Classifiers
How Bigeye Detects Sensitive Data
Bigeye Sensitive Data Scanning uses a combination of pattern matching, checksum validation, and machine learning to detect sensitive data across your structured data sources. These techniques work together to build confidence before a finding is produced, balancing precision with recall to minimize false positives.
All detection runs entirely within your environment. Bigeye does not train models on your data — classification is performed using inference only.
Key Concepts
Classifier: A classifier contains the detection logic used to identify a specific type of sensitive data. Each classifier produces a single data class as its output when a match is found (e.g., "SSN", "Email Address", "Credit Card Number"). When configuring a classifier, you will select the data class to be applied when the classifier's detection patterns are satisfied.
Data class: A label that describes the type of sensitive data discovered. Each data class can be mapped to a sensitivity level — Public, Internal Only, Confidential, or Restricted — to indicate the level of risk associated with that data.
Detector: The atomic unit of detection logic within a classifier. Detectors can use regex patterns or machine learning models. When a classifier contains multiple detectors, they can be combined using AND or OR logic.
Out-of-the-Box Classifiers
Bigeye ships with a curated set of out-of-the-box classifiers ready to use in your scan jobs. These cover common sensitive data types across PII, PHI, PCI, and financial data.
Personally Identifiable Information (PII)
| Data Class | Description |
|---|---|
| Name | Full or partial person names |
| Email Address | Email addresses |
| Phone Number | Phone numbers (USA only) |
| Home or Mailing Address | Physical street and mailing addresses |
| Date of Birth | Dates of birth |
| Place of Birth | Location of birth |
| Age | Age values |
| Mother's Maiden Name | Maiden name identifiers, commonly used in security verification |
| Nationality, Religion, or Political Group | Demographic and group affiliation identifiers |
| Password | Password fields identified by column name and value patterns |
| Personal Account Usernames | User account identifiers |
| IP Address | IPv4 and IPv6 addresses |
| Location | General geographic or location data |
| Date or Datetime | Date and timestamp values |
| Web URL | Web addresses that may contain identifying information |
| Device Identifier or Serial Number | Hardware or device serial numbers |
| Crypto Wallet ID | Cryptocurrency wallet addresses |
Government & Travel Identifiers
| Data Class | Description |
|---|---|
| US SSN/TIN | US Social Security Numbers and Tax Identification Numbers |
| US ITIN | US Individual Taxpayer Identification Numbers |
| US Driver's License Number / State ID | US driver's license and state ID numbers across all US states |
| US Passport Number | US passport numbers |
| Vehicle Identifiers (VIN, Plate #, Registration #) | Vehicle identification numbers, license plates, and registration numbers |
Protected Health Information (PHI)
| Data Class | Description |
|---|---|
| Patient Name | Patient name fields in healthcare contexts |
| MRN (Medical Record Number) | Medical record number identifiers |
| Diagnosis | Diagnostic codes and descriptions (e.g., ICD codes) |
| Treatment Codes | Medical treatment and procedure codes |
| Medical Information | General medical and clinical information |
| Medical License | Medical practitioner license numbers |
| Provider NPI (National Provider Identifier) | National Provider Identifier numbers |
| Health Plan / Insurance Number | Health insurance and plan identifiers |
| Healthcare Admission Date | Patient admission dates |
| Healthcare Discharge Date | Patient discharge dates |
Payment Card Industry (PCI)
| Data Class | Description |
|---|---|
| Credit Card Number | Credit and debit card numbers, validated with checksum |
| CC Expiration Date | Card expiration dates |
| CVV | Card verification values |
Personal Financial Information (PFI)
| Data Class | Description |
|---|---|
| US Bank Account Number | US bank account numbers |
| IBAN Code | International Bank Account Numbers |
| Account PIN | Account PIN codes |
| Credit Score | Credit score values |
If the above list lacks a classifier that you need, please inform your Bigeye representative.
Customizing Classifiers
Clone and Adjust
You can clone any out-of-the-box classifier to create your own version. Cloned classifiers can be adjusted — for example, modifying or adding regex patterns to better match data formats specific to your organization.
Build from Scratch
You can also create classifiers entirely from scratch. When building a custom classifier, you can add one or more detectors using regex patterns, machine learning models, or a combination of both. Multiple detectors within a single classifier can be linked with AND or OR logic to fine-tune detection accuracy.
When creating or editing any classifier, you will assign a data class that will be applied as the finding output whenever the classifier's detection logic is satisfied.
Updated about 2 hours ago
