Scan Outputs

Before reading this: If you haven't set up your first scan yet, start with Build your first scan. This page assumes you have at least one scan job configured and have seen scan results.

Key concepts

Before diving in, it helps to understand some key scan terms:

Concept	Definition
Scan job	A saved configuration: which tables to scan, which classifiers to use, and how often to run. A scan job produces many scan runs over time.
Scan run	One execution of a scan job. Each run is a snapshot — a record of what was found at a specific point in time.
Finding	The result of a classifier checking a column. A finding can be positive (data class detected) or negative (nothing found). Findings are recorded at the column level, not the row level.
Data class	The type of sensitive data identified, e.g. `SSN`, `Email Address`, `Credit Card Number`.
Classifier	A detection rule that evaluates column data and determines whether it matches a specific data class. A scan job can include multiple classifiers.
Sensitivity level	The risk tier assigned to a data class: Public (L0), Internal Only (L1), Confidential (L2), or Restricted (L3). Unassigned data classes default to Restricted.

Two views of your results

Every scan job exposes two distinct views of its findings. Understanding the difference between them is the most important thing on this page.

Snapshot view (per scan run)

The runs view shows the output of a single scan run. It captures what the classifiers found during that specific execution and is frozen in time — it will never change, regardless of what happens to your data, data class sensitivity levels, or your classifier configuration afterward.

Navigate to a snapshot from: Data sensitivity → [Scan job name] → Runs tab → click a run

The snapshot view shows:

Column	Description
Source / Schema / Table / Column	The full path to the scanned column
Data class	The sensitive data type detected
Rows scanned	Number of rows the classifier evaluated
Rows matched	Number of rows that triggered the classifier
Sensitivity	The assigned sensitivity level

Tip: The default filter on the runs view shows only positive findings (where rows matched). To see columns that were scanned but produced no match, use the filter to include negative findings as well.

Aggregate view (across all runs)

The aggregate view consolidates findings across every run a scan job has ever completed. Rather than showing a single moment in time, it answers the question: "Has this column ever been found to contain this type of sensitive data?"

Note that aggregate behavior varies by scan type. For auto, incremental, and sampled scans, aggregate findings are additive — once a finding is recorded, it persists until a reset scan clears it. For full scans, findings are replaced on each run, so the aggregate reflects the most recent execution.

Navigate to the aggregate view from: Data sensitivity → [Scan job name] → Aggregate

The aggregate view surfaces a Reset Scan action per finding row for auto, incremental, and sampled scan types. Reset scans are not available for full scan jobs because full scans re-evaluate all data on every run. See Clearing stale aggregate findings below.

Catalog sensitivity tab

In addition to the scan-specific views above, aggregated findings from all scan jobs are surfaced in the data catalog. Navigate to any source, schema, or table and open the Sensitivity tab to see all findings associated with that asset, with attribution showing which scan job and classifier produced each result.

Note: If a column has findings from two different scan jobs, both appear in the catalog — nothing is suppressed or merged. The catalog always shows the union of all scan job results.

The findings table in detail

Whether you're looking at a snapshot or aggregate view, findings are displayed one row per column + data class pairing. A single column can appear multiple times in the table if multiple classifiers each found a different data class.

Example:

Column	Data class	Sensitivity	Rows scanned	Rows matched
`users.email`	Email Address	Confidential	50,000	48,210
`users.email`	Username	Internal Only	50,000	12
`orders.card_num`	Credit Card Number	Restricted	50,000	50,000

In this example, the users.email column has two separate findings because two classifiers flagged it for different data classes. This is expected behavior — a column can legitimately contain data that matches multiple classes.

Important: Findings are never deduplicated or merged across classifiers. Each classifier produces an independent result. When reviewing output for compliance purposes, account for this when counting "sensitive columns."

Runs vs. aggregate: which should you use?

Use the right view for the right job:

Use case	Recommended view
Point-in-time audit or compliance report	Runs
Understanding current data risk posture	Aggregate (with a recent re-scan)
Verifying a remediation (e.g., "did we remove SSNs?")	Runs from a run after remediation
Reviewing all findings across scan jobs for an asset	Catalog sensitivity tab
Debugging a specific scan run	Runs

Risks of misinterpreting scan outputs

This section covers the most common mistakes when reading scan results. Read it carefully before sharing reports with stakeholders.

🚧
Tables added after scan creation are not included

Tables added to a source or schema after a scan job is created will not be automatically included in the scan. If new tables appear that need scanning, create a new scan job. See the scope warning displayed during the scan job creation wizard.

🚧
Aggregate findings can be stale

For auto, incremental, and sampled scans, aggregate findings are additive and persistent. Once a finding is recorded, it remains in the aggregate view until you explicitly trigger a reset scan for that column — even if the sensitive data has since been deleted from your database. (Full scans replace findings on each run, so this concern does not apply to full scan jobs.)

Scenario: Your team manually redacts SSNs from a column that previously contained SSNs. You rerun the scan. The snapshot from the new run shows no SSN findings for that column. However, the aggregate view still shows the SSN finding from the prior run.

Why this matters: If you share an aggregate report to demonstrate compliance after a remediation, it will appear as though the sensitive data is still present.

What to do: Use the Reset scan action on the aggregate view to flag affected columns for a full rescan, then re-run the scan job. Once the rescan completes, the aggregate will reflect current reality.

🚧
Sampled scans may miss sensitive data

When a scan job uses sampled scanning, the classifier only evaluates a subset of rows — not the full column. If sensitive data exists only in rows that weren't included in the sample, the scan will produce no finding.

🚧
Auto/incremental scans only evaluate new rows

Auto/incremental scans scan rows added or updated since the last run, using a row creation time column. They do not re-evaluate rows that were present in previous runs.

Scenario: Your scan job is configured as incremental. An SSN was loaded into a column two years ago, before incremental scanning was enabled. The SSN will never appear in an incremental scan finding — it's in rows that are never evaluated.

What to do: Use an auto scan, which performs a full-table scan on its first run to establish a complete baseline, then switches to incremental scanning for subsequent runs. Alternatively, run a full scan at least once. Use auto/incremental scans to catch new sensitive data efficiently, but don't rely on them to surface sensitive data that predates the scan job.

🚧
Data can change during a scan

Bigeye splits tables into chunks using queries and processes the chunks independently over a period of time. If your table's data changes while a scan is in progress, some chunks may reflect pre-change data while others reflect post-change data. The scan's "Started At" through "Completed At" window represents a critical period during which the underlying data may have shifted. Keep this in mind when interpreting findings from scans that run over long durations on actively-written tables.

🚧
Snapshot staleness

Snapshots reflect the state of your data at the time of the scan run. They are labeled with their completion timestamp. A snapshot from three months ago does not tell you what your data looks like today.

Always check the "last scan completed" timestamp before treating a snapshot as current. The UI displays this prominently and will indicate if a snapshot is stale relative to your scan schedule.

Clearing stale aggregate findings

If a column's aggregate findings no longer reflect the current state of your data, kick off a Reset Scan. This option is available for auto, incremental, and sampled scan types only — full scans re-evaluate all data on every run and do not need reset scans.

Navigate to Data sensitivity → [Scan job name] → Aggregate
Find the column-data class row you want to refresh
Click the Reset scan icon (↻) on that row
The icon changes to an × — click it to undo if needed
On the next run, the column will be fully re-evaluated from scratch, and stale findings will be cleared if the data class is no longer present

Note: Clicking the reset icon does not trigger an immediate scan. It marks the column for a full rescan on the next scheduled (or manual) run. The column will be treated as if it were newly added — all chunks will be rescanned against all classifiers.

Kicking off a reset scan requires Write permission on the scan job.

Exporting results

Both the runs and aggregate views support CSV and PDF export. Every export explicitly identifies whether it represents a run or an aggregate result.

You can create an export by selecting the Export button in the top right corner of either your Runs or Aggregate tab view.

Export format and compliance use

Each export file identifies in its filename whether it is a run (snapshot) or aggregate report. Before using an export for a compliance audit or data inventory:

Check the report type. Aggregate reports are not point-in-time certifications. Snapshot reports are.
Check the scan type. A report from a sampled scan carries higher uncertainty than one from a full scan. CSV exports include a column indicating the scan type; PDF exports currently reflect the scan type in the header only.
Check the timestamp. Verify the scan completed recently enough to be meaningful for your use case.

What is preserved when things are deleted

Understanding deletion behavior matters for audit trails:

Deleted object	Effect on snapshot findings	Effect on aggregate findings
Scan job (deleted)	Soft-deleted; preserved indefinitely for audit	Soft-deleted; preserved indefinitely for audit
Classifier (deleted)	Snapshot findings persist forever	Aggregate findings persist until next re-scan
Scan job edited to remove a classifier	Snapshot findings persist forever	Behaves the same as "classifier deleted" — findings persist until next re-scan
Column (deleted from source)	Historical run outputs are preserved and marked as deleted	Findings immediately disappear from the aggregate view
Table (deleted from source)	Historical run outputs are preserved and marked as deleted	Findings immediately disappear from the aggregate view
Data class (deleted)	Snapshot findings persist forever	Aggregate findings persist until next re-scan

Frequently asked questions

Can I drill down to see which specific rows triggered a finding? No. For security reasons, Bigeye does not display the raw data values that triggered a classifier. The findings table shows the column path, data class, and row counts only.

A column appears twice in my findings table. Is that a bug? No — it means two different classifiers found data in that column for different data classes. This can also happen if the same column is scanned by two different scan jobs using the same classifier. Review each finding independently.

My reset scan completed, but the aggregate still shows the old finding. The reset scan marks the column for re-scan, but the aggregate won't update until the scan job actually runs. Check whether a run has completed since you marked the column for re-scan. If the finding persists after a completed run, it means the data class still exists in the column.

I deleted sensitive data from my database. How do I get it out of the aggregate view? For auto, incremental, or sampled scan jobs: trigger a reset scan on the affected column(s), then run the scan job. Once the new run completes with no finding, the aggregate will no longer show that data class for that column. For full scan jobs: the aggregate is replaced on each run, so simply re-running the scan will clear findings for data that is no longer present.

Why does my snapshot show fewer rows scanned than exist in the table? You're likely using a sampled scan or an auto/incremental scan. Check the scan type on the run detail page. For full coverage, configure the scan job as a full table scan.

Why does my aggregate show fewer rows scanned than exist in the table? The aggregate reflects cumulative scan activity. If you're using an incremental or sampled scan, not all rows may have been evaluated across all runs. For a complete picture, consider running a full scan or an auto scan (which starts with a full scan on its first run).

Why do auto/incremental scans keep scanning rows even though there's no new data? Bigeye re-scans "boundary rows" — rows near the maximum RCT value observed in the previous run — to account for late-arriving data. It also re-scans rows with null RCT values because it cannot determine whether they were previously scanned. See the Row Creation Time section in the scan configuration docs for more details.