Frequently Asked Questions
This document consolidates the most common questions about what data is sent to Bigeye, how it’s handled, and how different connection and data modes work. It’s designed for security, compliance, and architecture reviewers who need a clear, single reference.
Connection Modes
Overview
Bigeye supports two connection architectures for integrating with customer data sources:
Agent-based deployments keep all source connections inside the customer’s network.
Direct Connect executes source connections from Bigeye’s managed cloud, reducing setup complexity but increasing exposure risk.
Both connection modes collect and send the same information back to Bigeye. Agent vs. agent-less affects where the connection originates but not what data is collected.
| Mode | Description | Typical Deployment | Security Notes |
|---|---|---|---|
| Agent-based | The Bigeye Agent runs within the customer’s environment, connecting to sources from inside the customer’s network. | Used by nearly all enterprise customers. | Network traffic and credentials never leave the customer’s environment except for encrypted metadata and metrics. |
| Direct Connect (Agent-less) | Connections are initiated directly from Bigeye’s cloud to the customer’s data source. | Primarily used in lower-sensitivity or proof-of-concept deployments. | Requires network access from Bigeye’s control plane to customer data source. |
What Data Bigeye Receives & Stores by Default
At all times, all data stored by Bigeye is encrypted at rest and in transit in alignment with Bigeye encryption policies.
Metadata
By default, Bigeye collects metadata from connected sources. Examples include:
- Schema, table, and column names
- Column data types
- Lineage information (e.g., column A is upstream of column B)
- Report and ETL job names
Aggregate Metrics
For any monitored source, Bigeye collects aggregate metrics only. Examples include:
- Count of nulls in a column
- Minimum, maximum, or average values
- Time since a table was last updated
These metrics are computed at the source via SQL pushdown. No raw data is transmitted to Bigeye to compute metric results.
What features can send raw data to Bigeye?
When Data Restricted Mode is off (default for Bigeye), certain features may send raw data to Bigeye temporarily. In most cases, the raw data is not stored long-term.
| Feature | Description | Raw Data Handling | Control |
|---|---|---|---|
| Table Previews | Displays small sample previews in the catalog. | Includes actual row values in the UI. Raw data is not stored in Bigeye and is scoped to the request. | Can be disabled or limited via RBAC. |
| Grouped Metrics | Groups metric results by column values. | Group values are raw and must be stored to display in UI. | Controlled at feature level. |
| Issue Debug Query Preview | Bigeye generates a SQL query for debugging issue rows that can be run from the UI. | Query results, which will contain raw data, are previewed in the UI. Raw data is not stored by Bigeye as is scoped to the request. | Feature-flag controlled. |
| Enhanced Issue Resolution & Descriptions | Uses AI to generate issue explanations and resolution seteps based on debug query output. | The raw query results are never stored, but descriptions may include values (e.g., “duplicate values found where state = 'New Mexico'). | Feature-flagged controlled. |
Can I disable all features that can send raw data?
Yes, with Data Restricted Mode. When Data Restricted Mode is enabled:
- No raw data is sent to Bigeye under any circumstance.
- All features in the above section that could transmit or display raw data are automatically disabled.
- Functionality is reduced for certain features such as table previews, grouped metrics, and AI-enhanced issue resolution.
This mode ensures complete prevention of raw data transmission but may limit functionality. Customers can fine-tune access and feature enablement through RBAC and per-feature configuration.
Related Documentation
Updated about 3 hours ago
