Sensitive Data Scanning Agent

Scan data sources for sensitive data like PII

The Sensitive Data Scanning (SDS) agent scans your data sources to detect sensitive data such as personally identifiable information (PII). It runs as a dual-container service: a Java agent for data collection and a Python service for classification.

Installation

# Run the installer and select SDS
./bigeye-agent install

After installation, add database connectors for the sources you want to scan:

./bigeye-agent add-connector -c snowflake

Commands

Start the Agent

./bigeye-agent sds start

Starts both the Java agent and Python service containers. The Python service is configured with a 12 GB memory limit and runs alongside the Java agent using shared networking.

Stop the Agent

./bigeye-agent sds stop

Stops both the Java and Python containers.

View Logs

# View Java agent logs (default)
./bigeye-agent sds logs

# View Python service logs
./bigeye-agent sds logs -c python

# View Java agent logs explicitly
./bigeye-agent sds logs -c java
OptionDescription
--container, -cWhich container's logs to view: java or python (default: java)

Update Configuration

# Re-enter SDS configuration
./bigeye-agent sds configure

# Overwrite existing configuration
./bigeye-agent sds configure --overwrite
OptionDescription
--overwrite, -oPrompt for new configuration details even if configuration already exists (default: false)

Upgrade

./bigeye-agent sds upgrade

Updates the SDS agent to the latest Docker image versions for both the Java and Python containers.

Share Diagnostics

./bigeye-agent sds share-diagnostics
./bigeye-agent sds share-diagnostics -m "Scan not completing on large tables"

Uploads diagnostic logs from both containers to Bigeye for troubleshooting.

OptionDescription
--message, -mMessage to include with the diagnostics for context

Deployment Notes

  • The SDS agent is currently supported only with Docker deployment. For Kubernetes deployments, contact Bigeye support.
  • The agent runs two interconnected containers sharing a network. The Python service uses the Java container's network stack.