Agent Infrastructure Specifications

To ensure consistent performance for sensitive data scans, we recommend running the Data Sensitivity Module on a dedicated AWS c7i.2xlarge instance (8 vCPU, 16 GiB RAM).

Dedicated vs. Shared Deployment

Preferred: Deploy on a dedicated instance to isolate SDS workloads from other Bigeye agents.
Supported: The SDS agent can run alongside other agents (e.g., Cross-Source Agent or Data Source Agent), provided sufficient CPU and memory are reserved to prevent resource contention.

Sensitive data scans are CPU- and memory-intensive, particularly when ML detectors are enabled. Resource isolation improves scan predictability and throughput.

Required Containers

The Data Sensitivity Module requires two container images:

sds-agent
sds

Both containers must be deployed together to support scan execution.

Resource Allocation

Each container should be allocated:

8 vCPU
16 GB memory

If running both containers on the same instance, ensure the host has sufficient total capacity (minimum 16 vCPU / 32 GB RAM if not oversubscribing).

Networking Requirements

The sds-agent container communicates with the sds container via HTTP on: localhost:5001

This port is configurable if required.

Deployment Requirement

The two containers must:

Run within the same network namespace
Be deployed in the same ECS task definition (if using ECS)
Or share the same Docker network (if using Docker directly)

If deployed in separate tasks or network namespaces, localhost communication will fail.

Setting Up the Agent

Follow the standard Bigeye agent setup instructions:

Agent Installation Guide

To simplify installation, you may use:

Bigeye Agent CLI Tool

The CLI can provision and configure the SDS components as part of the agent installation workflow.

Operational Considerations (Recommended for Enterprise Deployments)

Ensure outbound TLS connectivity to Bigeye infrastructure is allowed.
Confirm IAM roles/service accounts follow least-privilege principles.
Monitor CPU and memory utilization during initial scans to validate sizing.
For high-volume environments, consider horizontal scaling across multiple agent instances.