Kubernetes Deployment

Deploy Bigeye agents on Kubernetes clusters

Some Bigeye agents can be deployed on Kubernetes instead of Docker. The Agent CLI generates Kubernetes YAML charts and provides step-by-step kubectl commands for deployment.

Supported Agents

AgentKubernetes Support
Data SourceDeployment chart generated
Lineage PlusJob chart generated
DataHealthCronJob chart generated
SDSNot supported — use Docker
Cross-SourceNot supported — use Docker
External MonitorsNot supported — use Docker

For agents not yet supported on Kubernetes, contact Bigeye support.

Initial Setup

During ./bigeye-agent install, select Kubernetes as your deployment target. This sets deploy_on_kubernetes: true in your configuration and prompts for a Kubernetes namespace (default: bigeye).

The Agent CLI still needs to be run locally to generate configuration files. The generated YAML charts are then applied to your cluster.

Data Source Agent on Kubernetes

After running ./bigeye-agent source configure, a deployment chart is generated at bigeye-agent-deployment.yaml.

Prerequisites

# Create the namespace
kubectl create namespace bigeye

# Create Docker registry secret
kubectl create secret docker-registry bigeye-docker-pat \
  --namespace bigeye \
  --docker-server=https://index.docker.io/v1/ \
  --docker-username=bigeyedata \
  --docker-password=<your-docker-pat>

# Create configmap from agent config directory
kubectl create configmap bigeye-agent-config \
  --namespace bigeye \
  --from-file=agent_config/

Deploy

kubectl apply -f bigeye-agent-deployment.yaml

Lineage Plus Agent on Kubernetes

Partial Support: The Kubernetes deployment does not support scenarios where custom JARs are required for lineage collection, or configurations exceeding the 1 MiB Kubernetes ConfigMap size limit.

Prerequisites

  1. Install the Lineage Plus agent and add connectors using the CLI:
./bigeye-agent install       # Select LINEAGE_PLUS
./bigeye-agent add-connector  # Add your connectors

This creates the lineage_config/ directory with all necessary files.

  1. Download the Kubernetes chart:
# Ad hoc Job
wget https://bigeye-public-web.s3.amazonaws.com/lineage-plus-kubernetes.yaml

# CronJob (for scheduled runs)
wget https://bigeye-public-web.s3.us-west-2.amazonaws.com/lineage-plus-kubernetes-cronjob.yaml

Configure

  1. Update resource limits in the YAML to match the max_memory parameter from bigeye_agent.yml.
  2. Update log_dir in lineage_config/global_settings.sh to match the mountPath of the logs-vol volume.
  3. Create ConfigMaps from your configuration files:
# Example for Snowflake connector
kubectl create configmap -n bigeye tmp-lineage-plus-config \
  --from-file=lineage_config/connectors/snowflake/snowflake.properties \
  --from-file=lineage_config/connectors/snowflake/snowflake.sh \
  --from-file=lineage_config/lineage_plus.properties \
  --from-file=lineage_config/global_settings.sh \
  --from-file=lineage_config/system.properties \
  --from-file=lineage_config/lineage_plus.lic \
  --from-file=lineage_config/application-context.xml

File paths depend on the connector type. For example, PostgreSQL uses postgresql/postgresql.properties and postgresql/postgresql.sh. Check the lineage_config/connectors directory for the correct paths.

  1. Verify the container command in the YAML executes /app/lineage_plus/scripts/<connector_type>.sh and that initContainer mount paths reference the correct connector type.

Deploy

# Apply the job
kubectl apply -f lineage-plus-kubernetes.yaml

# View pods
kubectl get pods -n bigeye

# View logs
kubectl logs -n bigeye <pod-name>

# Delete the job when complete
kubectl delete -f lineage-plus-kubernetes.yaml

DataHealth Agent on Kubernetes

The CLI generates a CronJob chart at bigeye-datahealth-cronjob.yaml with a default schedule of daily at 00:00 UTC. Edit the chart to customize the schedule before applying.

kubectl apply -f bigeye-datahealth-cronjob.yaml