Lineage Plus Agent

Managed by the Bigeye agent installer

Requirements

Pre-install checklist

  1. Set up a VM / host (Ubuntu (20.04+) or Redhat Linux (RHEL 8+) preferred‬)

    • Minimum hardware size
      • 25GB of RAM‬
      • ‭4 CPU‬
      • ‭55 GB disk space‬
    • Networking
      • Firewall access to the hostname and URL paths provided below:
        1. app-metacenter-portal.bigeye.com‬
        2. app-metacenter-solr.bigeye.com‬
      • The firewall rules should NOT strip any Authorization headers for the below mentioned‬ host/domain names.
      • Egress (outbound) Access to the data sources you wish to add to track Lineage in Bigeye
      • Egress (outbound) Access to the Bigeye SaaS environment
        • app.bigeye.com
      • Ingress (inbound) Access to retrieve Licenses from Bigeye for Agent CLI
      • Access to pull images from docker.io
  2. Bigeye information (provided by Bigeye)

    • The company name associated with your agent installs
    • The password to authenticate to your tenant and get your associated Lineage Plus License
  3. Docker PAT

    • Provided by Bigeye

Updating the Lineage Plus Agent in the CLI

Use the command below to update the lineage plus agent through the Agent CLI.

./bigeye-agent lineage upgrade

Run on Kubernetes

🚧

Partial Support

The steps below, and the chart provided, do not support scenarios where custom jars are required for lineage collection, or scenarios where any customization exceeds the size limit of a Kubernetes ConfigMap (1 MiB). Full support for Lineage Plus on Kubernetes is still in the development stage.

To run on Kubernetes, the Agent CLI is required. Follow steps 1-3 of the Installation section, and then return here to complete the following prerequisites:

  1. Installing the Lineage Plus agent
  2. Add any connectors for lineage collection. (This generates the necessary files for lineage collection)
# Install the agent with the for-kubernetes flag (only valid for Lineage Plus agent)
./bigeye-agent install --for-kubernetes

# Add the necessary connectors
./bigeye-agent add-connector

📘

What to expect

Running the install command will create a file called bigeye_agent.yml. This will store information for Bigeye, the Lineage Plus agent, and connection information for sources where lineage will be collected.

The add-connector command will create a directory called lineage_config. Within that directory will be all the necessary files for the lineage process to run. These files will be used to run the process as a Kubernetes job.

Configure Kubernetes

Download the chart for Lineage Plus on Kubernetes

# Download K8s yaml (the namepsace set in the file is bigeye)
wget https://bigeye-public-web.s3.amazonaws.com/lineage-plus-kubernetes.yaml
  1. Update the resource limits in lineage-plus-kubernetes.yaml to match the value entered during the Lineage Plus installation. This is the value of the max_memory parameter in bigeye_agent.yml.

  2. Update the log_dir variable in lineage_config/global_settings.sh to match the mountPath of the logs-vol volume mount of the container.

  3. Create a configMap of the necessary files.

    📘

    Each connector type is different

    The first two files listed in the command below will have path and file names dependent upon the connector type specified. For example, a connector for Postgres would have postgresql in the path with files named postgresql.properties and postgresql.sh. The example below shows a connector for Snowflake. Verify paths and names by looking at the lineage_config/connectors directory

    # Example configMap for Snowflake
    kubectl create configmap -n bigeye tmp-lineage-plus-config \
      --from-file=lineage_config/connectors/snowflake/snowflake.properties \
      --from-file=lineage_config/connectors/snowflake/snowflake.sh \
      --from-file=lineage_config/lineage_plus.properties \
      --from-file=lineage_config/global_settings.sh \
      --from-file=lineage_config/system.properties \
      --from-file=lineage_config/lineage_plus.lic \
      --from-file=lineage_config/application-context.xml
      
    
  4. Verify in lineage-plus-kubernetes.yaml that the container command executes /app/lineage_plus/scripts/<connector_type>.sh and that the mount paths reference the correct connector type. These will look like mountPath: /app/lineage_plus/scripts/<connector_type>.sh and mountPath: /app/lineage_plus/config/snapshot/<connector_type>.properties.

  5. Run Lineage Plus

    # Apply the job
    kubectl apply -f lineage-plus-kubernetes.yaml
    
    # View pods
    kubectl get pods -n bigeye
    NAME                        READY   STATUS      RESTARTS   AGE
    bigeye-lineage-plus-dgwn3   0/1     Running     0          16m
    
    # View logs (add -f to tail)
    kubectl logs -n bigeye bigeye-lineage-plus-dgwn3
    
    # Delete the job when it completes
    kubectl delete -f lineage-plus-kubernetes.yaml