Matillion ETL

The Matillion ETL connector extracts lineage from Matillion Orchestration and Transformation jobs. It supports both live API integration and file-based operation, building lineage across Matillion pipelines at the table and column level.

The Matillion ETL connector extracts lineage from Matillion Orchestration and Transformation jobs. It supports both REST API integration and file-based operation, building lineage across Matillion pipelines at the table and column level.

Prerequisites

  • Matillion version 1.54.7 or higher with Enterprise Mode enabled
  • Matillion user credentials with API access
  • Supported dialects: Snowflake, Delta Lake on Databricks, Amazon Redshift

Supported Features

REST API-based metadata extraction

  • Connects to Matillion ETL via REST API using username/password authentication
  • Extracts groups, projects, jobs (Orchestration and Transformation), components, and steps
  • Retrieves lineage at table and field level where metadata is available

File-based lineage extraction

  • Runs the connector on exported job JSON files
  • Produces consistent lineage results as API-based extraction
  • Useful for POC scenarios or security-restricted environments

Internal lineage modeling

  • Tracks data flow within jobs and across jobs
  • Establishes source-to-target mapping relationships and directed transformation graphs
  • Uses SQL parsing to extract field-level lineage from embedded queries

Known Limitations

  • Stored procedures — Lineage within stored procedures is not supported in the current release
  • Non-SQL components — Components with logic in arbitrary scripting languages are not parsed; only components that expose SQL or metadata are supported
  • File-based sources/sinks — Lineage for sources like Amazon S3 or other file storage is not yet supported

Configuration Parameters

Create a properties file (for example, matillion.properties) with your connection configuration:

PropertyTypeRequiredDescriptionExample
environment.name.NStringYesEnvironment identifier used to group projectsProd
bigeye.host.NURLYesBigeye instance URLhttps://app.bigeye.com
bigeye.apikey.NStringYesBigeye API keybigeye_pak_abc123
bigeye.allowed.workspaces.NInteger ListYesComma-separated workspace IDs123
matillion.instance.url.NURLYesURL of the Matillion ETL instancehttps://matillion.company.com
matillion.api.version.NStringNoMatillion API versionv1
matillion.username.NStringYesMatillion usernamebigeye_service
matillion.password.NStringYesMatillion password
matillion.environment.NStringNoMatillion environment nameProduction
matillion.include.groups.NString ListNoGroups to include (comma-separated)ETL,Analytics
matillion.exclude.groups.NString ListNoGroups to exclude (comma-separated)Dev,Test
matillion.include.projects.NString ListNoProjects to include (comma-separated)DW_Load
matillion.exclude.projects.NString ListNoProjects to exclude (comma-separated)Sandbox
matillion.startTimestamp.NLongNoStart time in UTC milliseconds. Defaults to start of current day1700000000000
matillion.endTimestamp.NLongNoEnd time in UTC milliseconds. Defaults to current time1700086400000

Sample Properties File

environment.name.1=Matillion Production
bigeye.host.1=https://app.bigeye.com
bigeye.apikey.1=bigeye_pak_acbdefg123456
bigeye.allowed.workspaces.1=123
matillion.instance.url.1=https://matillion.company.com
matillion.username.1=bigeye_service
matillion.password.1=${MATILLION_PASSWORD}
matillion.environment.1=Production
matillion.include.groups.1=ETL,Analytics

Running the Connector

With the Agent CLI (recommended)

# Install and configure the Lineage Plus agent
./bigeye-agent install

# Add the Matillion connector
./bigeye-agent add-connector -c matillion

# Run the connector
./bigeye-agent lineage run -c matillion

With Docker

docker run --rm \
  -v /path/to/config:/app/config \
  --entrypoint bash bigeyedata/source-connector:latest \
  -c "bigeye-connector run -c matillion -p /app/config/matillion.properties"

Performance Considerations

  • Job metadata queries are made per job, which may impact runtime for large environments
  • Field-level detail varies: some transformations expose fields clearly, while others (for example, SQL blocks) require parsing or inference
  • Lineage completeness depends on the level of detail Matillion exposes per component