Matillion ETL
The Matillion ETL connector extracts lineage from Matillion Orchestration and Transformation jobs. It supports both REST API integration and file-based operation, building lineage across Matillion pipelines at the table and column level.
Prerequisites
- Matillion version 1.54.7 or higher with Enterprise Mode enabled
- Matillion user credentials with API access
- Supported dialects: Snowflake, Delta Lake on Databricks, Amazon Redshift
Supported Features
REST API-based metadata extraction
- Connects to Matillion ETL via REST API using username/password authentication
- Extracts groups, projects, jobs (Orchestration and Transformation), components, and steps
- Retrieves lineage at table and field level where metadata is available
File-based lineage extraction
- Runs the connector on exported job JSON files
- Produces consistent lineage results as API-based extraction
- Useful for POC scenarios or security-restricted environments
Internal lineage modeling
- Tracks data flow within jobs and across jobs
- Establishes source-to-target mapping relationships and directed transformation graphs
- Uses SQL parsing to extract field-level lineage from embedded queries
Known Limitations
- Stored procedures — Lineage within stored procedures is not supported in the current release
- Non-SQL components — Components with logic in arbitrary scripting languages are not parsed; only components that expose SQL or metadata are supported
- File-based sources/sinks — Lineage for sources like Amazon S3 or other file storage is not yet supported
Configuration Parameters
Create a properties file (for example, matillion.properties) with your connection configuration:
| Property | Type | Required | Description | Example |
|---|---|---|---|---|
environment.name.N | String | Yes | Environment identifier used to group projects | Prod |
bigeye.host.N | URL | Yes | Bigeye instance URL | https://app.bigeye.com |
bigeye.apikey.N | String | Yes | Bigeye API key | bigeye_pak_abc123 |
bigeye.allowed.workspaces.N | Integer List | Yes | Comma-separated workspace IDs | 123 |
matillion.instance.url.N | URL | Yes | URL of the Matillion ETL instance | https://matillion.company.com |
matillion.api.version.N | String | No | Matillion API version | v1 |
matillion.username.N | String | Yes | Matillion username | bigeye_service |
matillion.password.N | String | Yes | Matillion password | |
matillion.environment.N | String | No | Matillion environment name | Production |
matillion.include.groups.N | String List | No | Groups to include (comma-separated) | ETL,Analytics |
matillion.exclude.groups.N | String List | No | Groups to exclude (comma-separated) | Dev,Test |
matillion.include.projects.N | String List | No | Projects to include (comma-separated) | DW_Load |
matillion.exclude.projects.N | String List | No | Projects to exclude (comma-separated) | Sandbox |
matillion.startTimestamp.N | Long | No | Start time in UTC milliseconds. Defaults to start of current day | 1700000000000 |
matillion.endTimestamp.N | Long | No | End time in UTC milliseconds. Defaults to current time | 1700086400000 |
Sample Properties File
environment.name.1=Matillion Production
bigeye.host.1=https://app.bigeye.com
bigeye.apikey.1=bigeye_pak_acbdefg123456
bigeye.allowed.workspaces.1=123
matillion.instance.url.1=https://matillion.company.com
matillion.username.1=bigeye_service
matillion.password.1=${MATILLION_PASSWORD}
matillion.environment.1=Production
matillion.include.groups.1=ETL,AnalyticsRunning the Connector
With the Agent CLI (recommended)
# Install and configure the Lineage Plus agent
./bigeye-agent install
# Add the Matillion connector
./bigeye-agent add-connector -c matillion
# Run the connector
./bigeye-agent lineage run -c matillionWith Docker
docker run --rm \
-v /path/to/config:/app/config \
--entrypoint bash bigeyedata/source-connector:latest \
-c "bigeye-connector run -c matillion -p /app/config/matillion.properties"Performance Considerations
- Job metadata queries are made per job, which may impact runtime for large environments
- Field-level detail varies: some transformations expose fields clearly, while others (for example, SQL blocks) require parsing or inference
- Lineage completeness depends on the level of detail Matillion exposes per component
