Matillion ETL
The Matillion ETL connector extracts lineage from Matillion Orchestration and Transformation jobs. It supports both live API integration and file-based operation, building lineage across Matillion pipelines at the table and column level.
The Matillion ETL connector extracts lineage from Matillion Orchestration and Transformation jobs. It supports both REST API integration and file-based operation, building lineage across Matillion pipelines at the table and column level.
Prerequisites
- Matillion version 1.54.7 or higher with Enterprise Mode enabled
- Matillion user credentials with API access
- Supported dialects: Snowflake, Delta Lake on Databricks, Amazon Redshift
Supported Features
REST API-based metadata extraction
- Connects to Matillion ETL via REST API using username/password authentication
- Extracts groups, projects, jobs (Orchestration and Transformation), components, and steps
- Retrieves lineage at table and field level where metadata is available
File-based lineage extraction
- Runs the connector on exported job JSON files
- Produces consistent lineage results as API-based extraction
- Useful for POC scenarios or security-restricted environments
Internal lineage modeling
- Tracks data flow within jobs and across jobs
- Establishes source-to-target mapping relationships and directed transformation graphs
- Uses SQL parsing to extract field-level lineage from embedded queries
Known Limitations
- Stored procedures — Lineage within stored procedures is not supported in the current release
- Non-SQL components — Components with logic in arbitrary scripting languages are not parsed; only components that expose SQL or metadata are supported
- File-based sources/sinks — Lineage for sources like Amazon S3 or other file storage is not yet supported
Configuration Parameters
Create a properties file (for example, matillion.properties) with your connection configuration:
| Property | Type | Required | Description | Example |
|---|---|---|---|---|
environment.name.N | String | Yes | Environment identifier used to group projects | Prod |
bigeye.host.N | URL | Yes | Bigeye instance URL | https://app.bigeye.com |
bigeye.apikey.N | String | Yes | Bigeye API key | bigeye_pak_abc123 |
bigeye.allowed.workspaces.N | Integer List | Yes | Comma-separated workspace IDs | 123 |
matillion.instance.url.N | URL | Yes | URL of the Matillion ETL instance | https://matillion.company.com |
matillion.api.version.N | String | No | Matillion API version | v1 |
matillion.username.N | String | Yes | Matillion username | bigeye_service |
matillion.password.N | String | Yes | Matillion password | |
matillion.environment.N | String | No | Matillion environment name | Production |
matillion.include.groups.N | String List | No | Groups to include (comma-separated) | ETL,Analytics |
matillion.exclude.groups.N | String List | No | Groups to exclude (comma-separated) | Dev,Test |
matillion.include.projects.N | String List | No | Projects to include (comma-separated) | DW_Load |
matillion.exclude.projects.N | String List | No | Projects to exclude (comma-separated) | Sandbox |
matillion.startTimestamp.N | Long | No | Start time in UTC milliseconds. Defaults to start of current day | 1700000000000 |
matillion.endTimestamp.N | Long | No | End time in UTC milliseconds. Defaults to current time | 1700086400000 |
Sample Properties File
environment.name.1=Matillion Production
bigeye.host.1=https://app.bigeye.com
bigeye.apikey.1=bigeye_pak_acbdefg123456
bigeye.allowed.workspaces.1=123
matillion.instance.url.1=https://matillion.company.com
matillion.username.1=bigeye_service
matillion.password.1=${MATILLION_PASSWORD}
matillion.environment.1=Production
matillion.include.groups.1=ETL,AnalyticsRunning the Connector
With the Agent CLI (recommended)
# Install and configure the Lineage Plus agent
./bigeye-agent install
# Add the Matillion connector
./bigeye-agent add-connector -c matillion
# Run the connector
./bigeye-agent lineage run -c matillionWith Docker
docker run --rm \
-v /path/to/config:/app/config \
--entrypoint bash bigeyedata/source-connector:latest \
-c "bigeye-connector run -c matillion -p /app/config/matillion.properties"Performance Considerations
- Job metadata queries are made per job, which may impact runtime for large environments
- Field-level detail varies: some transformations expose fields clearly, while others (for example, SQL blocks) require parsing or inference
- Lineage completeness depends on the level of detail Matillion exposes per component
Updated 19 days ago
