Airflow
Bigeye metrics can be run as part of your Airflow DAG. This is done by creating an extra task after a table has been loaded to run all the checks on the table. If you want to include your Bigeye metrics as part of your ETL definition, you can also use the operators to create or update the metric definitions.
Getting started
You can use our plugin by referencing our Github project in your requirements.txt
by adding the following line:
bigeye-airflow
In order to communicate with Bigeye, you will have to define an HTTP connection in Airflow; your login and password are the same as you use to log into the Bigeye UI:

All operations need to be run on a specific warehouse and you will need to provide the warehouse id. The easiest way to find the warehouse id is to log into the Bigeye UI and navigate to a warehouse of interest; the ID is 414 in the screenshot below:

Run metrics on a table
You can use the RunMetricsOperator
to run all the metrics defined on a table. This operator will throw a ValueError
if it fails and will produce logs about how many and which metrics failed.
In order to run all the metrics on the users
table in the analytics
schema from the warehouse seen above and using the bigeye_connection
connection that we've defined, you can build the operator as follows:
from bigeye_airflow.operators.run_metrics_operator import RunMetricsOperator
....
run_metrics = RunMetricsOperator(
task_id='run_bigeye_metrics',
connection_id='bigeye_connection',
warehouse_id=414,
schema_name="analytics",
table_name="users",
dag=dag
)
Updated about 1 year ago