Bigeye metrics can be run as part of your Airflow DAG. This is done by creating an extra task after a table has been loaded to run all the checks on the table. If you want to include your Bigeye metrics as part of your ETL definition, you can also use the operators to create or update the metric definitions.
You can use our plugin by referencing our Github project in your
requirements.txt by adding the following line:
In order to communicate with Bigeye, you will have to define an HTTP connection in Airflow; your login and password are the same as you use to log into the Bigeye UI:
All operations need to be run on a specific warehouse and you will need to provide the warehouse id. The easiest way to find the warehouse id is to log into the Bigeye UI and navigate to a warehouse of interest; the ID is 414 in the screenshot below:
You can use the
RunMetricsOperator to run all the metrics defined on a table. This operator will throw a
ValueError if it fails and will produce logs about how many and which metrics failed.
In order to run all the metrics on the
users table in the
analytics schema from the warehouse seen above and using the
bigeye_connection connection that we've defined, you can build the operator as follows:
from bigeye_airflow.operators.run_metrics_operator import RunMetricsOperator .... run_metrics = RunMetricsOperator( task_id='run_bigeye_metrics', connection_id='bigeye_connection', warehouse_id=414, schema_name="analytics", table_name="users", dag=dag )
Updated about 1 month ago