Article·airflow.apache.org
workflowdagschedulingmlopspipelineapacheairflow
Apache Airflow (ML Edition)
Orchestrate ML workflows with Apache Airflow. This Action Pack guides you through setting up an Airflow DAG to trigger a simple machine learning task, showcasing Airflow's capabilities in MLOps.
beginner30 min4 steps
The play
- Install Apache AirflowInstall Airflow using pip. We'll use the 'apache-airflow' package with the 'amazon' extra for AWS integration.
- Configure AirflowInitialize the Airflow database. This creates the necessary tables for Airflow to operate.
- Create a Simple DAGCreate a DAG file (e.g., `ml_pipeline.py`) in your Airflow DAGs folder. This DAG will define a simple ML task (e.g., printing a message).
- Run the DAGUnpause the DAG in the Airflow UI and trigger a DAG run. Monitor the task execution in the Airflow UI.
Starter code
from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import datetime
with DAG('simple_ml_pipeline', start_date=datetime(2023, 1, 1), schedule_interval=None, catchup=False) as dag:
task1 = BashOperator(task_id='print_message', bash_command='echo "Running ML Task!"')Source