Track Your First ML Experiment with MLflow

Use MLflow to log parameters, metrics, and models from a machine learning experiment. This action pack shows how to instrument a Python script and view the results in the MLflow UI, making your work reproducible and comparable.

beginner15 min5 steps

The play

Install MLflow
Install MLflow and scikit-learn using pip. MLflow is a Python library, and we'll use scikit-learn to train a simple model for demonstration.
Create a Training Script
Write a basic Python script to train a model. We'll use a Logistic Regression model on the Iris dataset. This script forms the foundation of the experiment we will track.
Log Parameters and Metrics
Use `mlflow.start_run()` to begin tracking. Inside this block, log hyperparameters like regularization strength with `mlflow.log_param()` and performance results like accuracy with `mlflow.log_metric()`.
Log the Model as an Artifact
Within the same `mlflow.start_run()` block, save the trained model using `mlflow.sklearn.log_model()`. This packages the model with its dependencies, making it easy to reload and deploy later.
Launch the MLflow UI
After your script runs, it creates an `mlruns` directory. Launch the MLflow Tracking UI from your terminal to inspect and compare your runs. Navigate to http://127.0.0.1:5000 in your browser.

Starter code

import mlflow
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Set an experiment name
mlflow.set_experiment("Iris Classification")

# Load data
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Start an MLflow run
with mlflow.start_run():
    # Define hyperparameters
    C = 0.1
    solver = 'liblinear'

    # Log parameters
    mlflow.log_param("regularization_strength_C", C)
    mlflow.log_param("solver", solver)

    # Train the model
    model = LogisticRegression(C=C, solver=solver, max_iter=200)
    model.fit(X_train, y_train)

    # Make predictions and evaluate
    predictions = model.predict(X_test)
    accuracy = accuracy_score(y_test, predictions)

    # Log metrics
    mlflow.log_metric("accuracy", accuracy)

    # Log the model artifact
    mlflow.sklearn.log_model(model, "iris_logistic_regression")

    run_id = mlflow.active_run().info.run_id
    print(f"Run completed. Run ID: {run_id}")
    print(f"Accuracy: {accuracy}")
    print("\nTo view the run, execute 'mlflow ui' in your terminal and navigate to http://127.0.0.1:5000")