Article
time-seriesforecastingpythondartspredictionmachine-learningstatistical-modeling
Time-Series Forecasting with the Darts Python Library
Use the Darts library for powerful Time-Series Forecasting. This guide shows how to load data, train a simple statistical model like Exponential Smoothing, and then advance to a gradient boosting model for more accurate predictions.
intermediate30 min5 steps
The play
- Install Darts and Load DataFirst, install the Darts library and its dependencies. We'll use a LightGBM model later, so we install the `lgb` extras. Then, load a built-in sample dataset, like the monthly air passengers, into a Darts `TimeSeries` object.
- Split Data and Train a Baseline ModelTo evaluate your forecast, you must set aside a validation set. Split the series into a training set (first 75% of data) and a validation set (the remaining 25%). Then, train a simple, fast baseline model like `ExponentialSmoothing` on the training data.
- Generate and Evaluate a ForecastUse the trained model to predict future values for the duration of your validation set. Darts makes it easy to evaluate this forecast against the actual validation data using metrics like Mean Absolute Percentage Error (MAPE).
- Train a Gradient Boosting ModelNow, train a more powerful machine learning model like `LightGBMModel`. These models require specifying input and output chunk lengths, which define the lookback and forecast horizons for the model during training.
- Compare Model PerformanceGenerate a forecast with the new LightGBM model and calculate its MAPE. You can then plot the actual data against the forecasts from both models to visually compare their performance and see how well they captured the series' trend and seasonality.
Starter code
import matplotlib.pyplot as plt
from darts import TimeSeries
from darts.datasets import AirPassengersDataset
from darts.models import ExponentialSmoothing
from darts.metrics import mape
# 1. Load data into a Darts TimeSeries object
print("Loading Air Passengers dataset...")
series = AirPassengersDataset().load()
# 2. Split into training and validation sets
# We'll train on the first 75% and validate on the remaining 25%
train, val = series.split_before(0.75)
# 3. Create and train a simple model
print("Training Exponential Smoothing model...")
model = ExponentialSmoothing()
model.fit(train)
# 4. Generate a forecast for the validation period
# The 'n' parameter is the number of time steps to predict
print("Generating forecast...")
prediction = model.predict(n=len(val))
# 5. Evaluate the forecast
error_mape = mape(val, prediction)
print(f"MAPE on validation set: {error_mape:.2f}%")
# 6. Plot the results
print("Plotting actuals vs. forecast...")
series.plot(label='Actual')
prediction.plot(label='Forecast')
plt.title(f'Air Passengers Forecast (MAPE: {error_mape:.2f}%)')
plt.legend()
plt.show()