Enhancing AI and Dynamical Subseasonal Forecasts with Probabilistic Bias Correction

Enhance subseasonal weather forecasts beyond two weeks by applying probabilistic bias correction to both AI and traditional dynamical models. This improves accuracy and reliability, crucial for planning in agriculture, energy, and disaster management.

intermediate1 hour6 steps

The play

Understand Subseasonal Forecast Limitations
Recognize the inherent challenges in achieving accurate weather forecasts beyond the two-week horizon and their critical importance for various decision-making sectors.
Identify Raw Forecast Model Outputs
Pinpoint the specific outputs from your AI (e.g., neural networks, ML models) and dynamical (e.g., GCMs, ensemble predictions) subseasonal forecasting models that require post-processing.
Select a Probabilistic Bias Correction Method
Choose an appropriate probabilistic bias correction technique (e.g., Quantile Mapping, Ensemble Model Output Statistics (EMOS), Bayesian Model Averaging) based on your forecast data characteristics and desired output (e.g., corrected mean, full predictive distribution).
Implement Bias Correction Training
Apply the selected method to historical raw forecasts and corresponding observations to train the bias correction model. This step learns the systematic errors and their statistical properties.
Apply Correction to New Forecasts
Integrate the trained bias correction model into your forecasting pipeline to adjust new, real-time subseasonal forecast outputs, generating more reliable and accurate probabilistic predictions.
Evaluate and Refine Performance
Continuously assess the impact of the bias correction on forecast skill, reliability, and sharpness using appropriate metrics (e.g., Continuous Ranked Probability Score (CRPS), Reliability Diagrams, ROC curves). Iterate and refine your chosen method as needed.

Starter code

import numpy as np
from sklearn.linear_model import LinearRegression

# 1. Simulate historical raw forecasts and corresponding observations
np.random.seed(42)
historical_forecasts = 10 + 2 * np.random.randn(100) # Example: raw forecasts are biased high
historical_observations = 8 + 1.5 * np.random.randn(100) # True observations

# Reshape data for scikit-learn (expects 2D array)
X_train = historical_forecasts.reshape(-1, 1)
y_train = historical_observations

# 2. Train a simple linear bias correction model
# This model learns a mapping from biased forecasts to observations
bias_corrector = LinearRegression()
bias_corrector.fit(X_train, y_train)

# 3. Simulate a new raw forecast from your model
new_raw_forecast = np.array([12.5]).reshape(-1, 1)

# 4. Apply the bias correction
corrected_forecast = bias_corrector.predict(new_raw_forecast)

print(f"Average historical raw forecast: {np.mean(historical_forecasts):.2f}")
print(f"Average historical observation: {np.mean(historical_observations):.2f}")
print(f"---\n")
print(f"New raw forecast: {new_raw_forecast[0,0]:.2f}")
print(f"Bias-corrected forecast: {corrected_forecast[0,0]:.2f}")

# For full probabilistic correction, you would typically model the residuals
# of this correction to generate a full predictive distribution, e.g.,
# by fitting a Gaussian distribution to `y_train - bias_corrector.predict(X_train)`.

Source

Paperarxiv.org