Skip to content

Ridge Regression

Ridge Regression adds L2 regularization to Ordinary Least Squares (OLS), making it particularly well-suited for A-Share return prediction where multiple factors exhibit high multicollinearity.


Overview

The Ridge Regression objective function:

$$\min_{w} |Xw - y|^2 + \alpha |w|^2$$

where $\alpha$ controls regularization strength. When factors are highly correlated (e.g., multiple momentum factors), OLS coefficients diverge. Ridge regression shrinks coefficients through the penalty term, maintaining stable estimation.

Official docs: Ridge Regression — scikit-learn


Applications in A-Share Quantitative Strategies

1. Multi-Factor Return Prediction

Feed 50+ Alpha factors as feature matrix $X$ with future 5-day returns as target $y$. Ridge regression remains stable under high factor correlation while OLS coefficients explode.

2. Risk Model Factor Exposure Estimation

In Barra-style risk models, Ridge regression fits individual stock returns against style factor exposures. The $\alpha$ parameter prevents the factor exposure matrix from becoming singular.

3. Automatic Alpha Selection (RidgeCV)

scikit-learn's RidgeCV auto-selects the optimal $\alpha$ via Leave-One-Out cross-validation:

python
from sklearn.linear_model import RidgeCV
alphas = [0.01, 0.1, 1, 10, 100]
model = RidgeCV(alphas=alphas, cv=5)
model.fit(X_train, y_train)
print(model.alpha_)  # Optimal regularization strength

ParameterDescriptionRecommended
alphaL2 regularization strength0.1–100 (log search)
fit_interceptFit intercept termTrue
solverSolver algorithm'auto'

Strengths & Limitations

Strengths:

  • Closed-form solution — extremely fast training: $w = (X^TX + \alpha I)^{-1}X^Ty$
  • Coefficients are directly interpretable as factor weights for strategy attribution
  • Robust to multicollinear factors — common in A-Share factor libraries

Limitations:

  • Linear assumption cannot capture non-linear factor interactions
  • L2 regularization does not produce sparse solutions — all factors are retained
  • Sensitive to outliers — apply Winsorization to financial data first

Official References

⚡ Real-time Data · 📊 Smart Analysis · 🎯 Backtesting