Stacking Optimization

An ensemble portfolio construction method that learns how to combine several base optimisers using a meta-optimiser, producing a single allocation that draws on the strengths of each.

Overview

Stacking borrows the "stacked generalisation" idea from machine learning. Several base portfolio optimisers each produce a candidate weight vector; a meta-estimator then learns the linear combination of those candidates that performs best on out-of-fold data. The output is a single combined portfolio.

Folio Lab's stack uses three complementary base estimators that each emphasise a different aspect of risk:

  • Inverse Volatility — captures the variance signal.
  • Maximum Diversification — captures correlation structure.
  • Risk Budgeting (CVaR) — captures tail risk.

The meta-estimator is a Mean-Risk optimiser fitted with 3-fold cross-validation. The motivation is the bias–variance tradeoff: each base estimator brings a different inductive bias, and the meta-stage learns how much to trust each in the current data regime.

Algorithm

Step 1 — Fit base estimators

On each cross-validation fold, fit Inverse Volatility, Maximum Diversification, and Risk Budgeting (CVaR) on the training portion of the returns matrix.

Step 2 — Generate out-of-fold portfolio returns

Each fitted estimator produces a weight vector. These weights are applied to the held-out fold returns, producing a stream of out-of-fold portfolio returns per base estimator.

Step 3 — Fit meta-optimiser

The Mean-Risk meta-optimiser is fit on the matrix of out-of-fold portfolio returns to find the convex combination of base portfolios that performed best.

Step 4 — Refit on full data

Each base estimator is refit on the full sample, and the meta weights are applied to combine the resulting weight vectors into a single long-only portfolio.

Advantages & Limitations

Advantages

  • Adaptive: The meta-stage tunes the mix to the data.
  • Diversified bias: Combines variance, correlation, and tail-risk views.
  • Robust: Mistakes in one base estimator are partially diluted by the others.
  • Single portfolio output: No need to choose one method.

Limitations

  • Compute-heavy: Trains base + meta with cross-validation.
  • Overfitting risk: Small samples can mislead the meta-stage.
  • Opacity: The composite weight is harder to attribute.
  • Solver chain: Failures in any base estimator propagate.

References

  • Wolpert, D. H. (1992). "Stacked generalization." Neural Networks, 5(2), 241-259.
  • Breiman, L. (1996). "Stacked regressions." Machine Learning, 24(1), 49-64.
  • skfolio documentation — skfolio.optimization.StackingOptimization.