Sparse Index Tracking
Replicate the return profile of a benchmark index using only a small subset of its constituents. The classical formulation is a regression of index returns on asset returns with an L1 penalty, solved iteratively by reweighted L1 minimisation; the result is a portfolio that hugs the benchmark with a small number of names.
Overview
A passively-managed index fund typically holds every constituent of its benchmark in cap-weighted proportions. For institutional investors with mandate-driven turnover or transaction-cost budgets, holding hundreds of names is impractical: the bottom of the cap-weight distribution contributes negligibly to index return but generates real trading cost. Sparse index tracking solves the same replication problem with a fraction of the constituents.
The original formulation by Brodie, Daubechies, De Mol, Giannone and Loris (2009) used an penalty on the weight vector to induce sparsity. Benidis, Feng and Palomar (2018) developed the high-dimensional theory and proposed the reweighted- iteration of Candes, Wakin and Boyd (2008) as the practical solver: the L1 penalty is updated at each iteration so that small weights are penalised more aggressively, which yields a tighter approximation to the sparsity ideal than a single L1 solve.
FolioLab implements the reweighted-L1 path with an automatic sparsity controller: the user specifies a target cardinality (number of holdings) and the optimiser sweeps the penalty path to find the smallest portfolio that still meets a tracking-error budget against the chosen Indian benchmark.
Mathematical Formulation
Notation
- — matrix of asset returns over periods
- — benchmark return vector
- — portfolio weights, summing to 1, long-only
- — L1 penalty strength
- — per-coordinate weights at iteration
L1 sparse regression (Brodie et al., 2009)
The squared loss measures the in-sample tracking error of the portfolio against the benchmark; the L1 penalty drives small weights to zero. Larger means a sparser portfolio at the cost of higher tracking error.
Reweighted L1 (Candes, Wakin & Boyd, 2008)
Iterate
Each iteration solves a weighted L1 problem; the per-coordinate weight grows for small , accelerating their decay to zero. The fixed-point of this iteration approximates an sparsity solution far better than a single L1 fit and converges in 5-15 iterations on typical Indian universes.
Constrained refit step
After convergence the support set is fixed and the weights are re-optimised on this support without the L1 penalty. This step removes the shrinkage bias that L1 introduces while retaining the cardinality constraint, and gives the final tracking portfolio.
Practical use on Indian benchmarks
For Nifty 50 the cap-weighted top-10 names typically account for 60-65% of index weight; a sparse-tracker built on the top 15-20 names alone usually achieves an annualised tracking error of well under 1%, which is competitive with full replication once trading costs are accounted for. On Nifty 200 and Nifty 500 the trade-off is more interesting: a 30-50 name sparse portfolio captures most of the index return with materially lower turnover than holding all 200 or 500 names.
FolioLab's implementation lets the user specify either directly or the target cardinality; the sparsity controller then sweeps the path automatically.
Advantages & Limitations
Advantages
- Holdings-cap aware: Targets a small number of positions natively.
- Approximates : Reweighted L1 is much closer to true sparsity than single L1.
- Convex per iteration: Each L1 step is a tractable QP.
- Cost aware: Lower cardinality means lower transaction costs and better fit on illiquid constituents.
Limitations
- Path sensitivity: Small changes in can flip individual names in the support.
- Out-of-sample drift: A support chosen on history may not track the benchmark equally well in regime breaks.
- No view on returns: Pure replication, no enhancement.
- Sample TE optimised: Future tracking error can exceed the in-sample objective.
References
- Benidis, K., Feng, Y., & Palomar, D. P. (2018). "Sparse Portfolios for High-Dimensional Financial Index Tracking." IEEE Transactions on Signal Processing, 66(1), 155-170.
- Candes, E. J., Wakin, M. B., & Boyd, S. P. (2008). "Enhancing Sparsity by Reweighted Minimization." Journal of Fourier Analysis and Applications, 14(5), 877-905.
- Brodie, J., Daubechies, I., De Mol, C., Giannone, D., & Loris, I. (2009). "Sparse and Stable Markowitz Portfolios." Proceedings of the National Academy of Sciences, 106(30), 12267-12272.
- Jansen, R., & van Dijk, R. (2002). "Optimal Benchmark Tracking with Small Portfolios." The Journal of Portfolio Management, 28(2), 33-39.
- Tibshirani, R. (1996). "Regression Shrinkage and Selection via the Lasso." Journal of the Royal Statistical Society: Series B, 58(1), 267-288.
- Palomar, D. P. (2025). Portfolio Optimization: Theory and Application. Cambridge University Press, Chapter 13 (Index Tracking Portfolios), Sections 13.3 and 13.5.