3 Portfolio Theory

Published

July 4, 2025

3.1 Introduction

TODO: Given some menu of possible investments, what mix should we hold? How should we hold value through time?

Mean-variance analysis is part of fundamental analysis, specifically within modern portfolio theory (MPT). Developed by Harry Markowitz in 1952 and later expanded by William Sharpe, it focuses on analyzing fundamental characteristics of securities, specifically their expected returns (mean) and risk (variance), to construct optimal portfolios.

3.2 Modern portfolio theory

3.2.1 History and pedagogy

Keywords:

Historical background:

Markowitz, H.M. (1952). Portfolio selection. ¹
Roy, A.D. (1952). Safety first and the holding of assets. ²
Markowitz, H.M. (1959). Portfolio Selection: Efficient Diversification of Investments. ³
Merton, R.C. (1972). An analytic derivation of the efficient portfolio frontier. ⁴
Levy, H. & Markowitz, H.M. (1979). Approximating expected utility by a function of mean and variance. ⁵
In 1990, Harry Markowitz, Merton Miller, and William F. Sharpe were awarded the Nobel Prize in Economics “for their pioneering work in the theory of financial economics”.
Markowitz, H.M. (1990). Nobel lecture: Foundations of portfolio theory. ⁶
Markowitz, H.M. (2005). Market efficiency: A theoretical distinction and so what? ⁷

¹ Markowitz (1952).

² Roy (1952).

³ Markowitz (1959).

⁴ Merton (1972).

⁵ Levy & Markowitz (1979).

⁶ Markowitz (1990).

⁷ Markowitz (2005).

Lecture notes:

Armerin, F. (2023). Lecture notes: More on mean-variance analysis.
Caflisch, R. (2003). Lecture notes: Mathematics of Finance.
Das, S.R. (2016). Data Science: Theories, Models, Algorithms, and Analytics. ⁸
- Also: Das, S.R. (2017). Being mean with variance: Markowitz optimization.
Ireland, P. (2013). Lecture notes: Principles of Macroeconomics.
Ireland, P. (2024). Lecture notes: Mathematics for Economists.
Ireland, P. (2025). Lecture notes: Financial Economics.
- In particular, Lecture 6
Kasa, K. (2023). Lecture notes by Ken Kasa (SFU)
- In particular, Lecture 7
Kwok, Y.K. (2017). Lecture notes: Fundamentals of Mathematical Finance. ⁹
- In particular, Lecture 2
Sigman, K. (2005). Notes on fund theorems.
Tam, A.S. (2021). Lagrangians and portfolio optimization.

⁸ Das (2016).

⁹ Kwok (2017).

3.2.2 Markowitz portfolio problem

Return of a portfolio:

\[ r = \vec{w}^\intercal \, \vec{r} = \sum_i w_{i} \, r_{i} \]

Variance of a portfolio:

\[ \sigma^2 = \vec{w}^\intercal \, V \, \vec{w} = \sum_{ij} w_{i} \, V_{ij} \, w_{j} \]

TODO: Show above ¹⁰

¹⁰ Luenberger (1998), p. 150.

Markowitz portfolio problem

Given an \(n\)-dimensional vector of expected returns, \(\vec{\mu}\), an \(n\times{}n\)-dimensional expected covariance matrix, \(V\), an \(m\times{}n\)-dimensional constraint matrix, \(A\), an \(m\)-dimensional constraint vector, \(\vec{b}\), and a target return, \(r_{\ast}\), solve for the portfolio weights, \(\vec{w}_{\ast}\), an \(n\)-dimensional vector, that are efficient, i.e. those that minimize the standard deviation of the portfolio return, \(\sigma\), for a given target return. Return \((\vec{w}_{\ast}, \sigma_{\ast})\). ¹¹

Solve

\[ \vec{w}_{\ast} = \underset{w}{\mathrm{argmin}}\ \vec{w}^\intercal \, V \, \vec{w} \]

such that

\[ \vec{w} \cdot \vec{1} = 1 \]

\[ \vec{w} \cdot \vec{\mu} = r_{\ast} \]

and with further optional constraints

\[ A \, \vec{w} \geq \vec{b} \]

¹¹ Markowitz (1959), p. 172.

There are a lot of topics to discuss about solving for the efficient frontier:

How there is an analytic solution if you allow shorts
Solving with Lagrange multipliers
Solving with numerical convex optimization

TODO: Discuss the above more.

It can be shown¹² that there is an analytic solution where:

¹² Merton (1972) was the first to show there was an analytic solution to the Markowitz portfolio problem? For the analytic results descussed here, we generally follow Kwok (2017). Note that we use variable names following Kwok, whereas to convert from Merton to Kwok: \(a_\mathrm{M} = b_\mathrm{K}\), \(b_\mathrm{M} = c_\mathrm{K}\), \(c_\mathrm{M} = a_\mathrm{K}\).

\[ a \equiv \vec{1}^\intercal \, V^{-1} \, \vec{1}, \qquad b \equiv \vec{1}^\intercal \, V^{-1} \, \vec{\mu}, \qquad c \equiv \vec{\mu}^\intercal \, V^{-1} \, \vec{\mu}, \qquad d \equiv a\,c - b^2 \]

There are two efficient portfolios of note: the minimum variance portfolio, \(\vec{w}_\mathrm{min}\), and the tangent portfolio, \(\vec{w}_\mathrm{tan}\).

The minimum variance portfolio is

\[ \vec{w}_\mathrm{min} = \frac{V^{-1} \, \vec{1}}{a} = \frac{V^{-1} \, \vec{1}}{\vec{1}^\intercal \, V^{-1} \, \vec{1}} \]

It has a return

\[ r_\mathrm{min} = \vec{w}_\mathrm{min} \cdot \vec{\mu} = \frac{\vec{1}^\intercal \, V^{-1} \, \vec{\mu}}{a} = \frac{b}{a} \]

and a variance

\[ \sigma_\mathrm{min}^2 = \vec{w}_\mathrm{min}^\intercal \, V \, \vec{w}_\mathrm{min} = \left( \frac{\vec{1}^\intercal \, V^{-1}}{a} \right) V \left( \frac{V^{-1} \, \vec{1}}{a} \right) = \frac{\vec{1}^\intercal \, V^{-1} \, \vec{1}}{a^2} = \frac{1}{a} \]

The tangent portfolio is

\[ \vec{w}_\mathrm{tan} = \frac{V^{-1} \, \vec{\mu}}{b} = \frac{V^{-1} \, \vec{\mu}}{\vec{1}^\intercal \, V^{-1} \, \vec{\mu}} \]

It has a return

\[ r_\mathrm{tan} = \vec{w}_\mathrm{tan} \cdot \vec{\mu} = \frac{\vec{\mu}^\intercal \, V^{-1} \, \vec{\mu}}{b} = \frac{c}{b} \]

and a variance

\[ \sigma_\mathrm{tan}^2 = \vec{w}_\mathrm{tan}^\intercal \, V \, \vec{w}_\mathrm{tan} = \left( \frac{\vec{\mu}^\intercal \, V^{-1}}{b} \right) V \left( \frac{V^{-1} \, \vec{\mu}}{b} \right) = \frac{\vec{\mu}^\intercal \, V^{-1} \, \vec{\mu}}{b^2} = \frac{c}{b^2} \]

The efficient frontier can be written as a linear combination of any two efficient portfolios. This is discussed in more detail in the section on Fund theorems. Written as a combination of the minimum variance and the tangent portfolios gives

\[ \vec{w}_{\ast} = \psi \, \vec{w}_\mathrm{min} + (1-\psi) \, \vec{w}_\mathrm{tan} \]

where

\[ \psi = (c - b \, r_{\ast}) \, a \, / \, d \]

The efficient frontier portfolio can be equivalently written

\[\begin{align} \vec{w}_{\ast} &= \psi \, \vec{w}_\mathrm{min} + (1-\psi) \, \vec{w}_\mathrm{tan} \\ &= \left( \frac{c - b \, r_{\ast}}{d} \right) a \, \vec{w}_\mathrm{min} + \left( \frac{a \, r_{\ast} - b}{d} \right) b \, \vec{w}_\mathrm{tan} \\ &= \left( \frac{c - b \, r_{\ast}}{d} \right) V^{-1} \, \vec{1} + \left( \frac{a \, r_{\ast} - b}{d} \right) V^{-1} \, \vec{\mu} \end{align}\]

Along the frontier, the return is

\[ r_{\ast} = \psi \, r_\mathrm{min} + (1-\psi) \, r_\mathrm{tan} \]

The variance is

\[ \sigma^2_{\ast} = \frac{a}{d} \, r_{\ast}^{2} - \frac{2 \, b}{d} \, r_{\ast} + \frac{c}{d} \]

TODO: Note calculation order of \(\vec{w}_\mathrm{min}(\mu, V)\) and \(\vec{w}_\mathrm{tan}(\mu, V, r_\mathrm{f})\), then calculate \(r_{\ast}(\sigma_{\ast})\), scanning from \(\sigma_\mathrm{min}\) to \(\sigma_\mathrm{max}\).

Figure 3.1: The “Markowitz Bullet”, the efficient frontier shown in Markowitz (1959), p. 152.

In general, depending on the correlations of the assets, the efficient frontier portfolios will short various positions, indicated by having negative weights.

3.2.3 No-shorts frontier

If one adds an additional constraint to the Markowitz portfolio problem as stated, requiring that we don’t short any positions

\[ w_i \geq 0 \]

then the problem doesn’t have an analytic solution. TODO: Citation needed.

The no-shorts frontier can be solved numerically with quadratic programming. In general, the no-shorts frontier will follow the unconstrained efficient frontier when there isn’t any shorting in the efficient portfolios, and the no-shorts frontier will pull away from the efficient frontier to somewhat lower returns when there is shorting on the efficient frontier.

An example of the efficient frontier and the no-shorts frontier is shown in Figure 3.2.

Figure 3.2: The efficient frontier and no-shorts frontier for a few example assets, using daily data from 2014-01-02 to 2024-08-30, 10 years and 8 months.

Quadratic programming and convex optimization are discussed in more detail in the section on Convex optimization.

3.2.4 Efficient-market hypothesis

Efficient-market hypothesis
Eugene Fama (b. 1939)
Fama, E. (1970). Efficient capital markets: A review of theory and empirical work. ¹³

¹³ Fama (1970).

3.2.5 Lessons of MPT

Markowitz:

[I]n trying to make variance small it is not enough to invest in many securities. It is necessary to avoid investing in securities with high covariances among themselves. We should diversify across industries because firms in different industries, especially industries with different economic characteristics, have lower covariances than firms within an industry. ¹⁴

¹⁴ Markowitz (1952), p. 89.

Dalio: “The Holy Grail”, see Figure 3.3.

Figure 3.3: TODO: Citation needed. See: financhill.com.

3.3 Estimation of covariance matrices

3.3.1 Overview

This is how we estimate \(V\) (and \(\mu\)).

Estimation of covariance matrices
Marsaglia, G. (1964). Conditional means and covariances of normal variables with singular covariance matrix. ¹⁵
Coqueret, G. & Milhau, V. (2014). Estimating covariance matrices for portfolio optimization ¹⁶
Fan, J., Liao, Y., & Liu, H. (2015). An overview on the estimation of large covariance and precision matrices. ¹⁷
Ayyala, D.N. (2020). High-dimensional statistical inference: Theoretical development to data analytics. ¹⁸

¹⁵ Marsaglia (1964).

¹⁶ Coqueret & Milhau (2014).

¹⁷ Fan, Liao, & Liu (2015).

¹⁸ Ayyala (2020).

Figure 3.4: The correlation matrix of a few example assets using daily data from 2014-01-02 to 2024-08-30, 10 years and 8 months.

Software:

3.3.2 Sample mean and covariance

TODO:

Sample mean and covariance
Rolling mean and covariance

3.3.3 Online mean and covariance

Algorithms for calculating variance
Welford, B.P. (1962). Note on a method for calculating corrected sums of squares and products. ¹⁹
Neely, P.M. (1966). Comparison of several algorithms for computation of means, standard deviations and correlation coefficients. ²⁰
Youngs, E.A. & Cramer, E.M. (1971). Some results relevant to choice of sum and sum-of-product algorithms. ²¹
Ling, R.F. (1974). Comparison of several algorithms for computing sample means and variances. ²²
Chan, T.F., Golub, G.H., & LeVeque, R.J. (1979). Updating formulae and a pairwise algorithm for computing sample variances. ²³
Pébay, P. (2008). Formulas for robust, one-pass parallel computation of covariances and arbitrary-order statistical moments. ²⁴
Finch, T. (2009). Incremental calculation of weighted mean and variance. ²⁵
Cook, J.D. (2014). Accurately computing running variance.
Meng, X. (2015). Simpler online updates for arbitrary-order central moments. ²⁶
Pébay, P., Terriberry, T.B., Kolla, H. & Bennett, J. (2016). Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights. ²⁷
Schubert, E. & Gertz, M. (2018). Numerically stable parallel computation of (co-)variance. ²⁸
Chen, C. (2019). Welford algorithm for updating variance.

¹⁹ Welford (1962).

²⁰ Neely (1966).

²¹ Youngs & Cramer (1971).

²² Ling (1974).

²³ Chan, Golub, & LeVeque (1979).

²⁴ Pébay (2008).

²⁵ Finch (2009).

²⁶ Meng (2015).

²⁷ Pébay, Terriberry, Kolla, & Bennett (2016).

²⁸ Schubert & Gertz (2018).

Univariate central moments:

\[ \mu_{p} = \mathbb{E}\left[(x - \mu_{1})^{p}\right] \]

\[ M_{p} = \sum_{i}^{n} (x_i - \mu_{1})^{p} \]

\[ \mu_{p} = \frac{M_{p}}{n} \,, \qquad \mu = \frac{M_{1}}{n} \,, \qquad \sigma^2 = \frac{M_{2}}{n} \]

Online mean:

\[ \delta \equiv x_n - \mu_{n - 1} \]

\[ \hat{\mu}_{n} = \mu_{n-1} + \frac{\delta}{n} \]

Online variance:

\[ S_{n} = M_{2,n} \]

\[ \hat{\sigma}^2 = \frac{S_{n}}{n-1} \]

where the \(n-1\) includes Bessel’s correction for sample variance.

Incrementally,

\[ S_{n} = S_{n-1} + (x_n - \mu_{n - 1}) (x_n - \mu_n) \]

Note that for \(n > 1\),

\[ (x_n - \mu_n) = \frac{n-1}{n} (x_n - \mu_{n - 1}) \]

Therefore,

\[ S_{n} = S_{n-1} + \frac{n-1}{n} (x_n - \mu_{n - 1}) (x_n - \mu_{n - 1}) \]

\[ S_{n} = S_{n-1} + \frac{n-1}{n} \delta^2 = S_{n-1} + \delta \left( \delta - \frac{\delta}{n} \right) \]

Online covariance (Welford algorithm):

\[ C_{n}(x, y) = C_{n-1} + (x_n - \bar{x}_{n - 1}) (y_n - \bar{y}_n) = C_{n-1} + \delta_{x} \delta_{y}^\prime \]

\[ C_{n}(x, y) = C_{n-1} + \frac{n-1}{n} (x_n - \bar{x}_{n - 1}) (y_n - \bar{y}_{n - 1}) = C_{n-1} + \frac{n-1}{n} \delta_{x} \delta_{y} \]

\[ \hat{V}_{xy} = \frac{C_{n}(x, y)}{n-1} \]

Matrix form:

\[ C_{n} = C_{n-1} + \left( \vec{x}_{n} - \vec{\mu}_{n-1} \right) \left( \vec{x}_{n} - \vec{\mu}_{n} \right)^\intercal = C_{n-1} + \vec{\delta} \: \vec{\delta^\prime}^\intercal \]

\[ C_{n} = C_{n-1} + \frac{n-1}{n} \left( \vec{x}_{n} - \vec{\mu}_{n-1} \right) \left( \vec{x}_{n} - \vec{\mu}_{n-1} \right)^\intercal = C_{n-1} + \frac{n-1}{n} S(\vec{x}_{n}, \vec{\mu}_{n-1}) \]

\[ \hat{V} = \frac{C_{n}}{n-1} \]

Note that the update term for the online covariance is a term in a scatter matrix, \(S\), using the currently observed data, \(\vec{x}_{n}\), and the previous means, \(\vec{\mu}_{n-1}\). But also note that the \(\vec{\delta} \: \vec{\delta^\prime}^\intercal\) form is also convenient because it comes naturally normalized and can be readily generalized for weighting.

Weighted mean:

\[ \hat{\mu}_{n} = \mu_{n-1} + \frac{w_{n,n}}{W_n} \delta = \mu_{n-1} + \frac{w_{n,n}}{W_n} (x_n - \mu_{n - 1}) \]

where

\[ W_{n} = \sum_{i=1}^{n} w_{n,i} \]

Weighted covariance:

\[ C_{n} = \frac{W_n - w_{n,n}}{W_{n-1}} C_{n-1} + w_{n,n} \left( x_{n} - \bar{x}_{n-1} \right) \left( y_{n} - \bar{y}_{n} \right) \]

\[ C_{n} = \frac{W_n - w_{n,n}}{W_{n-1}} C_{n-1} + w_{n,n} \left( \vec{x}_{n} - \vec{\mu}_{n-1} \right) \left( \vec{x}_{n} - \vec{\mu}_{n} \right)^\intercal \]

where

\[ \hat{V} = \frac{C_{n}}{W_{n}} \]

Exponential-weighted mean:

\[ \alpha = 1 - \mathrm{exp}\left( \frac{-\Delta{}t}{\tau} \right) \simeq \frac{\Delta{}t}{\tau} \]

\[ \hat{\mu}_{n} = \mu_{n-1} + \alpha (x_{n} - \mu_{n-1}) = (1 - \alpha) \mu_{n-1} + \alpha x_{n} \]

Exponential-weighted covariance:

\[ C_{n} = (1 - \alpha) C_{n-1} + \alpha \left( x_{n} - \bar{x}_{n-1} \right) \left( y_{n} - \bar{y}_{n} \right) \]

\[ C_{n} = (1 - \alpha) C_{n-1} + \alpha \left( \vec{x}_{n} - \vec{\mu}_{n-1} \right) \left( \vec{x}_{n} - \vec{\mu}_{n} \right)^\intercal \]

where by summing a geometric series, one can show that for exponential weighting, \(W_{n} = 1\), so \(\hat{V} = C_{n}\).

Rolling mean (reverse Welford algorithm):

\[ \vec{\mu}_{n-1} = \vec{\mu}_{n} - \frac{1}{n-1} \left( \vec{x}_{n} - \vec{\mu}_{n} \right) \]

Rolling covariance (reverse Welford algorithm):

\[ C_{n-1} = C_{n} - \left( \vec{x}_{n} - \vec{\mu}_{n-1} \right) \left( \vec{x}_{n} - \vec{\mu}_{n} \right)^\intercal = C_{n} - \vec{\delta} \: \vec{\delta^\prime}^\intercal \]

3.3.4 Shrinkage estimators

Shrinkage
Ledoit, O. & Wolf, M. (2001). Honey, I shrunk the sample covariance matrix. ²⁹
Ledoit, O. & Wolf, M. (2003). Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. ³⁰

²⁹ Ledoit & Wolf (2001).

³⁰ Ledoit & Wolf (2003).

3.3.5 Precision matrices

Precision matrices
Galloway, M. (2019). Shrinking characteristics of precision matrix estimators: An illustration via regression. ³¹
Bax, K., Taufer, E., & Paterlini, S. (2022). A generalized precision matrix for t-Student distributions in portfolio optimization. ³²
Dutta, S. & Jain, S. (2023). Precision versus shrinkage: A comparative analysis of covariance estimation methods for portfolio allocation. ³³

³¹ Galloway (2019).

³² Bax, Taufer, & Paterlini (2022).

³³ Dutta & Jain (2023).

TODO:

Why are precision matrices sparse?

3.4 Convex optimization

This is how we minimize \(\sigma\).

Linear programming
- George Dantzig (1914-2005)
Quadratic programming
- No-shorts efficient frontier
- Karush-Kuhn-Tucker (KKT) conditions
- Jagannathan, R. & Ma, T. (2003). Risk reduction in large portfolios: Why imposing the wrong constraints helps. ³⁴
- Tam, A.S. (2021). Lagrangians and portfolio optimization.
Boyd, S. & Vandenberghe, L. (2004). Convex Optimization. ³⁵
- Course website at Stanford: Convex Optimization I
- Convex Optimization Short Course
- Boyd Lectures on youtube: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
- Affine combinations and convex sets
Markowitz’s Critical Line Algorithm (CLA)
- Markowitz, H.M. (1956). The optimization of a quadratic function subject to linear constraints. ³⁶
- Bailey, D.H. & López de Prado, M. (2013). An open-source implementation of the critical-line algorithm for portfolio optimization. ³⁷
- Markowitz, H.M., Starer, D., Fram, H., & Gerber, S. (2019). Avoiding the downside: A practical review of the Critical Line Algorithm for mean-semivariance portfolio optimization. ³⁸
Software:

³⁴ Jagannathan & Ma (2003).

³⁵ Boyd & Vandenberghe (2004).

³⁶ Markowitz (1956).

³⁷ Bailey & López de Prado (2013).

³⁸ Markowitz, Starer, Fram, & Gerber (2019).

TODO: Discuss optimizing the no-shorts frontier.

3.5 Fund theorems

3.5.1 Mutual fund separation theorem

Mutual fund separation theorem
Cass, D. & Stiglitz, J.E. (1970). The structure of investor preferences and asset returns, and separability in portfolio allocation: A contribution to the pure theory of mutual funds. ³⁹
Chamberlain, G. (1983). A characterization of the distributions that imply mean-variance utility functions. ⁴⁰
Owen, J. & Rabinovitch, R. (1983). On the class of elliptical distributions and their applications to the theory of portfolio choice. ⁴¹

³⁹ Cass & Stiglitz (1970).

⁴⁰ Chamberlain (1983).

⁴¹ Owen & Rabinovitch (1983).

Cass & Stiglitz:

[G]iven a market in which there are available \(n\) different assets, nonetheless all the opportunities relevant to the investor’s decision can be provided by a set of \(m\) (\(< n\)) “mutual funds,” i.e., a set of \(m\) linear combinations (with weights adding to one) of the available assets. ⁴²

⁴² Cass & Stiglitz (1970), p. 122.

3.5.2 Two-fund theorem

Continuing the discussion of the context of a portfolio of risky assets (no risk-free asset; to be considered in the next section).

Tobin⁴³ is often credited as the first to note, and later Merton⁴⁴ exposited more formally, the Two-fund theorem:

⁴³ Tobin (1958).

⁴⁴ Merton (1972).

Merton:

Given \(m\) assets satisfying the conditions […], there are two portfolios (“mutual funds”) constructed from these \(m\) assets, such that all risk-averse individuals, who choose their portfolios so as to maximize utility functions dependent only on the mean and variance of their portfolios, will be indifferent in choosing between portfolios from among the original \(m\) assets or from these two funds. ⁴⁵

⁴⁵ Merton (1972), p. 1858.

Kasa:

Any portfolio on the efficient frontier can be written as a linear combination of two fixed efficient portfolios.

\[ \vec{w}_{\ast} = \psi \, \vec{w}_{1} + (1-\psi) \, \vec{w}_{2} \]

TODO: reparameterize? ⁴⁶

⁴⁶ TODO: Throughout this we have parameterized \(\psi\) such as it goes from 0 to 1, we go from holding asset 2 to 1. Let’s reparameterize so that \(\psi \rightarrow (1-\psi)\).

3.5.3 One-fund theorem

Now we consider adding the posibility of holding a risk-free asset with a risk-free return, \(r_\mathrm{f}\).

One-fund theorem:

Kwok:

Any efficient portfolio can be expressed as a [linear] combination of the risk free asset and the portfolio (or fund) represented by \(M\).

Kasa:

Any portfolio on the efficient frontier can be written as a linear combination of one fixed efficient non-risk-free portfolio and the risk-free asset.

The portfolio weights are

\[ \vec{w}_{\ast} = \kappa \, \vec{w}_\mathrm{f} + (1-\kappa) \, \vec{w}_\mathrm{tan} \]

The portfolio return is

\[ r_{\ast} = \kappa \, r_\mathrm{f} + (1-\kappa) \, r_\mathrm{tan} \]

The portfolio standard deviation is

\[ \sigma_{\ast} = \left| 1-\kappa \right| \sigma_\mathrm{tan} \]

Since the efficient frontier is a linear combination of the risk-free, “cash”, and a single portfolio of risky assets, “stocks”, then it forms a line in return-risk-space from the risk-free asset to the tangent portfolio, and follows the line further up if one allows borrowing at the risk-free rate and investing in the tangent portfolio. This line is called the Capital Allocation Line because it represents the possible portfolios one can have depending on how much of their cash they have deployed into risky assets in the market.

The functional form of the Capital Allocation Line is

\[ r_\mathrm{CAL}(\sigma) = r_\mathrm{f} + \sigma \, \sqrt{ a \, r_\mathrm{f}^{2} - 2 \, b \, r_\mathrm{f} + c} \]

TODO: Double-check the expression and example values of this slope.

Note that while the shape of the efficient frontier is unchanged by introducing or varying the risk-free rate of return, which portfolio along the frontier that is the tangent portfolio will depend on the risk-free rate of return.

The tangent portfolio with a risk-free asset is

\[ \vec{w}_\mathrm{tan} = \frac{V^{-1} \, (\vec{\mu} - r_\mathrm{f} \, \vec{1})}{\vec{1}^\intercal \, V^{-1} \, (\vec{\mu} - r_\mathrm{f} \, \vec{1})} \]

It has a return

\[ r_\mathrm{tan} = \vec{\mu} \cdot \vec{w}_\mathrm{tan} = \frac{c - b \, r_\mathrm{f}}{b - a \, r_\mathrm{f}} \]

and a variance

\[ \sigma_\mathrm{tan}^{2} = \frac{\left|\vec{\mu} - r_\mathrm{f} \, \vec{1}\right|^2}{ (\vec{\mu} - r_\mathrm{f} \, \vec{1})^\intercal \, V^{-1} \, (\vec{\mu} - r_\mathrm{f} \, \vec{1})} = \frac{a \, r_\mathrm{f}^2 - 2 \, b \, r_\mathrm{f} + c}{(b - a \, r_\mathrm{f})^2} \]

The tangent portfolio is the portfolio with the maximum Sharpe ratio, \(S_i\).

\[ S_i \equiv \frac{ r_i - r_\mathrm{f} }{ \sigma_i } \label{eq:sharpe_ratio} \]

The Sharpe ratio is a measure of how much excess return an asset had over a risk-free asset, adjusted for the risk as measured by the standard deviation of return.

TODO:

Citation needed for the one-fund theorem
Related to the efficient-market hypothesis: in equilibrium, the tangent portfolio becomes the market portfolio

3.6 Capital asset pricing model

Keywords:

Background:

Jensen, M. (1968). The performance of mutual funds in the period 1945-1964. ⁴⁷
Sharpe, W.F. (1963). A simplified model for portfolio analysis. ⁴⁸
Sharpe, W.F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. ⁴⁹
Sharpe, W.F. (1999). Portfolio Theory and Capital Markets. ⁵⁰
Sharpe, W.F. (1990). Nobel lecture: Capital asset prices with and without negative holdings. ⁵¹

⁴⁷ Jensen (1968).

⁴⁸ Sharpe (1963).

⁴⁹ Sharpe (1964).

⁵⁰ Sharpe (1999).

⁵¹ Sharpe (1990).

\[ \beta_i = \frac{ \mathrm{Cov}(r_i, r_m) }{ \mathrm{Var}(r_m) } = \mathrm{Cor}(r_i, r_m) \: \frac{\sigma_i}{\sigma_m} \label{eq:sharpe_beta} \]

Thought in \(r_i\) vs \(r_m\) space, accumulating points over time, \(\alpha_{i}\) and \(\beta_{i}\) can be calculated via linear regression:

SCL:

\[ r_{it} - r_\mathrm{f} = \hat{\alpha}_i + \hat{\beta}_i \, (r_{mt} - r_\mathrm{f}) + \varepsilon_{it} \label{eq:alpha_beta_regression} \]

The Security Characteristic Line (SCL) is the line in \(r_i\) vs \(r_m\), fit to a particular asset, \(i\), with its slope, \(\hat{\beta}_{i}\), and its \((r_i - r_\mathrm{f})\) intercept, \(\hat{\alpha}_i\).

Jensen’s alpha uses the same form, but at a particular time point, using a historical fit for \(\hat{\beta}_{i}\), but not \(\alpha_{i}\).

\[ \alpha_{i} = (r_i - r_\mathrm{f}) - \hat{\beta}_{i} \, (r_m - r_\mathrm{f}) \label{eq:jensen_alpha} \]

TODO: Compare with this:

\[ \alpha_{i} = (r_i - r_\mathrm{f}) - \hat{\beta}_{i} \, (\mu_{m} - r_\mathrm{f}) \]

The Security Market Line (SML), thought in \(r_i\) vs \(\beta_i\) space, goes through the market portfolio at (\(\beta_m\), \(r_m\)).

SML:

\[ \mathbb{E}(r_i) = r_\mathrm{f} + \beta_i \left( \mathbb{E}(r_m) - r_\mathrm{f} \right) \]

Figure 3.5: The Capital Asset Pricing Model (CAPM) applied to a few example assets using daily data from 2014-01-02 to 2024-08-30, 10 years and 8 months.

Treynor ratio

\[ T_i \equiv \frac{ r_i - r_\mathrm{f} }{ \beta_i } \label{eq:treynor_ratio} \]

Gibbons, M., Ross, S., & Shanken, J. (1989). A test of the efficiency of a given portfolio. ⁵²
Luenberger, D.G. (1998). Investment Science. ⁵³

⁵² Gibbons, Ross, & Shanken (1989).

⁵³ Luenberger (1998).

3.7 Black-Litterman model

Black, F. & Litterman, R. (1991). Asset allocation.⁵⁴
Black, F. & Litterman, R. (1992). Global portfolio optimization. ⁵⁵

⁵⁴ Black & Litterman (1991).

⁵⁵ Black & Litterman (1992).

3.8 Factor models

3.8.1 Factor analysis

Factor analysis

3.8.2 Fama-French model

Fama-French model three-factor model
Fama, E.F. & French, K.R. (1992). The cross-section of expected stock returns. ⁵⁶

⁵⁶ Fama & French (1992).

3.8.3 Carhart four-factor model

Carhart four-factor model
Yontar, T. & Benham, F. (2016). US small cap equity: Which benchmark is best?. ⁵⁷

⁵⁷ Yontar & Benham (2016).

3.9 Risk preferences

Risk preferences
Kelly criterion
- Kelly, J.L. (1956). A new interpretation of information rate. ⁵⁸
General consumption/investment problem
- Gambler’s ruin problem
- Merton’s portfolio problem
- Merton, R.C. (1969). Lifetime portfolio selection under uncertainty: The continuous-time case. ⁵⁹
- Karatzas, I., Lehoczky, J.P., Sethi, S.P., & Shreve, S.E (1986). Explicit solution of a general consumption/investment problem. ⁶⁰
Conditional Value at Risk (CVaR) or Expected shortfall
- Rockafellar, R.T. & Uryasev, S. (2000). Optimization of conditional value-at-risk. ⁶¹

⁵⁸ Kelly (1956).

⁵⁹ Merton (1969).

⁶⁰ Karatzas, Lehoczky, Sethi, & Shreve (1986).

⁶¹ Rockafellar & Uryasev (2000).

3.10 Postmodern portfolio theory

3.10.1 Criticisms of MPT

Post-modern portfolio theory

Criticisms of MPT:

Sensitivity of portfolio weights to the estimates of \(\hat{\mu}\) and \(\hat{V}\).
- Error propagation
- Even assuming Gaussian distributed returns
Problem of induction
- Past performance is no guarantee of future results
- Criticisms of using historical estimators of \(\hat{\mu}\) and \(\hat{V}\)
- Non-Gaussian distributed returns
- Heteroskedasticity
Variance is not a good measure of risk
- Downside risk is better
Criticisms of the Efficient Market Hypothesis

3.10.2 Error propagation

Lo, A.W. (2002). The statistics of Sharpe ratios. ⁶²
Bodnar, T. & Schmid, W. (2011). On the exact distribution of the estimated expected utility portfolio weights: Theory and applications. ⁶³
Bodnar, T., Mazur, S., & Podgórski, K. (2016). Singular inverse Wishart distribution and its application to portfolio theory. ⁶⁴

⁶² Lo (2002).

⁶³ Bodnar & Schmid (2011).

⁶⁴ Bodnar, Mazur, & Podgórski (2016).

3.10.3 Heteroskedasticity

3.10.4 Downside risk

Downside risk, semi-variance, semi-deviation, target semi-variance (TSV), target semi-deviation
Markowitz, H.M., Starer, D., Fram, H., & Gerber, S. (2019). Avoiding the downside: A practical review of the Critical Line Algorithm for mean-semivariance portfolio optimization. ⁶⁵
Mean-Semivariance frontier in scikit-portfolio

⁶⁵ Markowitz et al. (2019).

\[ \mathrm{TSV}(r_i, r_\mathrm{tan}) = \mathbb{E}\left[ (r_i - r_\mathrm{tan})^2 \: \mathbb{1}_{\{r_i < r_\mathrm{tan}\}} \right] \label{eq:target_semi_variance} \]

\[ \mathrm{TSD}(r_i, r_\mathrm{tan}) = \sqrt{\mathrm{TSV}(r_i, r_\mathrm{tan})} \label{eq:target_semi_deviation} \]

Sortino ratio

3.10.5 Criticisms of the Efficient Market Hypothesis

Bessembinder, H. & Chan, K. (1998). Market efficiency and the returns to technical analysis. ⁶⁶
TODO: Room for fundamentally-motivated indicator analysis — sits between fundamental analysis and technical analysis.

⁶⁶ Bessembinder & Chan (1998).

3.10.6 More

Rom, B.M. & Ferguson, K. (1993). Post-modern portfolio theory comes of age. ⁶⁷
Sortino, F. (2010). The Sortino Framework for Constructing Portfolios. ⁶⁸
Elton, E.J., Gruber, M.J., Brown, S.J., & Goetzmann, W.N. (2014). Modern Portfolio Theory and Investment Analysis. ⁶⁹
Low-volatility anomaly

⁶⁷ Rom & Ferguson (1993).

⁶⁸ Sortino (2010).

⁶⁹ Elton, Gruber, Brown, & Goetzmann (2014).

3.11 Hierarchical risk analysis

Asset trees
- Stock correlation network
- Mantegna, R.N. (1998). Hierarchical structure in financial markets. ⁷⁰
- Onnela, J.P., Chakraborti, A., Kaski, K., Kertész, J., & Kanto, A. (2003). Dynamics of market correlations: Taxonomy and portfolio analysis. ⁷¹
- Onnela, J.P., Kaski, K., & Kertész, J. (2004). Clustering and information in correlation based financial networks. ⁷²
Hierarchical Risk Parity (HRP)
- López de Prado, M. (2016). Building diversified portfolios that outperform out-of-sample. ⁷³
- López de Prado, M. (2018). Advances in Financial Machine Learning. ⁷⁴
- Lohre, H., Rother, C., & Schäfer, K.A. (2020). Hierarchical Risk Parity: Accounting for tail dependencies in multi-asset multi-factor allocations. ⁷⁵
Raffinot, T. (2018). Hierarchical clustering-based asset allocation. (HCAA) ⁷⁶
Raffinot, T. (2018). The hierarchical equal risk contribution portfolio. (HERC) ⁷⁷
Hudson & Thames. (2024). The Modern Guide to Portfolio Optimization. ⁷⁸
Cotton, P. (2024). Hierarchical minimum variance portfolios: A unifying approach using Schur complements. ⁷⁹
Blogs:

⁷⁰ Mantegna (1998).

⁷¹ Onnela, J.P. et al. (2003).

⁷² Onnela, Kaski, & Kertész (2004).

⁷³ López de Prado (2016).

⁷⁴ López de Prado (2018).

⁷⁵ Lohre, Rother, & Schäfer (2020).

⁷⁶ Raffinot (2018a).

⁷⁷ Raffinot (2018b).

⁷⁸ Hudson & Thames (2024).

⁷⁹ Cotton (2024).

Correlation distance between two assets:

\[ d_{ij} = \sqrt{\frac{1}{2} \left( 1 - \rho_{ij} \right)} \label{eq:hrp_distance} \]

Euclidean distance between two assets in \(n\)-assest space:

\[ \tilde{d}_{ij} = \sqrt{ \sum_{k=1}^{n} \left( d_{ki} - d_{kj} \right)^{2} } \label{eq:hrp_tilde_distance} \]

Note that \(\tilde{d}_{ij}\) is a function of the entire correlation matrix over all assets, whereas \(d_{ij}\) is defined for asset pairs.

Figure 3.6: Dendrogram after hierarchical clustering.

3.12 Forecasting

Models:

Vector autoregression models (VAR)
Bayesian VAR
Autoregressive conditional heteroskedasticity (ARCH)
Autoregressive moving average (ARMA)
Generalized autoregressive conditional heteroskedasticity (GARCH)
Autoregressive integrated moving average (ARIMA)

Davis, J.H., Brandl-Cheng, L., & Khang, K. (2024). Megatrends and the U.S. economy, 1890-2040. ⁸⁰
Koop, G.M. (2013). Forecasting with medium and large Bayesian VARs. ⁸¹

⁸⁰ Davis, Brandl-Cheng, & Khang (2024).

⁸¹ Koop (2013).

Ayyala, D. N. (2020). High-dimensional statistical inference: Theoretical development to data analytics. In A. S. R. S. Rao & C. R. Rao (Eds.), Handbook of Statistics (pp. 289–335). Elsevier.

Bailey, D. H. & López de Prado, M. (2013). An open-source implementation of the critical-line algorithm for portfolio optimization. Algorithms, 6, 169–196. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2197616

Bax, K., Taufer, E., & Paterlini, S. (2022). A generalized precision matrix for t-Student distributions in portfolio optimization. https://arxiv.org/abs/2203.13740

Bessembinder, H. & Chan, K. (1998). Market efficiency and the returns to technical analysis. Financial Management, 27, 5–17. https://www.jstor.org/stable/3666289

Black, F. & Litterman, R. (1991). Asset allocation. Journal of Fixed Income, 1, 7–18.

———. (1992). Global portfolio optimization. Financial Analysts Journal, 48, 28–43.

Bodnar, T., Mazur, S., & Podgórski, K. (2016). Singular inverse Wishart distribution and its application to portfolio theory. Journal of Multivariate Analysis, 143, 314–326. https://www.sciencedirect.com/science/article/pii/S0047259X15002353

Bodnar, T. & Schmid, W. (2011). On the exact distribution of the estimated expected utility portfolio weights: Theory and applications. Statistics & Risk Modeling, 28, 319–342.

Boyd, S. & Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press. https://web.stanford.edu/~boyd/cvxbook/

Cass, D. & Stiglitz, J. E. (1970). The structure of investor preferences and asset returns, and separability in portfolio allocation: A contribution to the pure theory of mutual funds. Journal of Economic Theory, 2.

Chamberlain, G. (1983). A characterization of the distributions that imply mean-variance utility functions. Journal of Economic Theory, 29, 185–201.

Chan, T. F., Golub, G. H., & LeVeque, R. J. (1979). Updating formulae and a pairwise algorithm for computing sample variances. Stanford University. Technical Report STAN-CS-79-773. http://infolab.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf

Coqueret, G. & Milhau, V. (2014). Estimating covariance matrices for portfolio optimization. https://web.archive.org/web/20230928050136/https://www.gcoqueret.com/files/Estim_cov.pdf

Cotton, P. (2024). Hierarchical minimum variance portfolios: A unifying approach using Schur complements. https://github.com/microprediction/precise/blob/main/academic/Schur_Complementary_Portfolios_Peter_Cotton.pdf

Das, S. (2016). Data Science: Theories, Models, Algorithms, and Analytics. http://srdas.github.io/Papers/DSA_Book.pdf

Davis, J. H., Brandl-Cheng, L., & Khang, K. (2024). Megatrends and the US economy, 1890-2040. https://ssrn.com/abstract=4702028

Dutta, S. & Jain, S. (2023). Precision versus shrinkage: A comparative analysis of covariance estimation methods for portfolio allocation. https://arxiv.org/abs/2305.11298

Elton, E. J., Gruber, M. J., Brown, S. J., & Goetzmann, W. N. (2014). Modern Portfolio Theory and Investment Analysis (9th ed.). Wiley.

Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. Journal of Finance, 25, 383–417. http://www.e-m-h.org/Fama70.pdf

Fama, E. F. & French, K. R. (1992). The cross-section of expected stock returns. Journal of Finance, 47, 427. https://onlinelibrary.wiley.com/doi/10.1111/j.1540-6261.1992.tb04398.x

Fan, J., Liao, Y., & Liu, H. (2015). An overview on the estimation of large covariance and precision matrices. https://arxiv.org/abs/1504.02995

Finch, T. (2009). Incremental calculation of weighted mean and variance. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=5f77d0594f66e49cc0e8c2b8177ead33b7137183

Galloway, M. (2019). Shrinking characteristics of precision matrix estimators: An illustration via regression. https://mattxgalloway.com/oral_manuscript/

Gibbons, M., Ross, S., & Shanken, J. (1989). A test of the efficiency of a given portfolio. Econometrica, 57, 1121–1152. https://www.jstor.org/stable/1913625

Hudson & Thames. (2024). The Modern Guide to Portfolio Optimization. https://github.com/hudson-and-thames/guide_to_modern_portfolio_optimization

Jagannathan, R. & Ma, T. (2003). Risk reduction in large portfolios: Why imposing the wrong constraints helps. Journal of Finance, 58, 1651–1683. https://www.jstor.org/stable/3648224

Jensen, M. (1968). The performance of mutual funds in the period 1945-1964. Journal of Finance, 23, 389–416. https://www.jstor.org/stable/2325404

Karatzas, I., Lehoczky, J. P., Sethi, S. P., & Shreve, S. (1986). Explicit solution of a general consumption/investment problem. Mathematics of Operations Research, 11, 261–294. https://www.jstor.org/stable/3689808

Kelly, J. L. (1956). A new interpretation of information rate. Bell System Technical Journal, 35, 917–926. https://www.princeton.edu/~wbialek/rome/refs/kelly_56.pdf

Koop, G. M. (2013). Forecasting with medium and large Bayesian VARs. Journal of Applied Econometrics, 28, 177–203. https://ssrn.com/abstract=1514412

Kwok, Y. K. (2017). Lecture notes: Fundamentals of Mathematical Finance. https://www.math.hkust.edu.hk/~maykwok/MATH4512.htm

Ledoit, O. & Wolf, M. (2001). Honey, I shrunk the sample covariance matrix. http://www.ledoit.net/honey.pdf

———. (2003). Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. Journal of Empirical Finance, 10, 603–621. http://www.ledoit.net/ole2.pdf

Levy, H. & Markowitz, H. M. (1979). Approximating expected utility by a function of mean and variance. American Economic Review, 69, 308–317.

Ling, R. F. (1974). Comparison of several algorithms for computing sample means and variances. Journal of the American Statistical Association, 69, 859–866.

Lo, A. W. (2002). The statistics of Sharpe ratios. Financial Analysts Journal, 58, 36–52. https://www.jstor.org/stable/4480405

Lohre, H., Rother, C., & Schäfer, K. A. (2020). Hierarchical Risk Parity: Accounting for tail dependencies in multi-asset multi-factor allocations. In E. Jurczenko (Ed.), Machine Learning and Asset Management (pp. 332–368). Iste and Wiley. https://papers.ssrn.com/sol3/Delivery.cfm?abstractid=3513399

López de Prado, M. (2016). Building diversified portfolios that outperform out-of-sample. Journal of Portfolio Management. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2708678

———. (2018). Advances in Financial Machine Learning. Wiley.

Luenberger, D. G. (1998). Investment Science. Oxford University Press.

Mantegna, R. N. (1998). Hierarchical structure in financial markets. https://arxiv.org/abs/cond-mat/9802256

Markowitz, H. M. (1952). Portfolio selection. Journal of Finance, 7, 77–91. https://www.jstor.org/stable/2975974

———. (1956). The optimization of a quadratic function subject to linear constraints. Naval Research Logistics Quarterly, 3, 111–133.

———. (1959). Portfolio Selection: Efficient Diversification of Investments. Wiley.

———. (1990). Nobel lecture: Foundations of portfolio theory. https://www.nobelprize.org/uploads/2018/06/markowitz-lecture.pdf

———. (2005). Market efficiency: A theoretical distinction and so what? Financial Analysts Journal, 61, 17–30.

Markowitz, H. M., Starer, D., Fram, H., & Gerber, S. (2019). Avoiding the downside: A practical review of the Critical Line Algorithm for mean-semivariance portfolio optimization. https://www.hudsonbaycapital.com/documents/FG/hudsonbay/research/599440_paper.pdf

Marsaglia, G. (1964). Conditional means and covariances of normal variables with singular covariance matrix. Journal of the American Statistical Association, 59, 1203–1204.

Meng, X. (2015). Simpler online updates for arbitrary-order central moments. https://arxiv.org/abs/1510.04923

Merton, R. C. (1969). Lifetime portfolio selection under uncertainty: The continuous-time case. Review of Economics and Statistics, 51, 247–257. https://www.jstor.org/stable/1926560

———. (1972). An analytic derivation of the efficient portfolio frontier. Journal of Financial and Quantitative Analysis, 7, 1851–1872. https://www.jstor.org/stable/2329621

Neely, P. M. (1966). Comparison of several algorithms for computation of means, standard deviations and correlation coefficients. Communications of the ACM, 9, 496–499. https://dl.acm.org/doi/pdf/10.1145/365719.365958

Onnela, J. P., Kaski, K., & Kertész, J. (2004). Clustering and information in correlation based financial networks. European Physical Journal B, 38, 353–362. https://link.springer.com/article/10.1140/epjb/e2004-00128-7

Onnela, J.P. et al. (2003). Dynamics of market correlations: Taxonomy and portfolio analysis. http://arXiv.org/abs/cond-mat/0302546v1

Owen, J. & Rabinovitch, R. (1983). On the class of elliptical distributions and their applications to the theory of portfolio choice. Journal of Finance, 38, 745–752.

Pébay, P. (2008). Formulas for robust, one-pass parallel computation of covariances and arbitrary-order statistical moments. Sandia National Laboratories. Technical Report SAND2008-6212. https://www.osti.gov/servlets/purl/1028931

Pébay, P., Terriberry, T. B., Kolla, H., & Bennett, J. (2016). Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights. Computational Statistics, 31, 1305–1325. https://link.springer.com/article/10.1007/s00180-015-0637-z

Raffinot, T. (2018a). Hierarchical clustering-based asset allocation. Journal of Portfolio Management, 44, 89–99. https://www.proquest.com/docview/2196551795

———. (2018b). The hierarchical equal risk contribution portfolio. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3237540

Rockafellar, R. T. & Uryasev, S. (2000). Optimization of conditional value-at-risk. Journal of Risk, 2, 21–42. https://sites.math.washington.edu/~rtr/papers/rtr179-CVaR1.pdf

Rom, B. M. & Ferguson, K. (1993). Post-modern portfolio theory comes of age. Journal of Investing. Winter 1993.

Roy, A. D. (1952). Safety first and the holding of assets. Econometrica, 20, 431–449. https://www.jstor.org/stable/1907413

Schubert, E. & Gertz, M. (2018). Numerically stable parallel computation of (co-)variance. Proceedings of the 30th International Conference on Scientific and Statistical Database Management, SSDBM18. https://ds.ifi.uni-heidelberg.de/files/Team/eschubert/publications/SSDBM18-covariance-slides.pdf

Sharpe, W. F. (1963). A simplified model for portfolio analysis. Management Science, 9, 277–293.

———. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. Journal of Finance, 19, 425–442.

———. (1990). Nobel lecture: Capital asset prices with and without negative holdings. https://www.nobelprize.org/uploads/2018/06/sharpe-lecture.pdf

———. (1999). Portfolio Theory and Capital Markets. McGraw-Hill. (Originally published in 1970).

Sortino, F. (2010). The Sortino Framework for Constructing Portfolios. Elsevier.

Tobin, J. (1958). Liquidity preference as behavior towards risk. Review of Economic Studies, 25, 65–86. https://doi.org/10.2307/2296205

Welford, B. P. (1962). Note on a method for calculating corrected sums of squares and products. Technometrics, 4, 419–420.

Yontar, T. & Benham, F. (2016). US small cap equity: Which benchmark is best? Meketa Investment Group. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2770660

Youngs, E. A. & Cramer, E. M. (1971). Some results relevant to choice of sum and sum-of-product algorithms. Technometrics, 13, 657–665. https://www.jstor.org/stable/1267176