3 Portfolio Theory
3.1 Introduction
TODO: Given some menu of possible investments, what mix should we hold? How should we hold value through time?
3.2 Modern portfolio theory
3.2.1 History and pedagogy
Keywords:
- Modern portfolio theory (MPT)
- Harry Markowitz (1927-2023)
- Markowitz model
- Rayleigh quotient
- Sharpe ratio
Historical background:
- Markowitz, H.M. (1952). Portfolio selection. 1
- Roy, A.D. (1952). Safety first and the holding of assets. 2
- Markowitz, H.M. (1959). Portfolio Selection: Efficient Diversification of Investments. 3
- Merton, R.C. (1972). An analytic derivation of the efficient portfolio frontier. 4
- Levy, H. & Markowitz, H.M. (1979). Approximating expected utility by a function of mean and variance. 5
- Markowitz, H.M. (1990). Nobel lecture: Foundations of portfolio theory. 6
- Markowitz, H.M. (2005). Market efficiency: A theoretical distinction and so what? 7
1 Markowitz (1952).
2 Roy (1952).
3 Markowitz (1959).
4 Merton (1972).
5 Levy & Markowitz (1979).
6 Markowitz (1990).
7 Markowitz (2005).
Lecture notes:
- Armerin, F. (2023). Lecture notes: More on mean-variance analysis.
- Das, S.R. (2016). Data Science: Theories, Models, Algorithms, and Analytics. 8
- Also: Das, S.R. (2017). Being mean with variance: Markowitz optimization.
- Kasa, K. (2023). Lecture notes by Ken Kasa (SFU)
- In particular, Lecture 7
- Kwok, Y.K. (2017). Lecture notes: Fundamentals of Mathematical Finance. 9
- In particular, Lecture 2
- Sigman, K. (2005). Notes on fund theorems.
3.2.2 Markowitz portfolio problem
Return of a portfolio:
\[ r = \vec{w}^\intercal \, \vec{r} = \sum_i w_{i} \, r_{i} \]
Variance of a portfolio:
\[ \sigma^2 = \vec{w}^\intercal \, V \, \vec{w} = \sum_{ij} w_{i} \, V_{ij} \, w_{j} \]
TODO: Show above 10
10 Luenberger (1998), p. 150.
Given an \(n\)-dimensional vector of expected returns, \(\vec{\mu}\), an \(n\times{}n\)-dimensional expected covariance matrix, \(V\), an \(m\times{}n\)-dimensional constraint matrix, \(A\), an \(m\)-dimensional constraint vector, \(\vec{b}\), and a target return, \(r_{\ast}\), solve for the portfolio weights, \(\vec{w}_{\ast}\), an \(n\)-dimensional vector, that are efficient, i.e. those that minimize the standard deviation of the portfolio return, \(\sigma\), for a given target return. Return \((\vec{w}_{\ast}, \sigma_{\ast})\). 11
Solve
\[ \vec{w}_{\ast} = \underset{w}{\mathrm{argmin}}\ \vec{w}^\intercal \, V \, \vec{w} \]
such that
\[ \vec{w} \cdot \vec{1} = 1 \]
\[ \vec{w} \cdot \vec{\mu} = r_{\ast} \]
and with further optional constraints
\[ A \, \vec{w} \geq \vec{b} \]
11 Markowitz (1959), p. 172.
There are a lot of topics to discuss about solving for the efficient frontier:
- How there is an analytic solution if you allow shorts
- Solving with Lagrange multipliers
- Solving with numerical convex optimization
TODO: Discuss the above more.
It can be shown12 that there is an analytic solution where:
12 Merton (1972) was the first to show there was an analytic solution to the Markowitz portfolio problem? For the analytic results descussed here, we generally follow Kwok (2017). Note that we use variable names following Kwok, whereas to convert from Merton to Kwok: \(a_\mathrm{M} = b_\mathrm{K}\), \(b_\mathrm{M} = c_\mathrm{K}\), \(c_\mathrm{M} = a_\mathrm{K}\).
\[ a \equiv \vec{1}^\intercal \, V^{-1} \, \vec{1}, \qquad b \equiv \vec{1}^\intercal \, V^{-1} \, \vec{\mu}, \qquad c \equiv \vec{\mu}^\intercal \, V^{-1} \, \vec{\mu}, \qquad d \equiv a\,c - b^2 \]
There are two efficient portfolios of note: the minimum variance portfolio, \(\vec{w}_v\), and the tangent portfolio, \(\vec{w}_t\).
The minimum variance portfolio is
\[ \vec{w}_{v} = \frac{V^{-1} \, \vec{1}}{a} = \frac{V^{-1} \, \vec{1}}{\vec{1}^\intercal \, V^{-1} \, \vec{1}} \]
It has a return
\[ r_{v} = \vec{w}_{v} \cdot \vec{\mu} = \frac{\vec{1}^\intercal \, V^{-1} \, \vec{\mu}}{a} = \frac{b}{a} \]
and a variance
\[ \sigma_{v}^2 = \vec{w}_{v}^\intercal \, V \, \vec{w}_{v} = \left( \frac{\vec{1}^\intercal \, V^{-1}}{a} \right) V \left( \frac{V^{-1} \, \vec{1}}{a} \right) = \frac{\vec{1}^\intercal \, V^{-1} \, \vec{1}}{a^2} = \frac{1}{a} \]
The tangent portfolio is
\[ \vec{w}_{t} = \frac{V^{-1} \, \vec{\mu}}{b} = \frac{V^{-1} \, \vec{\mu}}{\vec{1}^\intercal \, V^{-1} \, \vec{\mu}} \]
It has a return
\[ r_{t} = \vec{w}_{t} \cdot \vec{\mu} = \frac{\vec{\mu}^\intercal \, V^{-1} \, \vec{\mu}}{b} = \frac{c}{b} \]
and a variance
\[ \sigma_{t}^2 = \vec{w}_{t}^\intercal \, V \, \vec{w}_{t} = \left( \frac{\vec{\mu}^\intercal \, V^{-1}}{b} \right) V \left( \frac{V^{-1} \, \vec{\mu}}{b} \right) = \frac{\vec{\mu}^\intercal \, V^{-1} \, \vec{\mu}}{b^2} = \frac{c}{b^2} \]
The efficient frontier can be written as a linear combination of any two efficient portfolios. This is discussed in more detail in the section on Fund theorems. Written as a combination of the minimum variance and the tangent portfolios gives
\[ \vec{w}_{\ast} = \xi \, \vec{w}_{v} + (1-\xi) \, \vec{w}_{t} \]
where
\[ \xi = (c - b \, r_{\ast}) \, a \, / \, d \]
The efficient frontier portfolio can be equivalently written
\[\begin{align} \vec{w}_{\ast} &= \xi \, \vec{w}_{v} + (1-\xi) \, \vec{w}_{t} \\ &= \left( \frac{c - b \, r_{\ast}}{d} \right) a \, \vec{w}_{v} + \left( \frac{a \, r_{\ast} - b}{d} \right) b \, \vec{w}_{t} \\ &= \left( \frac{c - b \, r_{\ast}}{d} \right) V^{-1} \, \vec{1} + \left( \frac{a \, r_{\ast} - b}{d} \right) V^{-1} \, \vec{\mu} \end{align}\]
Along the frontier, the return is
\[ r_{\ast} = \xi \, r_{v} + (1-\xi) \, r_{t} \]
The variance is
\[ \sigma^2_{\ast} = \frac{a}{d} \, r_{\ast}^{2} - \frac{2 \, b}{d} \, r_{\ast} + \frac{c}{d} \]
TODO: Note calculation order of \(\vec{w}_{v}(\mu, V)\) and \(\vec{w}_{t}(\mu, V, r_{f})\), then calculate \(r_{\ast}(\sigma_{\ast})\), scanning from \(\sigma_{v}\) to \(\sigma_\mathrm{max}\).
In general, depending on the correlations of the assets, the efficient frontier portfolios will short various positions, indicated by having negative weights.
3.2.3 No-shorts frontier
If one adds an additional constraint to the Markowitz portfolio problem as stated, requiring that we don’t short any positions
\[ w_i \geq 0 \]
then the problem doesn’t have an analytic solution. TODO: Citation needed.
The no-shorts frontier can be solved numerically with quadratic programming. In general, the no-shorts frontier will follow the unconstrained efficient frontier when there isn’t any shorting in the efficient portfolios, and the no-shorts frontier will pull away from the efficient frontier to somewhat lower returns when there is shorting on the efficient frontier.
An example of the efficient frontier and the no-shorts frontier is shown in Figure 3.2.
Quadratic programming and convex optimization are discussed in more detail in the section on Convex optimization.
3.2.4 Lessons of MPT
Markowitz:
[I]n trying to make variance small it is not enough to invest in many securities. It is necessary to avoid investing in securities with high covariances among themselves. We should diversify across industries because firms in different industries, especially industries with different economic characteristics, have lower covariances than firms within an industry. 13
13 Markowitz (1952), p. 89.
Dalio: “The Holy Grail”, see Figure 3.3.
3.3 Estimation of covariance matrices
This is how we estimate \(V\) (and \(\mu\)).
- Algorithms for calculating variance
- Estimation of covariance matrices
- Shrinkage
- Ledoit, O. & Wolf, M. (2001). Honey, I shrunk the sample covariance matrix. 14
- Ledoit, O. & Wolf, M. (2003). Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. 15
- Coqueret, G. & Milhau, V. (2014). Estimating covariance matrices for portfolio optimization 16
- Covariance estimation and asset trees
- Stock correlation network
- Mantegna, R.N. (1998). Hierarchical structure in financial markets. 17
- Onnela, J.P., Chakraborti, A., Kaski, K., Kertesz, J., & Kanto, A. (2003). Dynamics of market correlations: Taxonomy and portfolio analysis. 18
- Onnela, J.P., Kaski, K., & Kertész, J. (2004). Clustering and information in correlation based financial networks. 19
- See Hierarchical Risk Parity
14 Ledoit & Wolf (2001).
15 Ledoit & Wolf (2003).
16 Coqueret & Milhau (2014).
17 Mantegna (1998).
18 Onnela, J.P. et al. (2003).
19 Onnela, Kaski, & Kertész (2004).
TODO:
- Sample mean and covariance
- Rolling mean and covariance
- Exponential moving mean and covariance
- Online mean and covariance
- Shrinkage estimators
3.4 Convex optimization
This is how we minimize \(\sigma\).
- Affine combinations and convex sets
- Linear programming
- George Dantzig (1914-2005)
- Quadratic programming
- Markowitz’s Critical Line Algorithm (CLA) 20
- No-shorts efficient frontier
- Karush-Kuhn-Tucker (KKT) conditions
- Jagannathan, R. & Ma, T. (2003). Risk reduction in large portfolios: Why imposing the wrong constraints helps. 21
- Tam, A.S. (2021). Lagrangians and portfolio optimization.
- Boyd, S. & Vandenberghe, L. (2004). Convex Optimization.
- Software:
TODO: Discuss optimizing the no-shorts frontier.
3.5 Fund theorems
3.5.1 Mutual fund separation theorem
- Mutual fund separation theorem
- Cass, D. & Stiglitz, J.E. (1970). The structure of investor preferences and asset returns, and separability in portfolio allocation: A contribution to the pure theory of mutual funds. 22
- Chamberlain, G. (1983). A characterization of the distributions that imply mean-variance utility functions. 23
- Owen, J. & Rabinovitch, R. (1983). On the class of elliptical distributions and their applications to the theory of portfolio choice. 24
Cass & Stiglitz:
[G]iven a market in which there are available \(n\) different assets, nonetheless all the opportunities relevant to the investor’s decision can be provided by a set of \(m\) (\(< n\)) “mutual funds,” i.e., a set of \(m\) linear combinations (with weights adding to one) of the available assets. 25
25 Cass & Stiglitz (1970), p. 122.
3.5.2 Two-fund theorem
Continuing the discussion of the context of a portfolio of risky assets (no risk-free asset; to be considered in the next section).
Tobin26 is often credited as the first to note, and later Merton27 exposited more formally, the Two-fund theorem:
Merton:
Given \(m\) assets satisfying the conditions […], there are two portfolios (“mutual funds”) constructed from these \(m\) assets, such that all risk-averse individuals, who choose their portfolios so as to maximize utility functions dependent only on the mean and variance of their portfolios, will be indifferent in choosing between portfolios from among the original \(m\) assets or from these two funds. 28
28 Merton (1972), p. 1858.
Kasa:
Any portfolio on the efficient frontier can be written as a linear combination of two fixed efficient portfolios.
\[ \vec{w}_{\ast} = \xi \, \vec{w}_{1} + (1-\xi) \, \vec{w}_{2} \]
TODO: reparameterize? 29
29 TODO: Throughout this we have parameterized \(\xi\) such as it goes from 0 to 1, we go from holding asset 2 to 1. Let’s reparameterize so that \(\xi \rightarrow (1-\xi)\).
3.5.3 One-fund theorem
Now we consider adding the posibility of holding a risk-free asset with a risk-free return, \(r_{f}\).
One-fund theorem:
Kwok:
Any efficient portfolio [on the Capital Allocation Line] can be expressed as a combination of the risk free asset and the portfolio (or fund) represented by \(M\).
Kasa:
Any portfolio on the efficient frontier can be written as a linear combination of one fixed efficient non-risk-free portfolio and the risk-free asset.
The portfolio weights are
\[ \vec{w}_{\ast} = \kappa \, \vec{w}_{f} + (1-\kappa) \, \vec{w}_{t} \]
The portfolio return is
\[ r_{\ast} = \kappa \, r_{f} + (1-\kappa) \, r_{t} \]
The portfolio standard deviation is
\[ \sigma_{\ast} = \left| 1-\kappa \right| \sigma_{t} \]
Since the efficient frontier is a linear combination of the risk-free, “cash”, and a single portfolio of risky assets, “stocks”, then it forms a line in return-risk-space from the risk-free asset to the tangent portfolio, and follows the line further up if one allows borrowing at the risk-free rate and investing in the tangent portfolio. This line is called the Capital Allocation Line because it represents the possible portfolios one can have depending on how much of their cash they have deployed into risky assets in the market.
The functional form of the Capital Allocation Line is
\[ r_\mathrm{CAL}(\sigma) = r_{f} + \sigma \, \sqrt{ a \, r_{f}^{2} - 2 \, b \, r_{f} + c} \]
TODO: Double-check the expression and example values of this slope.
Note that while the shape of the efficient frontier is unchanged by introducing or varying the risk-free rate of return, which portfolio along the frontier that is the tangent portfolio will depend on the risk-free rate of return.
The tangent portfolio with a risk-free asset is
\[ \vec{w}_{t} = \frac{V^{-1} \, (\vec{\mu} - r_{f} \, \vec{1})}{\vec{1}^\intercal \, V^{-1} \, (\vec{\mu} - r_{f} \, \vec{1})} \]
It has a return
\[ r_{t} = \vec{\mu} \cdot \vec{w}_t = \frac{c - b \, r_{f}}{b - a \, r_{f}} \]
and a variance
\[ \sigma_{t}^{2} = \frac{\left|\vec{\mu} - r_{f} \, \vec{1}\right|^2}{ (\vec{\mu} - r_{f} \, \vec{1})^\intercal \, V^{-1} \, (\vec{\mu} - r_{f} \, \vec{1})} = \frac{a \, r_{f}^2 - 2 \, b \, r_{f} + c}{(b - a \, r_{f})^2} \]
The tangent portfolio is the portfolio with the maximum Sharpe ratio, \(S_i\).
\[ S_i \equiv \frac{ r_i - r_f }{ \sigma_i } \label{eq:sharpe_ratio} \]
The Sharpe ratio is a measure of how much excess return an asset had over a risk-free asset, adjusted for the risk as measured by the standard deviation of return.
TODO:
- Citation needed for the one-fund theorem
- Related to the efficient-market hypothesis: in equilibrium, the tangent portfolio becomes the market portfolio
3.6 Efficient-market hypothesis
- Efficient-market hypothesis
- Eugene Fama (b. 1939)
- Fama, E. (1970). Efficient capital markets: A review of theory and empirical work. 30
30 Fama (1970).
3.7 Capital asset pricing model
Keywords:
- Capital Asset Pricing Model (CAPM)
- William F. Sharpe (b. 1934)
- Beta
- Alpha
- Security Characteristic Line (SCL)
- Security Market Line (SML)
- Jensen’s alpha
- Treynor ratio
Background:
- Jensen, M. (1968). The performance of mutual funds in the period 1945-1964. 31
- Sharpe, W.F. (1963). A simplified model for portfolio analysis. 32
- Sharpe, W.F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. 33
- Sharpe, W.F. (1999). Portfolio Theory and Capital Markets. 34
- Sharpe, W.F. (1990). Nobel lecture: Capital asset prices with and without negative holdings. 35
\[ \beta_i = \frac{ \mathrm{Cov}(r_i, r_m) }{ \mathrm{Var}(r_m) } = \mathrm{Cor}(r_i, r_m) \: \frac{\sigma_i}{\sigma_m} \label{eq:sharpe_beta} \]
Thought in \(r_i\) vs \(r_m\) space, accumulating points over time, \(\alpha_{i}\) and \(\beta_{i}\) can be calculated via linear regression:
SCL:
\[ r_{it} - r_f = \hat{\alpha}_i + \hat{\beta}_i \, (r_{mt} - r_f) + \varepsilon_{it} \label{eq:alpha_beta_regression} \]
The Security Characteristic Line (SCL) is the line in \(r_i\) vs \(r_m\), fit to a particular asset, \(i\), with its slope, \(\hat{\beta}_{i}\), and its \((r_i - r_f)\) intercept, \(\hat{\alpha}_i\).
Jensen’s alpha uses the same form, but at a particular time point, using a historical fit for \(\hat{\beta}_{i}\), but not \(\alpha_{i}\).
\[ \alpha_{i} = (r_i - r_f) - \hat{\beta}_{i} \, (r_m - r_f) \label{eq:jensen_alpha} \]
TODO: Compare with this:
\[ \alpha_{i} = (r_i - r_f) - \hat{\beta}_{i} \, (\mu_{m} - r_f) \]
The Security Market Line (SML), thought in \(r_i\) vs \(\beta_i\) space, goes through the market portfolio at (\(\beta_m\), \(r_m\)).
SML:
\[ \mathbb{E}(r_i) = r_f + \beta_i \left( \mathbb{E}(r_m) - r_f \right) \]
\[ T_i \equiv \frac{ r_i - r_f }{ \beta_i } \label{eq:treynor_ratio} \]
- Gibbons, M., Ross, S., & Shanken, J. (1989). A test of the efficiency of a given portfolio. 36
- Luenberger, D.G. (1998). Investment Science. 37
3.8 Black-Litterman model
3.9 Factor models
3.9.1 Factor analysis
3.9.2 Fama-French model
- Fama-French model three-factor model
- Fama, E.F. & French, K.R. (1992). The cross-section of expected stock returns. 40
40 Fama & French (1992).
3.9.3 Carhart four-factor model
3.10 Risk preferences
- Risk preferences
- Kelly criterion
- Kelly, J.L. (1956). A new interpretation of information rate. 41
- General consumption/investment problem
- Gambler’s ruin problem
- Merton’s portfolio problem
- Merton, R.C. (1969). Lifetime portfolio selection under uncertainty: The continuous-time case. 42
- Karatzas, I., Lehoczky, J.P., Sethi, S.P., & Shreve, S.E (1986). Explicit solution of a general consumption/investment problem. 43
- Conditional Value at Risk (CVaR) or Expected shortfall
- Rockafellar, R.T. & Uryasev, S. (2000). Optimization of conditional value-at-risk. 44
3.11 Postmodern portfolio theory
3.11.1 Criticisms of MPT
Criticisms of MPT:
- Sensitivity of portfolio weights to the estimates of \(\hat{\mu}\) and \(\hat{V}\).
- Error propagation
- Problem of induction
- Past performance is no guarantee of future results
- Criticisms of using historical estimators of \(\hat{\mu}\) and \(\hat{V}\)
- Variance is not a good measure of risk
- Downside risk is better
3.11.2 Error propagation
- TODO
- Lo, A.W. (2002). The statistics of Sharpe ratios. 45
45 Lo (2002).
3.11.3 Downside risk
- Downside risk, semi-variance, semi-deviation, target semi-variance (TSV), target semi-deviation
- Markowitz, H.M., Starer, D., Fram, H., & Gerber, S. (2019). Avoiding the downside: A practical review of the Critical Line Algorithm for mean-semivariance portfolio optimization. 46
- Mean-Semivariance frontier in scikit-portfolio
46 Markowitz, Starer, Fram, & Gerber (2019).
\[ \mathrm{TSV}(r_i, r_t) = \mathbb{E}\left[ (r_i - r_t)^2 \: \mathbb{1}_{\{r_i < r_t\}} \right] \label{eq:target_semi_variance} \]
\[ \mathrm{TSD}(r_i, r_t) = \sqrt{\mathrm{TSV}(r_i, r_t)} \label{eq:target_semi_deviation} \]
3.11.4 More
- Rom, B.M. & Ferguson, K. (1993). Post-modern portfolio theory comes of age. 47
- Sortino, F. (2010). The Sortino Framework for Constructing Portfolios. 48
- Elton, E.J., Gruber, M.J., Brown, S.J., & Goetzmann, W.N. (2014). Modern Portfolio Theory and Investment Analysis. 49
- Low-volatility anomaly
3.12 Hierarchical risk parity
- Hierarchical Risk Parity (HRP)
- López de Prado, M. (2016). Building diversified portfolios that outperform out-of-sample. 50
- López de Prado, M. (2018). Advances in Financial Machine Learning. 51
- Lohre, H., Rother, C., & Schäfer, K.A. (2020). Hierarchical Risk Parity: Accounting for tail dependencies in multi-asset multi-factor allocations. 52
- Blogs: