3  Portfolio Theory

Published

October 21, 2024

3.1 Introduction

TODO: Given some menu of possible investments, what mix should we hold? How should we hold value through time?

3.2 Modern portfolio theory

3.2.1 History and pedagogy

Keywords:

Historical background:

1 Markowitz (1952).

2 Roy (1952).

3 Markowitz (1959).

4 Merton (1972).

5 Levy & Markowitz (1979).

6 Markowitz (1990).

7 Markowitz (2005).

Lecture notes:

8 Das (2016).

9 Kwok (2017).

3.2.2 Markowitz portfolio problem

Return of a portfolio:

\[ r = \vec{w}^\intercal \, \vec{r} = \sum_i w_{i} \, r_{i} \]

Variance of a portfolio:

\[ \sigma^2 = \vec{w}^\intercal \, V \, \vec{w} = \sum_{ij} w_{i} \, V_{ij} \, w_{j} \]

TODO: Show above 10

10 Luenberger (1998), p. 150.

Markowitz portfolio problem

Given an \(n\)-dimensional vector of expected returns, \(\vec{\mu}\), an \(n\times{}n\)-dimensional expected covariance matrix, \(V\), an \(m\times{}n\)-dimensional constraint matrix, \(A\), an \(m\)-dimensional constraint vector, \(\vec{b}\), and a target return, \(r_{\ast}\), solve for the portfolio weights, \(\vec{w}_{\ast}\), an \(n\)-dimensional vector, that are efficient, i.e. those that minimize the standard deviation of the portfolio return, \(\sigma\), for a given target return. Return \((\vec{w}_{\ast}, \sigma_{\ast})\). 11

Solve

\[ \vec{w}_{\ast} = \underset{w}{\mathrm{argmin}}\ \vec{w}^\intercal \, V \, \vec{w} \]

such that

\[ \vec{w} \cdot \vec{1} = 1 \]

\[ \vec{w} \cdot \vec{\mu} = r_{\ast} \]

and with further optional constraints

\[ A \, \vec{w} \geq \vec{b} \]

11 Markowitz (1959), p. 172.

There are a lot of topics to discuss about solving for the efficient frontier:

  • How there is an analytic solution if you allow shorts
  • Solving with Lagrange multipliers
  • Solving with numerical convex optimization

TODO: Discuss the above more.

It can be shown12 that there is an analytic solution where:

12 Merton (1972) was the first to show there was an analytic solution to the Markowitz portfolio problem? For the analytic results descussed here, we generally follow Kwok (2017). Note that we use variable names following Kwok, whereas to convert from Merton to Kwok: \(a_\mathrm{M} = b_\mathrm{K}\), \(b_\mathrm{M} = c_\mathrm{K}\), \(c_\mathrm{M} = a_\mathrm{K}\).

\[ a \equiv \vec{1}^\intercal \, V^{-1} \, \vec{1}, \qquad b \equiv \vec{1}^\intercal \, V^{-1} \, \vec{\mu}, \qquad c \equiv \vec{\mu}^\intercal \, V^{-1} \, \vec{\mu}, \qquad d \equiv a\,c - b^2 \]

There are two efficient portfolios of note: the minimum variance portfolio, \(\vec{w}_v\), and the tangent portfolio, \(\vec{w}_t\).

The minimum variance portfolio is

\[ \vec{w}_{v} = \frac{V^{-1} \, \vec{1}}{a} = \frac{V^{-1} \, \vec{1}}{\vec{1}^\intercal \, V^{-1} \, \vec{1}} \]

It has a return

\[ r_{v} = \vec{w}_{v} \cdot \vec{\mu} = \frac{\vec{1}^\intercal \, V^{-1} \, \vec{\mu}}{a} = \frac{b}{a} \]

and a variance

\[ \sigma_{v}^2 = \vec{w}_{v}^\intercal \, V \, \vec{w}_{v} = \left( \frac{\vec{1}^\intercal \, V^{-1}}{a} \right) V \left( \frac{V^{-1} \, \vec{1}}{a} \right) = \frac{\vec{1}^\intercal \, V^{-1} \, \vec{1}}{a^2} = \frac{1}{a} \]

The tangent portfolio is

\[ \vec{w}_{t} = \frac{V^{-1} \, \vec{\mu}}{b} = \frac{V^{-1} \, \vec{\mu}}{\vec{1}^\intercal \, V^{-1} \, \vec{\mu}} \]

It has a return

\[ r_{t} = \vec{w}_{t} \cdot \vec{\mu} = \frac{\vec{\mu}^\intercal \, V^{-1} \, \vec{\mu}}{b} = \frac{c}{b} \]

and a variance

\[ \sigma_{t}^2 = \vec{w}_{t}^\intercal \, V \, \vec{w}_{t} = \left( \frac{\vec{\mu}^\intercal \, V^{-1}}{b} \right) V \left( \frac{V^{-1} \, \vec{\mu}}{b} \right) = \frac{\vec{\mu}^\intercal \, V^{-1} \, \vec{\mu}}{b^2} = \frac{c}{b^2} \]

The efficient frontier can be written as a linear combination of any two efficient portfolios. This is discussed in more detail in the section on Fund theorems. Written as a combination of the minimum variance and the tangent portfolios gives

\[ \vec{w}_{\ast} = \xi \, \vec{w}_{v} + (1-\xi) \, \vec{w}_{t} \]

where

\[ \xi = (c - b \, r_{\ast}) \, a \, / \, d \]

The efficient frontier portfolio can be equivalently written

\[\begin{align} \vec{w}_{\ast} &= \xi \, \vec{w}_{v} + (1-\xi) \, \vec{w}_{t} \\ &= \left( \frac{c - b \, r_{\ast}}{d} \right) a \, \vec{w}_{v} + \left( \frac{a \, r_{\ast} - b}{d} \right) b \, \vec{w}_{t} \\ &= \left( \frac{c - b \, r_{\ast}}{d} \right) V^{-1} \, \vec{1} + \left( \frac{a \, r_{\ast} - b}{d} \right) V^{-1} \, \vec{\mu} \end{align}\]

Along the frontier, the return is

\[ r_{\ast} = \xi \, r_{v} + (1-\xi) \, r_{t} \]

The variance is

\[ \sigma^2_{\ast} = \frac{a}{d} \, r_{\ast}^{2} - \frac{2 \, b}{d} \, r_{\ast} + \frac{c}{d} \]

TODO: Note calculation order of \(\vec{w}_{v}(\mu, V)\) and \(\vec{w}_{t}(\mu, V, r_{f})\), then calculate \(r_{\ast}(\sigma_{\ast})\), scanning from \(\sigma_{v}\) to \(\sigma_\mathrm{max}\).

Figure 3.1: The “Markowitz Bullet”, the efficient frontier shown in Markowitz (1959), p. 152.

In general, depending on the correlations of the assets, the efficient frontier portfolios will short various positions, indicated by having negative weights.

3.2.3 No-shorts frontier

If one adds an additional constraint to the Markowitz portfolio problem as stated, requiring that we don’t short any positions

\[ w_i \geq 0 \]

then the problem doesn’t have an analytic solution. TODO: Citation needed.

The no-shorts frontier can be solved numerically with quadratic programming. In general, the no-shorts frontier will follow the unconstrained efficient frontier when there isn’t any shorting in the efficient portfolios, and the no-shorts frontier will pull away from the efficient frontier to somewhat lower returns when there is shorting on the efficient frontier.

An example of the efficient frontier and the no-shorts frontier is shown in Figure 3.2.

Figure 3.2: The efficient frontier and no-shorts frontier for a few example assets, using daily data from 2014-01-02 to 2024-08-30, 10 years and 8 months.

Quadratic programming and convex optimization are discussed in more detail in the section on Convex optimization.

3.2.4 Lessons of MPT

Markowitz:

[I]n trying to make variance small it is not enough to invest in many securities. It is necessary to avoid investing in securities with high covariances among themselves. We should diversify across industries because firms in different industries, especially industries with different economic characteristics, have lower covariances than firms within an industry. 13

13 Markowitz (1952), p. 89.

Dalio: “The Holy Grail”, see Figure 3.3.

Figure 3.3: TODO: Citation needed.

3.3 Estimation of covariance matrices

This is how we estimate \(V\) (and \(\mu\)).

14 Ledoit & Wolf (2001).

15 Ledoit & Wolf (2003).

16 Coqueret & Milhau (2014).

17 Mantegna (1998).

18 Onnela, J.P. et al. (2003).

19 Onnela, Kaski, & Kertész (2004).

Figure 3.4: The correlation matrix of a few example assets using daily data from 2014-01-02 to 2024-08-30, 10 years and 8 months.

TODO:

  • Sample mean and covariance
  • Rolling mean and covariance
  • Exponential moving mean and covariance
  • Online mean and covariance
  • Shrinkage estimators

3.4 Convex optimization

This is how we minimize \(\sigma\).

20 Markowitz (1956).

21 Jagannathan & Ma (2003).

TODO: Discuss optimizing the no-shorts frontier.

3.5 Fund theorems

3.5.1 Mutual fund separation theorem

  • Mutual fund separation theorem
  • Cass, D. & Stiglitz, J.E. (1970). The structure of investor preferences and asset returns, and separability in portfolio allocation: A contribution to the pure theory of mutual funds. 22
  • Chamberlain, G. (1983). A characterization of the distributions that imply mean-variance utility functions. 23
  • Owen, J. & Rabinovitch, R. (1983). On the class of elliptical distributions and their applications to the theory of portfolio choice. 24

22 Cass & Stiglitz (1970).

23 Chamberlain (1983).

24 Owen & Rabinovitch (1983).

Cass & Stiglitz:

[G]iven a market in which there are available \(n\) different assets, nonetheless all the opportunities relevant to the investor’s decision can be provided by a set of \(m\) (\(< n\)) “mutual funds,” i.e., a set of \(m\) linear combinations (with weights adding to one) of the available assets. 25

25 Cass & Stiglitz (1970), p. 122.

3.5.2 Two-fund theorem

Continuing the discussion of the context of a portfolio of risky assets (no risk-free asset; to be considered in the next section).

Tobin26 is often credited as the first to note, and later Merton27 exposited more formally, the Two-fund theorem:

26 Tobin (1958).

27 Merton (1972).

Merton:

Given \(m\) assets satisfying the conditions […], there are two portfolios (“mutual funds”) constructed from these \(m\) assets, such that all risk-averse individuals, who choose their portfolios so as to maximize utility functions dependent only on the mean and variance of their portfolios, will be indifferent in choosing between portfolios from among the original \(m\) assets or from these two funds. 28

28 Merton (1972), p. 1858.

Kasa:

Any portfolio on the efficient frontier can be written as a linear combination of two fixed efficient portfolios.

\[ \vec{w}_{\ast} = \xi \, \vec{w}_{1} + (1-\xi) \, \vec{w}_{2} \]

TODO: reparameterize? 29

29 TODO: Throughout this we have parameterized \(\xi\) such as it goes from 0 to 1, we go from holding asset 2 to 1. Let’s reparameterize so that \(\xi \rightarrow (1-\xi)\).

3.5.3 One-fund theorem

Now we consider adding the posibility of holding a risk-free asset with a risk-free return, \(r_{f}\).

One-fund theorem:

Kwok:

Any efficient portfolio [on the Capital Allocation Line] can be expressed as a combination of the risk free asset and the portfolio (or fund) represented by \(M\).

Kasa:

Any portfolio on the efficient frontier can be written as a linear combination of one fixed efficient non-risk-free portfolio and the risk-free asset.

The portfolio weights are

\[ \vec{w}_{\ast} = \kappa \, \vec{w}_{f} + (1-\kappa) \, \vec{w}_{t} \]

The portfolio return is

\[ r_{\ast} = \kappa \, r_{f} + (1-\kappa) \, r_{t} \]

The portfolio standard deviation is

\[ \sigma_{\ast} = \left| 1-\kappa \right| \sigma_{t} \]

Since the efficient frontier is a linear combination of the risk-free, “cash”, and a single portfolio of risky assets, “stocks”, then it forms a line in return-risk-space from the risk-free asset to the tangent portfolio, and follows the line further up if one allows borrowing at the risk-free rate and investing in the tangent portfolio. This line is called the Capital Allocation Line because it represents the possible portfolios one can have depending on how much of their cash they have deployed into risky assets in the market.

The functional form of the Capital Allocation Line is

\[ r_\mathrm{CAL}(\sigma) = r_{f} + \sigma \, \sqrt{ a \, r_{f}^{2} - 2 \, b \, r_{f} + c} \]

TODO: Double-check the expression and example values of this slope.

Note that while the shape of the efficient frontier is unchanged by introducing or varying the risk-free rate of return, which portfolio along the frontier that is the tangent portfolio will depend on the risk-free rate of return.

The tangent portfolio with a risk-free asset is

\[ \vec{w}_{t} = \frac{V^{-1} \, (\vec{\mu} - r_{f} \, \vec{1})}{\vec{1}^\intercal \, V^{-1} \, (\vec{\mu} - r_{f} \, \vec{1})} \]

It has a return

\[ r_{t} = \vec{\mu} \cdot \vec{w}_t = \frac{c - b \, r_{f}}{b - a \, r_{f}} \]

and a variance

\[ \sigma_{t}^{2} = \frac{\left|\vec{\mu} - r_{f} \, \vec{1}\right|^2}{ (\vec{\mu} - r_{f} \, \vec{1})^\intercal \, V^{-1} \, (\vec{\mu} - r_{f} \, \vec{1})} = \frac{a \, r_{f}^2 - 2 \, b \, r_{f} + c}{(b - a \, r_{f})^2} \]

The tangent portfolio is the portfolio with the maximum Sharpe ratio, \(S_i\).

\[ S_i \equiv \frac{ r_i - r_f }{ \sigma_i } \label{eq:sharpe_ratio} \]

The Sharpe ratio is a measure of how much excess return an asset had over a risk-free asset, adjusted for the risk as measured by the standard deviation of return.

TODO:

  • Citation needed for the one-fund theorem
  • Related to the efficient-market hypothesis: in equilibrium, the tangent portfolio becomes the market portfolio

3.6 Efficient-market hypothesis

30 Fama (1970).

3.7 Capital asset pricing model

Keywords:

Background:

31 Jensen (1968).

32 Sharpe (1963).

33 Sharpe (1964).

34 Sharpe (1999).

35 Sharpe (1990).

\[ \beta_i = \frac{ \mathrm{Cov}(r_i, r_m) }{ \mathrm{Var}(r_m) } = \mathrm{Cor}(r_i, r_m) \: \frac{\sigma_i}{\sigma_m} \label{eq:sharpe_beta} \]

Thought in \(r_i\) vs \(r_m\) space, accumulating points over time, \(\alpha_{i}\) and \(\beta_{i}\) can be calculated via linear regression:

SCL:

\[ r_{it} - r_f = \hat{\alpha}_i + \hat{\beta}_i \, (r_{mt} - r_f) + \varepsilon_{it} \label{eq:alpha_beta_regression} \]

The Security Characteristic Line (SCL) is the line in \(r_i\) vs \(r_m\), fit to a particular asset, \(i\), with its slope, \(\hat{\beta}_{i}\), and its \((r_i - r_f)\) intercept, \(\hat{\alpha}_i\).

Jensen’s alpha uses the same form, but at a particular time point, using a historical fit for \(\hat{\beta}_{i}\), but not \(\alpha_{i}\).

\[ \alpha_{i} = (r_i - r_f) - \hat{\beta}_{i} \, (r_m - r_f) \label{eq:jensen_alpha} \]

TODO: Compare with this:

\[ \alpha_{i} = (r_i - r_f) - \hat{\beta}_{i} \, (\mu_{m} - r_f) \]

The Security Market Line (SML), thought in \(r_i\) vs \(\beta_i\) space, goes through the market portfolio at (\(\beta_m\), \(r_m\)).

SML:

\[ \mathbb{E}(r_i) = r_f + \beta_i \left( \mathbb{E}(r_m) - r_f \right) \]

Figure 3.5: The Capital Asset Pricing Model (CAPM) applied to a few example assets using daily data from 2014-01-02 to 2024-08-30, 10 years and 8 months.

\[ T_i \equiv \frac{ r_i - r_f }{ \beta_i } \label{eq:treynor_ratio} \]

36 Gibbons, Ross, & Shanken (1989).

37 Luenberger (1998).

3.8 Black-Litterman model

  • Black, F. & Litterman, R. (1991). Asset allocation.38
  • Black, F. & Litterman, R. (1992). Global portfolio optimization. 39

38 Black & Litterman (1991).

39 Black & Litterman (1992).

3.9 Factor models

3.9.1 Factor analysis

3.9.2 Fama-French model

40 Fama & French (1992).

3.9.3 Carhart four-factor model

3.10 Risk preferences

41 Kelly (1956).

42 Merton (1969).

43 Karatzas, Lehoczky, Sethi, & Shreve (1986).

44 Rockafellar & Uryasev (2000).

3.11 Postmodern portfolio theory

3.11.1 Criticisms of MPT

Criticisms of MPT:

  1. Sensitivity of portfolio weights to the estimates of \(\hat{\mu}\) and \(\hat{V}\).
    • Error propagation
  2. Problem of induction
    • Past performance is no guarantee of future results
    • Criticisms of using historical estimators of \(\hat{\mu}\) and \(\hat{V}\)
  3. Variance is not a good measure of risk
    • Downside risk is better

3.11.2 Error propagation

45 Lo (2002).

3.11.3 Downside risk

46 Markowitz, Starer, Fram, & Gerber (2019).

\[ \mathrm{TSV}(r_i, r_t) = \mathbb{E}\left[ (r_i - r_t)^2 \: \mathbb{1}_{\{r_i < r_t\}} \right] \label{eq:target_semi_variance} \]

\[ \mathrm{TSD}(r_i, r_t) = \sqrt{\mathrm{TSV}(r_i, r_t)} \label{eq:target_semi_deviation} \]

3.11.4 More

  • Rom, B.M. & Ferguson, K. (1993). Post-modern portfolio theory comes of age. 47
  • Sortino, F. (2010). The Sortino Framework for Constructing Portfolios. 48
  • Elton, E.J., Gruber, M.J., Brown, S.J., & Goetzmann, W.N. (2014). Modern Portfolio Theory and Investment Analysis. 49
  • Low-volatility anomaly

47 Rom & Ferguson (1993).

48 Sortino (2010).

49 Elton, Gruber, Brown, & Goetzmann (2014).

3.12 Hierarchical risk parity

50 López de Prado (2016).

51 López de Prado (2018).

52 Lohre, Rother, & Schäfer (2020).