Portfolio Optimization using Gradient Descent

You should do 60/40 allocation in stocks/bond.
You should diversify your portfolio.
You should buy this, you should buy that.

Okay, those are the things we heard a lot when we start to invest.
But what is the mathematically optimal portfolio where you get the best returns while taking the least risk?

How to measure how is your portfolio performing?

If you measure solely based on return, just leverage your positions then you can get what you wanted.

But like a lot of things in life, things are not that simple.
In investment, you care about the risk you take also.
In this context, the proxy to risk is the volatility of an asset.
Though there are controversy to the measure of risk using volatility, but that is the most easily obtained quantitative metric that we can get to measure risk.

Hence, the way to measure your portfolio performance is the return of investment per unit of risk.
Or as we call it, the Sharpe ratio

\begin{equation} \text{Sharpe Ratio} = \frac{E[R_p - R_f]}{\sqrt{Var[R_p - R_f]}} \end{equation}

Optimization using gradient descent method

Consider that we have a 3 fund portfolio,
The return of a portfolio will be the weighted sum of the return minus the risk free rate return

Total Return = w_1R_1 + w_2R_2 + w_3R_3 - R_f

Where the volatility is the weigthed sum of an asset’s volatility and the covariance between assets

Volatility =\sqrt{w_1^2 \sigma_1^2 .w_2^2 \sigma_2^2. w_3^2 \sigma_3^2 + 2w_1w_2cov(1,2) + 2w_2w_3cov(2,3) + 2w_1w_3cov(1,3)}

Hence,

\text f(w) = \frac{w_1R_1 + w_2R_2 + w_3R_3 - R_f}{\sqrt{w_1^2 \sigma_1^2 .w_2^2 \sigma_2^2. w_3^2 \sigma_3^2 + 2w_1w_2cov(1,2) + 2w_2w_3cov(2,3) + 2w_1w_3cov(1,3)}}

Idea of gradient descent

Like climbing a mountain, the goal of gradient descent is to maximize/minimize a function.
To achieve that, we need to know how steep are we at right now, or in mathematical terms, the gradient.

We’ll need to get the first derivative of the Sharpe ratio with respect to weights.

Naturally, I’ll skip this part because I’m not good at typing in LaTeX.

Talk is cheap, show me the code

In this post, I am optimizing a three fund portfolio which consists of
Vanguard Total World Stock Index Fund ETF (VT)
Vanguard Total Bond Market Index Fund ETF (BND)
SPDR Bloomberg 1-3 Month T-Bill ETF (BIL)

import math
import numpy as np

R_1 =  0.09005572157459407
R_2 =  0.011467362280556559
R_3 =  0.0041235900715872464
R_free = 0.0275 # using maybank 1 month FD rate here

# weights
w_1 = 0.6
w_2 = 0.2
w_3 = 0.2

cov_11 =  0.00013121288625519365
cov_22 = 7.225186779674805e-06
cov_33 =  2.4716643519128892e-08

cov_12 =  2.31881732577847e-07
cov_13 =  -4.1092654882446956e-08
cov_23 =  1.3072393870561457e-08

portfolio_expected_return = R_1 * w_1 + R_2 * w_2 + R_3 * w_3
portfolio_volatility = math.sqrt( pow(w_1,2)*cov_11 + pow(w_2,2)*cov_22 + pow(w_3,2)*cov_33 + 2*w_1*w_2*cov_12 + 2*w_1*w_3*cov_13 + 2*w_2*w_3*cov_23) 

objective_function = portfolio_expected_return / portfolio_volatility

df_dw1 = -1 * ( (R_1 - 0.5*(2*w_1*pow(cov_11,2) + 2*w_2*cov_12 + 2*w_3*cov_13)) * portfolio_expected_return ) / pow(portfolio_volatility,2)
df_dw2 = -1 * ( (R_2 - 0.5*(2*w_2*pow(cov_22,2) + 2*w_1*cov_12 + 2*w_3*cov_23)) * portfolio_expected_return ) / pow(portfolio_volatility,2)
df_dw3 = -1 * ( (R_3 - 0.5*(2*w_3*pow(cov_33,2) + 2*w_2*cov_23 + 2*w_1*cov_13)) * portfolio_expected_return ) / pow(portfolio_volatility,2)

def get_df_dw1(v_w_1, v_w_2, v_w_3):
    portfolio_expected_return = R_1 * v_w_1 + R_2 * v_w_2 + R_3 * v_w_3
    portfolio_volatility = math.sqrt( pow(v_w_1,2)*cov_11 + 
                                      pow(v_w_2,2)*cov_22 +
                                      pow(v_w_3,2)*cov_33 +
                                      2*v_w_1*v_w_2*cov_12 +
                                      2*v_w_1*v_w_3*cov_13 +
                                      2*v_w_2*v_w_3*cov_23) 

    return -1 * ( (R_1 - 0.5*(2*w_1*pow(cov_11,2) + 2*w_2*cov_12 + 2*w_3*cov_13)) * portfolio_expected_return ) / pow(portfolio_volatility,2)

def get_df_dw2(v_w_1, v_w_2, v_w_3):
    portfolio_expected_return = R_1 * v_w_1 + R_2 * v_w_2 + R_3 * v_w_3
    portfolio_volatility = math.sqrt( pow(v_w_1,2)*cov_11 + 
                                      pow(v_w_2,2)*cov_22 +
                                      pow(v_w_3,2)*cov_33 +
                                      2*v_w_1*v_w_2*cov_12 +
                                      2*v_w_1*v_w_3*cov_13 +
                                      2*v_w_2*v_w_3*cov_23) 

    return -1 * ( (R_2 - 0.5*(2*w_2*pow(cov_22,2) + 2*w_1*cov_12 + 2*w_3*cov_23)) * portfolio_expected_return ) / pow(portfolio_volatility,2)

def get_df_dw3(v_w_1, v_w_2, v_w_3):
    portfolio_expected_return = R_1 * v_w_1 + R_2 * v_w_2 + R_3 * v_w_3
    portfolio_volatility = math.sqrt( pow(v_w_1,2)*cov_11 + 
                                      pow(v_w_2,2)*cov_22 +
                                      pow(v_w_3,2)*cov_33 +
                                      2*v_w_1*v_w_2*cov_12 +
                                      2*v_w_1*v_w_3*cov_13 +
                                      2*v_w_2*v_w_3*cov_23) 

    return -1 * ( (R_3 - 0.5*(2*w_3*pow(cov_33,2) + 2*w_2*cov_23 + 2*w_1*cov_13)) * portfolio_expected_return ) / pow(portfolio_volatility,2)

def project(x):
    total = x.sum()
  
    return [round(x[0]/total, 2), round(x[1]/total, 2), round(x[2]/total, 2)]


def gradient_descent():
    x = np.array([w_1, w_2, w_3])

    # initial 
    alpha = 1
    learning_rate = np.array([alpha, alpha, alpha])
    gradient = [df_dw1, df_dw2, df_dw3]

    print("\n====BEFORE OPTIMIZATION=====")
    print("| asset                   VT  BND  BIL")
    print("| initial weights      ", w_1, w_2, w_3)
    print("| portfolio return     ", portfolio_expected_return)
    print("| portfolio volatility ", portfolio_volatility) 
    print("| sharpe ratio         ", (portfolio_expected_return - R_free)/ portfolio_volatility)
    print("| obj func gradient    ", gradient)
    print("============================")

    i = 0
    while (True):
        # plug gradient into x_k+1 = x_k - alpha * gradient
        x = x - learning_rate*np.array([gradient[0], gradient[1], gradient[2]])
        # update gradient
    
        gradient = np.array([get_df_dw1(x[0], x[1], x[2]), get_df_dw2(x[0], x[1], x[2]), get_df_dw3(x[0], x[1], x[2])])
        i += 1
        # iterate many times or
        # alternatively, stop when gradient is near 0
        if i==50000:
            break

    (vw_1, vw_2, vw_3) = project(x)
    new_portfolio_expected_return = R_1 * vw_1 + R_2 * vw_2 + R_3 * vw_3
    new_portfolio_volatility = math.sqrt( pow(vw_1,2)*cov_11 + pow(vw_2,2)*cov_22 + pow(vw_3,2)*cov_33 + 2*vw_1*vw_2*cov_12 + 2*vw_1*vw_3*cov_13 + 2*vw_2*vw_3*cov_23) 
    print("\n====AFTER OPTIMIZATION======")
    print("| asset                   VT  BND  BIL")
    print("| optimized weight     ", vw_1, vw_2, vw_3)
    print("| portfolio return     ", new_portfolio_expected_return)
    print("| portfolio volatility ", new_portfolio_volatility) 
    print("| sharpe ratio         ", (new_portfolio_expected_return - R_free)/ portfolio_volatility)
    print("| obj func gradient    ", gradient)
    print("============================")

gradient_descent()

Output:


====BEFORE OPTIMIZATION=====
| asset                   VT  BND  BIL
| initial weights       0.6 0.2 0.2
| portfolio return      0.057151623415185204
| portfolio volatility  0.006897352416612769
| sharpe ratio          4.298986281136967
| obj func gradient     [-108.18693435263714, -13.775960933899709, -4.953835084916042]
============================

====AFTER OPTIMIZATION======
| asset                   VT  BND  BIL
| optimized weight      0.85 0.11 0.04
| portfolio return      0.07797371679212968
| portfolio volatility  0.009743174905879868
| sharpe ratio          7.317839330720598
| obj func gradient     [-0.02504764 -0.00318944 -0.00114692]
============================

The starting allocation is 60/20/20 for VT/BND/BIL portfolio, with Sharpe ratio of 4.298, which is not too bad! After optimization, it’s 85/11/4, with Sharpe ratio raised to 7.318, with a slight rise in portfolio volatility but big rise in portfolio return!

Afterthoughts

We are optimizing a portfolio with assumption that asset return and volatility is constant over time, which is simply not true!
Though with that being said, this still serves some reference value, as Mark Twain said, History never repeats itself, but it does often rhyme

How to measure how is your portfolio performing?

Optimization using gradient descent method

Idea of gradient descent

Talk is cheap, show me the code

Afterthoughts

Leave a Reply Cancel reply