The Correlation Coefficient

$\rho_{xy}=\frac{(x-\bar{x}) * (y-\bar{y})}{\sigma_x\sigma_y}$

Covar = 1: Perfect (Positive) Correlation

Covar > 0: Positive Correlation (The two variables move in the same direction.)

Covar < 0: Negative Correlation (The two variables move in opposite directions.)

Covar = 0: No Correlation (The two variables are independent.)

Example:

$\rho_{xy}=\frac{(x-\bar{x}) * (y-\bar{y})}{n-1}$

x: house size

y: house price

Covariance and Correlation

\begin{eqnarray*} Covariance Matrix: \ \ \Sigma = \begin{bmatrix} \sigma_{1}^2 \ \sigma_{12} \ \dots \ \sigma_{1I} \\ \sigma_{21} \ \sigma_{2}^2 \ \dots \ \sigma_{2I} \\ \vdots \ \vdots \ \ddots \ \vdots \\ \sigma_{I1} \ \sigma_{I2} \ \dots \ \sigma_{I}^2 \end{bmatrix} \end{eqnarray*}
In [26]:
import numpy as np
import pandas as pd
from pandas_datareader import data as wb
import matplotlib.pyplot as plt
In [27]:
tickers = ['PG', 'BEI.DE']

sec_data = pd.DataFrame()

for t in tickers:
    sec_data[t] = wb.DataReader(t, data_source='yahoo', start='2007-1-1')['Adj Close']

sec_returns = np.log(sec_data / sec_data.shift(1))

Variance

In [28]:
PG_var = sec_returns['PG'].var()
PG_var
Out[28]:
0.00014225385298706933
In [29]:
BEI_var = sec_returns['BEI.DE'].var()
BEI_var
Out[29]:
0.00019114232202711176
In [30]:
# annual
PG_var_a = sec_returns['PG'].var() * 250
PG_var_a
Out[30]:
0.03556346324676733
In [31]:
# annual
BEI_var_a = sec_returns['BEI.DE'].var() * 250
BEI_var_a
Out[31]:
0.04778558050677794

Covariance

In [32]:
cov_matrix = sec_returns.cov()
cov_matrix
Out[32]:
PG BEI.DE
PG 0.000142 0.000045
BEI.DE 0.000045 0.000191
In [33]:
# annual
cov_matrix_a = sec_returns.cov() * 250
cov_matrix_a
Out[33]:
PG BEI.DE
PG 0.035563 0.011226
BEI.DE 0.011226 0.047786

Correlation

In [34]:
corr_matrix = sec_returns.corr()
corr_matrix
Out[34]:
PG BEI.DE
PG 1.000000 0.271889
BEI.DE 0.271889 1.000000