PCA¶
Assume the historical instrument returns of the estimation universe is represented by a T x N matrix R. With singular value decomposition (SDV), the covariance matrix \(\hat{Q}\) is decomposed by its eigenvectors and eigenvalues.
where V is a matrix of eigenvalues (each column is an eigenvector) and D is a diagonal matrix with eigenvalues \(\lambda_i\) in the decreasing order on the diagonal.
The factor exposure matrix \(B\) is taken to be \(V_nD^{\frac{1}{2}}_n\), where n is the number of largest eigenvalues selected, in a dimension of \((n, N)\).
Factor \(F\) (in dimension \((T, n)\)) and residual returns \({\Gamma}\) (in dimension \((T, N)\)) can then be computed by either ordinary or weighted least-squares
where \(W\) is the weight matrix in regression, e.g. an identity matrix in ordinary weighted least-squares.
Module¶
- class fpm_risk_model.statistical.pca.PCAConfig(*, show_all_instruments: bool = False, n_components: Union[int, float, str], demean: Optional[bool] = True, speedup: Optional[bool] = True)¶
PCA statistics model configuration class.
Parameters¶
- n_componentsint
Number of components.
- demeanOptional[bool]
Indicate whether to demean before fitting. Default is True.
- speedup: Optional[bool]
Indicate whether to speed up the computation as much as possible. Default is True.
- n_components: Union[int, float, str]¶
- demean: Optional[bool]¶
- speedup: Optional[bool]¶
- class fpm_risk_model.statistical.pca.PCA(n_components: int, demean: Optional[bool] = True, speedup: Optional[bool] = True, **kwargs)¶
PCA statistics model.
- __init__(n_components: int, demean: Optional[bool] = True, speedup: Optional[bool] = True, **kwargs)¶
Constructor.
Parameters¶
- n_componentsint
Number of components.
- demeanOptional[bool]
Indicate whether to demean before fitting. Default is True.
- speedup: Optional[bool]
Indicate whether to speed up the computation as much as possible. Default is True.
- fit(X: Union[ndarray, DataFrame], weights: Optional[Union[ndarray, Series]] = None) object¶
Fit the returns into the risk model.
Parameters¶
- X: pandas.DataFrame or numpy.ndarray
Instrument returns where the columns are the instruments and the index is the date / time in ascending order. For example, if there are N instruments and T days of returns, the input is with the dimension of (T, N).
Returns¶
- object
The object itself.