Asymptotic Principal Component Analysis (APCA)

When PCA is running in rolling window \(T\) against a universe with number of instruments \(N\), it requires to satisfy the condition that \(T\) should be much greater than \(N\) to produce quality estimates. To address the shortcoming of PCA, rather than performing analysis on the \(N\) space, the analysis is performed on the \(T\) space.

\[ \hat{Q} = \frac{R R^T}{T} = VDV^T \]

It is then chosen \(V_n^T\), which contains the greatest \(n\) eigenvectors, as the factor returns \(F\), and run through the regression on

\[ R = B F + {\Gamma} \]

Reference

Gregory Connor, Robert A. Korajczyk (1988). Risk and return in an equilibrium APT: Application of a new test methodology

Module

class fpm_risk_model.statistical.apca.APCAConfig(*, show_all_instruments: bool = False, n_components: Union[int, float, str], demean: Optional[bool] = True)

Asymptotic PCA statistics model configuration class.

Parameters

n_componentsint

Number of components.

demeanOptional[bool]

Indicate whether to demean before fitting. Default is True.

n_components: Union[int, float, str]
demean: Optional[bool]
class fpm_risk_model.statistical.apca.APCA(n_components: int, demean: Optional[bool] = True, **kwargs)

Asymptotic PCA statistics model.

ConfigClass

alias of APCAConfig

__init__(n_components: int, demean: Optional[bool] = True, **kwargs)

Constructor.

Parameters

n_componentsint

Number of components.

demeanOptional[bool]

Indicate whether to demean before fitting. Default is True.

fit(X: Union[ndarray, DataFrame], atol: float = 1e-10) object

Fit the returns into the risk model.

Parameters

X: pandas.DataFrame or numpy.ndarray

Instrument returns where the columns are the instruments and the index is the date / time in ascending order. For example, if there are N instruments and T days of returns, the input is with the dimension of (T, N).

Returns

object

The object itself.