fama-french 3 Factor model

This set of Python code replicates the Fama French risk factors SMB and HML, in addition to the excess market risk factor. It utilizes CRSP data for pricing related items and Compustat data for fundamental data.

We start by importing relevant data packages in Python and then establish connection with WRDS server.



Next, we extract fundamental information from Compustat, and calculate book-value of equities.


Price and shares outstanding information is extracted from CRSP for calculating market value of equity.

As market cap is calculated at issue-level (permno), and book value of equity is calculated at company level, we need to aggregate market cap at company level (permco) for later book-to-market value calculation. And market cap of companies at December of year t-1 is used for portfolio formation at June of year t.


Next, we extract linking information from CRSP CCM table to join the market cap information from CRSP together with book-value of equity information from Compustat. With the joined data frame, book-to-market is a straightforward ratio calculation.


Following the original paper's methodology, NYSE stock universe is used for portfolio formation based on size.


Functions to assign stocks into size and book-to-market portfolios:


Then apply the functions to the dataframe:

The final steps involve form portfolios based on size and book-to-market assignment.

Apply the function to calculate value weighted returns within each portfolio:

One might be interested in comparing the program generated Fama French factors with the actual ones reported by Fama French:

Last but not least, the chart below illustrates the how the output of this Python program lines up against the Fama French 3-factors. Solid blue line represents the risk factor generated from the Python code, and dash red line represents the original data series from Fama French library.