Characteristics-based benchmark (dgtw)

This code is written to replicate the characteristics-based benchmarks proposed by Daniel, Grinblatt, Titman and Wermers (1997), hence the short form DGTW benchmarks. In asset pricing literature, the DGTW benchmarks are widely used to measure the performance of institutional investors based on the portfolio holdings data.

As always, the first step is to import relevant Python packages and establish connection to WRDS server.

Pricing related information (return, price, shares outstanding) is obtained from CRSP using the following "raw_sql" step.

Compustat Fundamental Annual contains the financial statement related information needed for calculating book value of equity.

To link these two sources together, CRSP provides a handy linking table to join the key identifiers, permco and gvkey, from both databases.

Below is a somewhat clumsy function to assign stocks into Fama-French 48 industry classification based on the historical SIC codes. Please refer to Ken French's site for a detailed text file describing the classification scheme.


The function "ffi48" is then applied to the data frame containing joint information of CRSP and Compustat created in the previous step to assign each stock to its proper Fama-French 48 Industry group.


Now that we have created an industry classification scheme, the rest are quite straightforward. AS DGTW benchmarks performance based on value, momentum and size of the stocks, we need to calculate the industry averages of these metrics.

Let's start with the Book-to-Market ratio.


Next is to create the momentum factors for the industry groups.


Third is to create size portfolios based on the NYSE stocks size breakpoints.

The final step involves calculating size, book-to-market, and momentum controlled value-weighted portfolio returns. I really wish Python can add a value-weighted option in the mean() function. How I miss SAS "proc means" statement's weight=size capacity.


For those of you who have already created your DGTW excess return using the sample SAS code available on WRDS website Research Application section, you may want to compare the output of this Python code to that. The correlation between the two dgtw_vwret series is 99.4%, with p-value equals 0.0.

I also plot the time series of the corner portfolios below to provide a visual comparison between the two series. The DGTW portfolio is ordered by Size, Book-to-Market and Momentum. The blue solid line represents the value-weighted portfolio return from the Python output (dgtw_vwret), and the dash red line reports the result from the SAS output.