wrds sec

This set of code demonstrates how to access various SEC indices and filings on WRDS SEC tool. 

Import Python packages and establish connection with WRDS server.

If you don't already know the data component of the WRDS SEC filings, you may want to explore which tables are part of this library. The most commonly used tables include:

Usage 1: Query Data by Keywords

Occasionally, researchers want to zoom in on certain filings that contain keywords. Taking the insider filings for instance, researchers might want to focus on any insider transaction that is related to Rule 10b5-1. One can parse out the insider filings that contain this keyword by using the code below.

The "where" statement in the rawl_sql block focuses on records whose "text" column contains the keyword "10b5". Notice the command "LIMIT 1000" in the raw_sql statement limits the output to first 1000 records that match the keyword criteria. 

And here is the output of the text column of the first record that meets the criteria.

The stock option exercise reported in this Form 4 was effected pursuant to a Rule 10b5-1 trading plan adopted by the reporting person on December 3, 2018.

Usage 2: Query Filing Index by Ticker

This part of the code enables researchers to look up a company's CIK using TICKER, and it further looks up the file name path that point to the actual filings on Edgar's website. WRDS SEC server also mirrors this file path structure. 

This raw_sql block looks up the CIKs for two tickers, AAPL and WFC, using the linking table wciklink_names prepared by WRDS research team. The table below shows the content of the cikdf dataframe. 

Next, we can use the CIK information to query these two companies' Edgar filing path. One can also impose date range conditions to focus on certain years' filings.

And the output dataframe df contains the following information:

Usage 3: Get Actual Filing Content

For textual analysis projects involving sentiment, readability, topic modelling and many more advanced tasks, researchers need to access the actual SEC filings themselves. WRDS SEC pre-cleans all the SEC filings and has them stored on server for easy access.

One might notice that under the form column, there are various types of SEC filings, such as Form 3, 4, and many others, in addition to the commonly used 10-K, 10-Q and 8-K filings. As a matter of fact, for the two companies in this example, there are over 40 types of SEC filings. 

If one wants to focus on certain type of filing (e.g. 10-K), the operation below filters out all the non 10-K filings and reduces the sample to 13 records.

Last but not least, let's try extract the actual 10-K filings for these two companies. To do so, we will rely on the wrdsfname field in the dataframe above. This variable contains file name and location information for each filing on wrds server under the SEC specific project folder. To complete the file path, we will need to use the os.path.join command below to add on the location of the project folder.  

The output list textLst should contain the body of the thirteen 10-K filings filed by these two companies in the sample period. 

And below is the partial (full content hundreds of pages long) output of the first 10-K filed by WFC on 2015/02/25 for period end 2014/12/31.