European Short Position Data

policy Background

Following the global financial crisis of 2008, European Securities and Markets Authority (ESMA) launched policy EU N236/2012 on short selling and certain aspects of credit default swaps. The purpose of this EU-wide reporting regulation is to increase market stability by reducing the opaqueness of market activities through mandatory daily reporting by institutional investors.

Therefore, each participating country's financial monitoring agency requires market participants with the significant short positions on qualifying securities to self-report the positions, and further makes such information available to the public.

technical background

As each government agency's website carries the data differently, and often the data scraping exercise involves mimicking the "clicking" action of a mouse, the package Selenium suits the need perfectly.

However, I would like to emphasize here that this code needs human monitoring over time, as government agency's website can change layout or the presented data might show different format. So please do not rely on the code output blindly (more likely it will crash somewhere before getting to the end!).

Our student intern Shuai Zheng contributed to this code.


0. Import packages

1. Define functions and set up Selenium driver

2. Scrape each country's agency site

2.1 Austria

2.2 Belgium

2.3 Czech Republic

2.4 Denmark

2.5 Finland

2.6 France

2.7 Germany

2.8 Greece

2.9 Hungary

2.10 Iceland

2.11 Ireland

2.12 Italy

2.13 Luxembourg

2.14 Netherlands

2.15 Poland

2.16 Spain

2.17 Sweden

2.18 UK

2.19 Norway

Norwegian Finanstilsynet presents the data in the json format through its API portal after the most recent website revamp, therefore the code does not resemble the other countries' download process.

The field activePositions is a list of dictionary items that report detail information on short sellers and the respective holdings, and one needs to further open up this field to have a long table that shows each item as a separate row.

I naively read in each row in activePositions column, convert it into a temp dataframe and append it to final output. I am sure there are smarter ways to proceed but this approach gets the job done. I also rename variables for the aggregate level to distinguish them from the individual fund level.

Done with all countries raw data scraping