Seminar "Selected Topics in Finance" Summer 2019: What’s in a p-value? The reliability of tests in empirical asset pricing


The seminar is open to Master students.

An empirical asset pricing study mostly looks like this: Researchers show that a variable significantly predicts stock returns in a way which cannot be explained with existing models. Recently, however, doubts about the reliabiliy of p-values in such studies have increased. Simply put: If 1000 researchers search for a significant variable, 10 will find one which is significant on the 1% level. We will look at this problem and possible solutions to it.

To successfully pass the seminar you need to write a paper and give a presentation. Papers can be written in either German or English and should have a length of 15-20 (team of two) or 20-25 pages (team of three). For hints on how to write a paper see our guidelines. You need to hand in a printed version and also a digital one (PDF). The seminar talks should be given in English.

The two main parts of your paper and presentation will be (i) explanation of the analytical methodology and (ii) replication of analyses from the key references. You should also provide an introduction, a short summary of the literature (which can be part of the introduction), and some concluding remarks. For most topics, you will need to use a software such as R or Matlab.

Please contact your supervisor to discuss the outline of your paper, your empirical part, and any questions that you may have. For organizational questions, please ask Syed Wasif Hussain.

FAQ & Organisational matters

  • Do we get a grade? Yes. Your paper and your presentation will be graded and lead to one grade (equally weighted). Both the paper and presentation have to be passed.
  • What do we have to hand in? An outline of your paper to discuss the content of your paper and your final paper one week before the presentation.
  • Who is responsible? For content-related questions, please contact your supervisor. For organizational questions, please ask Syed Wasif Hussain.

Time Table

  • 28.01.2019 - 02.02.2019 Students must submit their preferences over seminars for the first matching round.
  • 03.02.2019 1st round of seminar matching.
  • 07.02.2019 2nd round of seminar matching.
  • 15.02.2019 General information about Seminar, introductory meeting, 16:30, Heho 18, room: 1.20)
  • 15.02.2019 Topic allocation (sort the topics on Taddle until 24.02.2019. 23:55)
  • 01.04.2019 - 19.04.2019 Registration at the Higher Services Portal
  • Until 26.04.2019 Meet your supervisor to discuss the outline of the paper
  • 07.06.2019 Submission of the paper until 12:00 noon, HeHo 18, room 1.00 (to Wasif, email and hard copy)
  • 14.06-15.06.2019 Presentations*, HeHo 22 Room E.04 (Fr) and O25/346(Sa) 

*exact schedule has been sent via email on 27.05.2019


1. P-values and p-hacking

Start by explaining the use and meaning of p-values in hypothesis testing. Then explain what is meant by “p-hacking” and discuss why p-hacking presents a big problem in financial economics. In doing so, also make comparisons to other disciplines.  As a practical application, find the best long-short portfolios based on letters of the ticker symbol of German stocks. Instructions for forming the portfolio will be very similar to those in Harvey (2017) on page 1400.

Literature to get started:

Harvey, C.R. (2017): Presidential Address: The Scientific Outlook in Financial Economics. Journal of Finance 4, 1399-1440.

Fanelli, D. (2010): “Positive’’ results increase down the Hierarchy of the Sciences. PLoS ONE 5, e10068.

Fanelli, D. (2012): Negative results are disappearing from most disciplines and countries. Scientometrics 90, 891–904.

Fanelli, D. (2013): Positive results receive more citations, but only in some disciplines. Scientometrics 94, 701–709.

supervisor: Nenad Ćurčić

students: Artemij Cadov, Florian Schinnerling and Kimberley Sperrfechter



2. Bayesian inference measures

The CAPM has been a major focus of financial research for decades ever since it was introduced by Sharpe (1964). While Fama and MacBeth (1973) find the CAPM to be valid as they do not find risk premiums other than for systematic market risk, Fama and French (1992) document the opposite and conclude that the CAPM is not valid. However, in light of the recent p-hacking debate, many questions have been raised about the true significance of the traditional p-value approach. Harvey (2017) presents some new approaches that could be used for hypothesis testing. Your task is to summarize the Bayesian measures and then revisit the studies of Fama and MacBeth (1973)[table 3] and Fama and French (1992) [table 3] by using the minimum Bayes factor approach instead of the p-value approach as suggested by Harvey (2017).

Literature to get started:

Fama, E. and French, K. (1992), ‘The cross-section of expected stock returns’, The Journal of Finance 47, 427–465

Fama, E. and MacBeth, J. (1973), ‘Risk, return, and equilibrium: Empirical tests’, The Journal of Political Economy 81, 607–636.

Harvey, C. R. (2017), ‘Presidential Address:  The Scientific Outlook in Financial Economics’, Journal of Finance 72(4), 1399–1440.

supervisor: Syed Wasif Hussain

students: Emanuele Luzzi, Runzhu Cao and Vitalii Piankov




3. Trading strategies based on Google trends: An instance of data snooping?

Preis et al. (2013) used Google search volume data of finance related words to develop profitable trading strategies and showed that data from Google Trends contain enough information to predict future financial index returns. Chalet and Ayed (2013) contest their findings by verifying that random finance related keywords do not to contain more exploitable predictive information than random keywords related to illnesses, classic cars and arcade games.  Your first task is to discuss the issues raised by Chalet & Ayed (2013) in using Google trends to predict the stock market. Your second task is to replicate the work of Preis et al. (2013) and include other non-finance terms to verify some of the critique in Chalet and Ayed (2013)  

Literature to get started:

Preis, T., Moat, H. S., & Stanley, H. E. (2013). Quantifying trading behavior in financial markets using Google Trends. Scientific reports, 3, 1684.

Challet, D., & Ayed, A. B. H. (2013). Predicting financial markets with Google Trends and not so random keywords. Working paper.

supervisor: Syed Wasif Hussain

students: Qing Chang, Ali Görkem Vapur and Miguel Franco Troncoso


4. Multiple testing, fishing for factors, and the market factor

Over the last decade, many factors have been proposed to explain the cross-section of expected returns. However, as multiple tests are carried out, there is an increasing risk of finding patterns in data, which only arise by chance. Therefore, it is often not possible to replicate the results of previous research. Harvey and Liu (2018) propose a novel approach to take multiple testing into account. In contrast to recent studies, they find that the original market factor proposed by Sharpe (1964) helps the most to explain the cross-section of expected returns.

Your task is to first summarize the problems that arise from multiple testing and to describe possibilities to cope with them. In addition, you should apply the approach from Harvey and Liu (2018) on your own and replicate some of their findings related to factors.

Literature to get started:

Harvey, C. R. (2017). Presidential Address: The Scientific Outlook in Financial Economics. Journal of Finance 4, 1399-1440.

Harvey, C. R. and Liu Y. (2018) Lucky Factors. Working Paper SSRN

Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. Journal of Finance, 19(3), 425-442.

supervisor: Niklas Paluszkiewicz

student: Aaminah Imran and Gokul Menon


5. Machine learning, stock market patterns, and data snooping

With increasing computer resources and availability of data, researchers have more tools than ever to study the behavior of financial markets. This comes along with advancements in machine learning and artificial intelligence. In 2016, for example, an algorithm developed by DeepMind beat Lee Sedol, one of the best Go players in the world. However, as financial data is scarce, it is still not yet entirely clear how much benefit this new methods will bring to the financial sector. Some researchers have even warned about the fact that it gets even easier with these new methods to find spurious patterns in financial data. This is particularly worth mentioning in view of the fact that many false positives are already reported in the finance area.

Your first task is to elaborate on the benefits, limitations and dangers of applying machine learning to financial data. In this context you should discuss some of the ideas of Arnott et al. (2018) on how to cope best with these new opportunities. As a second task you should perform an analysis similar to Chong et al. (2017), whether neural networks provide benefits in the prediction of future returns compared to linear methods (e.g. AR(p)). In contrast to Chong et al. (2017) you should do this on daily data for stocks of your choice.

Literature to get started:

Arnott, D. A., Harvey, C.R. and Markowitz H. (2018) A Backtesting Protocol in the Era of Machine Learning. Working Paper SSRN

Chong, E., Han, C., and Park, F. C. (2017) Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies. Expert Systems with Applications, 83, 187-205.

supervisor: Niklas Paluszkiewicz

students: Abbas FarrokhiKevin Walter and Ritank Gupta


6. P-hacking in equity premium prediction

After Goyal and Welch showed that models for the prediction of the equity premium do not work out-of-sample, new models with significant out-of-sample performance have been suggested. You shall take the combination approach of Rapach et al. (2010) as an exemplary case to examine how different specification choices influence the predictive accuracy and its significance. Start by replicating parts of Rapach et al. (2010), in particular Table 1. Then examine how results change if you use different sets of variables (instead of 15 variables, you should also consider sets of fewer and more variables), and if results are reported for different sub-periods. Data sources and an Excel template will be provided to you.

Literature to get started:

Rapach, D., et al. (2010): Out-of-Sample Equity Premium Prediction: Combination Forecasts and Links to the Real Economy. Review of Financial Studies, 3, 821-62.

supervisor: Nenad Ćurčić

students: Giulia Pancaldi and Javier Alberto Almeida Garcia






All relevant information to be found on this page. Students will be contacted via e-mail occasionally.

Dates and Room

Please note the detailed timetable.

Module description

This seminar is open for Master students.