HEC Liège – Management School of the University of Liège. Rue Louvrex 14, Liège
Finance / Accounting (FIN/ACC)
Textual analysis and machine learning with applications to economics and finance
Thomas Renault is an Assistant Professor at the University Paris 1 Panthéon-Sorbonne (France) and scientific advisor at the French Council of Economic Analysis. He has received his PhD diploma in 2017 from Paris 1 Panthéon-Sorbonne. His work focuses on the use of textual data – mostly from social media – to forecast financial markets and to construct novel indicators to track economic conditions. He is teaching the following classes at the University Paris 1 Panthéon-Sorbonne: Applied Data Science in Finance, Applied Big Data in Finance, Introduction to Python, Digital Data and Network Analysis. He has published articles in the Journal of Public Economy, the Journal of Banking and Finance, and in the Journal of International Money and Finance.
Matthieu Picault is an Assistant Professor at the University of Orléans (France) in the Laboratoire d’Economie d’Orléans (LEO). He has received his PhD diploma from the University of Aix-Marseilles (AMSE) in 2017. His research focuses primarily on central banks communications and its impact of financial and macroeconomic variables. It includes textual analysis of both official documents and media. He teaches in the Applied Econometrics Master courses of Introduction to Python and Natural Language Processing with Python. He has published articles in the Journal of International Money and Finance, Finance Research Letters and in the International Journal of Finance & Economics.
The objective of this course is study how we can use the millions of textual contents published on the Internet and social media every day to improve our understanding of various economic and financial phenomena. After an introduction to the Python programming language, we will start by seeing how it is possible to extract online content via the use of existing APIs or the implementation of web scraping tools. We will create an application to collect articles from a major media site and we will use an API to extract tweets from a social network dedicated to finance. Next, we will see how to analyse a text using Natural Language Processing (NLP) methods. We will apply this to the speeches made by the European Central Bank to show how it is possible to give structure to unstructured data. The next session will be dedicated to sentiment analysis and will present the different methods (dictionary approach and machine learning). We will analyse Twitter data to build a sentiment indicator capturing the well-being of individuals in a country. The fourth session will be devoted to machine learning using text as data with an application on StockTwits data (asset pricing). The last session, we will introduce methods of textual analysis on unsupervised data (topic modelling and transformers). We will perform an application of a Latent Dirichlet Allocation on a large corpus of Glassdoor reviews.
For the different sessions, we will first present both the related theories and methods – in a language accessible to non-mathematicians – and their latest applications in the economic and financial literature. We will then study and share with the participants’ scripts and codes to realize different tasks in Python. We will also offer participants the opportunity to present their research and/or projects, and if possible, we will assist them with their projects – both on the data collection side and on the data analysis side.
Participants should have a basic understanding of computer programming. It is possible to follow the tutorial available at https://www.learnpython.org/ to learn or review the basics of programming in Python.
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.