Combining Corpus and Experimental Data in Linguistics


7 sept. 2021

Prof. Dr. Sandrine Zufferey, Institut de langue et de littérature françaises, Université de Berne.

Dr. Jérôme Jacquin, Section des sciences du langage et de l'information, Université de Lausanne.



Prof. Gaëtanelle Gilquin, Institut Langage et Communication, Université Catholique de Louvain.

Dr. Ludivine Crible, School of Philosophy, Psychology and Language Sciences, University of Edinburgh.


Research in linguistics is increasingly based on analyses of corpus data and controlled experiments. However, most of the time, studies make use of only one of these empirical methods. In this workshop, we invite students to discover the various ways in which corpora and experiments complement each other, and can be used together to study the same linguistic phenomenon. From a methodological perspective, we will see that using corpus data is a powerful way to calibrate experimental materials. But corpora and experiments can also be combined in several other ways to increase the validity of empirical findings. First, combining data from corpora and experimental results allows researchers to compare the production of a given linguistic phenomenon in naturally occurring data and in a controlled elicitation task, thus benefiting from the insights offered by both methods. Second, it makes it possible to compare the production and comprehension aspects of linguistic competence. Third, linguistic analyses (of syntactic, semantic or pragmatic features) performed on corpus data can be related to the way in which the same element is used and understood by speakers in an experimental context. For example, corpus analyses may reveal that certain words or structures typically display a set of complex features. Such complexity can then be linked to the time it takes participants in an experiment to process and understand them. In sum, comparing data from corpora and from experiments provides linguists with a powerful way to test linguistic hypotheses, based on strong empirical findings. During the workshop, Prof. Gaëtanelle Gilquin, who co-authored a pioneering article on the use of corpora and experimental data (Gilquin & Gries, 2009) will first present the various ways in which corpora and experiments can be combined in linguistics. Then, Dr. Ludivine Crible will present her own research which combines corpora and experiments to analyze discourse phenomena. In the afternoon, PhD students will be invited to present their own empirical data from corpora and/or experiments in order to receive feedback and practical advice, and thus contribute to the discussion on the challenges and possibilities linked to the use of several empirical methods. The invited presentations will be given in English, but students should feel free to present their research and contribute to the discussion in both English and French.


