Information détaillée concernant le cours
Crowdsourcing linguistic annotations and experiments
4 septembre 2023
|Lang||Workshop language is English|
|Responsable de l'activité||
Prof. Sandrine Zufferey, UNIBE
Dr. Jérôme Jacquin, UNIL
Prof. Laura Speed, Radboud University, The Netherlands
Dr. Merel Scholman, Utrecht University, The Netherlands
Traditionally, empirical research in linguistics involved a limited range of highly qualified people. Corpus annotations were performed by a small group of expertly trained annotators (most of the time the researchers themselves), and datapoints were often annotated by only one or two annotators. People taking part in experimental research were mostly undergraduate students in linguistics and psychology, with an expert knowledge of language. More recently, advances in web-based methods have contributed to broaden the scope of linguistic research by providing linguists with a new range of options for crowdsourcing the annotation of linguistic data to a large pool of naïve annotators. In addition, the development of online platforms such as Amazon Mechanical Turk and Prolific has greatly helped to connect linguists with a large number of speakers from a variety of languages, who now contribute to online experiments in exchange for a small remuneration. Thanks to the inclusion of a higher number of participants, and annotators with a wide range of linguistic and demographic profiles, these methods have represented a major step ahead for empirical linguistics. Yet, these methods also come with a number of shortcomings, such as a limited access to participants' characteristics and a lesser control on the conditions in which the tasks are performed.
This one-day CUSO workshop aims at presenting these new methodologies, emphasizing both their important advantages and discussing ways of limiting their shortcomings. Prof. Laura Speed, from Radboud University in the Netherlands, will discuss ways of taking (psycho)linguistic research outside the lab, thus increasing the diversity and quantity of data that can be collected. Dr. Merel Scholman from the University of Saarland in Germany will present the possibilities offered by the crowdsourcing of linguistic annotation of corpus data, underlining the importance for linguistic research of defining linguistic categories and annotation procedures that are applicable by naïve annotators.
Presentations from PhD students using empirical methods in their research, using either corpus and/or experimental research, will be encouraged to present their project and get feedback about methodological aspects of their project and get feedback about methodological aspects of their projects.
Université de Berne