[Air-L] CfP 4S/EASST Amsterdam - P036 Questioning data annotation for AI: Empirical studies

Girard-Chanudet Camille camille.girard-chanudet at ehess.fr
Tue Jan 23 07:36:58 PST 2024


Dear colleagues, 

We are pleased to invite you to participate in the panel on data annotation activities for AI , which we are organizing as part of the EASST/4S Congress taking place in Amsterdam from July 16 to 19, 2024. 

This panel aims to bring together empirical research on annotation work, documenting organizational configurations, doubts, and choices that progressively shape training datasets, and thus, the results produced by artificial intelligence. The panel will also provide an opportunity to establish the foundations for an international working group on this subject. 

The deadline for submitting proposals [ https://www.easst4s2024.net/open-panels/#14069 | is set for February 12, 2024 ] . 

Detailed information is provided below. 

Assia Wirth (Affiliation ?) 
Camille Girard-Chanudet (CEMS - EHESS/INSERM/CNRS) 

----- 

PANEL 036 - Questioning data annotation for AI: Empirical studies 



The growing implementation of Artificial Intelligence (AI) technologies in “sensitive” fields such as health, justice or surveillance has triggered diverse preoccupations about algorithmic opacity. The efforts to crack open the “black box” of machine learning have mainly focused on coding architectures and practices, on the one hand, and on the constitution of training data sets, on the other hand. Both these components of machine learning dispositives are however held together by an essential link, which has largely remained on the fringes of AI studies: the manual annotation of data by human professionals. 

Annotation work consists of the manual and meticulous labeling of documents (pictures, texts…) with a desired outcome that the algorithmic model will then reproduce. It can be undertaken by various categories of actors. While annotation conducted by underpaid and outsourced workers has been well documented, these activities can also be assumed by qualified workers even within prestigious professions. All of these empirical cases raise questions regarding the micro doubts, inquiries, choices, and overall expertise, that progressively shape the training data sets and, thus, the results produced by AI. Data labelling is about putting classification systems into practice, defining categories and their empirical borders, and constructing information infrastructures. Despite these strong political impacts, data annotation remains highly invisible. 




This panel welcomes papers addressing the question of annotation from an empirical perspective. The contributions may include: 

- Ethnographic studies documenting the practices of annotation work in particular contexts. 

- Historical studies situating annotation work for AI in a broader genealogy of classification instruments and inscription practices. 

- Organizational studies describing the effects of different institutional settings (in-house, outsourced, subcontracted etc.) and social configurations (gender, nationality, socioeconomic background etc.) on the annotation process. 




The last session will be constructed as an open workshop, aiming at drawing research questions and perspectives. 



More information about the Air-L mailing list