[Air-L] First Call for Shared Task Participation: Event Causality Identification with Causal News Corpus at CASE @ EMNLP 2022
ali hürriyetoglu
ali.hurriyetoglu at gmail.com
Thu Apr 28 14:08:41 PDT 2022
Dear all,
We invite you to participate in the CASE-2022 Shared Task 3: Event
Causality Identification with Causal News Corpus.
The task is being held as part of the 5th Workshop on Challenges and
Applications of Automated Extraction of Socio-political Events from Text
(CASE 2022). All participating teams will be able to publish their system
description paper in the workshop proceedings published by ACL. Workshop
Website: https://emw.ku.edu.tr/case-2022/
Motivation
================================================
Causality is a core cognitive concept and appears in many natural language
processing (NLP) works that aim to tackle inference and understanding. We
are interested to study event causality in news, and therefore, introduce
the Causal News Corpus.
The Causal News Corpus consists of 3,559 event sentences, extracted from
protest event news, that have been annotated with sequence labels on
whether it contains causal relations or not. Subsequently, causal sentences
are also annotated with Cause, Effect and Signal spans. Our subtasks work
on the Causal News Corpus, and we hope that accurate, automated solutions
may be proposed for the detection and extraction of causal events in news.
Task Overview
================================================
We focused on two subtasks relevant to Event Causality Identification:
-
Subtask 1: Causal Event Classification – Does an event sentence contain
any cause-effect meaning?
-
Subtask 2: Cause-Effect-Signal Span Detection – Which consecutive spans
correspond to cause, effect or signal per causal sentence?
-
Subtask 2.1: Cause-Effect Span Detection – This subtask identifies
the spans corresponding to cause and effect per sentence.
-
Subtask 2.2: Signal Span Detection – This subtask identifies the
spans corresponding to the signal, or causal connective, per cause and
effect relation.
Participants may design solutions that work on a single, multiple or all
subtasks concurrently. Participants are also allowed to combine Subtask 1
and 2 annotations for either task. However, the target labels of
development and test sets should not be introduced during training in their
set up in any way (E.g. even for data augmentation).
Data Content
================================================
Our work extends a prior socio-political news corpus to annotate if
event-containing sentences have causal relations or not. Our data sizes and
splits are described as follows:
-
Subtask 1: Causal Event Classification -- 869 news documents and 3559
English sentences were annotated with labels on whether it contains causal
relations or not. The data splits were: 2925 train, 323 development, and
311 test.
-
Subtask 2: Cause-Effect-Signal Span Detection – Positive causal
sentences from Subtask 1 were retained and annotated with
Cause-Effect-Signal spans. Of the 1957 examples available, we annotated
only 180 sentences, but intend to complete all in the future. We annotated
130 train+dev examples so far. There can be multiple relations per
sentence. The data splits were: 130 train and 13 development. We will
release more training examples closer to the test period. The test set will
include >=50 examples.
Task Repository: https://github.com/tanfiona/CausalNewsCorpus
Codalab Site: https://codalab.lisn.upsaclay.fr/competitions/2299
Subtask 1 Paper description (to appear at LREC 2022):
http://arxiv.org/abs/2204.11714
Important Dates
================================================
Training data available: Apr 15, 2022
Validation data available: Apr 15, 2022
Validation labels available: Aug 01, 2022
Test data available: Aug 01, 2022
Test start: Aug 01, 2022
Test end: Aug 15, 2022
System Description Paper submissions due: Sep 07, 2022
Notification to authors after review: Oct 09, 2022
Camera ready: Oct 16, 2022
Workshop period @ EMNLP: Dec 7-8, 2022
Organization
================================================
-
Fiona Anting Tan, Institute of Data Science/ National University of
Singapore, Singapore
-
Ali Hürriyetoğlu, KNAW Humanities Cluster, the Netherlands
-
Tommaso Caselli, Rijksuniversiteit Groningen, Netherlands
-
Nelleke Oostdijk, Radboud University
-
Tadashi Nomoto, National Institute of Japanese Literature, Japan
-
Onur Uca, Mersin University
-
Iqra Ameer, Centro de Investigación en Computación/ Instituto
Politécnico Nacional, Mexico
-
Hansi Hettiarachchi, Birmingham City University, United Kingdom
-
Farhana Ferdousi Liza, University of East Anglia, United Kingdom
-
Tiancheng Hu, ETH Zürich, Switzerland
Please contact Fiona Anting Tan at tan.f at u.nus.edu, with your title
starting with “CNC ST”, or post questions on the Forum page in Codalab.
More information about the Air-L
mailing list