[Air-L] First Call for Shared Task Participation: Event Causality Identification with Causal News Corpus at CASE @ EMNLP 2022

ali hürriyetoglu ali.hurriyetoglu at gmail.com
Thu Apr 28 14:08:41 PDT 2022


Dear all,

We invite you to participate in the CASE-2022 Shared Task 3: Event
Causality Identification with Causal News Corpus.

The task is being held as part of the 5th Workshop on Challenges and
Applications of Automated Extraction of Socio-political Events from Text
(CASE 2022). All participating teams will be able to publish their system
description paper in the workshop proceedings published by ACL. Workshop
Website: https://emw.ku.edu.tr/case-2022/


Motivation

================================================

Causality is a core cognitive concept and appears in many natural language
processing (NLP) works that aim to tackle inference and understanding. We
are interested to study event causality in news, and therefore, introduce
the Causal News Corpus.

The Causal News Corpus consists of 3,559 event sentences, extracted from
protest event news, that have been annotated with sequence labels on
whether it contains causal relations or not. Subsequently, causal sentences
are also annotated with Cause, Effect and Signal spans. Our subtasks work
on the Causal News Corpus, and we hope that accurate, automated solutions
may be proposed for the detection and extraction of causal events in news.



Task Overview

================================================

We focused on two subtasks relevant to Event Causality Identification:

   -

   Subtask 1: Causal Event Classification – Does an event sentence contain
   any cause-effect meaning?
   -

   Subtask 2: Cause-Effect-Signal Span Detection – Which consecutive spans
   correspond to cause, effect or signal per causal sentence?
   -

      Subtask 2.1: Cause-Effect Span Detection – This subtask identifies
      the spans corresponding to cause and effect per sentence.
      -

      Subtask 2.2: Signal Span Detection – This subtask identifies the
      spans corresponding to the signal, or causal connective, per cause and
      effect relation.

Participants may design solutions that work on a single, multiple or all
subtasks concurrently. Participants are also allowed to combine Subtask 1
and 2 annotations for either task. However, the target labels of
development and test sets should not be introduced during training in their
set up in any way (E.g. even for data augmentation).



Data Content

================================================

Our work extends a prior socio-political news corpus to annotate if
event-containing sentences have causal relations or not. Our data sizes and
splits are described as follows:

   -

   Subtask 1: Causal Event Classification -- 869 news documents and 3559
   English sentences were annotated with labels on whether it contains causal
   relations or not. The data splits were: 2925 train, 323 development, and
   311 test.
   -

   Subtask 2: Cause-Effect-Signal Span Detection – Positive causal
   sentences from Subtask 1 were retained and annotated with
   Cause-Effect-Signal spans. Of the 1957 examples available, we annotated
   only 180 sentences, but intend to complete all in the future. We annotated
   130 train+dev examples so far. There can be multiple relations per
   sentence. The data splits were: 130 train and 13 development. We will
   release more training examples closer to the test period. The test set will
   include >=50 examples.

Task Repository: https://github.com/tanfiona/CausalNewsCorpus

Codalab Site: https://codalab.lisn.upsaclay.fr/competitions/2299

Subtask 1 Paper description (to appear at LREC 2022):
http://arxiv.org/abs/2204.11714



Important Dates

================================================

Training data available: Apr 15, 2022

Validation data available: Apr 15, 2022

Validation labels available: Aug 01, 2022

Test data available: Aug 01, 2022

Test start: Aug 01, 2022

Test end: Aug 15, 2022

System Description Paper submissions due: Sep 07, 2022

Notification to authors after review: Oct 09, 2022

Camera ready: Oct 16, 2022

Workshop period @ EMNLP: Dec 7-8, 2022



Organization

================================================

   -

   Fiona Anting Tan, Institute of Data Science/ National University of
   Singapore, Singapore
   -

   Ali Hürriyetoğlu, KNAW Humanities Cluster, the Netherlands
   -

   Tommaso Caselli, Rijksuniversiteit Groningen, Netherlands
   -

   Nelleke Oostdijk, Radboud University
   -

   Tadashi Nomoto, National Institute of Japanese Literature, Japan
   -

   Onur Uca, Mersin University
   -

   Iqra Ameer, Centro de Investigación en Computación/ Instituto
   Politécnico Nacional, Mexico
   -

   Hansi Hettiarachchi, Birmingham City University, United Kingdom
   -

   Farhana Ferdousi Liza, University of East Anglia, United Kingdom
   -

   Tiancheng Hu, ETH Zürich, Switzerland


Please contact Fiona Anting Tan at tan.f at u.nus.edu, with your title
starting with “CNC ST”, or post questions on the Forum page in Codalab.


More information about the Air-L mailing list