CoNLL-2010 Shared Task
Learning to detect hedges and their scope in natural language text

Overview of the CoNLL-2010 Shared Task

The CoNLL-2010 Shared Task aimed at identifying hedges and their scope in natural language texts. 23 teams participated in the shared task from all over the world and 22, 16 and 13 teams submitted output for Task1 (biological papers), Task1 (Wikipedia) and Task2, respectively. Papers on the submitted systems and an overview of the shared task can be downloaded here.

Although the shared task is over, it is still possible to download the corpora used in the tasks. Registered participants can also have access to the results.

We would like to thank all the participants for their efforts and inspiring feedback on the shared task. Comments, suggestions and remarks on the shared task are welcome at conll2010st(AT)inf(DOT)u-szeged(DOT)hu.

Introduction

In Natural Language Processing (NLP) - in particular, in Information Extraction (IE) - many applications aim at extracting factual information from text. In order to distinguish facts from unreliable or uncertain information, linguistic devices such as hedges (indicating that authors do not or cannot back up their opinions/statements with facts) have to be identified. Applications should handle detected speculative parts in a different manner.

Hedge detection has received considerable interest recently in the biomedical NLP community, including research papers addressing the detection of hedge devices in biomedical texts, and some recent work on detecting the in-sentence scope of hedge cues in text. Exploiting the hedge scope annotated BioScope corpus and publicly available Wikipedia weasel annotations, the goals of the Shared Task were

Task 1: learning to detect sentences containing uncertainty and
Task 2: learning to resolve the in-sentence scope of hedge cues.

The shared task was part of the CoNLL conference held in conjunction with ACL 2010 in Uppsala, Sweden, July 15-16, 2010.

For more information please visit the FAQ site or contact: conll2010st(AT)inf(DOT)u-szeged(DOT)hu.

References

Veronika Vincze, György Szarvas, Richárd Farkas, György Mora, and János Csirik: The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics, 9(Suppl 11):S9, 2008.

Viola Ganter and Michael Strube: Finding hedges by chasing weasels: Hedge detection using wikipedia tags and shallow linguistic features. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 173-176, Suntec, Singapore, August 2009. Association for Computational Linguistics.

Roser Morante and Walter Daelemans: Learning the scope of hedge cues in biomedical texts. In Proceedings of the BioNLP 2009 Workshop, pages 28-36, Boulder, Colorado, June 2009. Association for Computational Linguistics.

Dates

The important dates for the shared task were as follows:

  • January 11, 2010: trial datasets and scorer
  • January 18, 2010: registration for the task opens
  • February 1, 2010: training and development sets available
  • March 26, 2010: test set available
  • April 2, 2010: systems' outputs collected
  • April 18, 2010: deadline for paper submission
  • May 2, 2010: notification of acceptance
  • May 9, 2010: deadline for camera ready paper submission
  • July 15 or 16, 2010: Uppsala

Organisers

The CoNLL-2010 Shared Task was organised by the Human Language Technology Group, University of Szeged.

Organising team:

Richárd Farkas, Human Language Technology Group, University of Szeged

Veronika Vincze, Human Language Technology Group, University of Szeged

György Szarvas, Ubiquitous Knowledge Processing Lab, Technische Universität Darmstadt

György Móra, Human Language Technology Group, University of Szeged

János Csirik, Research Group of Artificial Intelligence, Hungarian Academy of Sciences

Programme Committee

  • Ekaterina Buyko, University of Jena
  • Kevin Cohen, University of Colorado
  • Hercules Dalianis, Stockholm University
  • Maria Georgescul, University of Geneva
  • Filip Ginter, University of Turku
  • Henk Harkema, University of Pittsburgh
  • Shen Jianping, Harbin Institute of Technology
  • Yoshinobu Kano, University of Tokyo
  • Jin-Dong Kim, Database Center for Life Science, Japan
  • Ruy Milidiu, Pontifícia Universidade Católica do Rio de Janeiro
  • Roser Morante, University of Antwerp
  • Lilja Ovrelid, University of Potsdam
  • Arzucan Ozgur, University of Michigan
  • Vinodkumar Prabhakaran, Columbia University
  • Sampo Pyysalo, University of Tokyo
  • Marek Rei, Cambridge University
  • Buzhou Tang, Harbin Institute of Technology
  • Erik Tjong Kim Sang, University of Groningen
  • Katrin Tomanek, University of Jena
  • Erik Velldal, University of Oslo
  • Andreas Vlachos, University of Wisconsin-Madison
  • Xinglong Wang, University of Manchester
  • Torsten Zesch, University of Darmstadt
  • Qi Zhao, Harbin Institute of Technology
  • HuiWei Zhou, Dalian University of Technology