What is corpus annotation?

Annotating a corpus means adding one or more layers of linguistic interpretation to raw data. Annotations added can be of very diverse natures: they can be morpho-syntactic categories, semantic or discursive annotations, but also, in the case of oral or multi-modal corpora, information on prosody, gestures, etc.

Annotations are performed during annotation campaigns by human annotators, more or less expert, who rely on an annotation guide.

More resources on the CORLI website:

  • The CORLI Annotation Network-Group is dedicated to issues related to corpus annotation – You can subscribe to its mailing list here.
  • Several training sessions organized by CORLI members have been dedicated to corpus annotation – You will find the list of these trainings as well as the course materials here.