XMLisation

Coordination: Antonio Balvet, Sascha Diwersy

The CORLI “XMLisation” working group is concerned with the many facets of structured input of linguistic data in XML, whether in TEI or other formats. We’ll look at the whole process: “before”, “during”, and “after”. The “before” part will deal with methods for cleaning up and preparing raw data to ensure optimum compatibility with XML formats. “During” will focus on the choice of standards and metadata suited to the specific needs of each project, as well as on selecting the relevant subset of TEI or other standards. Finally, the “after” will focus on the exploitation of structured data: how to enhance and exploit well-formed linguistic corpora for linguistic analysis, publication and scientific dissemination. This working group aims to provide an overview of best practices for maximizing the quality and use of XML data in linguistic projects.