Orthographic text convention

A convention for orthographic text specifies the encoding of different information allowing for audio to be recorded as text. In addition to the tokens themselves, a transcription encodes low-level information such as noises, partial words or particular pronunciations. This information generally is not included in annotations, which are higher-level information and for which we recommend separate encoding (respecting the principle of stand-off annotation). Groupe de travail 1 of the IRCOM group is preparing a transcription convention which may serve as a reference for French