Anonymization consists of removing all information which could be used to identify an individual, in the interest of sharing data without impacting privacy. This operation concerns the identifying information for a participant or place which would allow for the the participants to be identified, the audio or video signal, the transcription with personal information such as addresses, telephone numbers, or proper nouns. Rather than simply erasing parts of the signal, there are masking/filtering techniques which make sensitive portions unintelligible, but analyzable in other ways (ex. prosodic analysis).
For visual-gestural information, notably in signed languages, anonymization (for example using blurring) is still problematic, since looks and facial expressions can be major sources of linguistic information. See Anonymisation de corpus réutilisables, Reffay and Teutsch, 2000 and Script PRAAT d’anonymisation de fichiers sonores.

Contents validated by Groupe de Travail 4