Australian National Database of Spoken Language (ANDOSL)

How is the data annotated?

What do the symbols mean?

The ANDOSL data was originally annotated at the acoustic-phonetic level by expert phoneticians. Owing to the labour-intensive nature of this work, only a subset of the sentence data was annotated in this way.  These annotations were reduced to phonemic complexity and used to train segmentation models. The remainder of the sentences were annotated at the phonemic level by automatic means using these models. These phonemic annotations are not included on the CDROMs but are held at the ANU from whence they are made available in their most recently corrected form to all purchasers of the CDROMs.

The criteria for the original acoustic phonetic segmentation as used by Croot and Taylor can be found  here.

A legend of the phonemic labels used in both manual and automated labelling of ANDOSL data can be found here.

