Australian National Database of Spoken Language (ANDOSL)
What is significant about ANDOSL?
It is Australian. It comprises spoken language as it occurs in a variety
of major speaker groups in Australia.
It is extendible. Only a core set of data has been collected, but a framework
of formats and standards have been set which can be used to create additional
technically compatible components of data.
It comprises carefully balanced material for Australian speakers, both
native-born and overseas-born migrants. The material includes phonetically-rich
read material and spontaneously spoken culturally relevant material.
The speakers have been rigorously selected within phonologically defined
speaker groups. Each group has been balanced for age ranges and gender.
Detailed anonymous descriptors of the speakers are available as part of
the data corpus description.
It was recorded in a high quality environment at the National Acoustic
Laboratories. Both a detailed parametric description of the environment
and detailed notes of any anomalies in the recording process have been
integrated into the data description.
It has been very honestly described in a set of formal data description
files which are distributed with the speech signal data.
It is being annotated using high levels of quality control. Manual annotation
at both word and phonemic levels using highly trained transcribers is being
combined with automatic methods to ensure a high quality product.
It is being disseminated using CDROM technology, according to internationally
Last modified: 24 March 1999.