Language Documentation & Conservation, 1(2), December 2007: Brotchie's review of Audiamus

Volume 1, Number 2 (December 2007)

Printer icon

Technology Review

Audiamus 2.3
created by Nicholas Thieberger

Reviewed by Amanda Brotchie, University of Melbourne

Audiamus is a software program that links digitized media to their transcripts, so that when a line of text is highlighted with a mouse-click, the corresponding segment of the media file is played. It is a valuable tool for linguists, particularly descriptive linguists working with large corpora, because it enables instant access to any part of a sound file in their database at the click of a mouse.

Audiamus was developed in 1998 to enable linguists to refer back to their recordings to check transcriptions and glean other audio information, such as intonation patterns or evidence of self-correction, without having to scroll through large sound files or search through boxes of CDs and tapes. Audiamus allows access to the audio material without having to segment the sound file itself; it requires only that the transcription be associated with start and end time-codes for each line of text.

The Audiamus interface is simple and straightforward. The main window shows the lines of transcribed text, and in the top left-hand corner the user enters the name of the audio file, so that the program knows where to look for the sound clips. There is no limit to the number or length of files that can be incorporated into the Audiamus database. The interface can be locked, in which case no changes can be made to the transcription or file. By clicking unlock, the user can make alterations, which can then be saved by clicking save in the dropdown file menu. There is another dropdown menu listing all the files in the Audiamus database, enabling easy access to any file. There is also a find function to locate a particular word or phrase within a file and an option to search all files.

Figure 1: Audiamus Main Window
(Click to see in original size)

Audiamus has several other valuable uses. It has a concordance function, which finds and collates all instances of each word in the corpus. For a corpus containing twenty hours of transcribed data, the concordance takes about ten minutes to compile. And the playlist function in Audiamus is a quick and convenient way to compile a list of utterances for a presentation. The user can also edit the in and out points of a clip, to isolate the word or phrase under review.

Audiamus functions and interface are regularly being improved, with several updated versions released over the past few years. The current version, Audiamus 2.3, was released in October 2006. It has two useful functions that previous versions were lacking: there is a repeat button, which loops the clip, and a slider to control the speed it plays. For people working with earlier versions, the upgrade is achieved by exporting all the data from the existing version of Audiamus, quitting it, then opening the new version and performing a mass import from the dropdown menu.

Audiamus can be incorporated into the work process as an additional step, after chunking and transcribing the digitized data, and before glossing it. The input for Audiamus is tab-delimited text documents or Transcriber’s LIMSI label documents, which can be exported from transcription programs, such as Transcriber, and which have start and end time-codes for each clip following the line of text. The file can then be exported from Audiamus along with time-code information as a plain text file, or in another format, such as XML or Toolbox. The transcription is then ready for glossing.

One drawback of working this way is the amount of time spent setting up the database. However, this is balanced by the time saved throughout the research , due to circumventing the need to search for utterances in large sound files or on tape or CD. Unicode is not yet supported in Audiamus, and diacritics from transcription programs, or IPA characters, need to be converted into plain text characters before importing into Audiamus. There is also an issue with the size of the interface in the current version. The interface requires a computer screen to be around 29x19 cm. Many computer screens are slightly smaller than this, and the interface needs to be shifted around in order for the user to access all the functions. There are also requirements that the user needs to know for the program to work:

Sound files need to be stored in the same folder as the program.
The name of the sound file written in the left hand corner needs to match exactly the name of the sound file in the folder.

.wav files are compatible with Audiamus but take up a lot of hard disk space. Audiamus can also play mp3 files, but some mp3 encoders work in such a way that the file will progressively slip out of sync, so the sync will need to be checked once the text and mp3 file are linked and another way of encoding to be found should there be problems.

My personal experience is that I refer to the audio files using Audiamus several times a day as I am writing up my work, to investigate apparent anomalies in the data. Initial transcriptions of the material can be flawed, as the sounds and grammar of the language are relatively unfamiliar. Almost invariably I find that apparent anomalies in the data turn out to be mistranscriptions, hesitation phenomena, or self-correction phenomena that were either not originally detected or not represented as such. As a result, the transcriptions are constantly being refined, so the quality of the text database is continually improving, in effect making more and more of the data available for analysis. The program allows linguists to be more rigorous in their analyses.

The Audiamus home page contains a description of its functions, technical information, and detailed instructions for downloading, importing, and exporting files.

Primary function:	Creates a corpus of digitized media indexed by transcripts.
Pros:	Facilitates rigorous descriptive work by ensuring that data recordings (and not the transcriptions) are used as the primary source material; allows repeated listenings to recorded data to ensure accurate transcription; editable text and time-code; variable speed function for playback; simple and straightforward interface; fast-find function to locate words or phrases; reasonably fast concordance function; able to compile and store playlists comprising one or more clips for presentations; supports a potentially infinitely large database; data not locked in to a proprietary format; freeware.
Cons:	Unicode is not yet supported.
Platforms:	Mac OSX, Windows
Open Source:	Audiamus is available from the website http://www.linguistics.unimelb.edu.au/thieberger/audiamusdemo.htm There is no download fee, but potential users need to contact [thien at unimelb dot edu dot au] before downloading.
Proprietary:	Audiamus is compiled from Runtime Revolution, so it is freely available, but is not modifiable without purchasing Runtime Revolution. Transcriptions which have been modified in Audiamus can be exported in a nonproprietary, plain text format.
Reviewed version:	Audiamus 2.3
Application size:	1.6MB
Documentation:	There is detailed supporting material, including technical information, available from the website: http://www.linguistics.unimelb.edu.au/thieberger/audiamus.htm

Amanda Brotchie
a.brotchie2 at pgrad dot unimelb dot edu dot au

Attribution Non-Commercial No Derivatives License