|Volume 2, Number 1 (June 2008)
PDF version | Metadata
(Doulos SIL font requried,
Notes from the Field
Minangali (Kalinga) Digital Wordlist: Presentation Form
Kenneth S. Olson
This paper presents a 207-item digital wordlist of Minangali, an Austronesian language spoken in the Philippines. The wordlist includes orthographic and broad phonetic transcriptions of each word, an English gloss, an individual WAV recording of each item, and metadata for resource discovery. An archival form of the wordlist was deposited into an institutional archive (the SIL Language and Culture Archives) and includes the original WAV digital recording, the original RTF wordlist form, descriptive markup encoding of the wordlist in XML employing Unicode transcription, and the metadata record. The presentation form was then generated directly from the archival form.
1. INTRODUCTION.1 This paper presents a 207-item digital wordlist of core vocabulary in Minangali, a variety of Lower Tanudan Kalinga spoken in the town of Mangali, Kalinga Province, Philippines. Lower Tanudan Kalinga is an Austronesian language (Northern Philippine subgroup) spoken by approximately 11,000 people (ISO 639–3 code: kml, Gordon 2005). This presentation form was generated from an archival form of the data. The procedure we followed for creating both forms is detailed in Simons, Olson, and Frank 2007 and summarized below.
In addition to a description of the primary data in the form of phonetic transcription, we provide a documentation of the data in the form of digital audio recordings (cf. Himmelmann 1998), enabling the reader to verify and critique our transcription. This is important for this particular wordlist because Minangali has some unusual phonological phenomena, including the rare interdental approximant speech sound (to our knowledge, these are the first published recordings of the sound) and word-internal …VCV… sequences in which the intervocalic consonant is the coda of a preceding syllable.
The materials included in this presentation of the data include the following:
The original wordlist materials included two items: an electronic wordlist form (in RTF format) and an 18.5-minute digital recording in WAV format. The wordlist form that we used presented 207 items, which is an amalgamation of the Swadesh 200 (Swadesh 1952:456–457) and Swadesh 100 (Swadesh 1955:124,133–137) wordlists, with some minor modifications (thou → you sg., ye → you pl., person → man (human being), woods → forest, berry → fruit, claw → fingernail, and right → correct). For each item, the form provided a prompt in English and spaces for the transcription of the elicited form in both orthography and broad phonetic script. The third author translated the wordlist into Minangali, with some assistance from the second author.
The wordlist was recorded on April 17, 2006 in the recording studio at the SIL Center in Bagabag, Nueva Vizcaya Province, Philippines. During the recording session, the second author produced the English prompt, and the third author produced the target word twice. The recording was made using a Samson C01U USB studio condenser microphone connected to a notebook computer, using Speech Analyzer v. 2.7 for audio capture.
The third author is a sixty-year-old native speaker of Minangali. He has lived most of his life in Mangali, but he lived in Cebu for three years during his post-secondary education. He also speaks Ilocano, English, Cebuano, and some Tagalog.
The microphone and software allowed for recording at a sampling rate of 44.1 kHz and a quantization of 16-bit (i.e., standard audio CD quality). This is sufficient for technical purposes since it covers nearly all acoustic information pertinent to language, but it does not meet the generally-accepted recommended best practice of 96 kHz and 24-bit for archival data (Ladefoged 2003:18, 26; Simons et al. 2007:31; Plichta and Kornbluh 2002; IASA-TC03 2005:8). We recommend that field researchers record primary data at archival-quality rates if possible.
The draft transcription contained fields for five annotations: the item number, an English prompt, an orthographic transcription of the Minangali utterance, a broad phonetic transcription of the Minangali utterance using the International Phonetic Alphabet (IPA 1999), and additional notes. The draft transcription was revised by the authors, converted to a comma-delimited (CSV) file, and imported into TableTrans v. 1.2 software (Bird et al. 2002), where it was time-aligned to the original audio recording. This annotation was outputted to an XML annotation graph output and transformed into an XML descriptive wordlist format using an XSLT script.
The original electronic wordlist (in RTF format), the original WAV file, the XML descriptive wordlist, and a metadata record constitute the archival form of the wordlist (Machlan and Olson 2008). The metadata record follows the standard set up by the Open Language Archives Community (OLAC, http://www.language-archives.org/OLAC/metadata.html). A copy of the archival materials can be ordered on CD-ROM for a nominal fee from:
SIL Language and Culture Archives
The presentation form of the wordlist was then generated from the archival form. An XSLT script was employed to convert the archival XML descriptive wordlist into an HTML presentation wordlist. Then, TableTrans was used to automatically create individual sound files corresponding to each of the segments identified in the transcription process for use in the presentation form.
The broad phonetic transcription requires a few remarks. First, two adjacent identical consonants represent a long consonant. We write them as two consonants in order to allow for stress to be marked properly on forms such as item 33: abobba [abobba] ‘short’. Second, two adjacent identical vowels represent two distinct vowels occurring in separate syllables, such as in item 20: poos [poos] ‘few’. Third, some words have an unusual pattern in which there is a syllable break after an intervocalic consonant. In such cases we explicitly mark the syllable break with a period, e.g. item 31: dam-ot [dam.ot] ‘heavy’. In cognates from many other Philippine languages, there is a glottal stop in this position, so it appears that in Minangali the glottal stop has been deleted historically in these cases. Fourth, the phoneme transcribed in the Minangali orthography as <k> is normally realized as a glottal stop  in Minangali. Cognates from many other Philippine languages have a /k/ in this position, so it appears that in Minangali the /k/ has evolved into //. However, in item 68 it is a [k] that is produced: sakkud [sakkud] ‘horn’. It is not certain that this word is a borrowing. Fifth, the eth with lowering sign  represents the interdental approximant, a rare speech sound found in about a dozen Philippine languages, including Kagayanen, Karaga Mandaya, Kalagan, Southern Catanduanes Bicolano, and several varieties of Kalinga (Olson and Mielke 2007).
Bird, Steven, Kazuaki Maeda, Xiaoyi Ma, Haejoong Lee, Beth Randall, and Salim Zayat. 2002. TableTrans, MultiTrans, InterTrans and TreeTrans: Diverse tools built on the Annotation Graph Toolkit. Proceedings of the Third International Conference on Language Resources and Evaluation. Paris: European Language Resources Association. http://arxiv.org/abs/cs/0204006.
Gordon, Raymond G., ed. 2005. Ethnologue: Languages of the world, 15th edition. Dallas, TX: SIL International. http://www.ethnologue.com.
Himmelmann, Nikolaus. 1998. Documentary and descriptive linguistics. Linguistics 36:161–195.
IASA-TC03. 2005. The safeguarding of the audio heritage: Ethics, principles and preservation strategy, version 3. http://www.iasa-web.org/IASA_TC03/TC03_English.pdf.
International Phonetic Association. 1999. Handbook of the International Phonetic Association. Cambridge: Cambridge University Press.
Ladefoged, Peter. 2003. Phonetic data analysis: An introduction to fieldwork and instrumental techniques. Oxford: Blackwell.
Machlan, Glenn, and Kenneth S. Olson. 2008. Minangali (Kalinga) digital wordlist: Archival form. SIL-LCA-50319. SIL Language and Culture Archives, Dallas, TX.
Olson, Kenneth S., and Jeff Mielke. 2007. Articulation of the Kagayanen interdental approximant: An ultrasound study. Paper presented at the Linguistic Society of America annual meeting, January 2007, in Anaheim, CA.
Plichta, Bartek, and Mark Kornbluh. 2002. Digitizing speech recordings for archival purposes. http://www.historicalvoices.org/papers/audio_digitization.pdf.
Simons, Gary F., Kenneth S. Olson, and Paul Frank. 2007. Ngbugu digital wordlist: A test case for best practices in archiving and presenting language documentation. Linguistic Discovery 5(1):28–39. http://journals.dartmouth.edu/cgi-bin/WebObjects/Journals.woa/2/xmlpage/1/article/314.
Swadesh, Morris. 1952. Lexico-statistic dating of prehistoric ethnic contacts: With special reference to North American Indians and Eskimos. Proceedings of the American Philosophical Society 96(4):452–463.
Swadesh, Morris. 1955. Towards greater accuracy in lexiostatistic dating. International Journal of American Linguistics 21:121–137.
Kenneth S. Olson
(Please replace [at] with "@" and [dot] with ".")