Corpus linguistics is characterized by the analysis of large transcription sets of actual speech; the Taipei Corpus is one of the largest sets available for any language.  The Corpus consists of 64 hours of transcribed speech from Mandarin-speaking two-year-olds, taped in Taipei, 1975-1980.  Collected by Professor Mary Erbaugh the transcripts total  almost 10,000 handwritten pages and are a unique resource with detailed contextual notes such as children's gestures and activities.

The collection is focused on four Taipei children aged 1 year 10 months to 2 years 10 months, whose speech was recorded every other week, two of them for a year.

SELECTED REFERENCES:

CHILDES (The Child Language Data Exchange)
Mary S. Erbaugh.  2001.  The Pear Stories:  Narrative in 7 Chinese Dialects
Mary S. Erbaugh, 1992.  "The Acquisition of Mandarin" in Dan I. Slobin, ed.  The Crosslinguistic Study of Language Acquisition, volume 3.  Hillsdale, NJ:  Larence Erbaum.  373-455.


 CHILD BIOGRAPHIES:

Kang Biography
Pang Biography
LH Biography
Zhong Biography


 AUDIO EXAMPLES (MP3 files):

KANG 11-01 KANG 11-05
KANG 11-02 KANG 11-06
KANG 11-03 KANG 11-07
KANG 11-04 KANG 11-08


         

KANG 1 KANG 2 KANG 3 KANG 4 KANG 5
KANG 6 KANG 7 KANG 8 KANG 9 KANG 10
KANG 11 KANG 12 KANG 13 KANG 14 KANG 15
KANG 16 KANG 17 KANG 18 KANG 19 KANG 20
KANG 21 KANG 22 KANG 23
-----
-----
LH 1 LH 2 LH3 LH 4 LH 5
LH 6 LH 7
-----
-----
-----
PANG 1 PANG 2 PANG 3 PANG 4 PANG 5
PANG 6 PANG 7 PANG 8 PANG 9 PANG 10
PANG 11
PANG 16 PANG 17 PANG 18 PANG 19 PANG 20
PANG 21 PANG 22 PANG 23 PANG 24 PANG 25
ZHONG 1 ZHONG 2 ZHONG 3 ZHONG 4 ZHONG 5
ZHONG 6 ZHONG 7 ZHONG 8 ZHONG 9 ZHONG 10
FuYi-Zhen        
Xiao Xing