How babies acquire their first words. From babbling to talking.

Language acquisition is one of the greatest mysteries of life. It takes only a few months to master one of the most complex phenomena in the world. Using the CHILDES database, we made an animated visualization of this process regarding English.

Generations of linguists and psychologists have been arguing for decades over the issue of native language acquisition. Did nature provide us with special abilities to pick up a language within a relatively short time, or do we have a general capability to acquire our mother tongue? Among others, Chomsky argues that every human must have a language acquisition device. While Tomasello proposes that language learning is not a special skill, but it it is just one manifestation of the learning skills of primates.

We don’t take side in this debate, but want to call attention to the unquestionable boom of the babie’s vocabulary and grammar at teir age of 18 months. As one can see on the animation above, at first, babies babbles and use only a few words, but at some point (around at the age of 18 months), the number of words they use and the connections between them explodes.

Methodology, data, code

We used the American and British English corpora of the CHILDES database. We followed Jinyun Ke and Yao Yao ‘s paper so as to generate the so-called accumulative language networks. We aggregated speakers’ utterances for months. The nodes of the network are words (or contracted forms like “that’s”), while edges are based on skip-gram (3-skip-2-grams) co-occurrences. The visualization shows only the 400 most frequent words.

You can find the code used to collect and analyze the data here.

The network visualizations for each month are made with Gephi‘s dynamic network tool. The animation from the individual pics is made with FFmpeg.

Subscribe to our newsletter

Get highlights on NLP, AI, and applied cognitive science straight into your inbox.

powered by TinyLetter