The scientific network “Language contact phenomena in multilingual first language acquisition” aims at bringing together early career researchers with established scholars in the fields of multilingualism and first language acquisition in order to foster the exchange of research on language contact phenomena across multiple language pairs and to pursue joint research projects. In particular, the network provides an opportunity to synergize the endeavors of the network members to create state-of-the-art corpora of multilingual first language acquisition, make them available to the scientific community, and to analyze them using innovative quantitative methods. Thus, the objectives of the project fall in two broad categories: The first set of objectives is methodological and data-oriented, while the second one relates to theoretical questions of how to model multilingual language acquisition in general and transfer phenomena in particular.

As for the first aspect, our goal is to refine and consolidate the inventory of methodological approaches currently used in the empirical study of multilingual first language acquisition, and to create well-annotated, well-documented, and widely re-usable corpus resources for the scientific community. Importantly, most network members have already built up extensive corpora of bilingual language acquisition in different language pairs (e.g. English – German, English –  Polish, German – Turkish). In all cases, the data have been transcribed, or are currently in the process of being transcribed, according to the CHAT standards (MacWhinney 2000). But all corpora need more work in order to meet the high standards set by previous child language corpora, many of which are available via the Child Language Exchange Database System (CHILDES), the standard repository for child language acquisition data. A variety of tools for automatically enriching language acquisition data already exist – e.g. the MOR program by MacWhinney (e.g. 2018) for English, adapted for German by Koch (2019) – but they are tailored to specific languages and have to be adapted in order to be usable for bilingual data and for languages for which no MOR grammars exist so far. The network provides an opportunity to create synergies in creating and improving on technical solutions and for making the annotations of the different corpora as comparable as possible across languages and language pairs. In addition, researchers with different methodological backgrounds – some with significant experience in corpus linguistics, others with expertise in statistical analysis and computational modelling – can exchange their knowledge in the context of the network to arrive at the best possible operationalizations of the research questions that emerge from a usage-based approach to multilingual language acquisition.

As for the second aspect, the question of how to model multilingual acquisition and how the results obtained from our data across multiple language pairs feed back into a usage-based account of language learning, the network members commit to pursue a number of joint research questions that are relevant for our understanding of multilingual language acquisition and for which the corpora compiled by the network members provide an ideal resource. These research questions relate to key issues such as the question of how children differentiate their languages, and whether transfer phenomena and code-mixing have socio-pragmatic functions in early child language (as it has been shown for these phenomena in adult use). In particular, our goal is to identify the “building blocks” of multilingual acquisition in psychologically plausible ways by detecting constructional patterns using data-driven methods, to investigate how constructions from different languages interact in each child’s emergent constructional network, how the individual repertoires differ between children, how children make use of these linguistic repertoires in different situations, and how each child’s inventory of constructions changes over time. These results can contribute valuable insights regarding the question of how linguistic “multi-competence” (Cook 2016) in general and code-mixing as well as transfer phenomena in particular can be modelled from a usage-based perspective.