When:
Friday, March 6, 2026
3:30 PM - 5:00 PM CT
Where: Chambers Hall, Ruan Conference Room, 600 Foster St, Evanston, IL 60208 map it
Audience: Faculty/Staff - Student - Public - Post Docs/Docs - Graduate Students
Contact:
Annie Lee
annielee@northwestern.edu
Group: Linguistics Department
Category: Academic
Interrogating and transcending proxies in language research: Emergent profiles and confounded variables
What's a "native speaker/signer"? Socially-informed scholarship has shown that language users (and linguist-language-users) have ideologically-mediated intuitions about who counts and who doesn't, with notions of social power and authenticity being particularly relevant factors (Ortega 2020, D'Onofrio 2019, Ramjattan 2019, Faez 2011, Leonard & Haynes 2010, Bucholtz 2003). At the same time, much experimental work operates on the assumption that self-identifying as "monolingual", "heritage", or "native" is a reliable proxy for particular profiles of language experience. While this could be the case in some communities, the logical independence of such terms and the factors they are meant to stand in for requires that researchers ask what these labels actually mean in a given context (Grammon & Babel 2024, Birkeland et al. 2024, Cheng et al. 2021, Namboodiripad, Kutlu, et al. 2026). I present collaborative work which investigates whether and how variation in language experience aligns with commonly-used proxy variables in experimental research.
First, we asked about the structure of variation in the Hindi-Urdu (Cheng et al. 2022) and Malayalam diaspora (Abtahian et al. in prep). We used agglomorative clustering analysis, a dimensionality reduction tool, to create emergent profiles of language experience, and compared these profiles to proxies such as generation of immigration, self-IDing as a "native speaker", and diaspora vs in situ. In both diaspora groups, the emergent profiles out-performed the proxy variables when it came to explaining variation in acceptability judgments (Hindi-Urdu) and choice of food terms (Malayalam).
Next, we studied an overrepresented group which is assumed to be homogenous: adult US-based Prolific participants who list English as their primary language. Participants (N=196) answered a detailed questionnaire about language use and exposure across their lifespan, current day linguistic practices, and alignment with various social and demographic labels (Cheng & Namboodiripad in prep.). When we asked about language experience in an open-ended way that centered linguistic variation, we found considerable self-reported linguistic diversity. Further, we found that early variation in language experience does not correspond to variation in current day use of English. Finally, we found that participants who self-identified as "monolingual native English speakers", an often prized "control" group in psycholinguistic research, did not significantly differ from others when it came to their patterns of language experience or use. Instead, they were more likely to identify as White.
Across studies, we found structured heterogeneity amongst participants. Language experience profiles aligned with concepts like "receptive bilingual" and "switch dominance", while also supporting theoretical approaches which deemphasize the role of very early experience in determining linguistic practices. Overall, this work aims to center and make sense of the variation which characterizes everyone's language use. And actually interrogating the ubiquitous proxy variables allows us to learn, for example, that studies which limit recruitment to self-identified monolingual English speakers in the US, all else being equal, overrepresent White people.