Skip to main content
Data Science Nights

Data Science Nights - MAY 2026 - Speaker: Xudong Tang, Computer Science and NICO

Thursday, May 28, 2026 | 5:30 PM - 7:00 PM CT
Technological Institute, M416, 2145 Sheridan Road, Evanston, IL 60208 map it

MAY MEETING: Thursday, May 28, 2026 at 5:30pm (US Central)

LOCATION: 
ESAM Conference Room, Tech M416 
2145 Sheridan Road, Evanston, IL 60208

AGENDA:
5:30pm - Meet and greet with refreshments
6:00pm - Talk with Xudong Tang, PhD Student, Computer Science, NICO, and the Human-AI Collaboration Lab, Northwestern University

TALK TITLE:
Human and Machine Perception of Voice Similarity

ABSTRACT:
Modern voice cloning systems generate synthetic speech that listeners frequently cannot identify as being synthetic. But a voice can sound natural without sounding like the intended person, and what determines whether a clone is heard as a particular person is an open question. Here we report a large-scale preregistered experiment in which we collected 92,239 responses from 175 participants on their perception of pairs of real recordings, voice clones, and continuously morphed voices drawn from 100 contemporary celebrities across 20 speaker groups. We find that voice clones do not reliably preserve perceived speaker identity, reducing same-speaker judgments by 12.7 percentage points even though the clones are produced by a state-of-the-art text-to-speech model, while leaving different-speaker judgments unchanged. Using continuously morphed stimuli, we find that speakers vary substantially in how much variation their perceived identity tolerates, and that this variation is not predicted by speaker demographics. Speaker embeddings account for 58.9\% (95\% CI = [55.7, 61.9]) of variance in identity judgments, which is more than acoustic features, social attributes, and clone status combined. Once all these observed features are accounted for, clone status adds no additional predictive power. These results shows that the perceptual impact of voice cloning is positional rather than categorical: we can model how listeners judge a voice by how close it falls to the perceptual boundary that defines each speaker's recognizable voice, applying the same criterion to real and synthetic speech alike.

DATA SCIENCE NIGHTS are monthly meetings featuring presentations and discussions about data-driven science and complex systems, organized by Northwestern University graduate students and scholars. Students and researchers of all levels are welcome! For more information: http://bit.ly/nico-dsn

FUTURE DATES:
Data Science Nights will return in September!

Audience

  • Faculty/Staff
  • Student
  • Public
  • Post Docs/Docs
  • Graduate Students

Contact

Stefan Pate
Email

Interest

  • Academic (general)
  • Social Events
  • Data Science & AI

Add Event To My Group

Please sign-in