Ask a Question

Prefer a chat interface with context about you and your work?

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

Abstract Finding word boundaries in continuous speech is challenging as there is little or no equivalent of a ‘space’ delimiter between words. Popular Bayesian non-parametric models for text segmentation (Goldwater et al., 2006, 2009) use a Dirichlet process to jointly segment sentences and build a lexicon of word types. We …