キーワード索引

topic models

  • P-3-41
    黒田 航 (杏林大学医学部)
    Hierarchical Dirichlet Process (HDP) is a non-parametric version of Latent Dirichlet Allocation (LDA). HDP was used for unsupervised extraction of 1) constitutive patterns of English words (either in spelling or pronunciation) and 2) associative patterns between spellings and pronunciations in such a setting that words are “documents” and their character n-grams are “terms”, with distinction between continuous “regular” n-grams and discontinuous “skippy” n-grams. Results suggest regular n-grams allow extraction of morphemes, whereas skippy n-grams allow extraction of abstract patterns that rather capture rules of word-formation. The proposed method is language-independent, and therefore is applicable to any language in unsupervised manner.