Spelling Variants Model


From the results of non-lead-end words filters, we observed that if a term has associated spelling variant(s) exist in the same corpus (The MEDLINE n-Gram set), it is likely a valid multiword. Thus, we developed algorithms of spelling variants pattern to identify spelling variants group from n-Grams. This can be used as matchers to retrieve valid multiwords from n-Grams.

Definiton and types

By definition, spVars are terms that have same meaning, categories (POS), pronuncitaion, syntax, and different spelling. They are the different spelling from American and British English.

SpVar Pattern Algorithm

SpVar Pattern Test