Introduction - Build the SPECIALIST Lexicon

Background

The SPECIALIST Lexicon are widely used for Part of Speech (POS) tagging, indexing, information retrieval, concept mapping, etc. in many Natural Language Processing (NLP) projects, such as Lexical Tools, MetaMap, SemRep, UMLS Metathesaurus, and ClinicalTrials.gov. A new systematic methodology is developed to identify single words and multiwords from MEDLINE through the use of element words. The results show an accelerated growth of the Lexicon, particularly an increase in multiword records. Hence, improvement in recall and precision can be anticipated in NLP projects using the SPECIALIST Lexicon and its applications.

LexBuild Processes

  • The Lexicon is built by linguists through a web-based computer-aided tools, LexBuild. A list of high frequency element words is generated by computer programs from MEDLINE abstracts and titles. These element words are reviewed by linguists to:
    • add new Lexical records if no exact/close match is found in LexBuild
    • update existing lexical records if related records are found by close match

    All lexical records (single words or multiwords) associated with these element words are reviewed through the Essie search engine, Google Scholar, dictionaries, etc. during the LexBuild process.