UMLS Metathesaurus Programs

This page describes programs run on on UMLS releases (AA and AB):

1.ProcessUmlsFiles

  • This is used to get all English terms, concepts (CUI), preferred terms of a concept, synonyms, etc. It is needed for STMT and TC projects. It needs to be run twice annually, {YEAR}AA and ${YEAR}AB.
  • Data supported by: Dr. Kin Wah Fung, Joe Chow
  • Location: ash2:/u03/umls/Releases/${YEAR}${VERSION}/Full/RRF/META/
  • Source files: ${META}/data/${UMLS_RELEASE}/org
    • MRCONSO.RRF
    • MRFILES.RRF
    • MRXNS_ENG.RRF
    • MRXNW_ENG.RRF
  • Run Programs:
    	${META}/bin/1.ProcessUmlsFiles
    	${UMLS_RELEASE}
    	${LVG_YEAR}
    	
    • 2: Get English terms
    • 3: Regenerate and test MRXNS_ENG.RRF
    • 4~5: Test Norm on English Strings (MRXNS_ENG.RRF) only if 3 does not work
    • 6: Regenerate and test MRXNW_ENG.RRF
    • 7~8: Test Norm Words from English Strings (MRXNW_ENG.RRF) only if 6 does not work

2.GetAtoms

  • This is to get all English terms, atoms files, form MRCONSO.RRF. It retrieves all strings (STR - 15th fields) from MRCONSO.RRF if language (LAT - 2nd field) is either ENG or SPA. LSG starts to generated this files ourselves after 2015+ release. It needs to be run once annually, {YEAR}AA for Lexical Tools - Canonical tables.
  • Source files: ${META}/data/${UMLS_RELEASE}/org
    • MRCONSO.RRF
  • Run Programs:
    	${META}/bin/2.GetAtoms
    	${UMLS_RELEASE}
    	
    • 1: Get Atoms by Java program
    • 2: Get Atoms by OCCS script
    • 3: Compare abov two files (after sort, they are the same)