gov.nih.nlm.nls.nlp.textfeatures
Class Variant

java.lang.Object
  extended by gov.nih.nlm.nls.nlp.textfeatures.MmObject
      extended by gov.nih.nlm.nls.nlp.textfeatures.Variant
All Implemented Interfaces:
java.io.Serializable

public class Variant
extends MmObject

Variant Fri Jun 02 17:46:46 EDT 2000, divita Initial Version

Version:
$Id: Variant.java,v 1.13 2005/12/21 20:43:17 divita Exp $
See Also:
Serialized Form

Field Summary
static char ACRONYM
           
static int ACRONYM_COST
           
static char ACRONYM_EXPANSION
           
static int ACRONYM_EXPANSION_COST
           
static char DERIVATION
           
static int DERIVATION_COST
           
static char INFLECTION
           
static int INFLECTION_COST
           
static char NO_OPERATION
           
static int NO_OPERATION_COST
           
static char SPELLING_VARIANT
           
static int SPELLING_VARIANT_COST
           
static char SYNONYM
           
static int SYNONYM_COST
           
 
Fields inherited from class gov.nih.nlm.nls.nlp.textfeatures.MmObject
serialVersionUID
 
Constructor Summary
Variant(java.lang.String pRow)
          This is a constructor for Variant used to create the fullVars table from MetaMap's vars, varsan, varsanu, and varsu table.
Variant(java.lang.String pVariant, int pCat, int pDistance, java.lang.String pHistory, Span pSpan, Span pPhraseCharSpan, Span pPhraseWordSpan, int pVariantId)
          This is a constructor for Variant
Variant(java.lang.String pVariant, int pCat, int pInfl, java.lang.String pRuleOrFact, java.lang.String pHistory, Span pSpan, int pVariantId)
          This is a constructor for Variant
Variant(java.lang.String pOrigTerm, int pOrigCat, java.lang.String pVariant, int pCat, int pInfl, java.lang.String pRuleOrFact, java.lang.String pHistory, int pDistance, int pFoundInFullTable, int pFoundInNounAdjDerivationFilteredTable, int pFoundInNounAdjDerivationAndUniqAcronymFilteredTable, int pFoundInUniqAcronymsFilteredTable)
          This is a constructor for Variant used for variant generation.
 
Method Summary
 void addCategory(int pCategory)
          Method addCategory
 boolean containsNoAcronymsOrAbbreviations()
          Method containsNoAcronymsOrAbbreviations returns true if this variant is acronym and abbreviation free.
 boolean containsNoDerivations()
          Method containsNoDerivations returns true if this variant is derivation free.
 java.lang.String display(gov.nih.nlm.nls.utils.GlobalBehavior settings)
          Method display returns a string of the variant at the level of detail set by the global behaviors.
 boolean equals(java.lang.Object pVariant)
          Method equals
 int foundInFullTable()
          Method foundInFullTable
 int foundInNounAdjDerivationAndUniqAcronymFilteredTable()
          Method foundInNounAdjDerivationAndUniqAcronymFilteredTable returns 1 if this variant is found in the filtered variants table that filters out all those variants that have derivations that are non noun/adjective pairs and that are unique acronyms or acronym expansions.
 int foundInNounAdjDerivationFilteredTable()
          Method foundInNounAdjDerivationFilteredTable returns 1 if this variant is found in the filtered variants table that filters out all those variants that have derivations that are non noun/adjective pairs.
 int foundInUniqAcronymsFilteredTable()
          Method foundInUniqAcronymsFilteredTable returns 1 if this variant is found in the filtered variants table that filters out all those variants that have non-unique acronyms or acronym expansions
static Variant fromBytes(byte[] pBytes)
          Method fromBytes ---0----+------1----+---2----+---3---+--4--+-5----+--6--+---7--+--8--+--9--+---10--+---11---+--12-----+ int | int | int | int | int| int | int | int |int | int |n bytes| m bytes| p bytes | term cat|variant cat|distance|FullVar|VarAN|VarANU|VarU | #term|#var |#hist|term | variant| history | --------+-----------+--------+-------+-----+------+-----+------+-----+-----+-------+--------+---------+
static Variant fromTableRow(java.lang.String pRow)
          Method fromTableRow creates a variant from a row from the fullVars table.
 int getCategories()
          Method getCategories returns the category(s) of this variant
 int getDistance()
          Method getDistance
 java.lang.String getHistory()
          Method getHistory
 int getInflections()
          Method getInflections returns the inflection(s) of this variant.
 java.lang.String getNormalizedTerm()
          Method getNormalizedTerm retrieves the Variant term, lowercased, devoid of punctuation - this is supposed to match the normalization that was done to create the normalized string, string sui table.
 int getOrigCat()
          Method getOrigCat
 java.lang.String getOrigTerm()
          Method getOrigTerm
 LexicalElement getParent()
          Method getParent returns the LexicalElement that this variant came from.
 Span getPhraseCharSpan()
          Method getPhraseCharSpan
 int getPhrasePosition()
          getPhrasePosition gets the lexical element position of this variant within the phrase.
 int getPhraseTokenPosition()
          Method getPhraseTokenPosition
 Span getPhraseWordSpan()
          Method getPhraseWordSpan
 java.lang.String getTerm()
          Method getTerm retrieves the Variant term
 java.util.Vector getTokens()
          Method getTokens computes the tokens (if it has not already done so) of this term, and returns them.
static void main(java.lang.String[] argv)
          This is a test main, whose purpose is to test the functionality of each method developed for this class.
 void setDistance(int pDistance)
          Method setDistance
 void setFoundInFullTable()
          Method setFoundInFullTable
 void setFoundInNounAdjDerivationAndUniqAcronymFilteredTable()
          Method setFoundInNounAdjDerivationAndUniqAcronymFilteredTable
 void setFoundInNounAdjDerivationFilteredTable()
          Method setFoundInNounAdjDerivationFilteredTable
 void setFoundInUniqAcronymsFilteredTable()
          Method setFoundInUniqAcronymsFilteredTable
 void setHistory(java.lang.String pHistory)
          Method setHistory
 void setNormalizedTerm(java.lang.String pTerm)
          Method setNormalizedTerm sets the Variant term, lowercased, devoid of punctuation - this is supposed to match the normalization that was done to create the normalized string, string sui table.
 void setParent(LexicalElement pParent)
          Method setParent preserves the link from the original LexicalElement to this variant
 void setPhraseCharSpan(Span pSpan)
          Method setPhraseCharSpan
 void setPhrasePosition(int pPos)
          setPhrasePosition records the position of this variant (as a lexical Element) within the phrase.
 void setPhraseTokenPosition(int pPos)
          Method setPhraseTokenPosition.
 void setPhraseWordSpan(Span pSpan)
          Method setPhraseWordSpan
 void setSpan(Span pSpan)
          Method setSpan
 void setTokens(java.util.Vector pTokens)
          Method setTokens adds the set of tokens that make up this variant
 byte[] toBytes()
          Method toBytes ---0----+------1----+---2----+---3---+--4--+-5----+--6--+---7--+--8--+--9--+---10--+---11---+--12-----+ int | int | int | int | int| int | int | int |int | int |n bytes| m bytes| p bytes | term cat|variant cat|distance|FullVar|VarAN|VarANU|VarU | #term|#var |#hist|term | variant| history | --------+-----------+--------+-------+-----+------+-----+------+-----+-----+-------+--------+---------+
 java.lang.String toFormatedString()
          Method toFormatedString is the method that prints out a variant in the format used to be absorbed into the db's variant table.
 java.lang.String toMetaMapVarsRow()
          Method toMetaMapVarsRow is the method that prints out a variant in one of MetaMap's Vars tables.
 java.lang.String toPipedString()
          Method toPipedString returns a pipe delimited fielded represtentation of some of the attributes of this variant.
 java.lang.String toString()
          Method toString
 
Methods inherited from class gov.nih.nlm.nls.nlp.textfeatures.MmObject
appendOriginalString, getCharOffset, getId, getLabel, getOriginalString, getSpan, getStrippedString, getTrimmedString, setId, setLabel, setOriginalString, setSpan, setStrippedString, setTrimmedString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

NO_OPERATION_COST

public static final int NO_OPERATION_COST
See Also:
Constant Field Values

SPELLING_VARIANT_COST

public static final int SPELLING_VARIANT_COST
See Also:
Constant Field Values

INFLECTION_COST

public static final int INFLECTION_COST
See Also:
Constant Field Values

SYNONYM_COST

public static final int SYNONYM_COST
See Also:
Constant Field Values

ACRONYM_COST

public static final int ACRONYM_COST
See Also:
Constant Field Values

ACRONYM_EXPANSION_COST

public static final int ACRONYM_EXPANSION_COST
See Also:
Constant Field Values

DERIVATION_COST

public static final int DERIVATION_COST
See Also:
Constant Field Values

NO_OPERATION

public static final char NO_OPERATION
See Also:
Constant Field Values

SPELLING_VARIANT

public static final char SPELLING_VARIANT
See Also:
Constant Field Values

INFLECTION

public static final char INFLECTION
See Also:
Constant Field Values

SYNONYM

public static final char SYNONYM
See Also:
Constant Field Values

ACRONYM

public static final char ACRONYM
See Also:
Constant Field Values

ACRONYM_EXPANSION

public static final char ACRONYM_EXPANSION
See Also:
Constant Field Values

DERIVATION

public static final char DERIVATION
See Also:
Constant Field Values
Constructor Detail

Variant

public Variant(java.lang.String pVariant,
               int pCat,
               int pInfl,
               java.lang.String pRuleOrFact,
               java.lang.String pHistory,
               Span pSpan,
               int pVariantId)
This is a constructor for Variant

Parameters:
pVariant - The variant of the lexicalElement
pCat - The category of this variant.
pInfl - The category of this variant.
pRuleOrFact - String indicating whether this variant came from a rule or fact ie.-> RULE|FACT
pHistory - The history of how this variant came to be from the original term. The history will be composed of the following:
n None
y Synonym
d Derivation
A Acronym
a Acronym expansion
s Spelling variant
pSpan - The span of the original term.
pVariantId - The id of the variant of this lexical element

Variant

public Variant(java.lang.String pVariant,
               int pCat,
               int pDistance,
               java.lang.String pHistory,
               Span pSpan,
               Span pPhraseCharSpan,
               Span pPhraseWordSpan,
               int pVariantId)
This is a constructor for Variant

Parameters:
pVariant - The variant of the lexicalElement
pCat - The category of this variant.
pDistance - Precomputed distance from the history
pHistory - The history of how this variant came to be from the original term. The history will be composed of the following:
n None
y Synonym
d Derivation
A Acronym
a Acronym expansion
s Spelling variant
pSpan - The span of the original term.
pPhraseCharSpan -
pPhraseWordSpan -
pVariantId - The id of the variant of this lexical element

Variant

public Variant(java.lang.String pOrigTerm,
               int pOrigCat,
               java.lang.String pVariant,
               int pCat,
               int pInfl,
               java.lang.String pRuleOrFact,
               java.lang.String pHistory,
               int pDistance,
               int pFoundInFullTable,
               int pFoundInNounAdjDerivationFilteredTable,
               int pFoundInNounAdjDerivationAndUniqAcronymFilteredTable,
               int pFoundInUniqAcronymsFilteredTable)
This is a constructor for Variant used for variant generation. Variants created using this constructor have no span. This variant is a mirror of the information in the vars table.

Parameters:
pOrigTerm - The original term.
pOrigCat - The category of the original term.
pVariant - The variant of the lexicalElement
pCat - The category of this variant.
pInfl - The category of this variant.
pRuleOrFact - String indicating whether this variant came from a rule or fact ie.-> RULE|FACT
pHistory - The history of how this variant came to be from the original term. The history will be composed of the following:
n None
y Synonym
d Derivation
A Acronym
a Acronym expansion
s Spelling variant
pDistance - Precomputed distance from the history
pFoundInFullTable - 1 if in the full Table
pFoundInNounAdjDerivationFilteredTable - 1 if in the table that filters out non noun/adj deriv. variants. 0 otherwise.
pFoundInNounAdjDerivationAndUniqAcronymFilteredTable - 1 if in the table that filters out non noun/adj deriv. and non unique acronym/acronym expansion variants. 0 otherwise.
pFoundInUniqAcronymsFilteredTable - 1 if in the table that filters out non unique acronym/acronym expansion variants. 0 otherwise.

Variant

public Variant(java.lang.String pRow)
This is a constructor for Variant used to create the fullVars table from MetaMap's vars, varsan, varsanu, and varsu table. It takes as a parameter, a line from one of these files. The row is of the form:

Parameters:
pRow - of the format: key|input cat|variant|output cat|history|[]
Method Detail

setTokens

public void setTokens(java.util.Vector pTokens)
Method setTokens adds the set of tokens that make up this variant

Parameters:
pTokens - a Vector of Token

setParent

public void setParent(LexicalElement pParent)
Method setParent preserves the link from the original LexicalElement to this variant

Parameters:
pParent -

addCategory

public void addCategory(int pCategory)
Method addCategory

Parameters:
pCategory -

setHistory

public void setHistory(java.lang.String pHistory)
Method setHistory

Parameters:
pHistory -

getHistory

public java.lang.String getHistory()
Method getHistory

Returns:
String

setDistance

public void setDistance(int pDistance)
Method setDistance

Parameters:
pDistance -

setSpan

public void setSpan(Span pSpan)
Method setSpan

Parameters:
pSpan -

setPhraseCharSpan

public void setPhraseCharSpan(Span pSpan)
Method setPhraseCharSpan

Parameters:
pSpan -

setPhraseWordSpan

public void setPhraseWordSpan(Span pSpan)
Method setPhraseWordSpan

Parameters:
pSpan -

getDistance

public int getDistance()
Method getDistance

Returns:
int

getTerm

public java.lang.String getTerm()
Method getTerm retrieves the Variant term

Returns:
String

getNormalizedTerm

public java.lang.String getNormalizedTerm()
Method getNormalizedTerm retrieves the Variant term, lowercased, devoid of punctuation - this is supposed to match the normalization that was done to create the normalized string, string sui table.

Returns:
String

setNormalizedTerm

public void setNormalizedTerm(java.lang.String pTerm)
Method setNormalizedTerm sets the Variant term, lowercased, devoid of punctuation - this is supposed to match the normalization that was done to create the normalized string, string sui table.

Parameters:
pTerm -

getCategories

public int getCategories()
Method getCategories returns the category(s) of this variant

Returns:
int an int enumerated from the Category class

getInflections

public int getInflections()
Method getInflections returns the inflection(s) of this variant.

Returns:
int an int enumerated from the Inflection class

getTokens

public java.util.Vector getTokens()
Method getTokens computes the tokens (if it has not already done so) of this term, and returns them.

Returns:
Vector of Tokens

display

public java.lang.String display(gov.nih.nlm.nls.utils.GlobalBehavior settings)
Method display returns a string of the variant at the level of detail set by the global behaviors. By default, that is just the trimmed String.

Parameters:
settings -
Returns:
String

toPipedString

public java.lang.String toPipedString()
Method toPipedString returns a pipe delimited fielded represtentation of some of the attributes of this variant. The fielded output is of the following form: Variant|Vid|string|cat|begingChar|endChar|position in phrase|distance|history

Returns:
String

toString

public java.lang.String toString()
Method toString

Overrides:
toString in class MmObject
Returns:
String

setPhrasePosition

public void setPhrasePosition(int pPos)
setPhrasePosition records the position of this variant (as a lexical Element) within the phrase. Note: This is a lexical element counter, not a word counter.

Parameters:
pPos - Position of the lexical Element in the phrase.

getPhrasePosition

public int getPhrasePosition()
getPhrasePosition gets the lexical element position of this variant within the phrase.

Returns:
int Position of the lexical Element in the phrase.

setPhraseTokenPosition

public void setPhraseTokenPosition(int pPos)
Method setPhraseTokenPosition. The phraseTokenPosition is the position of this variant (as if this variant equates to a token) in the phrase.

Parameters:
pPos -

getPhraseTokenPosition

public int getPhraseTokenPosition()
Method getPhraseTokenPosition

Returns:
int of the phraseTokenPosition of this variant.

getPhraseWordSpan

public Span getPhraseWordSpan()
Method getPhraseWordSpan

Returns:
Span

getPhraseCharSpan

public Span getPhraseCharSpan()
Method getPhraseCharSpan

Returns:
Span

foundInFullTable

public int foundInFullTable()
Method foundInFullTable

Returns:
int 1 if this variant is in the full variants table, 0 if it is not (not likely in this case.)

setFoundInFullTable

public void setFoundInFullTable()
Method setFoundInFullTable


foundInNounAdjDerivationFilteredTable

public int foundInNounAdjDerivationFilteredTable()
Method foundInNounAdjDerivationFilteredTable returns 1 if this variant is found in the filtered variants table that filters out all those variants that have derivations that are non noun/adjective pairs.

Returns:
int 1 if this variant is in this filtered variants table, 0 if it is not

setFoundInNounAdjDerivationFilteredTable

public void setFoundInNounAdjDerivationFilteredTable()
Method setFoundInNounAdjDerivationFilteredTable


foundInNounAdjDerivationAndUniqAcronymFilteredTable

public int foundInNounAdjDerivationAndUniqAcronymFilteredTable()
Method foundInNounAdjDerivationAndUniqAcronymFilteredTable returns 1 if this variant is found in the filtered variants table that filters out all those variants that have derivations that are non noun/adjective pairs and that are unique acronyms or acronym expansions.

Returns:
int 1 if this variant is in this filtered variants table, 0 if it is not

setFoundInNounAdjDerivationAndUniqAcronymFilteredTable

public void setFoundInNounAdjDerivationAndUniqAcronymFilteredTable()
Method setFoundInNounAdjDerivationAndUniqAcronymFilteredTable


containsNoDerivations

public boolean containsNoDerivations()
Method containsNoDerivations returns true if this variant is derivation free. That is, this variant is a spelling variant, synonym, acronyn, or expansion, or inflection but not a derivation, or a variant of a derivation

Returns:
boolean

containsNoAcronymsOrAbbreviations

public boolean containsNoAcronymsOrAbbreviations()
Method containsNoAcronymsOrAbbreviations returns true if this variant is acronym and abbreviation free. That is, this variant is a spelling variant, synonym, derivation, or inflection but no acronyms and abbreviations or their expansions or variants based on them.

Returns:
boolean

foundInUniqAcronymsFilteredTable

public int foundInUniqAcronymsFilteredTable()
Method foundInUniqAcronymsFilteredTable returns 1 if this variant is found in the filtered variants table that filters out all those variants that have non-unique acronyms or acronym expansions

Returns:
int 1 if this variant is in this filtered variants table, 0 if it is not

setFoundInUniqAcronymsFilteredTable

public void setFoundInUniqAcronymsFilteredTable()
Method setFoundInUniqAcronymsFilteredTable


getOrigTerm

public java.lang.String getOrigTerm()
Method getOrigTerm

Returns:
String

getParent

public LexicalElement getParent()
Method getParent returns the LexicalElement that this variant came from.

Returns:
LexicalElement

getOrigCat

public int getOrigCat()
Method getOrigCat

Returns:
int

toFormatedString

public java.lang.String toFormatedString()
Method toFormatedString is the method that prints out a variant in the format used to be absorbed into the db's variant table. This method is called by the generateVariants.VariantStore.add method, which is called by the GenerateMMVariants program/class.

Returns:
String

toMetaMapVarsRow

public java.lang.String toMetaMapVarsRow()
Method toMetaMapVarsRow is the method that prints out a variant in one of MetaMap's Vars tables.

Returns:
String

equals

public boolean equals(java.lang.Object pVariant)
Method equals

Overrides:
equals in class java.lang.Object
Parameters:
pVariant -
Returns:
boolean

fromTableRow

public static Variant fromTableRow(java.lang.String pRow)
                            throws java.lang.Exception
Method fromTableRow creates a variant from a row from the fullVars table.

Parameters:
pRow -
Returns:
Variant
Throws:
java.lang.Exception

toBytes

public byte[] toBytes()
               throws java.lang.Exception
Method toBytes ---0----+------1----+---2----+---3---+--4--+-5----+--6--+---7--+--8--+--9--+---10--+---11---+--12-----+ int | int | int | int | int| int | int | int |int | int |n bytes| m bytes| p bytes | term cat|variant cat|distance|FullVar|VarAN|VarANU|VarU | #term|#var |#hist|term | variant| history | --------+-----------+--------+-------+-----+------+-----+------+-----+-----+-------+--------+---------+

Returns:
byte[]
Throws:
java.lang.Exception

fromBytes

public static Variant fromBytes(byte[] pBytes)
                         throws java.lang.Exception
Method fromBytes ---0----+------1----+---2----+---3---+--4--+-5----+--6--+---7--+--8--+--9--+---10--+---11---+--12-----+ int | int | int | int | int| int | int | int |int | int |n bytes| m bytes| p bytes | term cat|variant cat|distance|FullVar|VarAN|VarANU|VarU | #term|#var |#hist|term | variant| history | --------+-----------+--------+-------+-----+------+-----+------+-----+-----+-------+--------+---------+

Parameters:
pBytes -
Returns:
Variant
Throws:
java.lang.Exception

main

public static final void main(java.lang.String[] argv)
This is a test main, whose purpose is to test the functionality of each method developed for this class. This main strives to test the boundary conditions as well as some sample common ways each public method is intended to be used.

Parameters:
argv - The command line input, tokenized


The use and distribution of this material is subject to the terms and conditions included in the file SPECIALIST_NLP_TOOLS_TERMS_AND_CONDITIONS.TXT, located in the root directory of the distribution.