gov.nih.nlm.nls.gspell
Class Options

java.lang.Object
  extended bygov.nih.nlm.nls.gspell.Cfg
      extended bygov.nih.nlm.nls.gspell.Options

public class Options
extends Cfg

Options retrieves options from the commandline


Field Summary
static int EXPORT
           
static int FIND
           
static int HELP
           
static int INDEX
           
static int UPDATE
           
static int VERSION
           
 
Constructor Summary
Options()
          This is a constructor for Options This constructor takes no Options.
Options(java.io.File pDictionary)
          This is a constructor for Options
Options(int pMode, java.lang.String pDictionaryName, java.lang.String pDictionaryDir)
          This is a constructor for Options Use this constructor for indexing and Updating from within your applications
Options(int pMode, java.lang.String pDictionaryName, java.lang.String pDictionaryDir, int pTruncate, int pConsiderNCandidates, int pMaxEditDistance)
          This is a constructor for Options.
Options(java.lang.String pDictionaryName)
          This is a constructor for Options
Options(java.lang.String[] argv)
          This is a constructor for Options
Options(java.lang.String pDictionaryDir, java.lang.String pDictionaryName)
          This is a constructor for Options
Options(java.lang.String pDictionaryName, java.lang.String pDictionaryDir, int pTruncate, int pConsiderNCandidates, int pMaxEditDistance)
          This is a constructor for Options.
 
Method Summary
(package private)  java.lang.String _makeDir(java.lang.String pDirectory)
          Method _makeDir This method creates a dictionaries/dictName dir
 java.lang.String get(java.lang.String pAttribute)
          get retrieves values for an attribute asked for This method will return a null for attributes asked for which do not exist.
 int getAction()
          Method getAction retrieves the type of action.
 java.lang.String getAspellMode()
          Method getAspellMode Retreive the aspell spelling suggestion mode (ultra|fast|normal|bad-spellers) (For aspell Only)
 int getCacheSize()
          Method getCacheSize gets an estimate of how big to set the cache to hold the grams.
 int getConsiderNCandidates()
          Method getConsiderNCandidates reports how many candidates to evaluate when processing a query.
 java.lang.String getCorpusName()
          Method getCorpusName retrieves the name of the corpus to retrieve frequency info from
 int getCorrectField()
          Method getCorrectField reports the field to pick up the "correct" term from.
 java.lang.String getDictionaryDir()
          Method getDictionaryDir retrieves the full path of the dictionary directory
 java.lang.String getDictionaryName()
          Method getDictionaryName retrieves the name of the dictionary
(package private)  java.lang.String getDictionaryPath()
          Method getDictionaryPath
(package private)  java.lang.String getDictionaryPath(java.lang.String pDictionaryName)
          Method getDictionaryPath
 boolean getFieldedText()
          Method getFieldedText
 FindOptions getFindOptions()
          Method getFindOptions returns a FindOptions Object with options initially set from the global options.
 java.lang.String getInputFileName()
          Method getInputFileName reports the name of the input file.
 java.io.BufferedReader getInputFileReader()
          Method getInputFileReader opens and retrieves the inputFileStream given the option --inputFile=xxxxx.
 int getMaxEditDistance()
          Method getMaxEditDistance reports back the edit distance threshold set to weed out candidates from consideration when the distance is greater than or equal this threshold.
 int getMaxReferences()
          Method getMaxReferences retrieves the gram with the largest number of documents associated with it, and returns the number of documents of this gram.
 int getMode()
          Method getMode
 int getNumberOfDocuments()
          Method getNumberOfDocuments gets the number of words in the dictionary.
 java.lang.String getOutputFileName()
          Method getOutputFileName gets the name of the output file.
 java.io.PrintWriter getOutputFileWriter()
          Method getOutputFileWriter.
 boolean getReportTime()
          Method getReportTime retrieves whether or not to keep and report the time it takes to process a query.
 boolean getStrict()
          Method getStrict returns true if the --strict option has been set.
 int getTermField()
          Method getTermField retrieved the field from the fielded text to pick the query term from.
 int getTruncate()
          Method getTruncate reports the number of candidates to return
 boolean getWordLengthHeuristic()
          Method getWordLengthHeuristic gets whether or not to use the wordLength heuristic
 void makeDir(java.lang.String pDirectoryName)
          Method makeDir This method creates a dictionaries/dictName dir and sets the dictionaryDir
 void read()
          Method read
(package private)  void read(java.lang.String pDictionaryName)
          Method read This method will take an absolute path to the directory where the dictionary resides, or it will look in the classpath for directories labeled dictionaries for the beginning paths to look for dictionary directory and a config file labeled "dictionary".cfg
(package private)  void read(java.lang.String pDictionaryDir, java.lang.String pDictionaryName)
          Method read This method will take an absolute path to the directory where the dictionary resides, or it will look in the classpath for directories labeled dictionaries for the beginning paths to look for dictionary directory and a config file labeled "dictionary".cfg
 void readCommandLineArguments(java.lang.String[] argv)
          Method readCommandLineArguments Retrieve these options Actions --help --index --update --find --export --reportTime --version Options --dictionary=XXXXX --inputFile=YYYYY --outputFile=ZZZZZ --truncate=N --considerNCandidates=N --maxEditDistance=N --fieldedText --termField=X [ The first field is 1 not 0 ] --correctField=Y --wordSizeHeuristic=true|false --casheSize=Z --strict
 void save()
          Method save
(package private)  void save(java.lang.String pDictionaryName)
          Save.
(package private)  void set(java.lang.String pKey, java.lang.String pValue)
          Set.
 void setAspellMode(java.lang.String pAspellMode)
          Method setAspellMode set the aspell spelling suggestion mode (ultra|fast|normal|bad-spellers) (For aspell Only)
 void setConsiderNCandidates(int pMaxCandidates)
          Method setConsiderNCandidates sets the number of candidates to evaluate when processing a query.
 void setCorpusName(java.lang.String pName)
          Method setCorpusName sets the name of the corpus to retrieve frequency info from
 void setCorrectField(int pField)
          Method setCorrectField sets the field to pick up the "correct" term from.
 void setDictionaryDir(java.lang.String pDictionaryDir)
          Method setDictionaryDir sets the name of the dictionary directory.
 void setDictionaryName(java.lang.String pDictionaryName)
          Method setDictionaryName sets the name of the dictionary.
(package private)  void setDictionaryPath(java.lang.String pDictionaryDir)
          Method setDictionaryPath
 void setFieldedText(boolean pVal)
          Method setFieldedText tells the options that the input text is fielded
 void setInputFileName(java.lang.String pFileName)
          Method setInputFileName sets the name of the input file.
 void setInputFileReader(java.io.BufferedReader pReader)
          Method setInputFileReader sets the inputFileStream
 void setMaxEditDistance(int pMaxEditDistance)
          Method setMaxEditDistance sets a threshold to weed out candidates from consideration when the distance is greater than or equal this number.
 void setMode(int pMode)
          Method setMode sets the type of mode gspell uses: READ, READ_WRITE, WRITE_ONLY
 void setOption(java.lang.String pKey, java.lang.String pVal)
          Method setOption
 void setOutputFileName(java.lang.String pFileName)
          Method setOutputFileName sets the name of the output file.
 void setOutputFileWriter(java.io.PrintWriter pStream)
          Method setOutputFileWriter.
 void setReportTime(boolean pVal)
          Method setReportTime sets whether or not to keep and report the time it takes to process a query.
 void setTermField(int pField)
          Method setTermField tells the options the field from the fielded text to pick the query term from.
 void setTruncate(int pTruncate)
          Method setTruncate sets the number of candidates to return
 void setWordLengthHeuristic(boolean pVal)
          Method setWordLengthHeuristic sets whether or not to use the wordLength heuristic
 java.lang.String toString()
          Method toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

HELP

public static final int HELP
See Also:
Constant Field Values

VERSION

public static final int VERSION
See Also:
Constant Field Values

INDEX

public static final int INDEX
See Also:
Constant Field Values

UPDATE

public static final int UPDATE
See Also:
Constant Field Values

FIND

public static final int FIND
See Also:
Constant Field Values

EXPORT

public static final int EXPORT
See Also:
Constant Field Values
Constructor Detail

Options

public Options()
        throws GSpellException
This is a constructor for Options This constructor takes no Options. Make sure the following does get set at some point: this.setMode( int pMode ) this.setsetDictionaryName(String pDictionaryName) (and/or) this.setsetDictionaryDir(String pDictionaryDir) For Find Options setOutputFileWriter(PrintWriter pStream ) (for Find,Export only) setTruncate(int pTruncate ) setConsiderNCandidates(int pTruncate ) setMaxEditDistance(int pMaxEditDistance)

Throws:
GSpellException

Options

public Options(int pMode,
               java.lang.String pDictionaryName,
               java.lang.String pDictionaryDir)
        throws GSpellException
This is a constructor for Options Use this constructor for indexing and Updating from within your applications

Parameters:
pMode - this should be set to GSpell.CREATE or GSpell.READ_WRITE
pDictionaryName - This is the name of the dictionary. This needs to be set. This name is both the name of the created dictionary directory and this name is prepended to each index file deposited in the created directory.
pDictionaryDir - This is the name leading up to the dictionary directory path. By default, the programs that use these options look in the classpath for a "dictionaries" subdirectory. Set this pDictionaryDir only when the dictionaries are not being placed this way.
Throws:
GSpellException

Options

public Options(java.lang.String pDictionaryName,
               java.lang.String pDictionaryDir,
               int pTruncate,
               int pConsiderNCandidates,
               int pMaxEditDistance)
        throws GSpellException
This is a constructor for Options. Use this constructor for find operations.

Parameters:
pDictionaryName - This is the name of the dictionary. This needs to be set. This name is both the name of the created dictionary directory and this name is prepended to each index file deposited in the created directory.
pDictionaryDir - This is the name leading up to the dictionary directory path. By default, the programs that use these options look in the classpath for a "dictionaries" subdirectory. Set this pDictionaryDir only when the dictionaries are not being placed this way.
pTruncate - This indicates how many candidates you want returned. If you don't have a preference, set this to -1; The default is set to 10.
pConsiderNCandidates - This indicates how many candidates to compare. On a fast PC, 3000 is a good number to use, On a slower sun, use a number under 1000. If you don't have a preference, set this to -1; The default is set to 3000.
pMaxEditDistance - This indicates to throw out from consideration any candidates that are evaluated to be larger than this edit distance. If you don't have a preference, set this to -1. The default is set to 4.
Throws:
GSpellException

Options

public Options(int pMode,
               java.lang.String pDictionaryName,
               java.lang.String pDictionaryDir,
               int pTruncate,
               int pConsiderNCandidates,
               int pMaxEditDistance)
        throws GSpellException
This is a constructor for Options. Use this constructor for find and update/find operations.

Parameters:
pMode - GSpell.READ_ONLY|GSpell.READ_WRITE
pDictionaryName - This is the name of the dictionary. This needs to be set. This name is both the name of the created dictionary directory and this name is prepended to each index file deposited in the created directory.
pDictionaryDir - This is the name leading up to the dictionary directory path. By default, the programs that use these options look in the classpath for a "dictionaries" subdirectory. Set this pDictionaryDir only when the dictionaries are not being placed this way.
pTruncate - This indicates how many candidates you want returned. If you don't have a preference, set this to -1; The default is set to 10.
pConsiderNCandidates - This indicates how many candidates to compare. On a fast PC, 3000 is a good number to use, On a slower sun, use a number under 1000. If you don't have a preference, set this to -1; The default is set to 3000.
pMaxEditDistance - This indicates to throw out from consideration any candidates that are evaluated to be larger than this edit distance. If you don't have a preference, set this to -1. The default is set to 4.
Throws:
GSpellException

Options

public Options(java.lang.String[] argv)
        throws GSpellException
This is a constructor for Options

Parameters:
argv -
Throws:
GSpellException

Options

public Options(java.lang.String pDictionaryName)
        throws GSpellException
This is a constructor for Options

Parameters:
pDictionaryName - This is the name of the dictionary. This needs to be set. This name is both the name of the created dictionary directory and this name is prepended to each index file deposited in the created directory. By default, the programs that use these options look in the classpath for a "dictionaries" subdirectory.
Throws:
GSpellException

Options

public Options(java.io.File pDictionary)
        throws GSpellException
This is a constructor for Options

Parameters:
pDictionary -
Throws:
GSpellException

Options

public Options(java.lang.String pDictionaryDir,
               java.lang.String pDictionaryName)
        throws GSpellException
This is a constructor for Options

Parameters:
pDictionaryDir - This is the name leading up to the dictionary directory path. By default, the programs that use these options look in the classpath for a "dictionaries" subdirectory. Set this pDictionaryDir only when the dictionaries are not being placed this way.
pDictionaryName - This is the name of the dictionary. This needs to be set. This name is both the name of the created dictionary directory and this name is prepended to each index file deposited in the created directory.
Throws:
GSpellException
Method Detail

getAction

public int getAction()
Method getAction retrieves the type of action.

Returns:
int HELP|VERSION|INDEX|UPDATE|FIND|EXPORT

readCommandLineArguments

public void readCommandLineArguments(java.lang.String[] argv)
                              throws GSpellException
Method readCommandLineArguments Retrieve these options Actions --help --index --update --find --export --reportTime --version Options --dictionary=XXXXX --inputFile=YYYYY --outputFile=ZZZZZ --truncate=N --considerNCandidates=N --maxEditDistance=N --fieldedText --termField=X [ The first field is 1 not 0 ] --correctField=Y --wordSizeHeuristic=true|false --casheSize=Z --strict

Parameters:
argv -
Throws:
GSpellException

getDictionaryName

public java.lang.String getDictionaryName()
Method getDictionaryName retrieves the name of the dictionary

Returns:
String

setDictionaryName

public void setDictionaryName(java.lang.String pDictionaryName)
Method setDictionaryName sets the name of the dictionary. The name of the dictionary needs to be set because its name is prepended to the name of each index file created.

Parameters:
pDictionaryName -

getDictionaryDir

public java.lang.String getDictionaryDir()
                                  throws GSpellException
Method getDictionaryDir retrieves the full path of the dictionary directory

Returns:
String
Throws:
GSpellException

setDictionaryDir

public void setDictionaryDir(java.lang.String pDictionaryDir)
Method setDictionaryDir sets the name of the dictionary directory. This method should only be called after setDictionaryName() has been called.

Parameters:
pDictionaryDir - This is the name leading up to the dictionary directory path. By default, the programs that use these options look in the classpath for a "dictionaries" subdirectory. Set this pDictionaryDir only when the dictionaries are not being placed this way.

getCorpusName

public java.lang.String getCorpusName()
Method getCorpusName retrieves the name of the corpus to retrieve frequency info from

Returns:
String

setCorpusName

public void setCorpusName(java.lang.String pName)
Method setCorpusName sets the name of the corpus to retrieve frequency info from


getInputFileReader

public java.io.BufferedReader getInputFileReader()
                                          throws GSpellException
Method getInputFileReader opens and retrieves the inputFileStream given the option --inputFile=xxxxx. If this option is not set, the default given back is standard input.

Returns:
BufferedReader
Throws:
GSpellException

setInputFileReader

public void setInputFileReader(java.io.BufferedReader pReader)
                        throws GSpellException
Method setInputFileReader sets the inputFileStream

Parameters:
pReader -
Throws:
GSpellException

getOutputFileWriter

public java.io.PrintWriter getOutputFileWriter()
                                        throws GSpellException
Method getOutputFileWriter. This method returns the outputFile writer. This method will open for writing the file specified by --ouputFile=XXXXX. If this option is not set, the default returned is standard output

Returns:
PrintWriter
Throws:
GSpellException

setOutputFileWriter

public void setOutputFileWriter(java.io.PrintWriter pStream)
                         throws GSpellException
Method setOutputFileWriter. This method sets the outputFile writer.

Parameters:
pStream -
Throws:
GSpellException

getReportTime

public final boolean getReportTime()
Method getReportTime retrieves whether or not to keep and report the time it takes to process a query.

Returns:
boolean

setReportTime

public final void setReportTime(boolean pVal)
Method setReportTime sets whether or not to keep and report the time it takes to process a query.

Parameters:
pVal -

getTruncate

public final int getTruncate()
Method getTruncate reports the number of candidates to return

Returns:
int

setTruncate

public final void setTruncate(int pTruncate)
Method setTruncate sets the number of candidates to return

Parameters:
pTruncate -

getConsiderNCandidates

public final int getConsiderNCandidates()
Method getConsiderNCandidates reports how many candidates to evaluate when processing a query.

Returns:
int

setConsiderNCandidates

public final void setConsiderNCandidates(int pMaxCandidates)
Method setConsiderNCandidates sets the number of candidates to evaluate when processing a query. It is impossible to consider all candidates (all the words in a dictionary) against a query. There are pruning heuristics to limit the number of candidates to consider. The more candidates considered the better the results. The more candidates considered, the slower the process is.

Parameters:
pMaxCandidates -

getMaxEditDistance

public final int getMaxEditDistance()
Method getMaxEditDistance reports back the edit distance threshold set to weed out candidates from consideration when the distance is greater than or equal this threshold.

Returns:
int

setMaxEditDistance

public final void setMaxEditDistance(int pMaxEditDistance)
Method setMaxEditDistance sets a threshold to weed out candidates from consideration when the distance is greater than or equal this number. An Edit distance is a metric that considers the difference between two strings by considering the transformations required to transform one string into another, by inserting a character, removing a character, swapping in one character for another. The currently implemented Levensthein algorithm does not consider transposed characters (ie ->ie) as a cost of 1.

Parameters:
pMaxEditDistance -

getFieldedText

public final boolean getFieldedText()
Method getFieldedText

Returns:
boolean True if the input is fielded

setFieldedText

public final void setFieldedText(boolean pVal)
Method setFieldedText tells the options that the input text is fielded

Parameters:
pVal - True if the input is fielded

getTermField

public final int getTermField()
Method getTermField retrieved the field from the fielded text to pick the query term from.

Returns:
int The field to pick up the term from

setTermField

public final void setTermField(int pField)
Method setTermField tells the options the field from the fielded text to pick the query term from.

Parameters:
pField - The field to pick up the term from

getCorrectField

public final int getCorrectField()
Method getCorrectField reports the field to pick up the "correct" term from. This field is used when both the bad and good term are passed in, for benchmarking purposes.

Returns:
int The field to pick up the correct term from.

setCorrectField

public final void setCorrectField(int pField)
Method setCorrectField sets the field to pick up the "correct" term from. This field is used when both the bad and good term are passed in, for benchmarking purposes.

Parameters:
pField - The field to pick up the correct term from.

getAspellMode

public final java.lang.String getAspellMode()
Method getAspellMode Retreive the aspell spelling suggestion mode (ultra|fast|normal|bad-spellers) (For aspell Only)

Returns:
String ultra|fast|normal|bad-spellers

setAspellMode

public final void setAspellMode(java.lang.String pAspellMode)
Method setAspellMode set the aspell spelling suggestion mode (ultra|fast|normal|bad-spellers) (For aspell Only)

Parameters:
pAspellMode - "ultra|fast|normal|bad-spellers"

setMode

public final void setMode(int pMode)
Method setMode sets the type of mode gspell uses: READ, READ_WRITE, WRITE_ONLY

Parameters:
pMode - GSpell.WRITE_ONLY|READ_WRITE|READ_ONLY

setWordLengthHeuristic

public final void setWordLengthHeuristic(boolean pVal)
Method setWordLengthHeuristic sets whether or not to use the wordLength heuristic

Parameters:
pVal -

getWordLengthHeuristic

public final boolean getWordLengthHeuristic()
Method getWordLengthHeuristic gets whether or not to use the wordLength heuristic


getMode

public final int getMode()
Method getMode

Returns:
int GSpell.WRITE_ONLY|READ_WRITE|READ_ONLY

setInputFileName

public final void setInputFileName(java.lang.String pFileName)
Method setInputFileName sets the name of the input file. This file should either be an absolute path or be in the current directory.

Parameters:
pFileName -

getInputFileName

public final java.lang.String getInputFileName()
Method getInputFileName reports the name of the input file.

Returns:
pFileName

setOutputFileName

public final void setOutputFileName(java.lang.String pFileName)
Method setOutputFileName sets the name of the output file. That is, the file where the candidates are reported to. This option only has meaning when finding candidates. This file should either be an absolute path or be in the current directory. The default is that this is null, and the output gets printed directly to standard output.

Parameters:
pFileName -

getOutputFileName

public final java.lang.String getOutputFileName()
Method getOutputFileName gets the name of the output file. That is, the file where the candidates are reported to. This option only has meaning when finding candidates. This file should either be an absolute path or be in the current directory. The default is that this is null, and the output gets printed directly to standard output.

Returns:
pFileName

getNumberOfDocuments

public final int getNumberOfDocuments()
Method getNumberOfDocuments gets the number of words in the dictionary. This number is calculated at index time and added to with each update, and stored in the configuration file.

Returns:
int

getCacheSize

public final int getCacheSize()
Method getCacheSize gets an estimate of how big to set the cache to hold the grams. The more that can fit in memory, the quicker the application goes. Thi makes the most difference for indexing.

Returns:
int

getMaxReferences

public final int getMaxReferences()
Method getMaxReferences retrieves the gram with the largest number of documents associated with it, and returns the number of documents of this gram. This sets the upper bound of the size of the bins needed to hold the gram and its document references.

Returns:
int

getStrict

public final boolean getStrict()
Method getStrict returns true if the --strict option has been set.

Returns:
int

setOption

public final void setOption(java.lang.String pKey,
                            java.lang.String pVal)
Method setOption

Parameters:
pKey -
pVal -

save

public final void save()
                throws GSpellException
Method save

Throws:
GSpellException

read

public final void read()
                throws GSpellException
Method read

Throws:
GSpellException

getFindOptions

public final FindOptions getFindOptions()
Method getFindOptions returns a FindOptions Object with options initially set from the global options.

Returns:
FindOptions

makeDir

public final void makeDir(java.lang.String pDirectoryName)
                   throws GSpellException
Method makeDir This method creates a dictionaries/dictName dir and sets the dictionaryDir

Throws:
GSpellException

toString

public final java.lang.String toString()
Method toString

Returns:
Stringj

read

void read(java.lang.String pDictionaryName)
    throws java.lang.Exception
Method read This method will take an absolute path to the directory where the dictionary resides, or it will look in the classpath for directories labeled dictionaries for the beginning paths to look for dictionary directory and a config file labeled "dictionary".cfg

Parameters:
pDictionaryName -
Throws:
java.lang.Exception

read

void read(java.lang.String pDictionaryDir,
          java.lang.String pDictionaryName)
    throws java.lang.Exception
Method read This method will take an absolute path to the directory where the dictionary resides, or it will look in the classpath for directories labeled dictionaries for the beginning paths to look for dictionary directory and a config file labeled "dictionary".cfg

Parameters:
pDictionaryDir -
pDictionaryName -
Throws:
java.lang.Exception

get

public java.lang.String get(java.lang.String pAttribute)
get retrieves values for an attribute asked for This method will return a null for attributes asked for which do not exist.

Parameters:
pAttribute -
Returns:
String

set

void set(java.lang.String pKey,
         java.lang.String pValue)
Set. This method sets the key=value pair into the properties

Parameters:
pKey -
pValue -

save

void save(java.lang.String pDictionaryName)
    throws java.lang.Exception
Save. This method writes out the properties file back out

Parameters:
pDictionaryName -
Throws:
java.lang.Exception

_makeDir

java.lang.String _makeDir(java.lang.String pDirectory)
                    throws java.lang.Exception
Method _makeDir This method creates a dictionaries/dictName dir

Parameters:
pDirectory -
Returns:
String
Throws:
java.lang.Exception

getDictionaryPath

java.lang.String getDictionaryPath()
                             throws java.lang.Exception
Method getDictionaryPath

Returns:
String
Throws:
java.lang.Exception

getDictionaryPath

java.lang.String getDictionaryPath(java.lang.String pDictionaryName)
                             throws java.lang.Exception
Method getDictionaryPath

Parameters:
pDictionaryName -
Returns:
String
Throws:
java.lang.Exception

setDictionaryPath

void setDictionaryPath(java.lang.String pDictionaryDir)
Method setDictionaryPath

Parameters:
pDictionaryDir -


The use and distribution of this material is subject to the terms and conditions included in the file SPECIALIST_NLP_TOOLS_TERMS_AND_CONDITIONS.TXT, located in the root directory of the distribution.