# ----------------------------------------------
# SPECIALIST Tagset (T1)
#   This is the tagset based on the SPECIALIST
#   categories. 
#
#   See http://lexsrv3.nlm.nih.gov/SPECIALIST/
#               Projects/lexicon/2006/release/
#               LEXICON/DOCS/techrpt.pdf
#   for a definition of the principle categories
#
#   The punctuation categories tags came primarily
#   from the conversion that Larry Smith does to convert
#   his tagset to the SPECIALIST Tags. [Need a cit here]
#
#   The Shape tags come from shapes that the Xerox
#   Parc tagger identified, or from shapes that 
#   the textTools identifies or hopes to identify.
#
#   A "1" in the open class column indicates this tag 
#   is an open class. This info is useful to know when 
#   guessing a class - we only want to guess open classes.
#   An open class is defined to be a lexical category whose
#   membership is typically large and which can easily accept
#   new members. In English, this includes the categories of
#   noun, verb, and adjective. [A Dictionary of Grammatical 
#   Terms in Linguistics, R.L. Trask, c. 1993, pg. 195,]
#
#   We will presume that we have a lexicon filled with
#   all the closed class words and tokens. 
#
#   Those tags that have a 1 in the Shape column, 
#   indicates tags that won't be seen identifying
#   a term within an official lexicon. Rather
#   these tags will be put on terms recognized
#   by shape identifiers such as numbers, url ... 
#  
#   This tagger heavily relys upon the END tag. This should
#   be present in all tagsets associated with this tagger.  
#   The END tag is a tag that is implisitly put before
#   the beginning and after the end of an utterance (sentence).
#
#   The java code needs to know about numbers and punctuation.
#   The num and punt tags should always remain in any tagset,
#   as represented here - or alter the TagSet.getNumberTagId()
#   and TagSet.getPunctuationTagId() methods to correspond
#   to the 
#   
#-+----------+---------------------------------+-----+-----+-----------
# |  POS     |                                 |Open |Shape|Example
# |  tag     |              Name               |Class|     |Character
#-+----------+---------------------------------+-----+-----+-----------
end          |END                              |0| |
noun         |noun                             |1| | 
adj          |adjective                        |1| | 
adv          |adverb                           |1| | 
verb         |verb                             |1| | 
aux          |auxilliary verb "be", "do"       |0| | 
modal        |modal verb "have"                |0| | 
to           |infinitive marker to             |0| | 
conj         |conjunction                      |0| | 
pron         |pronoun                          |0| | 
compl        |complementizer(that)             |0| | 
det          |determiner                       |0| | 
pos          |genitive marker                  |0| | 
prep         |preposition                      |0| | 
num          |number or numeric                |0|1| 
real         |real number                      |0|1| 
unknown      |unknown                          |0|1| 
punct        |punctuation                      |0|1| 
pd           |end of sentence period           |0| |.
cm           |comma                            |0| |,
hy           |hyphen                           |0| |-
cl           |colon                            |0| |:
;            |semiColon                        |0| |;
ap           |right quote or double quote      |0| |'"
bq           |left quote (backquote)           |0| |`
lp           |left parenthesis                 |0| |(
rp           |right parenthesis                |0| |)
~            |tilda                            |0| |~
!            |bang                             |0| |!
@            |at sign                          |0| |@
pound        |pound sign                       |0| |#
$            |dollar sign                      |0| |$
%            |percent sign                     |0| |%
^            |carrot sign                      |0| |^
&            |and sign                         |0| |&
*            |asterisk                         |0| |*
=            |equal sign                       |0| |=
_            |underBar sign                    |0| |_
+            |plus sign                        |0| |+
{            |left curly bracket               |0| |{
}            |right curly bracket              |0| |}
bar          |bar                              |0| ||
[            |left bracket                     |0| |[
]            |right bracket                    |0| |]
\            |backslash                        |0| |\
/            |slash                            |0| |/
<            |lessThan                         |0| |<
>            |greaterThan                      |0| |>
?            |questionMark                     |0| |?
tab          |tab                              |0| | 
shape        |shape                            |0|1| 
prefix       |prefix                           |0|1| 
money        |money                            |0|1| 
phone        |phonenumber                      |0|1| 
date         |date                             |0|1| 
url          |URL                              |0|1| 
email        |EMAIL address                    |0|1| 
unitOfMeasure|unit of measure                  |0|1| 
chem         |chemical                         |0|1| 
propername   |proper name                      |0|1| 
acronym      |acronym                          |0|1| 
localAcronym |local acronym                    |0|1| 
percent      |percent number                   |0|1| 
fraction     |fraction                         |0|1| 
range        |range                            |0|1| 
glob         |glob                             |0|1| 
equation     |equation                         |0|1| 
levelOfSignificance|level of significance      |0|1|
experimentSize|experiment size                 |0|1|
none         |none                             |0|0|