What is new
The lexical tools (2009) are developed in Java 1.6 with integration of HyperSonic SQL database (1.8.0.10) and International Component for Unicode (ICU4J 4.0). New major changes are described as follows:
Package
- Provides both full and lite versions of lvg.2009
- JRE, 1.6.0_06
- HSqlDb 1.8.0.10 (HyperSonic SQL DB)
- ICU4J 4.0 (International Components for Unicode)
Unicode Related Features
Lexical tools 2009 release includes many features for Unicode normalization as described bwlows:
- toAscii, covnerts Unicode to pure ASCII tools and APIs:
- Lvg flow components, normalizes Unicode to pure ASCII:
- -f:q5: normalize Unicode to pure ASCII
- Norm (-f:N): normalization tool
- luiNorm (-f:N3): normalization and canonicalization tool
- -f:q6: normalize Unicode to pure ASCII with synonym options
- Lvg flow components, basic Unicode normalization:
- -f:q: strip diacritics
- -f:q0: map symbols & punctuation to ASCII
- -f:q1: map Unicode to ASCII
- -f:q2: split ligatures
- -f:q3: get Unicode name & information
- -f:q4: get Unicode synonym base
- -f:q7: Unicode core normalization
- -f:q8: strip or map Unicode
Database
Lexical tools 2009 release use HyperSonic SQL database (1.8.0.10) as default DB (for handling Unicode characters). Please make sure the database server is capable of handling Unicode if you use other DB.
IO & Data
Lvg
- Lvg 2009 includes 62 flow components and 37 options.
ToAscii
- Provides new command line tools, Java APIs, GUI tools, and Web tools to convert Unicode (UTF-8) to pure ASCII