Package com.sun.speech.freetts.en
Class TokenizerImpl
- java.lang.Object
-
- com.sun.speech.freetts.en.TokenizerImpl
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
DEFAULT_POSTPUNCTUATION_SYMBOLS
A string containing the default post-punctuation characters.static java.lang.String
DEFAULT_PREPUNCTUATION_SYMBOLS
A string containing the default pre-punctuation characters.static java.lang.String
DEFAULT_SINGLE_CHAR_SYMBOLS
A string containing the default single characters.static java.lang.String
DEFAULT_WHITESPACE_SYMBOLS
A string containing the default whitespace characters.static int
EOF
A constant indicating that the end of the stream has been read.
-
Constructor Summary
Constructors Constructor Description TokenizerImpl()
Constructs a Tokenizer.TokenizerImpl(java.io.Reader file)
Creates a tokenizer that will return tokens from the given file.TokenizerImpl(java.lang.String string)
Creates a tokenizer that will return tokens from the given string.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
getErrorDescription()
if hasErrors returnstrue
, this will return a description of the error encountered, otherwise it will returnnull
Token
getNextToken()
Returns the next token.boolean
hasErrors()
Returnstrue
if there were errors while reading tokensboolean
hasMoreTokens()
Returnstrue
if there are more tokens,false
otherwise.boolean
isBreak()
Determines if the current token should start a new sentence.void
setInputReader(java.io.Reader reader)
Sets the input readervoid
setInputText(java.lang.String inputString)
Sets the text to tokenize.void
setPostpunctuationSymbols(java.lang.String symbols)
Sets the postpunctuation symbols of this Tokenizer to the given symbols.void
setPrepunctuationSymbols(java.lang.String symbols)
Sets the prepunctuation symbols of this Tokenizer to the given symbols.void
setSingleCharSymbols(java.lang.String symbols)
Sets the single character symbols of this Tokenizer to the given symbols.void
setWhitespaceSymbols(java.lang.String symbols)
Sets the whitespace symbols of this Tokenizer to the given symbols.
-
-
-
Field Detail
-
EOF
public static final int EOF
A constant indicating that the end of the stream has been read.- See Also:
- Constant Field Values
-
DEFAULT_WHITESPACE_SYMBOLS
public static final java.lang.String DEFAULT_WHITESPACE_SYMBOLS
A string containing the default whitespace characters.- See Also:
- Constant Field Values
-
DEFAULT_SINGLE_CHAR_SYMBOLS
public static final java.lang.String DEFAULT_SINGLE_CHAR_SYMBOLS
A string containing the default single characters.- See Also:
- Constant Field Values
-
DEFAULT_PREPUNCTUATION_SYMBOLS
public static final java.lang.String DEFAULT_PREPUNCTUATION_SYMBOLS
A string containing the default pre-punctuation characters.- See Also:
- Constant Field Values
-
DEFAULT_POSTPUNCTUATION_SYMBOLS
public static final java.lang.String DEFAULT_POSTPUNCTUATION_SYMBOLS
A string containing the default post-punctuation characters.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
TokenizerImpl
public TokenizerImpl()
Constructs a Tokenizer.
-
TokenizerImpl
public TokenizerImpl(java.lang.String string)
Creates a tokenizer that will return tokens from the given string.- Parameters:
string
- the string to tokenize
-
TokenizerImpl
public TokenizerImpl(java.io.Reader file)
Creates a tokenizer that will return tokens from the given file.- Parameters:
file
- where to read the input from
-
-
Method Detail
-
setWhitespaceSymbols
public void setWhitespaceSymbols(java.lang.String symbols)
Sets the whitespace symbols of this Tokenizer to the given symbols.- Specified by:
setWhitespaceSymbols
in interfaceTokenizer
- Parameters:
symbols
- the whitespace symbols
-
setSingleCharSymbols
public void setSingleCharSymbols(java.lang.String symbols)
Sets the single character symbols of this Tokenizer to the given symbols.- Specified by:
setSingleCharSymbols
in interfaceTokenizer
- Parameters:
symbols
- the single character symbols
-
setPrepunctuationSymbols
public void setPrepunctuationSymbols(java.lang.String symbols)
Sets the prepunctuation symbols of this Tokenizer to the given symbols.- Specified by:
setPrepunctuationSymbols
in interfaceTokenizer
- Parameters:
symbols
- the prepunctuation symbols
-
setPostpunctuationSymbols
public void setPostpunctuationSymbols(java.lang.String symbols)
Sets the postpunctuation symbols of this Tokenizer to the given symbols.- Specified by:
setPostpunctuationSymbols
in interfaceTokenizer
- Parameters:
symbols
- the postpunctuation symbols
-
setInputText
public void setInputText(java.lang.String inputString)
Sets the text to tokenize.- Specified by:
setInputText
in interfaceTokenizer
- Parameters:
inputString
- the string to tokenize
-
setInputReader
public void setInputReader(java.io.Reader reader)
Sets the input reader- Specified by:
setInputReader
in interfaceTokenizer
- Parameters:
reader
- the input source
-
getNextToken
public Token getNextToken()
Returns the next token.- Specified by:
getNextToken
in interfaceTokenizer
- Returns:
- the next token if it exists,
null
if no more tokens
-
hasMoreTokens
public boolean hasMoreTokens()
Returnstrue
if there are more tokens,false
otherwise.- Specified by:
hasMoreTokens
in interfaceTokenizer
- Returns:
true
if there are more tokensfalse
otherwise
-
hasErrors
public boolean hasErrors()
Returnstrue
if there were errors while reading tokens
-
getErrorDescription
public java.lang.String getErrorDescription()
if hasErrors returnstrue
, this will return a description of the error encountered, otherwise it will returnnull
- Specified by:
getErrorDescription
in interfaceTokenizer
- Returns:
- a description of the last error that occurred.
-
-