[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.6 Tokenization

If you use the festival-freebsoft-utils tokenizer instead of the Festival built-in tokenizer, you can put additional limits on the tokenization process besides the eou_tree.

max-number-of-tokens

Maximum number of tokens in a single utterance. Utterance chunking is performed in such a way that each produced utterance contains at most this number of tokens.

max-number-of-token-chars

Maximum number of characters within a single token. If a token contains more characters than is stated by this limit, it is split into smaller tokens.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]

This document was generated by Milan Zamazal on August, 11 2009 using texi2html 1.78.