Lucene++ - a full-featured, c++ search engine
API Documentation
Go to the documentation of this file.
7 #ifndef CHARTOKENIZER_H
8 #define CHARTOKENIZER_H
virtual wchar_t normalize(wchar_t c)
Called on each token character to normalize it before it is added to the token. The default implement...
#define LUCENE_CLASS(Name)
Definition: LuceneObject.h:24
boost::shared_ptr< Reader > ReaderPtr
Definition: LuceneTypes.h:547
int32_t offset
Definition: CharTokenizer.h:22
CharTokenizer(const AttributeFactoryPtr &factory, const ReaderPtr &input)
CharTokenizer(const ReaderPtr &input)
static const int32_t IO_BUFFER_SIZE
Definition: CharTokenizer.h:30
OffsetAttributePtr offsetAtt
Definition: CharTokenizer.h:34
CharArray ioBuffer
Definition: CharTokenizer.h:32
An abstract base class for simple, character-oriented tokenizers.
Definition: CharTokenizer.h:15
virtual bool isTokenChar(wchar_t c)=0
Returns true if a character should be included in a token. This tokenizer generates as tokens adjacen...
Definition: AbstractAllTermDocs.h:12
int32_t dataLen
Definition: CharTokenizer.h:27
virtual bool incrementToken()
Consumers (ie., IndexWriter) use this method to advance the stream to the next token....
CharTokenizer(const AttributeSourcePtr &source, const ReaderPtr &input)
boost::shared_ptr< OffsetAttribute > OffsetAttributePtr
Definition: LuceneTypes.h:40
boost::shared_ptr< AttributeSource > AttributeSourcePtr
Definition: LuceneTypes.h:520
A Tokenizer is a TokenStream whose input is a Reader.
Definition: Tokenizer.h:20
static const int32_t MAX_WORD_LEN
Definition: CharTokenizer.h:29
boost::shared_ptr< TermAttribute > TermAttributePtr
Definition: LuceneTypes.h:58
TermAttributePtr termAtt
Definition: CharTokenizer.h:33
virtual void end()
This method is called by the consumer after the last token has been consumed, after incrementToken() ...
int32_t bufferIndex
Definition: CharTokenizer.h:26
virtual void reset(const ReaderPtr &input)
Reset the tokenizer to a new reader. Typically, an analyzer (in its reusableTokenStream method) will ...
boost::shared_ptr< AttributeFactory > AttributeFactoryPtr
Definition: LuceneTypes.h:519
clucene.sourceforge.net