net.sf.jabb.util.text.word
Class AnalyzedTextWordLister

java.lang.Object
  extended by net.sf.jabb.util.text.word.AnalyzedTextWordLister
All Implemented Interfaces:
com.enigmastation.extractors.WordLister, Serializable

public class AnalyzedTextWordLister
extends Object
implements com.enigmastation.extractors.WordLister

支持中英文的分词器。基于词典匹配。

A WordLister that can handle Chinese and English. It is based on dictionary matching.

Author:
Zhengmao HU (James)
See Also:
Serialized Form

Field Summary
protected  boolean includeLengthCategory
           
 
Constructor Summary
AnalyzedTextWordLister()
           
AnalyzedTextWordLister(boolean includeLengthCategory)
           
 
Method Summary
 void addWords(Object document, Collection<String> collection)
           
 Set<String> getUniqueWords(Object document)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

includeLengthCategory

protected boolean includeLengthCategory
Constructor Detail

AnalyzedTextWordLister

public AnalyzedTextWordLister()

AnalyzedTextWordLister

public AnalyzedTextWordLister(boolean includeLengthCategory)
Method Detail

addWords

public void addWords(Object document,
                     Collection<String> collection)
Specified by:
addWords in interface com.enigmastation.extractors.WordLister

getUniqueWords

public Set<String> getUniqueWords(Object document)
Specified by:
getUniqueWords in interface com.enigmastation.extractors.WordLister


Copyright © 2012. All Rights Reserved.