@Bindable(prefix="PhraseExtractor") public class PhraseExtractor extends Object
This class saves the following results to the PreprocessingContext:
PreprocessingContext.AllPhrases.wordIndicesPreprocessingContext.AllPhrases.tfPreprocessingContext.AllPhrases.tfByDocumentPreprocessingContext.AllTokens.suffixOrderPreprocessingContext.AllTokens.lcp
This class requires that Tokenizer, CaseNormalizer and
LanguageModelStemmer be invoked first.
| Modifier and Type | Field and Description |
|---|---|
int |
dfThreshold
Phrase Document Frequency threshold.
|
| Constructor and Description |
|---|
PhraseExtractor() |
| Modifier and Type | Method and Description |
|---|---|
void |
extractPhrases(PreprocessingContext context)
Performs phrase extraction and saves the results to the provided
context. |
@Processing @Input @Attribute @IntRange(min=1, max=100) @Label(value="Phrase document frequency threshold") @Level(value=ADVANCED) @Group(value="Phrase extraction") public int dfThreshold
dfThreshold documents will be ignored.public void extractPhrases(PreprocessingContext context)
context.