@Bindable(prefix="DocumentAssigner") public class DocumentAssigner extends Object
PreprocessingContext.AllLabels.featureIndex an BitSet with the assigned documents is
constructed. The assignment algorithm is rather simple: in order to be assigned to a
label, a document must contain at least one occurrence of each non-stop word from the
label.
This class saves the following results to the PreprocessingContext :
This class requires that Tokenizer, CaseNormalizer,
StopListMarker, PhraseExtractor and LabelFilterProcessor be
invoked first.
| Modifier and Type | Field and Description |
|---|---|
boolean |
exactPhraseAssignment
Only exact phrase assignments.
|
int |
minClusterSize
Determines the minimum number of documents in each cluster.
|
| Constructor and Description |
|---|
DocumentAssigner() |
@Input @Processing @Attribute @Label(value="Exact phrase assignment") @Level(value=MEDIUM) @Group(value="Preprocessing") public boolean exactPhraseAssignment
@Input @Processing @Attribute @IntRange(min=1, max=100) @Label(value="Minimum cluster size") @Level(value=MEDIUM) @Group(value="Preprocessing") public int minClusterSize
public void assign(PreprocessingContext context)