@Bindable public class ContaminationMetric extends Object
Document.PARTITIONS, its contamination is 0. If a cluster groups an equally
distributed mix of all partitions, its contamination is 1.0. For a full definition,
please see section 4.4.1 of this
work.
Contamination is calculated for top-level clusters only, taking into account documents
from the cluster and all subclusters. Finally, contamination will be calculated only if
all input documents have non-blank Document.PARTITIONSs.
| Modifier and Type | Field and Description |
|---|---|
List<Cluster> |
clusters |
static String |
CONTAMINATION
Key for the contamination value of a cluster.
|
List<Document> |
documents |
boolean |
enabled
Calculate contamination metric.
|
String |
partitionIdFieldName
Partition id field name.
|
double |
weightedAverageContamination
Average contamination of the whole cluster set, weighted by the size of cluster.
|
| Constructor and Description |
|---|
ContaminationMetric() |
public static final String CONTAMINATION
@Processing @Output @Attribute public double weightedAverageContamination
@Processing @Input @Attribute public boolean enabled
@Processing @Input @Attribute(key="documents") public List<Document> documents
@Processing @Input @Attribute(key="clusters") public List<Cluster> clusters
@Input @Processing @Attribute public String partitionIdFieldName
public void calculate()
IClusteringMetricProcessing Input attributes
will have been bound before a call to this method.public boolean isEnabled()
IClusteringMetrictrue if this metric should be calculated.