@Bindable(inherit=LexicalDataLoader.class) public class DefaultLexicalDataFactory extends Object implements ILexicalDataFactory
resourceLookup, reloadResources,
mergeResources.| Modifier and Type | Field and Description |
|---|---|
boolean |
mergeResources
Merges stop words and stop labels from all known languages.
|
boolean |
reloadResources |
ResourceLookup |
resourceLookup |
| Constructor and Description |
|---|
DefaultLexicalDataFactory() |
| Modifier and Type | Method and Description |
|---|---|
ILexicalData |
getLexicalData(LanguageCode languageCode)
The main logic for acquiring a shared
ILexicalData instance. |
static HashSet<String> |
load(IResource resource)
Loads words from a given
IResource (UTF-8, one word per line, #-starting lines
are considered comments). |
@Processing @Input @Attribute(key="reload-resources", inherit=true) public boolean reloadResources
@Processing @Input @Attribute(key="merge-resources") @Label(value="Merge lexical resources") @Level(value=MEDIUM) @Group(value="Preprocessing") public boolean mergeResources
false, only stop words and stop labels of the active language will be
used. If set to true, stop words from all LanguageCodes will
be used together and stop labels from all languages will be used together, no
matter the active language. Lexical resource merging is useful when clustering data
in a mix of different languages and should increase clustering quality in such
settings.@Processing @Input @Internal @Attribute(key="resource-lookup", inherit=true) public ResourceLookup resourceLookup
public ILexicalData getLexicalData(LanguageCode languageCode)
ILexicalData instance.getLexicalData in interface ILexicalDataFactorypublic static HashSet<String> load(IResource resource) throws IOException
IResource (UTF-8, one word per line, #-starting lines
are considered comments).IOException