Package org.getalp.dbnary.languages.nld
Class WiktionaryExtractor
- java.lang.Object
-
- org.getalp.dbnary.languages.AbstractWiktionaryExtractor
-
- org.getalp.dbnary.languages.nld.WiktionaryExtractor
-
- All Implemented Interfaces:
IWiktionaryExtractor
public class WiktionaryExtractor extends AbstractWiktionaryExtractor
- Author:
- malick
-
-
Field Summary
Fields Modifier and Type Field Description protected static Pattern
languageSectionPattern
protected static String
languageSectionPatternString
protected static HashMap<String,String>
nymMarkerToNymName
protected static Pattern
pronPattern
protected static String
pronPatternString
protected static Pattern
sectionPattern
protected static String
sectionPatternString
-
Fields inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
debutOrfinDecomPatternString, expander, NON_STANDARD_LANGUAGE_MAPPINGS, pageContent, wdh, wi, xmlCommentPattern
-
-
Constructor Summary
Constructors Constructor Description WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
extractData()
void
extractDefinition(String definition, int defLevel)
protected void
extractDefinitions(int startOffset, int endOffset)
void
extractExample(String example)
void
extractExample(Matcher definitionMatcher)
protected void
extractNetherlandData(int startOffset, int endOffset)
void
setWiktionaryIndex(WiktionaryPageSource wi)
-
Methods inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
cleanUpMarkup, cleanUpMarkup, computeRegionEnd, computeStatistics, convertToHumanReadableForm, extractData, extractDefinition, extractNyms, extractOrthoAlt, filterOutPage, getWiktionaryPageName, populateMetadata, postProcessData, postProcessModel, removeXMLComments, setWiktionaryPageName, stripParentheses, validateAndStandardizeLanguageCode
-
-
-
-
Field Detail
-
languageSectionPatternString
protected static final String languageSectionPatternString
- See Also:
- Constant Field Values
-
sectionPatternString
protected static final String sectionPatternString
-
pronPatternString
protected static final String pronPatternString
- See Also:
- Constant Field Values
-
languageSectionPattern
protected static final Pattern languageSectionPattern
-
sectionPattern
protected static final Pattern sectionPattern
-
pronPattern
protected static final Pattern pronPattern
-
-
Constructor Detail
-
WiktionaryExtractor
public WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
-
Method Detail
-
setWiktionaryIndex
public void setWiktionaryIndex(WiktionaryPageSource wi)
- Specified by:
setWiktionaryIndex
in interfaceIWiktionaryExtractor
- Overrides:
setWiktionaryIndex
in classAbstractWiktionaryExtractor
-
extractData
public void extractData()
- Specified by:
extractData
in classAbstractWiktionaryExtractor
-
extractNetherlandData
protected void extractNetherlandData(int startOffset, int endOffset)
-
extractDefinitions
protected void extractDefinitions(int startOffset, int endOffset)
- Overrides:
extractDefinitions
in classAbstractWiktionaryExtractor
-
extractDefinition
public void extractDefinition(String definition, int defLevel)
- Overrides:
extractDefinition
in classAbstractWiktionaryExtractor
-
extractExample
public void extractExample(Matcher definitionMatcher)
- Overrides:
extractExample
in classAbstractWiktionaryExtractor
-
extractExample
public void extractExample(String example)
- Overrides:
extractExample
in classAbstractWiktionaryExtractor
-
-