Package org.getalp.dbnary.languages.zho
Class WiktionaryExtractor
- java.lang.Object
-
- org.getalp.dbnary.languages.AbstractWiktionaryExtractor
-
- org.getalp.dbnary.languages.zho.WiktionaryExtractor
-
- All Implemented Interfaces:
IWiktionaryExtractor
public class WiktionaryExtractor extends AbstractWiktionaryExtractor
-
-
Field Summary
Fields Modifier and Type Field Description protected static Pattern
level2HeaderPattern
protected static String
level2HeaderPatternString
protected static HashMap<String,String>
nymMarkerToNymName
protected static Pattern
wikiSectionPattern
protected static String
wikiSectionPatternString
-
Fields inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
debutOrfinDecomPatternString, NON_STANDARD_LANGUAGE_MAPPINGS, pageContent, wdh, wi, xmlCommentPattern
-
-
Constructor Summary
Constructors Constructor Description WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
extractData()
void
extractDefinition(String definition, int defLevel)
void
extractExample(String example)
protected void
extractNyms(String currentNym, int startOffset, int endOffset)
void
setWiktionaryIndex(WiktionaryPageSource wi)
-
Methods inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
cleanUpMarkup, cleanUpMarkup, computeRegionEnd, computeStatistics, convertToHumanReadableForm, extractData, extractDefinition, extractDefinitions, extractExample, extractOrthoAlt, filterOutPage, getHumanReadableForm, getWiktionaryPageName, populateMetadata, postProcessData, postProcessModel, removeXMLComments, setWiktionaryPageName, stripParentheses, validateAndStandardizeLanguageCode
-
-
-
-
Field Detail
-
wikiSectionPatternString
protected static final String wikiSectionPatternString
- See Also:
- Constant Field Values
-
level2HeaderPatternString
protected static final String level2HeaderPatternString
- See Also:
- Constant Field Values
-
wikiSectionPattern
protected static final Pattern wikiSectionPattern
-
level2HeaderPattern
protected static final Pattern level2HeaderPattern
-
-
Constructor Detail
-
WiktionaryExtractor
public WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
-
Method Detail
-
setWiktionaryIndex
public void setWiktionaryIndex(WiktionaryPageSource wi)
- Specified by:
setWiktionaryIndex
in interfaceIWiktionaryExtractor
- Overrides:
setWiktionaryIndex
in classAbstractWiktionaryExtractor
-
extractData
public void extractData()
- Specified by:
extractData
in classAbstractWiktionaryExtractor
-
extractDefinition
public void extractDefinition(String definition, int defLevel)
- Overrides:
extractDefinition
in classAbstractWiktionaryExtractor
-
extractExample
public void extractExample(String example)
- Overrides:
extractExample
in classAbstractWiktionaryExtractor
-
extractNyms
protected void extractNyms(String currentNym, int startOffset, int endOffset)
- Overrides:
extractNyms
in classAbstractWiktionaryExtractor
-
-