Package org.getalp.dbnary.jpn
Class WiktionaryExtractor
java.lang.Object
org.getalp.dbnary.AbstractWiktionaryExtractor
org.getalp.dbnary.jpn.WiktionaryExtractor
- All Implemented Interfaces:
IWiktionaryExtractor
public class WiktionaryExtractor extends AbstractWiktionaryExtractor
- Author:
- serasset
-
Field Summary
Fields Modifier and Type Field Description protected boolean
isCurrentlyExtracting
protected static Pattern
level2HeaderPattern
protected static String
level2HeaderPatternString
protected static String
pronounciationPatternString
protected static Pattern
sectionPattern
protected static String
wikiSectionPatternString
Fields inherited from class org.getalp.dbnary.AbstractWiktionaryExtractor
debutOrfinDecomPatternString, glossFilter, pageContent, wdh, wi, xmlCommentPattern
-
Constructor Summary
Constructors Constructor Description WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
Method Summary
Modifier and Type Method Description void
extractData()
void
extractDefinition(String definition, int defLevel)
void
extractTranslations(int startOffset, int endOffset)
boolean
isCurrentlyExtracting()
void
setWiktionaryIndex(WiktionaryIndex wi)
Methods inherited from class org.getalp.dbnary.AbstractWiktionaryExtractor
cleanUpMarkup, cleanUpMarkup, computeRegionEnd, computeStatistics, convertToHumanReadableForm, extractData, extractDefinition, extractDefinitions, extractExample, extractExample, extractNyms, extractOrthoAlt, filterOutPage, getHumanReadableForm, getWiktionaryPageName, populateMetadata, postProcessData, removeXMLComments, setWiktionaryPageName, stripParentheses
-
Field Details
-
level2HeaderPatternString
- See Also:
- Constant Field Values
-
wikiSectionPatternString
- See Also:
- Constant Field Values
-
pronounciationPatternString
- See Also:
- Constant Field Values
-
sectionPattern
-
level2HeaderPattern
-
isCurrentlyExtracting
protected boolean isCurrentlyExtracting
-
-
Constructor Details
-
WiktionaryExtractor
-
-
Method Details
-
setWiktionaryIndex
- Specified by:
setWiktionaryIndex
in interfaceIWiktionaryExtractor
- Overrides:
setWiktionaryIndex
in classAbstractWiktionaryExtractor
-
isCurrentlyExtracting
public boolean isCurrentlyExtracting() -
extractData
public void extractData()- Specified by:
extractData
in classAbstractWiktionaryExtractor
-
extractDefinition
- Overrides:
extractDefinition
in classAbstractWiktionaryExtractor
-
extractTranslations
public void extractTranslations(int startOffset, int endOffset)
-