Package org.getalp.dbnary.languages.pol
Class WiktionaryExtractor
- java.lang.Object
-
- org.getalp.dbnary.languages.AbstractWiktionaryExtractor
-
- org.getalp.dbnary.languages.pol.WiktionaryExtractor
-
- All Implemented Interfaces:
IWiktionaryExtractor
public class WiktionaryExtractor extends AbstractWiktionaryExtractor
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
WiktionaryExtractor.SectionType
-
Field Summary
Fields Modifier and Type Field Description protected ExpandAllWikiModel
definitionExpander
protected static Pattern
languageSectionPattern
protected static String
languageSectionPatternString
protected static HashMap<String,String>
nymMarkerToNymName
protected static String
partOfSpeechPatternString
protected static Pattern
polishDefinitionPattern
protected static String
polishDefinitionPatternString
protected static Pattern
polishNymLinePattern
protected WiktionaryDataHandler
polwdh
protected static Pattern
sectionPattern
protected static String
senseNumberRegExp
protected static String
subSection4PatternString
protected static HashMap<String,WiktionaryExtractor.SectionType>
validSectionTemplates
-
Fields inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
debutOrfinDecomPatternString, NON_STANDARD_LANGUAGE_MAPPINGS, pageContent, wdh, wi, xmlCommentPattern
-
-
Constructor Summary
Constructors Constructor Description WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
extractData()
protected void
extractDefinitions(int startOffset, int endOffset)
protected void
extractNyms(String synRelation, int startOffset, int endOffset)
void
setWiktionaryIndex(WiktionaryPageSource wi)
-
Methods inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
cleanUpMarkup, cleanUpMarkup, computeRegionEnd, computeStatistics, convertToHumanReadableForm, extractData, extractDefinition, extractDefinition, extractExample, extractExample, extractOrthoAlt, filterOutPage, getHumanReadableForm, getWiktionaryPageName, populateMetadata, postProcessData, postProcessModel, removeXMLComments, setWiktionaryPageName, stripParentheses, validateAndStandardizeLanguageCode
-
-
-
-
Field Detail
-
senseNumberRegExp
protected static final String senseNumberRegExp
- See Also:
- Constant Field Values
-
languageSectionPatternString
protected static final String languageSectionPatternString
- See Also:
- Constant Field Values
-
partOfSpeechPatternString
protected static final String partOfSpeechPatternString
- See Also:
- Constant Field Values
-
subSection4PatternString
protected static final String subSection4PatternString
- See Also:
- Constant Field Values
-
polishDefinitionPatternString
protected static final String polishDefinitionPatternString
- See Also:
- Constant Field Values
-
polwdh
protected WiktionaryDataHandler polwdh
-
definitionExpander
protected ExpandAllWikiModel definitionExpander
-
languageSectionPattern
protected static final Pattern languageSectionPattern
-
polishDefinitionPattern
protected static final Pattern polishDefinitionPattern
-
polishNymLinePattern
protected static final Pattern polishNymLinePattern
-
sectionPattern
protected static final Pattern sectionPattern
-
validSectionTemplates
protected static final HashMap<String,WiktionaryExtractor.SectionType> validSectionTemplates
-
-
Constructor Detail
-
WiktionaryExtractor
public WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
-
Method Detail
-
setWiktionaryIndex
public void setWiktionaryIndex(WiktionaryPageSource wi)
- Specified by:
setWiktionaryIndex
in interfaceIWiktionaryExtractor
- Overrides:
setWiktionaryIndex
in classAbstractWiktionaryExtractor
-
extractData
public void extractData()
- Specified by:
extractData
in classAbstractWiktionaryExtractor
-
extractDefinitions
protected void extractDefinitions(int startOffset, int endOffset)
- Overrides:
extractDefinitions
in classAbstractWiktionaryExtractor
-
extractNyms
protected void extractNyms(String synRelation, int startOffset, int endOffset)
- Overrides:
extractNyms
in classAbstractWiktionaryExtractor
-
-