Package org.getalp.dbnary.languages.mlg
Class WiktionaryExtractor
- java.lang.Object
-
- org.getalp.dbnary.languages.AbstractWiktionaryExtractor
-
- org.getalp.dbnary.languages.mlg.WiktionaryExtractor
-
- All Implemented Interfaces:
IWiktionaryExtractor
public class WiktionaryExtractor extends AbstractWiktionaryExtractor
- Author:
- roques
-
-
Field Summary
Fields Modifier and Type Field Description protected static HashMap<String,String>
blockName
protected static Pattern
blockPattern
protected static String
blockPatternString
protected static HashMap<String,org.getalp.dbnary.languages.mlg.WiktionaryExtractor.Block>
blockValue
protected static Pattern
defPattern
protected static String
defPatternString
protected static Pattern
languageSectionPattern
protected static String
languageSectionPatternString
protected static Pattern
nymsPattern
protected static String
nymsPatternString
protected static Pattern
pronPattern
protected static String
pronPatternString
protected static Pattern
tradPattern
protected static String
tradPatternString
-
Fields inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
debutOrfinDecomPatternString, expander, NON_STANDARD_LANGUAGE_MAPPINGS, pageContent, wdh, wi, xmlCommentPattern
-
-
Constructor Summary
Constructors Constructor Description WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
extractData()
protected void
extractDataBlock(int startOffset, int endOffset, org.getalp.dbnary.languages.mlg.WiktionaryExtractor.Block currentBlock, String blockString)
protected void
extractDataLang(int startOffset, int endOffset, String lang)
protected void
extractDefinitions(int start, int end)
protected void
extractExample(int start, int end)
protected void
extractNyms(int start, int end, String blockString)
protected void
extractPOS(String blockString)
protected void
extractPron(int start, int end)
protected void
extractTranslations(int start, int end)
-
Methods inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
cleanUpMarkup, cleanUpMarkup, computeRegionEnd, computeStatistics, convertToHumanReadableForm, extractData, extractDefinition, extractDefinition, extractExample, extractExample, extractNyms, extractOrthoAlt, filterOutPage, getWiktionaryPageName, populateMetadata, postProcessData, postProcessModel, removeXMLComments, setWiktionaryIndex, setWiktionaryPageName, stripParentheses, validateAndStandardizeLanguageCode
-
-
-
-
Field Detail
-
languageSectionPatternString
protected static final String languageSectionPatternString
- See Also:
- Constant Field Values
-
blockPatternString
protected static final String blockPatternString
- See Also:
- Constant Field Values
-
tradPatternString
protected static final String tradPatternString
- See Also:
- Constant Field Values
-
nymsPatternString
protected static final String nymsPatternString
- See Also:
- Constant Field Values
-
pronPatternString
protected static final String pronPatternString
- See Also:
- Constant Field Values
-
defPatternString
protected static final String defPatternString
- See Also:
- Constant Field Values
-
languageSectionPattern
protected static final Pattern languageSectionPattern
-
blockPattern
protected static final Pattern blockPattern
-
tradPattern
protected static final Pattern tradPattern
-
nymsPattern
protected static final Pattern nymsPattern
-
pronPattern
protected static final Pattern pronPattern
-
defPattern
protected static final Pattern defPattern
-
blockValue
protected static HashMap<String,org.getalp.dbnary.languages.mlg.WiktionaryExtractor.Block> blockValue
-
-
Constructor Detail
-
WiktionaryExtractor
public WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
-
Method Detail
-
extractData
public void extractData()
- Specified by:
extractData
in classAbstractWiktionaryExtractor
-
extractDataLang
protected void extractDataLang(int startOffset, int endOffset, String lang)
-
extractDataBlock
protected void extractDataBlock(int startOffset, int endOffset, org.getalp.dbnary.languages.mlg.WiktionaryExtractor.Block currentBlock, String blockString)
-
extractPOS
protected void extractPOS(String blockString)
-
extractTranslations
protected void extractTranslations(int start, int end)
-
extractNyms
protected void extractNyms(int start, int end, String blockString)
-
extractPron
protected void extractPron(int start, int end)
-
extractDefinitions
protected void extractDefinitions(int start, int end)
- Overrides:
extractDefinitions
in classAbstractWiktionaryExtractor
-
extractExample
protected void extractExample(int start, int end)
-
-