Package org.getalp.dbnary.languages.ita
Class WiktionaryExtractor
- java.lang.Object
-
- org.getalp.dbnary.languages.AbstractWiktionaryExtractor
-
- org.getalp.dbnary.languages.ita.WiktionaryExtractor
-
- All Implemented Interfaces:
IWiktionaryExtractor
public class WiktionaryExtractor extends AbstractWiktionaryExtractor
- Author:
- serasset
-
-
Field Summary
Fields Modifier and Type Field Description protected static Pattern
carPattern
protected static String
carPatternString
protected static String
entrySectionPatternString
protected int
INIT
protected int
LANGUE
protected static Pattern
level2HeaderPattern
protected static String
level2HeaderPatternString
protected static Pattern
macroOrLinkOrcarPattern
protected static String
macroOrLinkOrcarPatternString
protected static HashMap<String,String>
nymMarkerToNymName
protected static String
pronounciationPatternString
protected static Pattern
sectionPattern
protected int
TRAD
protected static String
wikiSectionPatternString
-
Fields inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
debutOrfinDecomPatternString, expander, NON_STANDARD_LANGUAGE_MAPPINGS, pageContent, wdh, wi, xmlCommentPattern
-
-
Constructor Summary
Constructors Constructor Description WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
extractData()
void
extractData(WikiText page)
void
extractDefinition(String definition, int defLevel)
void
extractExample(String example)
protected void
extractItalianData(int startOffset, int endOffset)
void
setWiktionaryIndex(WiktionaryPageSource wi)
protected void
setWiktionaryPageName(String wiktionaryPageName)
-
Methods inherited from class org.getalp.dbnary.languages.AbstractWiktionaryExtractor
cleanUpMarkup, cleanUpMarkup, computeRegionEnd, computeStatistics, convertToHumanReadableForm, extractData, extractDefinition, extractDefinitions, extractExample, extractNyms, extractOrthoAlt, filterOutPage, getWiktionaryPageName, populateMetadata, postProcessData, postProcessModel, removeXMLComments, stripParentheses, validateAndStandardizeLanguageCode
-
-
-
-
Field Detail
-
level2HeaderPatternString
protected static final String level2HeaderPatternString
- See Also:
- Constant Field Values
-
entrySectionPatternString
protected static final String entrySectionPatternString
-
wikiSectionPatternString
protected static final String wikiSectionPatternString
- See Also:
- Constant Field Values
-
pronounciationPatternString
protected static final String pronounciationPatternString
- See Also:
- Constant Field Values
-
sectionPattern
protected static final Pattern sectionPattern
-
level2HeaderPattern
protected static final Pattern level2HeaderPattern
-
carPatternString
protected static final String carPatternString
-
macroOrLinkOrcarPatternString
protected static final String macroOrLinkOrcarPatternString
-
macroOrLinkOrcarPattern
protected static final Pattern macroOrLinkOrcarPattern
-
carPattern
protected static final Pattern carPattern
-
INIT
protected final int INIT
- See Also:
- Constant Field Values
-
LANGUE
protected final int LANGUE
- See Also:
- Constant Field Values
-
TRAD
protected final int TRAD
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
WiktionaryExtractor
public WiktionaryExtractor(IWiktionaryDataHandler wdh)
-
-
Method Detail
-
extractData
public void extractData()
- Specified by:
extractData
in classAbstractWiktionaryExtractor
-
extractData
public void extractData(WikiText page)
-
extractItalianData
protected void extractItalianData(int startOffset, int endOffset)
-
extractDefinition
public void extractDefinition(String definition, int defLevel)
- Overrides:
extractDefinition
in classAbstractWiktionaryExtractor
-
extractExample
public void extractExample(String example)
- Overrides:
extractExample
in classAbstractWiktionaryExtractor
-
setWiktionaryIndex
public void setWiktionaryIndex(WiktionaryPageSource wi)
- Specified by:
setWiktionaryIndex
in interfaceIWiktionaryExtractor
- Overrides:
setWiktionaryIndex
in classAbstractWiktionaryExtractor
-
setWiktionaryPageName
protected void setWiktionaryPageName(String wiktionaryPageName)
- Overrides:
setWiktionaryPageName
in classAbstractWiktionaryExtractor
-
-