Package org.getalp.dbnary.languages.swe
Class SwedishTableExtractor
- java.lang.Object
-
- org.getalp.dbnary.morphology.HtmlTableHandler
-
- org.getalp.dbnary.morphology.TableExtractor
-
- org.getalp.dbnary.languages.swe.SwedishTableExtractor
-
public class SwedishTableExtractor extends TableExtractor
-
-
Field Summary
-
Fields inherited from class org.getalp.dbnary.morphology.TableExtractor
alreadyParsedTables, currentEntry
-
-
Constructor Summary
Constructors Constructor Description SwedishTableExtractor()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected Set<String>
getInflectedForms(org.jsoup.nodes.Element cell)
Extract wordforms from table cell
Splits cell content by <br\> or comma and removes HTML formattingprotected List<SwedishInflectionData>
getInflectionDataFromCellContext(List<String> context)
returns the inflection data that correspond to current celle contextprotected List<String>
getRowAndColumnContext(int nrow, int ncol, ArrayMatrix<org.jsoup.nodes.Element> columnHeaders)
protected boolean
shouldProcessCell(org.jsoup.nodes.Element cell)
-
Methods inherited from class org.getalp.dbnary.morphology.TableExtractor
addToContext, decodeH2Context, handleNestedTables, handleSimpleCell, isHeaderCell, isNormalCell, parseHTML, parseTable, shouldIgnoreCurrentH2
-
Methods inherited from class org.getalp.dbnary.morphology.HtmlTableHandler
explodeTable, getBackgroundColor
-
-
-
-
Method Detail
-
getRowAndColumnContext
protected List<String> getRowAndColumnContext(int nrow, int ncol, ArrayMatrix<org.jsoup.nodes.Element> columnHeaders)
- Overrides:
getRowAndColumnContext
in classTableExtractor
-
getInflectionDataFromCellContext
protected List<SwedishInflectionData> getInflectionDataFromCellContext(List<String> context)
Description copied from class:TableExtractor
returns the inflection data that correspond to current celle contextThe cell context is a list of String that corresponds to all column and row headers + section headers in which the cell appears.
- Specified by:
getInflectionDataFromCellContext
in classTableExtractor
- Parameters:
context
- a list of Strings that represent the celle context- Returns:
- The InflexionData corresponding to the context
-
getInflectedForms
protected Set<String> getInflectedForms(org.jsoup.nodes.Element cell)
Description copied from class:TableExtractor
Extract wordforms from table cell
Splits cell content by <br\> or comma and removes HTML formatting- Overrides:
getInflectedForms
in classTableExtractor
- Parameters:
cell
- the current cell in the inflection table- Returns:
- Set of wordforms (Strings) from this cell
-
shouldProcessCell
protected boolean shouldProcessCell(org.jsoup.nodes.Element cell)
- Overrides:
shouldProcessCell
in classTableExtractor
-
-