The Polish Language Extractor is now online

DBnary now contains extraction from 13 wiktionary language editions, with Polish being added today. Polish data is available in the very same format as other languages. The extraction work has been really tedious as the Polish language edition uses a rather different micro-structure for its pages. For instance, all lexical entries definitions are gathered in a unique definitions sections (with different part os speech gathered as sub sections). This leads to a more tricky extractor.

Moreover, the Part Of Speech information is rather detailed in Polish language edition, with many sub-categorization options. Almost all of these subcategorization info is extracted and rendered using the lexinfo standard vocabulary.