Author: Gilles Sérasset

4 more languages immediately available

Thanks to Malick Diagne and Steve Roques, DBnary now extracts data from Dutch, Lithuanian, Serbo-Croat and Swedish editions. The Serbo-Croat language extractor also extracts morphological informations.

WikDict, a web service based on DBnary data

Karl Bartel has created a very simple web service to lookup translation of many languages. The data powering wikDict is provided by DBnary. The service is still preliminary, but I’m quite sure it will improve with the time. Go to

Improvement in Russian extractor

Several bug fixes were made on the Russian extractor. Definitions should now be extracted in a more complete way and empty definitions should not be extracted anymore. Translation extraction has also been modified to avoid (rather unfrequent) special cases where

French and German morphology is now extracted

Since December 2014, morphological data has been extracted from French and German language edition. This data is currently stored in exhaustive version, meaning that every inflected form may be found in an a lemon:otherForm property. German morphology extraction is still rather preliminar, but French

English extraction improved

English extraction has been improved slightly. More part of speeches are now extracted (mainly affixes, proverbs, …), Extracted data is now more precisely typed and described, using lexinfo vocabulary, Additionally, All lexical entries are now explicitly typed as LexicalEntry while

Usage examples are now extracted and attached to word senses

During the French “TALN” workshop, several French researchers asked me if I could add usage examples in the extracted data. This has been done in French, with the addition of a new property (named dbnary:exampleSource) that gives the source of

The Polish Language Extractor is now online

DBnary now contains extraction from 13 wiktionary language editions, with Polish being added today. Polish data is available in the very same format as other languages. The extraction work has been really tedious as the Polish language edition uses a

Setting up a virtuoso-opensource server to mirror DBnary data

This post details the steps necessary to install virtuoso server and bulk load wiktionary data. You’ll have to adapt for you own settings. Download and Compile virtuoso-opensource First, you’ll have to install several development libraries (this is for Debian): sudo

Tagged with:

Translations are now connected to word senses

A translation is supposed to connect a source word sense to a target word sense. However, current DBnary data connects a lexical entry to a target string. By using available glosses, we are able to connect translations to their source

DBnary is now supporting 12 language editions

With the addition of Bulgarian and Spanish, the DBnary data now contains 12 language editions. More to come we hope !