Gilles Sérasset

More examples extracted from Wiktionary

By Gilles Sérasset Posted on July 15, 2024 Posted in Uncategorized

Wiktionary dictionary entries sometimes have examples or citation and we already extracted some of them in certain languages. With the work from Emmanuel Mayoux-Li, our undergrad intern, we added more languages and also fixed obsolete extractors to better get such …

More examples extracted from Wiktionary Read more »

Gaelic and Catalan editions are now part of DBnary

By Gilles Sérasset Posted on July 19, 2023 Posted in Uncategorized Tagged with extractor, Morphology

Gaelic and Catalan editions are now part of DBnary, which brings the number of language editions extracted by wiktionary to 25 ! Thanks to Arnaud Alet (1st year student in Grenoble) for creating the two extractors.

Examples are now extracted from the English version

By Gilles Sérasset Posted on December 4, 2022 Posted in Uncategorized

The english extractor should now correctly extract examples and citations (along with references) from the English Language Edition. Such examples where initially extracted for the French Edition and we will try to extend such extraction for all other editions.

Eager to meet the Exolexica ?

By Gilles Sérasset Posted on January 29, 2022 Posted in Announce Tagged with exolexicon, HDT

Since its beginning, the DBnary dataset made the choice to extract what I call the endolexicon, i.e. the language data corresponding to the extracted language edition (e.g. French data is extracted from French language Edition). Since its 20220120 edition, DBnary …

Eager to meet the Exolexica ? Read more »

DBnary dataset is now made available in HDT

By Gilles Sérasset Posted on January 29, 2022 Posted in Announce Tagged with HDT

You can now directly download an HDT* version of a whole language edition dataset (i.e. core, statistics, lime, enhanced translations, morphology and etymology (when available)). Everything is combined in one HDT file that is directly usable. The HDT data may …

DBnary dataset is now made available in HDT Read more »

A new dashboard will give you latest statistics on DBnary dataset

By Gilles Sérasset Posted on August 24, 2021 Posted in Uncategorized

Since 2020, the statistics on DBnary datasets were provided in RDF using the datacube vocabulary. As a result, the graphs available on this site were inaccurate (as they displayed previous, older, stats). We have built a new dashboard wordpress plugin …

A new dashboard will give you latest statistics on DBnary dataset Read more »

Pre 2017 extracts are now available on Zenodo only

By Gilles Sérasset Posted on August 24, 2021 Posted in Announce, Uncategorized Tagged with Archive, Zenodo

The DBnary project extracts every Wiktionary dumps since mid 2012 (for the earliest extracted language) until now. In July 2017, the extracts were transitioned from original lemon model to the now de facto standard ontolex model. As we are lacking …

Pre 2017 extracts are now available on Zenodo only Read more »

Kurdish language edition is now part of DBnary

By Gilles Sérasset Posted on July 9, 2021 Posted in Announce Tagged with extractor, Kurdish

DBnary adds a new language to its collection : Kurdish. Since July 1st 2021, the Kurdish lexical data is available in the DBnary dataset. This makes 22 languages in the collection.

Turkish extractor has been rewritten

By Gilles Sérasset Posted on July 22, 2020 Posted in Announce Tagged with extractor, Turkish

Since almost one year, the Turkish data was empty as the extractor was not fixed after the Turkish wiktionary community changed drastically the way the pages were encoded. This has now been fixed with a new version of the Turkish …

Turkish extractor has been rewritten Read more »

Swedish Morphology Extraction is available

By Gilles Sérasset Posted on July 22, 2020 Posted in Announce Tagged with Morphology, swedish

The DBnary extractor now extracts the Swedish morphology. This data is available from the 20200701 extraction version, that used the 2.3.1 version of the extractor. The morphological data may be downloaded and will be available as soon as we upload …

Swedish Morphology Extraction is available Read more »

Author: Gilles Sérasset