Open source

The software and computational linguistic tools developed by the Dutch Language Institute are available as open source. External users have access to the source code and are allowed to adapt it for their own purposes. Examples are the corpus retrieval engine BlackLab and the morphological parser MBPM.

BlackLab

BlackLab is a corpus search system based on Apache Lucene. This technology can quickly perform complex searches within extensive, annotated text collections in our historical and contemporary text corpora. The search results are highlighted within the text. Our corpora will be made accessible through this search system. The beta version of the Gysseling Corpus can already be consulted here.

MBMP (Memory Based Morphological Parser)

MBMP is a memory-based morphological parser for the programming language Python. This parser analyses words morphologically. For example, it segments words into morphemes, it assigns part-of-speech tags to the word morphemes or it performs a complete hierarchical analysis. The package also offers the functionality of a generic memory-based classifier. We developed this tool especially for the morphological component of GiGaNT.

Tags: tools

Contact: Jan Niestadt

BlackLab

BlackLab is een corpuszoeksysteem op basis van Apache Lucene. Deze technologie maakt snelle, complexe zoekacties mogelijk binnen omvangrijke, geannoteerde tekstverzamelingen in onze historische en hedendaagse tekstcorpora. De zoekresultaten worden gemarkeerd weergegeven in de tekst. Onze corpora zullen we met behulp van dit zoeksysteem ontsluiten. Het Corpus Gysseling kunt u hier al in bètaversie bekijken.

MBMP (Memory Based Morphological Parser)

MBMP is een geheugengebaseerde morfologische parser voor de programmeertaal Python. Met deze parser worden woorden voorzien van een morfologische analyse. Zoals de onderverdeling van een woord in morfemen, de toekenning van PoS-tags aan de morfemen van een woord of een complete hiërarchische analyse. Daarnaast biedt het pakket de functionaliteit van een generieke geheugengebaseerde classificeerder. Deze tool ontwikkelden we ten behoeve van de morfologische component van GiGaNT.

Tags: tools

Contactpersoon: Jan Niestadt
Laatste wijziging: 03/09/2025

Open source

BlackLab

MBMP (Memory Based Morphological Parser)

Further reading

Meer lezen

Agenda

Open source

BlackLab

MBMP (Memory Based Morphological Parser)

Further reading

BlackLab

MBMP (Memory Based Morphological Parser)

Meer lezen

INT-nieuwsbrief

Terminologienieuwsbrief