Schedule a meeting
Back to projects

Belgian Chamber of Representatives: making 183 million words more accessible

Plenum.be collects 183 million words from debates in the Belgian Chamber of Representatives. OCR is used to scan the pages. We updated the process and the website with a wonderful research tool courtesy of UAntwerp digital humanities.

joakim-honkasalo-DurC25GdOvk-unsplash
joakim-honkasalo-DurC25GdOvk-unsplash

Challenge

The website and the process needed an update: make it more user-friendly on the outside, create a new database and taxonomy on the inside.

Solution

Our solutions involved many aspects of Plenum.be. We implemented the MongoDB database system, which is ideal for text files, remains perfectly scalable and works through an API.

We introduced a new and improved taxonomy using Ocelot NLP techniques, along with new and improved OCR techniques, like automatic lay-out analysis.

We also added an API-powered search function, and added speaker and intervention type metadata to research how this would improve accessibility.

For the UX design we led an ideation session at the start of the project, which will lead to wireframes later on.

All deployed on a local server.

Result

This project is in progress.