Datasets

True to our roots, we have open source datasets available.

Discover our datasets

Some of our findings are available open source for transparency. We also offer tailored and standard training programs to expand your knowledge on what happens in the online world.

EN Toxic Word Embeddings

English word embeddings, trained on tendentious language on 4chan.

Request

NL Toxic Word Embeddings

Dutch word embeddings trained on tendentious language on for example GeenStijl.nl.

Request

NL Word Embeddings

Dutch word embeddings, trained on large amounts of public data.

Download

Get in touch

.css-l0mio9{display:none;visibility:hidden;}