Schedule a meeting
Back to solutions

Ocelot

With our modularly designed Document Management System Ocelot, we offer an efficient and sustainable solution to support document processing and management in a fully customizable AI-powered pipeline. With the help of Artificial Intelligence, we can perform smart machine reading of such archives and activate the knowledge within this dormant collective brain.

marc-olivier-jodoin-NqOInJ-ttqM-unsplash
marc-olivier-jodoin-NqOInJ-ttqM-unsplash

Preprocessing

Documents in different file types (PDF, DOC(X), HTML, XML, XLSX, …) are imported automatically or can be added manually with a drag ‘n drop interface. Scanned documents are passed through an OCR (optical character recognition) pipeline. The documents are further cleaned up and semantic duplicate detection removes redundant information.

Machine Learning

NLP techniques search for meaningful patterns in the data and automatically derive a custom knowledge model. The knowledge model is then used to add searchable metadata to the documents.

Discovery

A customized dashboard is built to present the relevant information clearly to the end user. Ocelot’s modular design enables the optimization of the user experience in collaboration with the customer. Data can also be exported to various formats via a REST-API.

Self-learning Domain Models

Ocelot has ready-made modules to extract knowledge from documents, such as the names of persons and organizations, key words, locations, telephone numbers, e-mail addresses, etc These add a first layer of metadata to the existing data. Then the data itself is activated by deriving a custom domain model. Our models automatically adapt to the data based on state-of-the-art techniques from the domains of machine reading and natural language understanding (NLU).

Classification models

Ocelot’s open design enables the addition of new classification models on top of the ready-made analysis modules. Thereby, we can use our machine learning models to leverage already classified data to derive a classification model that can enrich unclassified data with metadata. Ocelot only uses machine learning algorithms from the subdomain of Explainable AI (xAI) that allow to visualize which language patterns have led to the classification of a document.

Document similarity

Ocelot also integrates effective and scalable document similarity techniques. These make it possible to compare overlaps in documents, both in form and content. This is particularly useful in the context of duplicate searches where different versions of documents can be compared with one another. Moreover, this technology also allows searches to be performed at paragraph level, where related sections are searched in the archive.

Lasting and flexible

Ocelot offers a long-term solution for document management. New documents can be continuously added to the database, after which the domain and classification models adapt to the new information. This ensures that the insights and analyses remain up to date. Ocelot’s design also allows for the front-end to be disconnected from the back-end analysis modules. This allows for open data exports and programmatic access from other business intelligence (e.g. PowerBI) and document management platforms such as Sharepoint.

Textgain is trusted by various different organizations and industries.

Want to discuss one of our products in more detail?

A short gettogether is always more enlightening.

Would our products be right for you? How could they help you? Just talk to one of our experts.

Get in touch with us!