Advanced and bespoke machine translation for low-resource languages.


Machine Translation Tools

Navigating the challenges of collecting quality data for low-resource languages is often difficult and costly. Natural Language Processing (NLP) technologies require good quality training data to enhance the accuracy of their models. VoxCroft is amassing valuable data for use in various NLP initiatives through its population-centric intelligence work in austere media environments.

VoxCroft also works with organizations to manage, expand, and supplement their datasets that can be used in various intelligence and machine learning applications. Our proprietary Proteus micro-tasking and data collection platform, coupled with our network of global linguists and team of MT and AI experts, allows us to provide scalable solutions for NLP projects in the toughest operating environments.

Bilingual Corpora for Low Resource Languages

High-quality bilingual sentence pairs, speech-to-text data and data sets containing audiovisual and text content in low-resource languages by using domain-specific corpora tailored to customer requirements.

Crowd-Sourced Language Experts

An expansive network of over 400 mother-tongue language and subject area experts from austere language environments who undergo rigorous testing and training to work on a range of language tasks that meet our in-house quality assurance (QA) criteria.

Proteus Micro-Tasking Platform

A proprietary data-agnostic online and mobile platform that supports micro-tasks including translation, transcription, image identification, audio recording, and QA tasks, and can be adapted to specific customer requirements.

Machine Translations Models

Machine translation (MT) models for languages for which commercial MT tools currently do not exist or perform poorly. Our models, trained on domain-specific corpora often outperform commercial MT models trained on larger corpora.

Keyword Spotting for Broadcast Monitoring

VoxCroft provides real-time analysis of audio content in languages, including Amharic, Tamashek, Wolof, and Zarma, and we build tailored keyword spotting models for specific customer use cases.

Technical and Operational Consultancy Services

Customized language services, particularly in austere language environments, and work with customers to develop tailored NLP solutions for seamless integration with their existing systems and processes.

Machine Translation

“Important voices and entire communities remain unheard when decision-makers cannot access information in low-resource languages.”

- Barend Lutz



Sense & Sensemaking

Combining the best tradecraft from our analysts and leveraging our proprietary AI, we evolved our sense and sensemaking methodology.

Infinity graph showing our machine learning tools' sense and sensemaking
Infinity graph showing our machine learning tools' sense and sensemaking

Population-Centric Intelligence

Find out how we use tradecraft, expert human analysts, and cutting-edge AI to listen to the voice of the people.

Sense & Sensemaking Methodology

See how we use sense and sensemaking to better tell humanity’s stories.


Detect early warning indicators of instability, terrorist violence, inflammatory rhetoric, crime, or misinformation.