Features

File-based API

A web-based API for integration with workflows, tools and web-clients. This allows for different functions to be performed on-demand, either synchronously or asynchronously.

Streaming API

Allow for realtime transcriptions, for example to provide live subtitling, simultaneous translation, live recommendations or monitoring.

This uses a Websocket interface, for easy integration with workflows, tools and web-clients. The API implements transcription capabilities and can be paired with the translation functions of the file-based API to translate realtime transcriptions of finished or partial phases.

Vendor Support

Adapters currently exist for Azure, AWS, Google, Speechmatics, DeepL and a development version of an interface for the opensource https://marian-nmt.github.io/ engine.

In total, this means that EuroVOX is able to transcribe from over 65 different languages and variants, as well as translate content between around 20000 different language pairs and create synthetic speech in over 80 different languages in 450 different voices.

Vendor adapters abstract support for the direct language functions, as well as any specifities of the platform including media handling (e.g. transcoding, cloud storage handling, limits and quotas).

Our flexible configuration allows for the same vendor adapter to be used multiple times in one instance, e.g for splitting different functions (transcribe, translate, voicing) across different accounts, or for using different storage systems.

Media storage handling is done automatically, and any processed storage can by default be either stored locally or with your chosen cloud vendor.

Additional vendors are currently in development, and depending on the complexity of the vendor itself an adapter can be written in a matter of days - enabling rapid integration.

Common data formats

One goal is to develop openly-standardised data formats for interoperability between language tools. Each vendor has their own native behaviours and formats abstracted by the EuroVOX API. This supports functionality provided by each of the major cloud vendors:

Individual word timings
Punctuation highlighting
Speaker identification
Partial-transcriptions (for streaming transcriptions)

Flexible deployment and configuration

The EuroVOX API is deployed as a docker image, and each vendor adapter is deployed as a standalone package allowing for a deployment to be specifically targeted to the required adapters.

Designed to be deployable in a variety of different scenarios, it can be run locally on your laptop, entirely on-premise or in a number of different cloud infrastructures.

Options include using an external Identity Service Provider (Apache Keycloak) in order to federate with your own Single Sign-on (SSO) system, LDAP, other external identity providers, entirely local credentials - or any combination. Media processing tasks can be run in-thread or using an external message broker (e.g. RabbitMQ) through pluggable interfaces.

Vendor metrics

The EuroVOX API can automatically choose the "best vendor" for a given metric, given the right benchmark information. For implementors who already benchmark different vendors, these metrics can be provided to the API and enable this to automatically choose a vendor for each function: transcribe, translate, voicing.

This allows a multi-vendor deployment to automatically track the changes in functionality without any human intervention or the need to "know" the best vendor ahead of time.

The supported metrics are currently:

Quality
Speed
Cost