AI Enhancing Entertainment Audio

Aioniq, in partnership with TV3 Group and OChK, successfully tested Google Cloud Platform's Generative AI, achieving high-quality transcription, translation, and speech generation. This reduced movie translation time from two weeks to five minutes, confirmed the feasibility and operational efficiency of AI solutions, and demonstrated technical reliability with minimal issues due to user-friendly APIs and comprehensive documentation.
Automatic Transcription: Efficiently converting content into text
Automatic Translation and Subtitles: Seamlessly translating and generating subtitles.
Speech Generation: Accurately converting translated text into spoken words
The outcomes look promising! We're encouraged by the results! Currently, we're in the process of closely examining the experiment findings and considering a more in-depth analysis of whether AI could be introduced in translation processes in the production stage

Ivars Lubāns
Business insight
TV3 Group, the premier media company in the Baltics, achieved remarkable success in 2023 with their OTT service, Go3, reaching 500,000 subscribers and covering over 20% of Baltic households. This achievement, surpassing international giants in subscriber count as noted by Dataxis, presented challenges in ensuring content accessibility through multilingual subtitles and narration. To address these challenges, TV3 Group decided to explore advanced AI solutions.
We aimed to assess the feasibility of addressing operational challenges using existing Generative AI solutions. Video-on-demand providers must ensure content accessibility for subscribers, making multilingual subtitles and narration crucial. However, this process is both time-consuming and costly

Ivars Lubāns
Partnership and implementation
Partnering with Aioniq, TV3 Group aimed to develop an efficient system for automatic transcription, translation, and speech generation, ensuring all their content remains accessible to a diverse audience. This involved:
Crafting the solution and architectural framework
Designing and developing the necessary infrastructure
Proposing a methodology to generate robust data for informed decision-making
The primary goal was to evaluate the quality of Generative AI services in transcription, translation, and speech generation. This assessment would determine whether to progress the project to a production environment.
.webp)
TV3 Group, the leading media group in the Baltics, owns top commercial TV channels, DTH and OTT platforms, premium film and sports channels, and radio stations

Business outcomes
By analyzing the outcomes, we gauge the output quality at each PoC stage and ensure that the infrastructure and architecture can effectively handle the task. This makes estimating production operational costs much more straightforward
.webp)
Wojciech Doganowski
The Proof-of-Concept encompassed a series of experiments carefully selected control content across three stages:
Transcription: Evaluating the system's proficiency in recognizing spoken words, including speaker identification and timing accuracy
Translation: Assessing the quality of translating English-based transcriptions into various languages
Speech Generation: Testing the capability to produce near-natural audio narration
Key benefits
Quality and Efficiency: High-quality transcription, translation, and speech generation drastically reduced time and cost compared to traditional methods. For instance, translating a single movie, usually a two-week task, was completed in approximately five minutes
Feasibility: The PoC confirmed the feasibility of implementing AI solutions in production, showcasing substantial improvements in operational efficiency and cost reduction
Technical Reliability: Utilizing Google's Generative AI components, the project encountered minimal technical challenges. This reliability, combined with comprehensive documentation and user-friendly APIs, ensured smooth integration and effective control over experimental parameters
These results underscore the immense potential of Generative AI to revolutionize content production, offering substantial time and cost savings while maintaining high standards of quality.
The findings are truly fascinating. Although not perfect, the quality is generally acceptable, especially considering that Generative AI is still in its infancy. The most significant advantage is the dramatic reduction in time and cost for transcription, translation, and speech generation compared to our current processes
.webp)
Arturs Zingis
Technological insight
From the project's inception, the Google Cloud Platform was our primary focus due to Google's extensive experience in translation services. Key Components of Google's Generative AI.
Speech-To-Text is the critical foundation for high-quality transcription, essential for effective translation and speech generation.
We prioritized Speech- To-Text for its pivotal role in the Proof-of-Concept (PoC). Successful transcription is crucial for ensuring high-quality outcomes in subsequent stages.
To maximize control over the Speech-To-Text API's numerous parameters, we developed:
Backend System: For triggering experiments using the Google API
Frontend Application: For executing, storing, and analyzing experiments and their parameters
This comprehensive setup ensured precise control, enabling us to achieve the best possible outcomes in our experiments.
For our infrastructure needs, we partnered with OChK, a Google Cloud Strategic Partner renowned for delivering Google Cloud services in the Warsaw region. Choosing a local Data Center was essential for ensuring optimal performance and swift access to technical support.
This collaboration is an excellent example of OChK's technical expertise and cloud service delivery in the CEE region. We are thrilled to participate in this proof of concept, showcasing our competence and dedication to innovative solutions
.webp)
Paweł Ławecki
Technical outcomes
Leveraging Google's Generative AI posed minimal technical challenges, thanks to comprehensive documentation and user-friendly APIs. The system developed by Aioniq includes several key modules:
Cloud Speech API: For transcription and speech generation
Translate API: For translation
Cloud Functions: For communication between the Orchestration App and Google API
Cloud Storage: For storing raw content, pre-processed content, experiments, experiment parameters, and results
Get to know us
We’re here to answer your inquiries and support your needs. Reach out to us, and we'll respond promptly