Exploring Innovative Use Cases and Future Possibilities with ReadSpeaker

Modev News voiceandai Apr 16, 2024 2:57:39 PM Modev Staff Writers 4 min read

DiPietropolo 1

Unleashing the Power of Synthetic Voices

Business Development Director at ReadSpeaker, Tom DiPietropolo, recently gave an insightful presentation at the VOICE & AI 2023 conference. It was all about the evolution of synthetic voices and their growing importance in various industries. As a company specializing in text-to-speech (TTS) technology, ReadSpeaker has been at the forefront of developing high-quality synthetic voices for multiple applications, from accessibility and learning to voice interactions in devices and video games.

The talk opened by highlighting the incredible advancements in TTS technology over the past few years. Notably, today, with the help of AI and deep neural networks, it's possible to build realistic synthetic voices with just a few minutes of audio, a significant leap from the traditional unit selection synthesis method that required voice actors to record extensive libraries of speech fragments.

From Entertainment to the Classroom

One particularly interesting use case discussed was implementing ReadSpeaker's TTS technology in the popular video game "The Last of Us Part 2." The game developers used ReadSpeaker to create static audio files in 18 languages for UI narration prompts, specifically designed to assist blind or visually impaired players.

This allowed these players to understand their surroundings better and enjoy the game more fully. However, while this application was limited to predetermined gameplay scenarios, where the developers could anticipate what needed to be said and when to say it, the techs' potential was obvious.

In education, ReadSpeaker's voices have been integrated into various learning platforms, enabling students to listen to content while following along with highlighted text. This feature particularly benefits students with learning disabilities or those learning English as a second language.

Case in point: a recent advancement in TTS allows students to select a sentence or paragraph, translate it using Google Translate or another API, and read it aloud in their native language in real time. This innovative approach empowers students to control their learning experience and better understand the content presented to them.

Quality Control

With TTS, it's critical to be mindful of voice quality and how it can impact user experiences, both positively and negatively. As voice technology becomes more prevalent in our daily lives, companies must focus on improving voice quality while offering multiple deployment options, such as cloud-based, on-premise, and embedded on-device solutions. ReadSpeaker's unique approach supports all three delivery methods, ensuring that customers can interact with the technology in the way that best suits their needs.

One of the most exciting implementations of synthetic voices is their application in video game NPCs (non-player characters). DiPietropolo described a project where game developers use generative AI programs like ChatGPT to create dynamic, real-time responses for NPCs based on player interactions. Since the content is constantly changing, traditional voice acting is not feasible. Instead, ReadSpeaker's voice engine can be plugged in to read the dynamic content during gameplay, creating a more immersive and engaging experience for players. 

The above simply won't be possible without proper voice quality and robust deployment options. 

More Use Cases Than You Might Think

Use cases for synthetic voices aren't scarce, and most are rather innovative. Think translation devices for travelers, virtual assistants in cruise ship cabins, or interactive bartenders. In ReadSpeaker's recent collaboration with Sonos, they co-developed a custom voice for the company's voice control feature using the voice of actor Giancarlo Esposito. This partnership demonstrates the growing trend of brands leveraging celebrity voices to create unique and recognizable user experiences.

Another fascinating example is creating a synthetic voice for the iconic character Hello Kitty. Due to the voice actor's limited availability, ReadSpeaker had to build the voice model using existing studio recordings rather than having the actor record a specific script. Despite this challenge, the resulting synthetic voice was successfully integrated into the "Hello Kitty's Room" game for Android devices.

ReadSpeaker's commitment to ethical voice creation is unwavering. They only build voices when the voice actor is involved and a contractual agreement is in place. This approach helps to address concerns around deep fakes and content ownership while ensuring a higher level of quality control.

Forward Vision

Looking to the future, DiPietropolo outlined some of the exciting developments on the horizon for synthetic voices. These include voice models that can handle emotions, emphasize certain words, and speak with more realistic intonation and proper pausing. And it's not just words either. For example, the use of paralinguistics in video games, where voice actors record various non-speech sounds like breathing or coughing, is increasing. These can then be seamlessly integrated into the generated content using AI.

Another promising advancement is the development of multilingual voices, allowing a single voice profile to speak multiple languages while maintaining the same tone and delivery. This innovation will give brands more flexibility in how they want to sound and where they want to reach their audiences.

Wrap Up

Tom's presentation at VOICE & AI 2023 provided a comprehensive overview of the current state and future potential of synthetic voices. As ReadSpeaker continues to push the boundaries of TTS technology, we anticipate synthetic voices will become an integral part of our daily lives, from accessibility and learning to entertainment and beyond. The key to success, as DiPietropolo emphasized, lies in the importance of voice quality, brand awareness, and the ability to deliver custom solutions that meet each customer's unique needs. 

We look forward to seeing the next developments in synthetic voices ReadSpeaker has in store.

Modev Staff Writers

Modev staff includes a talented group of developers and writers focused on the industry and trends. We include Staff when several contributors join forces to produce an article.

Exploring Innovative Use Cases and Future Possibilities with ReadSpeaker

Modev Staff Writers

Information

Modev Events

Exploring Innovative Use Cases and Future Possibilities with ReadSpeaker

Modev Staff Writers

The Emergence of AI in the Workplace

Voice and Predictive Analytics

Bringing Beauty to All: The Estée Lauder Companies' Voice-Enabled Makeup Assistant

Information

Modev Events