Home AI - Artificial Intelligence Podcasting Platform Podcastle Unveils Text-to-Speech Model Featuring Over 450 AI Voices

Podcasting Platform Podcastle Unveils Text-to-Speech Model Featuring Over 450 AI Voices

by admin

The podcast recording and editing platform, Podcastle, is entering the competitive AI-driven text-to-speech market with the launch of its proprietary AI model, Asyncflow v1.0. Additionally, an API will be made available for developers, facilitating the integration of this text-to-speech technology into their applications.

This new model empowers the company to provide over 450 distinct AI voices, all capable of narrating your written content. The startup claims to have designed the technology and model so that both training and inference expenses remain low, giving it a competitive edge in the market.

By taking this step, Podcastle joins a range of startups, such as ElevenLabs, Speechify, and WellSaid, all of which have crafted AI models and technologies to transform any text into AI-narrated voice clips. The applications of this technology include marketing, advertising, content creation, education, and corporate training.

According to Podcastle’s founder, Arto Yeritsyan, the company had ambitions to create a text-to-speech model from its early days, but the associated costs of training and data were prohibitively high.

“Since our inception, we aimed to develop a robust text-to-speech model. However, the development costs were substantial. Fortunately, advancements in large language models enabled us to achieve a breakthrough last year, allowing us to create a high-quality voice model with significantly less data,” Yeritsyan explained.

The company received further support from a $13.5 million Series A funding round secured last year.

Yeritsyan highlighted that Podcastle offers text-to-speech conversion at approximately $40 for every 500 minutes, while ElevenLabs charges $99 for the same service.

The platform’s voice cloning feature is also set for an upgrade to streamline the training process.

Previously, training required reading around 70 different sentences, but now it only demands a few seconds of recording to produce a voice clone. This updated process utilizes Podcastle’s Magic Dust AI, introduced last year, to enhance the quality of audio recordings.

Image Credits: Podcastle

In our assessments, the voices generated using this new method tended to sound somewhat mechanical, albeit retaining the intended tone. The company has indicated that it plans to enhance this feature over time. Furthermore, users can train various voice samples to achieve different results.

Podcastle asserts that beyond cost advantages, offering a comprehensive suite of tools for audio, video, podcasts, and AI-based narration on a unified and revamped platform will provide it with a competitive advantage. Yeritsyan noted that while most users utilize Podcastle primarily for audio projects, video usage is rapidly gaining traction as well.

Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence

You may also like

About Us

Get the latest tech news, reviews, and analysis on AI, crypto, security, startups, apps, fintech, gadgets, hardware, venture capital, and more.

Latest Articles