The two talk about the underlying components of the Papercup workflow and outline the role that technology — speech recognition (ASR), machine translation (MT), and speech-to-text (STT) — as well as humans play in the creation of multilingual videos.
Simon, a Professor of Speech Processing at the University of Edinburgh, discusses the evolution of text-to-speech technology, the main technical hurdles in producing highly natural, emotional voices, as well as the adoption and acceptance curve for synthetic voices.
Jesse shares some of Papercup’s company milestones, which include raising a total of ca. USD 14m in seed and series A rounds. He also explains why there is room for many different startups in the multilingual speech and video translation space.
While Papercup has an ambitious goal of making videos accessible in any language, Jesse says startups will likely expand the market rather than replace traditional dubbing, particularly for high-end production environments.
First up, Florian and Esther discuss the latest language industry news — with a multilingual speech technology slant this week. The duo touch on NDVIA’s real-time MT offering, a mouse that transcribes and translates your voice at the press of a button, and Microsoft’s USD 19.7bn acquisition of AI speech technology firm Nuance.
In language industry-adjacent funding, the two discuss data-for-AI leader and Appen rival Scale, which doubled its valuation (to a whopping USD 7bn) after announcing they had raised a further USD 325m in funding.
Returning to the core of translation and localization, they talk about signs of a boom in the language industry, pointing to Super Agencies reporting strong results, anecdotal evidence from busier-than-ever LSP staff, a soaring Language Industry Job Index, and RWS shares (SlatorPro) that serve as a bellwether given company’s broad sector exposure.
Stream Slator webinars, workshops, and conferences on the Slator Video-on-Demand channel.