User Profile
HeikoRa
Joined 7 years ago
User Widgets
Recent Discussions
We have updated Speech Studio
Speech Studio (https://speech.microsoft.com) has been updated to a new version. Besides enabling you to create custom models we have started adding ways to try out our out of the box capabilities. Check out the "Real-time Speech-to-text" area that let's you quickly test your audio without writing any code as well as our new Pronunciation Assessment feature that lets you evaluate a speakers fluency, accuracy and pronunciation which is great for language learning scenarios.1.5KViews1like0CommentsRecent Blog posts for Speech Services
Here are some recent blog posts related to Speech Services: Enable read-aloud for your application with Azure neural TTS Add voice activation to your product with Custom Keyword Build an application that transcribes speech Azure Neural Text-to-Speech extended to support lip sync with viseme2.1KViews1like4CommentsRe: Speech Model Limitations
Nathan Hess See the documentation here for recommended amounts of training data: Prepare data for Custom Speech - Speech service - Azure Cognitive Services | Microsoft Docs For audio training data the limit is 20h. For text data you could technically use more than the recommended data amounts, but typically gains would be minimal if any.1.7KViews0likes0CommentsRe: Can you clarify the flow of messages from a Direct Line Speech client to the bot service?
nsouth1625 1. Yes the DL Speech channel facilitates the communication between the app using the Speech SDK to send/receive audio and bot messages and your bot. It handles any conversion of audio to text and the other way around where needed. It basically just routes the data to the appropriate place (speech service or bot). 2. If there is no text to render as audio we won't call the text to speech service.2KViews0likes0CommentsRe: Learning/Training Different Accents
AlimaSam you can automate the training using our REST API see here: https://westus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-0 You would need to create an endpoint for each user though which comes with some hosting cost.1.7KViews0likes0CommentsRe: Cost of Custom Neural Voice
AlimaSam we recently GA'ed the Custom Neural Voice creation capability. See docs here: Custom neural voice overview - Speech service - Azure Cognitive Services | Microsoft Docs Pricing information can be found here: Cognitive Speech Services Pricing | Microsoft Azure Model training is $52 per compute hour. Real-time synthesis: $24 per 1M characters Endpoint hosting: $4.04 per model per hour Long audio creation: $100 per 1M characters5.7KViews0likes0CommentsRe: Short phrase language detection
pauliom not sure exactly what text language detection capability you are using, but text analytics has a default language hint capability you could populate with the previous detected language. See here: Detect language with the Text Analytics REST API - Azure Cognitive Services | Microsoft Docs2.1KViews0likes1CommentRe: Can you clarify the flow of messages from a Direct Line Speech client to the bot service?
Hi nsouth1625, Here is a link to the Direct Line Speech docs with a diagram that shows the flow. Direct Line Speech - Speech service - Azure Cognitive Services | Microsoft Docs You are correct in that audio is sent to our Azure Speech Service and then from there via the DL Speech channel to your Bot. The Bot is hosted in your app service. On the way back it is sent back via the channel and text that should be rendered as audio is sent to our Speech To Text service. Our services comply with the various security and privacy certifications. Have a look here: Cognitive Services Compliance and Privacy | Microsoft Azure As well as: Speech service encryption of data at rest - Azure Cognitive Services | Microsoft Docs2KViews0likes2CommentsRe: Special Symbols
Or if you are talking about Text To Speech you can use SSML to provide the pronunciation: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup?tabs=csharp#use-phonemes-to-improve-pronunciation2KViews0likes0CommentsRe: Special Symbols
Hi LeonhardBleicher , Custom Speech may be able to let you do that. You would need training text that has the terms like "nm" in it and also provide a pronunciation file that looks something like: nm<tab>nanometer ... Not 100% sure if that also works for "µ" as it is a special character. See docs here: Custom Speech overview - Speech service - Azure Cognitive Services | Microsoft Docs2KViews1like0Comments