file-type

实时音频转录解决方案:Google Cloud Speech to Text API应用技巧

下载需积分: 20 | 25KB | 更新于2025-09-01 | 155 浏览量 | 0 下载量 举报 收藏
download 立即下载
### 知识点一:Google Cloud Speech to Text API Google Cloud Speech to Text API是一个强大的语音识别服务,能够将语音转换成文本。开发者能够借助此API,轻松地将音频中的语音转换成文本信息,使其能被用于搜索、内容摘录、语音命令等多种应用场景。该API支持多种语言,并且能够实时地将语音转换为文本,这对于需要实时转录的应用场景来说非常有用。 ### 知识点二:实时音频转录 实时音频转录指的是将说话人的语音实时地转换成文本的过程。这通常需要处理连续的语音输入,并且具有非常低的延迟,以便用户获得接近实时的反馈。Google Cloud Speech to Text API能够处理实时转录,适用于电话交谈、实时会议记录等多种实时语音数据处理的场合。 ### 知识点三:API的60秒时限限制 Google Cloud Speech to Text API在处理实时音频转录时有一个限制,即每次请求只能处理最多60秒的音频。如果尝试转录超过60秒的音频,那么超过部分的音频将无法通过单个请求来转录。针对这一限制,开发者需要实现一种机制来处理超过60秒的音频数据。 ### 知识点四:转录缓冲区的设计与实现 为了应对Google Cloud Speech to Text API的60秒时限,相关脚本采取了一种缓冲区机制。该机制的工作原理是在实际发送到API之前,先将音频输入转移至一个缓冲区。这样可以对音频数据进行分块处理,每块音频数据(音频块)的长度不会超过60秒。在转录过程中,一旦遇到API的超时错误,现有的API客户端将被重新初始化,新的客户端则会继续从缓冲区中取出新的音频块进行转录。通过这种方式,即使输入的音频超过60秒,也能够被连续地转录完成。 ### 知识点五:环境变量配置 在使用Google Cloud Speech to Text API进行开发时,需要通过环境变量进行项目认证。项目认证通常需要一个GCP(Google Cloud Platform)的凭证JSON文件,该文件包含了API调用所必需的认证信息。在进行开发前,开发者需要设置环境变量`GOOGLE_APPLICATION_CREDENTIALS`,使其指向该JSON文件的路径。这样,应用程序就能正确地进行身份验证,从而调用API服务。 ### 知识点六:标签解析 - **machine-learning**:表明涉及的是机器学习技术,因为语音识别是机器学习领域中的一个应用场景。 - **google cloud**:强调服务或产品是属于Google Cloud Platform的。 - **gcp**:是Google Cloud Platform的缩写,指的是Google提供的云服务。 - **speech transcription**:指的是语音识别的过程,即把语音转换成文字的技术。 - **google-cloud-speech**:直接指向Google Cloud的语音识别服务。 - **live-audio**:表明该技术或应用与实时音频处理相关。 - **indefinite-duration**:说明可以处理无限时长的音频输入。 - **GoogleJavaScript**:指的是使用Google提供的服务或API时所用的JavaScript编程语言。 ### 知识点七:项目文件结构说明 - **transcribe-live-audio-master**:这是一个项目的名称,从文件压缩包的命名来看,它包含了实现上述实时音频转录功能的代码和相关资源。"master"通常在Git版本控制中代表主分支,意味着这个压缩包可能包含的是项目的主要代码版本。

相关推荐

filetype

2025-07-30 10:09:31.617300 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:31.617300 98.33% [INFO] mrcp_client_connection.c:551 (ASR-41) Send MRCPv2 Data 172.29.121.237:58334 <-> 172.29.121.237:1544 [291 bytes] MRCP/2.0 291 RECOGNIZE 1 Channel-Identifier: 1e7deafcb1bb43e9@speechrecog Content-Type: text/uri-list Cancel-If-Queue: false Start-Input-Timers: true No-Input-Timeout: 5000 Vendor-Specific-Parameters: barge_in=true;break_on_speech=false Content-Length: 25 builtin:speech/transcribe 2025-07-30 10:09:31.617300 98.33% [DEBUG] apt_poller_task.c:244 () Wait for Messages [MRCPv2ConnectionAgent] timeout [5000] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_poller_task.c:267 () Process Signalled Descriptor [MRCPv2ConnectionAgent] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_client_connection.c:656 () Receive MRCPv2 Data 172.29.121.237:58334 <-> 172.29.121.237:1544 [83 bytes] MRCP/2.0 83 1 200 IN-PROGRESS Channel-Identifier: 1e7deafcb1bb43e9@speechrecog 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6ec001630;2;3] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_poller_task.c:249 () Wait for Messages [MRCPv2ConnectionAgent] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6ec001630;2;3] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_client_session.c:498 (ASR-41) Raise App MRCP Response ASR-41 <1e7deafcb1bb43e9> 2025-07-30 10:09:32.157297 98.33% [DEBUG] mod_unimrcp.c:3642 (ASR-41) RECOGNIZE IN PROGRESS 2025-07-30 10:09:32.157297 98.33% [DEBUG] mod_unimrcp.c:1589 (ASR-41) READY ==> PROCESSING 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:32.157297 98.33% [INFO] mod_unimrcp.c:1636 speech_handle: name = unimrcp, rate = 8000, speed = 0, samples = 160, voice = , engine = unimrcp, param = (null) 2025-07-30 10:09:32.157297 98.33% [INFO] mod_unimrcp.c:1639 voice = (null), rate = 8000 2025-07-30 10:09:32.157297 98.33% [DEBUG] mod_unimrcp.c:688 (TTS-42) audio queue created 2025-07-30 10:09:32.157297 98.33% [NOTICE] mrcp_application.c:117 (TTS-42) Create MRCP Handle 0x7fa6e8070348 [uni2] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_client_session.c:131 (TTS-42) Create Channel TTS-42 <new> 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6e8011190;4;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6e8011190;4;0] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_client_session.c:385 (TTS-42) Receive App Request TTS-42 <new> [2] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_client.c:697 (TTS-42) Add MRCP Handle TTS-42 <new> 2025-07-30 10:09:32.157297 98.33% [DEBUG] mrcp_client_session.c:1277 (TTS-42) Dispatch App Request TTS-42 <new> [2] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:32.157297 98.33% [NOTICE] mrcp_client_session.c:717 (TTS-42) Add Control Channel TTS-42 <new@speechsynth> 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_poller_task.c:259 () Process Poller Wakeup [MRCPv2ConnectionAgent] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] mrcp_client_session.c:743 (TTS-42) Add Media Termination TTS-42 <new@media-tm> 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6ec001630;2;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_poller_task.c:249 () Wait for Messages [MRCPv2ConnectionAgent] 2025-07-30 10:09:32.157297 98.33% [DEBUG] mrcp_client_session.c:772 (TTS-42) Add Media Termination TTS-42 <new@rtp-tm> 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MediaEngine] [0x7fa6f0008640;1;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6ec001630;2;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] mrcp_client_session.c:292 (TTS-42) Control Channel Added TTS-42 <new@speechsynth> 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:335 () Process Message [MediaEngine] [0x7fa6f0008640;1;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] mpf_context.c:180 () Add Media Context TTS-42 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6c00037d0;3;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6c00037d0;3;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] mrcp_client_session.c:939 (TTS-42) Media Termination Added TTS-42 <new@media-tm> 2025-07-30 10:09:32.157297 98.33% [DEBUG] mrcp_client_session.c:939 (TTS-42) Media Termination Added TTS-42 <new@rtp-tm> 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_client_session.c:409 (TTS-42) Send Offer TTS-42 <new> [c:1 a:1 v:0] to 172.29.121.237:8060 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_sofiasip_client_agent.c:357 (TTS-42) Local SDP TTS-42 <new> v=0 o=FreeSWITCH 0 0 IN IP4 172.29.121.237 s=- c=IN IP4 172.29.121.237 t=0 0 m=application 9 TCP/MRCPv2 1 a=setup:active a=connection:existing a=resource:speechsynth a=cmid:1 m=audio 4084 RTP/AVP 0 8 96 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000 a=rtpmap:96 L16/8000 a=recvonly a=mid:1 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_sofiasip_client_agent.c:617 () Receive SIP Event [nua_i_state] Status 0 INVITE sent [uni2] 2025-07-30 10:09:32.157297 98.33% [NOTICE] mrcp_sofiasip_client_agent.c:555 (TTS-42) SIP Call State TTS-42 [calling] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_sofiasip_client_agent.c:617 () Receive SIP Event [nua_r_invite] Status 200 OK [uni2] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_sofiasip_client_agent.c:617 () Receive SIP Event [nua_i_state] Status 200 OK [uni2] 2025-07-30 10:09:32.157297 98.33% [NOTICE] mrcp_sofiasip_client_agent.c:555 (TTS-42) SIP Call State TTS-42 [ready] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_sofiasip_client_agent.c:441 (TTS-42) Remote SDP TTS-42 <new> v=0 o=UniMRCPServer 2123169851781563314 2271541858673750442 IN IP4 172.29.121.237 s=- c=IN IP4 172.29.121.237 t=0 0 m=application 1544 TCP/MRCPv2 1 a=setup:passive a=connection:existing a=channel:8e98b0a10aca453c@speechsynth a=cmid:1 m=audio 5078 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=sendonly a=mid:1 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6f8005ec0;1;0] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_sofiasip_client_agent.c:617 () Receive SIP Event [nua_i_active] Status 200 Call active [uni2] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6f8005ec0;1;0] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_client_session.c:149 (TTS-42) Receive Answer TTS-42 <new> [c:1 a:1 v:0] Status 200 2025-07-30 10:09:32.157297 98.33% [DEBUG] mrcp_client_session.c:1136 (TTS-42) Modify Control Channel TTS-42 <8e98b0a10aca453c> 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] mrcp_client_session.c:1174 (TTS-42) Modify Media Termination TTS-42 <8e98b0a10aca453c@rtp-tm> 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MediaEngine] [0x7fa6f0008780;1;0] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_poller_task.c:259 () Process Poller Wakeup [MRCPv2ConnectionAgent] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:32.157297 98.33% [INFO] mrcp_client_connection.c:456 (TTS-42) Add Control Channel <8e98b0a10aca453c@speechsynth> 172.29.121.237:58334 <-> 172.29.121.237:1544 [2] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6ec001630;2;1] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_poller_task.c:249 () Wait for Messages [MRCPv2ConnectionAgent] 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6ec001630;2;1] 2025-07-30 10:09:32.157297 98.33% [DEBUG] mrcp_client_session.c:309 (TTS-42) Control Channel Modified TTS-42 <8e98b0a10aca453c@speechsynth> 2025-07-30 10:09:32.157297 98.33% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_task.c:335 () Process Message [MediaEngine] [0x7fa6f0008780;1;0] 2025-07-30 10:09:32.177292 98.33% [INFO] mpf_rtp_stream.c:331 () Enable RTP Session 172.29.121.237:4084 2025-07-30 10:09:32.177292 98.33% [DEBUG] mpf_bridge.c:149 () Create Linear Audio Bridge TTS-42 2025-07-30 10:09:32.177292 98.33% [INFO] mpf_rtp_stream.c:505 () Open RTP Receiver 172.29.121.237:4084 <- 172.29.121.237:5078 playout [0 ms] bounds [0 - 600 ms] adaptive [0] skew detection [1] 2025-07-30 10:09:32.177292 98.33% [INFO] mpf_bridge.c:111 () Media Path TTS-42 Source->[PCMU/8000/1]->Decoder->[LPCM/8000/1]->Bridge->[LPCM/8000/1]->Sink 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6c00037d0;3;0] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6c00037d0;3;0] 2025-07-30 10:09:32.177292 98.33% [DEBUG] mrcp_client_session.c:980 (TTS-42) Media Termination Modified TTS-42 <8e98b0a10aca453c@rtp-tm> 2025-07-30 10:09:32.177292 98.33% [INFO] mrcp_client_session.c:453 (TTS-42) Raise App Response TTS-42 <8e98b0a10aca453c> [2] SUCCESS [0] 2025-07-30 10:09:32.177292 98.33% [DEBUG] mod_unimrcp.c:1905 (TTS-42) SYNTHESIZER channel is ready, codec = LPCM, sample rate = 8000 2025-07-30 10:09:32.177292 98.33% [DEBUG] mod_unimrcp.c:1589 (TTS-42) CLOSED ==> READY 2025-07-30 10:09:32.177292 98.33% [DEBUG] mod_unimrcp.c:1067 (TTS-42) channel is ready 2025-07-30 10:09:32.177292 98.33% [DEBUG] switch_ivr_play_say.c:3114 OPEN TTS unimrcp 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:32.177292 98.33% [DEBUG] switch_ivr_play_say.c:3124 Raw Codec Activated 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6e802a770;4;0] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6e802a770;4;0] 2025-07-30 10:09:32.177292 98.33% [INFO] mrcp_client_session.c:390 (TTS-42) Receive App MRCP Request TTS-42 <8e98b0a10aca453c> 2025-07-30 10:09:32.177292 98.33% [INFO] mrcp_client_session.c:620 (TTS-42) Send MRCP Request TTS-42 <8e98b0a10aca453c@speechsynth> [1] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_poller_task.c:259 () Process Poller Wakeup [MRCPv2ConnectionAgent] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:32.177292 98.33% [INFO] mrcp_client_connection.c:551 (TTS-42) Send MRCPv2 Data 172.29.121.237:58334 <-> 172.29.121.237:1544 [348 bytes] MRCP/2.0 348 SPEAK 1 Channel-Identifier: 8e98b0a10aca453c@speechsynth Content-Type: text/plain Content-Length: 227 '您好,我是沧州福居家博会的客服,请问您是张先生吗?张先生您好!您报名了咱们的展会,我们的参展时间是七月三十日至八月二日,咱们到时候凭短信入场就可以了!' 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_poller_task.c:244 () Wait for Messages [MRCPv2ConnectionAgent] timeout [5000] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_poller_task.c:267 () Process Signalled Descriptor [MRCPv2ConnectionAgent] 2025-07-30 10:09:32.177292 98.33% [INFO] mrcp_client_connection.c:656 () Receive MRCPv2 Data 172.29.121.237:58334 <-> 172.29.121.237:1544 [83 bytes] MRCP/2.0 83 1 200 IN-PROGRESS Channel-Identifier: 8e98b0a10aca453c@speechsynth 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6ec0018e0;2;3] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_poller_task.c:249 () Wait for Messages [MRCPv2ConnectionAgent] 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6ec0018e0;2;3] 2025-07-30 10:09:32.177292 98.33% [INFO] mrcp_client_session.c:498 (TTS-42) Raise App MRCP Response TTS-42 <8e98b0a10aca453c> 2025-07-30 10:09:32.177292 98.33% [DEBUG] mod_unimrcp.c:1978 (TTS-42) REQUEST IN PROGRESS 2025-07-30 10:09:32.177292 98.33% [DEBUG] mod_unimrcp.c:1589 (TTS-42) READY ==> PROCESSING 2025-07-30 10:09:32.177292 98.33% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:32.177292 98.33% [DEBUG] switch_ivr_play_say.c:2832 Speaking text: '您好,我是沧州福居家博会的客服,请问您是张先生吗?张先生您好!您报名了咱们的展会,我们的参展时间是七月三十日至八月二日,咱们到时候凭短信入场就可以了!' 2025-07-30 10:09:32.197293 98.33% [DEBUG] switch_rtp.c:7698 Correct audio ip/port confirmed. 2025-07-30 10:09:32.197293 98.33% [DEBUG] switch_core_io.c:448 Setting BUG Codec PCMU:0 2025-07-30 10:09:32.217292 98.33% [DEBUG] switch_rtp.c:1934 rtcp_stats_init: audio ssrc[700391578] base_seq[27299] 2025-07-30 10:09:32.577293 98.37% [DEBUG] switch_rtp.c:7128 Correct audio RTCP ip/port confirmed. 2025-07-30 10:09:43.177293 98.23% [DEBUG] apt_poller_task.c:267 () Process Signalled Descriptor [MRCPv2ConnectionAgent] 2025-07-30 10:09:43.177293 98.23% [INFO] mrcp_client_connection.c:656 () Receive MRCPv2 Data 172.29.121.237:58334 <-> 172.29.121.237:1544 [94 bytes] MRCP/2.0 94 START-OF-INPUT 1 IN-PROGRESS Channel-Identifier: 1e7deafcb1bb43e9@speechrecog 2025-07-30 10:09:43.177293 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6ec0010e0;2;3] 2025-07-30 10:09:43.177293 98.23% [DEBUG] apt_poller_task.c:249 () Wait for Messages [MRCPv2ConnectionAgent] 2025-07-30 10:09:43.177293 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6ec0010e0;2;3] 2025-07-30 10:09:43.177293 98.23% [INFO] mrcp_client_session.c:514 (ASR-41) Raise App MRCP Event ASR-41 <1e7deafcb1bb43e9> 2025-07-30 10:09:43.177293 98.23% [DEBUG] mod_unimrcp.c:3734 (ASR-41) START OF INPUT 2025-07-30 10:09:43.177293 98.23% [DEBUG] mod_unimrcp.c:2623 (ASR-41) start of input 2025-07-30 10:09:43.177293 98.23% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:43.197293 98.23% [DEBUG] mod_unimrcp.c:2567 (ASR-41) SUCCESS, start of input 2025-07-30 10:09:43.197293 98.23% [DEBUG] mod_unimrcp.c:2567 (ASR-41) SUCCESS, start of input 2025-07-30 10:09:43.197293 98.23% [DEBUG] mod_unimrcp.c:2813 (ASR-41) start of input 2025-07-30 10:09:43.217294 98.23% [INFO] switch_ivr_async.c:4842 (sofia/internal/[email protected]:14527) START OF SPEECH 2025-07-30 10:09:43.217294 98.23% [DEBUG] switch_ivr_play_say.c:2996 done speaking text 2025-07-30 10:09:43.217294 98.23% [DEBUG] mod_unimrcp.c:1400 (TTS-42) Stopping SYNTHESIZER 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6e8011190;4;0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6e8011190;4;0] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_session.c:390 (TTS-42) Receive App MRCP Request TTS-42 <8e98b0a10aca453c> 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_session.c:620 (TTS-42) Send MRCP Request TTS-42 <8e98b0a10aca453c@speechsynth> [2] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_poller_task.c:259 () Process Poller Wakeup [MRCPv2ConnectionAgent] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_connection.c:551 (TTS-42) Send MRCPv2 Data 172.29.121.237:58334 <-> 172.29.121.237:1544 [72 bytes] MRCP/2.0 72 STOP 2 Channel-Identifier: 8e98b0a10aca453c@speechsynth 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_poller_task.c:244 () Wait for Messages [MRCPv2ConnectionAgent] timeout [5000] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_poller_task.c:267 () Process Signalled Descriptor [MRCPv2ConnectionAgent] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_connection.c:656 () Receive MRCPv2 Data 172.29.121.237:58334 <-> 172.29.121.237:1544 [108 bytes] MRCP/2.0 108 2 200 COMPLETE Channel-Identifier: 8e98b0a10aca453c@speechsynth Active-Request-Id-List: 1 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6ec0010e0;2;3] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_poller_task.c:249 () Wait for Messages [MRCPv2ConnectionAgent] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6ec0010e0;2;3] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_session.c:498 (TTS-42) Raise App MRCP Response TTS-42 <8e98b0a10aca453c> 2025-07-30 10:09:43.217294 98.23% [DEBUG] mod_unimrcp.c:1990 (TTS-42) COMPLETE 2025-07-30 10:09:43.217294 98.23% [DEBUG] mod_unimrcp.c:1589 (TTS-42) PROCESSING ==> DONE 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mod_unimrcp.c:1422 (TTS-42) SYNTHESIZER stopped 2025-07-30 10:09:43.217294 98.23% [DEBUG] mod_unimrcp.c:1589 (TTS-42) DONE ==> READY 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6e8011190;4;0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mod_unimrcp.c:931 (TTS-42) Waiting for MRCP session to terminate 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6e8011190;4;0] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_session.c:385 (TTS-42) Receive App Request TTS-42 <8e98b0a10aca453c> [1] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mrcp_client_session.c:1277 (TTS-42) Dispatch App Request TTS-42 <8e98b0a10aca453c> [1] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_session.c:828 (TTS-42) Terminate Session TTS-42 <8e98b0a10aca453c> 2025-07-30 10:09:43.217294 98.23% [DEBUG] mrcp_client_session.c:849 (TTS-42) Remove Control Channel TTS-42 <8e98b0a10aca453c> 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mrcp_client_session.c:859 (TTS-42) Subtract Media Termination TTS-42 <8e98b0a10aca453c@media-tm> 2025-07-30 10:09:43.217294 98.23% [DEBUG] mrcp_client_session.c:880 (TTS-42) Subtract Media Termination TTS-42 <8e98b0a10aca453c@rtp-tm> 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_poller_task.c:259 () Process Poller Wakeup [MRCPv2ConnectionAgent] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MediaEngine] [0x7fa6f0008b20;1;0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCPv2ConnectionAgent] [0x7fa6f0001860;1;0] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_connection.c:480 (TTS-42) Remove Control Channel <8e98b0a10aca453c@speechsynth> [1] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6ec0010e0;2;2] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_poller_task.c:249 () Wait for Messages [MRCPv2ConnectionAgent] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6ec0010e0;2;2] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mrcp_client_session.c:329 (TTS-42) Control Channel Removed TTS-42 <8e98b0a10aca453c@speechsynth> 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_sofiasip_client_agent.c:617 () Receive SIP Event [nua_r_bye] Status 200 OK [uni2] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_sofiasip_client_agent.c:617 () Receive SIP Event [nua_i_state] Status 200 to BYE [uni2] 2025-07-30 10:09:43.217294 98.23% [NOTICE] mrcp_sofiasip_client_agent.c:555 (TTS-42) SIP Call State TTS-42 [terminated] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6f8005ec0;1;1] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6f8005ec0;1;1] 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_session.c:207 (TTS-42) Session Terminated TTS-42 <8e98b0a10aca453c> 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:335 () Process Message [MediaEngine] [0x7fa6f0008b20;1;0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mpf_bridge.c:118 () Destroy Audio Bridge TTS-42 2025-07-30 10:09:43.217294 98.23% [INFO] mpf_rtp_stream.c:537 () Close RTP Receiver 172.29.121.237:4084 <- 172.29.121.237:5078 [r:552 l:0 j:50 p:0 d:0 i:0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mpf_context.c:236 () Remove Media Context TTS-42 2025-07-30 10:09:43.217294 98.23% [INFO] mpf_rtp_stream.c:418 () Remove RTP Session 172.29.121.237:4084 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6c00037d0;3;0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6c00037d0;3;0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mrcp_client_session.c:1009 (TTS-42) Media Termination Subtracted TTS-42 <8e98b0a10aca453c@media-tm> 2025-07-30 10:09:43.217294 98.23% [DEBUG] mrcp_client_session.c:1009 (TTS-42) Media Termination Subtracted TTS-42 <8e98b0a10aca453c@rtp-tm> 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client.c:707 (TTS-42) Remove MRCP Handle TTS-42 <8e98b0a10aca453c> 2025-07-30 10:09:43.217294 98.23% [INFO] mrcp_client_session.c:453 (TTS-42) Raise App Response TTS-42 <8e98b0a10aca453c> [1] SUCCESS [0] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mod_unimrcp.c:1840 (TTS-42) Destroying MRCP session 2025-07-30 10:09:43.217294 98.23% [NOTICE] mrcp_application.c:211 (TTS-42) Destroy MRCP Handle TTS-42 2025-07-30 10:09:43.217294 98.23% [DEBUG] mod_unimrcp.c:1589 (TTS-42) READY ==> CLOSED 2025-07-30 10:09:43.217294 98.23% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:43.217294 98.23% [DEBUG] mod_unimrcp.c:856 (TTS-42) audio queue destroyed 2025-07-30 10:09:43.217294 98.23% [INFO] switch_ivr_async.c:4938 (sofia/internal/[email protected]:14527) WAITING FOR RESULT 2025-07-30 10:09:46.317293 98.23% [DEBUG] apt_poller_task.c:267 () Process Signalled Descriptor [MRCPv2ConnectionAgent] 2025-07-30 10:09:46.317293 98.23% [INFO] mrcp_client_connection.c:656 () Receive MRCPv2 Data 172.29.121.237:58334 <-> 172.29.121.237:1544 [416 bytes] MRCP/2.0 416 RECOGNITION-COMPLETE 1 COMPLETE Channel-Identifier: 1e7deafcb1bb43e9@speechrecog Completion-Cause: 000 success Content-Type: application/x-nlsml Content-Length: 231 {"resp_type":"RESULT","trace_id":"ee496c56-91d9-4e03-9f85-dad51b21b86a","segments":[{"start_time":0,"end_time":13500,"is_final":true,"result":{"text":"不是不是不是不是不是不是不是不是","score":0.7839646935462952}}]} 2025-07-30 10:09:46.317293 98.23% [DEBUG] apt_task.c:263 () Signal Message to [MRCP Client] [0x7fa6ec0010e0;2;3] 2025-07-30 10:09:46.317293 98.23% [DEBUG] apt_poller_task.c:249 () Wait for Messages [MRCPv2ConnectionAgent] 2025-07-30 10:09:46.317293 98.23% [DEBUG] apt_task.c:335 () Process Message [MRCP Client] [0x7fa6ec0010e0;2;3] 2025-07-30 10:09:46.317293 98.23% [INFO] mrcp_client_session.c:514 (ASR-41) Raise App MRCP Event ASR-41 <1e7deafcb1bb43e9> 2025-07-30 10:09:46.317293 98.23% [DEBUG] mod_unimrcp.c:3709 (ASR-41) RECOGNITION COMPLETE, Completion-Cause: 000 2025-07-30 10:09:46.317293 98.23% [DEBUG] mod_unimrcp.c:3718 (ASR-41) Recognition result is not null-terminated. Appending null terminator. 2025-07-30 10:09:46.317293 98.23% [DEBUG] mod_unimrcp.c:2756 (ASR-41) ASR adding result headers 2025-07-30 10:09:46.317293 98.23% [DEBUG] mod_unimrcp.c:2651 (ASR-41) result: {"resp_type":"RESULT","trace_id":"ee496c56-91d9-4e03-9f85-dad51b21b86a","segments":[{"start_time":0,"end_time":13500,"is_final":true,"result":{"text":"不是不是不是不是不是不是不是不是","score":0.7839646935462952}}]} 2025-07-30 10:09:46.317293 98.23% [DEBUG] mod_unimrcp.c:1589 (ASR-41) PROCESSING ==> READY 2025-07-30 10:09:46.317293 98.23% [DEBUG] apt_consumer_task.c:135 () Wait for Messages [MRCP Client] 2025-07-30 10:09:46.337293 98.23% [DEBUG] mod_unimrcp.c:2564 (ASR-41) SUCCESS, have result 2025-07-30 10:09:46.337293 98.23% [DEBUG] mod_unimrcp.c:2564 (ASR-41) SUCCESS, have result 2025-07-30 10:09:46.337293 98.23% [DEBUG] mod_unimrcp.c:2809 (ASR-41) result: {"resp_type":"RESULT","trace_id":"ee496c56-91d9-4e03-9f85-dad51b21b86a","segments":[{"start_time":0,"end_time":13500,"is_final":true,"result":{"text":"不是不是不是不是不是不是不是不是","score":0.7839646935462952}}]} 2025-07-30 10:09:46.357292 98.23% [INFO] switch_ivr_async.c:4829 (sofia/internal/[email protected]:14527) DETECTED SPEECH --------------1.break_on_speech=false 还是会自动打断。我感觉不是顺序的问题导致的没有解析,相反我从日志中发现应该是被正确解析了? 2. while os.clock() - start < 30 do -- 最大等待30秒 local speech_status = session:getVariable("detect_speech_result_type") if speech_status == "break" then local status = { type = session:getVariable("detect_speech_result_type") or "", result = session:getVariable("detect_speech_result") or "", cause = session:getVariable("detect_speech_cause") or "" } freeswitch.consoleLog("NOTICE", "Detect Status: " .."type="..status.type ..", cause="..status.cause ..", result="..status.result:sub(1,50).."\n") handle_barge_in() return "bargein", session:getVariable("detect_speech_result") end session:sleep(100) end 获取的 detect_speech_result_type还是空

filetype

解决方案: 选择合适的语音识别技术:aelos机器人支持多种语音识别技术,例如Google Cloud Speech-to-Text、Microsoft Azure Speech Services等。你可以根据自己的需求和预算选择合适的技术。 配置语音识别模型:根据选择的语音识别技术,需要配置相应的模型和参数。例如,Google Cloud Speech-to-Text需要配置语言、模型类型、 sampling rate等参数。 实现语音识别接口:使用选择的语音识别技术,实现语音识别接口。例如,使用Google Cloud Speech-to-Text的RESTful API,实现语音识别接口。 实现机器人执行指令:根据语音识别结果,实现机器人执行相应的指令。例如,使用aelos机器人的API,实现机器人执行指令。 测试和优化:测试语音识别模型和机器人执行指令,优化模型和参数,以提高语音识别准确率和机器人执行指令的速度。 核心代码: import speech_recognition as sr # 创建语音识别对象 r = sr.Recognizer() # 配置语音识别模型 r.energy_threshold = 300 r.pause_threshold = 0.8 # 实现语音识别接口 def recognize_speech(audio): try: text = r.recognize_google(audio, language='zh-CN') return text except sr.UnknownValueError: return None # 实现机器人执行指令 def execute_command(text): # 例如,使用aelos机器人的API,实现机器人执行指令 # ... pass # 测试语音识别模型和机器人执行指令 audio = sr.AudioFile('test.wav') text = recognize_speech(audio) if text: execute_command(text)(给我保姆级教程,细化到在哪个软件上操作我都要)

可吸不是泥
  • 粉丝: 42
上传资源 快速赚钱