Download free for 30 days
Sign in
Upload
Language (EN)
Support
Business
Mobile
Social Media
Marketing
Technology
Art & Photos
Career
Design
Education
Presentations & Public Speaking
Government & Nonprofit
Healthcare
Internet
Law
Leadership & Management
Automotive
Engineering
Software
Recruiting & HR
Retail
Sales
Services
Science
Small Business & Entrepreneurship
Food
Environment
Economy & Finance
Data & Analytics
Investor Relations
Sports
Spiritual
News & Politics
Travel
Self Improvement
Real Estate
Entertainment & Humor
Health & Medicine
Devices & Hardware
Lifestyle
Change Language
Language
English
Español
Português
Français
Deutsche
Cancel
Save
Submit search
EN
Uploaded by
Korakot Chaovavanich
1,742 views
Speech-to-Text API - Thai NLP Meetup #2
Experience in using Google Speech-to-Text API for transcription.
Data & Analytics
◦
Read more
4
Save
Share
Embed
Download
Downloaded 29 times
1
/ 25
2
/ 25
3
/ 25
4
/ 25
5
/ 25
6
/ 25
7
/ 25
8
/ 25
9
/ 25
10
/ 25
11
/ 25
12
/ 25
13
/ 25
14
/ 25
15
/ 25
16
/ 25
17
/ 25
18
/ 25
19
/ 25
20
/ 25
21
/ 25
22
/ 25
23
/ 25
24
/ 25
25
/ 25
More Related Content
PDF
Storytelling For The Web: Integrate Storytelling in your Design Process
by
Chiara Aliotta
PDF
2024 Trend Updates: What Really Works In SEO & Content Marketing
by
Search Engine Journal
PDF
Meetup 4 regexp
by
Korakot Chaovavanich
PDF
Vajirayana Digital Library Introduction
by
Korakot Chaovavanich
PDF
Build your own ASR engine
by
Korakot Chaovavanich
PDF
How Pantip manage its Thai Database
by
Korakot Chaovavanich
PDF
Line hackathon
by
Korakot Chaovavanich
PDF
Thai NLP resources
by
Korakot Chaovavanich
Storytelling For The Web: Integrate Storytelling in your Design Process
by
Chiara Aliotta
2024 Trend Updates: What Really Works In SEO & Content Marketing
by
Search Engine Journal
Meetup 4 regexp
by
Korakot Chaovavanich
Vajirayana Digital Library Introduction
by
Korakot Chaovavanich
Build your own ASR engine
by
Korakot Chaovavanich
How Pantip manage its Thai Database
by
Korakot Chaovavanich
Line hackathon
by
Korakot Chaovavanich
Thai NLP resources
by
Korakot Chaovavanich
Featured
PDF
Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...
by
OECD Directorate for Financial and Enterprise Affairs
PDF
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
by
SocialHRCamp
PDF
2024 State of Marketing Report – by Hubspot
by
Marius Sescu
PDF
Everything You Need To Know About ChatGPT
by
Expeed Software
PDF
Product Design Trends in 2024 | Teenage Engineerings
by
Pixeldarts
PDF
How Race, Age and Gender Shape Attitudes Towards Mental Health
by
ThinkNow
PDF
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
by
marketingartwork
PDF
Skeleton Culture Code
by
Skeleton Technologies
PDF
PEPSICO Presentation to CAGNY Conference Feb 2024
by
Neil Kimberley
PDF
Content Methodology: A Best Practices Report (Webinar)
by
contently
PPTX
How to Prepare For a Successful Job Search for 2024
by
Albert Qian
PDF
Social Media Marketing Trends 2024 // The Global Indie Insights
by
Kurio // The Social Media Age(ncy)
PDF
Trends In Paid Search: Navigating The Digital Landscape In 2024
by
Search Engine Journal
PDF
5 Public speaking tips from TED - Visualized summary
by
SpeakerHub
PDF
ChatGPT and the Future of Work - Clark Boyd
by
Clark Boyd
PDF
Getting into the tech field. what next
by
Tessa Mero
PDF
Google's Just Not That Into You: Understanding Core Updates & Search Intent
by
Lily Ray
PDF
How to have difficult conversations
by
Rajiv Jayarajah, MAppComm, ACC
PDF
Introduction to Data Science
by
Christy Abraham Joy
PDF
Time Management & Productivity - Best Practices
by
Vit Horky
Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...
by
OECD Directorate for Financial and Enterprise Affairs
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
by
SocialHRCamp
2024 State of Marketing Report – by Hubspot
by
Marius Sescu
Everything You Need To Know About ChatGPT
by
Expeed Software
Product Design Trends in 2024 | Teenage Engineerings
by
Pixeldarts
How Race, Age and Gender Shape Attitudes Towards Mental Health
by
ThinkNow
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
by
marketingartwork
Skeleton Culture Code
by
Skeleton Technologies
PEPSICO Presentation to CAGNY Conference Feb 2024
by
Neil Kimberley
Content Methodology: A Best Practices Report (Webinar)
by
contently
How to Prepare For a Successful Job Search for 2024
by
Albert Qian
Social Media Marketing Trends 2024 // The Global Indie Insights
by
Kurio // The Social Media Age(ncy)
Trends In Paid Search: Navigating The Digital Landscape In 2024
by
Search Engine Journal
5 Public speaking tips from TED - Visualized summary
by
SpeakerHub
ChatGPT and the Future of Work - Clark Boyd
by
Clark Boyd
Getting into the tech field. what next
by
Tessa Mero
Google's Just Not That Into You: Understanding Core Updates & Search Intent
by
Lily Ray
How to have difficult conversations
by
Rajiv Jayarajah, MAppComm, ACC
Introduction to Data Science
by
Christy Abraham Joy
Time Management & Productivity - Best Practices
by
Vit Horky
Speech-to-Text API - Thai NLP Meetup #2
1.
Speech-to-Text API ประสบการณ์ หัดใช้เพื่อถอดเทป
ภาษาไทย
2.
Outline • ทฤษฎี • Speech-to-Text
API • ปัญหาการถอดเทป • ปฏิบัติ - Colab • Libraries & Data • pydub, SpeechClient() • config, response • Interface design
3.
ทฤษฎี Theory
4.
Google Speech-to-Text • เดิมชื่อ
Speech API เปลี่ยนเป็น Speech-to-Text • กลุ่ม Cloud ML แบบเรียกใช้สำเร็จ • ASR: Automatic Speech Recognition • รองรับ 120 ภาษา (มากที่สุด) • ตัวเลือกอื่น: Nuance, AmiVoice, Tellvoice (MS, AWS ไม่มีภาษาไทย)
5.
ความสามารถ • แปลงเสียง เป็น
Text ใน 3 mode • Synchronous (สั้น) • Asynchronous (ยาว) • Streaming (real-time) • ใช้ผ่าน Library, gcloud, curl
6.
Advanced Features • Alternatives •
Timestamps • Separate Speakers • Identify Language • Enhanced Models (เฉพาะ English) • Word-level Confidence
7.
ราคา • ฟรี 60
นาที/เดือน • เสียงพูด $0.006 = 0.20฿ ต่อ 15 sec • วิดีโอ x2 เท่า • เช่น ถอดเทป 1 ชั่วโมง ≈ 50 บาท
8.
การถอดเทป • ทำไมต้องถอดเทป (transcribe) •
เพื่อ search • ทำ และแปล sub-title (accessibility) • text mining หา insight • แปลง unstructured data เป็น structured • ปัญหา: กินแรงคนมาก=แพง
9.
Requirement • ระบบถอดเทป โดยใช้
Speech-to-Text API • ไม่แพงเกินไป • แม่นยำพอ และสามารถตรวจคำผิดได้ • มี time-stamp เพื่อแปลงเป็น subtitle ได้
10.
ปฏิบัติ Practice
11.
Google Colab • https://colab.research.google.com •
Free cloud instance • CPU: Xeon 2.3 Ghz. • RAM: 12.6 GB • Disk: 33 GB • GPU Tesla K80, 2496 cores, 12GB VRAM • idle cutoff 90 นาที (max 12 ชั่วโมง)
12.
Import & Install •
from IPython.display import HTML, Audio • !pip install youtube-dl • !pip install pydub !apt install ffmpeg • !pip install google-cloud-speech • from google.cloud import speech_v1p1beta1 as speech
13.
youtube-dl • Download from
YouTube in many formats • !youtube-dl -F [youtube_url] ดู formats • !youtube-dl -f bestaudio -o ‘audio.%(ext)s’ [url]
14.
pydub
15.
librosa, peaks.js • Librosa •
waveform • spectrogram • CQT • Peaks.js • interaction
16.
Cloud Authentication • Register
Google Cloud (free $200 credit) • Create a new project • Enable Speech API for the project • Download credential file, save to Google Drive • Load credential file into Colab, set environment
17.
gcloud speech recognize •
For command line, you can also use ‘curl’
18.
Python Library • เรียก
API ด้วย audio และ config
19.
แบบง่ายสุด
20.
หลาย alternatives • ระบุ
config เป็น max_alternatives=n
21.
Time offset, word
confidence
22.
python property()
23.
Click & Play
24.
Next • client.long_running_recognize() • คำยากๆ
ที่สะกดผิด เก็บสถิติแล้ว correct • context hint ด้วย speechContexts phrases (500 คำ) • speech analysis (waveform, spectrogram) เพื่อ correct boundary • เอา label ไป train ASR เช่น KALDI, Deep Speech
25.
Thank you!
Download