[B! TTS] arrowKatoのブックマーク

GitHub - FunAudioLLM/CosyVoice: Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

arrowKato 2025/06/02

TTS

リンク

hexgrad/Kokoro-82M · Hugging Face

📣 Jan 12 Status: Intent to improve the base model https://hf.co/hexgrad/Kokoro-82M/discussions/36 ❤️ Kokoro Discord Server: https://discord.gg/QuGxSWBfQy Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out). On 25 Dec 2024, Kokoro v0.19 weights were permissively released in full fp32 precision under an Apache 2.0 license. As of 2 Jan 2025, 10 unique Voicepacks

arrowKato 2025/01/20

日本語対応してる。めっちゃ軽くて、それなりの精度

リンク

GitHub - svc-develop-team/so-vits-svc: SoftVC VITS Singing Voice Conversion

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

arrowKato 2024/12/16

歌声の合成ができる

TTS

リンク

GitHub - fishaudio/fish-speech: SOTA Open Source TTS

Zero-shot & Few-shot TTS: Input a 10 to 30-second vocal sample to generate high-quality TTS output. For detailed guidelines, see Voice Cloning Best Practices. Multilingual & Cross-lingual Support: Simply copy and paste multilingual text into the input box—no need to worry about the language. Currently supports English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish. No Phoneme Depe

arrowKato 2024/08/26

TTS

リンク

GitHub - 2noise/ChatTTS: A generative speech model for daily dialogue.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

arrowKato 2024/08/26

TTS

リンク

Seed-TTS

arrowKato 2024/07/01

TTS

リンク

GitHub - fishaudio/Bert-VITS2: vits2 backbone with multilingual-bert

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

arrowKato 2024/02/27

TTS

リンク

GitHub - RVC-Boss/GPT-SoVITS: 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

arrowKato 2024/02/06

10秒くらいの音声をいれて、テキストを指定する入力した音声っぽい声で、テキストを読み上げてくれる。←とは別にfine tuningも可能。モデルサイズは7Bもないっぽい

リンク

`large-v3` release · openai/whisper · Discussion #1762

We're pleased to announce the latest iteration of Whisper, called large-v3. Whisper-v3 has the same architecture as the previous large models except the following minor differences: The input uses 128 Mel frequency bins instead of 80 A new language token for Cantonese The large-v3 model is trained on 1 million hours of weakly labeled audio and 4 million hours of pseudolabeled audio collected using

arrowKato 2023/11/08

WhisperV3

リンク

はてなブックマーク

タグ

関連タグで絞り込む (2)

TTSに関するarrowKatoのブックマーク (9)

お知らせ

今週のはてなブックマーク数ランキング（2025年6月第1週）

今週のはてなブックマーク数ランキング（2025年5月第4週）

今週のはてなブックマーク数ランキング（2025年5月第3週）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス