[B! multimodal] arrowKatoのブックマーク

arrowKato id:arrowKato

multimodalに関するarrowKatoのブックマーク (4)

Vidore Leaderboard - a Hugging Face Space by vidore
Discover amazing ML apps made by the community
arrowKato 2024/12/24
マルチモーダル用にドキュメントを検索するためのembeddingを作るモデルのリーダーボード

multimodal

RAG
リンク
langchain/cookbook/Multi_modal_RAG.ipynb at master · langchain-ai/langchain
arrowKato 2024/12/24
multimodal

RAG
リンク
Multimodal RAG を実装してみる
昨日の記事の続き。 Multimodal RAG のアプローチのうち、マルチモーダル埋め込みを用いるもの（Multi-Vector Retriever for RAG on tables, text, and images のOption 1）の具体的な実装を考えてみる。前提：非マルチモーダルRAG 通常のRAG でドキュメントを格納する場合、以下のようなコードを用いるのが一般的だと思う。 #ドキュメントを分割してチャンクにする text_splitter = RecursiveCharacterTextSplitter(chunks_size=300, chunk_overlap=30) chunks = text_splitter.split_documents(document) #ベクトルインデックスを作成（Google Vertex AI） vector_store = Vec
arrowKato 2024/11/21
RAG

multimodal
リンク
Introduction to GPT-4o and GPT-4o mini | OpenAI Cookbook
GPT-4o ("o" for "omni") and GPT-4o mini are natively multimodal models designed to handle a combination of text, audio, and video inputs, and can generate outputs in text, audio, and image formats. GPT-4o mini is the lightweight version of GPT-4o. Background Before GPT-4o, users could interact with ChatGPT using Voice Mode, which operated with three separate models. GPT-4o integrates these capabil
arrowKato 2024/11/19
画像を入力にするときのサンプルコードあり

LLM

GPT-4o

GPT-4o-mini

multimodal
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx