[5ページ] DLの人気記事 216件 - はてなブックマーク

161 - 200 件 / 216件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

DLの検索結果161 - 200 件 / 216件

OpenXLA is available now to accelerate and simplify machine learning
- 6 users
- opensource.googleblog.com
- テクノロジー
- 2023/03/09
The latest news from Google on open source releases, major projects, events, and student outreach programs. ML development and deployment today suffer from fragmented and siloed infrastructure that can differ by framework, hardware, and use case. Such fragmentation restrains developer velocity and imposes barriers to model portability, efficiency, and productionization. Today, we’re taking a signi
- AI
- Google
Transformer models: an introduction and catalog
- 5 users
- arxiv.org
- テクノロジー
- 2023/02/17
In the past few years we have seen the meteoric appearance of dozens of foundation models of the Transformer family, all of which have memorable and sometimes funny, but not self-explanatory, names. The goal of this paper is to offer a somewhat comprehensive but simple catalog and classification of the most popular Transformer models. The paper also includes an introduction to the most important a
- あとで読む
Kaggleで学んだBERTをfine-tuningする際のTips⑤〜ラベルなしデータ活用編〜 | 株式会社AI Shift
- 5 users
- www.ai-shift.co.jp
- テクノロジー
- 2023/01/17
こんにちは！AIチームの戸田です！本記事では私がKaggleのコンペティションに参加して得た、Transformerをベースとした事前学習モデルのfine-tuningのTipsを共有させていただきます。以前も何件か同じテーマで記事を書かせていただきました。 Kaggleで学んだBERTをfine-tuningする際のTips①〜学習効率化編 Kaggleで学んだBERTをfine-tuningする際のTips②〜精度改善編〜 Kaggleで学んだBERTをfine-tuningする際のTips③〜過学習抑制編〜 Kaggleで学んだBERTをfine-tuningする際のTips④〜Adversarial Training編〜今回はラベルなしデータの活用について書かせていただきます。世の中の様々な問題を、蓄積された大量のデータを使った教師あり学習で解こうとする試みは多くなされてい
- BERT
- NLP
- *tips
論文紹介 / Llama 2: Open Foundation and Fine-Tuned Chat Models
- 5 users
- speakerdeck.com/kyoun
- テクノロジー
- 2023/09/02
第15回最先端NLP勉強会
- あとで読む

ggml.ai
- 5 users
- ggml.ai
- テクノロジー
- 2023/06/12
GGML - AI at the edge ggml is a tensor library for machine learning to enable large models and high performance on commodity hardware. It is used by llama.cpp and whisper.cpp Low-level cross-platform implementation Integer quantization support Broad hardware support Automatic differentiation ADAM and L-BFGS optimizers No third-party dependencies Zero memory allocations during runtime The ggml way
- C++
- library
オートエンコーダ（自己符号化器）とは｜意味、仕組み、種類、活用事例を解説 | Ledge.ai
- 5 users
- ledge.ai
- テクノロジー
- 2020/10/22
サインインした状態で「いいね」を押すと、マイページの「いいね履歴」に一覧として保存されていくので、再度読みたくなった時や、あとでじっくり読みたいときに便利です。
Multilingual CLIP with Huggingface + PyTorch Lightning 🤗 ⚡
- 5 users
- sachinruk.github.io
- テクノロジー
- 2021/03/09
Transformer models: an introduction and catalog — 2023 Edition
- 5 users
- amatria.in
- テクノロジー
- 2022/07/22
Transformer models: an introduction and catalog — 2023 Edition January 16, 2023 52 minute read This post is now an ArXiV paper that you can print and cite. Update 05/2023 Another pretty large update after 4 months. I was invited to submit the article to a journal, so I decided to enlist some help from some LinkedIn colleages and completely revamp it. First off, we added a whole lot of new models,
- Transformer
- あとで読む
汎用的で使いやすいアイコンを意識してデザインされたオープンソースのSVGアイコンセット・「Humbleicons」 - かちびと.net
- 5 users
- kachibito.net
- テクノロジー
- 2022/05/25
Humbleiconsは汎用的で使いやすいアイコンを意識してデザインされたオープンソースのSVGアイコンセットです。シンプルでクセの無いスタイリングで、そのまますぐにプロジェクトで使えるようなアイコンとなるよう丁寧にデザインしたそうです。全てのアイコンをセットでDLする事も出来ますし、個別でコードをコピーして使う事も出来ます。コードはインラインとSVGスプライトが用意されています。また、コードにはclassが与えられているので一括でCSSスタイリング出来るようになっています。ラインセスはMITとの事です。 Humbleicons
- SVG
- アイコン
- css
- デザイン
Neural Tangent Kernel（NTK）の概要
- 5 users
- medium.com
- テクノロジー
- 2020/09/02
今ニューラルネットワーク業界に旋風を巻き起こしていて、今後主流になるかもしれないし、埋もれるかもしれない理論である「Neural Tangent Kernel（ニューラルタンジェントカーネル、略してNTK)」の日本語版解説です。NTKがどういったものなのかを理解することを目的としています。はじめに（英語のお勉強） Tangentは日本語で「接線」です。サインコサインタンジェントだと思ったそこのあなた！違いますよ！ Neural Tangent Kernel(NTK)とは2018年末に提案され、その理論が機械学習の真理に近いとは言われているものの、イマイチ結果に結びつかない理論です。ニューラルネットワークモデルはy=f(x,θ)（y=出力、x=入力、θ=重みの集合）という関数で表すことができ、一般的には個々の重みを調整することで正しいyを導いていくアルゴリズムです。NTKは重みを調整する
GitHub - google/flax: Flax is a neural network library for JAX that is designed for flexibility.
- 5 users
- github.com/google
- テクノロジー
- 2021/07/25
Overview | Quick install | What does Flax look like? | Documentation Released in 2024, Flax NNX is a new simplified Flax API that is designed to make it easier to create, inspect, debug, and analyze neural networks in JAX. It achieves this by adding first class support for Python reference semantics. This allows users to express their models using regular Python objects, enabling reference sharing
- library
- *あとで読む
特許コンペで金メダルを取得し、新たに2人のKaggle Masterが日経に誕生しました — HACK The Nikkei
- 5 users
- hack.nikkei.com
- テクノロジー
- 2022/07/06
コンペに参加したきっかけ本コンペに参加した理由は2点あります。 1点目は、事前に今回のチームメンバーとコンペに取り組む約束をしていたためです。今回共にコンペに参加したチームメイトである増田や青田とは今年の5月にも別のコンペに参加しており、その際には銀メダルを取得しました（当時の体験記はこちら）。5月のコンペを通じ、同じ職場のメンバーで和気藹々とコンペに取り組む楽しさを覚え、「次はPPPMで」と約束を交わしていたのでした。 2点目は、ドメインが個人的に馴染み深く興味が湧いたためです。私は大学時代に産業財産権について学んでいたこともあり、特許領域には強い思い入れがありました。結果的には、ドメイン知識を何一つ活かしませんでしたが、最後まで興味を持ってコンペに取り組むことができたので結果オーライだと思っています。 Team Nの解法前書きの通り、私たちは1,889チーム中8位という好成績を収め
NVIDIA Dynamoについて調べてみた - NTT Communications Engineers' Blog
- 5 users
- engineers.ntt.com
- テクノロジー
- 2025/05/20
こんにちは。NTTコミュニケーションズの露崎です。本ブログでは2025年3月のGTCで紹介されたNVIDIA社のOSS Dynamoについて紹介します。はじめに特徴インストールと基本動作 Dynamo Run Dynamo Serve 推論グラフとコンポーネント dynamo serveの起動の流れ 1. nats/etcdの起動 2. dynamo serveの起動 3. 動作確認 4. 終了分散処理の仕組みまとめはじめにこんにちは。NTTコミュニケーションズの露崎です。本ブログでは2025年3月のGTCで紹介されたNVIDIA社のOSS Dynamoについて紹介します。 NVIDIA Dynamoは発表されて間がなく、開発/変更が盛んに行われています。本ブログでは2025年5月の時点での最新版である0.2.0について紹介しますが、最新情報については公式をご参照ください。
- library
Data-centric AI とは｜Idein株式会社
- 5 users
- note.com/idein
- テクノロジー
- 2022/09/29
はじめにR&D 室の渡邉です．本日は，機械学習界隈ではお馴染みの Andrew Ng 先生が提唱されている Data-Centric AI について A Chat with Andrew on MLOps: From Model-centric to Data-centric AI という動画の内容を中心に紹介していきたいと思います． 2021 年頃に出てきた話なので何番煎じか分からないくらいの紹介になりますがお付き合いいただければと思います． Model-centric AI から Data-centric AI へはじめに Data-centric AI とは何なのかという話ですが，Data,Centric,AI という Word からデータを中心とした AI っぽいことを言っているのは想像が付くかと思います．そうです，Data-centric AI とはデータに重きを置いたアプロー
- 人工知能
機械学習で学習モデルを生成・保存し、APIサーバーにしてブラウザーからJSONで通信するまでの手順 - Qiita
- 4 users
- qiita.com/hira03
- テクノロジー
- 2020/06/21
はじめに機械学習で生成した学習モデルをAPIサーバーにして、ブラウザーからJSON通信で、データを送って予測値を返すということをやりました。この機械学習によるAPIサーバーは、主に三つのプログラムによって実装されています。最初に、XGBoostで機械学習を行い、学習モデルを生成し、保存します。次にFlaskで学習モデルのAPIサーバーを実装します。最後に、HTMLファイルでフォームタグを書き、フォームタグから得たデータをjavascriptのAjaxでJSON通信を行えるようにします。この三つのプログラムによって、ブラウザーからAPIサーバーにデータを送って予測値を返すというものを作ることができます。このプログラムを実行するために必要な環境 Anaconda、XGBoost、joblib、Flask、flask-corsなどのライブラリがインストールされている。主なプロセスこの機械
- 機械学習
Solving a machine-learning mystery
- 4 users
- news.mit.edu
- テクノロジー
- 2023/02/09
MIT researchers found that massive neural network models that are similar to large language models are capable of containing smaller linear models inside their hidden layers, which the large models could train to complete a new task using simple learning algorithms. Large language models like OpenAI’s GPT-3 are massive neural networks that can generate human-like text, from poetry to programming c
- 機械学習
COCO dataset：セグメンテーションなどに使える大規模なカラー写真の画像データセット
- 4 users
- atmarkit.itmedia.co.jp
- テクノロジー
- 2021/09/08
COCO dataset：セグメンテーションなどに使える大規模なカラー写真の画像データセット：AI・機械学習のデータセット辞典データセット「COCO」について説明。約33万枚のカラー写真（教師ラベル付きは20万枚以上）の画像データとアノテーション（＝教師ラベル）が無料でダウンロードでき、物体検知／セグメンテーションや、キーポイント検出／姿勢推定、キャプション作成などに利用できる。
- dataset
- photo
2022年度第2回計算科学フォーラム | HPCIC計算科学フォーラム
- 4 users
- hpcic-kkf.com
- テクノロジー
- 2023/04/10
横田理央東京工業大学学術国際情報センター教授「富岳を用いた大規模言語モデルの分散並列学習」近年GPTなどの大規模言語モデルが身近な場面でも利用される機会が増えており、多岐にわたる産業応用の可能性を秘めている。ただし、そのような大規模言語モデルが学習できるのはGoogle, OpenAI, Metaなどの巨大企業に限定されており、大規模モデルを学習するための独特なノウハウもそれらの企業に専有されている。大規模な分散並列学習には多くの技術の積み重ねが必要であり、この分野で立後れることは我が国の経済安全保障上の重大な懸念点である。本講演では富岳を用いた大規模言語モデルの分散並列学習を行う際の技術的な課題についていくつか紹介する。発表資料（PDF:4.97MB）
GitHub - artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs
- 4 users
- github.com/artidoro
- テクノロジー
- 2023/06/03
We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLoRA backpropagates gradients through a frozen, 4-bit quantized pretrained language model into Low Rank Adapters (LoRA). Our best model family, which we name Guanaco, outperforms all previous openly rel
BioGPT: generative pre-trained transformer for biomedical text generation and mining
- 4 users
- academic.oup.com
- テクノロジー
- 2023/01/27
Introduction Text mining and knowledge discovery from biomedical literature play important roles in drug discovery, clinical therapy, pathology research, etc. Typical tasks include recognizing named entities in the articles, mining the interaction between drugs and proteins/diseases/other drugs, answering questions given reference text, generating abstracts for given phrases/words, etc. People hav
- あとで読む
Interactive Topic Modeling with BERTopic
- 4 users
- towardsdatascience.com
- テクノロジー
- 2021/01/09
Every day, businesses deal with large volumes of unstructured text. From customer interactions in emails to online feedback and reviews. To deal with this large amount of text, we look towards topic modeling. A technique to automatically extract…
3 deep learning mysteries: Ensemble, knowledge- and self-distillation
- 4 users
- www.microsoft.com
- テクノロジー
- 2021/01/29
Three mysteries in deep learning: Ensemble, knowledge distillation, and self-distillation Published January 19, 2021 By Zeyuan Allen-Zhu , Senior Researcher Yuanzhi Li , Assistant Professor, Carnegie Mellon University Under now-standard techniques, such as over-parameterization, batch-normalization, and adding residual links, “modern age” neural network training—at least for image classification t
Transformer and Graph Neural Network
- 4 users
- speakerdeck.com/liberalarts
- テクノロジー
- 2021/02/25
下記で取り扱ったTransformerの簡易的な解説を公開します。・Transformer と画像処理 https://lib-arts.booth.pm/items/2741653 TransformerはLLMのベースに用いられるなど、応用の幅が広いので抑えておくと良いと思います。
Predictions and hopes for Geometric & Graph ML in 2022
- 4 users
- towardsdatascience.com
- 世の中
- 2022/01/26
Image: ShutterstockThis post was co-authored with Petar Veličković. See also my last year’s prediction, Michael Galkin’s excellent post on the current state of affairs in Graph ML, a deeper dive into subgraph GNNs, techniques inspired by PDEs and differential geometry and algebraic topology, and how the concepts of symmetry and invariance form the picture of modern deep learning. Summing up impres
学習率減衰/バッチサイズ増大とEarlyStoppingの併用で汎化性能を上げる@tensorflow2.0
- 4 users
- akichan-f.medium.com
- テクノロジー
- 2020/07/18
この記事についてこの記事では、以下２つを解説します。 early stoppingと併用して学習率を適応的に変えていく手法をtensorflow2.xを使って実装する方法early stoppingと併用して、学習率の代わりにバッチサイズを適応的に変えて学習率減衰と同じ効果を得る手法の説明とtensorflow2.xその実装方法学習率減衰とは？学習率減衰(Learning rate decay)は深層学習の汎化性能向上のためによく使われる手法で、学習がある程度進んだ場所で学習率を下げる手法です。下図にあるように、学習率を落とすと、急激に精度が向上することが知られています。では、いつ学習率を減衰させれば良いのでしょうか？2020年7月現在では、学習がある程度収束してから学習率を1/10~1/5程度に減衰させることが多いように思います。論文を見ていると、よく使われるCIFAR10やImag
モデルの気持ちになって情報を与えよう
- 4 users
- medium.com/@junkoda
- テクノロジー
- 2023/12/10
これは Kaggle Advent Calendar 2023 12/10 の記事です。ディープラーニングでは何を考えたらいいのか？「モデルの気持ち」などと言いますが、もっと具体的にどういうことなのか？私も素人でわからないのですが、今年参加したコンペで読んだ solution を題材に「情報を与える」という観点からモデルの気持ちを推測してみます。「それ、俺も書いたが？」みたいなのはたくさんあるかと思いますが advent calendar ゆえどうかご容赦を。記憶をたよりに雑に書いて調べ直していません。 Classification の根拠を segmentation で与えるG2Net 2 Detecting Continuous Gravitational Waves はノイズに埋もれたデータに重力波の信号が含まれているかを判定する二項分類のコンペでした。入力は spectrogr
- Kaggle
GitHub - prophesier/diff-svc: Singing Voice Conversion via diffusion model
- 4 users
- github.com/prophesier
- テクノロジー
- 2023/01/19
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- music
[論文解説] MAML: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks - Qiita
- 4 users
- qiita.com/ku2482
- テクノロジー
- 2020/07/13
以下の論文の解説(まとめ)になります． Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks この論文は，Chelsea Finnが出した論文でICML 2017に採択されています．Meta-Learningの汎用性を大きく改善した，ターニングポイントとなる手法を提案していて非常に面白く，また論文の優位性を適切に説明した日本語解説がなかったため，今回紹介させていただきました．この論文で提案しているモデルは，MAML(Model-Agnostic Meta-Learning)と呼ばれる手法になります．記事中の図は，特に記載がない限りすべて論文からの引用です．記事内容に不備がございましたら，ご指摘頂けると助かります．概要この論文は， Model-Agnostic 微分可能である以外，モデルや損失関数の形式を仮
- 機械学習
How does in-context learning work? A framework for understanding the differences from traditional supervised learning
- 4 users
- ai.stanford.edu
- テクノロジー
- 2023/02/10
In this post, we provide a Bayesian inference framework for in-context learning in large language models like GPT-3 and show empirical evidence for our framework, highlighting the differences from traditional supervised learning. This blog post primarily draws from the theoretical framework for in-context learning from An Explanation of In-context Learning as Implicit Bayesian Inference 1 and expe
データセットシフトの学習理論
- 4 users
- speakerdeck.com/mkimura
- テクノロジー
- 2021/06/07
Equivalence of Geodesics and Importance Weighting from the Perspective of Information Geometry
Top Applications of Graph Neural Networks 2021
- 4 users
- techblog.criteo.com
- 学び
- 2021/01/16
Chinese translation is available here. At the beginning of the year, I have a feeling that Graph Neural Nets (GNNs) became a buzzword. As a researcher in this field, I feel a little bit proud (at least not ashamed) to say that I work on this. It was not always the case: three years ago when I was talking to my peers, who got busy working on GANs and Transformers, the general impression that they g
Top 8 Hands-On Books For Machine Learning Practitioners
- 4 users
- analyticsindiamag.com
- テクノロジー
- 2020/08/04
Machine learning is a vast field. Thanks to the internet, there are plenty of resources available to get your hands on it — from books to blogs to vlogs. Analytics India Magazine has been compiling learning resources for the ML community for quite some time now. In this article, we list down top machine learning books for those who want to get practical with algorithms. (The books below are listed
[輪講資料] Language-agnostic BERT Sentence Embedding
- 3 users
- speakerdeck.com/hpprc
- テクノロジー
- 2022/05/25
多言語文埋め込み手法であるLanguage-agnostic BERT  Sentence Embedding (LaBSE)の論文について解説した資料です。
- NLP
- 機械学習
AITemplate: Unified inference engine on GPUs from NVIDIA and AMD
- 3 users
- ai.meta.com
- テクノロジー
- 2022/10/04
Faster, more flexible inference on GPUs using AITemplate, a revolutionary new inference engine GPUs play an important role in the delivery of the compute needed for deploying AI models, especially for large-scale pretrained models in computer vision, natural language processing, and multimodal learning. Currently, AI practitioners have very limited flexibility when choosing a high-performance GPU
- 機械学習
- library
知識拡張型言語モデルLUKE
- 3 users
- speakerdeck.com/ikuyamada
- テクノロジー
- 2023/03/18
言語処理学会第29回年次大会併設ワークショップ JLR2023 (
Attention? Attention!
- 3 users
- lilianweng.github.io
- 世の中
- 2023/03/22
Date: June 24, 2018 | Estimated Reading Time: 21 min | Author: Lilian Weng [Updated on 2018-10-28: Add Pointer Network and the link to my implementation of Transformer.] [Updated on 2018-11-06: Add a link to the implementation of Transformer model.] [Updated on 2018-11-18: Add Neural Turing Machines.] [Updated on 2019-07-18: Correct the mistake on using the term “self-attention” when introducing t
http://ibis.t.u-tokyo.ac.jp/suzuki/lecture/2020/intensive2/KyusyuStatZemi2020.pdf
- 3 users
- ibis.t.u-tokyo.ac.jp
- テクノロジー
- 2020/09/13
- math
- 機械学習
- 数学
Japan PR slides
- 3 users
- www.deepspeed.ai
- テクノロジー
- 2023/06/08
DeepSpeed: 深層学習の訓練と推論を劇的に高速化するフレームワーク Microsoft DeepSpeed Team 2023 年 6 月 7 日このスライドでは、我々が研究開発しているDeepSpeedというフレームワークについて、概要をご紹介します。 1 概要 • 大規模かつ高速な深層学習を容易に実現する様々な機能を持ったソフトウェア • オープンソースソフトウェアとしてGitHubで公開中 • DeepSpeed (メインのレポジトリ) • DeepSpeedExamples (使用例). • Megatron-DeepSpeed (NVIDIAのMegatron-LMと結合したもの). • DeepSpeed-MII (DeepSpeedの高速な推論を容易に利用するためのツール) メインレポジトリのURL DeepSpeedのプロジェクトは、MicrosoftのAI
Deep learning to translate between programming languages
- 3 users
- ai.meta.com
- テクノロジー
- 2020/07/23
Migrating a codebase from an archaic programming language such as COBOL to a modern alternative like Java or C++ is a difficult, resource-intensive task that requires expertise in both the source and target languages. COBOL, for example, is still widely used today in mainframe systems around the world, so companies, governments, and others often must choose whether to manually translate their code
GitHub - facebookresearch/mae: PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
- 3 users
- github.com/facebookresearch
- テクノロジー
- 2022/01/07
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert