关于
I am a tech lead of Qwen Team, Alibaba Group. I am responsible for building Qwen, the…
动态
-
A big upgrade for our Qwen Deep Research, creating webpages and podcasts! A beautiful ensemble of models and agents. Feel free to give it a try!
A big upgrade for our Qwen Deep Research, creating webpages and podcasts! A beautiful ensemble of models and agents. Feel free to give it a try!
Junyang Lin分享
-
Excited to announce the launch of Qwen3-VL-Flash on Alibaba Cloud Model Studio! 🚀 A powerful new vision-language model that combines reasoning and…
Excited to announce the launch of Qwen3-VL-Flash on Alibaba Cloud Model Studio! 🚀 A powerful new vision-language model that combines reasoning and…
Junyang Lin点赞
工作经历
教育经历
出版作品
-
Qwen Technical Report
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human…
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind the proprietary models.
-
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models (LVLMs) designed to perceive and understand both texts and images. Starting from the Qwen-LM as a foundation, we endow it with visual capacity by the meticulously designed (i) visual receptor, (ii) input-output interface, (iii) 3-stage training pipeline, and (iv) multilingual multimodal cleaned corpus. Beyond the conventional image description and question-answering, we implement the grounding and…
In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models (LVLMs) designed to perceive and understand both texts and images. Starting from the Qwen-LM as a foundation, we endow it with visual capacity by the meticulously designed (i) visual receptor, (ii) input-output interface, (iii) 3-stage training pipeline, and (iv) multilingual multimodal cleaned corpus. Beyond the conventional image description and question-answering, we implement the grounding and text-reading ability of Qwen-VLs by aligning image-caption-box tuples. The resulting models, including Qwen-VL and Qwen-VL-Chat, set new records for generalist models under similar model scales on a broad range of visual-centric benchmarks (e.g., image captioning, question answering, visual grounding) and different settings (e.g., zero-shot, few-shot). Moreover, on real-world dialog benchmarks, our instruction-tuned Qwen-VL-Chat also demonstrates superiority compared to existing vision-language chatbots.
-
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
ICML 2022
In this work, we pursue a unified paradigm for multimodal pretraining to break the scaffolds of complex task/modality-specific customization. We propose OFA, a Task-Agnostic and Modality-Agnostic framework that supports Task Comprehensiveness. OFA unifies a diverse set of cross-modal and unimodal tasks, including image generation, visual grounding, image captioning, image classification, language modeling, etc., in a simple sequence-to-sequence learning framework. OFA follows the…
In this work, we pursue a unified paradigm for multimodal pretraining to break the scaffolds of complex task/modality-specific customization. We propose OFA, a Task-Agnostic and Modality-Agnostic framework that supports Task Comprehensiveness. OFA unifies a diverse set of cross-modal and unimodal tasks, including image generation, visual grounding, image captioning, image classification, language modeling, etc., in a simple sequence-to-sequence learning framework. OFA follows the instruction-based learning in both pretraining and finetuning stages, requiring no extra task-specific layers for downstream tasks. In comparison with the recent state-of-the-art vision & language models that rely on extremely large cross-modal datasets, OFA is pretrained on only 20M publicly available image-text pairs. Despite its simplicity and relatively small-scale training data, OFA achieves new SOTAs in a series of cross-modal tasks while attaining highly competitive performances on uni-modal tasks. Our further analysis indicates that OFA can also effectively transfer to unseen tasks and unseen domains.
-
M6: A Chinese Multimodal Pretrainer
KDD 2021
In this work, we construct the largest dataset for multimodal pretraining in Chinese, which consists of over 1.9TB images and 292GB texts that cover a wide range of domains. We propose a cross-modal pretraining method called M6, referring to Multi-Modality to Multi-Modality Multitask Mega-transformer, for unified pretraining on the data of single modality and multiple modalities. We scale the model size up to 10 billion and 100 billion parameters, and build the largest pretrained model in…
In this work, we construct the largest dataset for multimodal pretraining in Chinese, which consists of over 1.9TB images and 292GB texts that cover a wide range of domains. We propose a cross-modal pretraining method called M6, referring to Multi-Modality to Multi-Modality Multitask Mega-transformer, for unified pretraining on the data of single modality and multiple modalities. We scale the model size up to 10 billion and 100 billion parameters, and build the largest pretrained model in Chinese. We apply the model to a series of downstream applications, and demonstrate its outstanding performance in comparison with strong baselines. Furthermore, we specifically design a downstream task of text-guided image generation, and show that the finetuned M6 can create high-quality images with high resolution and abundant details.
-
Diversity-Promoting Generative Adversarial Network for Generating Informative and Diversified Text
EMNLP 2018
Existing text generation methods tend to produce repeated and "boring" expressions. To tackle this problem, we propose a new text generation model, called Diversity-Promoting Generative Adversarial Network (DP-GAN). The proposed model assigns low reward for repeatedly generated text and high reward for "novel" and fluent text, encouraging the generator to produce diverse and informative text. Moreover, we propose a novel language-model based discriminator, which can better distinguish novel…
Existing text generation methods tend to produce repeated and "boring" expressions. To tackle this problem, we propose a new text generation model, called Diversity-Promoting Generative Adversarial Network (DP-GAN). The proposed model assigns low reward for repeatedly generated text and high reward for "novel" and fluent text, encouraging the generator to produce diverse and informative text. Moreover, we propose a novel language-model based discriminator, which can better distinguish novel text from repeated text without the saturation problem compared with existing classifier-based discriminators. The experimental results on review generation and dialogue generation tasks demonstrate that our model can generate substantially more diverse and informative text than existing baselines.
-
Global Encoding for Abstractive Summarization
ACL 2018
In neural abstractive summarization, the conventional sequence-to-sequence (seq2seq) model often suffers from repetition and semantic irrelevance. To tackle the problem, we propose a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context. It consists of a convolutional gated unit to perform global encoding to improve the representations of the source-side information. Evaluations on the LCSTS and the…
In neural abstractive summarization, the conventional sequence-to-sequence (seq2seq) model often suffers from repetition and semantic irrelevance. To tackle the problem, we propose a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context. It consists of a convolutional gated unit to perform global encoding to improve the representations of the source-side information. Evaluations on the LCSTS and the English Gigaword both demonstrate that our model outperforms the baseline models, and the analysis shows that our model is capable of reducing repetition.
语言能力
-
English
高级 (无障碍商务沟通)
-
French
中级 (日常会话)
-
Japanese
初级 (入门)
-
Russian
初级 (入门)
-
Chinese
母语或精通双语
-
German
初级 (入门)
Junyang的更多动态
-
Qwen3-VL 4 and 8B models, the first shot this week. Small VL models are good for deployment and notably significant for phones and robots. Previously…
Qwen3-VL 4 and 8B models, the first shot this week. Small VL models are good for deployment and notably significant for phones and robots. Previously…
Junyang Lin分享
-
Introducing the compact, dense versions of Qwen3-VL — now available in 4B and 8B pairs, each with both Instruct and Thinking variants. ✅ Lower VRAM…
Introducing the compact, dense versions of Qwen3-VL — now available in 4B and 8B pairs, each with both Instruct and Thinking variants. ✅ Lower VRAM…
Junyang Lin点赞
-
🚀 Qwen3-VL-30B-A3B-Instruct & Thinking are here! Smaller size, same powerhouse performance 💪—packed with all the capabilities of Qwen3-VL! 🔧 With…
🚀 Qwen3-VL-30B-A3B-Instruct & Thinking are here! Smaller size, same powerhouse performance 💪—packed with all the capabilities of Qwen3-VL! 🔧 With…
Junyang Lin点赞
-
This is the last shot! Qwen3-Max, no preview. What I can say is "Just Scale it"!
This is the last shot! Qwen3-Max, no preview. What I can say is "Just Scale it"!
Junyang Lin分享
-
🚀 Qwen3-Max is here—no preview, just power! Qwen Chat:https://chat.qwen.ai/ Blog: https://lnkd.in/g9Du6Sfu API: https://lnkd.in/gu9K98cH We’ve…
🚀 Qwen3-Max is here—no preview, just power! Qwen Chat:https://chat.qwen.ai/ Blog: https://lnkd.in/g9Du6Sfu API: https://lnkd.in/gu9K98cH We’ve…
Junyang Lin点赞
-
This is the 5th shot! Super crazy! We opensourced a 235B-A22B Instruct and Thinking Qwen3-VL models under Apache 2.0! Qwen3-VL, the new generation…
This is the 5th shot! Super crazy! We opensourced a 235B-A22B Instruct and Thinking Qwen3-VL models under Apache 2.0! Qwen3-VL, the new generation…
Junyang Lin分享
-
🚀 We're thrilled to unveil Qwen3-VL — the most powerful vision-language model in the Qwen series yet! 🔥 The flagship model Qwen3-VL-235B-A22B is…
🚀 We're thrilled to unveil Qwen3-VL — the most powerful vision-language model in the Qwen series yet! 🔥 The flagship model Qwen3-VL-235B-A22B is…
Junyang Lin点赞
-
This is the 3rd shot! Live translation is something that I feel very interesting and important! It supports many languages and it is even visually…
This is the 3rd shot! Live translation is something that I feel very interesting and important! It supports many languages and it is even visually…
Junyang Lin分享
-
🚀 Introducing Qwen3-LiveTranslate-Flash — Real‑Time Multimodal Interpretation — See It, Hear It, Speak It! 🌐 Wide language coverage — Understands…
🚀 Introducing Qwen3-LiveTranslate-Flash — Real‑Time Multimodal Interpretation — See It, Hear It, Speak It! 🌐 Wide language coverage — Understands…
Junyang Lin点赞
-
This is the 2nd shot! Travel planner, available on Qwen Chat, is a new product for the following holiday. It is a travel agent that can help you make…
This is the 2nd shot! Travel planner, available on Qwen Chat, is a new product for the following holiday. It is a travel agent that can help you make…
Junyang Lin分享
-
This is the 1st shot! For a long time people just don't have any idea about the safety work that we have invested efforts in. This time, we show you…
This is the 1st shot! For a long time people just don't have any idea about the safety work that we have invested efforts in. This time, we show you…
Junyang Lin分享
-
Qwen3-Next, or to say, a preview of our next generation (3.5?) is out! This time we try to be bold, but actually we have been doing experiments on…
Qwen3-Next, or to say, a preview of our next generation (3.5?) is out! This time we try to be bold, but actually we have been doing experiments on…
Junyang Lin分享
-
🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here! 🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x…
🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here! 🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x…
Junyang Lin点赞
-
Yes this is our maybe the first released ASR model. It is just good, sota level model!
Yes this is our maybe the first released ASR model. It is just good, sota level model!
Junyang Lin分享
-
🎙️ Meet Qwen3-ASR — the all-in-one speech recognition model! ✅ High-accuracy EN/CN + 9 more languages: ar, de, en, es, fr, it, ja, ko, pt, ru, zh ✅…
🎙️ Meet Qwen3-ASR — the all-in-one speech recognition model! ✅ High-accuracy EN/CN + 9 more languages: ar, de, en, es, fr, it, ja, ko, pt, ru, zh ✅…
Junyang Lin点赞
其他相似会员
中国中其他姓名为Junyang Lin的会员
领英上有中国的其他 42 位姓名为Junyang Lin的会员
查看其他姓名为Junyang Lin的会员