How to Fine-Tune LLMs with Amazon SageMaker and vLLM

Senior ML Solutions Architect @ AWS

Agents are getting most of the attention in the Generative AI space lately — and fairly so. But when it comes to enhancing LLM performance, there are other powerful levers we shouldn’t overlook. One of them is Fine-Tuning — and techniques like QLoRA are key for hyper-personalization, especially when working with domain-specific or user-centric data. In this video, I walk through how you can leverage Amazon SageMaker’s Multi-Adapter Inference feature — combined with vLLM’s Async Engine using the new Large Model Inference (LMI) container — to serve multiple adapters efficiently at scale. 👇 Check it out below and feel free to adapt it for your own use-cases. I’ve also attached the code sample if you want to experiment directly. #GenerativeAI #SageMaker #vLLM #QLoRA #MachineLearning #AWS #FineTuning https://lnkd.in/eEFAgFBF

Customizing LLMs at Scale with SageMaker Multi-Adapter Inference

https://www.youtube.com/

2 Comments

Ram Vegiraju

Senior ML Solutions Architect @ AWS

Code Sample: https://github.com/RamVegiraju/SageMaker-Deployment/tree/master/SM-Inference-Video-Series/Part13-Multi-Adapter-Inference

Dmitry Soldatkin

All about Data and AI/ML

Ram, nice video with excellent explanation of multi-LoRA adapter hosting on Amazon SageMaker AI.

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

AgenticWeb

228 followers
2w
Report this post
In this post, we show how to deploy gpt-oss-20b model to SageMaker managed endpoints and demonstrate a practical stock analyzer agent assistant example with LangGraph, a powerful graph-based framework that handles state management, coordinated...

Build Agentic Workflows with OpenAI GPT OSS on Amazon SageMaker AI and Amazon Bedrock AgentCore | Amazon Web Services aws.amazon.com
Like Comment
To view or add a comment, sign in
adaptors.ai

1,326 followers
2w
Report this post
In this post, we show how to deploy gpt-oss-20b model to SageMaker managed endpoints and demonstrate a practical stock analyzer agent assistant example with LangGraph, a powerful graph-based framework that handles state management, coordinated...

Build Agentic Workflows with OpenAI GPT OSS on Amazon SageMaker AI and Amazon Bedrock AgentCore | Amazon Web Services aws.amazon.com
Like Comment
To view or add a comment, sign in
Business Compass LLC

774 followers
1w
Report this post
Enhance Your Machine Learning Workflow with DeepSeek in SageMaker Studio https://lnkd.in/e3JT9rJP Machine learning teams and data scientists working in AWS environments can supercharge their productivity by integrating DeepSeek with SageMaker Studio. This powerful combination transforms how you build, train, and deploy ML models by streamlining complex workflows and automating repetitive tasks.
Like Comment
To view or add a comment, sign in
Bibash Kattel

AWS certified | Full stack developer | Mobile developer | React Js | Angular | Node js | Typescript | Java Spring boot | Docker | AWS | Serverless | UI/UX | GIT | CID/CD API Design | Epic | HL7 FHIR | Agile
1w
Report this post
I wanted to try out MCP (Model Context Protocol), so I tested it out. MCP basically lets AI assistants use tools to automate tasks. You can create your own tools without much coding - it handles Word documents, databases, APIs, whatever you need. I built a Word Document MCP Server that does text formatting, font sizes, font families, headings, and more. Pretty straightforward setup. Made a quick video showing how it works: https://lnkd.in/eEzevi9g #MCP #Automation #AI
Like Comment
To view or add a comment, sign in
AgenticWeb

228 followers
3w
Report this post
In this post, we showed how to use SageMaker and Comet together to spin up fully managed ML environments with reproducibility and experiment tracking capabilities.

Rapid ML experimentation for enterprises with Amazon SageMaker AI and Comet | Amazon Web Services aws.amazon.com
Like Comment
To view or add a comment, sign in
adaptors.ai

1,326 followers
3w
Report this post
In this post, we showed how to use SageMaker and Comet together to spin up fully managed ML environments with reproducibility and experiment tracking capabilities.

Rapid ML experimentation for enterprises with Amazon SageMaker AI and Comet | Amazon Web Services aws.amazon.com
Like Comment
To view or add a comment, sign in
Sabareesh M B

AI Engineer – Generative AI, LLMs, Voice/Computer Vision| Multi Agents | Architecting Scalable AI for Global Impact | Open to AI/ML Roles
4d
Report this post
A few weeks back, we were building a RAG pipeline for a client. Everything was going perfectly until we hit the real wall: their knowledge was stored in messy PDFs, Word files, and PowerPoints. Our LLM was ready. The embeddings were working beautifully. But the data wasn’t. Extracting clean text, tables, and structure from thousands of documents felt almost impossible. Every parser we tried failed: some missed key sections, others broke layouts or mixed up tables. That’s when we came across Docling, IBM’s open-source framework that quietly does something brilliant, it converts any document (PDF, DOCX, PPTX) into structured, LLM-ready data. Docling didn’t just pull out text - it preserved hierarchy, layout, and metadata, giving our RAG results more accuracy and context than we expected. This experience reminded me that the real power of AI doesn’t start with the model; it starts with clean, structured data. Before you make your LLM smart, make your documents readable. #AI #RAG #IBM #Docling #LLM #DataEngineering #GenerativeAI
Like Comment
To view or add a comment, sign in
Gilbert Schuppe

Big Data, Analytics & AI at Databricks
3w
Report this post
#OpenAI 🤝 #Databricks ✨ Get governed access to models like GPT-5 via SQL, API and Agent Bricks for best in class 💡 #Reasoning: Healthcare decisions, financial risk analysis, contract review, logistics planning 📚 #Productivity: Summarization, business writing, knowledge management 💻 #Coding: Debugging large systems, modernizing legacy applications, generating production-ready code
Like Comment
To view or add a comment, sign in
Dan Suman

Lead Data Engineer @ Rovio | ex-Zalando | Msc in Data Science & AI
1w
Report this post
My forecasting journey continues – this time exploring AI agents. I've been working on a personal project to make forecasting more accessible, and my latest experiment is building an agent that can handle the whole pipeline autonomously. The idea: Upload your time series data, and the agent figures out the rest – analyzes patterns, creates features, finds optimal parameters, trains models, deploys them. Basically trying to remove all the manual steps that usually take forever. I'm using AWS Bedrock AgentCore with SageMaker, and it's been a great way to learn how autonomous agents actually work. Still very much a work in progress, but I'm excited to keep building on this. The goal is simple: make it dead easy for anyone to go from data to production forecasts. Demo video below shows where it's at right now 👇 Code: https://lnkd.in/e7xnjWuy #Forecasting #AWS #AgentCore #SageMaker

2 Comments
Like Comment
To view or add a comment, sign in
Bruno Fontaine

AI Automation Agency | Applications Design & Development
2w
Report this post
👍🏻Completed days (aka Levels) 5 & 6 of the AI Accelerator Challenge by Outskill! 🔥 Here are my key takeaways from these two days of compressed (without cutting corners) classes: *Day 5* 🧠 Definitions of MCP servers vs API and Webhooks 🔍 4 levels of access to MCP servers: • Web (easy), • Desktop (medium), • Hard (3rd-party library) • Ultra (multiple MCP servers at once) ⚙️ Great examples on how to use MCP servers *Day 6* 🧠How to create a human-like smart voice agent to contact leads within 5 minutes of inquiries. 🔍"Perfect" is the enemy of "Shipped" ⚙️Most people will complain about symptoms. You need to translate these to root causes. With each passing day, I get a lot of ideas on how I can implement new solutions for clients' needs. Really worth it. ➡️ I'll keep sharing updates of my journey and I am looking forward to next week, which will be focused on actual real-life implementations. 🧭 Follow my journey and feel free to DM if you are curious, let's grow together. 💬 What's one thing you're curious about when it comes to AI? Let's chat in the comments. #AI #LearningInPublic #AIAcceleratorbyOutskill

1 Comment
Like Comment
To view or add a comment, sign in

2,647 followers

48 Posts

View Profile Connect

LinkedIn respects your privacy

How to Fine-Tune LLMs with Amazon SageMaker and vLLM

Customizing LLMs at Scale with SageMaker Multi-Adapter Inference

https://www.youtube.com/

Explore content categories