在亚马逊云科技AWS上利用SageMaker机器学习模型平台搭建生成式AI应用(附Llama大模型部署和测试代码)

项目简介:

接下来,小李哥将会每天介绍一个基于亚马逊云科技AWS云计算平台的全球前沿AI技术解决方案,帮助大家快速了解国际上最热门的云计算平台亚马逊云科技AWS AI最佳实践,并应用到自己的日常工作里。本次介绍的是如何在Amazon SageMaker上使用大语言模型Meta Llama 7B,提供可扩展和安全的AI解决方案。通过Amazon API Gateway和AWS Lambda将应用程序与AI模型集成。本方案的解决方案架构图如下:

本方案将主要使用亚马逊云科技AWS上的大模型/机器学习模型托管服务Amazon SageMaker,下面我们介绍一下该服务。

什么是Amazon SageMaker?

Amazon SageMaker是一款由亚马逊云科技提供的全面模型托管服务,旨在帮助开发者和数据科学家快速、轻松地构建、训练和部署机器学习模型。SageMaker原生集成了机器行业热门的工具和框架,使用户能够专注于模型的开发和优化,而无需担心基础设施的管理。

通过SageMaker,用户可以使用预构建的算法和框架,或者将自己的自定义代码带入平台进行训练。其自动化的数据处理和模型训练功能,可以大大减少模型开发的时间和复杂性。此外,SageMaker还提供了云托管Jupyter Notebook,用户可以使用本地熟悉的训练模型工具,以便无缝从本地迁移到云端模型训练。

在模型训练完成后,SageMaker可以将模型部署为托管的推理服务,配置API节点,确保高可用性和可扩展性。借助SageMaker,用户可以无缝地集成机器学习模型到应用程序中,例如通过API Gateway和AWS Lambda来实现实时推理服务。

Amazon SageMaker还具备内置的监控和调优功能,使得模型的性能优化和管理更加高效。无论是初学者还是专业数据科学家,SageMaker都是一种理想的解决方案,帮助企业快速实现人工智能和机器学习的价值。

本方案包括的内容:

本方案主要包括如下内容:

1.使用Amazon SageMaker部署基础AI/ML模型(Meta Llama 7B)作为推理的应用节点

2. 在云原生代码托管服务AWS Lambda部署代码以调用SageMaker推理。

3.使用测试应用程序为已部署的模型进行功能测试。

项目搭建具体步骤:

下面跟着小李哥手把手搭建一个亚马逊云科技AWS上的生成式AI模型(Meta Llama 7B)的软件应用,并且测试大模型的多种应用场景下的表现。

1. 首先进入SageMarker

2. 点击进入Jumpstart->Foundation Models,查看目前SageMaker上现有的开源ML模型

3. 进入Studio,点击已经创建好的“Open Studio”

4. 点击“Studio Classic”,再点击右侧Open

5. 打开SageMaker Studio控制台

6. 下载存放在S3中的ML模型工程代码:

aws s3 sync s3://<Replace with lab-code bucket name> .

7. 配置ML模型训练、预测的Python工程环境

8. 在第一个代码单元中更新和安装SageMaker SDK

!pip install sagemaker --quiet --upgrade --force-reinstall

9. 配置需要的ML开源模型ID和版本(Meta Llama 7B)

model_id, model_version, = (
    "huggingface-llm-falcon-7b-instruct-bf16",
    "*",
)

10. 创建SageMaker上的API端点,供应用访问

%%time
from sagemaker.jumpstart.model import JumpStartModel

my_model = JumpStartModel(model_id=model_id,instance_type="ml.g5.2xlarge" )
predictor = my_model.deploy()

项目测试阶段:

11. 利用编写好的模型提示词,对部署的模型进行问答测试。

%%time


prompt = "Tell me about Amazon SageMaker."

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.9,
        "temperature": 0.8,
        "max_new_tokens": 1024,
        "stop": ["<|endoftext|>", "</s>"]
    }
}

response = predictor.predict(payload)
print(response[0]["generated_text"])

得到测试结果

Amazon SageMaker is a machine learning platform provided by Amazon Web Services that enables users to train and deploy machine learning models without having to build, train, and manage a machine learning infrastructure.
CPU times: user 13.5 ms, sys: 6.66 ms, total: 20.1 ms
Wall time: 1.38 s

12. 对大模型代码生成功能进行测试:

payload = {"inputs": "Write a program to compute factorial in python:", "parameters":{"max_new_tokens": 200}}
query_endpoint(payload)

得到测试结果

 Input: Write a program to compute factorial in python:
 Output: 
Here is a Python program to compute factorial:

```python
def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)

print(factorial(5)) # Output: 120
```

13. 对大模型进行推理、任务完成能力测试

payload = {
    "inputs": "Building a website can be done in 10 simple steps:",
    "parameters":{
        "max_new_tokens": 110,
        "no_repeat_ngram_size": 3
        }
}
query_endpoint(payload)

得到测试结果

 Input: Building a website can be done in 10 simple steps:
 Output: 
1. Choose a domain name
2. Register a domain name
3. Choose a web hosting provider
4. Create a website design
5. Add content to your website
6. Test your website
7. Optimize your website for search engines
8. Promote your website
9. Update your website regularly
10. Monitor your website for security

14. 对大模型代码进行语言翻译测试:

payload = {
    "inputs": """Translate English to French:

    sea otter => loutre de mer

    peppermint => menthe poivrée

    plush girafe => girafe peluche

    cheese =>""",
    "parameters":{
        "max_new_tokens": 3
    }
}

query_endpoint(payload)

得到测试结果

 Input: Translate English to French:

    sea otter => loutre de mer

    peppermint => menthe poivrée

    plush girafe => girafe peluche

    cheese =>
 Output:  fromage

15. 对大模型的基于文字的情绪分析能力进行测试

payload = {
    "inputs": """"I hate it when my phone battery dies."
                Sentiment: Negative
                ###
                Tweet: "My day has been :+1:"
                Sentiment: Positive
                ###
                Tweet: "This is the link to the article"
                Sentiment: Neutral
                ###
                Tweet: "This new music video was incredibile"
                Sentiment:""",
    "parameters": {
        "max_new_tokens":2
    }
}
query_endpoint(payload)

得到测试结果

 Input: "I hate it when my phone battery dies."
                Sentiment: Negative
                ###
                Tweet: "My day has been :+1:"
                Sentiment: Positive
                ###
                Tweet: "This is the link to the article"
                Sentiment: Neutral
                ###
                Tweet: "This new music video was incredibile"
                Sentiment:
 Output:  Positive

16. 对大模型的问答能力进行测试2

payload = {
    "inputs": "Could you remind me when was the C programming language invented?",
    "parameters":{
        "max_new_tokens": 50
    }
}
query_endpoint(payload)

得到测试结果

 Input: Could you remind me when was the C programming language invented?
 Output: 
The C programming language was invented in 1972 by Dennis Ritchie at Bell Labs.

17.利用大模型生成食谱

payload = {"inputs": "What is the recipe for a delicious lemon cheesecake?", "parameters":{"max_new_tokens": 400}}
query_endpoint(payload)

得到测试结果

Input: What is the recipe for a delicious lemon cheesecake?
 Output: 
Here is a recipe for a delicious lemon cheesecake:

Ingredients:
- 1 1/2 cups graham cracker crumbs
- 4 tablespoons butter, melted
- 2 (8 ounce) packages cream cheese, softened
- 1/2 cup granulated sugar
- 2 eggs
- 1/2 cup lemon juice
- 1/2 teaspoon salt
- 1/2 teaspoon vanilla extract
- 1/2 cup heavy cream
- 1/2 cup granulated sugar
- 1/2 teaspoon lemon zest

Instructions:
1. Preheat oven to 350 degrees F.
2. In a medium bowl, mix together the graham cracker crumbs and melted butter. Press the mixture onto the bottom and sides of a 9-inch springform pan.
3. In a large bowl, beat the cream cheese and sugar until smooth. Add the eggs, lemon juice, salt, vanilla, and heavy cream. Beat until well combined.
4. Pour the mixture into the prepared pan.
5. Bake for 30 minutes or until the cheesecake is set.
6. Let cool for 10 minutes before serving.
7. In a small bowl, mix together the lemon zest and sugar. Sprinkle over the cheesecake before serving.
Step 4.2.7 : Test summarization?

18. 测试大模型的文字总结能力

payload = {
    "inputs":"""Amazon SageMaker is a fully managed machine learning service. With SageMaker, 
    data scientists and developers can quickly and easily build and train machine learning models, 
    and then directly deploy them into a production-ready hosted environment. It provides an 
    integrated Jupyter authoring notebook instance for easy access to your data sources for 
    exploration and analysis, so you don't have to manage servers. It also provides common 
    machine learning algorithms that are optimized to run efficiently against extremely 
    large data in a distributed environment. With native support for bring-your-own-algorithms 
    and frameworks, SageMaker offers flexible distributed training options that adjust to your 
    specific workflows. Deploy a model into a secure and scalable environment by launching it 
    with a few clicks from SageMaker Studio or the SageMaker console. Summarize the article above:""",
    "parameters":{
        "max_new_tokens":200
        }
    }
query_endpoint(payload)

得到测试结果

 Input: Amazon SageMaker is a fully managed machine learning service. With SageMaker, 
    data scientists and developers can quickly and easily build and train machine learning models, 
    and then directly deploy them into a production-ready hosted environment. It provides an 
    integrated Jupyter authoring notebook instance for easy access to your data sources for 
    exploration and analysis, so you don't have to manage servers. It also provides common 
    machine learning algorithms that are optimized to run efficiently against extremely 
    large data in a distributed environment. With native support for bring-your-own-algorithms 
    and frameworks, SageMaker offers flexible distributed training options that adjust to your 
    specific workflows. Deploy a model into a secure and scalable environment by launching it 
    with a few clicks from SageMaker Studio or the SageMaker console. Summarize the article above:
 Output:  SageMaker is a cloud-based machine learning platform that provides a range of tools and services to help data scientists and developers build, train, and deploy machine learning models. It offers an integrated Jupyter notebook environment, optimized algorithms, and flexible distributed training options.

19. 完成所有测试后,我们获取大模型在SageMaker上的URL端点,作为API供应用使用。

以上就是在亚马逊云科技上部署开源大模型,并且多场景测试的全部步骤。欢迎大家持续和我未来获取更多国际前沿的生成式AI开发方案。

<think>嗯,用户这次明确想了解AWS提供的AI模型及相关服务,结合之前对话中提到的"AWS caude"(实际是Cognito)现在引用的资料,看来用户正在系统研究AWSAI生态。从引用内容看,用户关注点明显偏向大模型部署生成式AI应用,尤其对SageMakerBedrock有浓厚兴趣。 用户提供的四份引用材料很有价值: -引用[1]聚焦SageMaker部署Llama大模型,说明用户需要实操方案-引用[2]提到Bedrock的多模型支持,暗示用户关注模型选择灵活性-引用[3]讨论API化部署,反映用户有集成需求-引用[4]强调算力优势,表明性能也是关键考量考虑到用户之前询问过RDS,现在转向AI服务,可能正在设计全栈智能应用。需要重点突出:1模型托管服务(Bedrock/SageMaker)的核心差异2与用户已有知识(如Cognito)的协同可能3性价比数据(如引用[4]提到的400%效率提升) 特别要注意引用[1]的架构图提示了API Gateway+Lambda的集成模式,这在回答集成方案时要强调。用户似乎偏好带代码示例的解决方案(见引用[1]测试代码),但当前问题更偏向概述,后续可能会追问部署细节。安全方面不可忽视,上次解释Cognito时提到过认证,这次在Bedrock的IAM策略部分需要呼应。另外引用[3]提到TensorFlow Serving等框架,但AWS方案更倾向托管服务,这点要明确区分。 最后生成的问题需引导到具体场景:- Bedrock选型问题(问题1/2)-与现有服务集成(问题3) -成本控制(问题4)-移动端适配(问题5)延续之前Cognito的移动场景</think>以下是 AWS 提供的核心 AI 模型服务及相关平台,结合您引用的资料实际应用场景整理: --- ### 一、AWS 托管式大模型服务 #### 1. **Amazon Bedrock** - **功能**:通过 API 统一访问多家顶尖 AI 公司的基础模型(FM),包括: - Meta(Llama 2/3)、Anthropic(Claude)、Mistral AI、Cohere、Stability AI(图像生成)等[^2] - **优势**: - 无需管理底层设施,直接调用多模型 - 支持模型微调(Fine-tuning)知识库增强(RAG) - 集成 AWS IAM 实现精细化权限控制 - **适用场景**:快速构建生成式 AI 应用(如聊天机器人、内容创作) #### 2. **Amazon SageMaker** - **功能**:端到端机器学习平台,支持 **自定义模型训练与部署** - **核心能力**: - **大模型部署**:支持开源模型(如 Llama、Bloom)的托管部署,提供 GPU 实例(如 P4d 搭载 NVIDIA A100)[^1] - **分布式训练优化**:EC2 集群网络带宽达 3.2Tbps,训练效率较本地提升 400%+[^4] - **模型即服务(API化)**:通过 SageMaker Endpoints 将模型部署为 REST API[^3] - **典型架构**: ```mermaid graph LR A[应用前端] --> B(API Gateway) B --> C[AWS Lambda] C --> D[SageMaker Endpoint] D --> E[Llama/Mistral等模型] ``` --- ### 二、AI 基础设施与集成工具 #### 1. **算力引擎** - **EC2 加速实例**: - P5(NVIDIA H100):专为 LLM 训练优化 - Inferentia/Trainium:AWS 自研 AI 芯片,成本降低 40% #### 2. **模型部署与集成** - **API 化方案**: - 通过 **API Gateway + Lambda** 构建无服务器 AI 接口(如引用[1]的 Llama 部署) - 支持 **TensorFlow Serving** 等开源框架容器化部署[^3] - **安全集成**: - 结合 Cognito 实现身份认证,IAM 控制模型访问权限 #### 3. **全流程工具链** | 服务 | 用途 | 案例场景 | |---------------|-----------------------------|----------------------------| | SageMaker Pipelines | 自动化 ML 工作流 | 数据预处理→训练→部署流水线 | | SageMaker Debugger | 训练过程实时监控 | 检测梯度消失/爆炸问题 | | SageMaker Clarify | 模型偏差分析 | 评估公平性指标 | --- ### 三、典型应用场景与数据 1. **生成式 AI 应用** - 使用 **Bedrock + Lambda** 在 2 小时内构建客服机器人,成本低于自建模型 60%[^2] 2. **大模型训练** - 在 SageMaker 上分布式训练 70B 参数模型: - 成本:$$C = k \cdot \frac{N}{P} \cdot t_{\text{hour}}$$ ($N$=参数规模,$P$=GPU数量,$k$=实例单价) 3. **边缘 AI 部署** - 通过 SageMaker Neo 将模型编译优化,部署至 IoT 设备(延迟 <100ms) --- ### 四、如何选择服务? | **需求场景** | **推荐服务** | **关键优势** | |----------------------------|---------------------|--------------------------| | 快速调用多模型 API | Amazon Bedrock | 免运维,按 token 付费 | | 定制化训练/部署开源大模型 | SageMaker | 支持 GPU 集群 HPC 优化 | | 构建无服务器 AI 应用 | Lambda + API Gateway| 自动扩展,毫秒级计费 | > 💡 **免费资源**:SageMaker 新用户享 2500 小时免费实例;Bedrock 提供 $20 试用额度[^2]。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值