0% found this document useful (0 votes)
95 views

AWS-Proposal

Uploaded by

shiuleec
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views

AWS-Proposal

Uploaded by

shiuleec
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Proposal to Purchase Amazon SageMaker and S3 Bucket Services

Date: 25/09/2024
Prepared by: Dr. Shiulee, Data Science-Senior Manager, People Analytics

1. Executive Summary

This proposal outlines the necessity of procuring Amazon SageMaker and


Amazon S3 bucket services to enhance the data science and machine
learning capabilities within People Analytics. Amazon SageMaker will allow
us to build, train, and deploy machine learning models at scale, while
Amazon S3 will serve as a scalable storage solution for data and model
artifacts. This solution aligns with our strategic initiatives to leverage
machine learning for predictive analytics and other business use cases.

In addition, we propose integrating advanced security measures to align


with the standards set by our Data Governance team, ensuring
compliance and safeguarding sensitive data across both services.

2. Objectives and Benefits

 Scalable ML Infrastructure: Amazon SageMaker provides a fully


managed service to develop, train, and deploy machine learning
models without needing to manage the underlying infrastructure.
This will reduce time-to-market for our models and enhance our
ability to scale as business demands grow.

 Efficient Data Storage: Amazon S3 will offer secure and scalable


storage for all data, including training datasets, models, and logs,
with flexible cost options based on storage usage.

 Data Governance and Security: Both SageMaker and S3 will


adhere to strict data governance policies, incorporating encryption,
access control, and monitoring to prevent unauthorized access and
ensure data protection.

 Enhanced Productivity: Automating key tasks, such as


hyperparameter tuning, model monitoring, and experiment tracking,
through SageMaker will allow the data science team to focus on
developing models and driving business impact.

 Cost Efficiency: Pay-as-you-go pricing for both SageMaker and S3


ensures we only pay for the resources we use, minimizing overhead
and upfront costs.
3. Security and Data Governance Measures

Amazon SageMaker Security Features:

 Role-Based Access Control (RBAC): Implement AWS Identity and


Access Management (IAM) roles to limit access to the SageMaker
environment based on user roles and responsibilities.

 Encryption at Rest and In Transit: Enable encryption of data


both in transit and at rest using AWS Key Management Service
(KMS) to secure sensitive datasets used for training and inference.

 VPC Integration: Deploy SageMaker in a Virtual Private Cloud


(VPC) to isolate the environment and restrict network access.

 Audit Logs: Enable Amazon CloudTrail to log all API calls made to
and from SageMaker, ensuring transparency and accountability.

Amazon S3 Security Features:

 S3 Bucket Policies: Apply strict bucket policies to control access


based on the principle of least privilege, ensuring that only
authorized users can read, write, or manage data.

 Server-Side Encryption: Enable server-side encryption (SSE)


using either AWS KMS or S3-Managed Keys (SSE-S3) to protect data
at rest.

 Access Logging and Monitoring: Enable S3 server access logging


and integrate with AWS CloudTrail and Amazon CloudWatch to
monitor all data access and modifications.

 Data Lifecycle Policies: Define lifecycle policies to automatically


transition or delete data based on its retention requirements,
aligning with governance policies.

 Object Lock: Implement S3 Object Lock to prevent deletion of


critical data, ensuring compliance with regulatory requirements or
internal governance.

4. Amazon SageMaker Services

Amazon SageMaker Features:

 Model Building and Training: Build models with pre-configured


Jupyter notebooks and use built-in algorithms or custom models.

 Hyperparameter Optimization: Automate the search for the best


model configurations.
 Deploy at Scale: Deploy models in production at any scale with
one-click deployment features.

 Model Monitoring and Management: Automate model


monitoring and retraining to ensure models remain accurate and
perform well over time.

SageMaker Cost Estimate:

Estimated monthly cost for SageMaker: $996 - $1,500, assuming


moderate usage of training and deployment resources.

Cost Breakup:

SageMaker On-Demand Notebook Instances feature

3 data scientist(s) x 10 On-Demand Notebook instances = 30.00 On-


Demand Notebook instances

30.00 On-Demand Notebook instances x 8 hours per day x 25 days per


month = 6,000.00 SageMaker On-Demand Notebook instance hours per
month

6,000.00 hours per month x 0.121 USD per hour instance cost = 726.00
USD (monthly On-Demand cost)

ML Storage

100 GB per month x 0.168 USD = 16.80 USD

Storage pricing (monthly): 16.80 USD

SageMaker Training feature

10 jobs per month x 10 instances per job x 8 hours per job = 800.00
SageMaker Training hours per month

800.00 hours per month x 0.408 USD per hour instance cost = 326.40
USD (monthly On-Demand cost)

Total cost for SageMaker Training (monthly): 326.40 USD

ML Storage

100 GB per month x 0.168 USD = 16.80 USD

Storage pricing (monthly): 16.80 USD

Total cost for On-Demand Notebooks (monthly): 726.00 USD

Storage pricing (monthly): 16.80 USD


Total cost for On-Demand Notebooks (Monthly): 726.00

Storage pricing (Monthly): 16.80

Total cost for SageMaker Training (Monthly): 326.40

Storage pricing (Monthly): 16.80

Total Monthly cost: 1,086.00 USD (approx.)

Total 12 months cost: 13,032.00 USD (approx.)

5. Amazon S3 Services

Amazon S3 Features:

 Durable and Secure Storage: 99.999999999% (11 9’s) of


durability with encryption and access control.

 Storage Classes for Cost Optimization: Store infrequently


accessed data in S3 Standard-Infrequent Access (S3 Standard-IA) or
archive data in S3 Glacier for further cost savings.

 Flexible Integration: Seamless integration with SageMaker and


other AWS services for data pipelines.

S3 Cost Estimate:

Estimated monthly cost for S3: $100 - $500, depending on data volume
(e.g., 1–5 TB stored).

Cost Breakup:

S3 Standard feature:

Tiered price for: 10,000 GB

10,000 GB x 0.025 USD = 250.00 USD

Total tier cost = 250.00 USD (S3 Standard storage cost)

100,000 PUT requests for S3 Standard Storage x 0.000005 USD per


request = 0.50 USD (S3 Standard PUT requests cost)

100,000 GET requests in a month x 0.0000004 USD per request = 0.04


USD (S3 Standard GET requests cost)

10,000 GB x 0.0009 USD = 9.00 USD (S3 select returned cost)

10,000 GB x 0.0025 USD = 25.00 USD (S3 select scanned cost)

250 USD + 0.04 USD + 0.50 USD + 9.00 USD + 25.00 USD = 284.54 USD
(Total S3 Standard Storage, data requests, S3 select cost)
S3 Standard cost (monthly): 284.54 USD

S3 Object Lambda feature:

10 GET requests in a month x 0.0000004 USD per request = 0.000004


USD (S3 Standard GET requests charges)

Amazon S3 GET request charges: 0.000004 USD

10 GET requests in a month x 600 ms x 0.001 ms to sec conversion factor


= 6.00 total compute (seconds)

1 GB x 6.00 seconds = 6.00 total compute (GB-s)

6.00 GB-s x 0.0000166667 USD = 0.0001 USD (monthly compute charges)

10 GET requests in a month x 0.0000002 USD = 0.000002 USD (monthly


request charges)

0.0001 USD + 0.000002 USD = 0.000102 USD

AWS Lambda charges: 0.000102 USD

10 GET requests in a month x 100 GB x 0.005 USD = 5.00 USD (monthly


S3 Object Lambda charges)

S3 Object Lambda charges: 5.00 USD

0.000004 USD + 0.000102 USD + 5.00 USD = 5.000106 USD (monthly


total charges)

Total cost (monthly): 5.00 USD

Data Transfer feature:

Inbound:

Internet: 100 GB x 0 USD per GB = 0.00 USD

Outbound:

Internet: 100 GB x 0.1093 USD per GB = 10.93 USD

Data Transfer cost (monthly): 10.93 USD

S3 Access Grants feature:

15 requests x 0.00003 USD per request = 0.00045 USD

Total cost (monthly): 0.00 USD

Total Monthly cost: 300.47 USD (approx.)

Total 12 months cost: 3,605.64 USD (approx.)


6. Total Estimated Costs

Service Estimated Monthly Cost Estimated Yearly Cost

Amazon $1500 - $2,000 $18,000 - $24,000


SageMaker

Amazon S3 $300- $500 $3600 - $6000

Total $1800 - $2,500 (£ 1345- $21,600- $30,000 (£ 16,000-£


£1900) 22,500 )

7. Implementation Plan

 Phase 1: Setup and Configuration (1-2 weeks):


Set up Amazon SageMaker and S3 instances, configure security
settings (including encryption and access controls), and integrate
with existing infrastructure.

 Phase 2: Model Development and Training (Ongoing):


Begin model development and training on SageMaker using data
stored in S3, with full audit logging enabled for monitoring and
compliance.

 Phase 3: Scaling and Deployment (Ongoing):


Scale resources as needed for additional models and production
deployments, while optimizing storage in S3 for cost savings and
governance compliance.

8. Conclusion

Investing in Amazon SageMaker and S3 with enhanced security measures


will allow the People Analytics team to work more efficiently and
effectively, accelerating the development of machine learning solutions
while ensuring full compliance with our data governance standards. The
pay-as-you-go model offers flexibility, allowing us to scale resources based
on business needs while maintaining control over costs. We recommend
moving forward with the procurement of these services to meet our
growing data science and analytics requirements securely. Also, there is a
direct inbuilt connector for S3 in Tableau. That way we can eliminate the
manual effort of downloading the data from workday prism and uploading
in Tableau.

9. Approvals
 Prepared by: Dr. Shiulee

 Reviewed by: Paul Scott

 Approval by: Justine Thompson

You might also like