0% found this document useful (0 votes)

30 views11 pages

Shamee K Sharma - IR

The document is an internship summary report detailing a two-month virtual internship focused on data engineering using AWS. It outlines the objectives, tasks, skills acquired, and project deliverables, emphasizing the design and implementation of data pipelines and the use of various AWS services. The report concludes with reflections on the overall internship experience, highlighting the technical growth and industry insights gained.

Uploaded by

try.kushagra2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views11 pages

Shamee K Sharma - IR

Uploaded by

try.kushagra2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

`

SCHOOL OF COMPUTING SCIENCE AND ENGINEERING

GREATER NOIDA, UTTAR PRADESH
2024 – 2025

INDUSTRY INTERNSHIP
SUMMARY REPORT

AWS Data Engineering Virtual Internship Report

BACHELOR OF TECHNOLOGY

COMPUTER SCIENCE AND ENGINEERING

Submitted by

Shamee K Sharma (22SCSE1012596)

Vth Sem III Year

1
`

CERTIFICATE

I hereby certify that the work which is being presented in the Internship project

report entitled “ AWS Data Engineering Virtual Internship Report “in

partial fulfillment for the requirements for the award of the degree of Bachelor of
Technology in the School of Computing Science and Engineering of Galgotias University ,
Greater Noida, is an authentic record of my own work carried out in the industry.
To the best of my knowledge, the matter embodied in the project report has not been
submitted to any other University/Institute for the award of any Degree.

Shamee K Sharma (22SCSE1012596)

This is to certify that the above statement made by the candidate is correct and
true to the best of my knowledge.

Signature of Internship Reviewer Signature of Dean (SCSE)

2
`

TABLE OF CONTENTS

CHAPTER TITLE Page No.

Abstract 4
List of Figures & List of Tables

List of Abbreviations

1 Introduction 8
1.1 Objective of the Internship Project

1.2 Problem statement and research objectives of this Internship

1.3 Description of Internship Domain and brief introduction about

an internship organization

2 Internship Activities 9
2.1 Detailed description of tasks and responsibilities.

2.2 Daily/Weekly progress (students can provide a log or journal

of activities).

2.3 Skills or tools used (e.g., programming languages,

frameworks, software, etc.).

3 Learning Outcomes

3.1 Skills acquired (technical and soft skills).

3.2 Knowledge gained about the industry/domain.

3.3 Problem-solving or challenges faced during the internship and

how they were addressed.

4 Project/Work Deliverables

4.1 Details of the main project(s) or tasks completed.

4.2 Outcomes or results of the work done.

4.3 Links or attachments to work products (if applicable, e.g.,

reports, presentations, or code).

5 Conclusion

5.1 Reflections on the overall internship experience.

5.2 Internship certificate.

3
`

ABSTRACT

This report details the experiences and outcomes of a two-month virtual internship focused
on data engineering using Amazon Web Services (AWS). The internship encompassed the
design and implementation of data pipelines, data modeling, and the utilization of various
AWS services to manage and process large datasets. Key deliverables included the
development of scalable data solutions and the application of best practices in data
engineering.

The primary goal of the internship was to design, implement, and optimize data pipelines
capable of handling large and complex datasets. This included tasks such as data ingestion,
transformation, and storage, which are essential for enabling data-driven decision-making
in modern organizations. Leveraging AWS services such as S3 for storage, Redshift for data
warehousing, Glue for ETL processes, and Lambda for automation, the internship
emphasized building scalable and efficient data solutions.

A key aspect of the program was understanding and applying data modelling techniques to
ensure data integrity and efficiency. Participants were introduced to industry-standard
practices, including schema design, data partitioning, and query optimization. These
practices were implemented to address real-world challenges such as performance
bottlenecks and data security concerns.

The internship also highlighted the importance of adopting best practices in data
engineering, such as using IAM roles for secure access, employing serverless computing for
cost-effectiveness, and optimizing Spark jobs for large-scale data processing. The
deliverables included functional data pipelines and documentation that showcased a deep
understanding of the AWS ecosystem and its applications in solving business challenges.

By the end of the internship, participants had gained not only technical proficiency in AWS
tools but also valuable insights into the broader domain of data engineering. This experience
equipped them with the skills to build reliable, scalable, and efficient data systems, making
significant contributions to the field of cloud-based data management. The report
summarizes this transformative journey, emphasizing the practical applications of AWS
technologies and the critical lessons learned during the program.

4
`

LIST OF FIGURES

S. NO FIG. NO TITLE PAGE. NO

1 1 Tools and Technologies Used 6

2 2 Daily/Weekly Progress Summary 8

3 3 Skills Acquired During the Internship 10

4 4 Project Deliverables Overview 12

5
`

LIST OF ABBREVIATIONS

AWS Abbreviation Definition

EMR Amazon Web Services
RDS Elastic MapReduce
S3 Relational Database Service
SQL Simple Storage Service
NoSQL Structured Query Language
ETL Non-Structured Query Language
BI Extract, Transform, Load
Business Intelligence

6
`
CHAPTER 1

INTRODUCTION

CHAPTER 1: INTRODUCTION

1.1 Objective of the Internship Project

 The primary objective of this internship was to gain practical experience in data
engineering by designing and implementing data pipelines using AWS services. This
involved understanding data warehousing concepts, data modelling, and the
deployment of scalable data solutions in a cloud environment.

1.2 Problem Statement and Research Objectives

 With the increasing volume of data generated by businesses, there is a pressing need
for efficient data processing and analysis tools. The internship aimed to address this
challenge by developing data pipelines capable of handling large datasets, ensuring
data integrity, and enabling data-driven decision-making.

1.3 Description of Internship Domain and Organization

 The internship was conducted under the AWS Data Engineering Virtual Internship
program, facilitated by EduSkills Foundation in collaboration with AICTE. The
program focused on cloud-based data engineering, providing exposure to AWS tools
and services essential for building data infrastructure

CHAPTER 2

7
`
INTERNSHIP ACTIVITIES

2.1 Tasks and Responsibilities

 Designed and implemented analytical data platform solutions to facilitate data-driven

decisions and insights.
 Developed data schemas and managed internal data warehouses and SQL/NoSQL
database systems.
 Collaborated with cross-functional teams to extract, transform, and load data from
diverse sources using AWS big data technologies.
 Engaged in data model design, architecture discussions, and optimizations to enhance
data processing efficiency.
 Explored and utilized AWS services such as S3, Redshift, Lambda, and Glue to build
and maintain data pipelines.
 Participated in mentoring sessions conducted by industry experts to gain insights into
real-world data engineering challenges.

2.2 Daily/Weekly Progress

 Each week a module was completed in order to produce the desired output on time.
 Weekly progress was noted and improved in order to maintain the harmony of the
process.

2.3 Skills or Tools Used

 Programming Languages: Python, SQL
 AWS Services: S3, Redshift, EMR, RDS, Lambda, Glue
 Data Processing Frameworks: Apache Spark, Hive
 Data Modelling Tools: ERD tools
 Version Control: Git

CHAPTER 3

8
`
LEARNING OUTCOMES

3.1 Skills Acquired

 Proficiency in designing and implementing data pipelines using AWS services.

 Enhanced understanding of data warehousing concepts and data modelling

techniques.

 Improved programming skills in Python and SQL for data processing tasks.

 Experience with big data technologies and frameworks such as Apache Spark and
Hive.

 Development of soft skills including teamwork, communication, and problem-solving.

3.2 Knowledge Gained

 In-depth understanding of AWS cloud services and their applications in data

engineering.

 In-depth understanding of AWS data warehousing and data modelling .

 Complete knowledge of SQL and Python.

 Deep understanding of cloud-based data engineering concepts.

 Insight into data lifecycle management, including ingestion, transformation, and

storage.

 Practical experience in optimizing cloud-based data solutions for scalability.

CHAPTER 4

9
`
PROJECT/WORK DELIVERABLES

4.1 Details of the main project(s) or tasks completed.

 Developed an API extraction system to pull data from a website at regular
intervals.
 Built a robust system to authenticate, send requests, and parse the API
response into structured formats (e.g., JSON, CSV).
 Automated the data extraction process and scheduled periodic API calls to
update the data.
4.2 Outcomes or results of the work done.
 Improved data retrieval efficiency, reducing manual effort and increasing the
frequency of data updates
 Delivered real-time insights from the extracted data to support decision-
making processes.
 Scalable and Reliable Solutions:
The API extraction process was designed for scalability, ensuring that it can
accommodate growth in the data volume and complexity of the website's API
over time.
4.3 Links or attachments to work products (if applicable, e.g., reports, presentations, or
code).
 Documentation outlining the architecture, setup process, and data extraction
methodology.
 Presentation:
A concise presentation summarizing the project's objectives, implementation
strategy, results, and future scalability potential. This was shared with
stakeholders to demonstrate the value of the automated API extraction
solution.
 Repository with API extraction scripts and configuration files
(https://github.com/shamee12312/porject_aicte/tree/main)

CHAPTER 5

10
`
CONCLUSION

5.1 Reflections on the overall internship experience.

 The AWS Data Engineering Virtual Internship provided a comprehensive
learning experience in cloud-based data engineering. It not only enhanced
technical proficiency in AWS tools but also fostered problem-solving and
analytical skills. The opportunity to work on real-world challenges has been
instrumental in preparing for a career in data engineering.
 Technical Growth
The internship allowed hands-on exposure to various AWS services like S3,
Redshift, Glue, Lambda, and EMR, which are foundational for modern data
engineering workflows. The ability to work with tools like Apache Spark and
Python further enhanced my capacity to manage, process, and analyze large
datasets efficiently. Designing and optimizing ETL processes, a core part of
the program, helped me understand the intricacies of data ingestion,
transformation, and storage.
 Industry Insights
Through this internship, I gained valuable insights into the data engineering
domain and the best practices followed in the industry. I learned about the
significance of data-driven decision-making and the role of robust data
pipelines in achieving business objectives. Understanding how large
organizations use cloud platforms to scale and secure their data infrastructure
was an eye-opener.
 Overall Reflection
The AWS Data Engineering Virtual Internship was more than just a learning
opportunity—it was an experience that bridged the gap between academic
concepts and industry practices. By tackling real-world problems and
delivering tangible results, I have grown both professionally and personally.
This journey has solidified my interest in data engineering and affirmed my
commitment to contributing to the field.

Aws Cloud Internship PPT Vikash Kumar
No ratings yet
Aws Cloud Internship PPT Vikash Kumar
13 pages
AWS Data Engineering
No ratings yet
AWS Data Engineering
17 pages
Final Report Internship
No ratings yet
Final Report Internship
22 pages
Data Engineering
No ratings yet
Data Engineering
24 pages
21A91A04C3
No ratings yet
21A91A04C3
81 pages
Ai ML Virtual Internship Report
0% (1)
Ai ML Virtual Internship Report
36 pages
Aws Data Engineer 2
No ratings yet
Aws Data Engineer 2
50 pages
Namineni Rakesh - Report
No ratings yet
Namineni Rakesh - Report
15 pages
AWS AI ML Virtual Internship Full Report
No ratings yet
AWS AI ML Virtual Internship Full Report
33 pages
Suhail Internship
No ratings yet
Suhail Internship
22 pages
Data Analytics Virtual Internship Report
No ratings yet
Data Analytics Virtual Internship Report
25 pages
Data Engineering Internship at AICTE
No ratings yet
Data Engineering Internship at AICTE
18 pages
Data Engineering
No ratings yet
Data Engineering
22 pages
Kau Progress Report
No ratings yet
Kau Progress Report
3 pages
ER Diagram For Library Management System
83% (6)
ER Diagram For Library Management System
13 pages
Project K
No ratings yet
Project K
34 pages
Types of Constraints in DBMS
No ratings yet
Types of Constraints in DBMS
15 pages
Internship Report On "Cloud and Devops: (Duration: May To June
No ratings yet
Internship Report On "Cloud and Devops: (Duration: May To June
43 pages
1234 Final
No ratings yet
1234 Final
32 pages
Condensed Internship Report v2
No ratings yet
Condensed Internship Report v2
10 pages
Internship Front Pages Final
No ratings yet
Internship Front Pages Final
9 pages
Internship
No ratings yet
Internship
24 pages
Internship Report
No ratings yet
Internship Report
24 pages
Aws Intern Report
No ratings yet
Aws Intern Report
37 pages
Final PVKK
No ratings yet
Final PVKK
53 pages
Internshipdocument
No ratings yet
Internshipdocument
54 pages
Puneeth Report
No ratings yet
Puneeth Report
37 pages
Bhavana Python Report
No ratings yet
Bhavana Python Report
55 pages
213T1A0427
No ratings yet
213T1A0427
26 pages
Internship Progress Report Prasad K
No ratings yet
Internship Progress Report Prasad K
9 pages
AI & ML Virtual Internship: Bachelor of Technology
No ratings yet
AI & ML Virtual Internship: Bachelor of Technology
34 pages
Raju
No ratings yet
Raju
27 pages
Shadab Internship Report
No ratings yet
Shadab Internship Report
15 pages
Ai PDF
No ratings yet
Ai PDF
51 pages
A Internship Report UTTAM
No ratings yet
A Internship Report UTTAM
9 pages
Last Data Analytics Report-1267
No ratings yet
Last Data Analytics Report-1267
32 pages
Summer Internship Report On: Aws Data Engineering (Topic)
No ratings yet
Summer Internship Report On: Aws Data Engineering (Topic)
21 pages
Priya
No ratings yet
Priya
56 pages
Sourabh Internship Report
No ratings yet
Sourabh Internship Report
14 pages
Disease Drug Prediction Usiing ML: Computer Science and Engineering (Artificial Intelligence)
No ratings yet
Disease Drug Prediction Usiing ML: Computer Science and Engineering (Artificial Intelligence)
48 pages
Vincy Mol Internship Report
No ratings yet
Vincy Mol Internship Report
9 pages
Gstr2a Excel Merging Utility
No ratings yet
Gstr2a Excel Merging Utility
7 pages
Internship 1
No ratings yet
Internship 1
24 pages
Final Presentation 2
No ratings yet
Final Presentation 2
20 pages
Internship Report Format
No ratings yet
Internship Report Format
25 pages
Monitor and Support Data Conversion UC - 3
No ratings yet
Monitor and Support Data Conversion UC - 3
18 pages
Geetha Intern de
No ratings yet
Geetha Intern de
26 pages
Google Aiml
No ratings yet
Google Aiml
47 pages
Report 2 Merged
No ratings yet
Report 2 Merged
25 pages
E0 Internship
No ratings yet
E0 Internship
40 pages
Summer Internship
No ratings yet
Summer Internship
31 pages
21a35a0113 Cohort 5
No ratings yet
21a35a0113 Cohort 5
26 pages
Veera Internship Report AIML
No ratings yet
Veera Internship Report AIML
24 pages
Data Engineering Nanodegree Program Syllabus PDF
No ratings yet
Data Engineering Nanodegree Program Syllabus PDF
5 pages
Data Engineering Nanodegree Program Syllabus
No ratings yet
Data Engineering Nanodegree Program Syllabus
16 pages
Report Internship 1 PDF
No ratings yet
Report Internship 1 PDF
9 pages
RMAN Interview Questions
No ratings yet
RMAN Interview Questions
6 pages
What Does A Database Administrator
No ratings yet
What Does A Database Administrator
27 pages
20cspl 402 Dbms Lab Manual
No ratings yet
20cspl 402 Dbms Lab Manual
107 pages
Database Shutdown Issue
100% (4)
Database Shutdown Issue
45 pages
6-MongoDB Architecture (E-Next - In)
No ratings yet
6-MongoDB Architecture (E-Next - In)
63 pages
DBS Lab Manual
No ratings yet
DBS Lab Manual
39 pages
Student Data Management in C++ - GeeksforGeeks
No ratings yet
Student Data Management in C++ - GeeksforGeeks
17 pages
Binary Search, Hashing and File Structures
No ratings yet
Binary Search, Hashing and File Structures
23 pages
Assignment No 3 Help File
No ratings yet
Assignment No 3 Help File
6 pages
How To Cache DTO Projections With Hibernate
No ratings yet
How To Cache DTO Projections With Hibernate
3 pages
Oracle - Practicetest.1z0 061.v2016!10!24.by - Zachary.74q
No ratings yet
Oracle - Practicetest.1z0 061.v2016!10!24.by - Zachary.74q
80 pages
Data Processing
No ratings yet
Data Processing
3 pages
Back To 'Certificate Final Exam/': Correct 1.00 Points Out of 1.00
No ratings yet
Back To 'Certificate Final Exam/': Correct 1.00 Points Out of 1.00
15 pages
Binary Search Tree - Data Structures
No ratings yet
Binary Search Tree - Data Structures
45 pages
Learning Common GIS Workflows
No ratings yet
Learning Common GIS Workflows
21 pages
DOS Command
No ratings yet
DOS Command
4 pages
Lesson 3
No ratings yet
Lesson 3
17 pages
Synology DS223 Data Sheet Enu
No ratings yet
Synology DS223 Data Sheet Enu
9 pages
Power BI and Azure
No ratings yet
Power BI and Azure
3 pages
Python Developer Standout Resume Example
No ratings yet
Python Developer Standout Resume Example
1 page
Neo4j Fundamentals Summary
No ratings yet
Neo4j Fundamentals Summary
1 page
Eight Criteria For Choosing The Perfect BI Tool
No ratings yet
Eight Criteria For Choosing The Perfect BI Tool
5 pages
2019BBA033 IA Assignment
No ratings yet
2019BBA033 IA Assignment
3 pages
AZ 305T00A ENU CourseOutline
No ratings yet
AZ 305T00A ENU CourseOutline
9 pages
Tuning Library Cache Latch Contention
No ratings yet
Tuning Library Cache Latch Contention
4 pages
Analysing Oracle AWR2
100% (1)
Analysing Oracle AWR2
17 pages
Engineering Data Mesh in Azure Cloud: Implement data mesh using Microsoft Azure's Cloud Adoption Framework
From Everand
Engineering Data Mesh in Azure Cloud: Implement data mesh using Microsoft Azure's Cloud Adoption Framework
Aniruddha Deswandikar
No ratings yet
Azure Container Apps Deployment and Architecture: The Complete Guide for Developers and Engineers
From Everand
Azure Container Apps Deployment and Architecture: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
From Everand
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
NetBeans Development Guide: Definitive Reference for Developers and Engineers
From Everand
NetBeans Development Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Real-Time in the Cloud: Azure Messaging & Events
From Everand
Real-Time in the Cloud: Azure Messaging & Events
Kameron Hussain
No ratings yet
Azure Data Demystified: From SQL to Synapse
From Everand
Azure Data Demystified: From SQL to Synapse
Kameron Hussain
No ratings yet
AWS CDK Essentials: A Beginner's Guide to Infrastructure as Code
From Everand
AWS CDK Essentials: A Beginner's Guide to Infrastructure as Code
Robert Johnson
No ratings yet

Shamee K Sharma - IR

Uploaded by

Shamee K Sharma - IR

Uploaded by

`

SCHOOL OF COMPUTING SCIENCE AND ENGINEERING

AWS Data Engineering Virtual Internship Report

COMPUTER SCIENCE AND ENGINEERING

Shamee K Sharma (22SCSE1012596)

Vth Sem III Year

report entitled “ AWS Data Engineering Virtual Internship Report “in

Shamee K Sharma (22SCSE1012596)

Signature of Internship Reviewer Signature of Dean (SCSE)

CHAPTER TITLE Page No.

1.2 Problem statement and research objectives of this Internship

1.3 Description of Internship Domain and brief introduction about

2.2 Daily/Weekly progress (students can provide a log or journal

2.3 Skills or tools used (e.g., programming languages,

3.1 Skills acquired (technical and soft skills).

3.2 Knowledge gained about the industry/domain.

3.3 Problem-solving or challenges faced during the internship and

4.1 Details of the main project(s) or tasks completed.

4.2 Outcomes or results of the work done.

4.3 Links or attachments to work products (if applicable, e.g.,

5.1 Reflections on the overall internship experience.

S. NO FIG. NO TITLE PAGE. NO

1 1 Tools and Technologies Used 6

2 2 Daily/Weekly Progress Summary 8

3 3 Skills Acquired During the Internship 10

4 4 Project Deliverables Overview 12

AWS Abbreviation Definition

1.1 Objective of the Internship Project

1.2 Problem Statement and Research Objectives

1.3 Description of Internship Domain and Organization

2.1 Tasks and Responsibilities

 Designed and implemented analytical data platform solutions to facilitate data-driven

2.2 Daily/Weekly Progress

2.3 Skills or Tools Used

3.1 Skills Acquired

 Proficiency in designing and implementing data pipelines using AWS services.

 Enhanced understanding of data warehousing concepts and data modelling

 Development of soft skills including teamwork, communication, and problem-solving.

3.2 Knowledge Gained

 In-depth understanding of AWS cloud services and their applications in data

 In-depth understanding of AWS data warehousing and data modelling .

 Complete knowledge of SQL and Python.

 Deep understanding of cloud-based data engineering concepts.

 Insight into data lifecycle management, including ingestion, transformation, and

 Practical experience in optimizing cloud-based data solutions for scalability.

4.1 Details of the main project(s) or tasks completed.

5.1 Reflections on the overall internship experience.

You might also like