Shamee K Sharma - IR
Shamee K Sharma - IR
INDUSTRY INTERNSHIP
SUMMARY REPORT
BACHELOR OF TECHNOLOGY
in
Submitted by
1
`
CERTIFICATE
I hereby certify that the work which is being presented in the Internship project
This is to certify that the above statement made by the candidate is correct and
true to the best of my knowledge.
2
`
TABLE OF CONTENTS
List of Abbreviations
1 Introduction 8
1.1 Objective of the Internship Project
2 Internship Activities 9
2.1 Detailed description of tasks and responsibilities.
3 Learning Outcomes
4 Project/Work Deliverables
5 Conclusion
3
`
ABSTRACT
This report details the experiences and outcomes of a two-month virtual internship focused
on data engineering using Amazon Web Services (AWS). The internship encompassed the
design and implementation of data pipelines, data modeling, and the utilization of various
AWS services to manage and process large datasets. Key deliverables included the
development of scalable data solutions and the application of best practices in data
engineering.
The primary goal of the internship was to design, implement, and optimize data pipelines
capable of handling large and complex datasets. This included tasks such as data ingestion,
transformation, and storage, which are essential for enabling data-driven decision-making
in modern organizations. Leveraging AWS services such as S3 for storage, Redshift for data
warehousing, Glue for ETL processes, and Lambda for automation, the internship
emphasized building scalable and efficient data solutions.
A key aspect of the program was understanding and applying data modelling techniques to
ensure data integrity and efficiency. Participants were introduced to industry-standard
practices, including schema design, data partitioning, and query optimization. These
practices were implemented to address real-world challenges such as performance
bottlenecks and data security concerns.
The internship also highlighted the importance of adopting best practices in data
engineering, such as using IAM roles for secure access, employing serverless computing for
cost-effectiveness, and optimizing Spark jobs for large-scale data processing. The
deliverables included functional data pipelines and documentation that showcased a deep
understanding of the AWS ecosystem and its applications in solving business challenges.
By the end of the internship, participants had gained not only technical proficiency in AWS
tools but also valuable insights into the broader domain of data engineering. This experience
equipped them with the skills to build reliable, scalable, and efficient data systems, making
significant contributions to the field of cloud-based data management. The report
summarizes this transformative journey, emphasizing the practical applications of AWS
technologies and the critical lessons learned during the program.
4
`
LIST OF FIGURES
5
`
LIST OF ABBREVIATIONS
6
`
CHAPTER 1
INTRODUCTION
CHAPTER 1: INTRODUCTION
The primary objective of this internship was to gain practical experience in data
engineering by designing and implementing data pipelines using AWS services. This
involved understanding data warehousing concepts, data modelling, and the
deployment of scalable data solutions in a cloud environment.
With the increasing volume of data generated by businesses, there is a pressing need
for efficient data processing and analysis tools. The internship aimed to address this
challenge by developing data pipelines capable of handling large datasets, ensuring
data integrity, and enabling data-driven decision-making.
The internship was conducted under the AWS Data Engineering Virtual Internship
program, facilitated by EduSkills Foundation in collaboration with AICTE. The
program focused on cloud-based data engineering, providing exposure to AWS tools
and services essential for building data infrastructure
CHAPTER 2
7
`
INTERNSHIP ACTIVITIES
CHAPTER 3
8
`
LEARNING OUTCOMES
Improved programming skills in Python and SQL for data processing tasks.
Experience with big data technologies and frameworks such as Apache Spark and
Hive.
CHAPTER 4
9
`
PROJECT/WORK DELIVERABLES
CHAPTER 5
10
`
CONCLUSION
11