0% found this document useful (0 votes)
7 views11 pages

Cloud computing U-2

The document provides an overview of data centre design, interconnection networks, and cloud computing, detailing the components and considerations for effective data management. It discusses the architecture of cloud storage, the differences between parallel and distributed computing, and introduces the MapReduce paradigm for processing large datasets. Additionally, it highlights the challenges faced in cloud programming, including cybersecurity, cost management, and skill gaps.

Uploaded by

nidspilot2025
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views11 pages

Cloud computing U-2

The document provides an overview of data centre design, interconnection networks, and cloud computing, detailing the components and considerations for effective data management. It discusses the architecture of cloud storage, the differences between parallel and distributed computing, and introduces the MapReduce paradigm for processing large datasets. Additionally, it highlights the challenges faced in cloud programming, including cybersecurity, cost management, and skill gaps.

Uploaded by

nidspilot2025
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Data Centre Design and Interconnection Network

What is a Data Centre?


A data centre is a centralized facility where computing resources such as servers, storage systems, networking
equipment, and infrastructure are organized to store, process, and manage data.These are the backbone of IT
operations, used by businesses, governments, cloud providers (like AWS, Azure, Google Cloud), and more.

Data Centre Design – Key Aspects


Data Centre design involves planning the layout, power supply, cooling systems, security, and networking to
ensure high availability, scalability, and efficiency.

A. Physical Infrastructure Components:


Component Function

Servers Process data and applications

Storage Systems Store files, databases, and backup copies

Racks & Enclosures Organize hardware in rows for accessibility and airflow

Power Supply (UPS) Provide consistent power; includes battery backups, generators

Cooling System Prevent overheating (CRAC units, chillers, hot/cold aisle setup)

Cabling Structured cabling to prevent clutter and improve airflow

Security Systems Physical (CCTV, biometrics), Network (firewalls, intrusion detection)

Design Considerations:
Design Factor Description

Redundancy Duplicate components (power, cooling, network paths) to avoid downtime

Scalability Easy addition of servers, storage, or network equipment

Availability Systems designed for 24x7 uptime (Tier 1 to Tier 4 ratings)

Energy Efficiency Green data centres use renewable energy and low-power hardware

Disaster Recovery Backup locations, cloud failover, and data replication strategies

Data Centre Interconnection Network (DCIN)


This refers to how various components in a data centre communicate — both internally (within the centre) and
externally (to users and other centres).

A. Three-Tier Network Architecture:

1. Core Layer – High-speed backbone; connects to internet/external data centres


2. Aggregation (Distribution) Layer – Manages traffic from multiple access layers
3. Access Layer – Connects directly to servers and devices
Role in Cloud Computing:

 Host Cloud Services:

Data centers serve as the foundation for cloud services, providing the resources needed to run virtual machines, store
data, and deliver applications.

 Resource Sharing:

Cloud data centers are designed to accommodate multiple clients, allowing them to rent computing resources (like
storage or processing power) on demand.

 Scalability and Elasticity:

Cloud data centers enable cloud providers to rapidly scale up or down computing resources based on client needs,
offering flexibility and cost-effectiveness.

 Accessibility:

Data centers enable users to access cloud services from anywhere with an internet connection.
Architecture of DATA CENTER
Cloud programming and software development face challenges in several areas,
including cybersecurity, cost management, skill gaps, governance, and compliance. These
issues can lead to fractured operations, hindering companies' ability to effectively manage
application delivery infrastructure.
Cybersecurity: Data breaches and security vulnerabilities remain a significant concern in
cloud computing. Companies need to ensure robust security measures, including secure
access control, encryption, and vulnerability management.
Cost Management: Cloud costs can be difficult to predict and manage, especially when using
multiple cloud providers or services. Companies need to implement strategies for cost
optimization, such as leveraging reserved instances, right-sizing resources, and monitoring
usage patterns.
Skill Gaps: The rapid growth of cloud computing has led to a shortage of skilled professionals,
particularly in areas like cloud engineering, development, and security. Organizations need to
invest in training and development programs to address this skill gap.
Governance and Compliance: Managing cloud environments requires establishing clear
governance policies and procedures, including data residency, access controls, and
compliance with industry regulations.
Fractured Operations: When different teams manage different aspects of the application
delivery infrastructure (e.g., traditional teams managing on-premise infrastructure while
DevOps teams manage cloud infrastructure), it can lead to operational inefficiencies, security
risks, and compliance issues. Companies need to unify their approach to application delivery
infrastructure management, whether on-premise or in the cloud, to avoid fractured operations.
Types of Data Centers:
 On-premises data centers:
Owned and operated by a single organization for their own internal use.
 Colocation data centers:
Provide space and resources for other companies to host their servers and equipment.
 Managed data centers:
Offer data storage, computing, and other services as a managed service to customers.
 Cloud data centers:
Owned and operated by cloud service providers and used to deliver cloud services to a wide
range of users.

What is Cloud storage Architecture?


Cloud storage architecture involves designing and arranging components to provide scalable,
reliable, and secure storage services within a cloud computing environment. It typically
includes a front-end API for storage access, a middleware layer for features like data reduction
and replication, and a back-end for physical storage. This architecture is crucial for delivering
storage on-demand, in a multi-tenant and highly scalable manner.
Here's a more detailed breakdown:

Key Components:
Front-end (API):
This layer provides the interface for users to access the storage. It can offer various APIs like file
service, web service, or traditional protocols like iSCSI.
Middleware (Storage Logic):
This layer handles various storage-related functionalities, such as data replication, data reduction, and
access control.
Back-end (Physical Storage):
This layer implements the physical storage devices where data is actually stored. It can involve various
storage technologies, including physical disks, object storage, or network protocols.
Key Characteristics:
Scalability:
Cloud storage architectures are designed to handle increasing storage demands by adding more
resources as needed.
Reliability:
Data replication and redundancy mechanisms ensure data is available even if one or more storage
components fail.
Security:
Security measures, such as encryption and access control, protect data from unauthorized access.
Multi-tenancy:
Cloud storage platforms allow multiple users or organizations to share the same infrastructure, enabling
efficient resource utilization.

Examples of Cloud Storage Architectures:


 Object Storage: Stores data as objects (files) in a scalable and cost-effective manner, often used for
unstructured data like images, videos, and documents.
 Block Storage: Provides raw storage space, often used for virtual machines and databases.
 File Storage: Allows users to access data through a traditional file system interface, similar to a network
drive.

Difference between Parallel Computing and Distributed


Computing
What is Parallel Computing?
In parallel computing multiple processors performs multiple tasks assigned to them
simultaneously. Memory in parallel systems can either be shared or distributed. Parallel
computing provides concurrency and saves time and money.
Examples
Blockchains, Smartphones, Laptop computers, Internet of Things, Artificial intelligence and
machine learning, Space shuttle, Supercomputers are the technologies that uses Parallel
computing technology.
Advantages of Parallel Computing
 Increased Speed: In this technique, several calculations are executed concurrently hence
reducing the time of computation required to complete large scale problems.
 Efficient Use of Resources: Takes full advantage of all the processing units it is
equipped with hence making the best use of the machine’s computational power
 Scalability: Also the more processors built into the system, the more complex problems
can be solved within a short time.
 Improved Performance for Complex Tasks: Best suited for activities which involve a
large numerical calculation like, number simulation, scientific analysis and modeling and
data processing.
Disadvantages of Parallel Computing
 Complexity in Programming: Parallel writing programming that is used in organizing
tasks in a parallel manner is even more difficult than that of serial programming.
 Synchronization Issues: Interaction of various processors when operating concurrently
can become synchronized and result in problem areas on the overall communication.
 Hardware Costs: The implementation of parallel computing does probably involve the use
of certain components such as multi-core processors which could possibly be costly than
the normal systems.
What is Distributed Computing?
In distributed computing we have multiple autonomous computers which seems to the user
as single system. In distributed systems there is no shared memory and computers
communicate with each other through message passing. In distributed computing a single
task is divided among different computers.
Examples
Artificial Intelligence and Machine Learning, Scientific Research and High-Performance
Computing, Financial Sectors, Energy and Environment sectors, Internet of Things,
Blockchain and Cryptocurrencies are the areas where distributed computing is used.
Advantages of Distributed Computing
 Fault Tolerance: The failure of one node means that this node is no longer part of the
computations, but that is not fatal for the entire computation since there are other
computers participating in the process thereby making the system more reliable.
 Cost-Effective: Builds upon existing hardware and has flexibility in utilizing commodity
machines instead of the need to have expensive and specific processors for its use.
 Scalability: The distributed systems have the ability to scale and expand horizontally
through the addition of more machines in the networks and therefore they can take on
greater workloads and processes.
 Geographic Distribution: Distributed computing makes it possible to execute tasks at
different points thereby eliminating latencies.
Disadvantages of Distributed Computing
 Complexity in Management: The task of managing a distributed system itself can be
made more difficult since it may require dealing with the latency and/or failure of a network
as well as issues related to synchronizing the information to be distributed.
 Communication Overhead: Inter node communication requirements can actually hinder
the package transfer between nodes that are geographically distant and hence the overall
performance is greatly compromised.
 Security Concerns: In general, distributed systems are less secure as compared to
centralized system because distributed systems heavily depend on a network.
 Difference between Parallel Computing and Distributed
Computing:
S.NO Parallel Computing Distributed Computing

Many operations are performed System components are located at different


1.
simultaneously locations

2. Single computer is required Uses multiple computers

Multiple processors perform multiple


3. Multiple computers perform multiple operations
operations

It may have shared or distributed


4. It have only distributed memory
memory
S.NO Parallel Computing Distributed Computing

Processors communicate with each Computer communicate with each other through
5.
other through bus message passing.

Improves system scalability, fault tolerance and


6. Improves the system performance
resource sharing capabilities

MAPREDUCE PARADIGM:
The MapReduce paradigm is a programming model for processing large datasets in a
massively parallel, distributed fashion. It simplifies the development of data-intensive
applications by dividing tasks into two main phases: Map and Reduce. The Map function
processes data in parallel, while the Reduce function combines the results from the Map
phase.
Here's a more detailed explanation:

Key Concepts:
Parallel Processing:
MapReduce is designed to leverage the power of distributed computing, processing data across multiple
machines simultaneously.
Data Splitting:
The input data is divided into smaller chunks, which are then processed independently by different map
tasks.
Map Function:
The Map function takes a key-value pair as input and produces a set of intermediate key-value pairs.
Reduce Function:
The Reduce function takes the intermediate key-value pairs and merges them based on the key,
producing the final output.
Fault Tolerance:
MapReduce is designed to be resilient to machine failures, allowing the system to continue operating
even if some nodes go down.
Abstraction:
The framework handles the complexities of data distribution, scheduling, and communication, allowing
developers to focus on the algorithms themselves.
How it Works:
1. Input Data Splitting: The input data is divided into smaller, manageable chunks.
2. Map Phase: Each chunk is processed by a Map task, which generates a set of intermediate key-value
pairs.
3. Shuffle and Sort: The intermediate key-value pairs are shuffled and sorted based on the key.
4. Reduce Phase: The Reduce tasks process the grouped key-value pairs and produce the final output.
Benefits of MapReduce:
 Scalability: Can handle massive datasets that would be impossible to process on a single machine.
 Efficiency: Parallel processing significantly reduces processing time.
 Simplicity: The MapReduce model is relatively simple to understand and use, making it easier to develop
and maintain applications.
 Fault Tolerance: The system can continue running even if some nodes fail.
Use Cases:

MapReduce is widely used in various applications, including:


 Big Data Analytics: Processing and analyzing large datasets for insights and trends.
 Web Search: Indexing and searching the web.
 Social Media: Analyzing user data and content.
 Financial Modeling: Performing complex calculations and simulations.
 Log Analysis: Processing and analyzing log data for troubleshooting and monitoring.

MapReduce Architecture:

You might also like