0% found this document useful (0 votes)
39 views29 pages

3 GCC

The document outlines a course on Grid and Cloud Computing, covering topics such as the evolution of distributed computing, grid services, virtualization, programming models, and security. It includes detailed units on Open Grid Services Architecture, cloud deployment models, and the use of various toolkits and frameworks. The course aims to equip students with the ability to apply grid computing techniques, utilize virtualization, and implement security models in cloud environments.

Uploaded by

vinayagam.ta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views29 pages

3 GCC

The document outlines a course on Grid and Cloud Computing, covering topics such as the evolution of distributed computing, grid services, virtualization, programming models, and security. It includes detailed units on Open Grid Services Architecture, cloud deployment models, and the use of various toolkits and frameworks. The course aims to equip students with the ability to apply grid computing techniques, utilize virtualization, and implement security models in cloud environments.

Uploaded by

vinayagam.ta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd

59

GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY


DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CS6703 GRID AND CLOUD COMPUTING LTPC


3003
UNIT I INTRODUCTION 9
Evolution of Distributed computing: Scalable computing over the Internet – Technologies for
network based systems – clusters of cooperative computers - Grid computing Infrastructures
– cloud computing - service oriented architecture – Introduction to Grid Architecture and
standards – Elements of Grid – Overview of Grid Architecture.

UNIT II GRID SERVICES 9


Introduction to Open Grid Services Architecture (OGSA) – Motivation – Functionality
Requirements –Practical & Detailed view of OGSA/OGSI – Data intensive grid service
models – OGSA services.

UNIT III VIRTUALIZATION 9


Cloud deployment models: public, private, hybrid, community – Categories of cloud
computing:
Everything as a service: Infrastructure, platform, software - Pros and Cons of cloud
computing –
Implementation levels of virtualization – virtualization structure – virtualization of CPU,
Memory and I/O devices – virtual clusters and Resource Management – Virtualization for
data center automation.

UNIT IV PROGRAMMING MODEL 9


Open source grid middleware packages – Globus Toolkit (GT4) Architecture , Configuration
– Usage of Globus – Main components and Programming model - Introduction to Hadoop
Framework - Mapreduce, Input splitting, map and reduce functions, specifying input and
output parameters, configuring and running a job – Design of Hadoop file system, HDFS
concepts, command line and java interface, dataflow of File read & File write.

UNIT V SECURITY 9
Trust models for Grid security environment – Authentication and Authorization methods –
Grid
security infrastructure – Cloud Infrastructure security: network, host and application level –
aspects of data security, provider data and its security, Identity and access management
architecture, I am practices in the cloud, SaaS, PaaS, IaaS availability in the cloud, Key
privacy issues in the cloud.
TOTAL: 45 PERIODS
OUTCOMES:
At the end of the course, the student should be able to:
 Apply grid computing techniques to solve large scale scientific problems.
 Apply the concept of virtualization.
 Use the grid and cloud tool kits.
 Apply the security models in the grid and the cloud environment.

TEXT BOOK:
1. Kai Hwang, Geoffery C. Fox and Jack J. Dongarra, “Distributed and Cloud Computing:
Clusters, Grids, Clouds and the Future of Internet”, First Edition, Morgan Kaufman
Publisher, an Imprint of Elsevier, 2012.

CS6703 GRID AND CLOUD COMPUTING


60
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

REFERENCES:
1. Jason Venner, “Pro Hadoop- Build Scalable, Distributed Applications in the Cloud”, A
Press, 2009
2. Tom White, “Hadoop The Definitive Guide”, First Edition. O’Reilly, 2009.
3. Bart Jacob (Editor), “Introduction to Grid Computing”, IBM Red Books, Vervante, 2005
4. Ian Foster, Carl Kesselman, “The Grid: Blueprint for a New Computing Infrastructure”,
2nd Edition, Morgan Kaufmann.
5. Frederic Magoules and Jie Pan, “Introduction to Grid Computing” CRC Press, 2009.
6. Daniel Minoli, “A Networking Approach to Grid Computing”, John Wiley Publication,
2005.
7. Barry Wilkinson, “Grid Computing: Techniques and Applications”, Chapman and Hall,
CRC, Taylor and Francis Group, 2010.

2 MARK QUESTIONS WITH ANSWERS


UNIT I - INTRODUCTION
1. What is Grid Computing?
Grid computing is a processor architecture that combines computer resources from
various domains to reach a main objective. In grid computing, the computers on the
network can work on a task together, thus functioning as a supercomputer.
2. What is QOS?
Grid computing system is the ability to provide the quality of service requirements necessary
for the end-user community. QOS provided by the grid like performance, availability,
management aspects, business value and flexibility in pricing.

3. What are the derivatives of grid computing?


There are 8 derivatives of grid computing. They are as follows:
a) Compute grid
b) Data grid
c) Science grid
d) Access grid
e) Knowledge grid
f) Cluster grid
g) Terra grid
h) Commodity grid

4. What are the features of data grids?


The ability to integrate multiple distributed, heterogeneous and independently
managed data sources. The ability to provide data catching and/or replication
mechanisms to minimize network traffic. The ability to provide necessary data
discovery mechanisms, which allow the user to find data based on characteristics
of the data.

5. Define – Cloud Computing.


Cloud computing, often referred to as simply “the cloud,” is the delivery of on-demand
computing resources—everything from applications to data centers—over the Internet on a
pay-for-use basis. Storing and accessing data and programs over the Internet instead of
your computer's hard drive

CS6703 GRID AND CLOUD COMPUTING


61
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

6. What is business on demand?


Business On Demand is not just about utility computing as it has a much broader
set of ideas about the transformation of business practices, process transformation, and
technology implementations.
The essential characteristics of on-demand businesses are responsiveness to the
dynamics of business, adapting to variable cost structures, focusing on core business
competency, and resiliency for consistent availability.

7. What are the facilities provided by virtual organization?


The formation of virtual task forces, or groups, to solve specific problems associated
with the virtual organization.

The dynamic provisioning and management capabilities of the resource required


meeting the SLA’s.
8. What are the properties of Cloud Computing?
There are six key properties of cloud computing: Cloud computing is
 user-centric
 task-centric
 powerful
 accessible
 intelligent
 programmable

9. Sketch the architecture of Cloud.

10. What are the types of Cloud service development?


 Software as a Service
 Platform as a Service
 Web Services
 On-Demand Computing

11. What is meant by scheduler?


Schedulers are types of applications responsible for the management of jobs, such
as allocating resources needed for any specific job, partitioning of jobs to schedule parallel
execution of tasks, data management, event correlation, and service-level management
capabilities.

12. What is meant by resource broker?

CS6703 GRID AND CLOUD COMPUTING


62
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Resource broker provides pairing services between the service requester and the
service provider. This pairing enables the selection of best available resources from the
service provider for the execution of a specific task.

13. What is load balancing?


Load balancing is concerned with the integrating the system in order to avoid
processing delays and over-commitment of resources. It involves partitioning of jobs,
identifying the resources and queuing the jobs.
14. What is grid infrastructure?
Grid infrastructure forms the core foundation for successful grid applications.
This infrastructure is a complex combination of number of capabilities and resources
identified for the specific problem and environment being addressed.

15. Define – Distributed Computing.


Distributed computing is a field of computer science that studies distributed
systems. A distributed system is a software system in which components located on
networked computers communicate and coordinate their actions by passing messages. The
components interact with each other in order to achieve a common goal.

16. List the challenges of P2P computing.


P2P computing faces three types of heterogeneity problems in hardware, soware, and
network requirements. There are too many hardware models and architectures to select from;
incompatibility exists between software and the OS; and di fferent network connections and
protocols make it too complex to apply in real applications. We need system scalability as the
workload increases.

17. Define IOT


The dynamic connections will grow exponentially into a new dynamic network of
networks, called the Internet of Things (IoT).

18. List the Technologies for Network-Based Systems.


 Multicore CPUs and Multithreading Technologies
 Multicore CPU and Many-Core GPU Architectures
 Multithreading Technology

19. .How GPUs Work


Modern GPUs are not restricted to accelerated graphics or video coding. They are
used in HPC systems to power supercomputers with massive parallelism at multicore and
multithreading levels. GPUs are designed to handle large numbers of floating-point
operations in parallel. In a way , the GPU o ffloads the CPU from all data-intensive
calculations, not just those that are related to video processing. Conventional GPUs are
widely used in mobile phones, game consoles, embedded systems, PCs, and servers. The
NVIDIA CUDA Tesla or Fermi is used in GPU clusters or in HPC systems for parallel
processing of massive floating-pointing data.

20. Draw the cluster Architecture.

CS6703 GRID AND CLOUD COMPUTING


63
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

UNIT II - GRID SERVICES

1. Define – OSGI.
Open Grid Services Architecture (OGSA) is a set of standards defining the way in
which information is shared among diverse components of large, heterogeneous grid systems.
In this context, a grid system is a scalable wide area network (WAN) that supports resource
sharing and distribution. OGSA is a trademark of the Open Grid Forum.

2. Define – OSGA.
The Open Grid Services Infrastructure (OGSI) was published by the Global Grid
Forum (GGF) as a proposed recommendation in June 2003. [1] It was intended to provide an
infrastructure layer for the Open Grid Services Architecture (OGSA). OGSI takes the
statelessness issues (along with others) into account by essentially extending Web services to
accommodate grid computing resources that are both transient and stateful.

3. Define – Peer to Peer Computing.


Peer to Peer computing is a relatively new computing discipline in the realm of
distributed computing. P2P system defines collaboration among a larger number of
individuals and/or organizations, with a limited set of security requirements and a less
complex resource-sharing topology.

4. What is Dynamic Accounting System?


DAS provides the following enhanced categories of accounting functionality to the
IPG community:
 Allows a grid user to request access to a local resource via the presentation of grid
credentials
 Determines and grants the appropriate authorizations for a user to access a local
resource without requiring a preexisting account on the resource to govern local
authorizations.

5. Define – SOA.

CS6703 GRID AND CLOUD COMPUTING


64
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

A service-oriented architecture is intended to define loosely coupled and


interoperable services/applications, and to define a process for integrating these
interoperable components.

6. What are the major goals of OSGA?


 Identify the use cases that can drive the OGSA platform components.
 Identify and define the core OGSA platform components.
 Define hosting and platform specific bindings.
 Define resource models and resource profiles with interoperable solutions.

7. What are the layers available in OGSA architectural organizations?


 Native platform services and transport mechanisms.
 OGSA hosting environment.
 OGSA transport and security.
 OGSA infrastructure (OGSI).
 OGSA basic services (meta-OS and domain services)

8. What is meant by grid infrastructure?


Grid infrastructure is a complex combination of a number of capabilities and resources
identified for the specific problem and environment being addressed. It forms the core
foundations for successful grid applications.

9. List some grid computing toolkits and frameworks?


 Globus Toolkit
 Globus Resource Allocation Manager(GRAM)
 Grid Security Infrastructure(GSI)
 Information Services
 Legion
 Condor and Condor-G
 NIMROD
 UNICORE
 NMI

10. Define - GRAM.


GRAM provides resource allocation, process creation, monitoring, and management
services. The most common use of GRAM is the remote job submission and control facility.
GRAM simplifies the use of remote systems.

11. What is the role of the grid computing organization?


 Organizations developing grid standards and best practices guidelines.
 Organizations developing grid computing toolkits, frameworks and middleware
solutions.
 Organizations building and using grid - based solutions to solve their computing, data,
and network requirements.
 Organizations working to adopt grid concepts into commercial products, via utility
computing and business on demand computing.

12. What are the different layers of grid architecture?

CS6703 GRID AND CLOUD COMPUTING


65
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

 Fabric Layer: Interface to local resources


 Connectivity Layer: Manages Communications
 Collective Layer: Coordinating Multiple Resources
 Application Layer: User Defined Application.

13. What are the fundamental components of SOAP specification?


 An envelope that defines a framework for describing message structure.
 A set of encoding rules for expressing instances of application defined data types
 A convention for representing remote procedure (RPC) and responses.
 A set of rules for using SOAP with HTTP.
 Message exchange patterns (MEP) such as request-response, one-way and peer-to-
peer conversations.

14. Define - SOAP.


SOAP is a simple and lightweight XML-based mechanism for creating structured data
packages that can be exchanged between network applications. SOAP provides a simple
enveloping mechanism and is proven in being able to work with existing networking services
technologies such as HHTP.SOAP is also flexible and extensible. SOAP is based on the fact
that it builds upon the XML info set.

15. Define WSDL.


WSDL is an XML Info set based document, which provides a model and XML format for
describe web services. This enables services to be described and enables the client to
consume these services in a standard way without knowing much on the lower level protocol
exchange binding including SOAP and HTTP. This high level abstraction on the service
limits human interaction and enables the automatic generation of proxies for web services,
and these proxies can be static or dynamic. It allows both document and RPC - oriented
messages.

16. What are the various levels of Policy Abstraction?


 Business Level
 Domain Level
 Device Level

17.What do you mean by the term flattening?


Basically,GWSDL extensions are to be transformed to WSDL.All the “extends” port
types and their operations, which are brought down to a single most derived portType.This
process is called “flattening” of the interface hierarchy to the most derived type.

18. What are the lifetime properties of OGSI specification?


 This time from which the contents of this element are valid (ogsi: goodFrom).
 This time until which the contents of this element are valid (ogsi:goodUntil).
 This time until which this element itself is available (ogsi: availableUntil).
19. What is soft-state lifetime management?

CS6703 GRID AND CLOUD COMPUTING


66
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

The soft-state lifetime management approach is a recommended method in the grid service
life-cycle management process. Every grid service has a terminated time set by the service
creator. This soft-state lifecycle is controlled by appropriate security and policy decisions of
the service and the service has the authority to control this behavior.
20. Explain about MembershipContentRule:
Deriving a service from the ServiceGroup portType and utilizing the
“MembershipContentRule” service data for the classification mechanisms can create a
grouping concept similar to a registry. This “rule” service data is used to restrict the
membership of a grid service in the group.

UNIT III - VIRTUALIZATION

1. What is the working principle of Cloud Computing?


The cloud is a collection of computers and servers that are publicly accessible via the
This hardware is typically owned and operated by a third party on a consolidated basis in
one or more data center locations. The machines can run any combination of operating
systems.

2. What is Virtualization?
Virtualization is a foundational element of cloud computing and helps deliver on the
value of cloud computing," Adams said. "Cloud computing is the delivery of shared
computing resources, software or data — as a service and on-demand through the Internet.

3. Define Cloud services with example.


Any web-based application or service offered via cloud computing is called a cloud
Cloud services can include anything from calendar and contact applications to word
processing and presentations.

4. What are the types of Cloud service development?


 Software as a Service
 Platform as a Service
 Infrastructure as a Service

5. Explain cloud provider and cloud broker?

Cloud Provider: Is a company that offers some component of cloud computing typically
infrastructure as a service, software as a Service or Platform as a Service. It is something
referred as CSP.

Cloud Broker: It is a third party individual or business that act as an intermediary between
the purchase of cloud computing service and sellers of that service.

6. Define - Private Cloud.

CS6703 GRID AND CLOUD COMPUTING


67
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

The private cloud is built within the domain of an intranet owned by a single
organization. Therefore, they are client owned and managed. Their access is limited to the
owning clients and their partners. Their deployment was not meant to sell capacity over the
Internet through publicly accessible interfaces. Private clouds give local users a flexible and
agile private infrastructure to run service workloads within their administrative domains.

7. Define - Public Cloud.


A public cloud is built over the Internet, which can be accessed by any user who has paid
for the service. Public clouds are owned by service providers. They are accessed by
subscription. Many companies have built public clouds, namely Google App Engine,
Amazon AWS, Microsoft Azure, IBM Blue Cloud, and Salesforce Force.com. These are
commercial providers that offer a publicly accessible remote interface for creating and
managing VM instances within their proprietary infrastructure.

8. Define - Hybrid Cloud.

A hybrid cloud is built with both public and private clouds; Private clouds can also support a
hybrid cloud model by supplementing local infrastructure with computing capacity from an
external public cloud. For example, the research compute cloud (RC2) is a private cloud built
by IBM.

9. Define anything-as-a-service?
Providing services to the client on the basis on meeting their demands at some pay per use cost
such as data storage as a service, network as a service, communication as a service etc. It is generally
denoted as anything as a service (XaaS).
10. What is mean by SaaS?
The software as a service refers to browser initiated application software over thousands of paid
customer. The SaaS model applies to business process industry application, consumer relationship
management (CRM), Enterprise resource Planning (ERP), Human Resources (HR) and collaborative
application.
11. What is mean by IaaS?
The Infrastructure as a Service model puts together the infrastructure demanded by the user
namely servers, storage, network and the data center fabric. The user can deploy and run on multiple
VM’s running guest OS on specific application.

12. Explain PaaS?


The Platform as a Service model enables the user to deploy user built applications onto a
virtualized cloud platform. It includes middleware, database, development tools and some runtime
support such as web2.0 and java. It includes both hardware and software integrated with specific
programming interface.
13. List out the advantages of Cloud Computing.

• Lower IT Infrastructure Costs


• Fewer Maintenance Issues
• Lower Software Costs

CS6703 GRID AND CLOUD COMPUTING


68
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

• Instant Software Updates


• Increased Computing Power
• Unlimited Storage Capacity
• Increased Data Safety
• Improved Compatibility Between Operating Systems
• Improved Document Format Compatibility
• Easier Group Collaboration
• Universal Access to Documents
• Latest Version Availability
• Removes the Tether to Specific Devices

14. List out the disadvantages of Cloud Computing.

• Requires a Constant Internet Connection


• Doesn’t Work Well with Low-Speed Connections
• Can Be Slow
• Features Might Be Limited
• Stored Data Might Not Be Secure
• If the Cloud Loses Your Data, You’re Screwed

15. What is Hypervisor?

A hypervisor or virtual machine monitor (VMM) is a piece of computer software, firmware or


hardware that creates and runs virtual machines. A computer on which a hypervisor is running one
or more virtual machines is defined as a host machine. Each virtual machine is called a guest
machine.
16. What are the types of hypervisor?
There are two types of hypervisors:
Type 1 (bare-metal)
Type 2 (hosted)

Type 1 hypervisors run directly on the system hardware. They are often referred to as
a "native" or "bare metal" or "embedded" hypervisors in vendor literature.

Type 2 hypervisors run on a host operating system. When the virtualization


movement first began to take off, Type 2 hypervisors were most popular. Administrators
could buy the software and install it on a server they already had.

17. What are the benefits of virtualization?


Virtualization is a creation of virtual machines and to manage them from one place. It
allows the resources to be shared with large number of network resources. Virtualization is
having lots of benefits and they are as follows:
1. It helps in saving lots of cost and allows to easily maintaining it, in less cost.
2. It allows multiple operating systems on one virtualization platform.
3. It removes the dependency of heavy hardware to run the application.
4. It provides consolidating servers that are used for crashing of a server purpose
5. It reduces the amount of space being taken by data centres and company data.

CS6703 GRID AND CLOUD COMPUTING


69
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

18. What are the types of hardware virtualization?


Full virtualization: Almost complete simulation of the actual hardware to allow
software, which typically consists of a guest operating system, to run unmodified Partial
virtualization: Some but not all of the target environment is simulated. Some guest programs,
therefore, may need modifications to run in this virtual environment.
Paravirtualization: A hardware environment is not simulated; however, the guest programs
are executed in their own isolated domains, as if they are running on a separate system. Guest
programs need to be specifically modified to run in this environment.

19.What are the different components used in VMWare infrastructure?


The different and major components used in VMWare infrastructure is as follows:
• VMWare infrastructure consists of the lowest layer which acts as a ESX server host.
• VMWare infrastructure also use the virtual centre server that keep tracks of all the
VM related images and manage it from one point.
• VMWare infrastructure (VI) client: this allows the client to interact with user’s
applications that are running on VMWare.
• Web browser is used to access the virtual machines.
• License server is used to create a server that provides licensing to the applications
• Database servers are used to maintain a database.

20. What is QEMU?


QEMU is a generic and open source machine emulator and virtualizer. When used as
a machine emulator, QEMU can run OS and programs made for one machine (e.g. an ARM
board) on a different machine (e.g. your own PC). By using dynamic translation, Qemu
achieves very good performance.

21. What is virtual Machine Cloning?


Virtual Machine Cloning is a method of creating a copy of an existing virtual machine
with the same configuration and installed software as the original.The existing virtual
machine is called the parent of the clone. When the cloning operation is complete, the clone
is a separate virtual machine

UNIT IV - PROGRAMMING MODEL

1. What is Hadoop development?


Apache Hadoop is an open-source software framework written in Java for distributed
storage and distributed processing of very large data sets on computer clusters built from
commodity hardware.

2.Explain what is NameNode in Hadoop?


NameNode in Hadoop is the node, where Hadoop stores all the file location
information in HDFS (Hadoop Distributed File System). In other words, NameNode is the
centrepiece of an HDFS file system. It keeps the record of all the files in the file system, and
tracks the file data across the cluster or multiple machines

3.Define- GT4.

CS6703 GRID AND CLOUD COMPUTING


70
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Globus Toolkit 4 is a open – source toolkit developed to build grids. It provides full
capabilities for sharing computing power and databases. Usage of Globus is extensive
throughout the scientific community within NSF, DOE, DARPHA, IBM, NASA, and
Microsoft projects.

4.Define- Map Reduce Computation.


MapReduce is designed to continue to work in the face of system failures. When a job
is running, MapReduce monitors progress of each of the servers participating in the job. If
one of them is slow in returning an answer or fails before completing its work, MapReduce
automatically starts another instance of that task on another server that has a copy of the data.
The complexity of the error handling mechanism is completely hidden from the programmer

5. Mention what are the three modes in which Hadoop can be run?
The three modes in which Hadoop can be run are
• Pseudo distributed mode
• Standalone (local) mode
• Fully distributed mode

6.What are the characteristics of Cloud Programming Model?


• Cost model
• Scalability
• Fault-tolerance
• Support for specific services
• Control model
• Data model
• Synchronization mode

7.What are the phases in MapReduce Programming Model?


• Map Phase:
Processes input key/value pair
Produces set of intermediate pair
map (in_key, in_value) -> list(out_key, interm_value)

CS6703 GRID AND CLOUD COMPUTING


71
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

• Reduce Phase:
Combines all intermediate values for a given key
Produces a set of merged output values
reduce(out_key, list(interm_value)) -> list(out_value)
8. Explain what is the function of MapReducer partitioner?
The function of Map Reducer practitioner is to make sure that all the value of a single
key goes to the same reducer, eventually which helps evenly distribution of the map output
over the reducers
9.Mention what is distributed cache in Hadoop?
Distributed cache in Hadoop is a facility provided by MapReduce framework. At the
time of execution of the job, it is used to cache file. The Framework copies the necessary
files to the slave node before the execution of any task at that node.
10.Mention what is rack awareness?
Rack awareness is the way in which the namenode determines on how to place blocks based
on the rack definitions.

11.What happens when a datanode fails ?


• When a datanode fails
• Jobtracker and namenode detect the failure
• On the failed node all tasks are re-scheduled
• Namenode replicates the users data to another node

12.Define- Hadoop Scheduler.


• Job divided into several independent tasks executed in parallel
• The input file is split into chunks of 64 / 128 MB
• Each chunk is assigned to a map task
• Reduce task aggregate the output of the map tasks

13.Define- HDFS.
Hadoop File System was developed using distributed file system design. It is run on
commodity hardware. Unlike other distributed systems, HDFS is highly fault tolerant and
designed using low-cost hardware. HDFS holds very large amount of data and provides easier
access. To store such huge data, the files are stored across multiple machines.

14.What are the features of HDFS?


 It is suitable for the distributed storage and processing.
 Hadoop provides a command interface to interact with HDFS.
 The built-in servers of name-node and data-node help users to easily check the status
of cluster.
 Streaming access to data in the file system.
 HDFS provides file permissions and authentication.

15.Sketch the HDFS Architecture.

CS6703 GRID AND CLOUD COMPUTING


72
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

16.What is Cloud Dataflow Programming Model?


The Dataflow programming model is designed to simplify the mechanics of large-
scale data processing. When you program with a Dataflow SDK, you are essentially creating
a data processing job to be executed by one of the Cloud Dataflow runner services.
This model lets you concentrate on the logical composition of your data processing
job, rather than the physical orchestration of parallel processing. You can focus on what you
need your job to do instead of exactly how that job gets executed.

17.What is Java Cloud service?


Oracle Java Cloud Service is a subscription-based, self-service, reliable, scalable, and
elastic enterprise-grade cloud platform that enables businesses to securely develop and
deploy Java applications.
 Dedicated virtual machines for running your entire WebLogic Server cluster.
 Pre-configured WebLogic Server software, with your choice of the 11g or 12c
version.
 Choice of virtual machine size (virtual cores, memory), as well as the size of the
WebLogic cluster.
 Self-managed, with fully automated cloud tooling for administrative and lifecycle
operations, such as patching, scaling, and backup.
 Fully automated, one-click, point-in-time restore for the entire service.

18.What is AIM?
The most-used instant messaging program is AOL Instant Messenger
(www.aim.com), also known as AIM. AIM supports all manner of special features in addition
to basic text messaging. The users get file sharing, RSS feeds, group chats, ability to text
message to and from mobile phones, voice chat, video chat, and even a mobile client and can
also enhance the basic AIM experience with a variety of official and user-created plug-ins.

19.Define- Multi-tenancy.
Multi-tenancy can be defined as a principle in software architecture, where a single
instance of a vendor’s offering runs on the vendor’s servers, serving multiple client
organizations (tenants). Often these tenants will pay a fee for this.

In practice, multi-tenancy allows a cloud provider to provide a service to


organizations that have users of their own. Of course, in certain cases the tenant could have

CS6703 GRID AND CLOUD COMPUTING


73
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

only 1 user; the important point is that the cloud provider has taken the tenant concept into
account and provided e.g. access based on the tenant concept, billing based on the tenant
concept, etc.

20.Define- GFS.
Google File System (GFS or GoogleFS) is a proprietary distributed file system
developed by Google for its own use.
It is designed to provide efficient, reliable access to data using large clusters of
commodity hardware. A new version of the Google File System is codenamed Colossus
which was released in 2010.

21.Define- OGF.
OGF is an open global community committed to driving the rapid evolution and
adoption of modern advanced applied distributed computing, including cloud, grid and
associated storage, networking and workflow methods.
OGF is focused on developing and promoting innovative scalable techniques,
applications and infrastructures to improve productivity in the enterprise and within the
international research, science and business communities.

UNIT V - SECURITY

1.What are the functions in Grid Security Model?


 Multiple security mechanisms
 Dynamic creation of services
 Dynamic establishment of trust domains

2. What are OGSA security services?


 Credential processing service
 Authorization service
 Credential Conversion service
 Identity Mapping service
 Audit

3.What are the high-level services included in Globas toolkit?


 Globus Resource Allocation Manager(GRAM)
 Grid Security Infrastructure(GSI)
 Information Services

4.What are the most common Gt3 handlers?


 Authentication service Hanlder
 WS Security Handler
 Authorization Handler
 X509 Sing Handler
 GSS Handler

5. Define- GSI.
The Grid Security Infrastructure (GSI), formerly called the Globus Security
Infrastructure, is a specification for secret, tamper-proof, delegatable communication between

CS6703 GRID AND CLOUD COMPUTING


74
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

software in a grid computing environment. Secure, authenticatable communication is


enabled using asymmetric encryption.

6. What are the high level grid security requirement aspects?


 Authentication
 Authorization
 Delegation
 Message integrity
 Single logon
 Confidentiality
 Privacy
 Policy exchange
 Credential life span and renewal
 Secure logging
 Assurance
 Manageability

7. What is CISCO connected grid security principles?


Cisco integrates security as a fundamental building block of any network architecture—
whether for the field area network, transmission and substation network, or the intra-control
center tier. The primary principles behind Cisco Connected Grid security include:
 Access control
 Data integrity, confidentiality, and privacy
 Threat detection and mitigation
 Device and platform integrity

8. What are the risks of storing data in the Cloud?


 Reliability
 Security
 User error
 Access problems

9. What are the factors to identify the threats in cloud?


 Failures in Provider Security
 Attacks by Other Customers
 Availability and Reliability Issues
 Legal and Regulatory Issues
 Perimeter Security Model Broken
 Integrating Provider and Customer Security Systems

10. What are the phases in data security life cycle?

CS6703 GRID AND CLOUD COMPUTING


75
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Create: Creation is the generation of new digital content, or the alteration/updating of


existing content.
Store: Storing is the act committing the digital data to some sort of storage repository, and
typically occurs nearly simultaneously with creation.
Use: Data is viewed, processed, or otherwise used in some sort of activity.
Share: Information is made accessible to others, such as between users, to customers, and to
partners
Archive: Data leaves active use and enters long-term storage.
Destroy: Data is permanently destroyed using physical or digital means (e.g.,
cryptoshredding).

11. Define- DLP.


Data Loss Prevention (DLP) is defined as: Products that, based on central policies,
identify, monitor, and protect data at rest, in motion, and in use, through deep content
analysis.
DLP is typically used for content discovery and to monitor data in motion using the
following options:

● Dedicated appliance/server: Standard hardware placed at a network chokepoint


between the cloud environment and the rest of the network/Internet., or within
different cloud segments.
● Virtual appliance
● Endpoint agent
● Hypervisor-agent: The DLP agent is embedded or accessed at the hypervisor level, as
opposed to running in the instance.

12. What is PaaS Encryption?


Since PaaS is so diverse, the following list may not cover all potential options:

● Client/application encryption: Data is encrypted in the PaaS application, or the client


accessing the platform.
● Database encryption: Data is encrypted in the database using encryption built in and
supported by the database platform.
● Proxy encryption: Data passes through an encryption proxy before being sent to the
platform.

CS6703 GRID AND CLOUD COMPUTING


76
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

13. Define- Database Activity Monitoring (DAM).


Database Activity Monitors capture and record, at a minimum, all Structured Query
Language (SQL) activity in real time or near real time, including database administrator
activity, across multiple database platforms; and can generate alerts on policy violations.

14. What are the two aspects involved in GRAM?


 Job submission- a user starts the job scheduling with the creation of a managed job
service.
 Resource management – a client knows about the master host environment and the
master managed factory service.

15. What are the two kinds of lifecycle model associated with state data recovery?
 Persistent lifecycle model.
 Transient lifecycle model.

16. Write the combination of Globus GT3 toolkit?


o GT3 core.
o Base services
o User- defined services.

17. What are the two aspects involved in GRAM?


 Job submission- a user starts the job scheduling with the creation of a managed job
service.
 Resource management – a client knows about the master host environment and the
master managed factory service.

18. What is a GT3 core?


It provides a framework to host the high-level services.
The core consists of OGSI reference implementation, security infrastructure, and System
level services.
19. What are the major components of default server side framework?
Web service engine provided by Apache AXIS framework. The GT3 software uses the
Apache AXIS framework to deal with normal web services.

Globus container framework. The GT3 software provides a container to manage stateful
web service through a unique instance handle, instance repository, and lifecycle management.

20. Write notes on Grid container?


The Globus container model is derived from the J2EE managed container model, where
the components are free from complex resource manageability.

 Lightweight service introspection and discovery.


 Dynamic deployment and soft-state management of stateful grid services.

CS6703 GRID AND CLOUD COMPUTING


77
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

21. What are the two levels of security available in GT3?


 Transport-level security-based on GSI security mechanism.
 Message-level security-implemented at the SOAP message level.

22. What are the treatments to the operation involved in service activation?
 Activate utilizing the lazy creation mechanism.
 Activation on service startup.

23. What are the problems with the operation providers?


 Due to the unavailability of multiple inheritances in java, service developers utilize
the default interface hierarchy, as provided by the framework.
 Some of the behaviors implemented by the aforementioned classes are specific to the
GT3 container.
 Dynamic configurations of service behaviors are not possible.

24. What are the expression evaluators supported in GT3?


 Service Data Name Evaluator.
 Service Data Name Set Evaluator.
 Service Data Name Delete Evaluator.
 Service Data XPath Evaluator.

25. What are the two different message-level authentication mechanisms provided by
GT3 framework?
 GSI Secure Conversation- a secure context is established between the client and the
service.
 GSI XML Signature- a message is signed with a given set of credentials.

26. What are the three ways available to create and add service data to service dataset?
 Gets the service data wrapper class from the service data set using the QName of the
service data element as defined in WSDL Create the value for that service data
element.
 Update the service data set with service data wrapper and the new value.

27. What are the steps involved in creating SDE?


 Create a new SDE by calling the create method of the service instance’s service data
set with a unique name or QName.
 Set a value for the SDE. The value of the SDE of type My Service Data type. Set the
initial value of My Service Data Type.
 Add the SDE to the service data set.
28. What are the most common GT3 security handlers?
 Authentication
 Service Handler
 WS security Handler
 Security Policy Handler
 Authorization Handler
 X509sign Handler
 GSS Handler

CS6703 GRID AND CLOUD COMPUTING


78
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

29. What are the client side security handlers?


 X509SignHandler SecContextHandler
 GSSHandler
 WSSecurityClientHandler

PART - B
UNIT – 1
1) Explain in detail about virtual organization. (16)
2) Write about the scope of grid computing in business areas. (16)
3) Explain some of the grid application and their usage patterns. (16)
4) Write short notes on. (16)
a) Schedulers
b) Resource broker
c) Load balancing
d) Grid portals
5) What are the data and functional requirements of grid computing? (16)
6) Explain briefly about grid infrastructure. (16)
7) Describe in detail about the Technologies for network based systems? (16)

UNIT – II
1) Write short notes on Open Grid Service Architecture. (16)
2) Explain in detail, the functional requirements of OGSA. (16)
3) Explain Practical & Detailed view of OGSA/OGSI. (16)
4) Explain in detail, OGSA services.(16)
5) Describe about the relation of grid architecture with other distributed technologies.
(16)
6) What are the third generation initiatives of grid computing?
7) Discuss briefly about organization building and using grid based solution to
solve their computing data and network requirements.

UNIT III
1) Write short notes on cloud deployment model. (16)
2) Explain in detail, categories of cloud. (16)
3) Explain in detail, pros and cons of cloud. (8)
4) Explain in detail, different implementation level of virtualization? (16)
5) Write short notes on OS level virtualization. List the pros and cons of OS level
virtualization. (16)
6) Explain in detail, the virtualization of CPU, Memory and I/O devices. (16)
7) Write short notes on virtual clusters. (8)
8) Explain in detail, the virtualization for data center automation. (16)

UNIT IV
1) Explain in detail, the architecture and working principle of MapReduce?

CS6703 GRID AND CLOUD COMPUTING


79
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

2) Explain in detail, dataflow and control flow of MapReduce?


3) Write short notes on iterative MapReduce in detail?
4) Explain in detail, the architecture of MapReduce in Hadoop?
5) Explain in detail, the programming the Google App Engine?
6) Explain in detail, the Architecture of Google File System?
7) Explain in detail, the structure of BigTable data model?
8) Explain in detail, the programming on Amazon EC2?
9) Explain in detail, the architecture of Amazon EC2?
10) Explain in detail, the Microsoft Azure programming support?

UNIT V
1) Explain the Security challenges in cloud computing in detail?
2) Explain the security architecture in detail?
3) Write short notes on,
a. Security governance
b. Security monitoring
c. Risk management
4) Explain the Secure Software Development Life Cycle?
5) Explain in detail about Software-as-a-Service security?
6) Explain the application security in detail?
7) Explain the data security and virtual machine security in detail?
8) Explain the identity management and access control in detail?

UNIVERSITY QUESTION PAPERS

B.E. DEGREE EXAMINATION MAY 2011


Computer Science and Engineering
CS2254 Cloud Computing

PART - A (10 x 2 = 20 MARKS)


ANSWER ALL THE QUESTIONS

1. What are the properties of Cloud Computing?


2. What are the advantages of cloud services?
3. List the companies who offer cloud service development?
4. What is pre cloud computing?
5. Give the various schedules in Collaborating on schedule.
6. What are the modules in the Conference.com?
7. Give some online to-do list application.
8. Who should use a Web-Based Word Processor?
9. List the web-based spreadsheet applications
10. What is web conferencing?

PART - B (5 x 16 = 80 MARKS)
ANSWER ALL THE QUESTIONS

11. a. (i) Discuss about the Pros and Cons of Cloud Computing. (8)
(ii) Explain briefly about who get benefits from Cloud Computing (8)

CS6703 GRID AND CLOUD COMPUTING


80
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

(OR)
b. (i) Explain the types of Cloud service development in detail (8)
(ii) Explain the architecture of Cloud computing in detail. (8)
12. a. (i) What are the collaboration schedules in communicating across the (8)
community?
(ii) Explain the activities on cloud computing for the corporation. (8)
(OR)
b. Explain in detail about Centralizing email communication. (16)
13. a. (i) Discuss about Collaborating on calendars (8)
(ii) How to explore on line scheduling and planning. Explain with example. (8)
(OR)
b. (i) Explain in detail about collaborating on word processing, (8)
(ii) Explain collaborating on event management (8)
14. a. (i) How to create groups on social networks? Explain with example. (8)
(ii) Explain about evaluating web conference tools. (8)
(OR)
b. (i) Explain in detail about Evaluating on line groupware. (8)
(ii) Explain about evaluating web conference tools. (8)

15. a. (i) Discuss about the online photo editing applications. (8)
(ii) Explain about the photo sharing communities. (8)

(OR)
b. Explain in detail about understanding the cloud storage. (16)

B.E. DEGREE EXAMINATION MAY 2012


Computer Science and Engineering

CS2254 Cloud Computing

PART - A (10 x 2 = 20 MARKS)


ANSWER ALL THE QUESTIONS

1. Draw the architecture of Cloud


2. Define Cloud services with example.
3. Who get benefits from Cloud Computing?
4. What are the types of Cloud service development?
5. How to manage the web based projects?
6. What is virtual community?
7. List the web-based presentation programs.
8. What is Hunt calendar?
9. How Online Databases Work?
10. List some of the web conferencing tools.

PART - B (5 x 16 = 80 MARKS)
ANSWER ALL THE QUESTIONS

CS6703 GRID AND CLOUD COMPUTING


81
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

11 a. (i) Explain the Cloud service development. (8)


.
(ii) Explain briefly about who get benefits from Cloud Computing (8)
(OR)
b. (i) How to discover cloud service development services and tools? (8)
(ii) Explain collaboration to cloud. (8)
12 a. (i) Explain Collaborating on Group Projects and Events. (8)
.
(ii) Explain the activities on cloud computing for the corporation. (8)
(OR)
b. (i) What are the collaboration schedules in communicating across the (8)
community?
(ii) Explain in detail about Centralizing email communication. (8)
13 a. (i) Explain in detail about collaborating on spreadsheets. (8)
.
(ii) Discuss about Collaborating on Schedules. (8)
(OR)
b. (i) Explain collaborating on contact management. (8)
(ii) How to explore on line scheduling and planning. Explain with example. (8)
14 a. (i) Explain about Evaluating instant messaging. (8)
.
(ii) Explain in detail about Evaluating on line groupware. (8)
(OR)
b. (i) How to create groups on social networks? Explain with example. (8)
(ii) Explain about evaluating web mail services (8)
15 a. (i) Explain about online bookmarking services. (8)
.
(ii) Discuss about the online file storage and sharing services. (8)
(OR)
b. (i) Explain in detail about understanding the cloud storage. (8)
(ii) Discuss about the online photo editing applications. (8)

CLOUD COMPUTING
IMPORTANT QUESTIONS
1 Explain any six benefits of Software as Service in Cloud computing? 12M
2 List the different cloud applications available in the market? Briefly explain the
scenarios/situations of “when to not use clouds”. 12M
3 a) Explain the tasks performed by Google applications engine? 6M b) Write a short note on
IBM offerings towards Cloud computing? 6M
4 Explain the different operational and economical benefits of using clouds? 12M
5 a) Describe any six design principles of Amazon S3 Cloud computing model? 6M

CS6703 GRID AND CLOUD COMPUTING


82
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

b) What is REST in Web services? List the different benefits of REST. 6M


6 a) What is SaaS in Cloud computing? Explain different categories of SaaS? 6M
b) List the prevalent companies and their offerings towards software plus services via Cloud
computing? 6M
7 What is the need of virtualization? Define Server virtualization, Application virtualization,
Presentation Virtualization. 12M
8 Discuss the various migration issues of the organization towards Clouds? 12M

SECTION - I
QI) a) Define Cloud computing, Enlist and explain essential characteristics of cloud
computing [8]
b) Explain the services provided by the Amazon infrastructure cloud from a user
perspective. [8]
c)What is self service provisioning? [2]
OR
Q2) a) What is cloud computing? Enlist and explain three service models, and four
deployment models of cloud computing. [8]
b) Explain a user view of Google App Engine with suitable block schematic.[8]
c) Explain in brief, how cloud helps reducing capital expenditure? [2]
Q3) a) What is the difference between process virtual machines, host VMMs and native
VMMs ? [8]
b) Enlist and explain some of the common pitfalls that come with virtualization. [8]
OR
Q4) a) What is the fundamental differences between the virtual machine as perceived by a
traditional operating system processes and a system VM? [8]
b) Compare the SOAP and REST paradigms in the context of programmatic communication
between applications deployed on different cloud providers, or between cloud applications
and those deployed in -house. [8]
Explain the architecture of cloud file systems (GFS, HDFS). [8]
Explain with suitable example, how a relational join could be executed in parallel using
MapReduce. [8]
Explain how Big tables are stored on a distributed file system such as GFS or HDFS.[8]

CS6703 GRID AND CLOUD COMPUTING


83
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Explain with suitable example the MapReduce model. [8]


SECTION – II
Why Cloud Computing brings new threats [6]
What is secure execution environment and communication in cloud? [6]
Explain different threats and vulnerabilities specific to virtual machines. [6]
OR
Explain the two fundamental functions, identity management and access control, which are
required for secure cloud computing [7]
Explain risks from multi-tenancy, with respect to various cloud environments. [7]
What is trusted cloud computing? [4]
Explain issues in cloud computing with respect to implementing real time application over
cloud platform. [8]
Enlist and explain the principal design issues that are to be addressed while designing a QoS-
aware distributed (middleware) architecture for cloud. [8]
OR
What is quality of service (QoS) monitoring in a cloud computing? [8] Enlist and explain
different issues in inter-cloud environments. [8]
Explain conceptual representation of the Eucalyptus Cloud. Explain in brief the components
within the Eucalyptus system. [8]
What is Nimbus? What is the main way to deploy Nimbus Infrastructure? What is the
difference between cloudinit.d and the Context Broker? [8]
OR
What is Open Nebula Cloud? Explain main components of Open Nebula. [8]
Explain Xen Cloud Platform (XCP) with suitable block diagram. [8]

CS6703 GRID AND CLOUD COMPUTING


84
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CS6703 GRID AND CLOUD COMPUTING


85
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CS6703 GRID AND CLOUD COMPUTING


86
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CS6703 GRID AND CLOUD COMPUTING


87
GRT INSTITUTE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CS6703 GRID AND CLOUD COMPUTING

You might also like