GCP Clouud Digital Study - Guide - v2-0
GCP Clouud Digital Study - Guide - v2-0
Study Guide
v2.0
2
Contents
Introduction 3
Learn more about the exam 4
Outline of learning path content 5
Course 1: Digital Transformation with Google Cloud 7
Course 2: Exploring Data Transformation with Google Cloud 9
Course 3: Innovating with Google Cloud Artificial Intelligence 11
Course 4: Modernize Infrastructure and Applications with Google Cloud 13
Course 5: Trust and Security with Google Cloud 15
Course 6: Scaling with Google Cloud Operations 17
Glossary 19
List of Google products and solutions 26
3
Introduction
The Google Cloud Digital Leader training and exam are intended for tech-adjacent individuals who
want to demonstrate an overall knowledge of cloud technology concepts and Google Cloud.
The exam validates a candidate’s ability to complete the following course objectives:
● Identify Google Cloud products and solutions that support digital transformation.
● Explain how cloud technology and data can be used to innovate within organizations.
● Identify how organizations can innovate using Google Cloud’s artificial intelligence and
machine learning solutions.
● Explain how to optimize cloud costs and achieve operational excellence with Google Cloud.
4
The Cloud Digital Leader exam is job-role independent. The exam assesses the knowledge and skills
of any individuals who want (or are required to) understand the purpose and application of Google
Cloud products.
Sign up for the Cloud Digital Leader Learning Path through Google Cloud Skills Boost, Coursera, or
Pluralsight.
Prepare for the exam with sample questions.
Learn more about how and where to take the exam on the Cloud Digital Leader website.
5
Course 1
Digital Transformation with Google Cloud
Module 1: Why Cloud Technology is Transforming Business
Course 2
Exploring Data Transformation with Google Cloud
Module 1: The Value of Data
Course 3
Innovating with Google Cloud Artificial Intelligence
Module 1: AI and ML Fundamentals
Course 4
Modernize Infrastructure and Applications
with Google Cloud
Module 1: Important Cloud Migration Terms
Course 5
Trust and Security with Google Cloud
Module 1: Trust and Security in the Cloud
Course 6
Scaling with Google Cloud Operations
Module 1: Financial Governance and Managing Cloud Costs
Cloud locations
Data, data management, data value chain, data What is a data lake?
governance, structured data, unstructured data,
What is a data warehouse?
semi-structured data, databases, data
warehouses, data lakes, database migration, What is Cloud Storage?
business intelligence, streaming analytics
What is object storage?
What is ETL?
Google security
Glossary
Course 1
Bandwidth: A measure of how much data a network can transfer in a given time.
Capital expenditures (CapEx): Upfront business expenses put toward fixed assets. Organizations
buy these items once, and they benefit their business for years.
Cloud technology/computing – The technology and processes needed to store, manage, and
access data that is transferred over the Cloud (as opposed to data that remains on your computer’s
hard drive).
Computing: A machine’s ability to process, store, retrieve, compare and analyze information, and to
automate tasks often done by computer programs (otherwise known as software or applications).
Data – Any information that is useful to an organization. Can be numbers on a spreadsheet, text in
an email, audio or video recordings, images, or even ideas in employees’ heads. Includes internal and
external information.
Digital transformation – When an organization uses new technologies to redesign and redefine
relationships with their customers, employees, and partners. Digital transformation uses modern
digital technologies—including all types of public, private, and hybrid cloud platforms—to create or
modify business processes, culture, and customer experiences to meet changing business and
market dynamics.
Infrastructure as a service (IaaS): A computing model that offers the on-demand availability of
almost infinitely scalable infrastructure resources, such as compute, networking, storage, and
databases as services over the internet.
Network latency: The time it takes for data to travel from one point to another. Often measured in
milliseconds, latency, sometimes called lag, describes delays in communication over a network.
On-premises IT infrastructure – This refers to hardware and software applications that are hosted
on-site, located, and operated within an organization's data center to serve their unique needs.
Open source: Software with source code that is publicly accessible and free for anyone to use,
modify, and share.
Open standard: Software that follows particular specifications that are openly accessible and
usable by anyone.
20
Operating expenses (OpEx): Recurring costs for a more immediate benefit. This represents the
day-to-day expenses to run a business.
Platform as a service (PaaS): A computing model that offers a cloud-based platform for
developing, running, and managing applications.
Private cloud: When an organization has virtualized servers in its own data centers, or those of a
private cloud provider, to create its own private dedicated environment.
Public cloud: Where on-demand computing services and infrastructure are managed by a
third-party provider, such as Google Cloud, and shared with multiple organizations or “tenants”
through the public internet.
Regions: Independent geographic areas where Google Cloud resources are deployed, composed of
zones.
Shared responsibility model: A model in which the responsibility to secure data is shared between
a business and the cloud provider. The cloud service provider is the data processor, whereas the
organization is the data controller.
Software as a service (SaaS): A computing model that offers an entire application, managed by a
cloud provider, through a web browser
The cloud: A metaphor for the network of data centers that store and compute information available
through the internet. It includes the complex web of software, computers, networks, and security
systems involved.
Total cost of ownership (TCO): A comprehensive assessment of all layers within the infrastructure
and other associated costs across the business over time. Includes acquiring hardware and
software, management and support, communications, and user expenses, and the cost of service
downtime, training, and other productivity losses.
Course 2
Business intelligence: The process of collecting, analyzing, and interpreting data to make better
business decisions
Database: An organized collection of data generally stored in tables and accessed electronically
from a computer system. Built and optimized to enable the efficient ingestion of large amounts of
data from many different sources.
Data lake: Data lake is a repository designed to store, process, and secure large amounts of
structured, semistructured, and unstructured data. It can store data in its native format and process
any variety of it, ignoring size limits and serves many purposes, such as exploratory data analysis.
Dataset: Aggregated data points of one category (for example, customer transactions).
Data value chain: The sequence of activities involved in transforming data into value for an
organization
Data warehouse: The central hub for all business data, it assembles data from multiple sources,
including databases. When combined with connector tools, it can transform unstructured data into
semi-structured data that can be used for analysis. Data warehouses are built to rapidly analyze and
report massive and multi-dimensional datasets on an ongoing basis, in real-time.
Object storage – A data storage architecture for large stores of unstructured data, designating
each piece of data as an object (for example, audio or multimedia files).
Semi-structured data: Data that falls somewhere between structured and unstructured data. It’s
organized into a hierarchy, but without full differentiation or any particular ordering. Examples
include emails, HTML, JSON, and XML files.
Streaming analytics: The process of analyzing data in real time as it is being generated.
Structured data – Highly organized, quantitative data (for example, names or credit card numbers).
Easily stored and managed in databases.
Unstructured data – Data that has no organization and tends to be qualitative (e.g. word
processing documents or images). Can be stored as objects, which consist of the data in its native
format along with metadata such as unique identifiers.
22
Course 3
Artificial intelligence (AI): A broad field or term that describes any kind of machine capable of a
task that normally requires human intelligence, such as visual perception, speech recognition,
decision-making, or translation between languages.
Data quality: The degree to which data is complete, unique, timely, valid, accurate, and consistent.
Explainable AI: Techniques that make AI models more transparent and understandable to humans.
Machine learning (ML): A branch within the field of AI. Computers that can “learn” from data and
make predictions or decisions without being explicitly programmed to do so by using algorithms or
models to analyze data. These algorithms use historical data as input to predict new output values.
ML models: Mathematical models that are used to make predictions or decisions based on data.
Responsible AI: An approach to AI development and deployment that considers the ethical, social,
and environmental implications of AI.
23
Course 4
Application (or app): A computer program or software that is designed to perform a specific digital
task, typically used or run by an end-user. In this digital age, customers expect applications to be
intuitive, well-functioning, and efficient.
Application programming interface (API) – A piece of software that interfaces with or connects
different applications and enables information to flow between systems. Unlike a user interface,
which connects a computer to a person, an API connects computers or pieces of software to each
other. One purpose of APIs is to hide the internal details of how a system works, exposing only those
parts a developer wants to allow a user or program to interface with. In this way APIs can help
organizations to adapt to modern business needs by allowing access to older legacy systems.
Container: Follows the same principle as a VM, providing an isolated environment to run software
services and optimize resources from one piece of hardware. Containers are more efficient than VMs
because they do not recreate a full representation of the hardware, but only recreate or virtualize the
operating system.
Kubernetes: An open source cluster management system that provides automated container
orchestration.
Multi-cloud: An IT infrastructure that uses multiple public cloud providers, such as Google Cloud, to
achieve greater flexibility, scalability, and cost savings.
Rehosting: Moving an application or system from one environment to another, such as from
on-premises to the cloud, without making any changes to the application or system itself.
Serverless computing: A cloud computing execution model in which the cloud provider allocates
machine resources on demand and takes care of the servers on behalf of their customers.
Businesses provide code for the function that they want to run and the cloud provider handles all
infrastructure management. Resources such as compute power are automatically provisioned behind
the scenes as needed.
Virtual machines (VM) – A VM is a virtualized instance of a server that re-creates the functionality
of a dedicated physical server. It uses a partitioned space inside a physical server which makes it
easy to optimize and reallocate resources and allow multiple systems to run on the same hardware.
24
Course 5
Availability: The duration for which the cloud service provider guarantees that client’s data and
services are up and running or accessible.
Defense-in-depth: The cloud service provider manages the security of its infrastructure and its
data centers, and customers gain the benefits of their infrastructure’s multiple built-in security
layers.
Encryption: The process of encoding data stored in the cloud to safeguard it from unauthorized
access.
Least privilege model: A security principle that grants users only the minimum permissions
necessary to perform their tasks.
Malware: Software designed to harm a computer system, such as viruses, ransomware, and
spyware.
Privacy: The data an organization or an individual has access to, and who they can share that data
with.
SecOps: A collaborative approach to security that combines IT security and operations teams.
Security: The policies, procedures, and controls put in place to keep data and infrastructure safe.
Two-step verification: A security measure that requires users to enter a second verification code,
such as a code sent to their phone, in addition to their password to log in.
Zero trust model: A security approach that assumes no entity or user is trustworthy and requires
continuous verification before granting access.
25
Course 6
DevOps: Developer operations. A philosophy that seeks to create a more collaborative and
accountable culture within developer and operations teams. Five objectives of DevOps include
reducing silos, accepting failure as normal, implementing gradual change, leveraging tooling and
automation, and measuring everything.
Log file: A text file where applications (including the operating system) write events. Log files make it
easier for developers, DevOps, and system administrators to get insights and identify the root cause
of issues within applications and the infrastructure.
Logging: A process that allows IT teams to analyze selected logs and accelerate application
troubleshooting.
Monitoring: Gathering predefined sets of metrics or logs. Monitoring is the foundation for site
reliability engineering because it provides visibility into the performance, uptime, and overall health
of cloud powered applications.
Resource hierarchy: How an IT team can organize a business’s Google Cloud environment and how
that service structure maps to the organization’s actual structure. It determines what resources
users can access.
Saturation: The point at which a system is no longer able to handle any more requests.
SLA (Service Level Agreement): A contract between a service provider and a customer that
specifies the level of service that will be provided.
SLO (Service Level Objective): A target for a particular SLI, such as a maximum latency of 200
milliseconds.
SRE: Site reliability engineering. A discipline that applies aspects of software engineering to
operations. The goals of SRE are to create scalable and highly reliable software systems. Best
practices central to SRE align with DevOps objectives.
App Engine: A platform for building scalable web applications and mobile backends.
Cloud Functions: An event-driven compute platform for cloud services and apps.
Cloud Identity: A unified platform for IT administrators to manage user devices and apps.
Cloud Monitoring: A tool monitoring infrastructure and application health with rich metrics.
Cloud Profiler: Continuous CPU and heap profiling to improve performance and reduce costs.
Cloud Spanner: A fully managed Google Cloud database service designed for global scale.
Cloud SQL: Google Cloud’s database service (relational database management service).
Cloud Storage: Google Cloud’s object storage service for structured, semi-structured, and
structured data. One of several products used in data lake solutions.
Cost Management: Tools for monitoring, controlling, and optimizing business costs.
Dataflow: A fully managed streaming analytics service that creates a pipeline to process both
streaming data and batch data.
Firebase: An app development software to build, improve, and grow mobile and web apps.
27
Google Cloud console: A web-based interface for managing and monitoring cloud apps.
Google Kubernetes Engine: An open source container orchestration system for automating
computer application deployment, scaling, and management.
Pub/Sub: A distributed messaging service that can receive messages from various device streams
such as gaming events, IoT devices, and application streams. The name is short for
Publisher/Subscriber.
TensorFlow: An end-to-end open source platform for machine learning, with a comprehensive,
flexible ecosystem of tools, libraries and community resources, originally created by Google.
Vertex AI: A unified platform for training, hosting and managing ML models. Features include
AutoML and custom training.
VMware Engine: An engine for migrating and running VMware workloads natively on Google Cloud.