0% found this document useful (0 votes)

9 views

Session 5

Uploaded by

lacaygelo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Session 5

Uploaded by

lacaygelo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

SECTION 5

What is BigQuery?

BigQuery is a fully managed, serverless data warehouse that allows for superfast SQL
queries using Google's infrastructure. It lets you analyze large datasets in real time
with SQL-like syntax, making it ideal for quick querying, real-time analytics, and
machine learning.

How BigQuery Works

 Data Storage: Data is stored in tables with schemas that define field names
and types.
 SQL Queries: You can run SQL-like queries to join tables, aggregate data,
and perform complex calculations.
 Resource Management: BigQuery automatically handles resources based on
load, so you don't need to manage infrastructure.
 Distributed Architecture: This allows it to process large datasets quickly.
 Data Loading: You can load data from sources like Google Cloud Storage
and export data to various destinations.

Getting Started with BigQuery

1. Accessing BigQuery:
o Click on the navigation menu and select BigQuery.

2. Creating a Dataset:
o Click on the three dots next to your project ID, select "Create dataset,"
name it (e.g., "my_dataset"), choose the location (e.g., US Central 1),
and optionally enable table expiration.
3. Creating a Table:
o Within your dataset, click on the three dots, select "Create table,"
choose to create an empty table, and name it (e.g., "employees").

o Add fields to the schema, such as "employee_id" (integer),

"first_name" (string), and "last_name" (string).

o Click "Create table."

What is Cloud Composer?

Cloud Composer acts like a conductor for your data workflows. It helps automate and
manage tasks such as data pipelines, ETL processes, and data transformations across
different services in Google Cloud. Imagine it as a tool that ensures your data
operations run smoothly and on time, much like how a conductor manages various
instruments in a musical performance.

Key Features:

 Workflow Definition: Workflows are defined using Directed Acyclic Graphs

(DAGs), which outline tasks and their dependencies.
 Integration with Google Cloud: Found under the Data Analytics section in
Google Cloud Platform, it seamlessly integrates with other Google services.
 Environment Setup: You can create environments (like Composer 2) to

configure machine types, locations, and networking specifics for optimal

performance.

Example Workflow:

 Task 1 Extract Customer Data:

 Queries customer data from Google BigQuery using SQL to fetch details like
customer IDs and emails of recent purchasers.
 Task 2 Generate Email Content:

  Processes retrieved data to create personalized email content thanking

customers for their purchases.

 Task 3 Send Email:

 Sends out personalized emails using SMTP server settings, ensuring each
email reaches the correct customer.
Creating a DAG:

To automate this workflow, you'd define a DAG named "send_emails_DAG.py" in

Python. This script specifies tasks, their sequence, and scheduling intervals (e.g., daily
execution). Tasks are executed in the order defined, ensuring dependencies are met.

Monitoring and Execution:

Once configured, Cloud Composer handles the execution of these workflows. It

provides monitoring and logging capabilities to track task completion, errors, and
overall workflow performance.

What is Google Cloud Data Fusion?

Google Cloud Data Fusion is a managed service that helps organizations integrate
data without needing extensive coding. It provides a graphical interface to design,

deploy, and manage ETL (Extract, Transform, Load) pipelines. These pipelines move
and transform data across Google Cloud services efficiently.

Example Use Case:

Imagine you have customer data stored in a MySQL database and want to move it to
Google BigQuery for analysis. With Data Fusion, you can extract the data, transform
it to fit BigQuery's format, and then load it into BigQuery seamlessly.

Using Data Fusion:

1. Creating an Instance: Start by creating an instance in Data Fusion, which

provides an environment to work on your data integration tasks.
2. Using the Interface: Data Fusion offers different tools like "Wrangler" for
transforming data without coding. You can upload data from sources like
Google Cloud Storage, apply transformations (e.g., masking sensitive

information), and prepare it for loading into BigQuery.

3. Deploying Pipelines: Once transformations are defined, you can create and
deploy ETL pipelines to automate these processes. Data Fusion handles the
execution and scaling of these pipelines.

What is Pub/Sub?
Google Cloud Pub/Sub is a messaging service designed for asynchronous
communication between different parts of an application or between services. It
enables scalable and decoupled communication through a publish-subscribe model.
Imagine it like a bulletin board where friends share updates: anyone interested can
read messages without waiting for direct communication.

How Pub/Sub Works

In Pub/Sub, messages are published to topics by publishers (like applications or

services) without knowing who subscribes or how many there are. Subscribers
express interest by creating subscriptions to topics, receiving all messages sent to that
topic. This setup allows for efficient, real-time communication across systems.

Publishers: Send messages to topics without needing to know about subscribers.

Subscribers: Receive and process messages from topics they subscribe to.

Topics: Named resources where messages are sent by publishers and received by
subscribers.

Subscriptions: Linked to topics, subscriptions receive messages from topics and have
unique identifiers.

Pub/Sub is ideal for:

Event-Driven Systems: Like microservices, where services publish events and others
react.

Real-Time Data Processing: Ingesting high volumes of data from diverse sources for
analysis or storage.

Messaging Applications: Where users or systems publish to their own topics and
others subscribe to receive messages.

1. Creating Topics and Subscriptions:

 Create a topic for "news updates".

 Establish subscriptions like "sports", "tech", and "general news" to receive
specific updates.

2. Configuring Subscriptions:

 Choose between push (real-time delivery to an endpoint) or pull (subscribers

fetch messages when ready).
 Specify endpoints like URLs for push subscriptions to receive messages
immediately.

3. Writing Subscriber Scripts:

 Develop scripts for each subscription (e.g., sports, tech, general) to process
and respond to news updates.
4. Running the System:

 Execute scripts to see how messages flow from publishers (sending news
updates to the "news updates" topic) to subscribers (receiving and processing
updates based on their interests).

Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
From Everand
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
Brian Knight
3/5 (1)
Microsoft Access For Beginners: The Complete Step-By-Step User Guide For Mastering Microsoft Access, Creating Your Database For Managing Data And Optimizing Your Tasks (Computer/Tech)
From Everand
Microsoft Access For Beginners: The Complete Step-By-Step User Guide For Mastering Microsoft Access, Creating Your Database For Managing Data And Optimizing Your Tasks (Computer/Tech)
Voltaire Lumiere
No ratings yet
C# 2010 Coding Briefs Data Access
From Everand
C# 2010 Coding Briefs Data Access
Kevin Hough
No ratings yet
Microsoft Access 2003
From Everand
Microsoft Access 2003
Jitendra Patel
5/5 (1)
Associate Cloud Engineer - Study Notes
No ratings yet
Associate Cloud Engineer - Study Notes
14 pages
GCP Fund Module 8 Big Data and Machine Learning in the Cloud
No ratings yet
GCP Fund Module 8 Big Data and Machine Learning in the Cloud
41 pages
Bigquery Scenarios -Dipakraj Patil
No ratings yet
Bigquery Scenarios -Dipakraj Patil
37 pages
Learn SQL: Database Management Basics
From Everand
Learn SQL: Database Management Basics
Kiet Huynh
No ratings yet
4.4 - Managed Services
No ratings yet
4.4 - Managed Services
17 pages
Analyse Data in GCP
No ratings yet
Analyse Data in GCP
14 pages
GCP Data Engineer Curriculum
No ratings yet
GCP Data Engineer Curriculum
7 pages
Real-Time Big Data Analytics: Emerging Trends
From Everand
Real-Time Big Data Analytics: Emerging Trends
Trilokesh Khatri
No ratings yet
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
From Everand
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
Will Girten
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Google Cloud Product Flashcards
No ratings yet
Google Cloud Product Flashcards
117 pages
Getting Started with Oracle Data Integrator 11g: A Hands-On Tutorial
From Everand
Getting Started with Oracle Data Integrator 11g: A Hands-On Tutorial
David Hecksel
5/5 (2)
Microsoft Access Guide to Success: From Fundamentals to Mastery in Crafting Databases, Optimizing Tasks, & Making Unparalleled Impressions [III EDITION]
From Everand
Microsoft Access Guide to Success: From Fundamentals to Mastery in Crafting Databases, Optimizing Tasks, & Making Unparalleled Impressions [III EDITION]
Kevin Pitch
5/5 (17)
Cloud Computing Lab-3
No ratings yet
Cloud Computing Lab-3
8 pages
GCP Fund Module 9 Summary and Review
No ratings yet
GCP Fund Module 9 Summary and Review
13 pages
Introduction to Oracle Database Administration
From Everand
Introduction to Oracle Database Administration
Ying Wang
5/5 (1)
Google Cloud Services
No ratings yet
Google Cloud Services
27 pages
Getting Started with SQL Server 2012 Cube Development
From Everand
Getting Started with SQL Server 2012 Cube Development
Simon Lidberg
No ratings yet
Pls Academy Pde Student Slides 4 2405
No ratings yet
Pls Academy Pde Student Slides 4 2405
129 pages
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
Hive On Google Cloud
No ratings yet
Hive On Google Cloud
16 pages
The Pandemic: Driven New Age of Cloud Computing
From Everand
The Pandemic: Driven New Age of Cloud Computing
VNS Surendra Chimakurthi
No ratings yet
22021134 - Đặng Thanh Quang - Chủ đề 1
No ratings yet
22021134 - Đặng Thanh Quang - Chủ đề 1
3 pages
Learn C++
From Everand
Learn C++
Aishik Dutta
No ratings yet
Data Mining with Microsoft SQL Server 2008
From Everand
Data Mining with Microsoft SQL Server 2008
Jamie MacLennan
4/5 (1)
Guide To Google Cloud Databases
100% (1)
Guide To Google Cloud Databases
15 pages
GCP Notes For Certification
No ratings yet
GCP Notes For Certification
24 pages
MySQL Management and Administration with Navicat
From Everand
MySQL Management and Administration with Navicat
Gokhan Ozar
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Projects with IOTA
From Everand
Projects with IOTA
Guillermo Perez Guillen
No ratings yet
Knight's Microsoft SQL Server 2012 Integration Services 24-Hour Trainer
From Everand
Knight's Microsoft SQL Server 2012 Integration Services 24-Hour Trainer
Brian Knight
No ratings yet
Google Sheets For Beginners: The Ultimate Step-By-Step Guide To Mastering Google Sheets To Simplify Data Analysis, Use Spreadsheets, Create Diagrams, And Boost Productivity
From Everand
Google Sheets For Beginners: The Ultimate Step-By-Step Guide To Mastering Google Sheets To Simplify Data Analysis, Use Spreadsheets, Create Diagrams, And Boost Productivity
Voltaire Lumiere
No ratings yet
Nios4 FIRST STEPS
From Everand
Nios4 FIRST STEPS
Gessica Monteforte
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Professional Access 2013 Programming
From Everand
Professional Access 2013 Programming
Ben Clothier
No ratings yet
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
From Everand
Cloud Computing: Harnessing the Power of the Digital Skies: The IT Collection
Christopher Ford
No ratings yet
GCP Fund Module 8 Big Data and Machine Learning in The Cloud Coursera
No ratings yet
GCP Fund Module 8 Big Data and Machine Learning in The Cloud Coursera
38 pages
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
Storage_and_Database_Services GCP
No ratings yet
Storage_and_Database_Services GCP
69 pages
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
From Everand
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
Georgio Daccache
No ratings yet
Professional Microsoft SQL Server 2012 Integration Services
From Everand
Professional Microsoft SQL Server 2012 Integration Services
Brian Knight
No ratings yet
11 Managed Services
No ratings yet
11 Managed Services
25 pages
IBM WebSphere eXtreme Scale 6
From Everand
IBM WebSphere eXtreme Scale 6
Anthony Chaves
No ratings yet
Google Cloud Fundamentals: Core Infrastructure: Summary and Next Steps
No ratings yet
Google Cloud Fundamentals: Core Infrastructure: Summary and Next Steps
15 pages
Learn Microsoft Azure: Step by Step in 7 day for .NET Developers
From Everand
Learn Microsoft Azure: Step by Step in 7 day for .NET Developers
Saillesh Pawar
No ratings yet
AI-Driven Web Apps: Practical Machine Learning for Software Developers
From Everand
AI-Driven Web Apps: Practical Machine Learning for Software Developers
Sivaramarajalu Ramadurai Venkataraajalu
No ratings yet
Google Cloud Fund M8 Big Data and Machine Learning in The Cloud
No ratings yet
Google Cloud Fund M8 Big Data and Machine Learning in The Cloud
44 pages
35395adc GCS 1161
No ratings yet
35395adc GCS 1161
13 pages
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Power BI
From Everand
Power BI
Vishal Mehra
No ratings yet
Sql : The Ultimate Beginner to Advanced Guide To Master SQL Quickly with Step-by-Step Practical Examples
From Everand
Sql : The Ultimate Beginner to Advanced Guide To Master SQL Quickly with Step-by-Step Practical Examples
Mark Robinson
No ratings yet
Cloud Computing For Noobs
From Everand
Cloud Computing For Noobs
Silas Meadowlark
No ratings yet
Google Cloud Question
No ratings yet
Google Cloud Question
12 pages
Artigo Luhmann 2
No ratings yet
Artigo Luhmann 2
25 pages
Prac 2 - Midterms Reviewer
No ratings yet
Prac 2 - Midterms Reviewer
5 pages
Generation of Programming Languages
No ratings yet
Generation of Programming Languages
13 pages
Department of Education: in Science 4 On Classroom Observation Tool
100% (2)
Department of Education: in Science 4 On Classroom Observation Tool
3 pages
Pinagmulan NG Pilipinas Batay Sa Teorya: Quarter 1 Week 2
No ratings yet
Pinagmulan NG Pilipinas Batay Sa Teorya: Quarter 1 Week 2
91 pages
CSEC BIOLOGY Teeth
No ratings yet
CSEC BIOLOGY Teeth
3 pages
Railway Pyq ( - Silent - Queen013 - )
No ratings yet
Railway Pyq ( - Silent - Queen013 - )
28 pages
Screenshot 2024-01-19 at 10.06.37 A.M.
No ratings yet
Screenshot 2024-01-19 at 10.06.37 A.M.
78 pages
Brise Vent Havre ENG
No ratings yet
Brise Vent Havre ENG
31 pages
Therac 25
No ratings yet
Therac 25
5 pages
Course Code: CHE 401 Course Title: Reaction Engineering Term: January 2019
No ratings yet
Course Code: CHE 401 Course Title: Reaction Engineering Term: January 2019
2 pages
Food Production Project
No ratings yet
Food Production Project
10 pages
Analytical Instrument Qualification Agilent - Capitulo 3calificación Equipos - 014349
No ratings yet
Analytical Instrument Qualification Agilent - Capitulo 3calificación Equipos - 014349
20 pages
Supply Diff Linear Bar Grille
No ratings yet
Supply Diff Linear Bar Grille
4 pages
VLSI Lab Manual
No ratings yet
VLSI Lab Manual
41 pages
CHEMISTRY Form 2 Term 2 Joint Exam 2022 Marking Scheme
No ratings yet
CHEMISTRY Form 2 Term 2 Joint Exam 2022 Marking Scheme
8 pages
Ecco Case Analysis
100% (1)
Ecco Case Analysis
2 pages
Unit 3 - Future Tenses
No ratings yet
Unit 3 - Future Tenses
6 pages
Capstone 1 Lesson Plan
No ratings yet
Capstone 1 Lesson Plan
3 pages
MMB Help
No ratings yet
MMB Help
552 pages
Indian Culture And Society 1st Edition A Ashok Pv Lakshmaiah instant download
No ratings yet
Indian Culture And Society 1st Edition A Ashok Pv Lakshmaiah instant download
78 pages
4 - Case Study - Job Candidate Without Strong Sat
0% (1)
4 - Case Study - Job Candidate Without Strong Sat
3 pages
Instant Download Digital Storytelling: Capturing Lives, Creating Community 5th Edition Lambert PDF All Chapter
100% (3)
Instant Download Digital Storytelling: Capturing Lives, Creating Community 5th Edition Lambert PDF All Chapter
52 pages
DS-ZF - 400 - A Gear Box For Volvo Penta d13
No ratings yet
DS-ZF - 400 - A Gear Box For Volvo Penta d13
4 pages
SAP PT Recommendations Template
No ratings yet
SAP PT Recommendations Template
11 pages
Facade Structural Engineer
No ratings yet
Facade Structural Engineer
9 pages
Maths (Sets) Worksheet
No ratings yet
Maths (Sets) Worksheet
2 pages
EDM F001 Course Syllabus - Masters MD21
No ratings yet
EDM F001 Course Syllabus - Masters MD21
5 pages
A Guide To Writing Literature Reviews in Political Science and Public Administration
No ratings yet
A Guide To Writing Literature Reviews in Political Science and Public Administration
5 pages
Faldic Ryc 102c3-Vvt2
100% (1)
Faldic Ryc 102c3-Vvt2
20 pages

Session 5

Uploaded by

Session 5

Uploaded by

SECTION 5

How BigQuery Works

Getting Started with BigQuery

o Add fields to the schema, such as "employee_id" (integer),

o Click "Create table."

 Workflow Definition: Workflows are defined using Directed Acyclic Graphs

configure machine types, locations, and networking specifics for optimal

 Task 1 Extract Customer Data:

  Processes retrieved data to create personalized email content thanking

 Task 3 Send Email:

To automate this workflow, you'd define a DAG named "send_emails_DAG.py" in

Monitoring and Execution:

Once configured, Cloud Composer handles the execution of these workflows. It

What is Google Cloud Data Fusion?

Example Use Case:

Using Data Fusion:

1. Creating an Instance: Start by creating an instance in Data Fusion, which

information), and prepare it for loading into BigQuery.

How Pub/Sub Works

In Pub/Sub, messages are published to topics by publishers (like applications or

Publishers: Send messages to topics without needing to know about subscribers.

Pub/Sub is ideal for:

1. Creating Topics and Subscriptions:

 Create a topic for "news updates".

 Choose between push (real-time delivery to an endpoint) or pull (subscribers

3. Writing Subscriber Scripts:

You might also like