0% found this document useful (0 votes)

33 views27 pages

Professional-Machine-Learning-Engineer

The document provides a series of questions and answers related to building and managing machine learning operations (MLOps) on Google Cloud, focusing on best practices for storing artifacts, managing permissions, training models, and deploying them. Key topics include using Vertex AI for managing ML workflows, leveraging AutoML Tables for regression tasks, and optimizing hyperparameters for model performance. The answers emphasize the importance of using appropriate Google Cloud services and roles to ensure efficient and effective machine learning processes.

Uploaded by

asif ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views27 pages

Professional-Machine-Learning-Engineer

Uploaded by

asif ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Free Exam/Cram Practice Materials - Best Exam Practice Materials

IT Certification Guaranteed, The Easy Way!

NO.1 You are building a MLOps platform to automate your company's ML experiments and model
retraining. You need to organize the artifacts for dozens of pipelines How should you store the
pipelines' artifacts'?
A. Store parameters in Cloud SQL and store the models' source code and binaries in GitHub
B. Store parameters in Cloud SQL store the models' source code in GitHub, and store the models'
binaries in Cloud Storage.
C. Store parameters in Vertex ML Metadata store the models' source code in GitHub and store the
models' binaries in Cloud Storage.
D. Store parameters in Vertex ML Metadata and store the models source code and binaries in
GitHub.
Answer: C
Explanation:
To organize the artifacts for dozens of pipelines, you should store the parameters in Vertex ML
Metadata, store the models' source code in GitHub, and store the models' binaries in Cloud Storage.
This option has the following advantages:
* Vertex ML Metadata is a service that helps you track and manage the metadata of your ML
workflows, such as datasets, models, metrics, and parameters1. It can also help you with data
lineage, model versioning, and model performance monitoring2.
* GitHub is a popular platform for hosting and collaborating on code repositories. It can help you
manage the source code of your models, as well as the configuration files, scripts, and notebooks
that are part of your ML pipelines3.
* Cloud Storage is a scalable and durable object storage service that can store any type of data,
including model binaries4. It can also integrate with other services, such as Vertex AI, Cloud
Functions, and Cloud Run, to enable easy deployment and serving of your models5.
References:
* 1: Introduction to Vertex ML Metadata | Vertex AI | Google Cloud
* 2: Manage metadata for ML workflows | Vertex AI | Google Cloud
* 3: GitHub - Where the world builds software
* 4: Cloud Storage | Google Cloud
* 5: Deploying models | Vertex AI | Google Cloud

NO.2 You recently created a new Google Cloud Project After testing that you can submit a Vertex Al
Pipeline job from the Cloud Shell, you want to use a Vertex Al Workbench user-managed notebook
instance to run your code from that instance You created the instance and ran the code but this time
the job fails with an insufficient permissions error. What should you do?
A. Ensure that the Workbench instance that you created is in the same region of the Vertex Al
Pipelines resources you will use.
B. Ensure that the Vertex Al Workbench instance is on the same subnetwork of the Vertex Al Pipeline
resources that you will use.
C. Ensure that the Vertex Al Workbench instance is assigned the Identity and Access Management
(1AM) Vertex Al User rote.
D. Ensure that the Vertex Al Workbench instance is assigned the Identity and Access Management
(1AM) Notebooks Runner role.
Answer: C
Explanation:

2 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 1
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

Vertex AI Workbench is an integrated development environment (IDE) that allows you to create and
run Jupyter notebooks on Google Cloud. Vertex AI Pipelines is a service that allows you to create and
manage machine learning workflows using Vertex AI components. To submit a Vertex AI Pipeline job
from a Vertex AI Workbench instance, you need to have the appropriate permissions to access the
Vertex AI resources. The Identity and Access Management (IAM) Vertex AI User role is a predefined
role that grants the minimum permissions required to use Vertex AI services, such as creating and
deploying models, endpoints, and pipelines. By assigning the Vertex AI User role to the Vertex AI
Workbench instance, you can ensure that the instance has sufficient permissions to submit a Vertex
AI Pipeline job. You can assign the role to the instance by using the Cloud Console, the gcloud
command-line tool, or the Cloud IAM API. References: The answer can be verified from official
Google Cloud documentation and resources related to Vertex AI Workbench, Vertex AI Pipelines, and
IAM.
* Vertex AI Workbench | Google Cloud
* Vertex AI Pipelines | Google Cloud
* Vertex AI roles | Google Cloud
* Granting, changing, and revoking access to resources | Google Cloud

NO.3 You need to train a regression model based on a dataset containing 50,000 records that is
stored in BigQuery.
The data includes a total of 20 categorical and numerical features with a target variable that can
include negative values. You need to minimize effort and training time while maximizing model
performance. What approach should you take to train this regression model?
A. Create a custom TensorFlow DNN model.
B. Use BQML XGBoost regression to train the model
C. Use AutoML Tables to train the model without early stopping.
D. Use AutoML Tables to train the model with RMSLE as the optimization objective
Answer: D
Explanation:
AutoML Tables is a service that allows you to automatically build, analyze, and deploy machine
learning models on tabular data. It is suitable for large-scale regression and classification problems,
and it supports various optimization objectives, data splitting methods, and hyperparameter tuning
algorithms. AutoML Tables can handle both categorical and numerical features, and it can also handle
missing values and outliers.
AutoML Tables is a good choice for this problem because it minimizes the effort and training time
required to train a regression model, while maximizing the model performance.
RMSLE stands for Root Mean Squared Logarithmic Error, and it is a metric that measures the average
difference between the logarithm of the predicted values and the logarithm of the actual values.
RMSLE is useful for regression problems where the target variable can include negative values, and
where large differences between small values are more important than large differences between
large values. For example, RMSLE penalizes underestimating a value of 10 by 2 more than
overestimating a value of 1000 by
20. RMSLE is a good optimization objective for this problem because it can handle negative values in
the target variable, and it can reduce the impact of outliers and large errors.
For more information about AutoML Tables and RMSLE, see the following references:
* AutoML Tables: end-to-end workflows on AI Platform Pipelines
* Predict workload failures before they happen with AutoML Tables

3 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 2
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

* How to Calculate RMSE in R

NO.4 You are tasked with building an MLOps pipeline to retrain tree-based models in production.
The pipeline will include components related to data ingestion, data processing, model training,
model evaluation, and model deployment. Your organization primarily uses PySpark-based workloads
for data preprocessing. You want to minimize infrastructure management effort. How should you set
up the pipeline?
A. Set up a TensorFlow Extended (TFX) pipeline on Vertex Al Pipelines to orchestrate the MLOps
pipeline. Write a custom component for the PySpark-based workloads on Dataproc.
B. Set up a Vertex Al Pipelines to orchestrate the MLOps pipeline. Use the predefined Dataproc
component for the PySpark-based workloads.
C. Set up Cloud Composer to orchestrate the MLOps pipeline. Use Dataproc workflow templates for
the PySpark-based workloads in Cloud Composer.
D. Set up Kubeflow Pipelines on Google Kubernetes Engine to orchestrate the MLOps pipeline. Write
a custom component for the PySpark-based workloads on Dataproc.
Answer: A

NO.5 You are a data scientist at an industrial equipment manufacturing company. You are
developing a regression model to estimate the power consumption in the company's manufacturing
plants based on sensor data collected from all of the plants. The sensors collect tens of millions of
records every day. You need to schedule daily training runs for your model that use all the data
collected up to the current date. You want your model to scale smoothly and require minimal
development work. What should you do?
A. Develop a custom TensorFlow regression model, and optimize it using Vertex Al Training.
B. Develop a regression model using BigQuery ML.
C. Develop a custom scikit-learn regression model, and optimize it using Vertex Al Training
D. Develop a custom PyTorch regression model, and optimize it using Vertex Al Training
Answer: B
Explanation:
BigQuery ML is a powerful tool that allows you to build and deploy machine learning models directly
within BigQuery, Google's fully-managed, serverless data warehouse. It allows you to create
regression models using SQL, which is a familiar and easy-to-use language for many data scientists. It
also allows you to scale smoothly and require minimal development work since you don't have to
worry about cluster management and it's fully-managed by Google.
BigQuery ML also allows you to run your training on the same data where it's stored, this will
minimize data movement, and thus minimize cost and time.
References:
* BigQuery ML
* BigQuery ML for regression
* BigQuery ML for scalability

NO.6 You have recently created a proof-of-concept (POC) deep learning model. You are satisfied
with the overall architecture, but you need to determine the value for a couple of hyperparameters.
You want to perform hyperparameter tuning on Vertex AI to determine both the appropriate
embedding dimension for a categorical feature used by your model and the optimal learning rate.

4 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 3
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

You configure the following settings:

For the embedding dimension, you set the type to INTEGER with a minValue of 16 and maxValue of
64.
For the learning rate, you set the type to DOUBLE with a minValue of 10e-05 and maxValue of 10e-
02.
You are using the default Bayesian optimization tuning algorithm, and you want to maximize model
accuracy.
Training time is not a concern. How should you set the hyperparameter scaling for each
hyperparameter and the maxParallelTrials?
A. Use UNIT_LINEAR_SCALE for the embedding dimension, UNIT_LOG_SCALE for the learning rate,
and a large number of parallel trials.
B. Use UNIT_LINEAR_SCALE for the embedding dimension, UNIT_LOG_SCALE for the learning rate,
and a small number of parallel trials.
C. Use UNIT_LOG_SCALE for the embedding dimension, UNIT_LINEAR_SCALE for the learning rate,
and a large number of parallel trials.
D. Use UNIT_LOG_SCALE for the embedding dimension, UNIT_LINEAR_SCALE for the learning rate,
and a small number of parallel trials.
Answer: A
Explanation:
The best option for performing hyperparameter tuning on Vertex AI to determine the appropriate
embedding dimension and the optimal learning rate is to use UNIT_LINEAR_SCALE for the embedding
dimension, UNIT_LOG_SCALE for the learning rate, and a large number of parallel trials. This option
has the following advantages:
* It matches the appropriate scaling type for each hyperparameter, based on their range and
distribution.
The embedding dimension is an integer hyperparameter that varies linearly between 16 and 64, so
using UNIT_LINEAR_SCALE makes sense. The learning rate is a double hyperparameter that varies
exponentially between 10e-05 and 10e-02, so using UNIT_LOG_SCALE is more suitable.
* It maximizes the exploration of the hyperparameter space, by using a large number of parallel
trials.
Since training time is not a concern, using more trials can help find the best combination of
hyperparameters that maximizes model accuracy. The default Bayesian optimization tuning algorithm
can efficiently sample the hyperparameter space and converge to the optimal values.
The other options are less optimal for the following reasons:
* Option B: Using UNIT_LINEAR_SCALE for the embedding dimension, UNIT_LOG_SCALE for the
learning rate, and a small number of parallel trials, reduces the exploration of the hyperparameter
space, by using a small number of parallel trials. Since training time is not a concern, using fewer
trials can miss some potentially good combinations of hyperparameters that maximize model
accuracy. The default Bayesian optimization tuning algorithm can benefit from more trials to sample
the hyperparameter space and converge to the optimal values.
* Option C: Using UNIT_LOG_SCALE for the embedding dimension, UNIT_LINEAR_SCALE for the
learning rate, and a large number of parallel trials, mismatches the appropriate scaling type for each
hyperparameter, based on their range and distribution. The embedding dimension is an integer
hyperparameter that varies linearly between 16 and 64, so using UNIT_LOG_SCALE is not suitable.
The learning rate is a double hyperparameter that varies exponentially between 10e-05 and 10e-02,

5 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 4
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

so using UNIT_LINEAR_SCALE makes less sense.

* Option D: Using UNIT_LOG_SCALE for the embedding dimension, UNIT_LINEAR_SCALE for the
learning rate, and a small number of parallel trials, combines the drawbacks of option B and option C.
It mismatches the appropriate scaling type for each hyperparameter, based on their range and
distribution, and reduces the exploration of the hyperparameter space, by using a small number of
parallel trials.
References:
* [Vertex AI: Hyperparameter tuning overview]
* [Vertex AI: Configuring the hyperparameter tuning job]

NO.7 You are creating a model training pipeline to predict sentiment scores from text-based product
reviews. You want to have control over how the model parameters are tuned, and you will deploy the
model to an endpoint after it has been trained You will use Vertex Al Pipelines to run the pipeline You
need to decide which Google Cloud pipeline components to use What components should you
choose?
A.

Answer: A
Explanation:
According to the web search results, Vertex AI Pipelines is a serverless orchestrator for running ML
pipelines, using either the KFP SDK or TFX1. Vertex AI Pipelines provides a set of prebuilt components
that can be used to perform common ML tasks, such as training, evaluation, deployment, and more2.
Vertex AI ModelEvaluationOp and ModelDeployOp are two such components that can be used to
evaluate and deploy a model to an endpoint for online inference3. However, Vertex AI Pipelines does
not provide a prebuilt component for hyperparameter tuning. Therefore, to have control over how
the model parameters are tuned, you need to use a custom component that calls the Vertex AI
HyperparameterTuningJob service4. Therefore, option A is the best way to decide which Google
Cloud pipeline components to use for the given use case, as it includes a custom component for
hyperparameter tuning, and prebuilt components for model evaluation and deployment. The other
options are not relevant or optimal for this scenario. References:
* Vertex AI Pipelines
* Google Cloud Pipeline Components
* Vertex AI ModelEvaluationOp and ModelDeployOp
* Vertex AI HyperparameterTuningJob
* Google Professional Machine Learning Certification Exam 2023
* Latest Google Professional Machine Learning Engineer Actual Free Exam Questions

NO.8 You recently trained an XGBoost model on tabular data You plan to expose the model for

6 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 5
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

internal use as an HTTP microservice After deployment you expect a small number of incoming
requests. You want to productionize the model with the least amount of effort and latency. What
should you do?
A. Deploy the model to BigQuery ML by using CREATE model with the BOOSTED-THREE- REGRESSOR
statement and invoke the BigQuery API from the microservice.
B. Build a Flask-based app Package the app in a custom container on Vertex Al and deploy it to Vertex
Al Endpoints.
C. Build a Flask-based app Package the app in a Docker image and deploy it to Google Kubernetes
Engine in Autopilot mode.
D. Use a prebuilt XGBoost Vertex container to create a model and deploy it to Vertex Al Endpoints.
Answer: D
Explanation:
XGBoost is a popular open-source library that provides a scalable and efficient implementation of
gradient boosted trees. You can use XGBoost to train a classification or regression model on tabular
data. You can also use Vertex AI to productionize the model and expose it for internal use as an HTTP
microservice. Vertex AI is a service that allows you to create and train ML models using Google Cloud
technologies. You can use a prebuilt XGBoost Vertex container to create a model and deploy it to
Vertex AI Endpoints. A prebuilt Vertex container is a container image that contains the dependencies
and libraries needed to run a specific ML framework, such as XGBoost. You can use a prebuilt Vertex
container to simplify the model creation and deployment process, without having to build your own
custom container. Vertex AI Endpoints is a service that allows you to serve your ML models online
and scale them automatically. You can use Vertex AI Endpoints to deploy the model from the prebuilt
Vertex container and expose it as an HTTP microservice.
You can also configure the endpoint to handle a small number of incoming requests, and optimize
the latency and cost of serving the model. By using a prebuilt XGBoost Vertex container and Vertex AI
Endpoints, you can productionize the model with the least amount of effort and latency. References:
* XGBoost documentation
* Vertex AI documentation
* Prebuilt Vertex container documentation
* Vertex AI Endpoints documentation
* Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate

NO.9 You are training an ML model on a large dataset. You are using a TPU to accelerate the training
process You notice that the training process is taking longer than expected. You discover that the TPU
is not reaching its full capacity. What should you do?
A. Increase the learning rate
B. Increase the number of epochs
C. Decrease the learning rate
D. Increase the batch size
Answer: D
Explanation:
The best option for training an ML model on a large dataset, using a TPU to accelerate the training
process, and discovering that the TPU is not reaching its full capacity, is to increase the batch size.
This option allows you to leverage the power and simplicity of TPUs to train your model faster and
more efficiently. A TPU is a custom-developed application-specific integrated circuit (ASIC) that can

7 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 6
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

accelerate machine learning workloads. A TPU can provide high performance and scalability for
various types of models, such as linear regression, logistic regression, k-means clustering, matrix
factorization, and deep neural networks. A TPU can also support various tools and frameworks, such
as TensorFlow, PyTorch, and JAX. A batch size is a parameter that specifies the number of training
examples in one forward/backward pass. A batch size can affect the speed and accuracy of the
training process. A larger batch size can help you utilize the parallel processing power of the TPU, and
reduce the communication overhead between the TPU and the host CPU. A larger batch size can also
help you avoid overfitting, as it can reduce the variance of the gradient updates. By increasing the
batch size, you can train your model on a large dataset faster and more efficiently, and make full use
of the TPU capacity1.
The other options are not as good as option D, for the following reasons:
* Option A: Increasing the learning rate would not help you utilize the parallel processing power of
the TPU, and could cause errors or poor performance. A learning rate is a parameter that controls
how much the model is updated in each iteration. A learning rate can affect the speed and accuracy
of the training process. A larger learning rate can help you converge faster, but it can also cause
instability, divergence, or oscillation. By increasing the learning rate, you may not be able to find the
optimal solution, and your model may perform poorly on the validation or test data2.
* Option B: Increasing the number of epochs would not help you utilize the parallel processing power
of the TPU, and could increase the complexity and cost of the training process. An epoch is a measure
of the number of times all of the training examples are used once in the training process. An epoch
can affect the speed and accuracy of the training process. A larger number of epochs can help you
learn more from the data, but it can also cause overfitting, underfitting, or diminishing returns. By
increasing the number of epochs, you may not be able to improve the model performance
significantly, and your training process may take longer and consume more resources3.
* Option C: Decreasing the learning rate would not help you utilize the parallel processing power of
the TPU, and could slow down the training process. A learning rate is a parameter that controls how
much the model is updated in each iteration. A learning rate can affect the speed and accuracy of the
training process. A smaller learning rate can help you find a more precise solution, but it can also
cause slow convergence or local minima. By decreasing the learning rate, you may not be able to
reach the optimal solution in a reasonable time, and your training process may take longer2.
References:
* Preparing for Google Cloud Certification: Machine Learning Engineer, Course 2: ML Models and
Architectures, Week 1: Introduction to ML Models and Architectures
* Google Cloud Professional Machine Learning Engineer Exam Guide, Section 2: Architecting ML
solutions, 2.1 Designing ML models
* Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 4: ML
Models and Architectures, Section 4.1: Designing ML Models
* Use TPUs
* Triose phosphate utilization and beyond: from photosynthesis to end ...
* Cloud TPU performance guide
* Google TPU: Architecture and Performance Best Practices - Run

NO.10 Your data science team has requested a system that supports scheduled model retraining,
Docker containers, and a service that supports autoscaling and monitoring for online prediction
requests. Which platform components should you choose for this system?
A. Vertex AI Pipelines and App Engine

8 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 7
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

B. Vertex AI Pipelines, Vertex AI Prediction, and Vertex AI Model Monitoring

C. Cloud Composer, BigQuery ML, and Vertex AI Prediction
D. Cloud Composer, Vertex AI Training with custom containers, and App Engine
Answer: B
Explanation:
* Option A is incorrect because Vertex AI Pipelines and App Engine do not meet all the requirements
of the system. Vertex AI Pipelines is a service that allows you to create, run, and manage ML
workflows using TensorFlow Extended (TFX) components or custom components1. App Engine is a
service that allows you to build and deploy scalable web applications using standard or flexible
environments2. However, App Engine does not support Docker containers in the standard
environment, and does not provide a dedicated service for online prediction and monitoring of ML
models3.
* Option B is correct because Vertex AI Pipelines, Vertex AI Prediction, and Vertex AI Model
Monitoring meet all the requirements of the system. Vertex AI Prediction is a service that allows you
to deploy and serve ML models for online or batch prediction, with support for autoscaling and
custom containers4. Vertex AI Model Monitoring is a service that allows you to monitor the
performance and fairness of your deployed models, and get alerts for any issues or anomalies5.
* Option C is incorrect because Cloud Composer, BigQuery ML, and Vertex AI Prediction do not meet
all the requirements of the system. Cloud Composer is a service that allows you to create, schedule,
and manage workflows using Apache Airflow. BigQuery ML is a service that allows you to create and
use ML models within BigQuery using SQL queries. However, BigQuery ML does not support custom
containers, and Vertex AI Prediction does not support scheduled model retraining or model
monitoring.
* Option D is incorrect because Cloud Composer, Vertex AI Training with custom containers, and App
Engine do not meet all the requirements of the system. Vertex AI Training is a service that allows you
to train ML models using built-in algorithms or custom containers. However, Vertex AI Training does
not support online prediction or model monitoring, and App Engine does not support Docker
containers in the standard environment or online prediction and monitoring of ML models3.
References:
* Vertex AI Pipelines overview
* App Engine overview
* Choosing an App Engine environment
* Vertex AI Prediction overview
* Vertex AI Model Monitoring overview
* [Cloud Composer overview]
* [BigQuery ML overview]
* [BigQuery ML limitations]
* [Vertex AI Training overview]

NO.11 You are an ML engineer at a global shoe store. You manage the ML models for the company's
website. You are asked to build a model that will recommend new products to the user based on
their purchase behavior and similarity with other users. What should you do?
A. Build a classification model
B. Build a knowledge-based filtering model
C. Build a collaborative-based filtering model

9 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 8
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

D. Build a regression model using the features as predictors

Answer: C
Explanation:
A recommender system is a type of machine learning system that suggests relevant items to users
based on their preferences and behavior. Recommender systems are widely used in e-commerce,
media, and entertainment industries to enhance user experience and increase revenue1 There are
different types of recommender systems that use different filtering methods to generate
recommendations. The most common types are:
* Content-based filtering: This method uses the features of the items and the users to find the
similarity between them. For example, a content-based recommender system for movies may use the
genre, director, cast, and ratings of the movies, and the preferences, demographics, and history of
the users, to recommend movies that are similar to the ones the user liked before2
* Collaborative filtering: This method uses the feedback and ratings of the users to find the similarity
between them and the items. For example, a collaborative filtering recommender system for books
may use the ratings of the users for different books, and recommend books that are liked by other
users who have similar ratings to the target user3
* Hybrid method: This method combines content-based and collaborative filtering methods to
overcome the limitations of each method and improve the accuracy and diversity of the
recommendations. For example, a hybrid recommender system for music may use both the features
of the songs and the artists, and the ratings and listening habits of the users, to recommend songs
that match the user's taste and preferences4
* Deep learning-based: This method uses deep neural networks to learn complex and non-linear
patterns from the data and generate recommendations. Deep learning-based recommender systems
can handle large-scale and high-dimensional data, and incorporate various types of information, such
as text, images, audio, and video. For example, a deep learning-based recommender system for
fashion may use the images and descriptions of the products, and the profiles and feedback of the
users, to recommend products that suit the user's style and preferences.
For the use case of building a model that will recommend new products to the user based on their
purchase behavior and similarity with other users, the best option is to build a collaborative-based
filtering model. This is because collaborative filtering can leverage the implicit feedback and ratings of
the users to find the items that are most likely to interest them. Collaborative filtering can also help
discover new products that the user may not be aware of, and increase the diversity and serendipity
of the recommendations3 The other options are not as suitable for this use case. Building a
classification model or a regression model using the features as predictors is not a good idea, as these
models are not designed for recommendation tasks, and may not capture the preferences and
behavior of the users. Building a knowledge-based filtering model is not relevant, as this method uses
the explicit knowledge and requirements of the users to find the items that meet their criteria, and
does not rely on the purchase behavior or similarity with other users.
References: 1: Recommender system 2: Content-based filtering 3: Collaborative filtering 4: Hybrid
recommender system : [Deep learning for recommender systems] : [Knowledge-based recommender
system]

NO.12 You work for a magazine publisher and have been tasked with predicting whether customers
will cancel their annual subscription. In your exploratory data analysis, you find that 90% of
individuals renew their subscription every year, and only 10% of individuals cancel their subscription.
After training a NN Classifier, your model predicts those who cancel their subscription with 99%

10 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 9
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

accuracy and predicts those who renew their subscription with 82% accuracy. How should you
interpret these results?
A. This is not a good result because the model should have a higher accuracy for those who renew
their subscription than for those who cancel their subscription.
B. This is not a good result because the model is performing worse than predicting that people will
always renew their subscription.
C. This is a good result because predicting those who cancel their subscription is more difficult, since
there is less data for this group.
D. This is a good result because the accuracy across both groups is greater than 80%.
Answer: B
Explanation:
This is not a good result because the model is performing worse than predicting that people will
always renew their subscription. This option has the following reasons:
* It indicates that the model is not learning from the data, but rather memorizing the majority class.
Since
90% of the individuals renew their subscription every year, the model can achieve a 90% accuracy by
simply predicting that everyone will renew their subscription, without considering the features or the
patterns in the data. However, the model's accuracy for predicting those who renew their
subscription is only 82%, which is lower than the baseline accuracy of 90%. This suggests that the
model is overfitting to the minority class (those who cancel their subscription), and underfitting to
the majority class (those who renew their subscription).
* It implies that the model is not useful for the business problem, as it cannot identify the customers
who are at risk of churning. The goal of predicting whether customers will cancel their annual
subscription is to prevent customer churn and increase customer retention. However, the model's
accuracy for predicting those who cancel their subscription is 99%, which is too high and unrealistic,
as it means that the model can almost perfectly identify the customers who will churn, without any
false positives or false negatives. This may indicate that the model is cheating or exploiting some
leakage in the data, such as a feature that reveals the outcome of the prediction. Moreover, the
model's accuracy for predicting those who renew their subscription is 82%, which is too low and
unreliable, as it means that the model can miss many customers who will churn, and falsely label
them as renewing customers. This can lead to losing customers and revenue, and failing to take
proactive actions to retain them.
References:
* How to Evaluate Machine Learning Models: Classification Metrics | Machine Learning Mastery
* Imbalanced Classification: Predicting Subscription Churn | Machine Learning Mastery

NO.13 You work on a team that builds state-of-the-art deep learning models by using the
TensorFlow framework.
Your team runs multiple ML experiments each week which makes it difficult to track the experiment
runs.
You want a simple approach to effectively track, visualize and debug ML experiment runs on Google
Cloud while minimizing any overhead code. How should you proceed?
A. Set up Vertex Al Experiments to track metrics and parameters Configure Vertex Al TensorBoard for
visualization.
B. Set up a Cloud Function to write and save metrics files to a Cloud Storage Bucket Configure a

11 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 10
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

Google Cloud VM to host TensorBoard locally for visualization.

C. Set up a Vertex Al Workbench notebook instance Use the instance to save metrics data in a Cloud
Storage bucket and to host TensorBoard locally for visualization.
D. Set up a Cloud Function to write and save metrics files to a BigQuery table. Configure a Google
Cloud VM to host TensorBoard locally for visualization.
Answer: A
Explanation:
Vertex AI Experiments is a service that allows you to track, compare, and optimize your ML
experiments on Google Cloud. You can use Vertex AI Experiments to log metrics and parameters from
your TensorFlow models, and then visualize them in Vertex AI TensorBoard. Vertex AI TensorBoard is
a managed service that provides a web interface for viewing and debugging your ML experiments.
You can use Vertex AI TensorBoard to compare different runs, inspect model graphs, analyze scalars,
histograms, images, and more.
By using Vertex AI Experiments and Vertex AI TensorBoard, you can simplify your ML experiment
tracking and visualization workflow, and avoid the overhead of setting up and maintaining your own
Cloud Functions, Cloud Storage buckets, or VMs. References:
* [Vertex AI Experiments documentation]
* [Vertex AI TensorBoard documentation]
* Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate

NO.14 You are an AI architect at a popular photo-sharing social media platform. Your organization's
content moderation team currently scans images uploaded by users and removes explicit images
manually. You want to implement an AI service to automatically prevent users from uploading
explicit images. What should you do?
A. Develop a custom TensorFlow model in a Vertex AI Workbench instance. Train the model on a
dataset of manually labeled images. Deploy the model to a Vertex AI endpoint. Run periodic batch
inference to identify inappropriate uploads and report them to the content moderation team.
B. Train an image clustering model using TensorFlow in a Vertex AI Workbench instance. Deploy this
model to a Vertex AI endpoint and configure it for online inference. Run this model each time a new
image is uploaded to identify and block inappropriate uploads.
C. Create a dataset using manually labeled images. Ingest this dataset into AutoML. Train an image
classification model and deploy it to a Vertex AI endpoint. Integrate this endpoint with the image
upload process to identify and block inappropriate uploads. Monitor predictions and periodically
retrain the model.
D. Send a copy of every user-uploaded image to a Cloud Storage bucket. Configure a Cloud Run
function that triggers the Cloud Vision API to detect explicit content each time a new image is
uploaded. Report the classifications to the content moderation team for review.
Answer: D
Explanation:
Cloud Vision API offers pre-trained models specialized in identifying explicit or inappropriate content.
By sending a copy of each image to a Cloud Storage bucket and triggering Cloud Vision through Cloud
Run, the detection of explicit content is automated with minimal development time. Vertex AI
custom models require more training data and infrastructure management, while AutoML-based
solutions add more complexity.
Cloud Vision's existing capabilities meet the requirement effectively and are highly scalable for real-

12 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 11
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

time moderation needs.

NO.15 You work for a manufacturing company. You need to train a custom image classification
model to detect product defects at the end of an assembly line Although your model is performing
well some images in your holdout set are consistently mislabeled with high confidence.
You want to use Vertex Al to understand your model's results.
What should you do?
A.

Answer: C
Explanation:
Vertex Explainable AI is a set of tools and frameworks to help you understand and interpret
predictions made by your machine learning models, natively integrated with a number of Google's
products and services1. With Vertex Explainable AI, you can generate feature-based explanations
that show how much each input feature contributed to the model's prediction2. This can help you
debug and improve your model performance, and build confidence in your model's behavior.
Feature-based explanations are supported for custom image classification models deployed on
Vertex AI Prediction3. References:
* Explainable AI | Google Cloud
* Introduction to Vertex Explainable AI | Vertex AI | Google Cloud
* Supported model types for feature-based explanations | Vertex AI | Google Cloud

NO.16 You have recently trained a scikit-learn model that you plan to deploy on Vertex Al. This
model will support both online and batch prediction. You need to preprocess input data for model
inference. You want to package the model for deployment while minimizing additional code What
should you do?
A. 1 Upload your model to the Vertex Al Model Registry by using a prebuilt scikit-learn prediction
container
2 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job that uses the
instanceConfig.inscanceType setting to transform your input data
B. 1 Wrap your model in a custom prediction routine (CPR). and build a container image from the
CPR local model
2 Upload your sci-kit learn model container to Vertex Al Model Registry
3 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job
C. 1. Create a custom container for your sci-kit learn model,
2 Define a custom serving function for your model

13 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 12
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

3 Upload your model and custom container to Vertex Al Model Registry

4 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job
D. 1 Create a custom container for your sci-kit learn model.
2 Upload your model and custom container to Vertex Al Model Registry
3 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job that uses the
instanceConfig. instanceType setting to transform your input data
Answer: B
Explanation:
The best option for deploying a scikit-learn model on Vertex AI with minimal additional code is to
wrap the model in a custom prediction routine (CPR) and build a container image from the CPR local
model. Upload your scikit-learn model container to Vertex AI Model Registry. Deploy your model to
Vertex AI Endpoints, and create a Vertex AI batch prediction job. This option allows you to leverage
the power and simplicity of Google Cloud to deploy and serve a scikit-learn model that supports both
online and batch prediction. Vertex AI is a unified platform for building and deploying machine
learning solutions on Google Cloud. Vertex AI can deploy a trained scikit-learn model to an online
prediction endpoint, which can provide low-latency predictions for individual instances. Vertex AI can
also create a batch prediction job, which can provide high- throughput predictions for a large batch of
instances. A custom prediction routine (CPR) is a Python script that defines the logic for
preprocessing the input data, running the prediction, and postprocessing the output data. A CPR can
help you customize the prediction behavior of your model, and handle complex or non- standard data
formats. A CPR can also help you minimize the additional code, as you only need to write a few
functions to implement the prediction logic. A container image is a package that contains the model,
the CPR, and the dependencies. A container image can help you standardize and simplify the
deployment process, as you only need to upload the container image to Vertex AI Model Registry,
and deploy it to Vertex AI Endpoints. By wrapping the model in a CPR and building a container image
from the CPR local model, uploading the scikit-learn model container to Vertex AI Model Registry,
deploying the model to Vertex AI Endpoints, and creating a Vertex AI batch prediction job, you can
deploy a scikit-learn model on Vertex AI with minimal additional code1.
The other options are not as good as option B, for the following reasons:
* Option A: Uploading your model to the Vertex AI Model Registry by using a prebuilt scikit-learn
prediction container, deploying your model to Vertex AI Endpoints, and creating a Vertex AI batch
prediction job that uses the instanceConfig.instanceType setting to transform your input data would
not allow you to preprocess the input data for model inference, and could cause errors or poor
performance.
A prebuilt scikit-learn prediction container is a container image that is provided by Google Cloud, and
contains the scikit-learn framework and the dependencies. A prebuilt scikit-learn prediction container
can help you deploy a scikit-learn model without writing any code, but it also limits your
customization options. A prebuilt scikit-learn prediction container can only handle standard data
formats, such as JSON or CSV, and cannot perform any preprocessing or postprocessing on the input
or output data. If your input data requires any transformation or normalization before running the
prediction, you cannot use a prebuilt scikit-learn prediction container. The
instanceConfig.instanceType setting is a parameter that determines the machine type and the
accelerator type for the batch prediction job. The instanceConfig.instanceType setting can help you
optimize the performance and the cost of the batch prediction job, but it cannot help you transform
your input data2.
* Option C: Creating a custom container for your scikit-learn model, defining a custom serving

14 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 13
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

function for your model, uploading your model and custom container to Vertex AI Model Registry,
and deploying your model to Vertex AI Endpoints, and creating a Vertex AI batch prediction job
would require more skills and steps than using a CPR and a container image. A custom container is a
container image that contains the model, the dependencies, and a web server. A custom container
can help you customize the prediction behavior of your model, and handle complex or non-standard
data formats. A custom serving function is a Python function that defines the logic for running the
prediction on the model. A custom serving function can help you implement the prediction logic of
your model, and handle complex or non-standard data formats. However, creating a custom
container and defining a custom serving function would require more skills and steps than using a
CPR and a container image.
You would need to write code, build and test the container image, configure the web server, and
implement the prediction logic. Moreover, creating a custom container and defining a custom serving
function would not allow you to preprocess the input data for model inference, as the custom serving
function only runs the prediction on the model3.
* Option D: Creating a custom container for your scikit-learn model, uploading your model and
custom container to Vertex AI Model Registry, deploying your model to Vertex AI Endpoints, and
creating a Vertex AI batch prediction job that uses the instanceConfig.instanceType setting to
transform your input data would not allow you to preprocess the input data for model inference, and
could cause errors or poor performance. A custom container is a container image that contains the
model, the dependencies, and a web server. A custom container can help you customize the
prediction behavior of your model, and handle complex or non-standard data formats. However,
creating a custom container would require more skills and steps than using a CPR and a container
image. You would need to write code, build and test the container image, and configure the web
server. The instanceConfig.
instanceType setting is a parameter that determines the machine type and the accelerator type for
the batch prediction job. The instanceConfig.instanceType setting can help you optimize the
performance and the cost of the batch prediction job, but it cannot help you transform your input
data23.
References:
* Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML
Systems, Week 2: Serving ML Predictions
* Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in
production, 3.1 Deploying ML models to production
* Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6:
Production ML Systems, Section 6.2: Serving ML Predictions
* Custom prediction routines
* Using pre-built containers for prediction
* Using custom containers for prediction

NO.17 You are deploying a new version of a model to a production Vertex Al endpoint that is serving
traffic You plan to direct all user traffic to the new model You need to deploy the model with minimal
disruption to your application What should you do?
A. 1 Create a new endpoint.
2 Create a new model Set it as the default version Upload the model to Vertex Al Model Registry.
3. Deploy the new model to the new endpoint.
4 Update Cloud DNS to point to the new endpoint

15 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 14
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

B. 1. Create a new endpoint.

2. Create a new model Set the parentModel parameter to the model ID of the currently deployed
model and set it as the default version Upload the model to Vertex Al Model Registry
3. Deploy the new model to the new endpoint and set the new model to 100% of the traffic
C. 1 Create a new model Set the parentModel parameter to the model ID of the currently deployed
model Upload the model to Vertex Al Model Registry.
2 Deploy the new model to the existing endpoint and set the new model to 100% of the traffic.
D. 1, Create a new model Set it as the default version Upload the model to Vertex Al Model Registry
2 Deploy the new model to the existing endpoint
Answer: C
Explanation:
The best option for deploying a new version of a model to a production Vertex AI endpoint that is
serving traffic, directing all user traffic to the new model, and deploying the model with minimal
disruption to your application, is to create a new model, set the parentModel parameter to the
model ID of the currently deployed model, upload the model to Vertex AI Model Registry, deploy the
new model to the existing endpoint, and set the new model to 100% of the traffic. This option allows
you to leverage the power and simplicity of Vertex AI to update your model version and serve online
predictions with low latency. Vertex AI is a unified platform for building and deploying machine
learning solutions on Google Cloud. Vertex AI can deploy a trained model to an online prediction
endpoint, which can provide low-latency predictions for individual instances. A model is a resource
that represents a machine learning model that you can use for prediction. A model can have one or
more versions, which are different implementations of the same model. A model version can have
different parameters, code, or data than another version of the same model. A model version can
help you experiment and iterate on your model, and improve the model performance and accuracy.
A parentModel parameter is a parameter that specifies the model ID of the model that the new
model version is based on. A parentModel parameter can help you inherit the settings and metadata
of the existing model, and avoid duplicating the model configuration. Vertex AI Model Registry is a
service that can store and manage your machine learning models on Google Cloud. Vertex AI Model
Registry can help you upload and organize your models, and track the model versions and metadata.
An endpoint is a resource that provides the service endpoint (URL) you use to request the prediction.
An endpoint can have one or more deployed models, which are instances of model versions that are
associated with physical resources. A deployed model can help you serve online predictions with low
latency, and scale up or down based on the traffic. By creating a new model, setting the parentModel
parameter to the model ID of the currently deployed model, uploading the model to Vertex AI Model
Registry, deploying the new model to the existing endpoint, and setting the new model to
100% of the traffic, you can deploy a new version of a model to a production Vertex AI endpoint that
is serving traffic, direct all user traffic to the new model, and deploy the model with minimal
disruption to your application1.
The other options are not as good as option C, for the following reasons:
* Option A: Creating a new endpoint, creating a new model, setting it as the default version,
uploading the model to Vertex AI Model Registry, deploying the new model to the new endpoint, and
updating Cloud DNS to point to the new endpoint would require more skills and steps than creating a
new model, setting the parentModel parameter to the model ID of the currently deployed model,
uploading the model to Vertex AI Model Registry, deploying the new model to the existing endpoint,
and setting the new model to 100% of the traffic. Cloud DNS is a service that can provide reliable and
scalable Domain Name System (DNS) services on Google Cloud. Cloud DNS can help you manage your

16 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 15
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

DNS records, and resolve domain names to IP addresses. By updating Cloud DNS to point to the new
endpoint, you can redirect the user traffic to the new endpoint, and avoid breaking the existing
application. However, creating a new endpoint, creating a new model, setting it as the default
version, uploading the model to Vertex AI Model Registry, deploying the new model to the new
endpoint, and updating Cloud DNS to point to the new endpoint would require more skills and steps
than creating a new model, setting the parentModel parameter to the model ID of the currently
deployed model, uploading the model to Vertex AI Model Registry, deploying the new model to the
existing endpoint, and setting the new model to 100% of the traffic. You would need to write code,
create and configure the new endpoint, create and configure the new model, upload the model to
Vertex AI Model Registry, deploy the model to the new endpoint, and update Cloud DNS to point to
the new endpoint. Moreover, this option would create a new endpoint, which can increase the
maintenance and management costs2.
* Option B: Creating a new endpoint, creating a new model, setting the parentModel parameter to
the model ID of the currently deployed model and setting it as the default version, uploading the
model to Vertex AI Model Registry, and deploying the new model to the new endpoint and setting
the new model to 100% of the traffic would require more skills and steps than creating a new model,
setting the parentModel parameter to the model ID of the currently deployed model, uploading the
model to Vertex AI Model Registry, deploying the new model to the existing endpoint, and setting the
new model to
100% of the traffic. A parentModel parameter is a parameter that specifies the model ID of the model
that the new model version is based on. A parentModel parameter can help you inherit the settings
and metadata of the existing model, and avoid duplicating the model configuration. A default version
is a model version that is used for prediction when no other version is specified. A default version can
help you simplify the prediction request, and avoid specifying the model version every time. By
setting the parentModel parameter to the model ID of the currently deployed model and setting it as
the default version, you can create a new model that is based on the existing model, and use it for
prediction without specifying the model version. However, creating a new endpoint, creating a new
model, setting the parentModel parameter to the model ID of the currently deployed model and
setting it as the default version, uploading the model to Vertex AI Model Registry, and deploying the
new model to the new endpoint and setting the new model to 100% of the traffic would require
more skills and steps than creating a new model, setting the parentModel parameter to the model ID
of the currently deployed model, uploading the model to Vertex AI Model Registry, deploying the
new model to the existing endpoint, and setting the new model to 100% of the traffic. You would
need to write code, create and configure the new endpoint, create and configure the new model,
upload the model to Vertex AI Model Registry, and deploy the model to the new endpoint. Moreover,
this option would create a new endpoint, which can increase the maintenance and management
costs2.
* Option D: Creating a new model, setting it as the default version, uploading the model to Vertex AI
Model Registry, and deploying the new model to the existing endpoint would not allow you to inherit
the settings and metadata of the existing model, and could cause errors or poor performance. A
default version is a model version that is used for prediction when no other version is specified. A
default version can help you simplify the prediction request, and avoid specifying the model version
every time. By setting the new model as the default version, you can use the new model for
prediction without specifying the model version. However, creating a new model, setting it as the
default version, uploading the model to Vertex AI Model Registry, and deploying the new model to
the existing endpoint would not allow you to inherit the settings and metadata of the existing model,

17 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 16
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

and could cause errors or poor performance. You would need to write code, create and configure the
new model, upload the model to Vertex AI Model Registry, and deploy the model to the existing
endpoint. Moreover, this option would not set the parentModel parameter to the model ID of the
currently deployed model, which could prevent you from inheriting the settings and metadata of the
existing model, and cause inconsistencies or conflicts between the model versions2.
References:
* Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML
Systems, Week 2: Serving ML Predictions
* Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in
production, 3.1 Deploying ML models to production
* Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6:
Production ML Systems, Section 6.2: Serving ML Predictions
* Vertex AI
* Cloud DNS

NO.18 You have been given a dataset with sales predictions based on your company's marketing
activities. The data is structured and stored in BigQuery, and has been carefully managed by a team
of data analysts. You need to prepare a report providing insights into the predictive capabilities of the
data. You were asked to run several ML models with different levels of sophistication, including
simple models and multilayered neural networks.
You only have a few hours to gather the results of your experiments. Which Google Cloud tools
should you use to complete this task in the most efficient and self-serviced way?
A. Use BigQuery ML to run several regression models, and analyze their performance.
B. Read the data from BigQuery using Dataproc, and run several models using SparkML.
C. Use Vertex AI Workbench user-managed notebooks with scikit-learn code for a variety of ML
algorithms and performance metrics.
D. Train a custom TensorFlow model with Vertex AI, reading the data from BigQuery featuring a
variety of ML algorithms.
Answer: A
Explanation:
* Option A is correct because using BigQuery ML to run several regression models, and analyze their
performance is the most efficient and self-serviced way to complete the task. BigQuery ML is a
service that allows you to create and use ML models within BigQuery using SQL queries1. You can use
BigQuery ML to run different types of regression models, such as linear regression, logistic regression,
or DNN regression2. You can also use BigQuery ML to analyze the performance of your models, such
as the mean squared error, the accuracy, or the ROC curve3. BigQuery ML is fast, scalable, and easy
to use, as it does not require any data movement, coding, or additional tools4.
* Option B is incorrect because reading the data from BigQuery using Dataproc, and running several
models using SparkML is not the most efficient and self-serviced way to complete the task. Dataproc
is a service that allows you to create and manage clusters of virtual machines that run Apache Spark
and other open-source tools5. SparkML is a library that provides ML algorithms and utilities for Spark.
However, this option requires more effort and resources than option A, as it involves moving the data
from BigQuery to Dataproc, creating and configuring the clusters, writing and running the SparkML
code, and analyzing the results.
* Option C is incorrect because using Vertex AI Workbench user-managed notebooks with scikit-learn
code for a variety of ML algorithms and performance metrics is not the most efficient and self-

18 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 17
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

serviced way to complete the task. Vertex AI Workbench is a service that allows you to create and
use notebooks for ML development and experimentation. Scikit-learn is a library that provides ML
algorithms and utilities for Python. However, this option also requires more effort and resources than
option A, as it involves creating and managing the notebooks, writing and running the scikit-learn
code, and analyzing the results.
* Option D is incorrect because training a custom TensorFlow model with Vertex AI, reading the data
from BigQuery featuring a variety of ML algorithms is not the most efficient and self-serviced way to
complete the task. TensorFlow is a framework that allows you to create and train ML models using
Python or other languages. Vertex AI is a service that allows you to train and deploy ML models using
built-in algorithms or custom containers. However, this option also requires more effort and
resources than option A, as it involves writing and running the TensorFlow code, creating and
managing the training jobs, and analyzing the results.
References:
* BigQuery ML overview
* Creating a model in BigQuery ML
* Evaluating a model in BigQuery ML
* BigQuery ML benefits
* Dataproc overview
* [SparkML overview]
* [Vertex AI Workbench overview]
* [Scikit-learn overview]
* [TensorFlow overview]
* [Vertex AI overview]

NO.19 You work for a magazine distributor and need to build a model that predicts which customers
will renew their subscriptions for the upcoming year. Using your company's historical data as your
training set, you created a TensorFlow model and deployed it to AI Platform. You need to determine
which customer attribute has the most predictive power for each prediction served by the model.
What should you do?
A. Use AI Platform notebooks to perform a Lasso regression analysis on your model, which will
eliminate features that do not provide a strong signal.
B. Stream prediction results to BigQuery. Use BigQuery's CORR(X1, X2) function to calculate the
Pearson correlation coefficient between each feature and the target variable.
C. Use the AI Explanations feature on AI Platform. Submit each prediction request with the 'explain'
keyword to retrieve feature attributions using the sampled Shapley method.
D. Use the What-If tool in Google Cloud to determine how your model will perform when individual
features are excluded. Rank the feature importance in order of those that caused the most significant
performance drop when removed from the model.
Answer: C
Explanation:
* Option A is incorrect because using AI Platform notebooks to perform a Lasso regression analysis on
your model, which will eliminate features that do not provide a strong signal, is not a suitable way to
determine which customer attribute has the most predictive power for each prediction served by the
model. Lasso regression is a method of feature selection that applies a penalty to the coefficients of
the linear model, and shrinks them to zero for irrelevant features1. However, this method assumes

19 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 18
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

that the model is linear and additive, which may not be the case for a TensorFlow model. Moreover,
this method does not provide feature attributions for each prediction, but rather for the entire
dataset.
* Option B is incorrect because streaming prediction results to BigQuery, and using BigQuery's CORR
(X1, X2) function to calculate the Pearson correlation coefficient between each feature and the target
variable, is not a valid way to determine which customer attribute has the most predictive power for
each prediction served by the model. The Pearson correlation coefficient is a measure of the linear
relationship between two variables, ranging from -1 to 12. However, this method does not account
for the interactions between features or the non-linearity of the model. Moreover, this method does
not provide feature attributions for each prediction, but rather for the entire dataset.
* Option C is correct because using the AI Explanations feature on AI Platform, and submitting each
prediction request with the 'explain' keyword to retrieve feature attributions using the sampled
Shapley method, is the best way to determine which customer attribute has the most predictive
power for each prediction served by the model. AI Explanations is a service that allows you to get
feature attributions for your deployed models on AI Platform3. Feature attributions are values that
indicate how much each feature contributed to the prediction for a given instance4. The sampled
Shapley method is a technique that uses the Shapley value, a game-theoretic concept, to measure
the contribution of each feature to the prediction5. By using AI Explanations, you can get feature
attributions for each prediction request, and identify the most important features for each customer.
* Option D is incorrect because using the What-If tool in Google Cloud to determine how your model
will perform when individual features are excluded, and ranking the feature importance in order of
those that caused the most significant performance drop when removed from the model, is not a
practical way to determine which customer attribute has the most predictive power for each
prediction served by the model. The What-If tool is a tool that allows you to visualize and analyze
your ML models and datasets. However, this method requires manually editing or removing features
for each instance, and observing the change in the prediction. This method is not scalable or efficient,
and may not capture the interactions between features or the non-linearity of the model.
References:
* Lasso regression
* Pearson correlation coefficient
* AI Explanations overview
* Feature attributions
* Sampled Shapley method
* [What-If tool overview]

NO.20 You are working on a binary classification ML algorithm that detects whether an image of a
classified scanned document contains a company's logo. In the dataset, 96% of examples don't have
the logo, so the dataset is very skewed. Which metrics would give you the most confidence in your
model?
A. F-score where recall is weighed more than precision
B. RMSE
C. F1 score
D. F-score where precision is weighed more than recall
Answer: A
Explanation:
* Option A is correct because using F-score where recall is weighed more than precision is a suitable

20 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 19
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

metric for binary classification with imbalanced data. F-score is a harmonic mean of precision and
recall, which are two metrics that measure the accuracy and completeness of the positive class1.
Precision is the fraction of true positives among all predicted positives, while recall is the fraction of
true positives among all actual positives1. When the data is imbalanced, the positive class is the
minority class, which is usually the class of interest. For example, in this case, the positive class is the
images that contain the company's logo, which are rare but important to detect. By weighing recall
more than precision, we can emphasize the importance of finding all the positive examples, even if
some false positives are included2.
* Option B is incorrect because using RMSE (root mean squared error) is not a valid metric for binary
classification with imbalanced data. RMSE is a metric that measures the average magnitude of the
errors between the predicted and actual values3. RMSE is suitable for regression problems, where
the target variable is continuous, not for classification problems, where the target variable is
discrete4.
* Option C is incorrect because using F1 score is not the best metric for binary classification with
imbalanced data. F1 score is a special case of F-score where precision and recall are equally
weighted1. F1 score is suitable for balanced data, where the positive and negative classes are equally
important and frequent5. However, for imbalanced data, the positive class is more important and
less frequent than the negative class, so F1 score may not reflect the performance of the model
well2.
* Option D is incorrect because using F-score where precision is weighed more than recall is not a
good metric for binary classification with imbalanced data. By weighing precision more than recall,
we can emphasize the importance of minimizing the false positives, even if some true positives are
missed2. However, for imbalanced data, the true positives are more important and less frequent
than the false positives, so this metric may not reflect the performance of the model well2.
References:
* Precision, recall, and F-measure
* F-score for imbalanced data
* RMSE
* Regression vs classification
* F1 score
* [Imbalanced classification]
* [Binary classification]

NO.21 You have trained a deep neural network model on Google Cloud. The model has low loss on
the training data, but is performing worse on the validation data. You want the model to be resilient
to overfitting. Which strategy should you use when retraining the model?
A. Apply a dropout parameter of 0 2, and decrease the learning rate by a factor of 10
B. Apply a L2 regularization parameter of 0.4, and decrease the learning rate by a factor of 10.
C. Run a hyperparameter tuning job on Al Platform to optimize for the L2 regularization and dropout
parameters
D. Run a hyperparameter tuning job on Al Platform to optimize for the learning rate, and increase the
number of neurons by a factor of 2.
Answer: C
Explanation:
Overfitting occurs when a model tries to fit the training data so closely that it does not generalize
well to new data. Overfitting can be caused by having a model that is too complex for the data, such

21 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 20
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

as having too many parameters or layers. Overfitting can lead to poor performance on the validation
data, which reflects how the model will perform on unseen data1 To prevent overfitting, one strategy
is to use regularization techniques that penalize the complexity of the model and encourage it to
learn simpler patterns. Two common regularization techniques for deep neural networks are L2
regularization and dropout. L2 regularization adds a term to the loss function that is proportional to
the squared magnitude of the model's weights. This term penalizes large weights and encourages the
model to use smaller weights. Dropout randomly drops out some units in the network during
training, which prevents co-adaptation of features and reduces the effective number of parameters.
Both L2 regularization and dropout have hyperparameters that control the strength of the
regularization effect23 Another strategy to prevent overfitting is to use hyperparameter tuning,
which is the process of finding the optimal values for the parameters of the model that affect its
performance. Hyperparameter tuning can help find the best combination of hyperparameters that
minimize the validation loss and improve the generalization ability of the model. AI Platform provides
a service for hyperparameter tuning that can run multiple trials in parallel and use different search
algorithms to find the best solution.
Therefore, the best strategy to use when retraining the model is to run a hyperparameter tuning job
on AI Platform to optimize for the L2 regularization and dropout parameters. This will allow the
model to find the optimal balance between fitting the training data and generalizing to new data. The
other options are not as effective, as they either use fixed values for the regularization parameters,
which may not be optimal, or they do not address the issue of overfitting at all.
References: 1: Generalization: Peril of Overfitting 2: Regularization for Deep Learning 3: Dropout: A
Simple Way to Prevent Neural Networks from Overfitting : [Hyperparameter tuning overview]

NO.22 You need to use TensorFlow to train an image classification model. Your dataset is located in
a Cloud Storage directory and contains millions of labeled images Before training the model, you
need to prepare the data.
You want the data preprocessing and model training workflow to be as efficient scalable, and low
maintenance as possible. What should you do?
A. 1 Create a Dataflow job that creates sharded TFRecord files in a Cloud Storage directory.
2 Reference tf .data.TFRecordDataset in the training script.
3. Train the model by using Vertex Al Training with a V100 GPU.
B. 1 Create a Dataflow job that moves the images into multiple Cloud Storage directories, where
each directory is named according to the corresponding label.
2 Reference tfds.fclder_da-asst.imageFclder in the training script.
3. Train the model by using Vertex AI Training with a V100 GPU.
C. 1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench
instance.
2 Write a Python script that creates sharded TFRecord files in a directory inside the instance
3. Reference tf. da-a.TFRecrrdDataset in the training script.
4. Train the model by using the Workbench instance.
D. 1 Create a Jupyter notebook that uses an n1-standard-64, V100 GPU Vertex Al Workbench
instance.
2 Write a Python scnpt that copies the images into multiple Cloud Storage directories, where each
directory is named according to the corresponding label.
3 Reference tf ds. f older_dataset. imageFolder in the training script.
4. Train the model by using the Workbench instance.

22 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 21
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

Answer: A
Explanation:
TFRecord is a binary file format that stores your data as a sequence of binary strings1. TFRecord files
are efficient, scalable, and easy to process1. Sharding is a technique that splits a large file into smaller
files, which can improve parallelism and performance2. Dataflow is a service that allows you to
create and run data processing pipelines on Google Cloud3. Dataflow can create sharded TFRecord
files from your images in a Cloud Storage directory4.
tf.data.TFRecordDataset is a class that allows you to read and parse TFRecord files in TensorFlow. You
can use this class to create a tf.data.Dataset object that represents your input data for training.
tf.data.Dataset is a high-level API that provides various methods to transform, batch, shuffle, and
prefetch your data.
Vertex AI Training is a service that allows you to train your custom models on Google Cloud using
various hardware accelerators, such as GPUs. Vertex AI Training supports TensorFlow models and can
read data from Cloud Storage. You can use Vertex AI Training to train your image classification model
by using a V100 GPU, which is a powerful and fast GPU for deep learning.
References:
* TFRecord and tf.Example | TensorFlow Core
* Sharding | TensorFlow Core
* Dataflow | Google Cloud
* Creating sharded TFRecord files | Google Cloud
* [tf.data.TFRecordDataset | TensorFlow Core v2.6.0]
* [tf.data: Build TensorFlow input pipelines | TensorFlow Core]
* [Vertex AI Training | Google Cloud]
* [NVIDIA Tesla V100 GPU | NVIDIA]

NO.23 You developed an ML model with Al Platform, and you want to move it to production. You
serve a few thousand queries per second and are experiencing latency issues. Incoming requests are
served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on
Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the
underlying infrastructure.
What should you do?
A. Significantly increase the max_batch_size TensorFlow Serving parameter
B. Switch to the tensorflow-model-server-universal version of TensorFlow Serving
C. Significantly increase the max_enqueued_batches TensorFlow Serving parameter
D. Recompile TensorFlow Serving using the source to support CPU-specific optimizations Instruct GKE
to choose an appropriate baseline minimum CPU platform for serving nodes
Answer: D
Explanation:
TensorFlow Serving is a service that allows you to deploy and serve TensorFlow models in a scalable
and efficient way. TensorFlow Serving supports various platforms and hardware, such as CPU, GPU,
and TPU.
However, the default TensorFlow Serving binaries are built with generic CPU instructions, which may
not leverage the full potential of the CPU architecture. To improve the serving latency and
performance, you can recompile TensorFlow Serving using the source code and enable CPU-specific
optimizations, such as AVX, AVX2, and FMA1. These optimizations can speed up the computation and

23 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 22
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

inference of the TensorFlow models, especially for deep neural networks.

Google Kubernetes Engine (GKE) is a service that allows you to run and manage containerized
applications on Google Cloud using Kubernetes. GKE supports various types and sizes of nodes, which
are the virtual machines that run the containers. GKE also supports different CPU platforms, which
are the generations and models of the CPUs that power the nodes. GKE allows you to choose a
baseline minimum CPU platform for your node pool, which is a group of nodes with the same
configuration. By choosing a baseline minimum CPU platform, you can ensure that your nodes have
the CPU features and capabilities that match your workload requirements2.
For the use case of serving a few thousand queries per second and experiencing latency issues, the
best option is to recompile TensorFlow Serving using the source to support CPU-specific
optimizations, and instruct GKE to choose an appropriate baseline minimum CPU platform for serving
nodes. This option can improve the serving latency and performance without changing the underlying
infrastructure, as it only involves rebuilding the TensorFlow Serving binary and selecting the CPU
platform for the GKE nodes. This option can also take advantage of the CPU-only pods that are
running on GKE, as it can optimize the CPU utilization and efficiency. Therefore, recompiling
TensorFlow Serving using the source to support CPU-specific optimizations and instructing GKE to
choose an appropriate baseline minimum CPU platform for serving nodes is the best option for this
use case.
References:
* Building TensorFlow Serving from source
* Specifying a minimum CPU platform for a node pool

NO.24 You need to build classification workflows over several structured datasets currently stored in
BigQuery.
Because you will be performing the classification several times, you want to complete the following
steps without writing code: exploratory data analysis, feature selection, model building, training, and
hyperparameter tuning and serving. What should you do?
A. Configure AutoML Tables to perform the classification task
B. Run a BigQuery ML task to perform logistic regression for the classification
C. Use Al Platform Notebooks to run the classification model with pandas library
D. Use Al Platform to run the classification model job configured for hyperparameter tuning
Answer: A
Explanation:
AutoML Tables is a service that allows you to automatically build and deploy state-of-the-art machine
learning models on structured data without writing code. You can use AutoML Tables to perform the
following steps for the classification task:
* Exploratory data analysis: AutoML Tables provides a graphical user interface (GUI) and a command-
line interface (CLI) to explore your data, visualize statistics, and identify potential issues.
* Feature selection: AutoML Tables automatically selects the most relevant features for your model
based on the data schema and the target column. You can also manually exclude or include features,
or create new features from existing ones using feature engineering.
* Model building: AutoML Tables automatically builds and evaluates multiple machine learning
models using different algorithms and architectures. You can also specify the optimization objective,
the budget, and the evaluation metric for your model.
* Training and hyperparameter tuning: AutoML Tables automatically trains and tunes your model
using the best practices and techniques from Google's research and engineering teams. You can

24 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 23
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

monitor the training progress and the performance of your model on the GUI or the CLI.
* Serving: AutoML Tables automatically deploys your model to a fully managed, scalable, and secure
environment. You can use the GUI or the CLI to request predictions from your model, either online
(synchronously) or offline (asynchronously).
References:
* [AutoML Tables documentation]
* [AutoML Tables overview]
* [AutoML Tables how-to guides]

NO.25 You work for an online retailer. Your company has a few thousand short lifecycle products.
Your company has five years of sales data stored in BigQuery. You have been asked to build a model
that will make monthly sales predictions for each product. You want to use a solution that can be
implemented quickly with minimal effort. What should you do?
A. Use Prophet on Vertex Al Training to build a custom model.
B. Use Vertex Al Forecast to build a NN-based model.
C. Use BigQuery ML to build a statistical AR1MA_PLUS model.
D. Use TensorFlow on Vertex Al Training to build a custom model.
Answer: C
Explanation:
According to the web search results, BigQuery ML1 is a service that allows you to create and execute
machine learning models in BigQuery using SQL queries. BigQuery ML supports various types of
models, such as linear regression, logistic regression, k-means clustering, matrix factorization, deep
neural networks, and time series forecasting1. ARIMA_PLUS2 is a statistical model for time series
forecasting that is built in to BigQuery ML. ARIMA_PLUS stands for AutoRegressive Integrated Moving
Average with eXogenous regressors. ARIMA_PLUS models the relationship between a target variable
and its past values, as well as other external factors that might influence the target variable.
ARIMA_PLUS can handle multiple time series, seasonality, holidays, and missing values2. Therefore,
option C is the best way to use a solution that can be implemented quickly with minimal effort for the
given use case, as it allows you to use SQL queries to build and run a forecasting model in BigQuery
without moving the data or writing custom code. The other options are not relevant or optimal for
this scenario. References:
* BigQuery ML
* ARIMA_PLUS
* Google Professional Machine Learning Certification Exam 2023
* Latest Google Professional Machine Learning Engineer Actual Free Exam Questions

NO.26 You work for a retail company that is using a regression model built with BigQuery ML to
predict product sales. This model is being used to serve online predictions Recently you developed a
new version of the model that uses a different architecture (custom model) Initial analysis revealed
that both models are performing as expected You want to deploy the new version of the model to
production and monitor the performance over the next two months You need to minimize the impact
to the existing and future model users How should you deploy the model?
A. Import the new model to the same Vertex Al Model Registry as a different version of the existing
model. Deploy the new model to the same Vertex Al endpoint as the existing model, and use traffic
splitting to route 95% of production traffic to the BigQuery ML model and 5% of production traffic to

25 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 24
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

the new model.

B. Import the new model to the same Vertex Al Model Registry as the existing model Deploy the
models to one Vertex Al endpoint Route 95% of production traffic to the BigQuery ML model and 5%
of production traffic to the new model
C. Import the new model to the same Vertex Al Model Registry as the existing model Deploy each
model to a separate Vertex Al endpoint.
D. Deploy the new model to a separate Vertex Al endpoint Create a Cloud Run service that routes the
prediction requests to the corresponding endpoints based on the input feature values.
Answer: A
Explanation:
Vertex AI Model Registry is a central repository where you can manage the lifecycle of your ML
models1. You can import models from various sources, such as BigQuery ML, AutoML, or custom
models, and assign them to different versions and aliases1. You can also deploy models to endpoints,
which are resources that provide a service URL for online prediction2.
By importing the new model to the same Vertex AI Model Registry as a different version of the
existing model, you can keep track of the model versions and compare their performance metrics1.
You can also use aliases to label the model versions according to their readiness for production, such
as default or staging1.
By deploying the new model to the same Vertex AI endpoint as the existing model, you can use traffic
splitting to gradually shift the production traffic from the old model to the new model2. Traffic
splitting is a feature that allows you to specify the percentage of prediction requests that each
deployed model in an endpoint should handle2. This way, you can minimize the impact to the
existing and future model users, and monitor the performance of the new model over time2.
The other options are not suitable for your scenario, because they either require creating a separate
endpoint or a Cloud Run service, which would increase the complexity and maintenance of your
deployment, or they do not allow you to use traffic splitting, which would create a sudden change in
your prediction results.
References:
* Introduction to Vertex AI Model Registry | Google Cloud
* Deploy a model to an endpoint | Vertex AI | Google Cloud

NO.27 While running a model training pipeline on Vertex Al, you discover that the evaluation step is
failing because of an out-of-memory error. You are currently using TensorFlow Model Analysis
(TFMA) with a standard Evaluator TensorFlow Extended (TFX) pipeline component for the evaluation
step. You want to stabilize the pipeline without downgrading the evaluation quality while minimizing
infrastructure overhead. What should you do?
A. Add tfma.MetricsSpec () to limit the number of metrics in the evaluation step.
B. Migrate your pipeline to Kubeflow hosted on Google Kubernetes Engine, and specify the
appropriate node parameters for the evaluation step.
C. Include the flag -runner=DataflowRunner in beam_pipeline_args to run the evaluation step on
Dataflow.
D. Move the evaluation step out of your pipeline and run it on custom Compute Engine VMs with
sufficient memory.
Answer: C
Explanation:

26 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 25
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

The best option to stabilize the pipeline without downgrading the evaluation quality while minimizing
infrastructure overhead is to use Dataflow as the runner for the evaluation step. Dataflow is a fully
managed service for executing Apache Beam pipelines that can scale up and down according to the
workload. Dataflow can handle large-scale, distributed data processing tasks such as model
evaluation, and it can also integrate with Vertex AI Pipelines and TensorFlow Extended (TFX). By using
the flag - runner=DataflowRunner in beam_pipeline_args, you can instruct the Evaluator component
to run the evaluation step on Dataflow, instead of using the default DirectRunner, which runs locally
and may cause out- of-memory errors. Option A is incorrect because adding tfma.MetricsSpec() to
limit the number of metrics in the evaluation step may downgrade the evaluation quality, as some
important metrics may be omitted.
Moreover, reducing the number of metrics may not solve the out-of-memory error, as the evaluation
step may still consume a lot of memory depending on the size and complexity of the data and the
model. Option B is incorrect because migrating the pipeline to Kubeflow hosted on Google
Kubernetes Engine (GKE) may increase the infrastructure overhead, as you need to provision,
manage, and monitor the GKE cluster yourself.
Moreover, you need to specify the appropriate node parameters for the evaluation step, which may
require trial and error to find the optimal configuration. Option D is incorrect because moving the
evaluation step out of the pipeline and running it on custom Compute Engine VMs may also increase
the infrastructure overhead, as you need to create, configure, and delete the VMs yourself.
Moreover, you need to ensure that the VMs have sufficient memory for the evaluation step, which
may require trial and error to find the optimal machine type. References:
* Dataflow documentation
* Using DataflowRunner
* Evaluator component documentation
* Configuring the Evaluator component

NO.28 You are training a custom language model for your company using a large dataset. You plan
to use the ReductionServer strategy on Vertex Al. You need to configure the worker pools of the
distributed training job. What should you do?
A. Configure the machines of the first two worker pools to have GPUs and to use a container image
where your training code runs Configure the third worker pool to have GPUs: and use the reduction
server container image.
B. Configure the machines of the first two worker pools to have GPUs and to use a container image
where your training code runs. Configure the third worker pool to use the reductionserver container
image without accelerators, and choose a machine type that prioritizes bandwidth.
C. Configure the machines of the first two worker pools to have TPUs and to use a container image
where your training code runs Configure the third worker pool without accelerators, and use the
reductionserver container image without accelerators and choose a machine type that prioritizes
bandwidth.
D. Configure the machines of the first two pools to have TPUs. and to use a container image where
your training code runs Configure the third pool to have TPUs: and use the reductionserver container
image.
Answer: B
Explanation:
According to the web search results, Reduction Server is a faster GPU all-reduce algorithm developed

27 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 26
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html
Free Exam/Cram Practice Materials - Best Exam Practice Materials
IT Certification Guaranteed, The Easy Way!

at Google that uses a dedicated set of reducers to aggregate gradients from workers12. Reducers are
lightweight CPU VM instances that are significantly cheaper than GPU VMs2. Therefore, the third
worker pool should not have any accelerators, and should use a machine type that has high network
bandwidth to optimize the communication between workers and reducers2. TPUs are not supported
by Reduction Server, so the first two worker pools should have GPUs and use a container image that
contains the training code12. The reductionserver container image is provided by Google and should
be used for the third worker pool2.

NO.29 You recently trained a XGBoost model that you plan to deploy to production for online
inference Before sending a predict request to your model's binary you need to perform a simple data
preprocessing step This step exposes a REST API that accepts requests in your internal VPC Service
Controls and returns predictions You want to configure this preprocessing step while minimizing cost
and effort What should you do?
A. Store a pickled model in Cloud Storage Build a Flask-based app packages the app in a custom
container image, and deploy the model to Vertex Al Endpoints.
B. Build a Flask-based app. package the app and a pickled model in a custom container image, and
deploy the model to Vertex Al Endpoints.
C. Build a custom predictor class based on XGBoost Predictor from the Vertex Al SDK. package it and
a pickled model in a custom container image based on a Vertex built-in image, and deploy the model
to Vertex Al Endpoints.
D. Build a custom predictor class based on XGBoost Predictor from the Vertex Al SDK and package
the handler in a custom container image based on a Vertex built-in container image Store a pickled
model in Cloud Storage and deploy the model to Vertex Al Endpoints.
Answer: D
* Option A is not the best answer because it requires storing the pickled model in Cloud Storage,
which may incur additional cost and latency for loading the model. It also requires building a Flask-
based app, which may not be necessary for a simple data preprocessing step.
* Option B is not the best answer because it requires building a Flask-based app, which may not be
necessary for a simple data preprocessing step. It also requires packaging the app and the pickled
model in a custom container image, which may increase the size and complexity of the image.
* Option C is not the best answer because it requires packaging the pickled model in a custom
container image, which may increase the size and complexity of the image. It also does not leverage
the Vertex built-in container image, which may provide some optimizations and integrations for
XGBoost models.
* Option D is the best answer because it leverages the Vertex built-in container image, which may
provide some optimizations and integrations for XGBoost models. It also allows storing the pickled
model in Cloud Storage, which may reduce the size and complexity of the image. It also allows
building a custom predictor class based on XGBoost Predictor from the Vertex AI SDK, which may
simplify the data preprocessing step and the prediction logic.

28 from Freecram.net.
Get Latest & Valid Google Exam's Question and Answers 27
https://www.freecram.net/exam/Professional-Machine-Learning-Engineer-google-professional-machine-learning-engineer-
e12287.html