System Design - ML Design 1 PDF
System Design - ML Design 1 PDF
Objectives/Goals
● Large-scale systems end to end. A strong performance: replicable to many systems at
Facebook.
● Sample Questions:
○ Design a key-value store
○ Design Google search
○ Architect a world-wide video distribution system
○ Build Facebook chat
○ Google Search vs Twitter Search vs FB Search: Google’s index building layer
has many more components for document understanding. It would need
components for extracting deep links, contact information, referrals (for page
rank). On the other hand, Twitter’s index building should be simpler due to small
size tweets and some rich media information for the attached media. Twitter’s
search is head heavy. So a bulk of engineering efforts in designing their search
should go to rapidly indexing new tweets and making them searchable.
● Expectations:
○ What we’re looking for:
■ Can you arrive at an answer in the face of unusual constraints?
■ Can you visualize the entire problem and solution space?
■ Can you make trade-offs like consistency, availability, partitioning,
performance?
■ Can you give ballpark numbers on QPS supported, # of machines needed
using a modern computer?
■ How much have you thought about Facebook and some of the unique
problems we face?
The CAP theorem implies that in the presence of a network partition, one has to choose between
consistency and availability.
Scalability
Reliability
● Availability is the time a system remains operational to perform its required function in a
specific period.
● Measured by the percentage of time that a system remains operational under normal
conditions.
● A reliable system is available.
● An available system is not necessarily reliable.
○ A system with a security hole is available when there is no security attack.
Efficiency
● Latency: response time, the delay to obtain the first piece of data.
● Bandwidth: throughput, amount of data delivered in a given time.
Serviceability / Manageability
- CAP Theorem:
As mentioned above
- RDBMS vs NoSQL:
- https://github.com/chagri/CP/blob/master/system_design/System_Design_Datab
ases/All_DBs_Cloud_And_Deployment_Systems.md
- Microservices: Kind of architecture for seamlessly working with various
components/services seamlessly. Example of Instagram microservice arch:
https://www.youtube.com/watch?v=qYhRvH9tJKw
https://www.n-ix.com/microservices-vs-monolith-which-architecture-best-choice-y
our-business/
Serverless architecture is a way to build and run applications and services
without having to manage infrastructure. Serverless computing allows you to run
any function without worrying about the infrastructure. This means that servers,
software, tools, backup, and scaling are parts of the platform. Serverless does
not mean that servers are no longer involved, but developers no longer have to
worry about managing them. Your application still runs on servers, but all the
server management is done by a cloud provider such as AWS.
- Load Balancer:
AWS Elastic Load Balancing automatically distributes incoming application traffic
across multiple targets, such as Amazon EC2 instances, containers, IP
addresses, and Lambda functions. It can handle the varying load of your
application traffic in a single Availability Zone or across multiple Availability
Zones. Elastic Load Balancing offers three types of load balancers that all feature
the high availability, automatic scaling, and robust security necessary to make
your applications fault-tolerant. Generally speaking, load balancers fall into three
categories:
● DNS Round Robin (rarely used): clients get a randomly-ordered list of IP
addresses.
pros: easy to implement and free
cons: hard to control and not responsive, since DNS cache needs time to
expire
● L3/L4 Load Balancer: traffic is routed by IP address and port. L3 is a network
layer (IP). L4 is the session layer (TCP).
pros: better granularity, simple, responsive
● L7 Load Balancer: traffic is routed by what is inside the HTTP protocol. L7 is
the application layer (HTTP).
Redis is an in-memory data structure store, used as a database, cache, and message
broker. ... While that's all that Memcached is its only the tip of the Redis iceberg.
Memcached is a volatile in-memory key/value store. Redis can act like one (and do that
job as well as Memcached), but it is a data structure server.
Redis can handle up to 232 keys and was tested in practice to handle at least 250
million keys per instance. Every hash, list, set and sorted set, can hold 232
elements (~4B). R edis Strings are binary safe, this means that a Redis string can
contain any kind of data, for instance, a JPEG image or a serialized Ruby object.
A String value can be at m ax 512 Megabytes in l ength.
https://www.infoworld.com/article/3063161/why-redis-beats-memcached-for-caching.html
- Streaming DBs: They help distribute data between several producers and many
consumers (e.g. Mongo, MySQL, Redshift, Dynamo) easily. Here Apache Kafka serves
as an "data" integration message bus.
Potential Apache Kafka, AWS Kinesis
Long Poll and BOST are other bidirectional options over HTTP.
- REST vs RPC/GRPC:
- REST messages typically contain JSON. gRPC, on the other hand, accepts and
returns Protobuf messages
- GrPC uses HTTP2 while REST uses HTTP1, therefore it is faster and no need to
create TCP connection every time, the same can be used for multiple requests,
useful for FB, IG kind of apps with multi-service support. Other advantages:
- The Growth of Page Size and Number of Objects per ask
- Latency
- Messages vs. Resources and Verbs: gRPC comes with clear interfaces and
structured messages for requests and responses. This model translates directly
from programming concepts like interfaces, functions, methods, and data
structures. It also allows gRPC to automatically generate client libraries for you.
- Streaming vs. Request-Response: REST request-response only, gRPC
streaming as well.
- gRPC is strongly typed i.e. more redundant but fewer bugs especially compared
to JSON which completely depends on the developer.
- More info:
https://code.tutsplus.com/tutorials/rest-vs-grpc-battle-of-the-apis--cms-30711
- Hashing Algorithms:
- B62: Based on 62 (26 Capital, 26 lower, 10 ints). 7 length string of base 62=
62^7 ~ 3.5 trillion combinations.
- MD5 Hash
- Message Queue:
A message queue is a form of asynchronous service-to-service communication used in
serverless and microservices architectures. Messages are stored on the queue until they
are processed and deleted. Each message is processed only once, by a single
consumer. Message queues can be used to decouple heavyweight processing, to buffer
or batch work, and to smooth spiky workloads. E.g.: Amazon Simple Queue Service
(SQS) website.
- Availability vs Reliability:
- Reliability:
- Reliability is the probability that a system will fail in a given period.
- A distributed system is reliable if it keeps delivering its service even
when one or multiple components fail.
- Reliability is achieved through redundancy of components and data
(remove every single point of failure).
- Availability:
- Availability is the time a system remains operational to perform its
required function in a specific period.
- Measured by the percentage of time that a system remains
operational under normal conditions.
- A reliable system is available.
- An available system is not necessarily reliable.
- A system with a security hole is available when there is no
security attack.
- Efficiency:
- Latency: response time, the delay to obtain the first piece of data.
- Bandwidth (QPS): throughput, amount of data delivered in a given time
- Zookeeper:
- ZooKeeper is a centralized service for maintaining configuration
information, naming, providing distributed synchronization, and providing
group services. All of these kinds of services are used in some form or
another by distributed applications.
- Paxos: Consensus over distributed hosts, similar to Zookeeper.
- On-device Scalability Inference, Federated Learning:
- Why: privacy, GDPR?
- Model save format for cross-platform inference:
- ONNX vs PMML: ONNX , the Open Neural Network Exchange
Format is an open format that supports the storing and porting of
predictive models across libraries and languages. ... P
MML or
Predictive model markup language is another interchange format
for predictive models. ONNX preferred and supported for NN,
Torch, TF, etc. PMML more traditional for SKlearn and ML kind of
models
- Tools:
- TF lite
- PyTorch Mobile: https://pytorch.org/mobile/home/
- Core ML: For iOS only
- API: Amazon Sagemaker
- Caching over ISPs (Internet Service Providers) via Open Connect for different
regions. E.g. YouTube/Netflix caches content per region (e.g. India) and ISPs do
not hit netflix.com everytime, rather netflix.in or something which fetches the data
much faster. They cache popular content such as Bollywood movies/videos over
these caches, which serves 90% of their traffic.
The major difference between them lies in their transmission methods, i.e.
Synchronous transmissions are synchronized by an external clock; whereas
Asynchronous transmissions are synchronized by special signals along the
transmission medium.
- CDN and Edge: Content delivery Network: dedicated server in the region to
support data (mostly for videos like Netflix). Edge is similar to CDN with more
advantage/local and dedicated line b/w edge and the consumer avoiding transfer
for busy internet, therefore faster data transfer compared to CDN.
- HTTPS = HTTP + TLS (Transport Layer Security): More secure with TLS
protocols
- Pub-Sub and Queue: Note that customer-facing requests through app/UI should
not be directly exposed to Pub-Sub.
- LRU Cache: LRU stands for least recently used and the idea is to remove the
least recently used data to free up space for the new data.
- Solr and Elastic Search built on top of Lucene: Highly available, scalable. Allow
full-text search.
Approach
1. Questions to ask:
a. Input:
i. App type
ii. List possible actions (such as Instagram: upload, like, share, comment,
etc.)
iii. Data: kind of data, GDPR/Privacy
iv. Users: kind of users, demographics
b. Optimizing for:
i. Consistency, Availability, Performance, Partitioning?
ii. ACID (Atomic, Consistent, Isolation, Durable) vs BASE (Basic Availability,
Soft-State, Eventual Consistency)
c. Traffic:
i. # Users
ii. # active users
iii. QPS, Per day, per year
iv. Lifecycle?
3. Service:
a. Continuous Integration and development: Kubernetes, Docker
b. Microservices/Monolithic
c. Synchronous, Asynchronous
d. API: REST, GRPC
e. Zookeeper/Paxo: Managing distributed
5. Model:
a. Deployment:
i. On Device: Look at concepts
ii. API: gRPC/Protobuff/REST
b. Active Learning
6. ML Design:
a. Questions:
i. Usecase:
ii. Data: Annotation: Size
iii. Metrics
iv. Active learning
v. Scalable: training vs inference
b. Model:
i. Regression, Classification, Supervised/Unsupervised
c. Training:
i. Distributed Training: PyTorch.data-parallel, Distributed TF
Relevant Resources
- System Design Basics:
- https://github.com/chagri/grokking-system-design/tree/master/basics
- Cracking the Coding Interview
- Glossary of terms:
- https://www.youtube.com/watch?v=UzLMhqg3_Wc&list=PL73KFetZlkJSZ9vTDSJ1swZh
e6CIYkqTL&index=3&t=0s
-
- Grokking the system design interview:
- Must Watch: https://github.com/lei-hsia/grokking-system-design
- LC:
- Design Youtube:
https://leetcode.com/discuss/interview-question/system-design/496042/Design-vi
deo-sharing-platform-like-Youtube
-
- CP Git:
- All DBs summary:
https://github.com/chagri/CP/blob/master/2020_practice/System_Design_Databa
ses/All_DBs_Cloud_And_Deployment_Systems.md
- https://github.com/chagri/CP/tree/master/system_design
- Kafka Theory and basics brush up:
- https://learning.oreilly.com/videos/apache-kafka-series/9781789342604/9781789
342604-video2_1
- Distributed system basics:
- https://learning.oreilly.com/videos/distributed-systems-in/9781491924914
- Cassandra:
- https://learning.oreilly.com/videos/mastering-cassandra-essentials/97814919941
22
Examples
1. URL Shortening:
a. CTCI
b. https://www.youtube.com/watch?v=JQDHz72OA3c&list=PL73KFetZlkJSZ
9vTDSJ1swZhe6CIYkqTL&index=26&t=180s
c. https://blog.codinghorror.com/url-shortening-hashes-in-practice/
2. Video Streaming:
a. YouTube:
i. https://leetcode.com/discuss/interview-question/system-design/496042/D
esign-video-sharing-platform-like-Youtube
b. Netflix:
https://www.youtube.com/watch?v=x9Hrn0oNmJM
Glossary
https://www.youtube.com/watch?v=UzLMhqg3_Wc&list=PL73KFetZlkJSZ9vTDSJ1swZhe6CIYkqTL&inde
x=3&t=0s
Things to consider
● Features
● API
● Availability
● Latency
● Scalability
● Durability
● Class Diagram
● Security and Privacy
● Cost-effective
Concepts to know
● Vertical vs horizontal scaling
● CAP theorem
● ACID vs BASE
● Partitioning/Sharding
● Consistent Hashing
● Optimistic vs pessimistic locking
● Strong vs eventual consistency
● RelationalDB vs NoSQL
● Types of NoSQL
○ Key value
○ Wide column
○ Document-based
○ Graph-based
● Caching
● Data center/racks/hosts
● CPU/memory/Hard drives/Network bandwidth
● Random vs sequential read/writes to disk
● HTTP vs http2 vs WebSocket
● TCP/IP model
● ipv4 vs ipv6
● TCP vs UDP
● DNS lookup
● Http & TLS
● Public key infrastructure and certificate authority(CA)
● Symmetric vs asymmetric encryption
● Load Balancer
● CDNs & Edges
● Bloom filters and Count-Min sketch
● Paxos
● Leader election
● Design patterns and Object-oriented design
● Virtual machines and containers
● Pub-sub architecture
● MapReduce
● Multithreading, locks, synchronization, CAS(compare and set)
Tools
● Cassandra
● MongoDB/Couchbase
● Mysql
● Memcached
● Redis
● Zookeeper
● Kafka
● NGINX
● HAProxy
● Solr, Elastic search
● Amazon S3
● Docker, Kubernetes, Mesos
● Hadoop/Spark and HDFS
ML Design
Objective/Goals
Concepts
● Ranking:
FB example for News Feed:
For each user-post pair obtain this (and calculate relevancy score by multiplying values
with prob) and then showcase:
Approach
https://research.fb.com/blog/2018/05/the-facebook-field-guide-to-machine-learning-video-series/
2. Feature Engineering:
a. Data:
3. Training:
a. Model architecture: Interpretable?
b. Cross-Validation
c. Baseline Model:
i. Get a simple baseline like random (normalized entropy), or general
likelihood of click/not-click, average click rate.
ii. Or training on 1st time clickers, i.e. with less # features with
context/history, i.e. simpler inputs.
4. Optimization function:
a. Based on the type and metrics: Ranking loss, RMSE, Cross-Entropy
b. Define based on performance wrt baseline model (RMSE in test data wrt avg
CTR, etc.)
https://medium.com/analytics-vidhya/calibration-in-machine-learning-e797
2ac93555
iv. Convert Binary to multi-class and evaluate how and where the
performance is coming from, also to debug.
Way to test whether A/B test is configured properly is by comparing control with
control and seeing exactly similar performance. If that’s not the case, then the
experiment is not fairly setup.
6. Hybrid:
a. Model + Rules (such as cache and latest seen items/interests)
Examples
1. Scalable Collaborative Filtering for FB Ads:
https://engineering.fb.com/core-data/recommending-items-to-more-than-a-billion-people/
2. Pinterest Recommendation System using CNNs:
Efficient convolutional network for recommender systems
Patent: Look at images
3.