Week7 Slides
Week7 Slides
Memory
Hierarchy
Types of storage elements
● On-chip registers: 10s-100s of bytes
Types of storage elements
● On-chip registers: 10s-100s of bytes
● SRAM (cache): 0.1 - 1 MB
Types of storage elements
● On-chip registers: 10s-100s of bytes
● SRAM (cache): 0.1 - 1 MB
● DRAM: 0.1 - 10 GB
Types of storage elements
● On-chip registers: 10s-100s of bytes
● SRAM (cache): 0.1 - 1 MB
● DRAM: 0.1 - 10 GB
● Solid-state disk (SSD) - Flash: 1-100 GB
Types of storage elements
● On-chip registers: 10s-100s of bytes
● SRAM (cache): 0.1 - 1 MB
● DRAM: 0.1 - 10 GB
● Solid-state disk (SSD) - Flash: 1-100 GB
● Magnetic disk (HDD - hard disk drive?): 0.1 - 10 TB
Types of storage elements
● On-chip registers: 10s-100s of bytes
● SRAM (cache): 0.1 - 1 MB
● DRAM: 0.1 - 10 GB
● Solid-state disk (SSD) - Flash: 1-100 GB
● Magnetic disk (HDD - hard disk drive?): 0.1 - 10 TB
● Optical, magnetic, holographic, . . .
Storage Parameters
● Latency: time to read first value from a storage location (lower is better)
○ Register < SRAM < DRAM < SSD < HDD
Storage Parameters
● Latency: time to read first value from a storage location (lower is better)
○ Register < SRAM < DRAM < SSD < HDD
● Throughput: number of bytes/second that can be read (higher is better)
○ DRAM > SSD > HDD (regs, SRAM limited capacity)
Storage Parameters
● Latency: time to read first value from a storage location (lower is better)
○ Register < SRAM < DRAM < SSD < HDD
● Throughput: number of bytes/second that can be read (higher is better)
○ DRAM > SSD > HDD (regs, SRAM limited capacity)
● Density: number of bits stored per unit area / cost (higher is better)
○ Volume manufacture important
○ HDD > SSD > DRAM > SRAM > Regs
Computer Organization
● CPU has as many registers as possible
Computer Organization
● CPU has as many registers as possible
● Backed by L1, L2, L3 cache (SRAM)
Computer Organization
● CPU has as many registers as possible
● Backed by L1, L2, L3 cache (SRAM)
● Backed by several GB of DRAM working memory
Computer Organization
● CPU has as many registers as possible
● Backed by L1, L2, L3 cache (SRAM)
● Backed by several GB of DRAM working memory
● Backed by SSD for high throughput
Computer Organization
● CPU has as many registers as possible
● Backed by L1, L2, L3 cache (SRAM)
● Backed by several GB of DRAM working memory
● Backed by SSD for high throughput
● Backed by HDD for high capacity
Computer Organization
● CPU has as many registers as possible
● Backed by L1, L2, L3 cache (SRAM)
● Backed by several GB of DRAM working memory
● Backed by SSD for high throughput
● Backed by HDD for high capacity
● Backed by long-term storage, backup
Cold storage (?)
● Backups and archives:
○ Huge amounts of data
○ Not read very often
○ Can tolerate high read latency
● Amazon Glacier, Google, Azure Cold/Archive storage classes
● High latency of retrieval: up to 48 hours
● Very high durability
● Very low cost
Impact on application development
● Plan the storage needs based on application growth
● Speed of app determined by types of data stored, how stored
● Some data stores are more efficient for some types of read/write operations
Developer must be aware of choices and what kind of database to choose for a
given application
Data Search
O() notation
● Used in study of algorithmic complexity: beyond scope of this course
● Rough approximation: “order of magnitude”, “approximately” etc.
● Main concepts here:
○ O(1) - constant time independent of input size - excellent!
○ O(log N) - logarithmic in input size - grows slowly with input - very good
○ O(N) - linear in input size - often the baseline - would like to do better
○ O(Nk) - polynomial (quadratic, cubic etc.) - not good as input size grows
○ O(kN) - exponential - VERY bad: won’t work even for reasonably small
inputs
Searching for element in memory
Unsorted data in a linked list
O(N)
Searching for element in memory
Unsorted data in array
O(N)
Searching for element in memory
Sorted data in array
O(N) but...
Searching for element in memory
Sorted data in array
O(log N)
Problems with arrays
● Size must be fixed ahead of time
● Adding new entries requires resizing - can try oversize, but eventually ...
● Maintaining sorted order O(N):
○ find location to insert
○ move all further elements by 1 to create a gap
○ insert
● Deleting
○ find location, delete
○ move all entries down by 1 step
Alternatives
● Binary search tree
○ Maintaining sorted order is easier: growth of tree
Alternatives
● Binary search tree
○ Maintaining sorted order is easier: growth of tree
● Self-Balancing
○ BST can easily tilt to one side and grow downwards
○ Red-black, AVL, B-tree… more complex, but still reasonable
Alternatives
● Binary search tree
○ Maintaining sorted order is easier: growth of tree
● Self-Balancing
○ BST can easily tilt to one side and grow downwards
○ Red-black, AVL, B-tree… more complex, but still reasonable
● Hash tables
○ Compute an index for an element: O(1)
○ Hope the index for each element is unique!
■ Difficult but doable in many cases
Database Search
Databases (tabular)
● Tables with many columns
● Want to search quickly on some columns
● Maintain “INDEX” of columns to search on
○ Store a sorted version of column
○ Needs column to be “comparable”: integer, short string, date/time etc.
■ Long text fields are not good for index
■ Binary data not good
Example: MySQL
https://dev.mysql.com/doc/refman/8.0/en/index-btree-hash.html
Index-friendly query
SELECT * FROM tbl_name WHERE key_col LIKE 'Patrick%';
/* index = 1 OR index = 2 */
... WHERE index=1 OR A=10 AND index=2
● https://dev.mysql.com/doc/refman/8.0/en/index-btree-hash.html
● https://www.sqlite.org/optoverview.html
● Postgres:
Summary
● Setting up queries properly impacts application performance
● Building proper indexes crucial to good search
● Compound indexes, multiple indexes etc. possible
○ Too many can be waste of space
● Make use of structure in data to organize it properly
SQL vs NoSQL
SQL
● Structured Query Language
○ Used to query databases that have structure
○ Could also be used for CSV files, spreadsheets etc.
● Closely tied to RDBMS - relational databases
○ Columns / Fields
○ Tables of data hold relationships
○ All entries in a table must have same set of columns
● Tabular databases
○ Efficient indexing possible - use specified columns
○ Storage efficiency: prior knowledge of data size
Problem with tabular databases
● Structure (good? bad?)
● All rows in table must have same set of columns
Example
pswd = form.request["name"]
Result???
Example input vs SQL
Input:
Query:
App developers should be very careful of their code, but also aware of problems at
other levels of the stack