Unit 5-Key - Value Store Database
Unit 5-Key - Value Store Database
Parul Pandey
Key – Value Database
• A key-value store is a simple hash table,
primarily used when all access to the database
is via primary key
Introduction
• Key-value stores are the simplest NoSQL data stores to use
from an API perspective.
• The client can either get the value for the key, put a value
for a key, or delete a key from the data store.
• The value is a blob that the data store just stores, without
caring or knowing what’s inside; it’s the responsibility of
the application to understand what was stored.
• Since key-value stores always use primary-keyaccess, they
generally have great performance and can be easily scaled.
• Eg. Riak , Redis , Memcached DB, Berkeley DB,
HamsterDB , Amazon DynamoDB ,Project Voldemort
Riak
• Riak lets us store keys into buckets, which are just a way to
segment the keys—think of buckets as flat namespaces for the
keys.
• If we wanted to store user session data, shopping cart
information, and user preferences in Riak, we could just store
all of them in the same bucket with a single key and single value
for all of these objects. In this scenario, we would have a single
object that stores all the data and is put into a single bucket
• Key-value stores such as Redis also support storing random data
structures, which can be sets, hashes, strings, and so on. This
feature can be used to store lists of things, like states or
addressTypes, or an array of user’s visits.
Key-Value Store Features - Consistency
• Consistency is applicable only for operations on a single
key, since these operations are either a get, put, or delete
on a single key.
• In distributed key-value store implementations like Riak,
the eventually consistent model of consistency is
implemented.
• Since the value may have already been replicated to other
nodes, Riak has two ways of resolving update conflicts:
either the newest write wins and older writes loose, or
both (all) values are returned allowing the client to resolve
the conflict.
Bucket bucket = connection.createBucket(bucketName)
.withRetrier(attempts(3))
.allowSiblings(siblingsAllowed)
.nVal(numberOfReplicasOfTheData)
.w(numberOfNodesToRespondToWrite)
.r(numberOfNodesToRespondToRead)
.execute();
If we need data in every node to be consistent, we can increase
the numberOfNodesToRespondToWrite set by w to be the same
as nVal.
doing that will decrease the write performance of the cluster.
To improve on write or read conflicts, we can change the
allowSiblings flag during bucket creation: If it is set to false, we let
the last write to win and not create siblings.
Transactions
• Riak uses the concept of quorum implemented by
using the W value —replication factor—during the
write API call.
• Assume we have a Riak cluster with a replication
factor of 5 and we supply the W value of 3. When
writing, the write is reported as successful only
when it is written and reported as a success on at
least three of the nodes. This allows Riak to have
write tolerance; in our example, with N equal to 5
and with a W value of 3, the cluster can tolerate N -
W = 2 nodes being down for write operations,
though we would still have lost some data on those
nodes for read.
Query Features
• All key-value stores can query by the key.
• If you have requirements to query by using some
attribute of the value column, it’s not possible to use the
database.
• What if we don’t know the key, especially during ad-hoc
querying during debugging?
• Most of the data stores will not give you a list of all the
primary keys; even if they did, retrieving lists of keys and
then querying for the value would be very cumbersome.
• design of the key - Can the key be generated using some
algorithm? - Can the key be provided by the user (user
ID, email, etc.)? - Or derived from timestamps or other
data that can be derived outside of the database?
Query Feature
• These query characteristics make key-value
stores likely candidates for storing session
data (with the session ID as the key), shopping
cart data, user profiles, and so on.
• expiry_secs
Query
• For Storing - store API
– Bucket bucket = getBucket(bucketName);
– IRiakObject riakObject = bucket.store(key,
value).execute();
• For retrieving - fetch API
– Bucket bucket = getBucket(bucketName);
– IRiakObject riakObject = bucket.fetch(key).execute();
– byte[] bytes = riakObject.getValue();
– String value = new String(bytes);
Query
• HTTP-based interface, so that all operations can
be performed from the web browser or on the
command line using curl
{" curl -v -X POST -d '
lastVisit":1324669989288, { "lastVisit":1324669989288,
"user":{ "user":
"customerId":"91cfdf5bcb7c",
{"customerId":"91cfdf5bcb7c",
"name":"buyer",
"name":"buyer",
"countryCode":"US",
"tzOffset":0 "countryCode":"US",
} "tzOffset":0}
} }'
TO FETCH DATA USING CURL -H "Content-Type:
curl -i application/json"
http://localhost:8098/buckets/sessio http://localhost:8098/buckets/
n/keys/a7e618d9db25 session/keys/a7e618d9db25
Structure of Data
• Key-value databases don’t care what is stored
in the value part of the key-value pair.
• The value can be a blob, text, JSON, XML, and
so on. In Riak, we can use the Content-Type in
the POST request to specify the data type.
Scaling