Mysql WP Memcached PDF
Mysql WP Memcached PDF
June, 2008
Table of Contents................................................................................................................................... 2
Introduction............................................................................................................................................. 3
Example Architectures......................................................................................................................... 9
Conclusion .............................................................................................................................................. 13
Additional Resources.......................................................................................................................... 14
Memcached Overview
Memcached is an actively maintained open source project distributed under the BSD license
with regular contributions from not only a community of developers, but also corporate
organizations like Facebook and Sun Microsystems. Danga Interactive originally developed
memcached to improve the performance and scalability characteristics of the blogging and
social networking site LiveJournal. At the time, the site was delivering over 20 million
dynamic page views per day to over 1 million users. With LiveJournal’s implementation of
memcached as a caching-tier, the existing load on the databases was significantly reduced.
This meant faster page loads for users, more efficient resource utilization and faster access
to the underlying databases in the event data could not be immediately fetched from
memcached.
Memcached is designed to take advantage of free memory on any system running Linux,
Open/Solaris, BSD or Windows with very low CPU overhead characteristics. It can be
installed on dedicated servers or co-located with web, application or database servers. Out-
of-the box memcached is designed to scale from a single server to dozens or even hundreds
of servers. Facebook currently has the largest known deployment of memcached servers in
1
http://highscalability.com/strategy-break-memcache-dog-pile
Memcached Server
As previously mentioned, there are two core components to memcached, the server and the
client. In this section we cover some of the core capabilities of the memcached server,
including how the memcached server deals with memory allocation, the caching of data.
Overall, the memached server is implemented as a non-blocking event-based server with an
emphasis on scalability and low resource consumption.
Memory Allocation
By default, the memcached server allocates memory by leveraging what is internally
referred to as a “slab allocator”. The reason why this internal slab allocator is used over
malloc/free (a standard C/C++ library for performing dynamic memory allocation) is to
avoid fragmentation and the operating system having to spend cycles searching for
contiguous blocks of memory. These tasks overall, tend to consume more resources than
the memcached process itself. With the slab allocator, memory is allocated in chunks and in
turn, is constantly being reused. Because memory is allocated into different sized slabs, it
does open up the potential to waste memory if the data being cached does not fit perfectly
into the slab.
There are also some practical limits to be aware of concerning key and data size limits. For
example, keys are restricted to 250 characters and cached data cannot exceed the largest
slab size currently supported, 1 megabyte.
Caching Structure
When memcached’s distributed hash table becomes full, subsequent inserts force older
cached data to be cycled out in a least recently used (LRU) order with associated expiration
timeouts. How long data is “valid” within the cache is set via configuration options. This
“validity” time may be a short, long or permanent. As mentioned, when the memcached
server exhausts its memory allocation, the expired slabs are cycled out with the next oldest,
unused slabs queued up next for expiration.
Memcached makes use of lazy expiration. This means it does not make use of additional
CPU cycles to expire items. When data is requested via a ‘get’ request, memcached
references the expiration time to confirm if the data is valid before returning it to the client
requesting the data. When new data is being added to the cache via a ‘set’, and memcached
is unable to allocate an additional slab, expired data will be cycled out prior to any data that
qualifies for the LRU criteria.
• The server first checks whether a memcached value with a unique key exists, for
example “user:userid”, where userid is a number.
• If the result is not cached, the request will issue a select on the database, and set
the unique key using the memcached ‘add’ function call.
• If this call was the only one being altered the server would eventually fetch incorrect
data, so in addition to using the ‘add’ function, an update is also required, using the
‘set’ function.
• This ‘set’ function call updates the currently cached data so that it is synchronized
with the new data in the database. (Another method for achieving a similar behavior
is to invalidate the cache using the ‘delete’ function so that additional fetches result
in a cache miss forcing an update to the data.)
At this point, whenever the database is updated, the cache also needs to be updated in
order to maintain the desired degree of consistency between the cache and the source
database. This can be achieved by tagging the cached data with a very low expiration time.
However, this does mean that there will be a delay between the update occurring on the
database and the expiration time on the cached data being reached. Once the expiration
time is reached, a subsequent request for the data will force an update to the cache.
In the event one of the memcached servers does suffer a loss of data, under normal
circumstances it should still be able to retrieve its data from the original source database. A
prudent caching design involves ensuring that your application can continue to function
without the availability of one or more memcached nodes. Some precautions to take in
order not to suddenly overwhelm the database(s) in the event of memcached failures is to
add additional memcached nodes to minimize the impact of any individual node failure.
Another option is to leverage a “hot backup”, in other words a server that can take over the
IP address of the failed memcached server.
Also, by design, memcached does not have any built in fail over capabilities. However, there
are some strategies one can employ to help minimize the impact of failed memcached
nodes.
The first technique involves simply having an (over) abundance of nodes. Because
memcached is designed to scale straight out-of-the-box, this is a key characteristic to
exploit. Having plenty of memcached nodes minimizes the overall impact an outage of one
or more nodes will have on the system as a whole.
One can also remove failed nodes from the server list against which the memcached clients
hash against. A practical consideration here is that when clients add or remove servers from
the server list, they will invalidate the entire cache. The likely effect being that the majority
of the keys will in-turn hash to different servers. Essentially, this will force all the data to be
re-keyed into memcached from the source database(s). As mentioned previously,
leveraging a “hot backup” server which can take over the IP address of the failed
memcached server can go a long way to minimize the impact of having to reload an
invalidated cache.
In situations where there are large amounts of fairly static or permanent data to be cached,
using a data dump/load can be useful for warming up the cache quickly.
Memcached Clients
In a typical memcached cluster, the application will access one or more memcached servers
via a memcached client library. Memcached currently has over a dozen client libraries
available including, Perl, PHP, Java, C#, C/C++, Lua and a native MySQL API. A complete
list of client APIs can be found at:
http://www.danga.com/memcached/apis.bml
From a security standpoint, it does bear mentioning that memcached does not posses any
authentication or security features. Best practices in this area would dictate that
memcached should only be run on systems within a firewall. By default, memcached makes
use of port 11211.
PHP PECL
For information concerning the PHP extensions which allows for Object-Oriented and
procedural interfaces to work with memcached, please see:
http://pecl.php.net/package/memcache
http://www.whalin.com/memcached/
ftp://ftp.tummy.com/pub/python-memcached/
http://search.cpan.org/dist/Cache-Memcached/
C Client Library
There are currently three C libraries for memcached. For more information, please see:
apr_memcache
http://www.outoforder.cc/projects/libs/apr_memcache/
libmemcached
http://tangent.org/552/libmemcached.html
libmemcache
http://people.freebsd.org/~seanc/libmemcache/
MySQL API
The memcache_engine allows memcached to work as a storage engine to MySQL. This
allows SELECT/UPDATE/INSERTE/DELETE to be performed from it were a table in MySQL.
http://tangent.org/index.pl?node_id=506
Also, there is a set of MySQL UDFs (user defined functions) to work with memcached using
libmemcached.
http://tangent.org/586/Memcached_Functions_for_MySQL.html
Example Architectures
There are various ways to design scalable architectures using memcached and MySQL.
Below we illustrate several of these architectures.
read write
Web Servers
Application
Servers
Memcached
Clients mc mc mc
Memcached
Servers ms ms ms
cache update
MySQL Server
Application Servers
&
Memcached Clients mc mc mc
Memcached
Servers ms ms ms
MySQL
Master cache update
MySQL
Replication
MySQL
Slaves
Figure 3: Multiple Memcached Servers with a Master and multiple Slave MySQL Servers
mc mc mc
ms ms ms ms ms ms
MySQL MySQL
Replication Replication
Figure 4: Sharding, multiple Memcached Servers with a Master and multiple Slave MySQL
Servers
• Implement a scalable, high performance data caching solution for their online
applications
• Reduce database Total Cost of Ownership (TCO) by eliminating licensing costs for
proprietary data caching software
• Reduce system TCO by making better use of resources like idle or spare RAM on
existing systems
• Incrementally add/remove data caching capacity, on-demand to quickly meet
changing requirements.
http://www.mysql.com/products/enterprise/features.html
On-Demand Scalability
Memcached can be incrementally scaled out in an on-demand fashion. Because Memcached
can scale to support dozens of nodes with minimal overhead, anywhere spare memory
resources exist is an opportunity to scale your application even further. Memcached is
designed as a non-blocking event-based server with no special networking or interconnect
requirements.
http://www.mysql.com/consulting/
Conclusion
In this whitepaper we have provided a basic introduction to memcached and how it can
improve application performance while minimizing database load. Memcached increases the
scalability of dynamic, data-driven applications by increasing the possibility of supporting
more concurrent users and spikes in traffic, while at the same time making the most
efficient use of existing computing infrastructure. We explored how the memcached server
allocates memory, implements hashing and interacts with memcached clients. We have also
Additional Resources
White Papers
http://www.mysql.com/why-mysql/white-papers/
Case Studies
http://www.mysql.com/why-mysql/case-studies/
Live Webinars
http://www.mysql.com/news-and-events/web-seminars/
Webinars on Demand
http://www.mysql.com/news-and-events/on-demand-webinars/
Download Memcached
http://www.danga.com/memcached/download.bml
Installation Tutorial
http://blog.ajohnstone.com/archives/installing-memcached/
Memcached FAQ
http://www.socialtext.net/memcached/index.cgi?faq
To discover how Sun’s offerings can help you harness the power of next-generation Web
capabilities, please visit http://www.sun.com/web .