Showing posts with label docker. Show all posts
Showing posts with label docker. Show all posts

Tuesday, April 03, 2018

Test MySQL 8.0 right in your computer

MySQL 8.0 GA is right around the corner. I don't have precise information about its release, as I don't work at Oracle. If I did, I would probably know, but I couldn't tell when the release is scheduled to appear because of company policies. I can, however, speculate and infer, based of my experience with previous releases. My personal assessment is that the release will appear before 9:00am PT on April 24, 2018. The "before" can be anything from a few minutes to one week in advance.
Then, again, it may not happen at all if someone finds an atrocious bug that needs to be fixed asap.

Either way, users are keen on testing the new release in its current state of release candidate. Here I show a few methods that allow you to have a taste of the new goodies without waiting for the triumphal (keynote) announcement.


1. Docker containers

If you are a docker user, using a container to test MySQL is a no brainer. Unlike virtual machines or standalone servers, a docker container comes ready to use, with nothing to configure. All you need to do is pulling the right image. As with every docker images, you pull once and then use as many times as you need.

There are two reliable images that contain the latest MySQL. One is called mysql:8.0 and is tagged as official, which means that it is released by the Docker maintenance team. The other one, which is released by the MySQL team, is called mysql/mysql-server:8.0.

$ docker pull mysql:8.0
8.0: Pulling from library/mysql
Digest: sha256:7004063f8bd0c7bade8d1c526b9b8f5188c8288f411d76ee4ba83131e00c6f02
Status: Downloaded newer image for mysql:8.0

$ docker pull mysql/mysql-server:8.0
8.0: Pulling from mysql/mysql-server
Digest: sha256:e81d95f788adb04a4d2fa5f6f7e9283ca0f6360fb518efe65af5a7377a4ec282
Status: Downloaded newer image for mysql/mysql-server:8.0

The mysql image is based on Debian, while the original package, as you would expect, is based on Oracle Linux.

Let's see how to run MySQL in a container.

$ docker run --name official  -e MYSQL_ROOT_PASSWORD=secret -d mysql:8.0
60ec307578a139f5083ded07e94d737690d287b1b95093878675983a5cc40174

$ docker run --name original -e MYSQL_ROOT_PASSWORD=secret \
    -d mysql/mysql-server:8.0
0c93bb4a97ffa53232a69732d3ae45413a443e38fa43ad6fdc4057168cba42d2

With the above commands we get two containers, one for the official image and one for the original one.
We can't use them straight away, though. We need to wait for the servers to be ready. An easy method to verify the status of the server is looking at docker logs:

$ docker logs original --tail 1
2018-04-01T21:23:30.395461Z 0 [System] [MY-010931] /usr/sbin/mysqld: ready for connections. Version: '8.0.4-rc-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL).

$ docker logs original --tail 1
2018-04-01T21:23:30.395461Z 0 [System] [MY-010931] /usr/sbin/mysqld: ready for connections. Version: '8.0.4-rc-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL).

Here, after about 10 seconds, both containers are ready to use. We can now access the servers. One easy method is through docker exec

$ docker exec -ti original mysql -psecret
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 15
Server version: 8.0.4-rc-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

A similar command would allow us to access the other container.

If you want to try replication, more work is needed. In these articles you will find more details on Docker operations, and examples of advanced deployments:


2. Sandboxes

A sandboxed database is deployed in a non-dedicated box, with its configuration altered in such a way that it will run independently from other similar deployment and even from databases running in the main space.
The granddaddy of the sandbox deployer was MySQL-Sandbox, which has recently evolved into the more powerful and easier to use dbdeployer.
You can use MySQL-Sandbox to test a MySQL 8.0 tarball on MacOS

$ make_sandbox --export_binaries  mysql-8.0.4-rc-macos10.13-x86_64.tar.gz

This command unpacks the tarball into $HOME/opt/mysql and deploys the database in $HOME/sandboxes/msb_8_0_4.
Until recently, the same command would work on Linux without modifications. In MySQL 8.0.4, though, the tarball organization for Linux has changed. There are symbolic links for SSL libraries inside the ./bin directory. Those symlinks are not extracted by default, but only if you use the option --keep-directory-symlink when opening the tarball. MySQL-Sandbox doesn't do it, also because this option is not standard to every version of tar.

Thus, if you want to use the old MySQL-Sandbox, you need to run the extraction manually.

$ cd $HOME/opt/mysql
$ tar -xzf  --keep-directory-symlink /tmp/mysql-8.0.4-rc-linux-glibc2.12-x86_64.tar.gz
$ mv mysql-8.0.4-rc-linux-glibc2.12-x86_64 8.0.4
$ make_sandbox 8.0.4

I don't recommend the above procedure, for either Linux or MacOS. The main reason, in addition to the manual operations involved, is that MySQL-Sandbox is not going to be updated for the time being. Instead, you should use dbdeployer, which has all the main features of MySQL-Sandbox and a lot of new ones. Here's the equivalent procedure:

$ dbdeployer unpack /tmp/mysql-8.0.4-rc-linux-glibc2.12-x86_64.tar.gz
$ dbdeployer deploy single 8.0.4
Database installed in $HOME/sandboxes/msb_8_0_4
run 'dbdeployer usage single' for basic instructions'
. sandbox server started

dbdeployer uses a different method to initialize the database server, which at the same time makes the initialization more visible and avoids the problem of the phantom SSL libraries.

Note: Tarballs for recent MySQL versions are really big. MySQL 8.0.4 binaries expand to 1.9 GB. If storage is an issue, you should get the tarballs from a collection of minimised tarballs (Linux only) for most MySQL versions. For now, it's maintained by me, but I hope that the the MySQL team will release something similar.

Once you have deployed a sandbox with MySQL 8.0, using it is easy:

$ cd $HOME/sandboxes/msb_8_0_4
$ ./use
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.4-rc-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql [localhost] {msandbox} ((none)) >

dbdeployer creates several shortcuts for the most common commands to use the database. ./use is the most common, and provides access to the MySQL client with all the necessary options needed to use it correctly. For more information on what is available, run

$ dbdeployer usage single

This functionality would be enough to decide for a sandbox as your preferred method for testing. However, it this is only a tiny portion of what you can do with dbdeployer in your own computer. With a single command, you can test master/slave replication, multi-primary group replication, single primary group replication, fan-in, and all-masters topologies.

You can try the following commands:

$ dbdeployer deploy single 8.0.4
$ dbdeployer deploy replication 8.0.4
$ dbdeployer deploy replication 8.0.4 --topology=group
$ dbdeployer deploy replication 8.0.4 --topology=group --single-primary
$ dbdeployer deploy replication 8.0.4 --topology=all-masters
$ dbdeployer deploy replication 8.0.4 --topology=fan-in

If you have enough RAM, all these deployments will survive in parallel.
In my desktop, I can run:

$ dbdeployer sandboxes --header
name                        type                    version  ports
----------------            -------                 -------  -----
all_masters_msb_8_0_4     : all-masters               8.0.4 [15001 15002 15003]
fan_in_msb_8_0_4          : fan-in                    8.0.4 [14001 14002 14003]
group_msb_8_0_4           : group-multi-primary       8.0.4 [20009 20134 20010 20135 20011 20136]
group_sp_msb_8_0_4        : group-single-primary      8.0.4 [21405 21530 21406 21531 21407 21532]
msb_8_0_4                 : single                    8.0.4 [8004]
rsandbox_8_0_4            : master-slave              8.0.4 [19009 19010 19011]

When MySQL 8.0.11 is released, you can replace "8.0.4" with "8.0.11" and get a similar result.

BTW, you have seen that deploying replication sandboxes may take a long time. You may try adding --concurrent to each command, and enjoy a notable speed increase.

What else can you do with the sandboxes you have just deployed? Plenty! For a complete list, have a look at the online documentation. But for the moment, you may try this:

$ dbdeployer global status
$ dbdeployer global test
$ dbdeployer global test-replication

3. Other methods

Besides the methods that I recommend, there are others that you could use, but I won't advise about them as there are more qualified ones for that.

  • Standalone server. If you have the luxury of having one or more standalone servers sitting in a lab, by all means go for it. Just follow the instructions about installing MySQL on your lucky server. Be advised, though, that depending on the method you choose and the version of your operating system, you may face compatibility issues (.rpm or .deb dependencies).
  • Virtual machines. VMs share with standalone servers the same ease of installation (and the same dependency issues), only a bit slower. They are convenient, as you can use them to test in conditions that more closely resemble production settings, and if you use a configuration server such as Puppet or Ansible, your task of testing the new version could be greatly simplified. The instructions for the virtual machines are the same seen for standalone servers.

Monday, May 08, 2017

Getting to know MySQL InnoDB cluster, the new kid in the block

Innodb cluster3

InnoDB Cluster was released as GA a few weeks ago. I remember the initial announcement of the product at OOW 2016, promising a seamless solution for replication and high availability with great ease of use. I was a bit disappointed to see that, at GA release time, the InnoDB Cluster is a patchwork of three separate products (Group Replication, MySQL Router, MySQL Shell) which the users have to collect and install separately.

Given this situation, I was very pleased when Matthew Lord published Docker-InnoDB-Cluster, an image for Docker that contains everything you need to get the system up and running. The associated scripts make the experience even easier: not only we don't have to hunt for components, but the cluster deployment procedure is completely automated.

Installation

The process is painless. After cloning the repository the start script takes care of everything. It will create a network, deploy three database nodes, and fire up the router.

$ ./start_three_node_cluster.sh
Creating dedicated grnet network...
# network grnet already exists
NETWORK ID          NAME                DRIVER              SCOPE
8fa365076198        grnet               bridge              local
Bootstrapping the cluster...
12fb4bd975c2fb2e7152ed64e12d2d212bbc9f1d3b39d715ea0c73eeb37fed45
Container mysqlgr1 is up at Sun May  7 22:02:38 CEST 2017
Starting mysqlgr1 container...
Starting mysqlgr1 container...
MySQL init process done. Ready for start up.
Getting GROUP_NAME...
Adding second node...
a2b504ea1920d35b1555f65de24cd364fc1bc7a6ac87ca4eb32f4c02f5afce7c
Container mysqlgr2 is up at Sun May  7 22:02:48 CEST 2017
Starting mysqlgr2 container...
Starting mysqlgr2 container...
MySQL init process done. Ready for start up.
Adding third node...
393d46b9a1795531d99f68645087393a54b2463ef88b9b3c4cbe735c1527fe57
Container mysqlgr3 is up at Sun May  7 22:02:58 CEST 2017
Starting mysqlgr3 container...
Starting mysqlgr3 container...
MySQL init process done. Ready for start up.
Sleeping 10 seconds to give the cluster time to sync up
Adding a router...
830c3125bad70b09b057cee370ee490bcb88b1d4a1bfec347cda847942f3b56e
Container mysqlrouter1 is up at Sun May  7 22:03:17 CEST 2017
Done!
Connecting to the InnoDB cluster...

Most of the configuration (which has been simplified thanks to the usage of MySQL shell to add nodes) is handled inside the container initialization script. Just a few details are needed in the cluster deployment script to get the result.

The deployment script will also invoke the mysql shell in one of the nodes to show the status of the cluster:

Creating a Session to 'root@mysqlgr1:3306'
Classic Session successfully established. No default schema selected.
{
    "clusterName": "testcluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "mysqlgr1:3306",
        "status": "OK",
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
        "topology": {
            "mysqlgr1:3306": {
                "address": "mysqlgr1:3306",
                "mode": "R/W",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "mysqlgr2:3306": {
                "address": "mysqlgr2:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "mysqlgr3:3306": {
                "address": "mysqlgr3:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            }
        }
    }
}

The above status is the result of dba.getCluster().status(), which is a convenient way of collecting a bunch of information about the cluster and then present them in a compact JSON structure. If you enable the general log prior to running this command, you would see something like this:

select count(*) from performance_schema.replication_group_members where MEMBER_ID = @@server_uuid AND MEMBER_STATE IS NOT NULL AND MEMBER_STATE != 'OFFLINE';
select count(*) from mysql_innodb_cluster_metadata.instances where mysql_server_uuid = @@server_uuid;
SELECT @@server_uuid, VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME = 'group_replication_primary_member';
SELECT MEMBER_STATE FROM performance_schema.replication_group_members WHERE MEMBER_ID = '0030396b-3300-11e7-a8b6-0242ac170002';
SELECT CAST(SUM(IF(member_state = 'UNREACHABLE', 1, 0)) AS SIGNED) AS UNREACHABLE,  COUNT(*) AS TOTAL FROM performance_schema.replication_group_members;
select count(*) from performance_schema.replication_group_members where MEMBER_ID = @@server_uuid AND MEMBER_STATE IS NOT NULL AND MEMBER_STATE != 'OFFLINE';
select count(*) from mysql_innodb_cluster_metadata.instances where mysql_server_uuid = @@server_uuid;
SELECT @@server_uuid, VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME = 'group_replication_primary_member';
SELECT MEMBER_STATE FROM performance_schema.replication_group_members WHERE MEMBER_ID = '0030396b-3300-11e7-a8b6-0242ac170002';
SELECT CAST(SUM(IF(member_state = 'UNREACHABLE', 1, 0)) AS SIGNED) AS UNREACHABLE,  COUNT(*) AS TOTAL FROM performance_schema.replication_group_members;
SELECT cluster_id, cluster_name, default_replicaset, description, options, attributes FROM mysql_innodb_cluster_metadata.clusters WHERE attributes->'$.default' = true;
show databases like 'mysql_innodb_cluster_metadata';
SELECT replicaset_name, topology_type FROM mysql_innodb_cluster_metadata.replicasets WHERE replicaset_id = 7;
select count(*) from performance_schema.replication_group_members where MEMBER_ID = @@server_uuid AND MEMBER_STATE IS NOT NULL AND MEMBER_STATE != 'OFFLINE';
select count(*) from mysql_innodb_cluster_metadata.instances where mysql_server_uuid = @@server_uuid;
SELECT @@server_uuid, VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME = 'group_replication_primary_member';
SELECT MEMBER_STATE FROM performance_schema.replication_group_members WHERE MEMBER_ID = '0030396b-3300-11e7-a8b6-0242ac170002';
SELECT CAST(SUM(IF(member_state = 'UNREACHABLE', 1, 0)) AS SIGNED) AS UNREACHABLE,  COUNT(*) AS TOTAL FROM performance_schema.replication_group_members;
SELECT @@group_replication_single_primary_mode;
SHOW STATUS LIKE 'group_replication_primary_member';
SELECT mysql_server_uuid, instance_name, role, MEMBER_STATE, JSON_UNQUOTE(JSON_EXTRACT(addresses, "$.mysqlClassic")) as host FROM mysql_innodb_cluster_metadata.instances LEFT JOIN performance_schema.replication_group_members ON `mysql_server_uuid`=`MEMBER_ID` WHERE replicaset_id = 7;

In short, these commands check that the cluster is resilient, summarized in the statusText field, which says that we can lose up to one node and the cluster will keep working.

High Availability

What we have after deployment is a system that is highly available:

  • Group replication with one primary node;
  • Access to the cluster through the router, which provides one port for Read/write and one for Read-Only;
  • Automatic failover. When the primary node fails, another one is promoted on the spot, without any manual labor.

Let's start a test. We can check whether the data inserted from the R/W port is then retrieved by other nodes using the R/O port.

$ docker exec -it mysqlrouter1 /opt/ic/tests/test_router.sh
Server ID of current master
--------------
SELECT @@global.server_id
--------------

+--------------------+
| @@global.server_id |
+--------------------+
|                100 |
+--------------------+
Create content using router
--------------
create schema if not exists test
--------------

--------------
create table t1(id int not null primary key, name varchar(50))
--------------

--------------
insert into t1 values (1, "aaa")
--------------

The first part of the test will show the server ID of the primary node, by using the router R/W port (6446.) Then it will create a table and insert one record.

Server ID of a RO node
--------------
SELECT @@global.server_id
--------------

+--------------------+
| @@global.server_id |
+--------------------+
|                200 |
+--------------------+
retrieving contents using router
--------------
SELECT * from test.t1
--------------

+----+------+
| id | name |
+----+------+
|  1 | aaa  |
+----+------+

Using the read-only port (6447), we get a different node, and we retrieve the data created in the primary node.

Now we can test the high availability. Since we are using Docker, instead of simply kill the MySQL service, we can simulate an anvil falling on the server, by wiping away the container:

$ docker rm -f -v mysqlgr1
mysqlgr1

The primary node is gone for good. Let's see what the cluster status says now:

$ ./tests/check_cluster.sh 2
Creating a Session to 'root@mysqlgr2:3306'
Classic Session successfully established. No default schema selected.
{
    "clusterName": "testcluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "mysqlgr2:3306",
        "status": "OK_NO_TOLERANCE",
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active",
        "topology": {
            "mysqlgr1:3306": {
                "address": "mysqlgr1:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "(MISSING)"
            },
            "mysqlgr2:3306": {
                "address": "mysqlgr2:3306",
                "mode": "R/W",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            },
            "mysqlgr3:3306": {
                "address": "mysqlgr3:3306",
                "mode": "R/O",
                "readReplicas": {},
                "role": "HA",
                "status": "ONLINE"
            }
        }
    }
}

There are a few differences compared to the initial report:

  • The primary is now node 2 (mysqlgr2);
  • Node 1 is marked as MISSING;
  • The cluster has lost its resilience. Unless we add another node, no further failures will be handled automatically.

We can run the router test again, and it will work just as well, with the differences reported below:

Server ID of current master
--------------
SELECT @@global.server_id
--------------

+--------------------+
| @@global.server_id |
+--------------------+
|                200 |
+--------------------+
Create content using router
--------------
create schema if not exists test
--------------

--------------
drop table if exists t1
--------------

--------------
create table t1(id int not null primary key, name varchar(50))
--------------

--------------
insert into t1 values (1, "aaa")
--------------

Server ID of a RO node
--------------
SELECT @@global.server_id
--------------

+--------------------+
| @@global.server_id |
+--------------------+
|                300 |
+--------------------+

We see that the primary has now ID 200, and the R/O node is 300 (the only other node that has survived.)

Summarizing

  • The good

    • I can see that some of the ease of use promised in San Francisco is already available. We can create a cluster with little effort.
    • The recovery from the master failure is transparent.
    • The cluster status gives clear information about the system.
  • The bad

    • MySQL shell is difficult to use. The command line help is insufficient: some options require trial and error to work correctly. It also does not use an options file like other MySQL clients.
    • Adding a node after the primary has become unavailable is harder than it should be, and the manual does not contemplate this case. It only mentions a server that can be restarted.
    • Restarting the router after the primary died is impossible with the current configuration.
    • The metadata for replication is now in three different schemas: mysql, performance_schema, and mysql_innodb_cluster_metadata. I understand the reasons, but I believe that a simplification would be possible.
  • The bottom line: quite good to start a cluster, but not enough to deal effectively with simple HA cases. Possibly released too early.

Sunday, January 15, 2017

MySQL group replication: installation with Docker

Overview

MySQL Group Replication was released as GA with MySQL 5.7.17. It is essentially a plugin that, when enabled, allows users to set replication with this new way.

There has been some confusion about the stability and usability of this release. Until recently, MySQL Group Replication (MGR) was only available in the Labs, which traditionally denotes a preview or an use-at-your-own-risk feature. Several months ago we saw the release of Group Replication as a Docker image, which allowed users to deploy a peer-to-peer cluster (every node is a master.) However, about one month after such release, word came from Oracle discouraging this setup, and inviting users to use Group Replicator in Single Primary mode which is functionally equivalent to traditional replication, with just some synchronous component more. There hasn't been an update of MGR for Docker since.

BTW, some more confusion came from the use of "synchronous replication" to refer to Group Replication operations. In reality, what in many presentations was called synchronous replication is only a synchronous transfer of binary logs data. The replication itself, i.e. the operation that makes a node able to retrieve the data inserted in the master, is completed asynchronously. Therefore, if you looked at MGR as a way of using multiple masters without conflicts, this is not the solution.

What we have is a way of replicating from a node that is the Primary in the group, with some features designed to facilitate high availability solutions. And all eyes are on the next product, which is based on MGR, named MySQL InnoDB Cluster which is MGR + an hormone pumped MySQL Shell (released with the same version number 1.0.5 in two different packages,) and MySQL-Router.

MGR has several limitations, mostly related to multi-primary mode.

Another thing that users should know is that the performance of MGR is inferior to that of asynchronous replication, even in Single-Primary mode. As an example, loading the test employees database takes 92 seconds in MGR, against 49 seconds in asynchronous replication (same O.S., same MySQL version, same server setup.)

Installing MySQL Group Replication

One of the biggest issue with MGR has been the quality of its documentation, which for a while was just lack of documentation altogether. What we have now has a set of instructions that refers to installing group replication in three nodes on the same host. You know, sandboxes, although without the benefit of using a tool to simplify operations. It's just three servers on the same host, and you drive with stick shift.

What we'll see in this post is how to set group replication using three servers in Docker. The advantage of using this approach is that the servers look and feel like real ones. Since the instructions assume that you are only playing with sandboxes (an odd assumption for a GA product) we lack the instructions for a real world setup. The closest thing to a useful manual is the tutorial given by Frédéric Descamps and Kenny Gryp at PerconaLive Amsterdam in October. The instructions, however, are muddled up by the fact that they were using the still unreliable InnoDB Cluster instead of a bare bones Group Replicator. What follows is my own expansion of the sandboxed rules as applied to distinct servers.

The environment:

I am using Docker 1.12.6 on Linux, and the image for mysql/mysql-server:5.7.17. I deploy three containers, with a customized my.cnf containing the bare minimum options to run Group Replication. Here's the template for the configuration files:

$ cat my-template.cnf
[mysqld]
user=mysql
server_id=_SERVER_ID_
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=mysql-bin
relay-log=relay
binlog_format=ROW
log-error=mysqld.err

transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name="aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
loose-group_replication_start_on_boot=off
loose-group_replication_local_address= "172.19.0._IP_END_:6606"
loose-group_replication_group_seeds= "172.19.0.2:6606,172.19.0.3:6606,172.19.0.4:6606"
loose-group_replication_ip_whitelist="172.19.0.2,172.19.0.3,172.19.0.4,127.0.0.1"
loose-group_replication_bootstrap_group= off

Here I take a shortcut. Recent versions of Docker assign a predictable IP address to new containers. To make sure I get the right IPs, I use a private network to deploy the containers. In a perfect world, I should use the container names for this purpose, but the manual lacks the instructions to set up the cluster progressively. For now, this method requires full knowledge about the IPs of the nodes, and I play along with what I have.

This is the deployment script:

#!/bin/bash
exists_net=$(docker network ls | grep -w group1 )
if [ -z "$exists_net" ]
then
    docker network create group1
fi
docker network ls

for node in 1 2 3
do
    export SERVERID=$node
    export IPEND=$(($SERVERID+1))
    perl -pe 's/_SERVER_ID_/$ENV{SERVERID}/;s/_IP_END_/$ENV{IPEND}/' my-template.cnf > my${node}.cnf
    datadir=ddnode${node}
    if [ ! -d $datadir ]
    then
        mkdir $datadir
    fi
    unset SERVERID
    docker run -d --name=node$node --net=group1 --hostname=node$node \
        -v $PWD/my${node}.cnf:/etc/my.cnf \
        -v $PWD/data:/data \
        -v $PWD/$datadir:/var/lib/mysql \
        -e MYSQL_ROOT_PASSWORD=secret \
        mysql/mysql-server:5.7.17

    ip=$(docker inspect --format '{{ .NetworkSettings.Networks.group1.IPAddress}}' node${node})
    echo "${node} $ip"
done

This script deploys three nodes, called node1, node2, and node3. For each one, the template is modified to use a different server ID. They use an external data directory created on the current directory (see Customizing MYSQL in Docker for more details on this technique.) Moreover, each node can access the folder /data, which contains this set of SQL commands:

reset master;
SET SQL_LOG_BIN=0;
CREATE USER rpl_user@'%';
GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%' IDENTIFIED BY 'rpl_pass';
SET SQL_LOG_BIN=1;
CHANGE MASTER TO MASTER_USER='rpl_user', MASTER_PASSWORD='rpl_pass' FOR CHANNEL 'group_replication_recovery';
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

Operations

After deploying the containers using the above script, I wait a few seconds to give time to the servers to be ready. I can peek at the error logs, which are in the directories ddnode1, ddnode2, and ddnode3, as defined in the installation command. Then I run the SQL code:

$ for N in 1 2 3; do docker exec -ti node$N bash -c 'mysql -psecret < /data/user.sql' ; done

At this stage, the plugin is installed in all three nodes. I can start the cluster:

$ docker exec -ti node1 mysql -psecret
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SET GLOBAL group_replication_bootstrap_group=ON;
Query OK, 0 rows affected (0.00 sec)

mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (1.14 sec)

mysql>SET GLOBAL group_replication_bootstrap_group=OFF;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | ecba1582-db68-11e6-a492-0242ac130002 | node1       |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)

The above operations have started the replication with the bootstrap, an operation that must be executed only once, and that defines the primary node.

After setting the replication, I can enter some data, and then see what happens in the other nodes:

mysql> create schema test;
Query OK, 1 row affected (0.01 sec)

mysql> use test
Database changed
mysql> create table t1 (id int not null primary key, msg varchar(20));
Query OK, 0 rows affected (0.06 sec)

mysql> insert into t1 values (1, 'hello from node1');
Query OK, 1 row affected (0.01 sec)

mysql> show binlog events;
+------------------+------+----------------+-----------+-------------+----------------------------------------------------------------------------+
| Log_name         | Pos  | Event_type     | Server_id | End_log_pos | Info                                                                       |
+------------------+------+----------------+-----------+-------------+----------------------------------------------------------------------------+
| mysql-bin.000001 |    4 | Format_desc    |         1 |         123 | Server ver: 5.7.17-log, Binlog ver: 4                                      |
| mysql-bin.000001 |  123 | Previous_gtids |         1 |         150 |                                                                            |
| mysql-bin.000001 |  150 | Gtid           |         1 |         211 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:1'          |
| mysql-bin.000001 |  211 | Query          |         1 |         270 | BEGIN                                                                      |
| mysql-bin.000001 |  270 | View_change    |         1 |         369 | view_id=14845163185775300:1                                                |
| mysql-bin.000001 |  369 | Query          |         1 |         434 | COMMIT                                                                     |
| mysql-bin.000001 |  434 | Gtid           |         1 |         495 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:2'          |
| mysql-bin.000001 |  495 | Query          |         1 |         554 | BEGIN                                                                      |
| mysql-bin.000001 |  554 | View_change    |         1 |         693 | view_id=14845163185775300:2                                                |
| mysql-bin.000001 |  693 | Query          |         1 |         758 | COMMIT                                                                     |
| mysql-bin.000001 |  758 | Gtid           |         1 |         819 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:3'          |
| mysql-bin.000001 |  819 | Query          |         1 |         912 | create schema test                                                         |
| mysql-bin.000001 |  912 | Gtid           |         1 |         973 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:4'          |
| mysql-bin.000001 |  973 | Query          |         1 |        1110 | use `test`; create table t1 (id int not null primary key, msg varchar(20)) |
| mysql-bin.000001 | 1110 | Gtid           |         1 |        1171 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:5'          |
| mysql-bin.000001 | 1171 | Query          |         1 |        1244 | BEGIN                                                                      |
| mysql-bin.000001 | 1244 | Table_map      |         1 |        1288 | table_id: 219 (test.t1)                                                    |
| mysql-bin.000001 | 1288 | Write_rows     |         1 |        1341 | table_id: 219 flags: STMT_END_F                                            |
| mysql-bin.000001 | 1341 | Xid            |         1 |        1368 | COMMIT /* xid=144 */                                                       |
+------------------+------+----------------+-----------+-------------+----------------------------------------------------------------------------+
19 rows in set (0.00 sec)

The binary log events show that we are replicating using the ID of the group, instead of the ID of the single server.

In the other two nodes I run the operation a bit differently:

$ docker exec -ti node2 mysql -psecret
mysql> select * from performance_schema.global_variables where variable_name in ('read_only', 'super_read_only');
+-----------------+----------------+
| VARIABLE_NAME   | VARIABLE_VALUE |
+-----------------+----------------+
| read_only       | OFF            |
| super_read_only | OFF            |
+-----------------+----------------+
2 rows in set (0.01 sec)

mysql>  START GROUP_REPLICATION;
Query OK, 0 rows affected (5.62 sec)

mysql> select * from performance_schema.global_variables where variable_name in ('read_only', 'super_read_only');
+-----------------+----------------+
| VARIABLE_NAME   | VARIABLE_VALUE |
+-----------------+----------------+
| read_only       | ON             |
| super_read_only | ON             |
+-----------------+----------------+
2 rows in set (0.01 sec)

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | ecba1582-db68-11e6-a492-0242ac130002 | node1       |        3306 | ONLINE       |
| group_replication_applier | ecf2eae5-db68-11e6-a492-0242ac130003 | node2       |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0.01 sec)

Now the cluster has two nodes, and I've seen that the nodes are automatically defined as read-only. I can repeat the same operation in the third one.

$ docker exec -ti node2 mysql -psecret
mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (2.35 sec)

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | ecba1582-db68-11e6-a492-0242ac130002 | node1       |        3306 | ONLINE       |
| group_replication_applier | ecf2eae5-db68-11e6-a492-0242ac130003 | node2       |        3306 | ONLINE       |
| group_replication_applier | ed259dfc-db68-11e6-a4a6-0242ac130004 | node3       |        3306 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)

What about the data? It's been replicated:

mysql> show schemas;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.00 sec)

mysql> show tables from test;
+----------------+
| Tables_in_test |
+----------------+
| t1             |
+----------------+
1 row in set (0.01 sec)

Monitoring

In this flavor of replication there is no SHOW SLAVE STATUS. Everything I've got is in performanceschema tables and in mysql.slavemasterinfo and mysql.slaverelayloginfo, and sadly it is not a lot.

mysql> select * from replication_group_member_stats\G
*************************** 1. row ***************************
                      CHANNEL_NAME: group_replication_applier
                           VIEW_ID: 14845163185775300:3
                         MEMBER_ID: ecba1582-db68-11e6-a492-0242ac130002
       COUNT_TRANSACTIONS_IN_QUEUE: 0
        COUNT_TRANSACTIONS_CHECKED: 3
          COUNT_CONFLICTS_DETECTED: 0
COUNT_TRANSACTIONS_ROWS_VALIDATING: 0
TRANSACTIONS_COMMITTED_ALL_MEMBERS: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:1-6
    LAST_CONFLICT_FREE_TRANSACTION: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:5
1 row in set (0.00 sec)


mysql> select * from replication_connection_status\G
*************************** 1. row ***************************
             CHANNEL_NAME: group_replication_recovery
               GROUP_NAME:
              SOURCE_UUID:
                THREAD_ID: NULL
            SERVICE_STATE: OFF
COUNT_RECEIVED_HEARTBEATS: 0
 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
 RECEIVED_TRANSACTION_SET:
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
             CHANNEL_NAME: group_replication_applier
               GROUP_NAME: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
              SOURCE_UUID: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
                THREAD_ID: NULL
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 0
 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
 RECEIVED_TRANSACTION_SET: aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee:1-6
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
2 rows in set (0.00 sec)

Compared to regular replication, we lose the ID of the node where the data was originated. Instead, we get the ID of the group replication (which we set in the configuration file.) This is useful for a smoother operation of replacing the primary node (a.k.a. the master) with another node, but we have lost some valuable information that could have been added to the output rather than simply being replaced. Another valuable piece of information that is missing is the transactions that were executed (we only see RECEIVED_TRANSACTION_SET.) As in regular replication, we can get this information with "SHOW MASTER STATUS" or "SELECT @@global.gtid_executed", but as mentioned in improving the design of MySQL replication there are several flaws in this paradigm. What we see in MGR is a reduction of replication monitoring data, while we would have expected some improvement, given the complexity of the operations for this new technology.

Summing up

MySQL Group Replication is an interesting technology. If we consider it in the framework of a component for high availability (which will be completed when the InnoDB Cluster is released) it might improve the workflow of many database users.

As it is now, however, it gives the feeling of being a rushed up piece of software that does not offer any noticeable advantage to users, especially considering that the documentation released with it is far below the standards of other MySQL products.

Monday, September 26, 2016

PerconaLive Amsterdam 2016 - Talks and shows

With Oracle Open World behind us, we are now getting ready for the next big event, i.e. the European edition of PerconaLive. I am going to be a presenter three times:

  • MySQL operations in Docker is a three-hour tutorial, and it will be an expansion of the talk by the same title presented at OOW. Attendees who want to play along can do it, by coming prepared with Docker 1.11 or later and the following images already pulled (images with [+] are mandatory, while [-] are optional):

    • mysql/mysql-server [+]
    • mysql/mysql-gr [+]
    • mysql [-]
    • percona/percona-server [-]
    • mariadb [-]
    • mysql/shell [-]
    • datacharmer/mysql-minimal-5.5 [-]
    • datacharmer/mysql-minimal-5.6 [-]
    • datacharmer/mysql-minimal-5.7 [-]
    • datacharmer/mysql-minimal-8.0 [-]
    • datacharmer/my-ubuntu [-]
    • datacharmer/my-debian [-]
    • datacharmer/my-centos [-]
  • The fastest customized MySQL deployment ever is a presentation where I show two ways of deploying MySQL on a custom server, with MySQL Sandbox and Docker.

  • The lightning talks will be held during the Community dinner at Booking.com, diamond sponsor of the conference. If you want to attend, you need to register, and if you want a free ticker for that dinner, there is still ONE SLOT for the lightning talks. Contact me directly with a title and an abstract if you want to be considered for that slot (finding my email is part of the challenge, but it should not be that difficult).

UPDATE: here's the lightning talks program. Percona should eventually add it to the schedule.

  • 19:00: arrival at Booking.com by boat
  • 19:15: welcoming speech and beginning of the LT
  • 20:00 food buffet opens, LT are not finished yet
  • 20:30: LT are done, buffet still open
  • 21:15: buffet closes

The talks accepted are the following:

  • Jaime Crespo
    dbhell: a tiny Python framework for the administration and monitoring of farms of MySQL servers
  • Federico Razzoli
    How to write slow and buggy stored procedures
  • Art Van Scheppingen
    Simple household rules that keep MySQL running
  • Pavel Tru
    What internal statistics every self-respecting dbms should have!
  • Giuseppe Maxia
    Fastest, smallest, newest: download and deploy MySQL at high speed
  • Ronald Bradford
    An Awe-Inspiring Journey Through the World of Numbers

Sunday, September 25, 2016

MySQL at Oracle Open World 2016

MySQL is a growing presence at Oracle Open World. While most of the headlines belong to the main products, where Oracle services are aiming at world domination, MySQL shared the spotlight, as it was announced to be part of Oracle database cloud. It seems a logical move for Oracle: after all the effort to make MySQL 5.7 the biggest release ever, it stands to reason that it is offered as a competitive feature in its own database as a service.

With this offer, Oracle is applying enterprise pricing and methodologies to a target of emerging companies. MySQL in the Oracle cloud differs from the competition by a few key points:

  • It's only MySQL 5.7. While this is the most advanced MySQL server available, users with existing deployments may exercise caution before adopting this cloud service. Companies that are starting now, instead, may take advantage of using the latest and greatest.
  • It's MySQL Enterprise edition, with a wealth of additional monitoring and management tools that are missing in other cloud offerings.
  • Unlike some popular competitors, it includes access to the server host, allowing DBAs to enable plugins, fine tune the deployment, and explore the operating system in addition to the database.
  • It includes Oracle support.

Time will tell if this offering will be successful. It may not appeal to all the users, but there is surely a category of power users who can take advantage of these features.

The surprises of the MySQL keynote did not stop at the cloud. We had already seen, one week before the conference, that MySQL 8.0 was released, with many juicy features. What came unexpected are two announcements:

  • MySQL group replication, a product that has been living in MySQL Labs for quite a while, was declared "release candidate" despite lacking documentation and being released in a format that discourages adoption, except from diehard hackers.
  • Another product is been released, again in the labs, with an ambitious mission. The MySQL InnoDB cluster is based on group replication and wants to be a 15-minute deployment of high-availability and scalable system, thanks to an enhanced version of MySQL Shell (the same used for the document store. Its feature set are exciting, but what we have seen in the demos suggests that the product is still in the early stages of development.

With these moves, Oracle is showing two paths of MySQL development:

  • in the main path, which has produced the GA of MySQL 5.5, 5.6, and 5.7, the MySQL team is showing the positive influence of Oracle engineering, with focus on security, performance, and stability.
  • in a parallel course, which started last April with the announcement of MySQL document store and its related shell, the team wants to introduce new features to a GA release as plugins, with the reasoning that the main release will not be touched (thus avoiding the taboo of altering a stable product) but users are free to enable plugins and unleash new functionalities.

The mix of traditional and agile releases are provoking exciting thoughts, albeit moderated by the fear of using together experimental code in a GA deployment.

The methodology of these releases is also baffling. It is unclear how mature is the document store. The plugin comes with the server, and it is accompanied by an huge set of documentation, which implies that it has been designed extensively and tested internally for a while, but the main tool for the feature, mysql shell is labeled as development preview: not very encouraging. On the other hand, the latest plugin addition, the MySQL group replication, which has been declared of release candidate quality, is still in the labs (no plugin in the server release), and without documentation.

All considered, while it is clear that Oracle is putting an enormous engineering effort into growing MySQL, I have the feeling that the replication features have been neglected and the announcement of group replication mixed fortunes confirms me in this belief.

The conference was useful to me. I had the chance of meeting many Oracle engineers and users, and discuss technical and community matters at length. My own presentation, a two-hour tutorial on MySQL operations in Docker was highly satisfactory, as it proved to be an interesting topic that was actively discussed by the audience.

Sunday, July 10, 2016

The fastest MySQL Sandbox setup ever!

MySQL-Sandbox 3.1.11 introduces a new utility, different from anything I have put before in the MySQL Sandbox toolkit.

make_sandbox_from_url downloads a tiny MySQL tarball from a repository and install it straight away.

As of today, the following packages are available

Major release versions package size
(what you download)
expanded size
(storage used)
original size
(not included)
5.0 5.0.96 20M 44M 371M
5.1 5.1.72 23M 59M 485M
5.5 5.5.50 15M 49M 690M
5.6 5.6.31 18M 61M 1.1G
5.7 5.7.13 33M 108M 2.5G

The sizes of the tarballs mentioned in the table above are much smaller than the original packages. The binaries have been stripped of debug info, compressed whenever possible, and purged of all binaries that are not needed for sandbox operations. This means that:

  • You can download the needed tarball very fast;
  • The storage needed for the binaries is reduced immensely.

Noun archive 8572

Here is an example of the script in action. We download and install mySQL 5.0.96 in one go:

$ make_sandbox_from_url 5.0 -- --no_show
wget -O 5.0.96.tar.gz
'http://github.com/datacharmer/mysql-docker-minimal/blob/master/dbdata/5.0.96.tar.gz?raw=true'
URL transformed to HTTPS due to an HSTS policy
--2016-07-10 17:59:33--
https://github.com/datacharmer/mysql-docker-minimal/blob/master/dbdata/5.0.96.tar.gz?raw=true
Resolving github.com (github.com)... 192.30.253.112
Connecting to github.com (github.com)|192.30.253.112|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location:
https://github.com/datacharmer/mysql-docker-minimal/raw/master/dbdata/5.0.96.tar.gz
[following]
--2016-07-10 17:59:33--
https://github.com/datacharmer/mysql-docker-minimal/raw/master/dbdata/5.0.96.tar.gz
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location:
https://raw.githubusercontent.com/datacharmer/mysql-docker-minimal/master/dbdata/5.0.96.tar.gz
[following]
--2016-07-10 17:59:34--
https://raw.githubusercontent.com/datacharmer/mysql-docker-minimal/master/dbdata/5.0.96.tar.gz
Resolving raw.githubusercontent.com (raw.githubusercontent.com)...
151.101.12.133
Connecting to raw.githubusercontent.com
(raw.githubusercontent.com)|151.101.12.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 20052235 (19M) [application/octet-stream]
Saving to: ‘5.0.96.tar.gz’

5.0.96.tar.gz
100%[=================================================================================>]
19.12M  15.2MB/s    in 1.3s

2016-07-10 17:59:37 (15.2 MB/s) - ‘5.0.96.tar.gz’ saved [20052235/20052235]

    The MySQL Sandbox,  version 3.1.11
    (C) 2006-2016 Giuseppe Maxia
# Starting server
. sandbox server started
# Loading grants
Your sandbox server was installed in $HOME/sandboxes/msb_5_0_96

If you call the same command twice, you will get a message saying that you can now use make_sandbox x.x.xx to install your sandbox.

The script is doing what I should probably have done from the beginning by default: expands the tarball in $SANDBOX_BINARY (by default $HOME/opt/mysql) from where it is easy to reuse with minimum typing.

As of today, the binaries are Linux ONLY. I made this repository to use it with Docker (I will write about it soon) and that means using Linux. This is still part of an experiment that so far is working well. The project can either evolve in smarter directions or merge with clever containers. It's early to say. For now, enjoy the fastest set-up that MySQL Sandbox can offer!

Wednesday, May 04, 2016

Docker for Mac beta and MySQL - First impressions

Using Docker for development is a great way of ensuring that what you develop will be the same that you deploy in production. This is true for almost everything. If you develop on Linux, the above statement holds. If you develop on a different operating system (OSX or Windows) there are several restrictions.

I showed one of those issues in a recent article (MySQL and Docker on a Mac: networking oddity.) When you want to export a port from a service running in the container, the exported port is not available in your mac, but in the virtual machine that runs Docker services. This happens with any application that listens to a port.

The second limitation I found affects only MySQL, and it is related to using volumes. The proper way of achieving data persistence with containers is through volumes, i.e. telling the container to run the data directory in a virtual path that refers to some safe place in the host computer. That can't be done on a Mac, because the host computer is a virtual machine, and even though Docker can access a folder in your Mac, the server installation fails for lack of permissions.

Both the above restrictions are lifted if you use the beta release of Docker for Mac and Windows. It's a private beta: you need to apply and wait to be given an operational token, but once you are in, you notice the differences between the beta and the "old" Docker-Toolbox:

Docker beta

  • The Docker app is a native app, which you install by copying its icon to the /Application folder;
  • You don't need Virtualbox or VMware Fusion. It comes with its own lightweight VM based on xhyve.
  • There is no need to run docker-machine start xxx and eval $(docker-machine env xxx). The new app is fully integrated with the OS.
  • Ports exported from a container are available in your Mac.
  • You can keep both the Docker Toolbox and the new Docker app in the same host, provided that you don't run them both in the same terminal session.

Back to our claim of lifted limitations: let's try a full installation on a Mac as we would do it on Linux.

$ docker run --name mybox -e MYSQL_ROOT_PASSWORD=secret -d \
    -v ~/docker/mysql/single:/var/lib/mysql \
    -p 5000:3306 mysql/mysql-server
72ca99918076ff0e5702514311cc706ffcc27f98917f211e98ed187dfda3b47b

$ ls  ~/docker/mysql/single/
auto.cnf    client-key.pem  ibdata1     mysql.sock.lock    server-cert.pem
ca-key.pem  ib_buffer_pool  ibtmp1      performance_schema server-key.pem
ca.pem      ib_logfile0     mysql       private_key.pem    sys
client-cert.pem             ib_logfile1 mysql.sock         public_key.pem

We create a MySQL server container with the internal port 3306 exposed to the external port 5000, and the data directory running in the host directory $HOME/docker/mysql/single. It seems that the data directory was created correctly. Now we use a MySQL client on the Mac to connect to the container, using port 5000 on the local network address (Note: there is NO database server running on my mac. Only in the container).

$ sudo netstat -atn  |grep LISTEN | grep 5000
tcp4       0      0  *.5000                 *.*                    LISTEN

$ ~/opt/mysql/5.7.12/bin/mysql -h 127.0.0.1  -u root -psecret -P 5000
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.12 MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create schema hello_docker;
Query OK, 1 row affected (0.01 sec)

mysql> use hello_docker
Database changed
mysql> create table i_am_here(id int);
Query OK, 0 rows affected (0.03 sec)

mysql> exit
Bye

$ ls  ~/docker/mysql/single/hello_docker/
db.opt        i_am_here.frm    i_am_here.ibd

This is full integration! Using a Mac client we connected to the container, where we created a table, which then appeared inside the data directory in the Mac host!

It's still early to say if this beta is ready for more serious work, but the first impressions are really good!

Monday, February 29, 2016

MySQL Sandboxes in Docker

Overview

When I got interested in Docker, I started playing idly with the idea of integrating containers and MySQL Sandbox. My first experiments were not encouraging. Using a container the same way I would use a regular server produced horrible results. I started by creating a Debian or CentOS container, installing MySQL Sandbox, and then importing an expanded tarball into the container. What happens is that tarballs of recent MySQL versions expand to roughly 2 GB of binaries. When you try to put that into a container you get a bloated file system. If you want to expand more than one tarball, you get an enormous unusable blob that is contrary to what containers should be used for. There is, of course, the possibility of using volumes, which would avoid the problem of making the container too big, but would not make it easier for users. The current difficulty of MySQL Sandbox is its non-intuitiveness. I want to use containers to make things easier for users. Getting the tarballs for more than one version and using them transparently is possible, but not simple. This was my starting point.

Shrinking MySQL tarballs

I read two articles recently about reducing the size of a MySQL distribution. I hadn't given much thought to this issue, other than noticing that the binaries are taking more and more storage in my hosts. While this has not been a huge problem in my servers so far, it would be nice to save a few gigabytes in my laptop. So these two posts made me think again:

The procedure is simple. What you need to run a MySQL server in a sandbox is only a handful of files. For example, for MySQL 5.7.11, you need, from the ./bin directory:

3.5M my_print_defaults
4.4M mysql
4.1M mysqladmin
4.8K mysql_config
 25M mysqld
 26K mysqld_safe

This is about 33M. The sizes are small because I run strip on all binaries. Additionally, I also need some files from the ./share directory, from which I shave away all languages except English, leaving me with about 3.5M

To complete the usability of the sandbox, I need these files, which add 8.6M to the total.

4.4M mysqlbinlog
4.2M mysqldump

This brings the grand total for MySQL 5.7 to ~ 47M. Compared to the original size of the tarball at 2GB, it's an impressive reduction. However, I need to add something if I plan to use plugins in the sandbox. This requires the ./lib directory, with the contents of ./lib/plugin. This will double the size of the repository. But 90M is hardly a problem, compared to dealing with the size of original tarballs.

Of course, I have automated the process, with a file list for every version of MySQL. After the reduction, I got the following (sizes in MB):

Version original sizereduced size
5.037144
5.148559
5.568749
5.6110061
5.7200091



Building the images

My goal with using Docker in conjunction with MySQL Sandbox is to simplify the user experience, while keeping intact the advanced functionalities. After a few unsatisfactory attempts that allowed me to learn a few tricks and avoid pitfalls, I came up with the plan of producing three images:

  • One with the MySQL Sandbox software and the libraries to support it;
  • One with that also includes the MySQL tarballs compressed;
  • And one that includes the tarballs expanded.

The main image mysql-sb-base

This image datacharmer/mysql-sb-base is the base for the others, but it can be used stand-alone with the tarballs already in your host computer. The image is relatively light, is based on Debian, and its size is 167 MB.

For example, you have expanded tarballs in $HOME/opt/mysql

$ ls $HOME/opt/mysql
5.6.28  5.7.11

$ docker run -ti --name sbox -v $HOME/opt/mysql:/opt/mysql datacharmer/mysql-sb-base bash
msandbox@3d4a8d9ca186:~$ make_sandbox 5.6.28 -- --no_show
The MySQL Sandbox,  version 3.1.05
(C) 2006-2016 Giuseppe Maxia
loading grants
. sandbox server started
Your sandbox server was installed in $HOME/sandboxes/msb_5_6_28
msandbox@3d4a8d9ca186:~$ ~/sandboxes/msb_5_6_28/use
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.6.28 MySQL Community Server (GPL)

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql [localhost] {msandbox} ((none)) >

Or, you may have a tarball in a directory:

$ ls $HOME/downloads
mysql-5.1.72-linux-x86_64-glibc23.tar.gz

$ docker run -ti --name sbox -v $HOME/downloads:/opt/mysql datacharmer/mysql-sb-base bash

msandbox@2b0a53ad8e71:~$ make_sandbox ~/opt/mysql/mysql-5.1.72-linux-x86_64-glibc23.tar.gz -- --no_show
unpacking /home/msandbox/opt/mysql/mysql-5.1.72-linux-x86_64-glibc23.tar.gz
    The MySQL Sandbox,  version 3.1.05
    (C) 2006-2016 Giuseppe Maxia
loading grants
. sandbox server started
Your sandbox server was installed in $HOME/sandboxes/msb_5_1_72

msandbox@2b0a53ad8e71:~$ ~/sandboxes/msb_5_1_72/use
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.1.72 MySQL Community Server (GPL)

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql [localhost] {msandbox} ((none)) >

The intermediate image mysql-sb-gz

If you want to use the reduced tarballs that I have prepared, the less expensive image is datacharmer/mysql-sb-gz. It contains all the binaries mentioned above (from 5.0 to 5.7) still compressed. The image weighs 272 MB, and you can use the binaries with the help of a script that ships with the image.

$ docker run -ti --name sbox datacharmer/mysql-sb-gz bash

msandbox@b0672d141e3d:~$ ls
opt  setup.sh  README
msandbox@b0672d141e3d:~$ ./setup.sh
# expanding 5.0.96.tar.gz
# expanding 5.1.72.tar.gz
# expanding 5.5.48.tar.gz
# expanding 5.6.28.tar.gz
# expanding 5.7.11.tar.gz

msandbox@b0672d141e3d:~$ ls ~/opt/mysql/
5.0.96  5.1.72  5.5.48  5.6.28  5.7.11

msandbox@b0672d141e3d:~$ make_sandbox 5.7.11 -- --no_show
    The MySQL Sandbox,  version 3.1.05
    (C) 2006-2016 Giuseppe Maxia
loading grants
. sandbox server started
Your sandbox server was installed in $HOME/sandboxes/msb_5_7_11
msandbox@b0672d141e3d:~$
msandbox@b0672d141e3d:~$ ~/sandboxes/msb_5_7_11/use
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.7.11 MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql [localhost] {msandbox} ((none)) >

The complete image mysql-sb-full

The image that allows you to use a sandbox without any extra work is datacharmer/mysql-sb-full. This one contains the expanded binaries that are ready for consumption.

$ docker run -ti datacharmer/mysql-sb-full bash
msandbox@ed3cbbc088a7:~$ ls ~/opt/mysql
5.0.96  5.1.72  5.5.48  5.6.28  5.7.11  COPYING
msandbox@ed3cbbc088a7:~$ make_sandbox 5.5.48 -- --no_show
    The MySQL Sandbox,  version 3.1.05
    (C) 2006-2016 Giuseppe Maxia
loading grants
.. sandbox server started
Your sandbox server was installed in $HOME/sandboxes/msb_5_5_48
msandbox@ed3cbbc088a7:~$ make_replication_sandbox 5.6.28
installing and starting master
installing slave 1
installing slave 2
starting slave 1
. sandbox server started
starting slave 2
. sandbox server started
initializing slave 1
initializing slave 2
replication directory installed in $HOME/sandboxes/rsandbox_5_6_28
msandbox@ed3cbbc088a7:~$ ~/sandboxes/use_all 'select version()'
version()
5.5.48
# master
version()
5.6.28-log
# server: 1:
version()
5.6.28-log
# server: 2:
version()
5.6.28-log
msandbox@ed3cbbc088a7:~$

What's next

This is a first attempt, and I am sure it can be improved. Things that come to mind, in no particular order:

  • Making the container run a sandbox transparently in the host computer. This is already feasible, but not out-of-the-box. It will require a wrapper script.
  • Adding binaries for more versions, such as Percona and MariaDB servers, or preview releases (MySQL Labs);
  • Finding a way of adding binaries seamlessly, without making the images unnecessarily big;
  • Getting the binaries from a centralised repository instead of shipping them with the image. This is also feasible, but not immediately reliable as of today. The ideal solution would be for Oracle, Percona, and MariaDB to release and maintain reduced binaries of every shipped version.

The code to create the images is already available on GitHub. Contributions are welcome!

Monday, February 01, 2016

A safer MySQL box in Docker

The MySQL team has been maintaining a MySQL image on Docker. They have been listening to requests from the community and reacting quickly. So far, they have fixed two bugs that I reported and introduced a feature request that I suggested to make the server more secure.

Thanks, folks!

My latest request was about password management in a MySQL container. I have mentioned in previous posts the compatibility problems introduced by MySQL 5.7 security enhancements. Let me recap the main issues here:

MySQL is secure by default.

The recommended method to install MySQL is mysqld --initialize, which will generate a random password that the DBA will then use to access the server and change it. This method would be acceptable if we were still in the 1990s, where installing servers one by one was a common practice. In the world of automated deployed servers, this practice would kill productivity in most environments.

MySQL provides a script-friendly alternative, mysql --initialize-insecure, which will not generate any password, and leave the server with old root without password, ready for another automated process to set a secure password. Despite the intimidating name, this method is not more insecure than it was before, because it is used in automated processes, where the security does not depend on the vendor decisions but on a careful deployment plan that include password generation and management at company level.

MySQL on Docker is insecure by default

The above problem came to bite the MySQL team when they released an image for Docker. Here, to deploy a container you need to execute:

docker run -e MYSQL_ROOT_PASSWORD=something -d mysql/mysql-server

Meaning that you need to pass your valuable password on the command line. There is an alternative. Instead of MYSQL_ROOT_PASSWORD, you can pass the variable MYSQL_RANDOM_ROOT_PASSWORD, which will generate a random password and you would then see the password in the logs:

$ docker logs mybox
Initializing database
Database initialized
MySQL init process in progress...
GENERATED ROOT PASSWORD: Ix.oqdovoj!AGSEtYBnOnebag]u

Here the situation is worse than the regular case where we install with an automated tool, because containers are not supposed to be modified post installation. They should work out of the box. The password that we pass on the command line should be the real one, but given the exposure it could only be temporary and then require a change with a further operation. The random password in the logs does not make the process simple for a script.

The latest change by the MySQL team, though, fixes the problem. Now you can pass a file name inside MYSQL_ROOT_PASSWORD and the server installation procedure will get the password from that file. This way, there will be no more exposure, and a reliable password could be used from the beginning.

Here is an example of a script that creates a container with a secure password, which nobody needs to see on screen at any moment:

#!/bin/bash

# Generate a random password (for demo purposes only: there are more secure methods)
RANDOM_PASSWORD=$(echo $RANDOM | sha256sum | cut -c 1-10 )

# Save the random password to a file
echo $RANDOM_PASSWORD > secretpassword

# Create the .my.cnf file
echo '[client]' > home_my_cnf
echo 'user=root' >> home_my_cnf
echo "password=$RANDOM_PASSWORD" >> home_my_cnf

# set -x
docker run --name mybox \
    -v $PWD/secretpassword:/root/secretpassword \
    -v $PWD/home_my_cnf:/root/home_my \
    -e MYSQL_ROOT_PASSWORD=/root/secretpassword -d mysql/mysql-server $@

This script creates a random password, saves it to a file, and stores the file inside the container (using the -v volume option). Inside the container, the installer reads the password from the file and proceeds like before. The difference is that this password has not been shown on screen or in the logs. Notice that this script has also written the password to a second file, which is in the format required by MySQL for its options (same as /etc/my.cnf.) We can use this file to connect without typing a password.:

$ docker exec -ti mybox mysql --defaults-file=~/home_my
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.10 MySQL Community Server (GPL)

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

We can't deploy the configuration file as /root/.my.cnf because the server would read it before installing, i.e. before the password has been set, and thus the installation will fail. Once the installation is concluded, though, we can do it, and access mysql without a password.

$ docker exec -ti mybox sh -c 'cp ~/home_my ~/.my.cnf'
$ docker exec -ti mybox mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.10 MySQL Community Server (GPL)

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Sunday, November 22, 2015

MySQL and Docker on a Mac: networking oddity

This is a quick post only indirectly related to the series of articles about Docker that I have written recently.

Yesterday I was giving a presentation about Docker in Buenos Aires, and as usual I included a long live demo. Almost all went as expected. There was one step that I tried some time ago, and had always worked well, but when I tried to repeat it on stage, it failed miserably:

  • Step 1: run the container
$ docker run  -d --name mybox -e MYSQL_ROOT_PASSWORD=secret mysql/mysql-server
50acff1c7d237b2944f3fbdd5d230e4ecca2bcccd4e4ba06aa778ee20d41631e
  • Step 2: get the IP address
$ docker inspect --format '{{ .NetworkSettings.IPAddress}}'  mybox
172.17.0.2
  • Step 3: connect to the container from a local client.

$ mysql -h 172.17.0.2 -u root -psecret
Enter Password:

And here the connection hangs.

That was odd. This operation was working fine at home. I have it pasted into my first article of the series. Why it did not work here?

I could not solve the issue on stage. I had to go on with my presentation. The installation of three nodes in replication went well. Everything that I tried at home was working, except connecting from the external client to the server inside a container.

When I came back to the hotel, I realised what was different: I had done all the tests for the first article on Linux only, and I started testing on Mac when using the material for the second article. Thus I hadn't tried this particular example on a Mac until I went on stage. And of course it did not work: I was trying to connect from my Mac to a container inside a virtual machine which uses a different subnet and cannot be reached without indirection.

The only way of connecting a local client to the container is by exposing explicitly the MySQL port to the external (and in this case the "external" is the virtual machine itself) and then connect to the virtual machine.

This approach works:

$ docker rm -v -f mybox
mybox

$ docker run -p 3306:3306 -d --name mybox -e MYSQL_ROOT_PASSWORD=secret mysql/mysql-server
2798f4d1b9abb023453b1141a808014fdf7666221498b42c96e2a5d58fc6ce65

Here the port 3306 of the container is attached to the port 3306 to the host (= the virtual machine).


$ docker-machine ip default
192.168.99.100

Now we get the address of the virtual machine, so that we can connect to its MySQL port.


$ mysql -h 192.168.99.100 -u root -psecret
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.9-log MySQL Community Server (GPL)

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

And the connection now works!

I owe it to the attendees to my talk to explain the oddity, but I believe it could be useful for others out there as well.

Monday, November 16, 2015

MySQL-Docker operations. - Part 4: Sandboxes, virtual machines, containers.

Previous episodes:

We're going to explore the choices and the differences between various types of deployments. We will consider four use cases:

  1. [Friendly]: Testing an application on a server where a different version of the same application is already installed (examples: a Python app requiring many libraries, a MySQL server);
  2. [Intrusive]: Testing a potentially intrusive application (anything that changes your general settings in /usr or /etc);
  3. [Conflicting]: Running a service that has lots of conflicting dependencies (an updated database driver compiled with a version of MySQL different from what you have installed);
  4. [Intractable]: Running an intractable service, one of those that require a specific user to run and assume they have full control of the operating system (e.g. Postgresql, Oracle).

For each case, we need to determine the impact on our well being. We assume that the user starts with one reasonably powerful server.

The method used will affect our operations in several ways:

  1. Cost: How much would it cost to implement this method.
  2. Time: How much time will be needed to get things done.
  3. Performance: Can we run things as fast as we need.
  4. Ease of use: Can we get things done without reading a lengthy manual or using an unforgiving and complicated procedure;
  5. Isolation: Can you run your server without affecting other servers?
  6. Storage: Can we add or change storage easily?
  7. Scalability: Can we easily repeat the procedure as many times as needed.
  8. Availability: Can we run any service using this method?
  9. Portability: Can we run this service on several operating systems?
  10. Networking: Can we use this method to run operations that require a network?

Running servers on a regular host

The first possibility to solve our problem is simple. Take an empty server, install the service, run it. Until not long ago, before the advent of cloud computing, this was the only way to run operations: if your server is not enough, buy a bigger one, or buy many small ones and get smarter with them. But inevitably, whether we wanted to install a new service or test a new version of a known application, we needed to find money and physical space to get the job done.

Regular apps

Figure 1 : Applications within a server share the operating system and library resources

In this configuration, everything is by the book. We assume that we will use one physical host to run a main service, using the best configuration we can get to achieve the purpose.

The evaluations in the following table are based on my own experience and may differ from what others feel or need.

Requirement score notes
Cost –10 You need to own a new server
Time 8 You need to install it, but it can be easily automated
Performance 10 Nothing beats bare metal
Ease of use 8 As easy as the installation procedure makes it.
Isolation 10 Not going to affect services in other machines.
Storage –10 Changing storage requires physical manipulation
Scalability –10 Every new server requires a new purchase
Availability 10 We can run anything.
Portability 10 We can install the O.S. that we need, and the services on top of it
Networking 0 We can use, but can't create or simulate networking.
Total +56 / -30

The negative results should be considered separately from the positive ones. What could be a prohibitive condition for an individual could be merely a nuisance for someone in a stronger position. For example, if you already have access to bare metal servers for the next two years, thanks to an advantageous merger, you may not feel the cost factor to affect you too much.

Evaluation for the bare metal servers usability:

  • [Friendly]: easily used. No problems here.
  • [Intrusive]: difficult to use. Installing one of those means that you may have trouble installing anything else.
  • [Conflicting]: Extremely difficult. You may end up with the inability of upgrading a given service unless you also upgrade all the dependencies, and end up upgrading the whole operating system out of desperation.
  • [Intractable]: Extremely difficult. Once you install one of those, you may not be able to use the server for anything else.

Running servers in a sandbox

In this context, by sandbox I mean an application that runs on a server with strict configuration settings that prevent it from misbehaving. One example for this category is MySQL-Sandbox, where one or more MySQL servers are installed in a host, each of them configured in such a way that it does not clash with the others.

Sandboxes

Figure 2 : Sandboxes are regular applications that were carefully configured to behave well without disturbing the neighbors.

While MySQL-Sandbox is designed for testing, deploying several production servers on the same host is a common practice. The main reason for it is that commodity servers have become more and more powerful, but the software hasn't caught up to utilise such power to its fullest. In this context, using a single server on such powerful hosts would be a waste, while installing two or three servers would provide for better effectiveness.

This type is similar to running a plain bare metal server. You are running your MySQL server very close to the metal, as there is no software layer between the server and the operating system. Applications configured this way are as fast as the hardware allows. However, they are not as secure. While a lonely server running inside its dedicated host does not have to worry about clashing, a sandbox is sharing libraries and other operating system resources to other similar servers, and a clash is easy to provoke. It would be enough to mix up the configuration settings, and one or more of them would either stop working or corrupt data. Or it could happen that a sandbox could drain all the resources (e.g. the main memory) leaving all the other contenders in the cold.

Requirement score notes
Cost 10 No investment required
Time 8 As easy as the installation procedure makes it
Performance 10 Still bare metal, even if there is potential concurrency.
Ease of use 10 As easy as the manual says it is
Isolation -5 Depends on the service configuration.
Although it is functionally independent, the services can clash.
Storage 5 Sandboxes can be resized at will (within the limits of existing storage).
Scalability 10 Deployment of new instances is only limited by the host resources.
Availability 5 We can run only applications that are fully configurable.
Portability -5 We can only run applications for the host O.S.
Networking We can use but can't create or simulate networking.
Total +58 / -10

There are several advantages to using sandboxes instead of a dedicated host, such as being able to deploy multiple servers without buying new hardware or installing virtual machines. There are, however, obvious limitations, like the lack of isolation mentioned above and the fact that only applications compiled for the host operating system can run in this fashion.

Evaluation for the sandboxes usability:

  • [Friendly]: easily used. This is the strong point of sandboxed applications.
  • [Intrusive]: Difficult to use. Sometimes impossible.
  • [Conflicting]: Difficult but possible to use. It's one of the case where having a conflicting application used in a parallel environment could be beneficial.
  • [Intractable]: Almost impossible to reduce to a sandboxed environment.

Running servers in virtual machines

VM

Figure 3 : A virtual machine isolates the application and the operating system.

Virtual machines are the heart of current cloud computing strategies. The ability of creating servers that behave almost like bare metal ones –without need for physically buying them and transporting into a data center– has changed the economy of most companies in the past decade.

Requirement score notes
Cost -5 Moderate investment required
Time As easy as the installation procedure makes it. But the O.S. must be installed as well
Performance -10 There is much overhead from the additional layers and the need of having a full O.S..
Ease of use 8 Everything that is allowed through the interface.
Isolation 9 It can be as good as a physical host.
There is still the risk of a VM affecting negatively others.
Storage 5 V.M.s can be resized at will (within the limits of existing storage).
Scalability 10 Deployment of new instances is only limited by the host resources.
Availability 10 We can run anything.
Portability 10 We can install the O.S. that we need, and the services on top of it
Networking 10 We can use and create networks.
Total +62 / -15

Compared to bare metal, virtual machines can scale at will. You can deploy in a few minutes a new VM of the size that is needed for your current business, and get rid of it when the need ends. Unlike sandboxes, you can run any operating system and any application. In addition, you can have a network for public and private communication between servers.

There are prices to pay. First of all, it will cost you. Depending on the usage, they could be much cheaper than buying and storing your own physical servers, but they won't be free. Sure, you can install a virtual machine in your initial server, the same way that you can do it for a sandbox, but then you get into the second great limitation: performance. Even with the best software available today, the performance of a server running in a VM is greatly inferior to a server on bare metal.

You can compensate for performance by splitting the job into many parts and deploying many small virtual machines that will work in parallel. When a solution like this is successfully deployed, the performance of the group of virtual machines can surpass that of a single bare metal server. Unfortunately, to achieve this goal, you would incur more costs than buying a single server, and your application will need to be adapted to working in a distributed environment. This solution can work, and it has been deployed successfully in many cases, but it is not a one-size-fits-all, and done with poor planning can backfire.

Evaluation for the virtual machines usability:

  • [Friendly]: easily used. No problems here.
  • [Intrusive]: Easily used with overhead. Just install another virtual machine.
  • [Conflicting]: Easily used with overhead.
  • [Intractable]: Easily used with overhead.

Running servers in containers

Docker

Figure 4 : Docker containers are thin layers of libraries and applications on top of a common kernel.

Containers are a growing trend in the virtualization ecosystem. If, by the previous statement, you believe that containers are virtual machines, you need to reconsider immediately, or risk failing to understand this technology. Containers are not virtual machines, although they have many things in common. Like virtual machines, containers are entities that are not in the host computer, can be deployed in a package, started, and the service inside it can be used more or less like a server on bare metal.

The differences between virtual machines and containers are a few, and very important:

  • A container does not pack a full operating system, but just a thin layer of the needed libraries to run the service in it;
  • The service itself is often a stripped down version of the original application.
  • Most important, the software in the container uses the host kernel directly, without any intermediate layer.
  • For the above reasons, while a virtual machine starts up in minutes, a container starts up in less than a second.

A container is a well packaged application that can be downloaded very quickly, and once downloaded can be instantiated several times with incredible speed.

Another notable difference between containers and virtual machines is that containers are less isolated, because they use the same kernel as the host, rather than a virtualized one. On one hand, this makes containers less secure, on the other hand, they are blazingly fast.

Docker shared

Figure 5 : Docker containers can share libraries and other image layers

There is another reason for containers speed and low storage occupancy. Docker containers are deployed in layers. Some of those layers can be used by a single container, others could be in common between two or more containers. While a virtual machine is an enormous blob which can reach several GB, a container could be a thin modification of an existing image, and thus can be downloaded in seconds and deployed even faster.

Requirement score notes
Cost 10 No investment required
Time 10 Fast, fast, fast!
Performance 9 Almost as fast as running on bare metal. Tiny overhead.
Ease of use 3 Requires some learning and new workflows.
Isolation 7 Much better than a sandbox.
Less than a V.M., because containers use the same kernel.
Storage 5 Containers can be resized at will (within the limits of existing storage).
Scalability 10 Deployment of new instances is only limited by the host resources.
Availability 3 We can run only applications that have been adapted for containers.
Portability -5 We can only run applications for the host O.S.
Networking 10 We can create and use netweoks.
Total +67 / -5

What are the strong points of containers? Low cost (or no cost, if all you need is what fits in your current server), good performance, private networking, easy to scale.

The limitations, as of today, are portability (applications can only run in the same OS as the host) and the ease of use. This is a point that is going to change. Using containers requires some changes in the applications (or finding ready made images) and an understanding of the environment, which could be intimidating for people used to the old ways. But once you get past the initial learning phase, everything feels very easy, and eventually the usage will be far easier than the old ways.

Evaluation for the containers usability:

  • [Friendly]: easily used. No problems here.
  • [Intrusive]: easily used, with little or no overhead.
  • [Conflicting]: easily used with little or no overhead.
  • [Intractable]: difficult to use, sometimes impossible if the intractable application or service was built without flexibility in mind.

All solutions comparison

For convenience, I made a table with a comparison of the solutions examined above.

I must stress that these evaluations are my own, very much subjective, based on my experience. The evaluations may differ from others, and possibly also from my own in a few years or months. Talking about Docker is like catching eels: it's a moving target where the technology evolves and improves daily. This fluidity is possibly the most appealing characteristic of Docker and the container related technology: its evolution has been and continues to be fast and effective, addressing the users needs at incredible speed.

Requirement Bare metal Sandbox Virtual machine Container
Cost –10 10 –5 10
Time 8 8 0 10
Performance 10 10 –10 8
Ease of use 8 10 8 3
Isolation 10 –5 9 7
Storage –10 5 5 5
Scalability –10 10 10 10
Availability 10 5 10 3
Portability 10 –5 10 –5
Networking 0 0 10 10
Total +56 –30 +58 –10 +62 –15 +66 –5

I believe we haven't seen the end of this trend yet. What we have seen so far with containers and virtual machines seems to aim at an architecture built on micro services. Containers could take a substantial role in the transition towards that reality.

What can we take away from this analysis?

  • Bare metal servers are not outdated yet. There are still cases where they are irreplaceable. Despite the cost associated with their usage, they are not extinct yet, but just.
  • Virtual machines are still in charge of the scalability department in many cases. However, they feel the advance of containers and need to either evolve or merge into a more flexible architecture to deal with increasing demands from users.
  • Containers are the new force in IT. They can play well with both bare metal servers and virtual machines, waiting for the rise of container-oriented operating systems, which already exist and aim at world domination in a not distant future.

I see a future where the rise of containers and micro systems will force software makers to simplify their products and make them more modular and easy to play with. This trend is important in the current cloud architecture and will become vital when containers take over.

In the meantime, I am not giving up MySQL-Sandbox, which is still indispensable I'm most scenarios, but I am starting to rethink the architecture to fit smarter future uses.

MySQL deployment summary

With all the above considerations, where do we stand with MySQL? My view is that we're still in middle ground. MySQL is still used heavily on bare metal, either as a stand-alone server or as a part of multi server deployments in the same host.

It is also massively employed in the cloud, where it offers many advantages for deployment flexibility and ease of scalability. Yet it still lacks the agility necessary to be a native cloud component. There are several attempts at creating a better cloud player out of MySQL, some successful, some less so.

When it comes to containers, MySQL has still much work to do to become an efficient building block in the new ebullient architecture expansion. The MySQL team provides an official package, which is a first step towards becoming a good player. But in the near future there will be demands of more integration and better modularity than what's available today. Looking at the internals of MySQL deployment in a container shows that the system is struggling to adapt to the new medium. I see the container revolution as an opportunity for established applications like MySQL to improve their usability and increase their ability to play well with other components of the emerging IT infrastructure.

What's next

In the next (and last) episode we will see MySQL, Docker and orchestrating tools playing together to deliver faster and more powerful operations.