One of the uses for mysqlnd’s plugin architecture that I found most interesting was query caching on the client.
The traditional MySQL Query Cache
The MySQL query cache is a really nice simple feature which helps speed up repeated queries of the same data. A certain portion of memory can be set aside to hold results sets in a look table against the query string. When a SELECT query comes in which the MySQL server has seen before it looks in the query cache and if it has seen the query before it doesn’t touch the table data or indexes instead it simply returns the result set it has in memory which is much faster. This is particularly useful in web applications where repeated SELECT queries are very common.
Here’s a little script to demo the effect query cache can have
<?php
$mysqli = new mysqli('127.0.0.1', 'root', '', 'war_and_peace');
$mysqli->query('RESET QUERY CACHE');
$querySQL = "SELECT * FROM words WHERE word = 'sausage' ";
$time_start = microtime(true);
if ($results = $mysqli->query($querySQL)) {
echo "Cold got {$results->num_rows} results in ";
echo microtime(true) - $time_start,"s",PHP_EOL;
$results->close();
}
$time_start = microtime(true);
if ($results = $mysqli->query($querySQL)) {
echo "Server cache got {$results->num_rows} results in ";
echo microtime(true) - $time_start,"s",PHP_EOL;
$results->close();
}
Running this script against a completely un-normalised table with the 600 thousand odd words of war and peace (thanks Project Gutenberg) in it gives you
$ php MySQL_QueryCache.php
Cold got 5 results in 0.13019204139709s
Server cache got 5 results in 0.00035595893859863s
Why not put this in the client?
What would make it even faster is if the client did not need to connect to the server at all. If the client had a cache of all queries it performed it could save the time taken for the network round trip and help take load off the server by not even asking the question.
mysqlnd_qc attempts to do this.
There is, however, a major problem though, cache invalidation. In the traditional server based cache a very simple table based cache invalidation algorithm is used. If a query which alters the data in a table is detected all caches of data with that table in the result set are invalidated. This will even work for servers which are part of a replication cluster since the update sql will propagate from the master where the data was updated to the slaves and the slaves’ caches will then be invalidated.
But what if there are multiple clients? One client could update a table while another client still thinks its cache is still valid and so continues to serve it. The situation of two clients is actually relevant even if you don’t have multiple web servers talking to the same database server. It could occur by using something other than PHP and mysqlnd to connect the database. For instance using the command line mysql client (which uses libmysql) any updates made through this would not invalidate other clients query caches.
This problem is seemingly insoluble without some complex mechanism of feeding back to every client which caches are still valid. So the simple answer is to only cache for a few seconds and then only use the caching where slightly stale data is not hugely important.
Therefore by default after installing the extension with a quick
phpize
./configure
make
sudo make install
(you will need to be using mysqlnd obviously to be able to use it)
You will need to prefix all queries you want to cache with
“/*qc=on*/ SELECT * from some_table”
which for portability can be written as “/*” . MYSQLND_QC_ENABLE_SWITCH . “*/”;
e.g.
<?php
$mysqli = new mysqli('127.0.0.1', 'root', '', 'war_and_peace');
$mysqli->query('RESET QUERY CACHE');
$querySQL = "SELECT * FROM words WHERE word = 'sausage' ";
$time_start = microtime(true);
if ($results = $mysqli->query($querySQL)) {
echo "Cold got {$results->num_rows} results in ";
echo microtime(true) - $time_start,"s",PHP_EOL;
$results->close();
}
$querySQL = "SELECT * FROM words WHERE word = 'sausage' ";
$time_start = microtime(true);
if ($results = $mysqli->query($querySQL)) {
echo "Server cache got {$results->num_rows} results in ";
echo microtime(true) - $time_start,"s",PHP_EOL;
$results->close();
}
$querySQL = "/*" . MYSQLND_QC_ENABLE_SWITCH . "*/";
$querySQL .= "SELECT * FROM words WHERE word = 'sausage' ";
$mysqli->query($querySQL);
$time_start = microtime(true);
if ($results = $mysqli->query($querySQL)) {
echo "Client cache got {$results->num_rows} results in ";
echo microtime(true) - $time_start,"s",PHP_EOL;
$results->close();
}
Running this script will show an even better speed improvement.
$ php MySQL_QueryCache_mysqlnd_qc.php
Cold got 5 results in 0.1253068447113s
Server cache got 5 results in 0.00036311149597168s
Client cache got 5 results in 9.8943710327148E-5s
Win Win Win!
This is pretty impressive! This is using the default settings for the length of cache or time to live. The default setting is 30 seconds and it is set with the ini setting mysqlnd_qc.ttl. Of course after our 30 seconds are up the server cache is still there so we will still be able to make use of that. So in a way we can think of mysqlnd_qc as a way of protecting the db server from excessive repeated queries.
More Options
We are using the default handler which behaves in a similar way to the query cache in the mysql server. There are also other backends available such as APC, memcache and sqlite. Sqlite may well be an excellent choice even though using another db may seem a little weird, the reason is that the results sets are in the form of table data which fits well with sqlite. There is also the possibility of creating you own user handler which can use a custom invalidation algorithm there is an example at http://uk3.php.net/manual/en/mysqlnd-qc.set_user_handlers.php