You're viewing a single post. I have much more to say! The main blog page is a good starting point.
Wednesday, February 25, 2009 ★ 17:39 ★ Category Programming
After suffering from severe performance problems for a rather data intensive application (which was explicitly designed with scalability in mind), the culprit was found: a often executed piece of code did not execute
SELECT LAST_INSERT_ID(), but
SELECT LAST_INSERT_ID() FROM some_table. Hard to spot, but very important.
The problem? Well, the
LAST_INSERT_ID (see the MySQL manual on LAST_INSERT_ID) function just returns the resulting primary key value from the last
INSERT statement in the current MySQL connection thread, regardless of the table into which the row was inserted. Erroneously appending
FROM some_table to this statement will result in a query that returns rows from a table, and these rows only contain the same constant value over and over again, once for each row in the table. However, if you only retrieve one row from the result set, you won’t initially notice the bug… at least not until your application mysteriously gets slower. If the table grows significantly over time, which was clearly the case here, since many records are inserted every day, executing this very simple query will become slower and slower, since it returns as many rows as are in the table, and each subsequent row yields exactly the same information as the previous one. How much slower? Well, up to the point that multiple quad core machines with plenty of memory were constantly suffering from such a high load that they became almost unresponsive. Yes, we were running this piece of code many times in parallel since it is supposed to be not CPU bound, but network latency bound.
The solution? Well, a very easy fix: after dropping the
FROM some_table part from the query the load on the machines instantly dropped back to almost 0, which was how this applications was designed since it’s mostly network latency bound, which does not use much resources other than occupying some file descriptors for open sockets.
The lesson learnt here? A speed-up of many orders of magnitude, just by removing two words in the code… yes, sometimes performance optimization is completely in the nitty-gritty details. Unfortunately we had to find out the hard way.
(Yay for the person who found this. You know who you are.)
Wouter Bolsterlee, also known as uws, a postmodern geek living in the Netherlands. Read more about me…
Unless stated otherwise, all material on this site is available under a Creative Commons Share-Alike license.