MySQL Prepared Statements and PHP : A small experiment

Consider a PHP-MySQL application where the information of 1000 users is being retrieved from the database by running a for loop:

for($i = 1; $i <= 1000; $i++){

$query = "SELECT * FROM user WHERE user_id = $i";

//run the query and fetch data

}

In each iteration, the first thing the MySQL engine does is to parse the query for syntax check. Then it sets up the query and runs it. Since the query remains unchanged during each iteration(except for the value of user_id), parsing the the query each time is definitely an overhead. In such cases use of prepared statements is most convenient. A prepared statement is just like a typical query, except that it has ‘placeholders’ that are supplied values at run time. The prepared statement in this case will look like this:

"SELECT * FROM user WHERE user_id = ?"

Notice the placeholder(‘?’) for the value of user_id in the query. Now MySQL engine needs to parse the query only once, then execute it 1000 times by binding the placeholder with PHP script supplied value for user_id. This pre-parsing of the query results in a significant performance boost.

The MySQL Improved extension in PHP, more commonly known as MySQLi, provides an API to work with prepared statements. The documentation at the online PHP manual is good enough to get you started on how to use them on your PHP application, so I’ll not go through it. Instead, I am going to share the results of my personal experiments on comparing performances of traditional and prepared SQL statements.

I conducted the experiment on a demo project which has large amount of data. I wrote two separate scripts on our development server, both of which performed the same operation: joining two related tables (one of which has over 150,000 records, the other has 350,000) and fetching some data . One script used regular SQL statement, the other employed prepared statement techniques. Each script was executed three times and the time required to fetch the data was measured at each pass.

The First script: traditional SQL statement

//Get the Database link
$dbLink = getDBLink();


$timeStart = microtime(true);


for($i = 0; $i < 162038; $i++){


$query = "SELECT article_id, article_name, username as author FROM articles a LEFT JOIN user u ON (a.author_id = u.user_id) WHERE article_id = $i";


if($result = $dbLink->query($query))

$obj = $result->fetch_object();

else die("Failed to execute query: $dbLink->error");


$result->close();


}


$timeEnd = microTime(true);
$dbLink->close();


//measure the time difference
$timeDiff = $timeEnd - $timeStart;

echo "Total time: $timeDiff seconds";

Output:

First Pass -> Total time: 25.5793459415 seconds
Second Pass -> Total time: 25.1708009243 seconds
Third Pass -> Total time: 25.2259421349 seconds

Average: 25.32536300023 seconds

The Second Script : using prepared statement

$dbLink = getDBLink();

$query = "SELECT article_id, article_name, username as author FROM article a LEFT JOIN user u ON (a.author_id = u.user_id) WHERE article_id = ?";


$stmt = $dbLink->stmt_init();


if(!$stmt->prepare($query))
die("Failed to prepare statement: ".$dbLink->error);

$timeStart = microtime(true);

for($i = 0; $i < 162038; $i++) {

//bind the parameter
$stmt->bind_param('i',$i);
//execute the statement
$stmt->execute();
//bind the result, fetch it, then free it
$stmt->bind_result($articleId, $articleName, $author);
$stmt->fetch();
$stmt->free_result();

}

$timeEnd = microTime(true);

$stmt->close();
$dbLink->close();

//measure the time difference
$timeDiff = $timeEnd - $timeStart;

echo "Total time: $timeDiff seconds";

Output:

First Pass -> Total time: 20.1434290409 seconds
Second Pass -> Total time: 20.182309866 seconds
Third Pass -> Total time: 20.6448199749 seconds

Average: 20.32351962726 seconds

The task takes 20% less time for prepared statement, a significant performance boost.

Other than performance, it can also improve application security by guarding against SQL Injections. Check out this informative blog post on that topic.

Harrison Fisk at MySQL AB wrote a very good article on MySQL prepared statements. Don’t forget to check out the section ‘When should you use prepared statement?’ if you read it.

About these ads

17 thoughts on “MySQL Prepared Statements and PHP : A small experiment

  1. My solution would be not to do a for loop, but to remake the query that makes sense to do..

    $query = “SELECT `article_id`, `article_name`, `username` AS `author` FROM `article` `a` LEFT JOIN `user` `u` ON (`a`.`author_id` = `u`.`user_id`) WHERE `article_id` >= $i AND `article_id` <= $j”;

    Or…

    “SELECT `article_id`, `article_name`, `username` AS `author` FROM `article` `a` LEFT JOIN `user` `u` ON (`a`.`author_id` = `u`.`user_id`) LIMIT $i, $j”;

    Then I would throw it in while loop..

    Of course you could convert the above to a prepared statement.

  2. by “Then I would throw it in while loop..” I mean
    $qry = mysql_query($sql);

    while($row = mysql_fetch_assoc($qry))
    {
    ….
    }

    Or even faster:

    $qry = mysql_query($sql) or die($sql.’ ::: ‘.mysql_error());
    $x = mysql_num_rows($qry);

    if($x > 1)
    {
    do
    {
    $row = mysql_fetch_assoc($qry);
    …. Code ….
    }while(++$b<$x);
    }

  3. I believe there’s a slight flaw in that script of yours. Feel free to correct me if I am wrong.

    You prepared the query before doing the bind, which is the one being timed. In MySQL, when doing a prepared query, what happens is PHP sends one request to the server to issue the PREPARE statement (which is the $stmt->prepare($query) ) and then another for the EXECUTE statement (the $stmt->execute() ). Since not that many web pages will call the same SELECT query that many times, your result is inconclusive.

    Prepared queries are still good for their security reasons though.

  4. Kudos on the coding examples EllisGL. However the idea was to prevent MySQL from parsing the same query again and again, which I’m not sure being handled in your script.

  5. Michael, the query written above was just to demonstrate the performance gains of prepared queries over to traditional ones. Of course, in a real world situation one must think 10 times before running a query in a for loop.

    I think a pagination scenario would have been a much more appropriate example, where you run the same query multiple times just changing the LIMIT parameters.

  6. Rubayeet – Yeah I think pagination would have been a great example to add to the end of the article. Real world scenario.

  7. I agree with EllisGL, I’ll never use a code such:

    for($i = 1; $i <= 1000; $i++){

    $query = “SELECT * FROM user WHERE user_id = $i”;

    //run the query and fetch data
    }

    Rather SELECT * FROM user WHERE user_id<=1000
    and then
    while($d = mysql_fetch_object($query)) {
    //process the script logic
    }

    Anyway I understand this is just an example, the post is very interesting because it increases the awareness for programmers to keep up to date in using best practices in their daily work

  8. Hi, results of this test can be wrong. I recommend you using Apache Benchamark (AB.exe) – small program provided with apache. It allows you to run many tests with diffrent load (many requests at once). I think it should be a bit more accurate in this case :).

    Sorry for my english, i hope you colud understand me ;)

  9. Have you tested this script to see where the breakpoint in efficiency is? It’s one thing to do more than 100,000 iterations of the select and another ifyou only do 100 or 10. I’m curious to see what you find.

  10. Pingback: Rubayeet Islam’s Blog: MySQL Prepared Statements and PHP : A small experiment : Dragonfly Networks

  11. Pingback: Rubayeet Islam’s Blog: MySQL Prepared Statements and PHP : A small experiment : WebNetiques

  12. @ John J. : I ran the same scripts for 1000, 100 and 10 iterations. Interestingly prepared statement beat traditional query by a marginal 20% in all cases. I’m yet to find a breakpoint. Why don’t you give it a try too?

  13. Pingback: Enlaces rápidos (12-11-08) | Tecnolink Informática Castellón

  14. Both tests must begin and end at the same point. Both tests should use:

    $dbLink = getDBLink();
    $timeStart = microtime(true);

    and

    $timeEnd = microTime(true);
    $dbLink->close();

    … because that sequence correctly identifies both the beginning and the end of the test period, from the point at which the db connection is established to its release, each of which could be significant variables unrelated to the test.

    You use these correctly when testing the first approach, but then allow several operations to occur before starting the timer and a significant operation to occur after ending the timer when testing the second approach. Your end result is therefore imprecise.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s