I just found
this really interesting analysis on this topic written by Jan Kneschke (author of the excellent
LightTPD web server), in which he suggests the following alternative to the ORDER BY RAND() method. I already tested it (i also just needed that functionality) and found that it, as promised, just works perfectly (at least for me).
$battleR = BattleRecord::finder()->findBySql('
SELECT participant_a_id, participant_b_id FROM Battle AS r1 JOIN(
SELECT(RAND() * (SELECT MAX(id) FROM Battle)) AS id
)
AS r2 WHERE r1.id >= r2.id ORDER BY r1.id ASC LIMIT 1;
');
He also posted a proper benchmark
PerformanceNow let's see what happends to our performance. We have 3 different queries for solving our problems.
* Q1. ORDER BY RAND()
* Q2. RAND() * MAX(ID)
* Q3. RAND() * MAX(ID) + ORDER BY ID
Q1 is expected to cost N * log2(N), Q2 and Q3 are nearly constant.
The get real values we filled the table with N rows ( one thousand to one million) and executed each query 1000 times.
100 1.000 10.000 100.000 1.000.000
Q1 0:00.718s 0:02.092s 0:18.684s 2:59.081s 58:20.000s
Q2 0:00.519s 0:00.607s 0:00.614s 0:00.628s 0:00.637s
Q3 0:00.570s 0:00.607s 0:00.614s 0:00.628s 0:00.637s
As you can see the plain ORDER BY RAND() is already behind the optimized query at only 100 rows in the table.
Read the analysis. I think it is a really interesting read for every developer that has to write SQL and it also offers a really nice solution for database tables with ID holes. There are also a few other interesting essays about various other MySQL problems on his
Playing with MySQL page.
Greetings from Hamburg / Germany
- rojaro -