15 Apr 2009

Optimal SQL to page through large record set

This is a tip for web application developers who want to page through results coming back from a MS SQL database. If there are 100 or 500 records coming back and you just want to show 25 at a time to the user then you can achieve that through your application code (CFML, PHP, etc). But what if there are 1000’s of records, it’s deeply inefficient to request that many records from your database if you only want to display 25 every time the user requests the next page, or jumps forwards several pages. If you want the results to be ordered by a user-defined column and not by the Row ID then the solution may not be obvious.

MySQL can use limit(start,count) but MS SQL doesn’t support that. Someone suggested using a memory table, someone else suggested a cursor. A bit of Googling revealed MS SQL 2005 onwards has a new function called row_number(). Here’s how it works in this example:



SELECT * FROM (
SELECT row_number() over (
order by users.surname, users.firstname
) AS rowNumber, users.id, users.firstname, users.surname
FROM users
) table1
WHERE table1.rowNumber BETWEEN 150 AND 175

It will only return records 150 to 175 and importantly it will order it by surname then firstname. And it’s very, very fast.

2 comments:

  1. I'm not sure how the performance compares, but here's some other ways: http://cfzen.instantspot.com/blog/2008/06/09/Paginating-Records-in-CF-with-One-SQL-Server-2005-call-update-1 . There are some more examples in the comments, too. :)

    ReplyDelete
  2. Hi Aaron. Their solution still uses row_number() but they're creating dynamic sql which isn't necessary to do the job - there's more code too so it won't be as fast.

    If you want to get a total record count in the same query (i.e. so you can output "showing records 1 to 25 out of 500") then it can easily be added in a new column in the same select statement:

    Count(*) OVER() AS totalRows

    Dan Switzer (in the comments of the blog you refer to) had a similar solution as me but used a virtual table so he can then query it for a 2nd result set to get a record count. I don't know if there's much of a penality for creating a memory table containing the entire record set, but if you have 1000's of records it must be a tad slower.

    If Dan's blog turned up in the search results when I was trying to find a solution I wouldn't have blogged about row_number() here. But now there's twice the chance that someone else can find a good solution on either blog! :-)

    ReplyDelete