Sunday 24 February 2008

AJAX and viewing very large datasets (50K+ documents)

Problem: How to facilitate a user request to get rid of the "paging" options at the bottom of a AJAX enabled web view and allow them to use only the scroll bar to navigate very large datasets (50k + documents).

I have been thinking about this all weekend and have drawn numerous wee drawings trying to get my head around what is required. The bull in a china shop approach would be download the 50k+ view data and then allow the JS or XSLT to render 50,000 documents .... Tried it .. OOOOO it was slow and required lots of answers to "this script is taking a long time it may be looping do you want to end it?" messages.

So the next option and the currently active one is to have 20 records per page and have the standard sort of page that has Page x of 2500 and allows the user to set x the background process goes off gets that page from the server and displays it. This has the benefit that the browser only has to get 20 documents from the server and render them which is quick and easy. BUT the user group doesn't really like this and reports back that they prefer a UI that has a scroll bar. Their main complaint is when you are looking at a Categorized view and the category is large when expanded then the data may be take several pages which you have to shuttle back and forth between.

And finally the last option.. what the user wants... a scroll bar option like in the client...This is the idea I am playing with at the minute,all be it as a mind game. My thoughts are these.

  1. To ensure the Scrollbar accurately reflects the size of the dataset the holding element must be as near as possible the correct size to contain the dataset, but it does not necessarily need to be full of data. The size of this containing object will equal
    height of a single row * total number of rows in the dataset

  2. The viewport need only contain the grid of Visible Rows say for example a table of 20
    rows. The size of the viewport if defined in rows is
    height of a single row * total number of rows in the viewport

  3. When a scroll bar is clicked on either the down or up arrows the viewport is moved up or down by one row.

  4. When the scroll bar is clicked either above or below the scroll tab the viewport is moved up or down by one page.

  5. The current pixel position of the scroll bar in the holding element is returned by the scrollTop value. This can be used to work out the row that is at the top of the view port
    current position as returned by scrollTop / height of a row given as an integer value
that need to have a height of 50,000 *14 = 700,000px
So...lets say we have a view that is 50,000 rows long, we want to display this in a grid that has 20 visible rows and each row has a height of 14px.

Step 1 - Create a viewport which we know needs to have a height of 20*14 = 280px
Step 2 - Inside the viewport create a holder [div] which has a height of 700000px (50,000*14)
Step 3 - Place a grid inside in the Holder [div] which contains 20 rows 14px high
Step 4 - Register an event handler on the "scroll" event of the Holder [div]
Step 5 - When this event is triggered return the scrollTop and calculate the new starting row.
Step 6 - Ask the server to return 20 view entries from this starting row
Step 7 - Populate the table with these entries.
Step 8 - Position the table at scrollTop,0 within the holder [div]

Basically this will move the table which holds only the data visible in the UI to be centered in the viewport [div], the holder [div] has a height of 700,000px but only contains the table of visible data thus preserving the scale of the scroll bar without the requirement for having it full of data.

I have had a play and this seems to be "do-able" although there are some browser specific niceties that I have to get my head around as well as how to do interesting things like sortable columns, searches and how to return categorized views.

Has anyone else done something similar I wonder or would anyone be interested in the proof of concept I am working on? If so let me know ....

Ohh I wonder does Ext.ND cope with large datasets?

UPDATE - In the comments below Joerg Michael didn't think this was a good idea...and Nathan Freeman followed up with the the point that ... perhaps the Notes Client view should be redesigned too.. :) Bum.. I was distracted by a user saying something that sounded like a good idea.. LOL...
Oh well ...being distracted are sunday afternoons in Feb are for :) However a non-notes solution of a similar nature has been looked at and put forward as a user extension for Ext as pointed out by Rich Waters as something to look at and as a point to think on... :)


Joerg Michael said...

One word: Don't. I've never seen a situation where a user really needs or wants to navigate in 50K documents. I have had situations where the user asking for crazy stuff like this just had never thought about other, better ways.

Usually, it makes more sense to add a few dropdown lists containing various filtering options. These filters can then narrow down the displayed data to a human-friendly level. And a level that pretty much any neat JS library/framework has to offer. Have you played around with e.g. Dojo or other whizbang-UI libraries?

Steve McDonagh said...

:) generally i would agree with you, users do have a habit of shooting off at odd tangents into the realms of fantasy.

The reason this one stuck with me was they have a notes-client based view of the dataset which currently has 48,726 docs, all neatly categorized to 3 levels and it was put to me "why cant it be like notes?" which struck me as being a fair enough comment to make.

Unfortunately due to the nature of the data the drop down lists at the second category level can contain up to 150 items which makes the drop downs a little hard to handle and it was user feed back of the drop down filters method that prompted this idea.

I have been playing with YUI for about 2 years and Dojo/Prototype for 8 months and we have around 30 or so, AJAXed applications currently live with the users.

Just out of interest what functions of the frameworks were you thinking about as a alternate filter method?


Nate said...

they have a notes-client based view of the dataset which currently has 48,726 docs, all neatly categorized to 3 levels and it was put to me "why cant it be like notes?" which struck me as being a fair enough comment to make.

Heh. Actually, that would have led me to say "good point, the Notes UI is bad, too... let's fix THAT instead." And then just drive them both from selection filtering criteria instead of using categories.

Joerg Michael said...

Steve, it was more than a year ago that I looked into AJAXy ways of making Notes views look nice. Things were pretty much getting started in the Domino world.

The first one I really tried was Bob Obringer's Ultimate View Navigator. A true joy, but couldn't do catageorized views back then.

Apparently, things have greatly improved since then. I wasn't really thinking about any particular part of Dojo or another tool (don't really do that much web development), I was just thinking that they probably offer some nice ways of displaying even large views.

I was looking at the problem more from a UI view. In terms of "What would Chris Blatnick or Nathan Freeman do" ;-)

Steve McDonagh said...


LOL.. good point, perhaps the shock of a user saying something that appeared to make sense made me miss the obvious :) It is a legacy app that has been around since V4.5 and is in need of an overhaul at the client end.


Looks like you and I got the answer to "what would..." and he agrees with you but with a bigger stick. ;)

Yep I think back to the drawing board for that one...well it filled in a boring Sunday if nothing else :) although I may still pursue it just for the fun of it.

Rich Waters said...

Definitely something I've thought of and looked into before. An Ext user has submitted a really slick User Extension that implements functionality like you talked about. There's a thread in the forums about it here -
and a live demo here -

It is actually quite nice and pre-loads a configurable amount of records above/below your current location.

It would take a good amount of work to get this working in Ext.nd, but it's definitely something on my list.

Steve McDonagh said...


OOOO now that is interesting!
I take it that the data that is read includes some of the above and below data so that single increment reads are coped with by cache and not DB reads but does not end up with the 50k in memory if a user scrolls the whole way thru the dataset?

I am probably going to have a play with the idea just for fun would you like me to feed back my findings, particularly with the thorny problem of mutli-level categorgised views in various states of expand and contract?

Rich Waters said...

I'd always be glad to look at what others have done to deal with view display and categorization.

As for the livegrid, it only maintains a set number of records in its 'BufferedStore'. Aside from configuring how many above and below your location to load in you tell it how many total records to cache. It even goes as far as to implement a new selection model that maintains selected nodes even if they are removed from the buffer.

jonvon said...

the Great Code Giveaway session at LS this year had some code working the way you want it to, scrolling produced more results as you went down the page. they were using dojo.

Steve McDonagh said...


Thanks! I will look that up when I get home. I have no flag to wave as yet for any of the frameworks. So Dojo is as good for me as YUI :)

Steve McDonagh said...


Perfect just what i was thinking about! Thanks for the heads up, LS was a non-runner for me this year :(

jonvon said...

yeah i was pretty impressed by their demo! i like the way they implemented the solution too, very nice.

Disqus for Domi-No-Yes-Maybe