Pushrod

Old dogs, new tricks

Posts Tagged ‘live search

Solving the live-search/slow mongrel process problem

leave a comment »

I’ve been using the excellent nginx ever since I put Autopendium :: Stuff about old cars, the classic car community website I run, into production.

Now there’s another reason for using it, a sensible balancer that delivers requests only to those mongrels that aren’t busy. However, as Ezra Zygmuntowicz (whose post first put me on to nginx in the first place), says:

“Now we all know that you should not have actions that take this long in our apps right? Of course we do, but that doesn’t stop it from happening quite a bit in a lot of apps I’ve seen.”

However, one place this problem can easily occur (even if you haven’t got any requests that you’d consider as ‘long-running’) is live-search.

In all the standard Rails recipes for livesearch, it works by observing a text input box, and sending the query to the app each time the text changes.

This is all fine and dandy in theory (well, apart from the repeated requests to the server, which is somewhat wasteful), and works fine in development where you’ve only got a single mongrel handling the requests of a single user. However, start adding some more mongrels (or FCGI processes, or whatever), and you get into all sorts of problems.

First, there’s the problem that in the user’s eyes, a live-search only works if the response is pretty much instantaneous. If the request is served to a mongrel that’s already handling someone else’s request and that request takes a couple of seconds to complete, then it’s not working for them.

However, another problem is maintaining the order of requests and responses. Say, you’ve got a reasonably designed app, are using a cluster of mongrels and caching to ensure that no page takes more than say half a second to process.

Then chances are, if you’re using live-search the standard way, you’ll get some unexpected behaviour (from the user’s perspective).

An example: Your user is searching for information on the Volvo Amazon and start typing normal speed in the livesearch box: A-m-a-z-o . The whole thing takes less than a second, but what they see isn’t what they’d expect:

Livesearch example

Huh? How’d that happen? I type in Amazo, and get results for Ama? From the user’s point of view, it’s at best puzzling (and knocks the site down a notch in their eyes), and at worse useless (if they were unlucky they might have ended up with this):

Livesearch problem 2

The problem is, the mongrel that dealt with the request “/search/livesearch?term=Am” took a fraction of a second to get around to dealing with the request because it was still finishing off a previous request (the ‘dumb’ round-robin proxy delivering the requests to the mongrels will not know this, delivering requests to each mongrel in turn). Because of this it returned the response after the other mongrels had returned theirs.

How do you deal with this? For most CS graduates, this is probably a basic first-year problem, complete with the appropriate jargon. For me, a self-taught, greying old-car junkie, there appear to be three solutions:

  1. Make sure the requests are only passed to those mongrels who are free to deal with the request. This is what the fair proxy balancer for nginx and mongrel mentioned at the top does. The bonus is that this will improve the apparent responsiveness of your whole app. The only problem, I guess, could come if the later requests (i.e. those with more letters) take less time to complete than the earlier ones.
  2. Pass the requests to a faster backend server, one that isn’t handling more ‘meaty’ requests. After all, livesearch isn’t doing anything complicated, just searching the db, parsing and then returning the results. Perhaps this is a job for a custom mongrel handler or merb?
  3. Make just a single trip to the server, and then use local javascript to reduce the results on each successive letter.

Long story short, I went for 3. with Autopendium, but delayed doing so for quite a while because I thought it would be very difficult. Turns out not to be.

First, I cleaned up the code at the server end. No more templates, no more converting the results to HTML, just return the search results as JSON.

def live_search
    @results = Modtype.find_trusted( :select => "id, title",
       :conditions => [ 'LOWER(title) LIKE ?', '%' + params[:term].downcase + '%' ],
       :order =>"title").collect { |m| {:title => m.title, :id => m.id} }
    render :nothing => true, :status => 404 and return if @results.empty?
    render(:text => @results.to_json)
end

As you can see, we just return a 404 if there are no results (we’ll use that later).

Then in the application.js file you need to add the observer to the text input. I’m a great believer in unobtrusive javascript and use Dan Webb’s excellent Lowpro library which I’ve written about before:

Event.addBehavior({
    "#header-livesearch-term": function(event) {
        new Form.Element.Observer('header-livesearch-term', 0.5, liveSearch );
    }
})

You could also use Rails built-in observe_field in the layout or template to achieve the same result if you’re into that sort of thing. Something like:

<%= observe_field :suggest, :function: =>"liveSearch(element, value)",
:frequency => 0.5,
:with => 'term'  %>

Then you’ve got the javascript to make the Ajax call to your app, parse the results, and render them on to the screen.

Alert: I’m very much a javascript hacker, picking up bits as I go along, so use this at your own risk:

Towards the top of my application.js file:

modtypeSearch = null; // setup global variables for the live-search results
initialModtypeTerm = null;

Then, a little later:

function liveSearch(element, value)
{
// Don't do anything unless we've got at least two characters to search for.
// You can change or delete this.
    if (value.length > 1) {

    // Check whether we've already got search results.
    // Also check that the term those two letters was generated from
    // is still valid for what where currently being searched for i.e. 'am'
    // is good for 'amaz' but not 'mot'

    if (modtypeSearch && value.match(initialModtypeTerm) ) {
        var termRegexp = RegExp(value, "i");
        // Given we've got the basic results, search within those
        var subSetResults = modtypeSearch.findAll(function(n) {return n.title.match(termRegexp);});
        // Convert the results to a list of links
        var htmlResults = resultsToLinks(subSetResults,termRegexp);
        // Update the results div
        $('live_search_results').update(htmlResults);
        }
    else {
        //We've got no valid results already, so make a Ajax request to the app
        new Ajax.Request('/search/live_search', {asynchronous:true, evalScripts:true, parameters:'term=' + encodeURIComponent(value),
            onFailure: function(transport) {
                // if there are no results the app returns a 404. We can use that to display a no results message
                $('live_search_results').update('<p class="highlight">"' + value + '" not found!</p>');
            },
            onSuccess: function(transport) {
                modtypeSearch = eval(transport.responseText); // update the global modtypeSearch array with results
                initialModtypeTerm = value;  // and also the term that was used to find them
                var termRegexp = RegExp(value, "i");
                var htmlResults = resultsToLinks(modtypeSearch,termRegexp);
                $('live_search_results').update(htmlResults);
            }
        });
      }
    }
}

The resultsToLinks is just a simple function that converts the results array into a set of links, highlighting the search term using the em tag (I’ve CSS styled this to be highlighted with the standard bright yellow background):

function resultsToLinks(resultsArray,rExp) {
    var resultString = '<h4>"' + rExp.source + '" found in: </h4>\n<ul>' +
    resultsArray.collect(function(s) {return resultToLink(s,rExp);}).join("\n") + "</ul>";
    return resultString;
}
function resultToLink(r,term){
    var l='<li><a href="/modtypes/' + r.id + '">' + r.title.sub(term, '<em>#{0}</em>') +    '</a></li>'; return l;
};

And that’s pretty much it. The whole thing sidesteps the problems stated above, makes far fewer calls on the server, uses less bandwidth, and feels much faster to the user.

Advertisements

Written by ctagg

December 18, 2007 at 11:12 am