Using jQuery Data for Easy Data Associations

One of my favorite but often overlooked features of jQuery is the data method, which allows you to associate specific data to a DOM element. This is particularly useful when dealing with a list of elements which have a lot of metadata. Without the data method, you could do this by storing everything in an array or adding funky attributes to an element, but that can get fairly convoluted. Not a lot of developers are aware of the power of jQuery data, so this post will cover a basic use case.

Let’s say we have a site which has a list of blog posts on the left-hand side. When one is clicked, the right side is populated with the blog post contents. The interaction is very similar to what we’ve done on the pulse.me site, where we use jQuery data for storing information about a story. The template fields in the HTML below would be populated using a templating framework.

HTML:

    <ul id="blog-posts">
        <li id="blog-post-0" class="blog-post-item">
            <h1>{{ post_title }}</h1>
            <h2>{{ post_date }}</h2>
            <h3>{{ post_author }}</h3>    
            <p>{{ post_snippet }}</p>
        </li>
    ...
    </ul>
   
    <article id="blog-post">
        <section id="blog-post-header">
            <h1>{{ post_title }}</h1>
            <h2>{{ post_date }}</h2>
            <h3>{{ post_author }}</h3>
            <h4>{{ post_category }}</h4>
        </section>
        <section>
            <img alt="{{ post_image_caption }}" src="{{ post_image }}"
           <p id="blog-post-content">{{ post_content }}</p>
            <span id="blog-post-tags">{{ post_tags }}</span>
        </section>
    </div>

Once we have the json object with the post data, we can save it. An association is made to the <body> and the <li> id in the DOM. The id is used as a key to reference the post’s data.

JavaScript:

Update: Thanks to the Hacker News community (Xurinos, rimantas) for some tips on improving performance. See commented out lines for old and less optimal code.

    // Post data
    var post = {
        'post_title' : '125 Days at a Startup',
        'post_date' : '2011/06/21 10:00:00 -0700',
        'post_author' : 'Filip Mares',
        'post_category' : 'Pulse',
        'post_tags' : 'experience, startups, updates'
        'post_image' : 'http://posterous.com/getfile/files.posterous.com/temp-2011-06-20/koJbhHJJHJgDmpxbrshCjdDoFsktnbmoJspkkIJgBoDJqyFwcxIFpIGgHsti/IMG_0385.JPG.scaled1000.jpg',
        'post_image_caption' : '125 Days at a Startup',
        'post_content' : 'This is the content...',
        'post_snippet' : 'This is the snippet...',
        'post_url' : 'http://filipmares.com/125-days-at-a-startup'
    }

    $(document).ready(function ()
    {
        // id of &lt;ul&gt; node
        var blog_post_id = 'blog-post-0';

        //Saving the incoming data in association to the
        //$('body').data(blog_post_id, post);
        $.data(document.body, blog_post_id, post)
       
        // Function for appending post data to list of posts
        PopulatePostsList(post);
    });

In order to query the post from memory we bind a click event function to retrieve the contents for the <li> id in question. Afterwards, we call a function that uses a templating framework to populate the DOM according to the HTML above. The ‘DeletePost()’ method is an example of how to delete the data from memory.

JavaScript:

Update: Thanks to the Hacker News community (Xurinos, rimantas) for some tips on improving performance. See commented out lines for old and less optimal code.

// Retrieve posts data on list element click
$('.blog-post-item').click(function(){

    //var id = $(this).attr('id');
    var id = this.id;
    var post = $.data(document.body, blog_post_id, post);
   
    // Function that outputs post data to article on DOM
    PopulatePost(post);
    });
   
    // Delete post with id passed
    DeletePost(blog_post_id){
    try{
        // Remove post from memory and terminate association
        //$('body').removeData(id);
        $.removeData(document.body, id)
    }catch(err){
        console.error('Post does not exist')
    }
}

Although this is fairly straightforward, please keep a few things in mind; this is not an alternative to local storage and can potentially lead to memory leaks when dealing with a lot of stored data. Be responsible when using the call and remove data that isn’t being used anymore. Furthermore, storing data takes a certain amount of time, so you might get a null exception if you are planning on retrieving it right after storage. There are currently no plans to add a callback to the jQuery data that I’m aware of.

Happy coding,

Filip

Introducing Livecount

Analytics is ideally a combination of real-time and batch processing. Batch processing, with something like Hadoop, is great for digging into large amounts of past data and asking questions that cannot be anticipated. However, when it is known ahead of time that certain aggregates will be required, the best solution is often to count each event as it happens. Most analytics dashboards are backed by this kind of real-time data.

Nine months ago, Pulse was just starting to experiment with real-time event counting. We didn’t have much server infrastructure yet and were using Google AppEngine to host a few simple APIs. We started reading about the various ways to implement counters on appengine and came across two frequently recommended solutions.

Existing Solutions

Sharded counters was the first approach we tried. To compensate for our write-heavy counter workload, we split each counter into several datastore entities. This allowed writes to be parallelized and avoided the single entity write limits of the AppEngine datastore. To read a single counter value, we queried all shards and summed up the values. This worked well, but required hitting the datastore on every request. Our tests still showed unacceptably high latencies. More shards sped things up a bit, but performance was always bottlenecked on the datastore.

To avoid datastore latencies, counting in memory seemed like an obvious solution. Given a distributed key-value store like Memcached, counting in memory should be quite scalable, while improving both read and write performance. Of course, memcache data is vulnerable to loss when a server goes down, as well as being subject to eviction if available memory runs short. Unfortunately, the eviction problem is amplified in shared environments where it is always possible to be evicted by memory pressure from another app. While we were willing to accept some risk of data loss, our tests showed the probability was too high on AppEngine’s memcache.

Implementing Write-behind Counters

Livecount was developed to leverage the performance of AppEngine’s memcache, while making an effort to maintain the durability of counts. AppEngine’s task queues turned out to be perfect for the job. Each time a count is updated (or optionally when a multiple is reached), Livecount creates a worker task to write that count from the memcache to the datastore in the background. If the count is ever evicted, it is reloaded from the datastore on the next read or write.

Performance

Since counter updates are usually written back to the datastore within seconds, the risk of loss is minimal. Write performance is excellent, since only the memcache must be updated before completing a request. Most reads can also be served from the memcache. Load on the datastore is further reduced by storing a dirty flag along with each memcached count. If more increment events come in than can be written back in real time, only one write is needed to update the datastore with the latest count. After a successful write, the dirty bit is cleared and the other backlogged write tasks for that counter are skipped.

Using Livecount

This simple solution has allowed Pulse’s backend to easily scale to counting hundreds of events per second, with minimal cost and complexity. Livecount’s API requires nothing more than a simple string counter name.

from livecount.counter import load_and_increment_counter

load_and_increment_counter(name=url)

For more advanced use-cases, namespaces are supported for keep counters organized and easy to query. Recently, we also added support for time period fields to help support hourly/daily/weekly/monthly aggregates. Here’s a more advanced example.

from livecount.counter import PeriodType, load_and_increment_counter

load_and_increment_counter(name=url, period=datetime.now(), \
period_types=[PeriodType.DAY, PeriodType.WEEK], namespace="starred", delta=1)

Livecount is open-source and easily deployable on Google AppEngine. If you have something to count, give Livecount a try. We’d love to hear your thoughts or suggestions for improvement!

Best Practices for Releasing Android App Updates

Since the Android version of Pulse was released in July of last year, we’ve released 40 some updates. Here are some of the basic and more advanced best practices that we’ve accumulated:

The Basics

To publish an update, you must sign it with the same private key you used when you initially published the app. After you publish the initial version, back up your key and make sure you don’t forget the keystore password, otherwise your app will be un-updateable.  You’ll have to change the package name and publish as a separate app, but if you have any users, this is really a bad situation to be in. Make sure to back up your keystore!

In general, you want to test the new version on a variety of devices with different screen sizes and running different versions of Android. Building a community of beta-testers can be a great way to get some feedback and testing on a bunch of devices.

Testing Upgrades

Make sure to test various upgrade combinations. For example if you’re going to be releasing version 2.0, test upgrades from the last few versions (1.7 → 2.0, 1.8 → 2.0, 1.9 → 2.0) because a lot of users don’t install updates immediately after they are published.  To facilitate upgrade testing, you can keep an internal-only Dropbox folder with all the previous Market versions, and upload a new one when you have an updated version of the app.

When you have a previous version running on your device, add a few widgets (if your app has them) and shortcuts to the home screen, and make sure they stick around and keep working after upgrades. Double-check that you are not changing any of the things that cannot change. Also make sure user settings are persisted through upgrades.

Another thing to watch out for when upgrading is database changes. If you use the standard SQLite database, make sure your onUpdate method handles upgrading your database appropriately.  If you have any tutorials when the app first opens, check that the messaging is appropriate for upgrades as well as new installs. Someone who has been using your app for a year shouldn’t get a welcome message or another walkthrough of how your app works.

By accessing the version name (use PackageManager to get the the app’s PackageInfo which includes version name) and saving the current version name somewhere like SharedPrefs, you can easily figure out whether it’s a new install or upgrade or same version, and handle messaging or anything else appropriately.

Communication with Users

The changelog is the main place where you can communicate with users about changes in the recent version. Even if it’s a small update with trivial bug fixes, you definitely need to put something in the changelog. If you forget to add messaging in the changelog, many Android users will give you 1-star reviews if you have an empty changelog, and then after you do update the changelog, Android Market can take up to a few hours to persist those changes.

We tend to keep change history around for the last few updates, since not all users are on the current version.

Post-Publication

Even if you’ve done your best right up until you click “Publish,” bugs will probably show up in production. For the next few days, keep an eye on crash reports through the Android Developer dashboard, and be ready to release a quick fix release if anything urgent comes up.  For whichever version control system you use, always keep a branch synced to the latest published version, so if there are any crash reports or email feedback that require quick fixes, you won’t be introducing any new bugs into the app.

If you’ve found any other useful guidelines or tips for releasing updates, we’d love to hear them!