Savvy shoppers have always been on the lookout for new deals, but sites such as Groupon, Gilt City, and LivingSocial have made deal-hunting even more popular. For the last few weeks, we’ve been working hard to make finding great deals easier for our users. By aggregating local deals from these sources, we hope to make them less intrusive (than email alerts) and easy to peruse. It should also be painless to scan deals from several different sites at once by simply adding each of them in Pulse.
Where are the deals?
Since most deal sites are location-based, we want to allow our users see deals that are nearby. The first step is to know where the deals are. We can get a list of cities (either through API calls to the deal sites or RSS feeds provided by them) and store the longitude and latitude of these cities. If the sites are unable to provide a list of cities and/or their coordinates, it is relatively straightforward to scrape the list of cities from their website (with permission of course!) and find the coordinates of the cities through services such as the Google Geocoding API. Once we have this data, we store the city/coordinate pairs in the Google AppEngine datastore (Figure 1).
Where is the user?
On GPS enabled devices, location coordinates are relatively easy to come by. A client application can simply query the server with a user’s coordinates (with user’s permission) using an API call such as:
http://<server>/api/nearby?long=-73&lat=42&max=10. After some server-side calculations, this call will return a list of nearby cities that should be relevant to the user because they are nearby the location the user just provided.
What is nearby?
Given that we have each city’s coordinates stored on the server and the user’s coordinates from the client application, how do we determine which cities are closest? We use the following algorithm to figure out which cities we should show to the user:
dist = math.pow(city.long - user.long, 2) + math.pow(city.lat - user.lat, 2)
heapq.heappush(result, (dist, city))
The result is a min heap that ranks the cities from the closest to the farthest. We return this list (or a subset of it) to the client application and then allow the user to choose to add deals from a city near them.
The key to making this operation efficient is loading the city coordinates into memory (in our case, into the AppEngine Memcache). This simple optimization works well for any small/medium sized list of locations and turns a resource intensive calculation (that can be quite involved to implement on AppEngine’s datastore) into simple and scaleable operation that requires no datastore calls.
We hope to give our users a better experience by offering the ability to aggregate deal sites in Pulse. This was a short insight into how we implemented this feature. If you have great ideas that can help further improve this experience, let us know!