Recap: Tech & Templeton

techtalktalk
Yesterday’s Tech Talk was a special event, focusing on collaboration
between designers and engineers. After a tasting from Al Capone
favorite Templeton Rye, Head of Product Design Stuart Norrie and
Lead Android Developer Albert Lai presented tips, tricks, and common misconceptions about working with designers and engineers. From Photoshop to Machine Learning, these two blurred the boundaries and examined the benefits of multifaceted product development.

What would you like to see from our future tech talks? We’d love to hear your feedback, speaker suggestions, and more. Talk to us on Twitter, Facebook, or leave a comment here on the Engineering Blog.

Check out our presentation slides below, and stay tuned for news about
our next Pulse event!

Designing Engineers
By Stuart Norrie

How far do designers and developers need to go into each others’ worlds? What things can each camp learn about the other to make the transition from sketches and mocks to working builds as smooth as possible? This talk will outline what works and what doesn’t work from a designer perspective.

Engineering Designers
By Albert Lai

Get a look into the mind of an engineer! Learn about the various systems they use and how it could impact the design of a product. We will discuss popular frameworks, how mobile apps are architected, common problems that keep engineers up at night, and the real reasons behind those “are you crazy?!” looks we sometimes give you when you ask if something is feasible.

For more on all things Pulse, check out our main blog at blog.pulse.me.

Join Us for Tech & Templeton!

facebook_bg

After last month’s awesome (and jam-packed!) Tech Talk, we have high hopes for Wednesday’s Tech & Templeton event. Join us at Pulse HQ for an evening of talks on a crucial piece of the app puzzle: collaboration between engineers and designers.

Make sure to RSVP here!

You know them, you love them, but how do you work with them? Hear from our Android and Design teams to get insider tips and tricks for building the best products with the other side of the aisle. We look forward to seeing you at Pulse HQ!

Your evening will start off at 6:45pm with a tasting of Al Capone’s “good stuff”—the deliciously infamous Templeton Rye. Our first presentation will begin at 7:15pm. Read more about this unique set of talks below:

Engineering Designers
By Lead Android Developer Albert Lai
Get a look into the mind of an engineer! Learn about the various systems they use and how it could impact the design of a product. We will discuss popular frameworks, how mobile apps are architected, common problems that keep engineers up at night, and the real reasons behind those “are you crazy?!” looks we sometimes give you when you ask if something is feasible.

Designing Engineers
By Lead Product Designer Stuart Norrie
How far do designers and developers need to go into each others’ worlds? What things can each camp learn about the other to make the transition from sketches and mocks to working builds as smooth as possible? This talk will outline what works and what doesn’t work from a designer perspective.

Recap: Pulse’s Tech & Tonic

Wednesday saw Pulse’s second official Tech Talk, Tech & Tonic!

The event was packed to the gills, and we thank those of you who made it out to our San Francisco HQ for presentations, absinthe, and great conversation. We were also honored to have people following along on Twitter, located everywhere from the Mission to Malaysia. We appreciate the support, and we’d love to hear any questions or feedback you have about the event! Let us know on Facebook and Twitter.

For those of you who couldn’t make it, check out our recaps:

Speed Up Your Web App By Asynchronously Loading Resources
By Lead Web Developer Filip Mares

From Filip Mares: “The Pulse web app is built with a mixture of backbone.js/Django/YepNope.js. The old ways of packaging your static resources are not optimal for single page apps. Using yepnope.js you can load all of these files asynchronously and speed up load times for the core of your app. At Pulse we lowered our bandwidth use and sped up our initial load times through the use of asynchronous resource loading.”

Syncing Non-Trivial User Data Across Mobile Devices
By Pulse Co-Founder & CTO Ankit Gupta & Lead Backend Engineer Greg Bayer

From Greg Bayer: “The core concepts we covered are:
1) Mobile & offline support means not having a single source of truth. Implications of this.
2) Making sure the user’s mental model matches what the client and syncing service does.
3) Some challenges and techniques for making syncing data across mobile devices fast.”

Thanks for joining us, and stay tuned for our next event!

Announcing: Tech & Tonic!

We’re bringing a winning combination to 2 Shaw Alley: Tech and Tonic! Join us for a gin and absinthe tasting courtesy of Raff Distillerie, then hear from members of Pulse’s iOS and Web teams.

Make sure to RSVP here, and read more about the tech talks below:

Speed Up Your Web App By Asynchronously Loading Resources
By Lead Web Developer Filip Mares

The Pulse web app is built with a mixture of backbone.js/Django/YepNope.js. This talk will give an overview of our architecture and conventions for developing the app, as well as discuss some limitations in performance and how we’ve overcome them.

Syncing Non-Trivial User Data Across Mobile Devices
By Pulse Co-Founder & CTO Ankit Gupta and Lead Backend Engineer Greg Bayer

One of the most delightful features in Pulse is the automatic syncing of user’s sources across their devices. This tech talk will cover the architecture and implementation of our syncing service, from both app and server perspectives. We will go over real world considerations, including speed, efficiency, offline access and user interface decisions. We will provide specific best practices for iOS and Android apps as well.

We can’t wait to see you!

Wednesday, January 30th
6:45 PM
2 Shaw Alley, 5th Floor
San Francisco

As always, get more updates about all things Pulse at Facebook and Twitter.

Recap: Pulse’s First Tech Talk

Pulse had its first Tech Talk last Wednesday at our HQ in San Francisco! Featuring a tequila tasting and two presentations from members of the team, the event was a success—and the first of many Tech Talks to come. Thank you to everyone who attended!

If you’d like to hear about our upcoming events, keep your eyes on this page, follow us on Facebook and Twitter, or send us an email to be added to our priority mailing list.

If you missed the presentations, check out the slides and recaps below:

Building on the Shoulders Of Giants:
How We Boostrapped an MVP Data Product on AWS and GAE

 

From Backend Engineer Greg Bayer: “Pulse’s backend runs on both Amazon Web Services (AWS) and Google App Engine (GAE). Our engineering philosophy is to minimize ops and low-level system development, so we can focus on core product features. To that end, we try to build things on GAE’s more managed infrastructure first and switch over to AWS if a particular feature doesn’t fit well with Google’s architecture.

In Wednesday’s talk we highlighted our Scribe-based event log collection and EMR-based analysis systems on AWS. We also talked about using GAE to manage user accounts, and how we repurpose GAE’s map reduce to efficiently send email to a large sets of users.”

From Product Engineer Elliot Babchick: “We covered the steps involved in Pulse’s data pipeline for creating an email digest. We covered the raw logging of our user events all the way up to our App Engine map reduce job to send the emails. Results of the effectiveness were presented (~2x engagement compared with our non-personalized emails), along with several tips about how to make the process easier to debug.”

 

One Screen, Two Screen, Red Screen, Blue Screen:
Designing and Engineering a Mobile Application for Multiple Screen Sizes

 

From Lead Android Developer Albert Lai: “Creating an app is more complicated now than it was three years ago; you can’t just design for the iphone screen and be done with it! You’ll have to be prepared to run your app on a diverse array of screen sizes and lucky for you there are some guidelines to help your code be as adaptable as you are.

1. use space judiciously
2. use popups for small actions in tablets
3. you’ll have to write size dependent code and use dynamic layouts

Best practices:
1. Use Fragments in Android (they’re in the compatability library) and ModalViewControllers in iOS
2. There are classes in the respective SDKs that tell you what kind of device your code is running on. Use them.
3. Never use absolute coordinates to layout your app
4. Use RelativeLayouts in Android and Autolayout in iOS
5. Calculate dimensions for UI elements on-the-fly in your app”

 

Stay tuned for our next event!

Join us at Pulse’s First Tech Talk!

Join at Pulse HQ on December 12th from 6:30 to 8:00 PM to learn about the technology that drives Pulse, from the people who make it happen. Meet the team, check out our San Francisco office space, and even enjoy a tequila tasting. Make sure to RSVP here!

The presentations include:

Building on the Shoulders Of Giants

How We Boostrapped an MVP Data Product on AWS and GAE

This talk, led by backend lead Greg Bayer, will attempt to cover several major components of Pulse’s backend infrastructure, including an overview of the services we leverage, and how we specifically make use of them. Along the way, product engineer Elliot Babchick will be relating how each one of these pieces allowed us to create a product feature driven by user generated data, at scale, in just under a week, under a single engineer’s supervision.

One Screen, Two Screen, Red Screen, Blue Screen

Designing and Engineering a Mobile Application for Multiple Screen Sizes

Screens are everywhere; they’re even on little devices inside our pockets. Yet how do we know how big they are when we load our software onto those portable screens? Android lead Albert Lai will walk through some of the key heuristics and techniques used to ensure a seamless Pulse experience that adapts to virtually any mobile device size.

______________

Immediately following our tech talk, we’ll be discussing (and tasting) the technology of tequilas premium, from a blanco to a reposado to an anejo. Our brand representative will walk you through the tasting profiles and teach you what makes an alcohol a tequila, what makes a tequila a blanco vs a reposado, and more. For guests 21 and up.

Backend Tips – App Engine, meet Redis on AWS

Since snappy performance is critical to providing a good user experience, we try to keep the latency of all common Pulse backend API requests under 500ms. Most of the time we achieve this by using Google App Engine’s memcache to cache all data which might be reused by many requests. Less commonly requested data is pulled from the datastore, resulting in such requests taking a bit longer than we like.

When these slower requests are rare, we accept them. However, for features that access a broad range of data, the likelihood of missing the cache increases. Some data required for a request may be cached, but some will almost always not be, resulting in high latency for most requests.

To implement these types of features efficiently, one option is to dramatically increase the size of our memcache. This would allow us to keep all required data in cache. However, it would be expensive and is somewhat at odds with the LRU cache policy we like to use for other features. This approach is also currently unsupported on Google App Engine (since memcache capacity is not directly tunable).

We investigated several other options and finally settled on using Redis as a persistent, in-memory, datastore. Redis strikes a great balance between simplicity, powerful primitives, and proven stability. Instead of increasing our memcache or switching entirely to a larger in-memory store, we created a second Redis-based system on AWS. This system is specifically designed to hold data which is important to have available at in-memory speeds (with no expected misses). Achieving this is more expensive than providing a similar LRU cache (which could be smaller), so we reserve it specifically for features that require such guarantees.

Architecture

We wanted to use Redis, but also to make sure that our implementation was both scalable and easily recoverable in the case of failure. From here on out, we will discuss the infrastructure and tools we use to build this system. Here’s a visual overview of the system:

 

Amazon Elastic Load Balancer

This is a really nice utility that AWS gives us. We setup an ELB that points to as many EC2 machines as we need, and for each of those machines (we’ll call them redis frontends), we get automatic round-robin balancing and it will also detect failing machines, give us a warning, and transfer the load to the running machines. Some important dos:

  1. The load balancer can deal with https requests, so use them! Some security is always better than none.
  2. You should make sure that the machines you provide to the load balancer are distributed among the different regions that AWS offers.
  3. You can also use dynamic scaling by putting dynamic instances into a group and giving the group to the load balancer.


HA Proxy

Our redis frontend machines use Tornado as the webserver. Tornado is fast (great!) and single threaded. Single threaded prevents many headaches, scales predictably and has minimal overhead, but doesn’t benefit from multiple cores on a machine. The larger Amazon machines have multiple cores, so we really want to use that to our advantage. Enter HA Proxy, a nice utility that allows you to build an reverse proxy. Here’s a barebone version of the configuration we use:

global
maxconn 1024
daemon
log 127.0.0.1 local0
frontend load_balancer
# We process all requests hitting port 8080
bind *:8080
# We will point them to the backend we describe later
default_backend tornado_servers
mode http
option httplog
option dontlognull
clitimeout 20000
backend tornado_servers
# The balancing strategy
balance roundrobin
# The tornado servers, in this case, the machine has 4 cores
server tornado_1 127.0.0.1:13371 check rise 2 fall 5
server tornado_1 127.0.0.1:13372 check rise 2 fall 5
server tornado_1 127.0.0.1:13373 check rise 2 fall 5
server tornado_1 127.0.0.1:13374 check rise 2 fall 5
retries 1
mode http
contimeout 5000
srvtimeout 20000
# We also get stats from HA Proxy about our tornado servers
stats enable
stats uri /lb?stats

Tornado Frontends

Each of these Tornado instances provides a thin python api layer. The implementation is both simplistic and very specific to our own use-cases. I won’t go into the specific details, but the frontend takes care of all of the security and implements the internal API we provide to our client teams. Certain general tasks like deserialization, error handling, and batching requests before hitting the backend were also very important. We run enough instances to match the number of cores on the machine and they all rely on the sharded redis interface to actually access the data.

Sharded Redis Interface

This is based heavily off of redis-py by Andy McCurdy, so many thanks to him. You can take a look at https://github.com/andymccurdy/redis-py/

The thing we needed to add was the ability to split our data amongst several different machines. Andy is working on a general solution for this called cluster redis, but we opted to go with something simpler in the meantime.

The first thing was to implement the actual sharding, something like:

def find_shard(key):
hash_value = some_consistent_hash_function(key)
return hash_value % num_machines

With that little snippet, it was pretty easy to send operations to a wrapper class of StrictRedis (look at redis-py), and just have all the tornado frontends behave as if there was a single machine serving the data. This works as long as you don’t want to use pipelines.

However, it turns out that you really do want to use pipelines. Whenever you have multiple requests that you can send out at the same time, a pipeline will save you all the roundtrip time of single requests. Without pipelines, it doesn’t matter how blazingly fast redis is, you are stuck on network i/o latency.

Getting pipelines to work is a little bit more involved. Now when a request comes in on a pipeline, we index it by the order it came in and store that tied to the individual machine pipeline we created. An example with two machines:

command1 key1 value1 (key1 -> machine 1)
command2 key2 value2 (key2 -> machine 2)
command3 key3 value3 (key3 -> machine 1)
command4 key4 value4 (key4 -> machine 1)

We will remember it like this:
Pipeline index for machine 1:
[1, 3, 4]
Pipeline for machine 1 will contain:
command1 key1 value1
command3 key3 value3
command4 key4 value4
Pipeline index for machine 2:
[2]
Pipeline for machine 2 will contain:
command2 key2 value2

Now when we execute all the pipelines, we will be able to reconstitute the return values in the order they came in to the sharded_redis interface. With solutions to both the sharding and pipelines, we now have an interface that hides the fact that we actually need multiple machines to serve all the data. Notice that since each tornado frontend uses the interface independently we need to update them synchronously when we make changes!

Redis Backend

Here are a few tips for setting up redis:

  1. Use a password, and make it a long password
  2. Set a memory limit and a reasonable policy to deal with exceeding max memory
  3. Change your machine overcommit_memory setting to 1
    sysctl -w vm.overcommit_memory=1
  4. Don’t run anything except redis on this machine
  5. If you are using AOF files and backup machines (recommended), don’t bother with persistence on the master! Instead, make sure you have an agressive fsync policy (everysec works) for the slave.
For those who want the “why” behind each of the tips:
  1. From Redis Documentation:

    The password is set by the system administrator in clear text inside the redis.conf file. It should be long enough to prevent brute force attacks for two reasons:

    • Redis is very fast at serving queries. Many passwords per second can be tested by an external client.
    • The Redis password is stored inside the redis.conf file and inside the client configuration, so it does not need to be remembered by the system administrator, and thus it can be very long.

    The goal of the authentication layer is to optionally provide a layer of redundancy. If firewalling or any other system implemented to protect Redis from external attackers fail, an external client will still not be able to access the Redis instance without knowledge of the authentication password.

    Note: The AUTH command, like every other Redis command, is sent unencrypted, so it does not protect against an attacker that has enough access to the network to perform eavesdropping.

  2. We actually monitor the machine memory usage as well as the redis memory usage to shard our redis backend more as needed. Even so, its safer to set a reasonable limit of memory that redis should use so that we don’t have a scenario where redis uses all available memory on a machine and then crashes.
  3. From Redis Documentation:

    Redis background saving schema relies on the copy-on-write semantic of fork in modern operating systems: Redis forks (creates a child process) that is an exact copy of the parent. The child process dumps the DB on disk and finally exits. In theory the child should use as much memory as the parent being a copy, but actually thanks to the copy-on-write semantic implemented by most modern operating systems the parent and child process will share the common memory pages. A page will be duplicated only when it changes in the child or in the parent. Since in theory all the pages may change while the child process is saving, Linux can’t tell in advance how much memory the child will take, so if the overcommit_memory setting is set to zero fork will fail unless there is as much free RAM as required to really duplicate all the parent memory pages, with the result that if you have a Redis dataset of 3 GB and just 2 GB of free memory it will fail.

    Setting overcommit_memory to 1 says Linux to relax and perform the fork in a more optimistic allocation fashion, and this is indeed what you want for Redis.

  4. Because of the large memory footprint we expect redis to use and the fact that we have to use an optimistic memory allocation setting, running anything else that might use up a lot of memory on the same machine can lead to failures.
  5. This is a optimization to make sure the master Redis instance does not bottleneck because of disk writes. The work associated with persistence is offloaded as much as possible to a backup machine That being said, its important that the slave/backup machine is robust.

Backup

This is simply a second machine running Redis that is set as a slave to the master Redis instance. In AWS, remember to use internal ip addresses when setting this up, since it saves you money. Backups are a must when you are running redis in production for several reasons:

  1. It’s a backup! If your machine in front goes down, you fail over to the backup as you try to fix the first machine. More often than not, you can actually just promote the backup and setup a new backup when you are running on AWS.
  2. If you ever need to expand the number of machines used for serving, you can just promote your backup to a serving machine and set up new backups for both machines. I would be remiss not to mention that you do have to then go through both machines to delete the extra keys later, or else you really won’t have expanded your memory limit.
  3. You can run data analytics on the backup without affecting the all important performance of the actual serving machine.

Backend Tips – Conquering Big Tables with MapReduce

mapreduceAs some of our readers already know, Pulse uses Google App Engine (GAE) to serve content from thousands of publishers to millions of users. We have been very happy with the minimal operational overhead App Engine requires and were thrilled to see App Engine scale without hiccups when we were preloaded on the Kindle Fire.

As a backend engineer, it is inevitable that some engineering tasks involve heavy data processing. In our case, this often happens on data in the App Engine datastore. We have always relied on the very flexible and easy remote shell to do this type of work. However, this approach is too slow for many use cases, especially those touching millions of records.

For larger tasks, App Engine’s built-in MapReduce is often the right tool. It allows us to quickly operate on millions of datastore entities in a very short amount time. To give a few examples, we use MapReduce: to quickly migrate existing data from legacy datastore models to new models due to architectural changes, to perform load testing on our system with hundreds of shards simulating millions of users, and to inform our users of Pulse’s latest updates by sending out millions of emails or push notifications.

Data Migration

When making product changes, we sometimes move large amounts of data away from a legacy django-nonrel model. The speed of MapReduce ensures that minimal transition time is required and that the transition is painless enough that it is preferred over simply living with the wrong data model.

Load Testing

We use MapReduce to simulate load tests that would otherwise be unrealistic if we only used a few physical machines. A simple load test might use MapReduce to make thousands of requests within a very short period. These requests can simulate millions of users using Pulse throughout a day.

Lessons Learned

You should plan and test any large Map Reduce task that will consume quota-limited resources before running the full job. It’s a good idea to estimate the amount of datastore reads/writes, url fetch calls, and other API requests beforehand. In some cases, it may be necessary to contact App Engine support to ask for increased quotas (for those that cannot be increased in the admin console).

For those using a framework on top of App Engine, make sure you initialize at the top of your handler file (see below). In some cases, you may also need to add the initialization code to the mapreduce module (at the top of mapreduce/main.py). In Django-nonrel, the init line you’ll need looks like this.

from djangoappengine import main

Getting Started

For those of you new to Map Reduce on App Engine, here’s how to create jobs of your own. The App Engine team has made it pretty easy.

Download the mapreduce library via svn and add it to your app:

 svn checkout http://appengine-mapreduce.googlecode.com/svn/trunk/python/src/mapreduce

Register the MapReduce handler in your app.yaml:

handlers:
- url: /mapreduce(/.*)?
  script: mapreduce/main.py
  login: admin

url – The MapReduce endpoints.
script – The handler file containing the task you want to perform.
login – Restricts access to app admins only.

Create the handler file you specified in the previous step (mr_email_users.py) and pass in the model you want to map over:

def run(user_entity):
    send_email(user_entity.email)

Note: See the official Map Reduce guide below for more advanced options & examples.

Register and configure the MapReduce job in mapreduce.yaml:

mapreduce:
- name: MapReduce Email Users Job
  mapper:
    input_reader: mapreduce.input_readers.DatastoreInputReader
    handler: mr_email_users.run
    params:
    - name: entity_kind
      default: user
    - name: shard_count
      default: 50
    - name: processing_rate
      default: 1000

input_reader – The input reader for this job; you can find other types here.
handler – The entry point to this MapReduce job.
entity_kind – The datastore model being mapped over.
shard_count – The number of concurrent mapper workers to run at once.
processing_rate – The aggregated maximum number of inputs processed per second by all mappers. Can be used to avoid using up all quota, interfering with online users.

Access the MapReduce admin console panel to view and launch jobs:

http://(your app name).appspot.com/mapreduce/status

More Info

You may be interested in the official MapReduce Get Started Guide for Python or Java. In addition, this 2011 Google IO talk includes many new useful MapReduce tips. Please leave any questions and comments below, and we will be happy to answer / discuss!

Backend Tips – Google Cloud Storage

Google App Engine’s datastore meets most of our backend storage needs, but we sometimes find ourselves limited by the maximum entity size of one megabyte. One option for storing larger files is to build a separate system on top of Amazon S3. A downside of this approach, however, is that we cannot take advantage of Google’s edge cache, which acts as a free CDN.

A second option is the new Google Cloud Storage service. Google Cloud Storage is the unofficial successor to the Google App Engine Blobstore, and both services are built on the same underlying infrastructure. Yet unlike the Blobstore, which is bundled with App Engine, Google Cloud Storage is a standalone service for storing and managing data. As such, Cloud Storage is Google’s attempt to roll out an Infrastructure as a Service (IaaS) offering that can compete with Amazon S3.

Getting Started

In order to use Google Cloud Storage with App Engine, the first step is to grant your application access to your storage bucket. The documentation instructs you to add the application’s service account name (application-id@appspot.gserviceaccount.com) as a team member to your Google APIs console project.

However, since we created our project with a Google Apps account, this takes bit more effort.  Only users from our domain (xxx@yourdomain.com) could be added to the team via the console. The solution is to use the GSUtil command line tool to edit the storage bucket’s Access Control List (ACL).

Run the following command to retrieve your bucket’s current ACL: gsutil getacl gs://bucketname > acl.txt. Then add an entry that looks like this:

<Entry>
<Scope type="UserByEmail">
<EmailAddress>application-id@appspot.gserviceaccount.com</EmailAddress>
<Name>Service Account</Name>
</Scope>
<Permission>FULL_CONTROL</Permission>
</Entry>

Finally, run this command to set the new ACL: gsutil setacl acl.txt gs://bucketname.

Storing Data

Google provides an experimental API to integrate Cloud Storage with App Engine. This API allows for reading and writing of files to a storage bucket. While testing, I had already preloaded some test files into our bucket using the (barebones, but functional) Cloud Storage Manager web application. I could also have used the GSUtil tool.

Moving forward, we wanted to start loading files programmatically from within App Engine. The API documentation clearly explains how to create, write to, save, and read from Cloud Storage objects. Note that the function provided by the API to create a Google Cloud Storage object —files.gs.create() — takes a number of useful parameters. For instance, this is where you can specify the ACL and Cache-Control header for the object.

The documentation does not address the case in which the object you wish to save is a user upload. Storing uploaded files in a bucket can be accomplished using the Blobstore, as suggested by this StackOverflow answer. The blobstore_helper module is useful for adapting this code for Django.  Simply replace self.get_uploads('file') with blobstore_helper.get_uploads(request, 'file') in order to retrieve the uploaded files.

Serving Content

The Cloud Storage API does not offer a way to serve files directly from a storage bucket. Instead, you can use the Blobstore API to create a url that points at your file.

First, generate a blob key for the Cloud Storage object using the Blobstore API’s create_gs_key() function. Then serve the object as you would a traditional blobstore object. The example given for the Blobstore Python API assumes use of Google’s webapp framework, which provides helper functions (such as self.send_blob()) that obscure the underlying implementation. This makes it a little tricky to understand how to port the code to a different framework, but once again the blobstore_helper module offers some insight. The module defines its own send_blob function, in which the key line of code is response[blobstore.BLOB_KEY_HEADER] = str(blob_key). Essentially, if you put a special header in the response containing the blob key, then App Engine will automatically fill the body of the response with the content of the blob.

To properly serve the blob, it is also necessary to set a correct Content-Type header for the response. Although the Cloud Storage REST API does support retrieving an object’s metadata, it seems that the API for App Engine does not. Currently, we rely on Python’s mimetypes module, which can guess content type from a filename: response['Content-Type'] = mimetypes.guess_type(filename)[0].

An alternative approach to serving files from Cloud Storage, which applies to images only, is to use App Engine’s Image API. As of App Engine version 1.7.0, it is possible to use the get_serving_url() function with Cloud Storage objects. Simply generate the blob key as before, and plug into this function to generate a url for the image. One benefit of using this approach is that the serving url supports cropping and resizing on the fly by supplying optional parameters.

We will continue to investigate the best practices for using Google Cloud Storage with App Engine as a service for storing and serving large files. For others who might be interested, there was a helpful session at Google IO, entitled Storing Your Application’s Data in the Google Cloud, that covers the basics of this new service. Of course, there are other options to consider as well, such as the Blobstore or Amazon S3. It remains to be seen which service will best meet our needs, but we’re glad that there is now a strong option on the Google side.

Backend Tips – The Free CDN

New Blog Post Series

This is the first in a series of blog posts in which we will offer a peek into the some of the challenges we tackle on the Backend Team and discuss some tips and tricks we have discovered. These posts will focus on the ways in which we use GAE and AWS to build simple features that have helped us to deliver an amazing product. We plan to dive a little deeper into topics we’ve covered before, as well as highlighting some new ones. Upcoming topics will include GAE MapReduce, Redis, Google Cloud Storage, and duplicate detection via TF-IDF. Our first entry in the series discusses how to use Google’s edge cache as a free content delivery network (CDN).

The Free CDN

At the end of last year, we briefly mentioned Google’s edge cache as a useful feature as part of our guest post on the App Engine blog. Since this is one of our favorite services, I’d like to take a few minutes to explain it in more detail. It is an extremely simple feature that has the potential to significantly improve content serving latency and can be very valuable in terms of cost savings over other CDNs. Hopefully it will be clear by the end of this post why you should think about using it for your next project.

Content Delivery Networks

Content Delivery Networks (CDNs) offer several benefits that are typically desired for both web and mobile apps. They are designed to cache content on many geographically distributed servers, as close to the end user as possible, thereby minimizing latency for requests to the cached content. There are several major CDN providers, but the big ones that come to mind are Akamai and Amazon’s Cloudfront. CDNs vary in quality and price, but generally one should expect to pay a premium for this type of service.

Google’s Edge Cache (aka. CDN)

It turns out that if you’re using Google App Engine (or other Google services like the newly announced Google Cloud Storage) and you configure things correctly, you get the same service for free. By simply setting public cache control headers wherever possible, you allow Google’s edge caches to serve unchanged content directly to users. Here’s an example of a set of response headers that will activate the cache:

 Cache-Control: public, max-age=900, must-revalidate

The most important component of the header is the word ‘public’. It tells Google’s network that the content in this response is not specific to a particular user or private in any way, so it’s safe to cache it as aggressively as possible. ‘max-age’ allows you to decide how often this content will be refreshed from your servers, and ‘must-revalidate’ is just telling the server (or client cache) to strictly follow this timeout.

This technique has been mentioned in at least one Google IO talk, but for some reason hasn’t been widely publicized. Because of the scale of Google’s network, this is perhaps the best CDN available. Best of all, there is no cost for this caching. It’s actually a win-win for both you and Google, since it minimizes the traffic that has to cross their internal networks and servers.

At Pulse we use this feature very heavily. It lets us serve high quality, mobile optimized images at < 50ms latency, while also saving us lots of App Engine instance hours by preventing these requests from hitting our frontend servers. As you can see from the graph below, for this particular App Engine app, we are serving the majority of requests out of Google’s edge cache (labeled red). I encourage you to try it out. It’s almost too easy to be true! If you have questions, feel free to leave comments below or ping me @gregbayer.