Tips for improving performance of your iOS application

Any iOS application worthy of a spot on their user’s home screen is made of 3 key ingredients: a great idea, stunning design and smooth performance. In a previous post, we shared a few guidelines to make your app look pretty. Today, we have some simple tips on how to improve the performance of your iOS application. At Pulse, we obsess over every small hiccup in the application and spend countless nights staring at Instruments at the end of our release cycles. Here are some of our insights that might help you in your development process.

Downsize your image assets

Apps with good visual design always delight users. To achieve pixel perfect graphics, every iOS application ships with several image assets. It is crucial that these images are as small in size as possible. Let me elaborate with an example.

It is common practice to add a button to a nib file and set its background to point to an image. When the nib file is read from disk, iOS instantiates all the individual objects in the file, including that button. When it notices that the button’s background points to an image, it reads the image from disk, inflates it in memory and renders it as the background. The bigger the image, the slower it is to read it from disk. Since all this happens synchronously on the main thread, it slows down the app. Tip #1: Once you are satisfied with an asset, remember to always compress it to the smallest size possible, without any loss in quality, before adding it to the bundle. As a rule of thumb, I have always been able to compress icons down to at most 4kb on disk. Check out Core Animation in Practice, Part 2 from WWDC 2010 for more info on optimizing graphics on screen.

Defer main thread operations

It goes without saying that any task that doesn’t need to be executed on the main thread should be shipped to a background thread. NSOperationQueues or Grand Central Dispatch are two great tools for such tasks. With tasks running on the main thread, you need to be very careful that they don’t interfere with a user’s touches. Such tasks can be roughly classified into two groups:

  • View Updates: Any changes to your views need to happen on the main thread. iOS makes it very easy to defer these changes by the simple, do not call us, we’ll call you rule – Never call drawRect yourself. Just call setNeedsDisplay and iOS will re-render your view when the user has stopped scrolling.
  • Processing: There are some critical processing tasks that cannot be performed on a background thread, like saving a Core Data database, changing in-memory state, etc. Tip #2: Group such tasks into independent chunks and execute them in the Default Runloop mode. Eg:
[self performSelectorOnMainThread:@selector(processDataOnMainThread:)
withObject:dictionaryOfParameters
waitUntillDone:NO
modes:[NSArray arrayWithObject:NSDefaultRunLoopMode]]

When the user starts scrolling a scrollview or a tableview, the run loop mode is set to the Common modes. When the user stops scrolling, it is reset to the Default mode. Thus, if you use the vanilla [self processDataOnMainThread:dictionaryOfParams] call, the function will start executing regardless of whether the user is scrolling or not. But, with the API call above, iOS will wait for the user to stop scrolling before executing your function.

Avoid Memory Spikes

Every iOS developer dreads the ominous “Low Memory Warning”. In addition to being delivered if the app uses a lot of memory, Low Memory Warnings can also arise if the application’s memory suddenly spikes, even though the overall memory usage is quite small. If your application’s memory doesn’t go down after repeated memory warnings, iOS will kill your app! Tip #3: Always strive to keep your memory profile smooth. Some typical hot spots for memory spikes are:

  • App Launch: Load as few objects as you need. This will speed up launch and prevent memory warnings!
  • View Controller Initialization: New view controller objects are instantiated when they are pushed on the navigation stack or presented modally. Try to use as few views as possible. Or instantiate some views lazily, if you can.
  • UIWebview: UIWebview is notorious for using up a lot of memory very quickly, especially when loading HTML content with heavy images/videos. Its hard to completely control the memory profile with a UIWebview in your application, but loading data lazily is always a good rule of thumb.

Remember, If you keep your application’s memory profile steady and consistent, it will lead a long and healthy life! Check out Advanced Memory Analysis with Instruments for more info.

Avoid unnecessary caching of images

Throughout an iOS application, we need to refer to images in the bundle. More often than not, imageNamed: is an extremely simple and efficient way to do so. But, you should be aware that imageNamed: also caches any image it imports from the bundle. Thus, it is highly efficient for images that need to be reused throughout your application (like icons, background images for buttons etc.). But it can be an unnecessary memory hog for images that are used sparingly. Tip #4: For loading such images, we should instead read them directly from disk and release the memory when we are done using the image.

NSString *path = [[NSBundle mainBundle] pathForResource:fileName ofType:fileType];
UIImage *image = [[UIImage alloc] initWithContentsOfFile:path];

[image release];

As a rule of thumb, use imageNamed: with images that are used in UI elements and initWithContentsOfFile: for everything else. Here is a handy category we wrote on UIImage that automatically chooses the right image for retina display screens and reads them from disk.

UImage+ImageNamedFromDisk.h
UImage+ImageNamedFromDisk.m

I hope you find these tips useful in your own development. Please share your own insights into optimizing iOS applications by leaving comments below!

Scaling to 10M on AWS

This post complements the recent article about Pulse on the Amazon Web Services Blog.

As Pulse crosses the 10M user mark (up 10x since last year), we’d like to share a bit more about how we’ve built and scaled Pulse’s backend systems. In this article we will discuss the important role AWS plays in our infrastructure.

Today there are more infrastructure choices than ever. They include running your own hardware, leasing virtual machines, subscribing to higher level platforms and software services, and often a combination of all of the above. It is important to consider the trade-offs and choose the right tool for the job. In our experience, AWS provides an exceptional capability to build systems as close to the metal as you like, while still avoiding the burden and inelasticity of owning your own hardware. It also provides some useful abstraction layers and services above the machine level.

Event Logging

Amazon’s Elastic Compute Cloud (Amazon EC2) instances make it easy to run low level processes that can write directly to disk, and its Amazon Simple Storage System (Amazon S3) provides great long-term file storage. This combination makes an excellent choice for most flat-file logging systems. At Pulse, we’ve built a simple logging system that is blazingly fast on one machine and easy to scale horizontally. Using Tornado to handle HTTP requests and Scribe to buffer and write files, we are able to store logs at near-disk speeds (more than 50 MB/s per instance). Once the logs have been written to disk, we regularly move them to Amazon S3 for reliable long-term storage and easy access. Amazon S3′s low cost and scalable nature allows us to save all of our data without worrying too much about size.

By provisioning one of Elastic Load Balancer (ELB) instances, we are able to easily divide our load over as many logging servers as necessary and automatically direct load away from failing machines. Provisioning these machines in multiple AWS availability zones also makes it easy to achieve fault tolerance.

Pulse’s implementation easily handles millions of events per hour and has been running continuously for over a year without any downtime.

Data Analytics

Another major reason we decided to build our event logging system on Amazon S3 was to leverage Amazon Elastic MapReduce  and Apache Hive. Now that our data is getting bigger, it is much more efficient to query with a cluster of machines. Without having to configure and maintain our own Hadoop cluster or having to move our data from Amazon S3, AWS allows us to quickly spin up a cluster of 10s to 100s of machines.

With a large cluster, we are able to query a significant portion of our data in minutes instead of hours or days. Because the AWS cluster can simply be turned off when we are done, the cost to run big queries is usually quite reasonable. Consider a cluster of 100 m1.large machines. A set of queries that takes 45 minutes to run on this cluster would cost us $11 – $34 (depending on whether we bid on spot instances or use regular on-demand instances). Assuming you’re not running jobs all the time, this is preferable to the cost of buying and continuously maintaining your own cluster.

Apache Hive makes this process even easier by taking simple SQL queries and converting them into what would often be relatively complex, multi-step Amazon Elastic MapReduce jobs. These SQL queries can be run directly by our business team, avoiding the need for engineering support.

For batch jobs, such as regularly extracting the top read and shared stories, the Pulse backend team likes to use mrjob, an open source framework developed at Yelp. Mrjob allows us to write mappers and reducers in Python (instead of Java) and integrates seamlessly with Amazon Elastic MapReduce. Python is our language of choice because it is more consistent with our codebase and it provides a simple representation for common MapReduce data structures such as tuples and dictionaries. Because our jobs are usually IO-bound, the interpreted runtime doesn’t slow things down much.

Recommendations

Beyond curating our top story feeds, we’ve recently started developing several exciting new user-facing features using Amazon Elastic MapReduce, mrjob, and our data on Amazon S3. As part of our last major release, we announced a new feature called Smart Dock, which recommends new sources to millions of users based on their reading history. This feature makes it much easier to discover relevant content and has been extremely well received by our users. Our newest full-time backend engineer, Leonard Wei, led this project and built it almost entirely on AWS.

Our recommendations pipeline processes over 250GB of the raw log data we have in Amazon S3. We reduce this data down to about 1GB of relevant features via an Amazon Elastic MapReduce job. We then use an LDA-based approach to predict which sources a user is likely to add next. We run this portion of the pipeline on AWS using a single High-CPU Extra Large instance.

Once the model is generated for each user and some additional post-processing is complete, we upload each user’s recommended sources to our serving infrastructure on App Engine. From there, the recommendations are combined with the latest catalog data and sent to the app to be presented in the Smart Dock. One run of the whole pipeline costs us a very reasonable $20 of AWS compute time.

Other Tasks

Beyond event logging, analytics and recommendations, we also use AWS for lots of smaller tasks that just make sense to run directly on one or more machines, rather than through a higher level service. Some examples include parsing html pages with node-readability and continuously monitoring all of our systems to make sure we’re aware of any problems. Recently, we also started working on a new real-time analytics infrastructure based on Redis, which will leverage the High-Memory instances Amazon EC2 offers.

To learn more about Pulse’s infrastructure check out some of the backend team’s other posts. Our recent article on how we scaled up for the Kindle Fire launch compliments this one and talks more about our content serving, client APIs and Pulse.me web hosting.