Women in Startups: Mythbusters

Last week the women on the Pulse team (Lili, Ketaki and Cristina) held an event for Women in Computer Science at Stanford. The topic of the event was “Mythbusters: Women at Early-Stage Startups”. Startups are an exciting place to work, but some (especially new grads!) have concerns and reservations about the culture and expectations at such a job. We explored and debunked some of the myths we often hear from students and non-students alike:

1. You will work crazy hours and have no life
This is simply not true. Even when you are working at a startup, you can choose your own work hours according to your lifestyle. Want to come in late and work late? That’s okay. Want to take a run in the middle of the day to clear your thoughts? That’s okay. Working crazy hours will burn you out and startups want to prevent that. At some startups, you’ll work more than 40 hours a week, and at others, you’ll work more standard hours. The focus is not placed on the number of hours you work, but rather on getting the job done.

2. It’s too late to learn a skill
“I am a web developer and will remain so throughout my career.” Not true. If you want to learn iOS or Android while working as a web developer, you can teach yourself a new skill. Communicating this to the right person is key, because he or she may be able to help you switch positions if you find something you’re passionate about. Of course, learning in the working world is not as structured as it would be in school. You will be responsible for teaching yourself at the same time as performing your current job, but this is certainly possible!

3. As a woman, you will live in a bro-culture
If you choose your startup wisely, you should not run into this problem. In our careers, we have found that fields like gaming and finance can be male-dominated, but by spending a day at a company you can see which ones are more open and you would feel at home with.

4. You should only work at a startup when you’re young
You can work at a startup at any age or stage of life. Depending on the company, the age of employees can vary widely. As long as you are passionate about the job, your age is not important. For those with families, remember that working long hours is not a guarantee of producing the best work. Making a big impact is satisfying at any age!

Beyond the myths, here’s what you can look forward to at the right startup:

  • Impact on a product and end-users
  • Flexible work hours and schedule
  • You’ll know everyone at your company – you’re not a cog
  • A flat structure where feedback is coming your way from all directions
  • You can experiment and make mistakes
  • High volume testing – testing on production
  • Tons of responsibility
  • Interfacing and collaborating with other teams (design, product, business)
  • You have a say in your career goals and future

Looking forward to more such events in the future.

Tips for improving performance of your iOS application

Any iOS application worthy of a spot on their user’s home screen is made of 3 key ingredients: a great idea, stunning design and smooth performance. In a previous post, we shared a few guidelines to make your app look pretty. Today, we have some simple tips on how to improve the performance of your iOS application. At Pulse, we obsess over every small hiccup in the application and spend countless nights staring at Instruments at the end of our release cycles. Here are some of our insights that might help you in your development process.

Downsize your image assets

Apps with good visual design always delight users. To achieve pixel perfect graphics, every iOS application ships with several image assets. It is crucial that these images are as small in size as possible. Let me elaborate with an example.

It is common practice to add a button to a nib file and set its background to point to an image. When the nib file is read from disk, iOS instantiates all the individual objects in the file, including that button. When it notices that the button’s background points to an image, it reads the image from disk, inflates it in memory and renders it as the background. The bigger the image, the slower it is to read it from disk. Since all this happens synchronously on the main thread, it slows down the app. Tip #1: Once you are satisfied with an asset, remember to always compress it to the smallest size possible, without any loss in quality, before adding it to the bundle. As a rule of thumb, I have always been able to compress icons down to at most 4kb on disk. Check out Core Animation in Practice, Part 2 from WWDC 2010 for more info on optimizing graphics on screen.

Defer main thread operations

It goes without saying that any task that doesn’t need to be executed on the main thread should be shipped to a background thread. NSOperationQueues or Grand Central Dispatch are two great tools for such tasks. With tasks running on the main thread, you need to be very careful that they don’t interfere with a user’s touches. Such tasks can be roughly classified into two groups:

  • View Updates: Any changes to your views need to happen on the main thread. iOS makes it very easy to defer these changes by the simple, do not call us, we’ll call you rule – Never call drawRect yourself. Just call setNeedsDisplay and iOS will re-render your view when the user has stopped scrolling.
  • Processing: There are some critical processing tasks that cannot be performed on a background thread, like saving a Core Data database, changing in-memory state, etc. Tip #2: Group such tasks into independent chunks and execute them in the Default Runloop mode. Eg:
[self performSelectorOnMainThread:@selector(processDataOnMainThread:)
withObject:dictionaryOfParameters
waitUntillDone:NO
modes:[NSArray arrayWithObject:NSDefaultRunLoopMode]]

When the user starts scrolling a scrollview or a tableview, the run loop mode is set to the Common modes. When the user stops scrolling, it is reset to the Default mode. Thus, if you use the vanilla [self processDataOnMainThread:dictionaryOfParams] call, the function will start executing regardless of whether the user is scrolling or not. But, with the API call above, iOS will wait for the user to stop scrolling before executing your function.

Avoid Memory Spikes

Every iOS developer dreads the ominous “Low Memory Warning”. In addition to being delivered if the app uses a lot of memory, Low Memory Warnings can also arise if the application’s memory suddenly spikes, even though the overall memory usage is quite small. If your application’s memory doesn’t go down after repeated memory warnings, iOS will kill your app! Tip #3: Always strive to keep your memory profile smooth. Some typical hot spots for memory spikes are:

  • App Launch: Load as few objects as you need. This will speed up launch and prevent memory warnings!
  • View Controller Initialization: New view controller objects are instantiated when they are pushed on the navigation stack or presented modally. Try to use as few views as possible. Or instantiate some views lazily, if you can.
  • UIWebview: UIWebview is notorious for using up a lot of memory very quickly, especially when loading HTML content with heavy images/videos. Its hard to completely control the memory profile with a UIWebview in your application, but loading data lazily is always a good rule of thumb.

Remember, If you keep your application’s memory profile steady and consistent, it will lead a long and healthy life! Check out Advanced Memory Analysis with Instruments for more info.

Avoid unnecessary caching of images

Throughout an iOS application, we need to refer to images in the bundle. More often than not, imageNamed: is an extremely simple and efficient way to do so. But, you should be aware that imageNamed: also caches any image it imports from the bundle. Thus, it is highly efficient for images that need to be reused throughout your application (like icons, background images for buttons etc.). But it can be an unnecessary memory hog for images that are used sparingly. Tip #4: For loading such images, we should instead read them directly from disk and release the memory when we are done using the image.

NSString *path = [[NSBundle mainBundle] pathForResource:fileName ofType:fileType];
UIImage *image = [[UIImage alloc] initWithContentsOfFile:path];

[image release];

As a rule of thumb, use imageNamed: with images that are used in UI elements and initWithContentsOfFile: for everything else. Here is a handy category we wrote on UIImage that automatically chooses the right image for retina display screens and reads them from disk.

UImage+ImageNamedFromDisk.h
UImage+ImageNamedFromDisk.m

I hope you find these tips useful in your own development. Please share your own insights into optimizing iOS applications by leaving comments below!

Scaling to 10M on AWS

This post complements the recent article about Pulse on the Amazon Web Services Blog.

As Pulse crosses the 10M user mark (up 10x since last year), we’d like to share a bit more about how we’ve built and scaled Pulse’s backend systems. In this article we will discuss the important role AWS plays in our infrastructure.

Today there are more infrastructure choices than ever. They include running your own hardware, leasing virtual machines, subscribing to higher level platforms and software services, and often a combination of all of the above. It is important to consider the trade-offs and choose the right tool for the job. In our experience, AWS provides an exceptional capability to build systems as close to the metal as you like, while still avoiding the burden and inelasticity of owning your own hardware. It also provides some useful abstraction layers and services above the machine level.

Event Logging

Amazon’s Elastic Compute Cloud (Amazon EC2) instances make it easy to run low level processes that can write directly to disk, and its Amazon Simple Storage System (Amazon S3) provides great long-term file storage. This combination makes an excellent choice for most flat-file logging systems. At Pulse, we’ve built a simple logging system that is blazingly fast on one machine and easy to scale horizontally. Using Tornado to handle HTTP requests and Scribe to buffer and write files, we are able to store logs at near-disk speeds (more than 50 MB/s per instance). Once the logs have been written to disk, we regularly move them to Amazon S3 for reliable long-term storage and easy access. Amazon S3′s low cost and scalable nature allows us to save all of our data without worrying too much about size.

By provisioning one of Elastic Load Balancer (ELB) instances, we are able to easily divide our load over as many logging servers as necessary and automatically direct load away from failing machines. Provisioning these machines in multiple AWS availability zones also makes it easy to achieve fault tolerance.

Pulse’s implementation easily handles millions of events per hour and has been running continuously for over a year without any downtime.

Data Analytics

Another major reason we decided to build our event logging system on Amazon S3 was to leverage Amazon Elastic MapReduce  and Apache Hive. Now that our data is getting bigger, it is much more efficient to query with a cluster of machines. Without having to configure and maintain our own Hadoop cluster or having to move our data from Amazon S3, AWS allows us to quickly spin up a cluster of 10s to 100s of machines.

With a large cluster, we are able to query a significant portion of our data in minutes instead of hours or days. Because the AWS cluster can simply be turned off when we are done, the cost to run big queries is usually quite reasonable. Consider a cluster of 100 m1.large machines. A set of queries that takes 45 minutes to run on this cluster would cost us $11 – $34 (depending on whether we bid on spot instances or use regular on-demand instances). Assuming you’re not running jobs all the time, this is preferable to the cost of buying and continuously maintaining your own cluster.

Apache Hive makes this process even easier by taking simple SQL queries and converting them into what would often be relatively complex, multi-step Amazon Elastic MapReduce jobs. These SQL queries can be run directly by our business team, avoiding the need for engineering support.

For batch jobs, such as regularly extracting the top read and shared stories, the Pulse backend team likes to use mrjob, an open source framework developed at Yelp. Mrjob allows us to write mappers and reducers in Python (instead of Java) and integrates seamlessly with Amazon Elastic MapReduce. Python is our language of choice because it is more consistent with our codebase and it provides a simple representation for common MapReduce data structures such as tuples and dictionaries. Because our jobs are usually IO-bound, the interpreted runtime doesn’t slow things down much.

Recommendations

Beyond curating our top story feeds, we’ve recently started developing several exciting new user-facing features using Amazon Elastic MapReduce, mrjob, and our data on Amazon S3. As part of our last major release, we announced a new feature called Smart Dock, which recommends new sources to millions of users based on their reading history. This feature makes it much easier to discover relevant content and has been extremely well received by our users. Our newest full-time backend engineer, Leonard Wei, led this project and built it almost entirely on AWS.

Our recommendations pipeline processes over 250GB of the raw log data we have in Amazon S3. We reduce this data down to about 1GB of relevant features via an Amazon Elastic MapReduce job. We then use an LDA-based approach to predict which sources a user is likely to add next. We run this portion of the pipeline on AWS using a single High-CPU Extra Large instance.

Once the model is generated for each user and some additional post-processing is complete, we upload each user’s recommended sources to our serving infrastructure on App Engine. From there, the recommendations are combined with the latest catalog data and sent to the app to be presented in the Smart Dock. One run of the whole pipeline costs us a very reasonable $20 of AWS compute time.

Other Tasks

Beyond event logging, analytics and recommendations, we also use AWS for lots of smaller tasks that just make sense to run directly on one or more machines, rather than through a higher level service. Some examples include parsing html pages with node-readability and continuously monitoring all of our systems to make sure we’re aware of any problems. Recently, we also started working on a new real-time analytics infrastructure based on Redis, which will leverage the High-Memory instances Amazon EC2 offers.

To learn more about Pulse’s infrastructure check out some of the backend team’s other posts. Our recent article on how we scaled up for the Kindle Fire launch compliments this one and talks more about our content serving, client APIs and Pulse.me web hosting.

 

Three tips to make your CSS more manageable

As we grow our web team at Pulse, we’ve begun to document a lot of our common practices in web development. This goes beyond a general style guideline and includes commonly used code to solve everyday problems. When it comes to CSS, there are many approaches to fairly simple problems. However, when collaborating on large projects it’s important that these problems are documented and the solution reused instead of having code written more than once.

1. CSS Reset

Whenever we start something, we always use a CSS reset stylesheet. This has become necessary since many browsers render elements differently. We patched ours from multiple projects and cut it down to suit our needs. This is of course done on purpose in order to minimize the size of the css file. Remember that every byte counts on mobile devices! There is no point in including resets for elements that will not be used in the particular project. If you’re starting off with a new project, we recommend either the Eric Meyer CSS Reset or Normalize CSS. Below is an example CSS reset snippet of that we use in some of our projects.

 

2. Centering a block

One of the questions I get asked all the time by my developer friends (not front-end) is how to center a block in the page. “But why doesn’t text-align: center work?” It’s not text… Generally, the more information you have about the block you’re trying to center, the easier it is to center it. The following code is for a foolproof solution to horizontally centering DIVs. This method works if the block has a set width or height. I recently came across Chris Coyier’s solution for “centering a block in the unknown”. He does a great job of explaining how to center a block with unknown width or height. You can find 2 solutions in the post here. http://css-tricks.com/14745-centering-in-the-unknown/

CSS:

3. Working with floating blocks

I find that the biggest problem developers struggle with when using CSS is the float property. It isn’t explained well and a lot of devs end up falling back to tables. Tables are gross and should only be used for presenting tabular data. Floats are magical and can be mastered rather easily. The main problem experienced when floating blocks is the parent height not adjusting to its children’s heights. This is easily mediated by the following 2 methods.
CSS:

HTML:

Preview:
floating blocks

Some other recommended sources for CSS tips are CSS-Tricks and RedTeamDesign. We read those everyday. Happy Styling!

Scaling with the Kindle Fire

This post was originally published as a guest post on the Google App Engine blog.

As part of the much anticipated Kindle Fire launch, Pulse was announced as one of the only preloaded apps. When you first unbox the Fire, Pulse will be there waiting for you on the home row, next to Facebook and IMDB!

Scale

The Kindle Fire is projected to sell over five million units this quarter alone. This means that those of us who work on backend infrastructure at Pulse have had to prepare for nearly doubling our user-base in a very short period. We also need to be ready for spikes in load due to press events and the holiday season.

Architecture

As I’ve discussed previously on the Pulse Engineering Blog, Pulse’s infrastructure has been designed with scalability in mind from the beginning. We’ve built our web site and client APIs on top of Google App Engine, which has allowed us to grow steadily from 10s to many 1000s of requests per second, without needing to re-architect our systems.

While restrictive in some ways, we’ve found App Engine’s frontend serving instances (running Python in our case) to be extremely scalable, with minimal operational support from our team. We’ve also found the datastore, memcache, and task queue facilities to be equally scalable.

Pulse’s backend infrastructure provides many critical services to our native applications and web site. For example, we cache and serve optimized feed and image data for each source in our catalog. This allows us to minimize latency and data transfer and is especially important to providing an exceptional user experience on limited mobile connections. Providing this service for millions of users requires us to serve 100Ms of requests per day. As with any well designed App Engine app, the vast majority of these requests are served out of memcache and never hit the datastore. Another useful technique we use is to set public cache control headers wherever possible, to allow Google’s edge cache (shown as cached requests on the graph below) and ISP / mobile carrier caches to serve unchanged content directly to users.

Costs

Based on App Engine’s projected billing statements leading up to the recent pricing changes, we were concerned that our costs might increase significantly. To prepare for these changes and the expected additional load from Kindle Fire users, we invested some time in diagnosing and reducing these costs. In most cases, the increases turned out to be an indicator of inefficiencies in our code and/or in the App Engine scheduler. With a little optimization, we have reduced these costs dramatically.

The new tuning sliders for the scheduler make it possible to rein in overly aggressive instance allocation. In the old pricing structure, idle instance time wasn’t charged for at all, so these inefficiencies were usually ignored. Now App Engine charges for all instance time by default. However, any time App Engine runs more idle instances than you’ve allowed, those hours are free. This acts as a hint to the scheduler, helping it reduce unneeded idle instances. By doing some testing to find the optimal cost vs spike latency tolerance and setting the sliders to those levels, we were able to reduce our frontend instance costs to near original levels. Our heavy usage of memcache (which is still free!) also helps keep our instance hours down.

Since datastore operations used to be charged under the umbrella of CPU hours, it was difficult to know the cost of these operations under the old pricing structure. This meant it was easy to miss application inefficiencies, especially for write-heavy workloads where additional indexes can have a multiplicative effect on costs. In our case, the new datastore write operations metric led us to notice some inefficiencies in our design and a tendency to overuse indexes. We are now working to minimize the number of indexes our queries rely on, and this has started to reduce our write costs.

Preparing for the Kindle Fire Launch

We took a few additional steps to prepare for the expected load increase and spikes associated with the Fire’s launch. First, we contacted App Engine’s support team to warn them of the expected increase. This is recommended for any app at or near 10,000 requests per second (to make sure your application is correctly provisioned). We also signed up for a Premier account which gets us additional support and simpler billing.

Architecturally, we decided to split our load across three primary applications, each serving different use cases. While this makes it harder to access data across these applications, those same boundaries serve to isolate potential load-related problems and make tuning simpler. In our case, we were able to divide certain parts of our infrastructure, where cross application data access was less important and load would be significant. Until App Engine provides more visibility into and control of memcache eviction policies, this approach also helps prevent lower priority data from evicting critical data.

I’m hopeful that in the near future such division of services will not be required. Individually tunable load isolation zones and memcache controls would certainly make it a lot more appealing to have everything in a single application. Until then, this technique works quite well, and helps to simplify how we think about scaling.

Optimizing for Screen Sizes on Android

In previous posts we outlined the key guidelines for designing phone and tablet apps. Then we followed up with some secret tips for making them shine! Of course, bringing these apps to life is easier said than done so today we’ll explore the technical adventures in developing for the myriad screens on Android.

Resource Folders Make Your Life Easier

By far, the easiest way to ensure your app looks the way you intended is to use Resource Folders (1.6+). With Honeycomb 3.2, the ability to distinguish between screen sizes becomes more granular, granting the developer greater control. However, as of this post’s writing Ice Cream Sandwich is not out yet, so we’ll discuss the pre-Honeycomb version of resource folders.

Screen are split up into four categories: small, normal, large, and xlarge. These correspond to general form factors like phone, small tablets, and tablets. Each classification has a screen size range detailed below:

 

 

Resource files (xml files that describe things like layouts, dimensions, styles, etc.) are placed in folders in your Android project under the res directory. You can add modifiers to the name of a resource folder which  declare under what circumstances its files should be used.

For example, if you normally put your awsome_layout.xml file in layout, you can also place a version designed for large screens in a folder called layout-large. Thus, when the app runs on a tablet-sized device, the app will automatically use the awsome_layout.xml file found in ‘layout-large’ through no additional effort. Magic! We don’t go into the details of naming your folders, but a handy guide can be found here.

Be careful though, if your layouts are drastically different you must be certain you don’t refer to views in your code that only exist in one layout file without checking its existence. This can be prevented with thorough testing and good software design.

Detecting Screen Size In Code

You may also want to exhibit different behavior on larger screens in addition to having a separate layout. For tablets, one can allot more space for buttons on the Action Bar; on phones it is preferable to keep the layout uncluttered. An example taken from Pulse is in the tablet’s landscape mode. Clicking on a story causes the article to slide in from the right rather than completely covering the screen. This takes advantage of the extra real estate to browse stories while reading an article. To do this we need a way to tell if the device is xlarge, large, or normal in the code.

There is a class called DisplayMetrics that can give us some basic information about the device we’re running on. While this may seems like a great place to start, it could also lead to many layout bugs. Don’t simply use the screen width in pixels as a measure of device size; advances in screen density tosses this assumption out the window. A 4” phone can have a screen that is 540 pixels across, whereas a 7” tablet’s screen width is a mere 60 pixels wider at 600px. If you’re not careful you could end up with behavior intended for tablets on a phone, which would be wonky to say the least.

Instead, the screen size a particular device is using (equivalent to which modifier on a resource folder gets chosen) can be found in the Configuration class by using Resources.getConfiguration method. This is the same Configuration you use to see if the device is in landscape or portrait. Using the configuration object, you can retrieve the screenLayout field and see if the device is equal to the relevant constants. With this knowledge, your app can decide how to behave properly.

But what about dynamic values?

Using resources is a very painless way to incorporate device-dependent dimensions, but sometimes you want the layout to be more adaptable. For example, in Pulse there are horizontally scrolling tiles with each square taking up 1/3 of the screen width; even when the app is in landscape, the tile widths are the same.

Since we can’t possibly know what the screen width of the device is beforehand, we use a helper class to store these predicated constants. Our class follows the singleton pattern and is used whenever this parameter is needed. The parameters are initialized with the class and are available whenever they’re needed. Here is a super simple example of such a class:

/**
 * Sample class from Pulse
 *
 * Class to store and provide useful dimensions
 */
public class DimensionCalculator {

  private static DimensionCalculator mInstance = null;
  private int mScreenWidth;
  private int mTileWidth;

  /**
   * This class is a singleton
   */
  public static DimensionCalculator getInstance() {
    return SingletonHolder.instance;
  }

  /**
   * We use the SingletonHolder solution which is widely considered to be the
   * standard implementation in Java. Thanks to Fredia from the comments!
   */
  private static class SingletonHolder {
    public static final DimensionCalculator instance = new DimensionCalculator();
  }

  public class DimensionCalculator() {
     DisplayMetrics dm = Resources.getSystem().getDisplayMetrics();
     mScreenWidth = Math.min(dm.widthPixels, dm.heightPixels);

     int numTiles = 3;
     int tileGap = 2;
     mTileWidth = (int) ((mScreenWidth - 4 * tileGap) / numTiles);
  }

  /**
   * Return the appropriate tile size for this device
   */
  public int getTileWidth() {
    return mTileWidth;
  }
}

 

Now that you have the tools to help create specialized layouts and designs for phones and tablets, you have absolutely no excuse for creating a tablet app that is just a blown up version of the phone app! Happy coding!

Five Lessons Learned from Migrating to iOS Core Data

A couple of weeks ago we decided to fully switch to Core Data for persistent data storage on our iOS apps. This was a bold decision that would bring us more coherent data representations, performance improvements and make future data migrations a breeze…or so we thought. Whilst there certainly were a lot of gains from switching to Core Data it took us a lot longer than anticipated, and we ran into some unexpected issues and limitations. Since we are surely not the last ones moving existing apps to Core Data, we would like to share some of our lessons learned.

Note: This post assumes that you are already familiar with the concepts of Core Data and its classes. If you are new to the subject, we recommend reading the Core Data programming guide by Apple.

Lesson 1: The more you respect MVC, the easier your migration will be

Core Data knows no mercy with developers who disrespect the MVC principles. Because model objects can get deallocated and reloaded at the will of their managing NSManagedObjectContext you can never rely on them being around unless you specifically retain them (which isn’t a good idea). It is thus unwise to set the model objects as the direct receivers of callbacks (for example, image download). Instead, route all calls through the appropriate controller, which manages the NSManagedObjectContext of the data object, and retrieve the data objects at the time when you need them to store the downloaded information. This does not only solve the availability issue but also makes for much cleaner code as you are forced to minutely adhere to the MVC standards.

Lesson 2: Data migrations are a breeze…if you have XCode >= 4.0.2

If you are already partly using Core Data in your application you will probably need to update your data model, and chances are that you need to provide a migration mapper. Luckily there are some great tutorials on the subject out there; however, you can still lose a lot of time if you forget either of the following:

  1. Turn off automatic data migration (when initializing the persistent data store manager, set NSMigratePersistentStoresAutomaticallyOption to NO)
  2. If you are still running XCode 4.0.0, upgrade to at least XCode 4.0.2. Previous versions have a bug that prevents the migration mappers from being placed in the proper location at link time. Believe us, we learned the hard way.

Lesson 3: Multithreading and Core Data can work very well if you plan ahead

Perhaps the most annoying part about Core Data is that, in its current implementation, the two objects you interact with the most (NSManagedObjectContext and NSManagedObject) are both not yet threadsafe. However, there are best practices, using workarounds, for both updating and loading objects on secondary threads. If you familiarize yourself with them before implementing your designs you will find it pretty easy to create concurrent applications using Core Data. Apple has a great guide on Core Data and concurrency which is well worth the read and nicely explains what to do in both situations.

Lesson 4: NSFetchedResultsController is not always useful

In order to make it easier to use UITableView together with Core Data, Apple has invented the NSFetchedResultsController, which manages the NSManagedObjects associated with a particular fetch request and monitors an associated NSManageObjectContext for changes to those objects. While this setup is great if there is only one UITableView being displayed and there are few changes being made to the underlying data objects, we’ve found that using NSFetchedResultsController, under certain conditions, can adversely effect the (felt) performance of your app. When you have data sets with many changes or if there are frequent changes being made to underlying data objects, it is advisable to manually “bundle” those changes and entirely reload the UITableView instead of incrementally reloading the cells affected. If you meet one of those two cases, consider implementing your own update routine and temporarily deactivating the update mechanism of the NSFetchedResultsController by setting its delegate to nil.

Lesson 5: Performance, performance, performance!

To get the best performance out of Core Data keep the following in mind as you sketch up your refactoring:

  • Multiple small queries are slower than one larger query since every call to performRequest: hits the disk
  • Complex filter predicates can significantly hamper the performance of your queries, keep them as simple as possible
  • Fetching relationships is expensive; avoid them if they do not provide a significant advantage. Flatten your data model where you need to load data together.
  • Only store small data objects directly in CoreData, store larger objects in files and only save the path to them in CoreData (eg. images)

We hope these insights will help you when planning your own migration to CoreData. Let us know if you find other valuable tricks for dealing with CoreData!

Thanks for reading!
~The Pulse iOS team

Design Secrets for Engineers

 

If you are a designer like me, you must be asked on a regular basis to “make it look pretty.” The request can stroke your designer ego, making you feel like a design rockstar with super powers to make this world a more beautiful place. This is especially true at startups, where you are one of the few, maybe the only designer there. However, it can also be really annoying–almost degrading at times. Thoughts like “why the hell can’t engineers do this on their own? It’s all common sense” always go through my head. If only engineers knew how to do visual design, designers would have more time to focus on cooler, more exciting problems like future product concepts.

And if you are an engineer, you might wonder how designers pull off their tricks (and why they’re in such huge demand right now). Is it genetic? Do design schools teach them top secret design tips? Or did they make a deal with the devil to get designers’ eyes in exchange for their souls?

Well, I’m here to bring you some good news: engineers don’t need to drink unicorn blood just to be good at visual design. I am a strong believer that good design is a highly learnable skill, like riding a bike, playing a piano or learning Spanish. If you practice often enough, you’ll become better and better at it, and once you’ve got the hang of it, you’ll never go back. I can say this because I too once sucked at design. But then I learned a few tips from my graphic design friends, and a few years later, I could proudly say that I was a design expert. Today, I want to share these not-so-secret tips with you.  The first five are more specific to visual design while the next three are geared towards interaction design.

 

1. Line things up.

Good example: beautiful sites and apps usually have underlying grid behind them.

This rule is the mother of all graphic design rules. Unless you’re recreating the Mona Lisa on MS Paint, please line things up. Our brains just like it better that way. The slightly more advanced version of lining things up is called the grid system, which is essentially lining more things up. Kindergarten kids can do it and so can you.

 

2. Design the white space.

Bad example: this is what happens when you try to fill in your white space with information.

When you’re in an elevator with 15 other people, it’s not so easy to breath…especially when someone farts. When you design a layout or UI, try not to jam too many elements into a page; it increases the chance of having one of the elements stink the whole thing up. Leave some white space for the eye to breathe. I often find myself designing the space in between elements, making sure elements aren’t too far apart or too close together.

 

3. Use designer fonts.

Bad example: use these fonts and designers will make fun of you.

In the design world, there are good fonts and bad fonts. Good fonts like Gotham, Trade Gothic Bold Condensed or Garamond please your eyes and make you feel like you’re having a frosty cold mojito on the beach. Bad fonts, on the other hand, make us designers cringe and feel like we’ve vomited from our eyes. Try to avoid super default fonts like Impact, Curlz, or Comic Sans, to name a few. If you must use a preloaded font, Helvetica and Georgia are two exceptions–they’re classic and restrained enough to be unoffensive.  If you want designer fonts that play nicely with the web, try Typekit. Oh…and please don’t use WordArt. Ever.

 

4. Keep it consistent.

Good example: use pick a few and run with it.

Use no more than two fonts and three colors in your designs. And keep them consistent throughout your sketches. Each time something changes, our brain has to go “whaaaa?” for a moment before figuring things out, so let’s give our brains a rest and keep things consistent. Also, let’s try not to stretch logos or images.  Imagine if someone took your face and stretched it horizontally by 5%. Still happy with the way you look?

 

5. Keep visual hierarchy in check.

Good example: quint your eyes. What do you see?

I don’t know about you, but when I cook, I always do a little tasting from time to time to make sure that my seasoning is on track. When you design, check yourself from time to time. Squint your eyes every now and then and look at the screen. What pops out at you first? What do you see second? Third? Walk away from the screen, then look at it from 10 feet away. Believe it or not, designers and architects do this all the time to keep things in perspective (literally).  It’s a good way to keep yourself from getting lost in little details or adding unnecessary buttons to the screen.

 

6. Set priorities and stick to them.

Bad example: don't allow this to happen in your product.

“Let’s put a help button there just in case the user is curious what’s going on. Oh…and let’s make the button look a little more like a button. And before I forget, can you make that tab pop a bit more?” Sometimes, I wish there was a robot that would bitchslap the one asshole in the room who keeps bringing up corner cases (cases that apply to only one very specific scenario).  Until this bitchslap machine is built, however, we can get by with a list of what’s important and what isn’t, backed by some data if possible. It will save you time and energy, and shut that asshole up.

 

7. Check the Physicality of the UI.

WTF example: imagine your interface lived in a little box. Try not to make impossible things happen.

A lot of what makes a UI successful is how familiar it seems to users when they first encounter it. Most users don’t have exhaustive experience with mobile apps, and will assume that they follow the same rules as the real world. When you’re making design decisions, ask yourself what sorts of physical analogs each element has.  If the UI was to be re-created in the real world, would it make sense?

 

8. Use Keynote.

Best. Prototyping. Tool. Ever.

I love Keynote.  I don’t know how else I would have come this far in life without this magical program that lines things up automatically and makes it easy to make things look good. Beyond just making slide decks, Keynote is a great way to mock up UI flows. Do a quick web search for “Keynote mockup templates” and you’ll find a number of great starting points for building good-looking prototype apps quickly and easily.

 

And remember to listen to others! It’s natural that confidence comes with the thought that you’re right and every one else is wrong. But just admit it: you’re not always right.

Chances are you probably won’t become a design supreme being over night. It takes some practice and confidence that you’ll get good at it. Ira Glass from NPR sums it up pretty well. http://www.mymodernmet.com/profiles/blogs/great-advice-for-creatives.

Just keep repeating “I, too, can be a designer”–eventually you will become one.

 

ps: we are hiring designers and designer-wannabe engineers!

Bringing Location-based Deals to Your Phone

Savvy shoppers have always been on the lookout for new deals, but sites such as Groupon, Gilt City, and LivingSocial have made deal-hunting even more popular. For the last few weeks, we’ve been working hard to make finding great deals easier for our users. By aggregating local deals from these sources, we hope to make them less intrusive (than email alerts) and easy to peruse. It should also be painless to scan deals from several different sites at once by simply adding each of them in Pulse.

Where are the deals?

Since most deal sites are location-based, we want to allow our users see deals that are nearby. The first step is to know where the deals are. We can get a list of cities (either through API calls to the deal sites or RSS feeds provided by them) and store the longitude and latitude of these cities. If the sites are unable to provide a list of cities and/or their coordinates, it is relatively straightforward to scrape the list of cities from their website (with permission of course!) and find the coordinates of the cities through services such as the Google Geocoding API.  Once we have this data, we store the city/coordinate pairs in the Google AppEngine datastore (Figure 1).

Figure 1: City coordinates in datastore

Where is the user?

On GPS enabled devices, location coordinates are relatively easy to come by. A client application can simply query the server with a user’s coordinates (with user’s permission) using an API call such as: http://<server>/api/nearby?long=-73&lat=42&max=10. After some server-side calculations, this call will return a list of nearby cities that should be relevant to the user because they are nearby the location the user just provided.

What is nearby?

Given that we have each city’s coordinates stored on the server and the user’s coordinates from the client application, how do we determine which cities are closest? We use the following algorithm to figure out which cities we should show to the user:

for city in cities:
    dist = math.pow(city.long - user.long, 2) + math.pow(city.lat - user.lat, 2)
    heapq.heappush(result, (dist, city))

The result is a min heap that ranks the cities from the closest to the farthest. We return this list (or a subset of it) to the client application and then allow the user to choose to add deals from a city near them.

The key to making this operation efficient is loading the city coordinates into memory (in our case, into the AppEngine Memcache). This simple optimization works well for any small/medium sized list of locations and turns a resource intensive calculation (that can be quite involved to implement on AppEngine’s datastore) into simple and scaleable operation that requires no datastore calls.

We hope to give our users a better experience by offering the ability to aggregate deal sites in Pulse. This was a short insight into how we implemented this feature.  If you have great ideas that can help further improve this experience, let us know!

Using jQuery Data for Easy Data Associations

One of my favorite but often overlooked features of jQuery is the data method, which allows you to associate specific data to a DOM element. This is particularly useful when dealing with a list of elements which have a lot of metadata. Without the data method, you could do this by storing everything in an array or adding funky attributes to an element, but that can get fairly convoluted. Not a lot of developers are aware of the power of jQuery data, so this post will cover a basic use case.

Let’s say we have a site which has a list of blog posts on the left-hand side. When one is clicked, the right side is populated with the blog post contents. The interaction is very similar to what we’ve done on the pulse.me site, where we use jQuery data for storing information about a story. The template fields in the HTML below would be populated using a templating framework.

HTML:

    <ul id="blog-posts">
        <li id="blog-post-0" class="blog-post-item">
            <h1>{{ post_title }}</h1>
            <h2>{{ post_date }}</h2>
            <h3>{{ post_author }}</h3>    
            <p>{{ post_snippet }}</p>
        </li>
    ...
    </ul>
   
    <article id="blog-post">
        <section id="blog-post-header">
            <h1>{{ post_title }}</h1>
            <h2>{{ post_date }}</h2>
            <h3>{{ post_author }}</h3>
            <h4>{{ post_category }}</h4>
        </section>
        <section>
            <img alt="{{ post_image_caption }}" src="{{ post_image }}"
           <p id="blog-post-content">{{ post_content }}</p>
            <span id="blog-post-tags">{{ post_tags }}</span>
        </section>
    </div>

Once we have the json object with the post data, we can save it. An association is made to the <body> and the <li> id in the DOM. The id is used as a key to reference the post’s data.

JavaScript:

Update: Thanks to the Hacker News community (Xurinos, rimantas) for some tips on improving performance. See commented out lines for old and less optimal code.

    // Post data
    var post = {
        'post_title' : '125 Days at a Startup',
        'post_date' : '2011/06/21 10:00:00 -0700',
        'post_author' : 'Filip Mares',
        'post_category' : 'Pulse',
        'post_tags' : 'experience, startups, updates'
        'post_image' : 'http://posterous.com/getfile/files.posterous.com/temp-2011-06-20/koJbhHJJHJgDmpxbrshCjdDoFsktnbmoJspkkIJgBoDJqyFwcxIFpIGgHsti/IMG_0385.JPG.scaled1000.jpg',
        'post_image_caption' : '125 Days at a Startup',
        'post_content' : 'This is the content...',
        'post_snippet' : 'This is the snippet...',
        'post_url' : 'http://filipmares.com/125-days-at-a-startup'
    }

    $(document).ready(function ()
    {
        // id of &lt;ul&gt; node
        var blog_post_id = 'blog-post-0';

        //Saving the incoming data in association to the
        //$('body').data(blog_post_id, post);
        $.data(document.body, blog_post_id, post)
       
        // Function for appending post data to list of posts
        PopulatePostsList(post);
    });

In order to query the post from memory we bind a click event function to retrieve the contents for the <li> id in question. Afterwards, we call a function that uses a templating framework to populate the DOM according to the HTML above. The ‘DeletePost()’ method is an example of how to delete the data from memory.

JavaScript:

Update: Thanks to the Hacker News community (Xurinos, rimantas) for some tips on improving performance. See commented out lines for old and less optimal code.

// Retrieve posts data on list element click
$('.blog-post-item').click(function(){

    //var id = $(this).attr('id');
    var id = this.id;
    var post = $.data(document.body, blog_post_id, post);
   
    // Function that outputs post data to article on DOM
    PopulatePost(post);
    });
   
    // Delete post with id passed
    DeletePost(blog_post_id){
    try{
        // Remove post from memory and terminate association
        //$('body').removeData(id);
        $.removeData(document.body, id)
    }catch(err){
        console.error('Post does not exist')
    }
}

Although this is fairly straightforward, please keep a few things in mind; this is not an alternative to local storage and can potentially lead to memory leaks when dealing with a lot of stored data. Be responsible when using the call and remove data that isn’t being used anymore. Furthermore, storing data takes a certain amount of time, so you might get a null exception if you are planning on retrieving it right after storage. There are currently no plans to add a callback to the jQuery data that I’m aware of.

Happy coding,

Filip