Line-by-line Speed Analysis for iOS Apps
[This is a guest post written by Tyler Neylon. Tyler is an iOS programmer and founder of Bynomial. He regularly blogs about iOS coding tips at the Bynomial blog.]
The iOS SDK comes bundled with tools in the Instruments app for finding speed bottlenecks in your code. These tools work by regularly inspecting the call stack of your app, and then providing an aggregate view of these call stacks. This is a good method for doing top-down speed analysis, where you want to find the methods eating up the most CPU time on a very large scale.
This post introduces some easy-to-use tools for speed analysis from a bottom-up perspective. Let’s say you’re writing or improving a particular time-critical function. For example, you may be writing a function that reacts to a user’s tap and needs to quickly compute and display results from that tap. In this case, the time profiler in Instruments is not ideal, since it’s hard to isolate a useful amount of data around the exact call stacks and time interval you care about. Instead, you can use CodeTimestamps, the open source tool we’re introducing here.
How to use CodeTimestamps
CodeTimestamps contains two macro sets. There is the simple LogTimestamp macro, which is useful for quick speed checks, and then the LogTimestamp{Start,Mid,End}Chunk macros, which are better for in-depth analysis, such as comparing many runs of multiple functions or loop bodies.
Simple case
Let’s say you want a detailed picture of how much time certain lines in your code take. You can add calls to LogTimestamp like this:
LogTimestamp;
// do some CPU-heavy stuff
LogTimestamp; // Line 44.
}
When this code executes, every line with LogTimestamp after the first one will produce a debug log line like this:
This tells you the class and method name, the line number, and the number of nanoseconds passed since the previous timestamp. In other words, you get to know exactly how long myMethod takes, down to the nanosecond. It can be useful to throw in a bunch of LogTimestamp lines to get a line-by-line breakdown of what parts of your method are the slowest, so you know where to focus your speed-up efforts.
Advanced case
The above technique is great for looking at a small number of specific code segments, and when no data aggregation is needed. But let’s say you want to find out which iteration of a function out of 10,000 calls was the slowest, or perhaps you will be using many timestamps within a function, and would like to quickly locate the slowest piece. Time to bust out the big guns.
You use the LogTimestamp{Start,Mid,End}Chunk macros similarly to LogTimestamp, except that, within a function, they are expected to be executed in order – Start – Mid, Mid, Mid, (etc.) – End. You can have as many midpoints as you want, including zero. The data you get from these macros is best explained by an simple example. Suppose we want to know which line of the function MyFunction is slowest. Here’s our code:
sleep(1); // pause for 1 sec
}
void FastCall() {
usleep(1e5); // pause for 0.1 sec
}
void MyFunction() {
LogTimestampStartChunk;
FastCall();
LogTimestampMidChunk;
SlowCall();
LogTimestampMidChunk;
FastCall();
LogTimestampEndChunk;
}
And here’s the resulting debug log lines when we run the code with one call to MyFunction:
Temp20[36546:207] --- Data for thread 0x4b0dbb0 ---
Temp20[36546:207] + Chunk = MyFunction:28 - MyFunction:34, time = 1.2003s
Temp20[36546:207] 83% in MyFunction:30 - MyFunction:32
Temp20[36546:207] 8% in MyFunction:32 - MyFunction:34
Temp20[36546:207] 8% in MyFunction:28 - MyFunction:30
Temp20[36546:207] ++ Chunk = MyFunction:28 - MyFunction:34, avg time = 1200328449 nsec
Temp20[36546:207] ==== Slowest chunks so far ====
Temp20[36546:207] # Chunk = MyFunction:28 - MyFunction:34, time = 1.2003s
Temp20[36546:207] ==== End timestamp data ====
If we add these chunk macros to another function called MyFunction2, and execute both MyFunction and MyFunction2 about 1,000 times each, then we can quickly find out which one is slower by looking at the “Slowest chunks so far” aggregated data, which is a summary based on all previous calls to all chunks.
The debug output is not immediately supplied — it is deferred by up to 10 seconds. This is because NSLog calls can be notoriously slow. If data were logged immediately, all the timing information would be hijacked by NSLog’s inexplicable processor waylaying. Why, NSLog? But alas, the best we can do is work around NSLog’s lackadaisical lollygagging ways by gathering the timing information in memory, and reporting it periodically outside of the timestamp chunks.
This output is not as beautiful as the graphs in Instruments, but I personally find this approach much easier to use for the purpose of speeding up specific sections of my code.
How to get the code
The files CodeTimestamps.{h,m} are freely available as open source code under the Apache 2 license. They can be downloaded as part of the moriarty library. Click on “Downloads” (near the upper-right corner on the github page), unzip the file you download, and copy the two files (CodeTimestamps.{h,m}) into your Xcode project, making sure Xcode knows about these files (right-click on Classes, then Add > Existing Files… to add them within Xcode). From there, just #import "CodeTimestamps.h" in whichever file you want to add timestamps to, and use them as described above.
I wrote this code while working with Ankit to speed up Pulse on the iPad. I’ve always been impressed by the Pulse team’s steadfast dedication to setting new standards in mobile app user experience, and building custom performance tools like this is just one of many techniques used. I also think it’s awesome that Pulse chose to offer this code as open source. Keep up the great work!
Also, we're hiring! Check out what it's like to work at Pulse and our current openings--we'd love to hear from you!
You can leave a response, or trackback from your own site.
Tweet