How Facebook engineers conquer elusive app memory leaks

facebook logo sign 2
Credit: Matt Kapko

They've learned that it's best not to be so fussy when asking for memory from the operating system.


Programmers needing a break from marathon debugging sessions may want to peruse a blog post from two Facebook software engineers, who offer tips and war stories about rooting out elusive memory errors in the social network's iOS app.

Memory errors can be a special nuisance for developers, given how hard they are to debug. If you've ever had the Facebook app, or any other running app, just stop running and disappear, it is very likely because of a memory error.

"With some tooling, migration to the newest iOS technologies, and a bit of cleverness to measure the problem in the first place, we were able to make our app more reliable," wrote Ali Ansari and Greg Pstrucha, in a new post on the Facebook Engineering Blog.

In short, a running program disappears because the operating system killed it, most likely because the app started doing things outside of its allotted space of memory. The operating system assigns every running program a range of a system's memory to do its work.

The operating system can also terminate a program if it suddenly starts requesting large amounts of additional memory, which can happen if there is a memory leak that could eventually consume all the system's memory. It can also kill a perfectly working program when the system itself is running out of system memory due to other reasons.

A diagram of Facebook app memory crashes Facebook

In Facebook engineering parlance, a BOOM (background out-of-memory error) is when a program dies in the background and a FOOM is when a program on the screen suddenly dies.

iOS does send a message to the app warning that a shutdown is imminent, but there is no guarantee that the app will log that message before it is dispatched into the void.

"This leaves us with no easy way to know that the app was killed by the OS due to memory pressure," the engineers wrote.

Nonetheless, the Facebook engineers have developed a number of techniques that have lowered the overall rate of memory crashes of its iOS app.

One technique they found to be handy is to make the app less fussy in terms of asking for and then relinquishing memory from the operating system.

Facebook engineers were initially conscientious about only using the amount of memory needed. Whenever an app would need more memory, to carry out an action such as view a Web page, it would ask the operating system for more, and then immediately relinquish that memory when the task was finished.

This approach, however, didn't reduce the number of crashes in the app. In many cases, that relinquished memory was not even reclaimed by iOS.

Instead, what seemed to help lower the crash rate was to make few changes in the amount of allotted memory. The app would ask for all the memory it needed when it started, then try to work within these confines.

This approach alone reduced program crashes by about 30 percent.

Apple provided some additional help in memory allocation as well. In version 8 of iOS, it provided a new programming class called WKWebView, which offloads the viewing of Web pages as a separate process.

The Facebook team also used a number of internally developed tools to detect potential memory errors.

One was a scanning tool developed by Facebook internally and released as open source, called CT-Scan infrastructure, which was originally developed for tuning the performance of mobile apps. It turned out to be effective for pinpointing memory leaks as well.

The team also developed a new in-app memory profiler, which tracked all the memory allocations made by the program, without adding overhead to the program itself. This allows Facebook to collect operational characteristics of a test copy of the program as it is running.

As the program gets updated, the team can compare the amount of memory allocated by different processes, between the new version and older versions. A giant discrepancy between the two may point to a heretofore undiscovered memory leak.

ITWorld DealPost: The best in tech deals and discounts.