WebKit2 - Memory consumption of Process Models vol. #1
A long time has passed since I’ve written, but lately I did some interesting memory consumption measurements on QtWebKit’s WebKit2 implementation. First, let’s talk a little about WebKit2’s process models. There are 2 process models in WebKit (more precisely, there are 3, but for now I don’t take with thread-based solutions into account): the default is the secondary process model, and we can easily switch to shared process model as well. Balazs experienced with a third one called service model which works like a classic server-client connection. For a clearer view, let’s see this picture below.
Based on the structure of the models we expect that the service model will consume the least.
Well, last summer I modified the Smaps part of the Linux Kernel a bit to be able to provide exact results about cross-process memory consumptions. I.e., several different processes can use the same shared libraries, but they can use different parts of them. Since only the accessed memory pages are loaded into memory, the measurement is not trivial at the first sight. As you can see on the picture above, every orange rectangle represents a unique process, so without this kernel modification we couldn’t provide precise results. Although we could read the memory usage of the shared libraries from Linux proc’s smaps file for MiniBrowser and for QtWebProcess, but we couldn't find out what is the amount of memory which is really shared or really distinct.
Okay, we have a fancy and exact measuring environment on my dedicated SuttyoTM memory benchmarking machine (x86, dual-core 2.33 Ghz, Slackware 13.1), now we need a benchmark suite.
Fortunately, I collected a new set of websites for our Methanol benchmark, which now contains the first 25 unique websites from Alexa Top 500. (Currently, we run it only with 24 sites, because of an assertion: bug #51990).
The pages are completely stripped of external references, so the MiniBrowser doesn’t use the network during the measurements. After filling the caches, Methanol loads every page five times, one by one. This is the way how we simulate live browsing.
I used the official r75063 WebKit revision, with Qt 4.7.1. Now, we have the infrastructure to measure memory consumption, and we have a live browsing simulation benchmark as well, so let’s check what the numbers show us!
Testing process models with a single Methanol run
This test case is a simple comparison between process model implementations, we run Methanol one time with once for every process model.
In this simple case the Secondary Process Model consumes 10% less than the other two. We should notice that we loaded 24 sites in 1 MiniBrowser window with 1 WebProcess. Personally, I expected identical results from the models since in this case they all have the same number of UI and Web processes. However, the actual results differ from my expectations.
Testing process models with 2 simultaneously running Methanols
In this case we run 1 MiniBrowser process with 2 BrowserWindow. In the case of Shared and Service models only 1 WebProcess runs, but the Secondary process model runs with 2, one for each BrowserWindow.
In this case, the Secondary process model wins again by 2 megabytes against the Shared model, and by 4 megabytes against the Service model. The Shared model consumed 2 megabytes less memory than the Service model. Don’t forget that in this case 2 Methanol was running at the same time in different BrowserWindows, but in 1 ui process. I think it is interesting that the Secondary model consumed the least memory, even though it had the extra RSS cost for the extra WebProcess.
Testing process models with 2x2 simultaneously running Methanols
This test case is very exciting, because we have a lot of differences between the behavior of the models, which show up very well on the chart. First, let’s talk about the test.
The Secondary process model consumes 40% extra memory compared to the Shared process model, and it consumes 72% extra memory over the Service process model. In this test case Service model is the absolute winner: it consumes 18% less memory than the Shared process model. These numbers shows a really significant difference between the models. I think the most real life test case was the last run. (Usually, I have ~20 opened BrowserWindows (or tabs) during surfing. What about you?)
This is a summary table about the results.