A new tool to extract meaningful memory consumption values. Valgrind shines again.

Every program allocates memory, regardless of its size. Reducing it is an important task and an eternal struggle. Using advanced memory profiling tools, it is easy to see that the most memory is consumed by ... hash tables, vectors, unicode strings, mmaped memory regions, etc. But is THIS what you are interested in? Isn't it would be better to know how much memory consumed by WebCore, JavaScriptCore, Qt, or by a particular component, like WebCore/css?

About the freya tool

Ever heard about valgrind? Many knows it is a leak hunting tool. Well, that is half (or less) of the story. Valgring is a JIT engine, which can fully recompile existing binaries. Why is that useful? Because you can add custom code modifications to existing binary programs, like calling a user callback function, whenever a memory read or write happens, and capturing the address and size of that particular operation. Although the generated code is slower than the original one, it is still way faster than debugging the program step-by-step. Several tools built upon valgrind, like the famous memcheck tool. However valgrind has several other tools, like dynamic call graph tracer and memory profiler.

I developed a new tool to push further the limits of valgrind, called freya. Freya is memory allocation tracker, both for explicit (malloc, posix_memalign) and implicit (allocating 4k pages from an address space returned by mmap). It not just records the allocations, but records their stack trace as well. The real interesting feature of freya is that you can define rules about how the extracted information should be organized to get meaningful values.

Rules

The rules are divided into two groups:

  • First, we can define rules to discard the top items of the stack trace. This is useful, if we don't interested in hash table and vector allocations, only those functions, which actually called them. The trace cannot be fully discarded, at least the last item remains.
  • The reamining stack trace is stored in a tree, so the allocations in the same code position are grouped together. We can define rules where the stack trace should be inserted into the tree. (The concept is similar to file system mounting). These rules are evaluated in first come first served basis.
 +--+--+--+--+--+--+-           +--+--+-
 | 0| 1| 2| 3| 4| 5| ...   ->   | 4| 5| ...
 +--+--+--+--+--+--+-           +--+--+-
   ^        ^
   |        |

For example, if a discard rule matches to the first and fourth items of the stack trace, the first four items of the trace are discarded. The position of the fifth element is determined by the second set of rules, and the remaining stack trace is inserted to that position in the tree.

 +--+--+--+--+
 | 0| 1| 2|3a| - Stack trace 1
 +--+--+--+--+

 +--+--+--+--+--+
 | 0| 1| 2|3b|4b| - stack trace 2
 +--+--+--+--+--+

 Generated tree:

       +--+
       | 0|
       +--+
        |
       +--+
       | 1|
       +--+
        |
       +--+
       | 2|
       +--+
     /      \
   +--+    +--+
   |3a|    |3b|
   +--+    +--+
            |
           +--+
           |4b|
           +--+

The items in the stack trace become tree nodes, creating natural groups. These groups collect the allocations on the same code locations.

Compiling the program

We should note first, that valgrind has its own application binary interface, so WebKit JIT should be disabled when WebKit is run by valgrind. Moreover, to evaluating the rules, we need debug information (-g for gcc).

In short, you have to pass the following arguments to build-webkit (under Qt):
QMAKE_CXXFLAGS+=-g QMAKE_LFLAGS+=-g DEFINES+=ENABLE_JIT=0 DEFINES+=ENABLE_YARR=0 DEFINES+=ENABLE_YARR_JIT=0 DEFINES+=USE_SYSTEM_MALLOC

Configuring freya

The rules are stored in a config file, each line contains either a rule, or a tree node. The lines which started by hash-mark (#) or only contains spaces are ignored.

The rules have the following format: rule_name function_name path

The rule_name can be a unique identifier, or a hypen, which means this rule is a discard rule. If a discard rule matches to an item in the stack trace, all items are removed to that point in the stack trace.

Both function_name and path are optional, but at least one of them must be appear in the line. The function_name can be a name of a function enclosed in parentheses (), or a namespace name enclosed in braces {}. The function_name is omitted, if neither parentheses nor braces are found. The name of a function must be a fully qualified name, and must match to the current function name. The name of a namespace must match to the beginning of the current function name, followed by two colons ::.

The path is created from the remaining characters of the line. The spaces before the path are omitted. The path contains directory names separated by slash /, and they must partially match to the path of the current file name. I.e: dir1/dir2 does not match to /home/user/dir1/dir22/main.cpp, but dir1/main.cpp matches to both /home/user/dir1/main.cpp and /home/user/dir1/main.cpp/real_main.cpp.

If both function_name and path are specified, they both must be match to the current function.

# Examples:

# Matches to main (...)
  rule1   (main)

# Matches to main (...) in main.cpp
# Note: this rule never matches, since rule1 already captures all main(...)-s
  rule2   (main)   main.cpp

# Discard all functions which qualified name starts with NS1::NS2::
  -  {NS1::NS2}

# Mathes to all functions in dir1/main.cpp
  rule3   dir1/main.cpp

The tree nodes must be enclosed in square brackets, braces, or parentheses.

[name] defines a new tree node. In case of tree nodes, the indentation is very important, since its also represents the depth in the tree (the nodes which have the same indentation have the same depth). The tree nodes collects the memory consumption information (total, peak, etc.).

{name} attach an already defined rule to the last tree node. One rule can only be attached to at most one tree node. If any of the attached rule matches, the current stack trace is inserted to that tree node.

(name) is just an abbreviation of [name] followed by a {name}

To make a node to the default tree node (for functions which does not match to any rule), specify a plus sign + after the node definition.

# Example:
  [Total]
     (rule1)
     [Sub-total]+
        [for rule2]
           {rule2}
  (rule3)
     [Empty node]

Note: Only spaces can be used for separation, tabs are threated as normal characters!

Where can I get the source code?

The source code of freya is attached to this post.

Update (01.04.2011)

Freya is updated to valgrind 3.6.1 I upload the whole source code, as updating the Makefiles are not that easy, since the description is rather old.
Run configure and make.

AttachmentSize
valgrind-3.6.1-freya.tgz7.51 MB
WebKit_conf.txt1.59 KB

lu_zero (not verified) - 02/16/2010 - 17:25

Would be helpful if you update this post with a mention about how to run freya and how to feed the configuration file

zoltan.herczeg - 02/17/2010 - 08:15

Just like any other tools:
valgrind --tool=freya --config="config file path" [... other options ...]

Help is also available:
valgrind --tool=freya --help

Anonymous (not verified) - 06/24/2010 - 10:45

Getting error after running freya with valgrind.

valgrind: mmap(0x8048000, 12288) failed in UME with error 22 (Invalid argument).
valgrind: this can be caused by executables with very large text, data or bss segments.

Anonymous (not verified) - 08/12/2010 - 22:42

$ inst/bin/valgrind --tool=freya date
valgrind: mmap(0x8048000, 61440) failed in UME with error 22 (Invalid argument).
valgrind: this can be caused by executables with very large text, data or bss segments.

Getting the same problem as previous commenter
using/building valgrind from latest svn checkout on fedora9!
Any solution to this?

zoltan.herczeg - 08/13/2010 - 07:25

Is that a 32 or 64 bit system? On 64 bit systems mmap tracking is not working (I have to implement it someday) and should be disabled (--mmap=no)

Arturo (not verified) - 08/13/2010 - 18:57

It is 32bit installation on i686- fedora9 installed:
Linux els-e71812 2.6.27.12-78.2.8.fc9.i686 #1 SMP Mon Jan 19 20:14:35 EST 2009 i686 i686 i386 GNU/Linux
Valgrind version is valgrind-3.6.0.SVN
Every other Valgrind tool is working except Freya!
I used Valgrind tool development instruction modifing configure.in by adding Freya files both into AC_OUTPUT
and AC_CONFIG_FILES, but not able to make it worked!

This error means valgrind is not able to allocate
/right to allocate space for freya, right? I build it
as a normal user, I do have only sudo privileges on
the machine! Do I need to be root to build it?
Thanks

zoltan.herczeg - 08/13/2010 - 19:34

I don't think so. Valgrind works for me with simple user privileges. And I don't think your poject has that big code or data sections as well. "with error 22 (Invalid argument)" - This "Invalid argument" is strange. Could you try the tool with the stable 3.5.0 revision? I also used top-of-tree Valgrind once before, and it yielded strange error messages (and they disappeared in the stable revision). Is there a backtrace or other helpful information where the issue happens?

Arturo (not verified) - 08/14/2010 - 02:49

Hi Zoltan,
You are right, 3.5.0 works with Freya :)I am still having problems to write config file though! Could you more specific examples of config file somewhere? Thanks a lot for your help.
For people having the same problem, I obtained 3.5.0 from
http://valgrind.org/downloads/current.html
And you still need autogen.sh from svn to be able to build Freya.

zoltan.herczeg - 08/16/2010 - 07:19

I have attached a config file I use for WebKit. Probably better than any explanation :)

lu_zero (not verified) - 12/15/2010 - 18:22

now we have valgrind 3.6.0 out, are there new plans for freya?

zoltan.herczeg - 12/20/2010 - 13:22

Thanks for letting me know. I will check the patch soon, and will update it.

Anonymous (not verified) - 01/23/2012 - 13:30

Great, great tool, thanks :)

Luca Barbato (not verified) - 12/25/2012 - 17:33

I started keeping freya in a git tree https://github.com/lu-zero/valgrind/tree/freya

The master branch contains a hourly sync of valgrind, VEX is kept in a separate submodule and the freya branch contains freya obviously.

I hope we could try to get it in valgrind by 2013 =)

zoltan.herczeg - 12/25/2012 - 17:44

Wow, nice job! Thank you for letting us know. I am really happy. Excellent xmas present :)

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • No HTML tags allowed
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Fill in the blank