Computers, Science, Technology, Xen Virtualization, Hosting, Photography, The Internet, Geekdom And More

Linux’s inner child is a serial killer

Posted on | March 23, 2009 | 1 Comment

No, not Linus’ (though I do wonder about that guy sometimes). I’m talking about the Linux OOM (out of memory) killer which I am highly considering classifying as sentient.

I’m finishing up some Xen related programs, one of them is a special watchdog daemon that runs on Xen guests and exports system vitals via Xenbus. I’m building a small set of rules that allows the watchdog to realize what is, or is not a recoverable event when it decides if sending the watchdog ping back to softdog is a good idea.

Naturally, one would want to know a tally of how many victims (I mean processes) the OOM killer has claimed. Of course, no such statistics are readily available via the /proc interface (that I can find, anyway). This is the second time today that I’ve encountered this psychotic pest. The first time was doing a live migration test to see if I could reproduce a friend’s bug … migrating a domain that has more RAM than dom-0 sometimes results in dom-0′s OOM killer sending xend to a watery grave.

As for the watchdog, I can work around the lack of any centralized statistics for OOM events, its quite cheap to just iterate through the process list and collect scores while figuring out how many victims are likely to be ‘next’. But still, it would be wonderful if we knew how many victims there were (readily via /proc) :)

I may make a patch that adds oom_count: to the bottom of /proc/stat, but man, I really hate touching the kernel for such a simple need, nevermind portability. I also don’t want to parse system messages once every 45 seconds just to count oom events.

So, to summarize, Linux’s inner child (named OOM) is not only a very efficient killer, its very good at concealing its body count :)

One of the most fun aspects of writing software to run on a privileged Xen domain is that you must be extra defensive, the privileged domain typically has less memory than old commodity Pentium Pro desktops. Yet, some things designed to run on the privileged domain allocate memory as if they were a relational database server. So, perhaps this lunatic killer should be elusive, it keeps everyone else in check :)


One Response to “Linux’s inner child is a serial killer”

  1. tinkertim
    March 23rd, 2009 @ 1:20 pm

    Prior to the flames and links to Documentation/filesystems/proc.txt, I am aware that the OOM score per process can be adjusted, and that -17 (or more proper, OOM_DISABLE) is like police protection for a process. I’m also aware that each process can find out how likely it is to become the OOM’s next victim.

    I just want more global (not just per process) stats for this beast in procfs, in addition to per process so I can watch this thing in action (and take measures from userspace).

Leave a Reply

  • Monkey Plus Typewriter
  • Stack Overflow

  • Me According To Ohloh

  • Meta