NO CARRIER

Computers, Science, Technology, Xen Virtualization, Hosting, Photography, The Internet, Geekdom And More

Almost Paravirtualized Introspection, But Not Quite

Posted on | February 17, 2010 | 1 Comment

In the world of Virtualization, introspection refers to examining the memory of a running guest from the privileged domain at key addresses. For instance, if you know where to find the running kernel task structure, you can read it while the guest is completely oblivious that you are doing so. In other words, it allows you to observe a running application without interfering with it at all. In an ideal world, this would be an easy process. Unfortunately, its not.

The Xen Introspection Project has made amazing strides in their attempt to simplify the process. Unfortunately, this means keeping a list of kernels and interesting addresses up to date for every possible operating system. That’s quite a chore, needle in a haystack mean anything to you?

One of the reasons that I love running mostly paravirtualized domains is the benefit of Xenbus, a communication channel between the privileged and guest domains. Using this, ‘split’ drivers can easily communicate with their backing counterparts on the privileged side, and vice versa. My goal a few months ago was to find a way to monitor the health and status of a large paravirtualized farm without relying on something like SNMP. In fact, some of the guests don’t have networking, all they do is crunch data on a clustered file system.

I first looked at real introspection, but this led me to several insurmountable obstacles. I needed a lot more than just a process tree and load average, the time it would have taken to collect the proper address offsets for each of the kernels I use would have been months if not longer. What I needed was really just the results of sysinfo(), combined with some information on mounts and other indicators usually found in /proc. Watching and metering IOWAIT across virtual CPUs on big SMP hosts was also a requirement (I did mention data crunching, remember?)

The answer was pretty obvious, use Xenbus. This offered two directions – write a kernel module or use Xenstore and do it in user space. I opted for user space and its working out quite well. My very beta and somewhat incomplete code can be found here. It consists of several components:

  • An agent that runs at a very high priority on the guest (optionally) that continually writes system vitals to xenstore
  • A daemon that runs on dom-0 that detects guests starting and stopping and automatically grants them write permissions to a special location in xenstore. This also watches for abusive guests that try to crash the store and automatically pauses them.
  • A FUSE file system enabling Xenstore to be navigated pretty much like /proc. Brendan Cully wrote the original version as an amusing proof of concept, I built on it to make it a little more serious for production
  • A mini library to simplify reading and writing to Xenstore. Mostly, its just a wrapper around libxenstore.
  • The start of another mini library to implement something like libxenstat that also knows about the additional stuff the dom-u agent exports

All in all I’m quite happy with it. I’m far from done, but it is already doing the job I needed it to do and more. I also added some hooks to proxy writes to sysctl from dom-0 to a dom-u, I’ll soon be adding code to allow dom-0 to change the root password of any dom-u by writing it to a special place in xenstore.

Like most things, I realized too late that I boxed myself into some bad ideas. As such, I plan to do some major refactoring once I get everything written and working. Unfortunately, the repo doesn’t even have install targets in the makefiles. To build it, you need the common Xen libs and libfuse. The documentation is sparse, but it should show you where to put things and get them going.

Hopefully, by next month I have a fully finished, rpm/deb’able package with a proper client (vs just using the FUSE implementation for now).

Comments

One Response to “Almost Paravirtualized Introspection, But Not Quite”

  1. Chris B
    January 18th, 2011 @ 5:28 am

    Hi,

    It seems your link to the code above is broken. Is there some other place to look? I too am in the process of building something similar as a component to a larger project, and would love to see what other people do.

Leave a Reply





  • Monkey Plus Typewriter
  • Stack Overflow

  • Me According To Ohloh

  • Meta