Riemann - a Clojure Application on OSv

By Tzach Livyatan

Clojure applications run on the JVM, so they’re usually simple to run on OSv. We have hello world in Clojure running, but this time I wanted to port a real, non-toy, Clojure application. I chose Riemann, a widely-used application for aggregating system events (and more).

I used Capstan, a tool for building and running applications on OSv. Jump to the end result, or follow the steps I took:

Following the Capstan guideline, I added a Capstanfile to the project. Here are the parts of Capstanfile you need to know about:

  • Set the base image. In this case I chose a base image with Java (open-jdk)
    base: cloudius/osv-openjdk
    
  • Build the jar file, taking advantage of the lein uberjar command, which packages the application with all dependencies into one jar file.
    build: lein uberjar
  • Copy the build artifacts to the base image, producing a new image:
    files:
      /riemann.jar: ./target/riemann-0.2.5-SNAPSHOT-standalone.jar
      /riemann.config: ./riemann.config
    

I also copy the config file, which Riemann will look for.

  • The run command for the VM is executed when the VM starts.
    cmdline: /java.so -jar /riemann.jar
    

That’s it. Done with the Capstanfile.

Let’s test it!

>capstan run
WARN [2014-04-13 14:11:22,029] Thread-9 - riemann.core - instrumentation service caught
java.io.IOException: Cannot run program "hostname": error=0, vfork failed
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041)
	at java.lang.Runtime.exec(Runtime.java:617)
	at clojure.java.shell$sh.doInvoke(shell.clj:116)
	at clojure.lang.RestFn.invoke(RestFn.java:408)

No luck. It turns out that Riemann is using

(sh "hostname")

which uses vfork to run a child process. On any OS its not very efficient to fork just to get the hostname, and on current OSv it simply won’t work. To bypass the problem, I replace this call with:

(.getHostName (java.net.InetAddress/getLocalHost))

which uses a Java getHostName.

Let’s try again

>capstan run

This time it works, but how do I test it and connect to it?

Let’s use Capstan port forwarding

capstan run -f 5555:5555 -f 5556:5556

This will forward host ports 5555 and 5556 to the corresponding ports on the OSv VM.

Success :)

Now we can switch to another terminal and run:

riemann-health

to generate traffic for Riemann and

riemann-dash

to launch a Riemann web GUI. Here is how it looks:

"Riemann GUI riemann-dash

Now we’re ready to do further stress testing. If you do find any problem, or have any question, you’re invited to join the osv-dev list and ask, or post an issue to the GitHub repository.

Tzach Livyatan

Spinlock-free OS Design for Virtualization

Designing an OS to run specifically as a cloud guest doesn’t just mean stripping out features. There are some other important problems with running virtualized that a conventional guest OS doesn’t address. In this post we’ll cover one of them.

Little spinlocks, big problem

In any situation where code running on multiple CPUs might read or write the same data, typical SMP operating systems use spinlocks. One CPU acquires the lock using an atomic test-and-set operation, and other CPUs that need the data must execute a busy-loop until they can acquire the lock themselves. Can I have the data? No. Can I have the data? No. Can I have the data? No. When an OS runs on bare hardware, a spinlock might just waste a little electricity. OS developers often use other more sophisticated locking techniques where they can, and try to reserve spinlocks for short-term locking of critical items.

OSv hacking at Apachecon 2014 Getting some high-performance web applications running on OSv at ApacheCON 2014

The problem comes in when you add virtualization. A physical CPU that holds a spinlock is actually working. The other CPUs in the system, “spinning” away waiting for the lock, are at least waiting for something that’s actually in progress. On a well-designed OS, the lock holder will be done quickly. When the OS is running under virtualization, though, it’s another story. The hypervisor might pause a virtual CPU at times when the guest OS can’t predict. As Thomas Friebel and Sebastian Biemueller described (PDF) in “How to Deal with Lock Holder Preemption”,

Lock holder preemption describes the situation when a VCPU is preempted inside the guest kernel while holding a spinlock. As this lock stays acquired during the preemption any other VCPUs of the same guest trying to acquire this lock will have to wait until the VCPU is executed again and releases the lock. Lock holder preemption is possible if two or more VCPUs run on a single CPU concurrently. And the more VCPUs of a guest are running in parallel the more VCPUs have to wait if trying to acquire a preempted lock. And as spinlocks imply active waiting the CPU time of waiting VCPUs is simply wasted.

If the hypervisor pauses a virtual CPU while that VCPU holds a spinlock, you get into the bad situation where other virtual CPUs on your guest are just spinning, and it’s possible that no useful work is getting done in that guest–just electricity wasting. Friebel and Biemueller describe a solution to the problem involving a hypercall to complain about the wait. But the OSv solution to the problem is to remove spinlocks from the guest OS entirely.

Why going spinlock-free matters

As a first step, OSv does almost all of its kernel-level work in threads. Threads, which are allowed to sleep, can use lock-based algorithms. They use mutexes, not spinlocks, to protect shared data. The mutex implementation itself, however, has to use a lock-free algorithm. OSv’s mutex implementation is based on a lock-free design by Gidenstam & Papatriantafilou, covered in LFTHREADS: A lock-free thread library. (PDF).

One other place that can’t run as a thread, because it has to handle the low-level switching among threads, is the scheduler. The scheduler uses per-cpu run queues, so that almost all scheduling operations do not require coordination among CPUs, and lock-free algorithms when a thread must be moved from one CPU to another.

Lock-free design is just one example of the kind of thing that we mean when talking about how OSv is “designed for the cloud”. Because we can’t assume that a CPU is always running or available to run, the low-level design of the OS needs to be cloud-aware to prevent performance degradation and resource waste.

We’ve been posting benchmarks that show sizeable performance increases running memcached and other programs. If you’re curious about whether OSv can make your application faster, please try it out from the OSv home page or join the osv-dev mailing list.

A Simple Capstan Example

(Updated 14 April 2014: Add new URL for osv-base image.)

Capstan is a new tool for building OSv virtual machine images. If you have worked with other tools for making VMs, you’ll find that Capstan is really simple. It’s a lot like Docker actually—only you get a complete VM out of it and not just a container.

You’re probably used to blogs from sneaky tech evangelists who claim that something is simple and then post some complicated set of instructions. So just to keep your finger off the close button, here’s all you need to do.

  • Add a Make target to build your application as a shared object.

  • Write a short Capstanfile. (8 lines not counting comments).

  • Run Capstan.

That’s all there is to it. Finger off the close button now? Good. Ready?

Let’s make a VM that does something useful, say, serve this article to the entire Internet. Go ahead and git clone Capstan and follow along.

An easy example, plus Makefile work

Just to keep it simple, let’s borrow the short HTTP server example from libevent. The libevent project is a wrapper for convenient event-driven programming, and the library is used in high-profile projects such as Tor, the anonymous communications system, and Chromium, the basis for the Google Chrome web browser.

Best of all, libevent includes an easy-to-use HTTP implementation and sample code for using it. So I’ll copy their web server sample code, tweak it a little to make the web server I need, and set up a simple Makefile.

Those steps are all done in the code for this article, which is at dmarti/http-server.

You’ll need the development package for libevent installed. On my system, it’s called libevent-devel.

Here’s the target to pay attention to:

http-server.so : http-server.c
        $(CC) -o $@ -std=gnu99 -fPIC -shared -levent $``

Yes, that’s right, we’re using -fPIC (position independent code) and -shared (passed to the linker, make it build a shared library). And http-server.c has a function called main. What’s going on? This is because of the way OSv works. Your application on OSv isn’t a conventional ELF executable, but a .so file.

Besides building the actual HTTP server, I’ll also put in a Make target to create the HTML version of this article from the README, because I can. So I type make to build the web content and the web server.

Of course you can expand on this to build as complicated of an application and data set as you want. This is just an example to show you Capstan for now.

Step two: Add a Capstanfile

Now it’s time to tell Capstan how to create the virtual machine image. Building it is easy–just run make–so there’s the build section right there. Now we need to tell Capstan what files go into the image, so we populate the files section with the name of our web server (http-server.so) the libevent shared library, and some web content–just the HTML version of this article, plus a favicon.ico file. (For now I’m just copying my development systems’s copy of libevent into the image. For real use, I’ll come up with a more consistent way to keep track of build artifacts like this, probably borrowing them from some helpful Linux distribution. Yes, OSv can use libraries built on and for your 64-bit Linux box.)

Easy so far. Now for the cmdline option, which is like Docker’s CMD: the command that gets run when the image starts. The HTTP server just takes its DocumentRoot entry from the command line, so the command comes out as:

cmdline: /tools/http-server.so /www

There’s one more section in the Capstanfile: base. That’s a pre-built OSv image, which is available from Amazon S3. Capstan will automatically download this for you. It lives under .capstan in your home directory.

Putting it all together

Now, when we type capstan build, Capstan invokes make, then creates the VM image. It lives under .capstan in your home directory, at:

.capstan/repository/http-server/http-server.qemu

This is a QCOW2 image, ready to run under KVM or convert to your favorite format. That’s it. Told you it was simple. You can just do capstan run and point your browser to http://localhost:8080/ to see the site.

In an upcoming blog post, I’ll cover the recently added VirtualBox support in Capstan (hint: try -p vbox) and some other fun things you can do.

If you have any Capstan questions, please join the osv-dev mailing list on Google Groups. You can get updates on new OSv and Capstan progress by subscribing to this blog or folllowing @CloudiusSystems on Twitter.

New OSv Blog

Welcome to the OSv blog.

A feed is available, so please subscribe in your RSS reader for updates, or, if you prefer to use Twitter, you can follow @CloudiusSystems there.

We’re running Octopress on GitHub Pages, so that we can use the same easy Git contribution process for the blog as for our code.

Watch for a simple introduction to the Capstan devops tool, coming this week.

If you have any questions about the blog, you can post a comment here, or mail Don Marti.