Improve virtual server performance with zram

TLDR: zram swap increases total ram to between 102% and 158%. It's easy to enable on post 3.14 Linux kernels.

Get to it man! At the bottom, I tell you how to setup zram swap. Scroll down if you don't want to hear me go on about the history of virtual memory.

In today's world, the virtual server is king. It's where most newcomers to operations start and where many oldcomers move to. It's so pervasive at this point that most operations folks have to justify NOT using virtual servers.

I'm not going to be debate the merits of physical servers vs virtual servers. Many, much more informed persons do that already. Instead I'm going to give you a tip on how to improve those virtual servers.

It begins with virtual memory

One of the biggest advances in the 386 architecture was the inclusion of paging. This allowed the OS to describe to the CPU how to use the physical memory available to the system and brought with it things like memory protection.

One thing it also brought with it was "swap". OS designers quickly realized they could use this new virtual memory system to save and load memory that wasn't actively being used to disk. If programs wanted 4Mb of RAM but only 2Mb was available, the OS could shuffle data between disk and memory as different programs needed their data, allowing those programs to run.

Recall this was the days when a 2Mb RAM module cost $300 (thanks again mom and dad for buying that for me, 4Mb really did make all the difference) so even having the ability to use 2Mb of disk would mean doubling the amount of ram that processes could use.

the equisite 386 with 4Mb of RAM Evan had

We fast forward a bit and find Linux evolving very nicely on the 386 architecture. Even when you first installed slackware disks sets you downloaded over AOL (thanks again mom and dad), it would ask "how much swap space do you want?" The answer to this question was initially puzzling, because at first blush a user has little context to answer it, sort of like asking "how many pounds of tuna does a shark enjoy before lunch?"

And so the people largely coalesced around a simple metric: half of your total RAM. If you had 16Mb of RAM, 8Mb of swap. Most Linux distros began to default to this value and so it was for a while, even as systems began to push 1Gb of RAM.

And continues to virtual servers

The advent of widespread virtualization changed things.

The original idea of swap was based on the idea that the disk you want to store the overloaded pages on is local. That even though it's 1000 times slower than accessing RAM, it's only 1000 times slower and the latency is typically fixed. This resulted in systems having predictable performance as they began to use swap. But virtual servers also had virtual disks, sometimes those disks were even located over a network! And to make matters worse, you might be charged for doing IO in the first place! Swap began to look like a burden. Something that, when you happened to use it, hurt a lot in terms of performance and pocket book.

Therefore, most people, including distro makers, began to disable swap entirely if a server was virtual. This seemed like a decent thing to do rather than have users accidentally tripping swap and have their servers grind to a halt.

This left users learning all about Linux Out-Of-Memory (OOM) errors. It's common to see random processes die because the Linux kernel needs to free up pages and thus it typically kills the process using the most RAM. And so, users setup process monitors to restart things, or watch them crash and start a virtual server with more RAM to try and complete the job.

A better way

As I began thinking about this problem, I kept coming back to fact that RAM pages are used very inefficiently. They're optimized for individual programs needs, which typically means loosely packed data that's easy to access on word (4 byte) boundaries. That means they should compress very well.

I know there has been discussion of page compression in Linux and that even other OSes have this ability, but I hadn't yet seen anyone use or discuss it outside of mild curiosity. Some quick searches turned up some older Linux kernel modules as well as what I'd been looking for: zram, merged into the Linux mainline in 3.14.

Zram

Zram is a Linux module that, when activated, creates block devices that store the data written to them in compressed in memory. So if you write a file full of zeros to a zram device, it will only take up a fraction of the space when read out in memory. (You may be thinking, “this sounds a lot like ‘RAM Doubler’ and other such products from the mid-90s.” You'd be correct, everything old is new again.)

Because zram creates a block device, Linux can use it as a swap device! "But Evan," you may be asking, "you're going to offload pages in memory back to memory? Won't that just create more problems?" I understand your hesitation, I had my own. That's why I employed some science to find out!

First, I wrote this script:

This allocates 1Mb strings in a loop forever. Forever is actually until the OOM code decides to kill the process, which it does without fail.

On a VM with 1Gb of RAM, running this code with no swap results in 825. That means the OS was able to allocate 825Mb of RAM to the process before there was simply no more to give out and the process was killed.

Now we enable a zram device with a size of 512Mb, i.e. half our available RAM:

Running free will now report our swap is active and so we rerun our script. And—look at this—now it outputs 1308! This means that even though we allowed half of the total memory to be used by compressed pages, the system was able to increase the total amount of usage memory by almost 500Mb! That’s a 58% increase.

Now, if you look at the script, you'll notice this is a best case scenario. Pages and pages filled with spaces is trivial to compress. So we can consider this our upper bound, but what about the lower bound?

We write this script:

The script allocates 1Mb of memory filled with random data, which should make it basically incompressible. If we run it without swap on, we get a result of 815. It's 10Mb less than before, but it's nearly the same. (Why 10 less? Good question. Perhaps the kernel was doing some duplicate page collapsing before?)

So, now let's turn zram back on. We're looking for a lower bound here, so we'd prefer it be as close to 815 as possible. Above 815 seems unlikely but possible. The result? 846! That means that there was some small amount of compression even on these random pages such that over the total amount of memory we achieved a savings of 30Mb. A 2% increase!

The big take away is that enabling zram never hurts system RAM usage, it only helps. It incurs a mild systems overhead related to the CPU cycles required to perform the compression, but these days that’s a pretty minimal hit on most servers, where CPU cycles are rarely the bottleneck.

Have I sold you on using compressed swap yet? Great! So here's how to use it:

  1. Get Linux kernel 3.14 or later. For debian/ubuntu, there is one available in our apt repo: deb [arch=amd64] http://apt-public.vektra.io vektra main.
  2. On Ubuntu 14.04, you can now simply do apt-get install zram-config. This will add an init script to switch on zram for swap at 50% of total RAM.
  3. For other distros, you can add the snippet below to your system startup. Undoubtedly, there are starting to be built-in ways to enable this, so check for those too.

Simple zram-as-swap script:

Now you've got some new tools for your journey, good luck!