Scaling techniques for cyber emulation

John Floren

11 Aug 2017

Funding statement

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. SAND2017-8709 C.

Introduction

One of the biggest reasons for using Emulytics toolsets for experimentation is that they can handle large-scale virtual environments easily. In this talk, we discuss some of the techniques we have used to achieve scale with minimega:

Scaling

In creating an Emulytics environment, we aim for:

Defining the environment involves striking a balance between getting enough VMs per host to accomplish your goal without packing them so tightly nothing can actually get done.

Resources

Our VMs compete for host resources:

If we wanted to be extra careful, we'd allocate VMs something like this:

Oversubscription

Oversubscription is core to Emulytics, but you have to be smart about it:

General rule: start cautious and scale from there.

KSM

Kernel Same-page Merging is a capability in the Linux kernel which can improve memory density. It scans through all allocated memory pages on the system and merges any duplicate pages into a single copy-on-write page.

KSM

In a simple experiment, 500 VMs were booted from a Debian disk image. Without KSM, the VMs consumed 200MB of memory per VM. With KSM enabled, they consumed only 80 MB/VM.

Note that if you go in and start different programs on each VM (e.g. Chrome on one, Firefox on another, MS Word on a third, etc.), memory use will balloon and could cause an Out-Of-Memory condition. Use KSM when you know what programs VMs will run and are confident they won't suddenly exceed available memory.

minimega allows you to turn KSM on and off with the `optimize ksm [true,false]` command.

Hugepages

Default page size on AMD64 is 4KB. Linux offers the ability to run certain programs with 2MB pages.

Using hugepages for VM processes has several advantages:

minimega can run VMs with 2MB pages using the `optimize hugepages <path>` command, where '<path>' is the path to a mounted hugetablesfs.

Containers

Full-weight VMs are not always necessary. When possible, we deploy lightweight Linux containers in our experiments for greater density. Containers offer some compelling advantages:

Containers also have some disadvantages:

Smart scheduling

VM scheduling is the process by which the orchestration system distributes VMs to physical hosts. Although automatic scheduling may be possible, it is sometimes useful to add some human intervention for better performance/scale:

We've started adding more advanced scheduling to minimega.

Networking options

minimega provides three ways to move virtual network traffic over the cluster's physical network. Each option has tradeoffs between performance, realism, and ease of use.

VLAN tagging is fast and easy to set up, and makes hardware-in-the-loop easy, but it means no VLAN tagging inside the experiment. GRE and VXLAN tunnels require more configuration but can allow a more flexible virtual environment or enable federation of clusters over the Internet.

More nodes

Sometimes the only answer is to throw more nodes at the problem. The orchestration tooling needs to be able to scale smoothly as the number of nodes increases.

We launched 1,000 KVMs across 1, 2, 4, 8, 16, and 32 nodes, then repeated the experiment with containers. The wall-clock times to boot (in seconds) were:

We also tried launching 1,000 VMs per node and found smooth scaling:

Putting it together

Here's the network at SC 2016:

Putting it together

We used our tooling to generate a model from packet capture, router configs, and other data sources. Model contains about 9500 endpoints, including laptops and mobile devices.

The model can then be tweaked based on what we want to accomplish.

We could boot every endpoint as a container, but use real Cisco/Juniper/Brocade router images to experiment with topology and router configurations.

We could use minimobile to boot Android devices and emulate the physical locations of the WiFi infrastructure while leaving the other endpoints as containers.

We could 'zoom in' on one subnet, instantiating it with KVM VMs booting full-size Linux and Windows VMs and using containers in the other subnets to generate traffic and provide a milieu.

With enough nodes (perhaps a dozen), we could simply instantiate every VM as a full-size Linux, Windows, or Android device.

Audience participation time

Are you using the techniques we described?

Are you using any other techniques to achieve large-scale environments?

Any questions / comments?

Thank you

John Floren

11 Aug 2017