IT Blog

Tech Blog

Beware the NUMA in 2008R2 SP1 Hyper-V

Hyper-VThat is the warning I bring to bear regarding Windows Server 2008 R2 SP1 Hyper-V.

Recently, we had updated a client’s server to Windows 2008R2 SP1 and immediately after rebooting, the 2003R2 SP2 Guest VM began to misbehave.

From a user’s point of view, opening a network drive would take an inordinate amount of time and in general, everything ran like molasses.

The Host OS and the Guest OS both showed nominal CPU, RAM and network usage, although the disk activity was higher than usual, it wasn’t so bad as to make the Guest VM grind to a halt.

After re-installing and updating the Hyper-V Integration Tools on the Guest OS, everything began to behave. That is, until the Guest OS was rebooted, then it reverted to trudging through quicksand.

After many hours of troubleshooting and attempting numerous hot-fixes, the problem was not resolved.. The only thing we tried that worked, albeit temporarily, was to install the Integration Tools on the Guest OS without rebooting the Guest OS.

This of course was not a solution, not by a long shot.  Everything seemed to point to the updated Hyper-V drivers not being compatible with 2003R2 SP2, although no articles exist warning of such incompatibilities.

After many hours of research failed to enlighten us as to the cause, the process of undertaking a SP1 roll-back was discussed, or perhaps moving the VM to another 2008R2 system that wasn’t running SP1.  Needless to say, this was a time consuming, not to mention an unappealing solution that would involve many hours of downtime.

We sat and asked ourselves, what did SP1 do to Hyper-V?
Windows history dictates that a Service Pack largely consists of an accumulated fixes since the RTM or last Service Pack, without many new features on offer.

In this instance though, we found out that SP1 introduced a new feature in Hyper-V called NUMA Spanning. To simplify, NUMA Spanning is a way for Guest OS’s to address a logically separate address bus, separate from other VM’s, which in theory allows for more scalability and better performance. Those who wish to know more can read about it here

Here is where things got interesting. For starters, NUMA is enabled by default, but NUMA depends on certain hardware requirements to be available on the server to work correctly. There is no indication of what Hyper-V does if the hardware is not NUMA capable. We suspect this is what was causing the problem.

We disabled NUMA spanning within Hyper-V, which required a Host reboot. After the system was back up and running, everything started to work correctly.

Moral of the Story?
Disable NUMA in Hyper-V immediately after upgrading 2008 R2 to SP1 until you are certain the Host Server supports it.