Nick Rintalan and I recently delivered a session at BriForum 2014 in Boston entitled “The Worst Citrix and Microsoft Best Practices of all Time”. It was a fun session and I would encourage you to watch the video if you attended BriForum, but missed the session. If you didn’t attend BriForum, then you will have to wait until they post the videos to You Tube, probably sometime early next year.

One of our “honorable mention” slides dealt with LoginVSI and the VSIMax score.  Before I beat up on the VSIMax number, I do need to say that I am a HUGE fan of LoginVSI.  It is a fantastic tool that not only I, but many of my customers find extremely valuable. If you are an enterprise customer with XenApp/RDS users or VDI users, then you absolutely need this tool! I encourage you to check them out. For good reason LoginVSI has become the de facto tool for establishing baseline numbers. More specifically, it seems everyone is using the LoginVSI Medium workload as the baseline test. You can find more information about LoginVSI here.

RDS vs VDI – The numbers are almost the same???

I have been hearing for some time from our internal Citrix folks as well as some other popular scalability reports that RDS desktops and VDI desktops are getting to the point where they scale almost the same.  Basically, if you can get 120 VDI users on a given server, then you probably only get about 140 – 150 RDS users on the same hardware (only 20% improvement with RDS). My initial gut reaction was that there is no way this could be true. The old consensus was that RDS scaled at least 100%+ better than VDI. Then I started running some tests for myself with LoginVSI in my own lab as well as onsite recently at two large customers. To my amazement, the difference between VDI and RDS was only about 20%. At both customers where I ran the tests we had very similar hardware using 48 core HP blades.  Here are the server specs from one of the customers.

  • HP BL 685 G8 Blades
  • Quad – 12 Core AMD CPUs (48 Cores) at 2.2 GHz
  • 512 GB RAM

On a server like this I was expecting to get about 150 Windows 7 VDI VMs and about 300 2008 R2 RDS desktops. So I set things up and kicked off some tests. All of my infrastructure servers were running on other blades so that I could dedicate an entire blade to either VDI or RDS. Here are the VM configurations that I ran on the blade.

RDS Test

VDI Test

  • 150 Windows 7 x64 VMs
  • 2 vCPUs and 3 GB RAM per Windows 7 VM
  • VMs delivered via PVS 7.1 with RAM Cache with Disk Overflow (512 MB for caching)

When Medium isn’t truly Medium

I took a close look at the LoginVSI Medium workload and here is what I observed….

  • The workflow lasts approximately 48 minutes before it starts over and loops again
  • During the 48 minute workflow the user is idle less than 8 minutes.  This means the user is active about 85% of the time.
  • About every 4 minutes (12 times per loop) the user runs a VSI_Timer test.
  • The user is constantly running multiple applications at the same time, including watching numerous videos as well as playing a heavy Flash based video game.

You can find more details about the workload here.

After thoroughly reviewing this workload, I noticed three major issues.

1)      The workload is heavy, not medium.

2)      The users are active 85% of the time, which is nowhere near real world.

3)      The VSI_Timer (used in determining VSIMax) is what actually kills performance.

By most customer standards, if you watch the medium workload play out there is no way you can call it a medium workload. It is definitely a heavy workload.  Nick Rintalan, actually pointed out this fact during some of his XenApp load testing that he performed with LoginVSI. You can read his results here.

After 16+ years in the Citrix computing space, one of the things I like to do at all the customers I visit is actually look at real world utilization on their production XenApp/RDS and VDI systems. I like to look at the active/idle ratio in their environments. You can easily do this by taking a quick look in the Citrix XenApp Console and sorting sessions by Idle Time. Unfortunately, this cannot be done as easily with XenDesktop VMs :(  However, you can write a script to pull this information. What I find over and over is that on average 60% of all users are idle for 1 minute or longer and only 40% of all users have actually been active in the last minute. That is a far cry from the 85% active percentage that the LoginVSI workload uses. The only environments where I have seen greater than 60% of all logged on users active at the exact same time are call center environments. In that type of scenario it is common to see 75% of all sessions having activity within the last minute. However, for the vast majority of environments with office workers, the number of sessions on a server that have had actual keyboard/mouse interaction within the last minute is usually 50% or less. This can make a big difference in calculating the number of users a server can support since in most cases idle users consume very little CPU and CPU is typically our bottleneck.

Even after considering that the LoginVSI workload is too active is actually heavy, I was still a little surprised with my results. I was expecting to go north of 300 users on this 48 core beast of a blade, but I fell far short. Here are my results.

XenApp RDS – 200 User Medium Test – VSIMax 192 Users

As you can see from the graph, when we launched a default LoginVSI Medium 200 user test against one of the 48 Core blades, we hit the VSIMax score at 192 sessions.

As I dug deeper in to what was actually consuming so much of the CPU during the test, I kept seeing multiple instances of 7za.exe eating up a lot of CPU time. 7za.exe is the 7-Zip process. Below is a screen shot of a LoginVSI load test where I had two XenApp VMs running LoginVSI with 45 users on each VM.  Take a look at the task manager screen shots from each of the XenApp VMs below.

You can see from the screen shots, 7-Zip is eating a lot of CPU. After watching Task Manager for several minutes, there were always a bunch of 7-Zip processes running and this is what was actually bringing down the server as the 7-Zip processes were often eating up 30% or more of the servers total CPU.  Since I was curious as to why 7-Zip was running, I began digging into how LoginVSI calculates the VSIMax score. What they do is periodically run a function called VSI_Timer and they track how long various processes take to complete. One of the functions that they run as part of the timer is a 7-Zip compression operation on a random 5 MB Outlook PST file using heavy compression.

You can find more information on the VSIMax calculation and the VSI_Timer function here.

This VSI_Timer function is actually what brings down the RDS server. If on average each user runs the VSI_Timer every 4 minutes and you put 300 users on an RDS host that means the VSI_Timer is running 75 times every minute! I don’t know about you, but I have never seen users on real production XenApp server execute 75 heavy compression zip operations every minute! I think the concept of the VSI_Timer is valuable and an effective way to calculate user experience; however, it probably makes sense to run the VSI_Timer once or twice per host per minute as opposed to 75 times per minute!

Since I knew that it was all the VSI_Timer events and not the user applications that actually tipped the server over, I decided to modify the Medium workload to reduce the number of VSI_Timer events being run. For this next test I simply commented out all VSI_Timers except for one. This would mean that each user would run one VSI_Timer every 48 minutes. Since LoginVSI randomizes which segment the user’s script begins executing, this would allow the distribution of when the VSI_Timer was run to be spread out nicely as users were logged on. I logged the users on over the course of 2500 seconds (41minutes) and I let the test run for an additional 30 minutes after all users logged on. This reduced the number of VSI_Timer operations executing on the host to about 3 per minute during the later stages of the logon phase. Now let’s take a look at the results!

XenApp RDS – 300 User Medium Test with reduced Timers– VSIMax 296 Users

You can see how simply reducing the Timer operations had a major impact on the VSIMax and the number of users supported on the server! We went from 192 users to 296 users! That is an increase of 54%!

I think it is safe to say that RDS with 296 users will scale much better than VDI on this same host! We all know there is no way we can get 296 Windows 7 VMs running on this host. However, I did want to test VDI, so I ran the LoginVSI Medium workload test against the same host using Windows 7 VDI. In order to keep it fair, I ran the same modified workload where each user would only run one VSI_Timer per 48 minute loop. Since going much beyond 150 VMs would simply not be possible as the host would run out of memory, I configured the host with 150 VMs. I ramped up all 150 users over the course of 30 minutes and let all the sessions run for an additional 30 minutes after the last user had logged on. Here are the results.

Windows 7 VDI – 150 User Medium Test with reduced Timers– VSIMax 146 Users

As you can see, we were able to get 146 users before hitting the VSIMax. Basically, this server does comfortably support about 150 Windows 7 VDI sessions as expected.  However, there is no way it would come anywhere near the 300 sessions that RDS supported!

LoginVSI 4.1

LoginVSI 4.1 was just released and they have made improvements to both their workloads and the VSI_Timer operation. Instead of the three workloads of Light, Medium and Heavy; they now have four workloads of Task, Office, Knowledge and Power.  Essentially, they removed the excessive video playback and flash video games from the Medium workload and now call it the Office workload. The old Medium workload is now called the Knowledge Worker workload. I applaud them for adding greater granularity and improving the workloads. However, these workloads are still 85% active and a bit heavier than what a typical Office or Knowledge worker in the real world would actually do. Additionally, while improvements have been made to the VSI_Timer, it is still runs every 4 minutes for every user and it still uses a heavy 7-Zip operation. For more information on the new LoginVSI 4.1 workloads check out this link.

I decided to put the new LoginVSI 4.1 Office workload to the test. Since I had to use my home lab to execute this test, I no longer had access to that awesome HP blade with 48 cores :(  Here are the specifications of my home lab server upon which I ran the test.

Hardware

  • Intel Quad Core i7 3.4 GHz with Hyper-threading (8 logical cores)
  • 32 GB RAM

Software

  • Windows Hyper-V 2012 R2 Core Edition
  • 2 Windows 2008 R2 VMs running XenApp 6.5 each with the following specs
    • 4 vCPU
    • 14 GB RAM

The server was dedicated to running only the two XenApp VMs. All other infrastructure VMs were hosted on other servers. For all my LoginVSI tests I launched 90 sessions (45 per XenApp VM) using the new Office workload. I took 30 minutes to ramp up and start all 90 sessions and had all of the sessions run for an additional 30 minutes after the last user had logged on.

XenApp RDS – 90 User Office Workload– VSIMax 56 Users

According to the VSIMax score from this test we were able to get 56 users before hitting the limit with the new Office workload. From the chart you can see that at about 63 users we really start to increase our delays.

Since I could see from Task Manager that 7-Zip was still the process that was causing the server to tip over, I decided to modify the VSI_Timers and repeat the test.

What I did for this next test was create a copy of the Office workload and comment out all the VSI_Timer operations. I used the VSI Workload Mashup feature and added 1 Office workload (with all the timers) and 9 modified Office workloads with no timers. With this configuration, 10% of the users would be running the workload with a timer every 4 minutes while 90% of the users would simply execute the full script without the timers. The net result would mean that the VSI_Timer would run on average 2+ times per minute on the host once all users had logged on.

XenApp RDS – 90 User Office Workload with reduced Timers– VSIMax 79 Users

According to the VSIMax score from this test we were able to get 79 users before hitting the limit. From the chart you can see that at about 83 users we really start to increase our delays. These results are still pretty impressive as we were able to get an additional 23 users on the host simply by modifying the frequency of the VSI_Timer! This is a 41% increase in user density!

The Bottom Line

LoginVSI is a fantastic tool.  They have made great improvements over the years and version 4.1 has many worthwhile enhancements as well. It is for good reason that LoginVSI has become the de facto tool for testing RDS and VDI workloads. I would encourage every enterprise customer to purchase and use LoginVSI as your scalability and testing tool. However, like any tool or piece of software (Citrix included), you must really tweak and tune it if you want to get the most out of it and if you really want to get accurate and real world data.

As for the myth that VDI scales almost as well as RDS, well I think we can officially call that myth busted!!!  Anyone who says RDS only scales 20% better than VDI is dead wrong!!! RDS is still king and in most real world environments will give you 100 – 200% greater user density than VDI!

Cheers,

Dan Allen