A colleague of mine, Nick Rintalan, wrote an excellent blog post on whether one should virtualize Provisioning Server. If you have not read it, then I recommend you check it out here: http://blogs.citrix.com/2011/02/22/should-i-virtualize-provisioning-server/
I am an avid supporter of virtualizing Provisioning Server. Servers today are just too powerful and it is a waste of resources to run things on bare metal. Let’s face it, the average enterprise 1U rack or blade server has at least 2 sockets, 8+ cores and tons of RAM. Running a single instance of Windows on a one of these servers is a complete waste of resources. I have often heard people saying that you can only get 300 – 500 targets on a virtual PVS. I have also seen customers thinking that they have to place a virtual PVS on each hypervisor host along with the target devices so that the number of targets per PVS is limited and that all traffic remains on the physical host and virtual switch. I would like to finally debunk these myths and let you know that PVS virtualizes just fine, even in large environments and you do not have to treat it any differently than other infrastructure servers that run as virtual machines. I would like to take this opportunity to provide a real world customer example showing that Provisioning Server is an excellent candidate to virtualize for all environments, even large ones.
Real World Example
First, for the sake of privacy I will not be disclosing the name or any other identifying information about the customer, but I will provide some basic technical details as it relates to the virtual PVS deployment as well as some data showing how well virtual PVS is scaling.
- Hypervisor is VMware 4.1 for both servers and Windows 7 desktops
- PVS 5.6 SP1 is virtualized on same hosts along with other supporting server VMs
- Windows 7 32-bit is virtualized on separate VMware hosts dedicated to desktop VMs
- All hosts are connected with 10Gb Ethernet
- There are 5000+ concurrent Windows 7 virtual machines being delivered by virtual PVS
- All virtual machines (both Windows 7 and PVS) have one NIC. PVS traffic and production Windows traffic traverses the same network path
- Each virtual PVS was configured as a Windows 2008 R2 VM with 4 vCPUs and 40 GB RAM
- The PVS Store is a local disk (VMDK) unique to each PVS server
- Each Windows 7 VM has a unique hard disk (VMDK) that hosts the PVS write cache
So, how many target devices do you think that we could successfully get on a single virtual PVS; 300, 500, 1000??? Well, check out the screen shot below which was taken in the middle of the afternoon during peak workload time:
As you can see, on the first three PVS servers, we are running almost 1500 concurrent target devices. How is performance holding up from a network perspective? The console screen shot was taken from PVS 01 so the task manager data represents 1482 connected target devices. From the task manager graph, you can see that we are averaging 7% network utilization with occasional spikes of 10%. Since this is a 10Gb interface, that means sustained networking for 1500 Windows 7 target devices is 700 – 1000 Mb/s. In theory, a single 1 Gig interface would support this load.
How about memory and CPU usage? Check out the task manger screen shot below taken from PVS 01 at the same time as the as the previous screen shot:
From a CPU perspective, you can see that we are averaging 13% CPU utilization with 1482 concurrently connected target devices. Memory usage is only showing 6.74 GB committed; however, take note of the Cached memory (a.k.a. System Cache or File Cache). The PVS server has used just under 34 GB RAM for file caching. This extreme use of file cache is due to the fact that there are multiple different Windows 7 VHD files being hosted on the PVS server. Windows will use all available free memory to cache the blocks of data being requested from these VHD files, thus reducing and almost eliminating the disk I/O on the virtual PVS servers.
At 1500 active targets, these virtual PVS servers are not even breaking a sweat. So how many target devices could one of these virtual PVS servers support? My customer has told me that they have seen it comfortably support 2000+ with plenty of head room still available. It will obviously take more real world testing to validate where the true limit will be, but I would be very comfortable saying that each one of these virtual PVS servers could support 3000 active targets.
It is important to note that this customer is very proficient in all aspects of infrastructure and virtualization. In fact, in my 13+ years of helping customers deploy Citrix solutions; the team working at this customer is by far the most proficient that I have ever worked with. They properly designed and optimized their network, storage and VMware environment to get the best performance possible. While I will not be able to go into deep details about their configuration, I will provide some of the specific Citrix/PVS optimizations that have been implemented.
There are Advanced PVS Stream Service settings that can be configured on the PVS server. These settings typically refer to the threads and ports available to service target devices. For most optimal configuration it is recommended that there be at least one thread per active target device. For more information on this setting, refer to Thomas Berger’s blog post: http://blogs.citrix.com/2011/07/11/pvs-secrets-part-3-ports-threads/
For this customer we increased the port range so that 58 UDP ports were used along with 48 threads per port for a total of 2784 threads. Below is a screen shot of the settings that were implemented:
It is also important to note that we gave 3GB RAM to each Windows 7 32-bit VM. It is important to make sure that you do not starve your targets devices for memory. In the same way that the PVS server will use its System Cache RAM so that it does not have to keep reading the VHD blocks from disk, the Windows target devices will use System Cache RAM so that they do not have to keep requesting the same blocks of data from the PVS server. Too little RAM in the target means that the network load on the PVS server will increase. For more detailed information on how System Cache memory on PVS and target devices can affect performance, I highly recommend you read my white paper entitled Advanced Memory and Storage Considerations for Provisioning Services: http://support.citrix.com/article/ctx125126
Based on this real world example, you should not be afraid to virtualize Provisioning Server. If you are virtualizing Provisioning Server make sure you take the following into consideration:
- Give plenty of RAM to both PVS and your target devices
- Give the proper number of vCPUs to the PVS VM and tune the ports and threads
- Plan on supporting about 1000 active targets per 1 Gig of network throughput
- Use 10 Gig networking infrastructure, if you can.
- If you are going to use NAS (CIFS) for the PVS Store, then read and follow the instructions in my blog: http://blogs.citrix.com/2010/11/05/provisioning-services-and-cifs-stores-tuning-for-performance/
It is also import that all of our other best practices for PVS and VDI are not overlooked as well. In this real world example, we also followed and implemented the applicable best practices as defined in these two links below:
- Provisioning Services 5.6 Best Practices
- Windows 7 Optimization Guide
As a final note before I wrap up, I would like to address XenServer as I know that I will l get countless questions since this real world example used VMware. There have been discussions in the past that seem to suggest that XenServer does not virtualize PVS very well. However, it is important to note that XenServer has made some significant improvements over the last year, which enables it to virtualize PVS just fine. If you are using XenServer then make sure you do the following:
- Use the latest version of XenServer: 5.6 SP2 (giving Dom0 4 vCPUs)
- Use IRQBalance. You can find more details on it here:
- Use SR-IOV, if you can (but not required). You can find more details on it here:
I hope you find that this real world example is useful and helps to eliminate some of the misconceptions about the ability to virtualize PVS.