Overview
NOTE: Part 2 of this Article is available here.
similar to what I did on my farm and design section area, I thought it was time to take a look "fresh" XenApp scalability and user density (hence " Version 2013 "). Because a lot of best practices in this area have changed as we have made some dramatic physical improvements and progress of the hypervisor in the last 2-3 years or more, in particular. And after all, one of the main questions I ask myself again is "How many users can I get on a box?". Well, I had one only 10 years ago when I started ... so I had another answer there about 3 years ... now I have another answer as we approach the year 2014. Much has changed over the last decade, but I think in this area XA scalability particularly as perhaps changed in recent years we have introduced things like NUMA and re-writing hypervisor CPU schedulers. Before we dive in, if you have not read the 3 part series of Andy Baker on Hosted Shared Desktop Scalability, I suggest you do - it's really great and I want to show how well things have changed over the last year since he wrote those articles. And if you have not read the white papers VRC project, particularly "Phase 2" where they looked XA scalability on three major hypervisors in 2010, I suggest you do - very informative and I will review this has changed since they conducted such testing
Results
I decided to make this article a little differently -. I'll give you the "Results" or "key findings" in part 1 first ... then explain how we arrived at these results and some concepts like NUMA and CPU Over-subscription in Part 2 (now Published !). I recently conducted a XenApp pretty intense scalability testing at a customer and some of these results are of this commitment (I also added a couple more that I can speak to one or two other things in this article). Without further ado:
The following conclusions were drawn XenApp scalability tests conducted for ABC Company:
- 130 users per physical host; CPU-bound workloads. Exactly 130 users or XenApp sessions were run with an acceptable user experience on each physical host with the default or workload "Medium" Login VSI using a single material of ABC. This is slightly lower than originally planned, but after a new investigation into the new default workload used by Login VSI 4.0.x, it is expected due to the high participation rate and intensity workload. The workload has also been linked to the processor as expected and as is typical of the XA workloads running on 64-bit operating systems.
- 4 vCPU VM specification Resulted in Optimal Density of the user. Citrix and ABC company tested various configurations with 2 vCPUs, 4 vCPUs and even 8 vCPUs to determine the optimal allocation of resources for intensive VM workload of the processor on the material chosen by ABC. It was determined that the specification 4 vCPU VM resulted in the highest density of the user, while providing a good user experience. Citrix believes this is due to the underlying architecture NUMA in the Intel chipset that was used in the test -. Each plug (with 8 cores) is divided into two NUMA nodes, each with 4 cores
- CPU Modest over-subscription Resulted in optimal user density while maintaining a good user experience. Citrix also tested different levels of CPU subscription, ie 1: 1, 1.5: 1 and 2: 1. It was determined that a modest level of CPU subscription (1.5: 1) resulted in the highest density of the user, while providing a good user experience. This can be largely attributed to Hyper-Threading is enabled, which typically provides performance gains ranging from 20-30%.
- The small performance benefits of Hyper-Threading because of the high activity ratio. The performance gains thanks to Hyper-Threading were relatively low (~ 10%) in the environment of ABC compared to other clients and industry standards (~ 20-30%) . After further investigation of the default workload Login VSI has set up, it seems that the "progress report" (the amount of active treatment versus time of rest) is very high - on the order of ~ 85%. In most Citrix XenApp environments Consulting has seen the participation rate is closer to 50-60% in practice. Citrix believes that this high rate of activity negatively affects Hyper-Threading performance on a workload already bound CPU. If ABC decides, the script Login VSI can be modified to include more time for rest or sleep and test 4 vCPUs can be re-executed to probably achieve higher user density and performance advantages Hyper-Threading technology.
Very interesting, huh? Now let's talk about some results. But first, let me give you some information about what hardware and software we used in these tests our consulting team led to that particular client.
Test Overview
We used a hardware specification that is quite popular at the moment (and also a very good " sweet spot "for XA workloads in terms of CPU and RAM I might add ...) - Dell R810 is with 2 sockets (8 cores each) and 128 GB of RAM. We used Intel chips and Hyper-Threading was enabled in all tests. We were using XenApp 6.5 on 08 R2 (both fully patched) and 4.1 for vSphere Hypervisor (fully patched at that level - I think "Update 3a"). We used Login VSI 4.0.x as load testing tool to perform all tests - we customize the workload, we only used the default "Medium" workload is essentially a desktop user quite heavy (later). We followed everything and then all watched some more. We examined a variety of different specifications VM (2 vCPUs vs 4 vs 8 vCPUs vCPUs) and CPU ratios on subscription (using only physical cores, using all virtual processors, and using somewhere between). We measured the user experience in a variety of ways with the Login VSI Analyzer. The results confirmed a lot of things we preach team Consulting these days and also seeing in the area:
- Hyper-Threading still provides performance gains - you should allow him 99.9% of the time when virtualizing XA.
- you should not just use physical cores when sizing the XA workload, but you should not use all logical / virtual cores when sizing is.
- You should definitely consider using 4, 6 or 8 vCPUs for your XA spec VM instead of 2 vCPUs.
- Users effectively "work" unless you think - Progress reports should reflect the way users work and interact with applications in the real world
in the next article, we'll really dig into these results and talk. things like how NUMA affects VM specifications, pCPUs vs. vCPUs, why we did not have 192 users per box as we did a year ago on the same hardware, and more. Stay tuned and I will update this article with the link to "Part 2" when it is published.
Cheers, Nick
Nick Rintalan, Lead Architect, Citrix Consulting
0 Komentar