I really love my job. I work closely with Citrix partners (usually over a period of months) to build desktop and application reference virtual architectures, conduct performance testing, and how each solution scales. When we come to the end of a validation, it's incredibly rewarding to understand the nuances of a configuration and see the final results published in our work. And it is particularly satisfying when our tests revealed that new software and press material can withstand substantial increases user density.
Perhaps this why I am delighted with the release of a new Cisco ® Validated Design (CVD), which documents a reference architecture and scalability testing we conducted in the last month. The architecture combines Citrix ® XenDesktop ® , Cisco UCS blade servers, NetApp storage and VMware vSphere to build a ready desktop virtualization solution for the enterprise. Under test, it showed excellent scalability with a heavy workload of 00 mixed spaces hosted virtual desktops (VDI) and hosted shared desktops (RDS). We capped our tests at 00 users, but the architecture can scale even beyond to meet the future growth needs. If you want to read the full CVD, Cisco published here. This blog excerpt highlights and summarizes the results of our design and testing.
Building A Scalable Architecture
The reference architecture is based on a FlexPod-key infrastructure (for example, the FlexPod data center with VMware vSphere 5.1). This enterprise solution combines a new generation of hardware and software components: Citrix XenDesktop 7.1 and Provisioning Services 7.1, VMware ESXi 5.1, B0 M3 Blade Cisco UCS with Intel ® Xeon ® E5 -2680v2 ( "Ivy Bridge") processor, and NetApp FAS 3240 shared storage with the Data ONTAP cluster ® OS 8.2 storage. For high availability, server architecture uses an N + 1 design that includes two chassis housing two Cisco UCS blades for infrastructure servers, blades 8 for VDI workloads and 4 blades for RDS hosted shared desktop workload, as in the diagram below.
infrastructure turnkey Nature creates a compact, affordable and flexible for Citrix XenDesktop 7.1 software. The XenDesktop 7.1 unifies the functionality of XenApp and XenDesktop earlier releases by providing a single management framework and common policies provide two virtual desktops hosted and hosted shared desktops. The CVD documents a design that we stress-tested and validated with 550 hosted virtual desktops (running Windows 7) and in 1450 hosted shared desktops (using Microsoft Server 2012). NetApp FAS 3240 storage provided well managed, shared storage scale-out.
Test Methodology
To understand the methodology used to test, you can see this short, 3-minute video.
We set up the test configuration in NetApp lab in Research Triangle Park, North Carolina. During each test, we captured the measures through the virtual desktop lifecycle from end to end: during the boot virtual desktop and connection desktop user (acceleration), the user simulation workload (steady state ), and the user log-offs. To generate a load in the environment, we used Login VSI 3.7 software Login VSI Inc. (http://www.loginvsi.com). This load simulation software generates desktop connections, simulate workloads, and following application responsiveness.
To begin the test, we started the performance monitoring scripts to save the resource consumption for the hypervisor, virtual office, storage and load generation software. At the beginning of each test, we took the workstations on the maintenance mode, virtual machines began, and waited for them to save. Login VSI launchers initiated desktop sessions and started user connections, which is the ramp-up phase. Once all users have been recorded in part at steady state of the trial began in which the Login VSI run the application workload (default Login VSI average workload). The average workload is the office productivity tasks for a knowledge worker "normal" and includes the Microsoft Office, Internet Explorer with Flash, PDF viewing and printing.
Login VSI loops through specific operations and measures response times in regular intervals. Response times Connection VSIMax determine the maximum number of users that the test environment can support before performance is constantly deteriorating. Because the reference response time may vary depending on the virtualization technology, using a threshold calculated dynamically based on the weighted measurements provide greater accuracy for comparisons between providers. For this reason, we also configured the software Login VSI to calculate and report response time dynamic VSImax.
We have two single server and multiple tests of server scalability, make three functional tests during each test cycle to check the consistency of our results. The test phases included:
- The determination of the single-server scalability limitations. This calculated phase VSIMax connection for each scenario (VDI or RDS) on a single blade. In each case, the user's density was scaled until VSIMax connection has been achieved, which occurs when the CPU utilization reaches 100%.
- Validation single server scalability at a maximum recommended density with VDI and RDS loads. The maximum recommended load for a single blade occurs when CPU usage spikes at 0-95%.
- Validation scalability of multiple servers on each cluster workload. Using multiple blade servers, we created a separate base for VDI workloads and RDS, testing each workload group independently.
- Validation scalability of multiple servers on an associated workload. After determining the baseline for each workload group, we combined workloads to achieve a full scale, the result of the mixed workload
Main results
Phase 1 :. Single Server Scalability Testing
In the first series of tests - tests of scalability single server -. we determined Login VSImax for virtual desktops hosted (VDI) and hosted shared desktop sessions (RDS) on a single slide
Parameter | Hosted virtual Desktops | Hosted Desktops shared |
virtual CPUs | 1 vCPUs | 5 vCPUs |
Memory | 1.5 GB | 24 GB |
size vDisk | 40 GB | 60GB |
Write size Cache | 6GB (Thin) | 50GB (Thin) |
NPI virtual | 1 virtual VMXNET3 NIC | 1 virtual VMXNET3 NIC |
vDisk OS | Microsoft Windows 7 Enterprise (x86) | Windows Server 2012 |
additional software | Microsoft office 2010, Connection VSI 3.7 | Microsoft office 2010, Login VSI 3.7 |
test workload | Login VSI workload " Medium " | Login VSI" Medium "workload |
to find Login VSImax for virtual desktops hosted on a single slide, we used a test workload 220 users running Windows 7 SP1 sessions as an average workload (including Adobe flash content). As Figure 1 shows, Connection VSImax was reached to 202 users
Note :. Virtual desktops hosted were configured with the Windows Desktop Basic theme for Windows 7. Windows 7 allows the default Aero Glass theme that translates the user density ~ 30% less consuming more host CPU resources. Indeed, XenDesktop 7.x has a virtual software GPU (vGPU) as part of the Virtual Desktop Agent that both Windows applications and can be used for compatibility purposes. Windows applications and enable GPU hardware acceleration on detection using vGPU XenDesktop Aero rendering software (and DirectX) which consumes additional resources from the host server processors. Except office composition (eg Aero DWM) is required, it is recommended not to use Aero to reach the full potential scalability of the solution. The Windows Desktop Basic theme can be configured using the following policy.
- User Configuration Policies Administrative Templates Control Panel Personalization Load a theme
- theme file path "C: Windows Resources Ease of Access Topics basic.theme "
- user profiles must be recreated when the activation / deactivation of this setting strategy
Figure 1: Hosted Virtual Desktops, single results Server
We then watched the scalability of shared workstations hosted on Windows Server 2012 desktop sessions on a single slide. We launched 240 user sessions to test the scalability of shared workstations hosted on a single Cisco UCS B0 M3 blade-and achieved a score of 211 VSImax users (Figure 2)
Figure 2 :. Hosted Shared Desktops, single server Results
With all scalability optimally configured test, CPU resources are usually the limiting factor, what we known as the VDI and RDS single server tests. Compared to previous generation Intel Xeon past tests "Sandy Bridge" processors, the same hardware Cisco UCS B0 M3 with dual 10-core 2.7 GHz Intel Xeon E5-2680v2 "Ivy Bridge" processors supported density of about 25% larger than the user. Thus, Cisco blades with new processors enable an ultra-compact solution that can support up to 2,000 users
Phase 2 :. Single Server Scalability, recommended maximum density
After testing single server scalability, the next step was to determine the maximum recommended capacity for a single blade. The maximum recommended level of density occurs when the CPU utilization reaches 0-95%.
Under a VDI workload, the maximum recommended density is 180 hosted virtual machines on a single blade server Cisco UCS B0 M3 with dual Intel Xeon E5 2680 v2 processors and 384 GB of RAM. Running Microsoft Windows 7 (32 bit), each virtual machine has been configured with 1.5GB RAM and 1 vCPUs. Figure 3 shows that the charges VSIMax scales, as well as the CPU, memory and network settings. (Storage settings are discussed in a separate section at the end of this blog.)
Figure 3: Hosted Virtual Desktops, servers Individual results under recommended load
for shared desktops hosted (RDS), workload recommended maximum of 220 users per Cisco UCS B0 M3 blade (with double E5-2680 v2 processors and 256 GB of RAM). On each slide eight we configured Windows 2012 virtual server machines, each with 5 vCPUs and 24GB RAM. Figure 4 shows that VSIMax load scales with CPU, memory and network settings
Figure 4 :. Hosted Desktops shared, server Individual results under recommended load
Phase 3: the Cluster workload Scalability Test
in this phase, we created separate workload clusters, one for VDI and for RDS, and test them independently. In an N + 1 configuration, if one blade is unavailable due to scheduled maintenance or unplanned downtime, the remaining servers can absorb the extra load. Our goal was to create a highly available architecture so that the design could sustain acceptable performance even in case of failure of a single blade.
To properly support a total of 550 users VDI, we used four Cisco UCS B300 M3 blade, configured with virtual machines as in the single server test / recommended maximum load (1 vCPUs and 1.5GB RAM by VM). To test the cluster of VDI workload 5 shows VSIMax and CPU, memory and network settings for a representative blade
Figure 5 :. Hosted Virtual Desktops, multiple servers / Workload Cluster Baseline
to provide 1450 hosted desktop sharing, we used eight Cisco UCS B300 M3 blade and configure as in the recommended load testing single server / maximum (each blade was configured with eight Windows server 2012 R2 virtual machines and each VM was allocated 5 vCPUs and 24GB RAM). Again, our goal was to design a configuration that could support a single failure of the blade and continue to support 1450 RDS users. 6 shows VSIMax and metrics collected for a representative single blade
Figure 6 :. Hosted Shared Desktops, multiple servers / Workload Cluster Baseline
phase 4: Full-scale Mixed Workload testing Extension
in this final phase of testing we have validated the wide solution by launching LoginVSI sessions against both VDI and RDS clusters simultaneously. Cisco test protocol requires that all sessions must be started within 30 minutes and all launched sessions should become active within 32 minutes.
Our validation tests imposed starting and aggressive scenarios connection with the siege of 00 mixed office workload. All VDI and RDS VMs started and recorded with XenDesktop 7.1 Controllers delivery in less than 15 minutes, proving how quickly this desktop virtualization solution could be available after a cold start. Our test simulated a connection storm every 00 simulated users, but all users connected and started running workloads (reaching steady state) within 30 minutes without exhausting the CPU, memory, or storage.
Figure 7 shows the results of mixed workloads, large-scale tests, including graphics for VSImax and CPU, memory, and network settings collected on the VDI server and representative RDS
Figure 7 :. full scale, mixed Workload scalability results
storage Performance
The storage configuration includes a cluster with two NetApp node FAS3240 being clustered implementing Data ONTAP 8.2 and four racks of 15K RPM 450GB SAS drives. Each node was configured with a 512GB Flash Cache PCI card. . User home directories were configured on CIFS shares PVS then write caches for VDI and RDS workloads are hosted on NFS volumes
Since the test was held at the NetApp plant, we captured the storage performance data extended at each stage: startup, connection, steady state, and logoff. When testing large-scale workloads combined VDI and RDS, we recorded storage measures on the NetApp Filer and are summarized below.
Overall, the storage configuration easily handled the workload of 2,000 seats, with an average read latency of less than 3 ms and write latency of less than 1 ms. Write caching PVS showed a peak average of 10 IOPS per desktop during the login storm, with steady state showing 15-20% less I / O in all configurations. The average write cache I / O was 8k in size, with 0% write cache I / O to be written. Using NetApp Flash Cache has decreased the number of IOPS during the I / startup and connection O-intensive phases, once cached (warm cache).
Conclusion
The validated test linear scalability of the Cisco, NetApp, VMware and Citrix reference architecture. The tested configuration easily supported a 00 seat XenDesktop mix of user load VDI and RDS without overloading the CPU, memory, network and storage. In addition, the reference architecture defines an N + 1 configuration blade, creating a very reliable and tolerant design fault for virtual desktops hosted, hosted shared desktops and infrastructure services. And the new generation of Intel "Ivy Bridge" processors supports about 25% more capacity per server processors of the previous generation, allowing the system under test to support 2,000 users using only 32 rack units of single rack, keeping the power center and floor space data.
Maybe because I love what I do, but I think these results are quite exciting. Check out the full CVD posted on the Cisco website
-. Frank Anderson, Principal Solutions Architect with Citrix Worldwide Alliances
0 Komentar