Dell Acquires AppAssure

Dell announced their acquisition of AppAssure today, a software based enterprise backup solution for physical, virtual and cloud infrastructures. They feature both local (DAS, SAN, remote disk) and cloud based backup, replication and recovery.

This further supplants the idea of Dell as being a hardware company and continues them on their path to a more solutions and services focus. As some colleagues in the field and I were just talking (@plankers, @tscalzott) that hardware (especially storage/server/network) are becoming ubiquitous, the real value will be in features and software innovations.

Here's the press release from Dell:

Read the rest of this post »

Posted by Matt Vogt
 

HP Tech Day 2012

This week I'm excited to be flying to Ft. Collins Colorado for an HP Tech Day that will be hosting independent bloggers to take a look at the LeftHand and 3Par products as well as their VMware integration. I've been to a couple demos, read a couple papers and have had some conversations with people about these products, so what makes this trip special is that we get some good ol' fashion hands-on-lab experience. There's a chasm of a difference between seeing the product in a slide deck and being able to kick the tires yourself.

I'm also excited to meet a group of new bloggers/storage-geeks. I've met a few of the guys at different events (Tech Field Day, VMWorld, Hp Cloud Tech Day, etc.) and on twitter and I'm excited to meet the rest:

Alastair Cooke, @DemitasseNZ, www.demitasse.co.nz
Brian Knudtson, @bknudtson, www.knudt.net/vblog
Ray Lucchesi, @raylucchesi, www.silvertonconsulting.com/blog
Howard Marks, @DeepStorageNet, www.deepstorage.net/WP-Save
John Obeto, @johnobeto, www.absolutelywindows.com
Justin Paul, @recklessop, www.jpaul.me
Jeffery Powers, @geekazine, www.geekazine.com 
Derek Schauland, @webjunkie, techhelp.cybercreations.net
Rick Schlander, @vmrick, www.vmbulletin.com
Justin Vashisht, @3cVguy, 3cvguy.blog.com
The crew will be hosted by HP Storage Guru and all around good guy Calvin Zito (@HPStorageGuy).
As is all the rage for conferences and other intimate gatherings, a live stream of the event will be attempted. Keep an eye out on twitter for the hash tag #HPTechDay and/or #HPCI for the latest information and buzz about the event.
Can't wait.

 

Posted by Matt Vogt
 

HP Cloud Tech Day

I had the privilege of attending an HP Cloud Tech Day this past week in Houston, organized by Ivy Communications. Tom, Chris and Halley did a great job gathering some pretty cool and smart bloggers and thinkers to hear about and give feedback on HP's cloud offerings and aspirations. The list of attendees were:

Patrick Pushor
@CloudChronicle

Christopher White
@Fezmid

Rich Miller
@datacenter

Phillip Sellers
@pbsellers

Phillip Jaenke
@rootwyrm

Bob Stein
@ActiveWin

John Obeto
@JohnObeto

Chris Wahl
@chriswahl

Frank Owen
@fowen

Michael Letschin
@mletschin

Ofir Nachmani
@iamondemand

I highly recommend you check out their stuff. Super smart guys. A great mix, too, of sys admins, cloud evangelists, service providers, etc. I'll follow up with some specific posts about the topics we covered while I was there, but here's what we covered in day 1: HP Enterprise Business Cloud Strategy, HP View of Cloud Futures, Hyperscale for Cloud, Inner Workings and Building of a CloudSystem Infrastructure, Performance Optimized Datacenter Overview and Tour. Overall I was quite impressed. My regret is that I could only attend one full day. I will be following the rest of the action on Twitter (hash tag #hpci) and on www.hp.com/go/hpcloudday (live video, twitter feed and chat).
Posted by Matt Vogt
 

Dell Management Plug-in for VMware vCenter Update 1 Released

Today, Dell released Update 1 to the 1.0.1 version of their Management Plug-in for VMware vCenter. The biggest highlight among the fixes and changes would be the added support of ESX5 (vCenter 5). If you're currently running the 1.0.1 plugin under a vCenter 5 environment (which 'works', just not in a supported kind of way), you'll need to unregister and re-register the Dell Management Plugin after upgrading (see the Release Notes for all issues/resolutions). 

One of the major changes from the original 1.0 to the 1.0.1 plug-in was the promise that updates to the appliance/software would come as an RPM patch and not tied to re-deploying another OVF. I'm glad to report that this worked wonderfully. You can find full instructions in the Dell Management Plug-in for VMware vCenter User Guide (page 41), but here's the quick and dirty:

  1. Always backup your appliance. Always backup pre-upgrade. When? Always.
  2. Open up and log into the web admin portal (https://myApplianceHostname/)
  3. Click on 'Appliance Management' in the left menu
  4. Click 'Upgrade'
    - This will boot you out of the portal, upgrade the software and reboot the VM (the User Guide makes no mention that it reboots the VM, so just know that it does).
    - I recommend opening up a VM Console so you don't have to just sit and refresh the page to see if it's back up or not
  5. Restart your vCenter Client (this might just be me because I was having some DNS issues at the time on my desktop)
The whole process took about 10 minutes for me. It took about 7 minutes before I saw the appliance reboot.

Happy upgrading.

 

Posted by Matt Vogt
 

Dell Management Plug-In for VMware vCenter Review

Ok, I've had the plug-in running for a few weeks and have gone through some of the primary functions of it (firmware updates, inventory, monitoring, warranty retrieval, create hardware profile for deployment)

I'm not going to go through the initial setup, that's been covered pretty well on DellTechCenter.com.

Here are the claimed major functionalities with my notes as far as day to day usage as well as some miscellaneous thoughts at the end.

Deep-level detail from Dell servers
The level of detail here is quite good; much deeper information and more clearly laid out than the basic 'Hardware' tab in vCenter. But what stands out to me is the efficiency of not having to rely on another tool, be it OpenManage, iDRAC, IT Assistant, etc. I spend a lot of time in vCenter and it's fantastic to not have to leave that for another program.
Dell_plugin_tab_overview
The amount of detail for hardware information is ridiculous. All of this information is available if you have the Enterprise iDRAC in your server, but to be able to get the serial number and manufacturing date of your RAM in the same place that you can check your warranty status is just beautiful.
Capture3
Deploy BIOS and firmware updates within vCenter
This is a wizard based process that requires you to have a CIFS or NFS repository, which the initial setup walks you through for configuring. I've found it pretty straight forward, easy and quick. Well, the wizard is quick. While this feature is fantastic and works very well, the actual upgrade, however, takes quite a while. The server goes through multiple reboots throughout the process. After the updates are downloaded to the repository, the server is automatically put into Maintenance Mode and then reboots into an EFI environment to do the updates. After each update, the server reboots and re-enters the System Update environment to continue with the next update (firmware/bios). If you attempt to perform many updates at once (NIC firmware, BIOS, HD Firmware, etc.), be prepared to wait.

Build hardware and hypervisor profiles and deploy any combination of the two on bare-metal Dell PowerEdge™ servers without a preboot execution environment (PXE)
This is accomplished through the magic of the combination of the LifeCycle Controller and the iDRAC. While I've built the profile which seems very straight forward, I've yet to be able to test this (spare Gen11 PowerEdge servers are hard to come by, though if one were donated, I would not complain). Although I have a new server coming to replace an out of warranty cluster host that I was planing on testing on, I found this little nugget in the Admin Guide
The system needs to have a Virtual Disk for installation of the OS.
The Plug-in will not install the hypervisor to an internal SD card.
Bummer. This is the standard config for my cluster going forward. No Hard Disks. My great hope is that this is resolved in the next version. If not, this is a huge feature and potentially massive time saver that's not available to me.

Automatically perform Dell recommended vCenter actions based on Dell hardware alerts
The Plug-in adds a whole host of new Dell server specific alarms to vCenter. These range from power consumption to OS driver version monitoring. If something critical enough happens, say a single power supply in a dual power supply system dies, the Plug-in will automatically put the host in maintenance mode until the issue is fixed. This can theoretically save you from encountering an HA event, which, while cool, is never fun.

Capture4

When I first installed the plug-in, I was immediately alerted to the fact that I was running on a quite old RAID controller driver. Handy.

Receive proactive renewal alerts from Dell before your warranty expires and access the Dell hardware warranty page online 
I've always been bad at doing this myself. It seems easy to track on my own, but we're all lazy in some areas, I guess this in one of mine. So, thanks, Dell, for enabling me to not have to come up with a better solution on my own :) I have yet to receive this because the server I'm testing on still has almost 1500 days of warranty left. But I see the link to click to renew it if I like, and its status is in the Overview page in the Dell Server Management tab.

Misc Thoughts and Issues

  • Hardware Provisioning and Deployment
    • Unfortunately, v1.0.1 cannot deploy a hypervisor to an internal SD card. This is how we plan to move forward with our ESXi installs (including the R610 I just ordered)
  • Pricing
    • Retail pricing is $299.00 for up to 3 hosts, $799.00 for up to 10, $1,799.00 for up to 50 and $2,999.00 for up to 1000 hosts
      • If you have 1000 hosts, you can probably afford this. It might be hard to sell $800 to my management to manage my 5 hosts. Essentially, we'd have to save about 25 hours of work to break even
      • I'm not asking for it to be free. It does too much to be free and is really bordering on what you can define as a plug-in. What I'd like to see is up to 3 for free (throw the SMBs a bone and gain market share in the process), $300 for 5 hosts, etc.

 

 

 

Posted by Matt Vogt
 

New Dell EqualLogic Arrays

Dell unveiled an update to 2 of their EqualLogic PS series array platforms today along with their first sub-$10k array. The new PS6100 and PS4100 series arrays are a refresh of their PS6000 and PS4000 units. The new boxes are being touted as having up to a 67% improvement in I/O performance. 

Here are the major new features for each:
PS41000
- shrinks down to 2U
- 24 x 2.5" drives - up to 21.6TB
- 12 x 3.5" drives - up to 32TB
- Now starting at under $10,000

PS6100
- 2U version with 24 x 2.5" drives - up to 21.6TB
- New 4U design with 24 x 3.5" drives - up to 72TB
- NEW Dedicated management port

Capture

Both arrays will ship with the latest 5.1 firmware and are certified for VMware's vSphere 5.0 storage APIs (VASA, VAAI, etc.). The SSD options will go up to 400GB per drive, which I'm sure will be slightly over the $10,000 starting price in the PS4100. 

This may sound lame, but the addition of the dedicated management port on the PS6100 is something that I'm very excited about. I never understood why there was one on the PS4000 but not the PS6000. It was maddening to lose 25% of my total network throughput on an array if I needed to attach it to a dedicated management network.

Being in the market for a Sumo (Dell's EqualLogic Monster PS6500 series array), I was hoping that those would get the same refresh, and even though I knew it wasn't going to be refreshed yet, I'm still a bit bummed that I may have to purchase it just before it gets its own upgrade.

Posted by Matt Vogt
 

New Role and Opportunity

For the last 4 years I've operated as a Windows Systems Administrator, primarily focusing on (surprise!) Microsoft technologies - patching, security, Active Directory, Group Policy, etc. When I took this position, our virtualization environment was quite small, not very complex, not needing a lot of love or development, and not really my job. We had about 30 virtual machines, 4 hosts running ESX 2.5 all with internal or direct attached storage, 3 hosts running EXS 3.5 with still more internal storage and one single controller NetApp FAS270 with a whopping 1.25TB of iSCSI storage! These ESX 3.5 hosts were also running un-clustered.

With demands growing much faster than our budget (centralized backup, Antivirus, patching, deployment, file and print services, CMS, LMS, better-than-just-pop-email), it was obvious that we could no longer afford physical servers. We had neither the budget nor the physical space, power, cooling, etc and had to come up with a better plan. Virtualization was the answer, and somebody had to do it. I fell in love with the technology and jumped right in. As most of you have probably experienced, it soon became the majority of my daily functions.

We quickly added one more ESX 3.5 host, consolidated 2 of the ESX 2.5 hosts into the 3.5 hosts, added a second shelf to the NetApp (now all of 3.5TB) and added a Dell PowerVault MD1000 attached to a PowerEdge 1950 running Red Hat serving as an NSF store (3TB also).

Sounds great. We should be set, right? Boy was I wrong. I had no idea how fast we could chew through storage and host resources. With our NetApp nearing End of Life (not to mention being well out of warranty), it was time to consider new storage and another host or 2. While we loved the performance of our NetApp, we couldn't afford a system with multiple controllers, couldn't afford death by licensed features and found it difficult to administer. Through a process I won't detail here, and with a price my Dell AE swore me me to protect, we decided to migrate to and standardize on EqualLogic. So we purchased a PS6000XV for primary storage (6.5TB usable) and a PS4000X for replication. 

We're now sitting with a single ESXi 4.1 cluster with 5 hosts and 3 EqualLogic arrays in two groups. We're still using the old NetApp iSCSI and MD1000 NFS SANs as tier 2 storage and now have a grand total of 26TB of storage (96TB more coming).

With the evolution of my workload and focus, as well as a new project building a remote data center in Houston as both a multi site cluster and DR site, I was offered the new position of Sr. Systems Administrator - Virtualization and Storage, which I gladly accepted. While this in part realigns my job title and description with what I actually do and where the Datacenter and IT services field is headed, it also adds more opportunities for growth. I will be taking on the role of Scrum Master (Srum is our internal project management framework), operate as lead/backup technician for the rest of the Sys Admin team and be responsible for server/service patch management oversight.

It's big and a little bit scary, but if im  not a little bit scared of what I'm doing, I get complacent and don't learn nearly as much.

Here's to being scared.

Posted by Matt Vogt
 

LA VMUG - vCenter Operations

The Los Angeles VMUG was held today at the DoubeTree Hotel at LAX and the primary topic was a product discussion and demo of vCenter Operations. Much of the time was dedicated to what needs and gaps it fills.

The dilemma now is that we have essentially 3 layers: Hardware, Hypervisor, OS/App. For each of those 3 layers there are a multitude of ways to monitor capacity, get health checks and gain deep visibility into performance metrics and bottlenecks. This is the goal of the vCenter Operations along with the promise of capacity planning, compliance checks and change management.

What I'm impressed with, though, is the robust 3 vectored system of overall health of the ESX environment and how they're scored (each 0-100), which is at the core vCenter Operations management system.

Workload
0 means that the object (Guest, Host, Cluster or Data Center) is using no resources that have been allocated to them
100 means that the object is consuming all of at least one resource. This can be higher than 100 in the case of RAM utilization. A VM can use more RAM than you've allocated to it.
The overall number for the object is bound by highest metric (ram, storage, CPU, network). Meaning, that if a guest's CPU utilization is sitting at 23%, but the RAM usage is at 75%, the Workload score would be 75.

Health
Here, higher means the system workload is following normal patterns and lower can indicate abnormalities. Normal is defined over time as vCenter Operations observes workload trends over time. Month end for an accounting office application will have higher utilization than any other time. This means overall CPU/RAM/Disk/Network usage will spike, but it's normal and expected. 

This to me is one of the biggest advantages over something like SCOM or Nagios. Just because something spikes, doesn't mean that I should get an email altering me (or in the case of our current Nagios implementation, spamming me every 30 minutes).

So when the Health score of an object lowers, key metrics and workload are getting further from what's been previously observed to be normal. This is what I want to know.

Capacity
Based on utilization, when will I run out of resources (CPU, RAM, Storage, Network)?
Pretty straight forward here: High is good, low is bad. The number is based on binding metric for capacity breach: based on the current trend, when will I run out of storage. Like the other two scores, this is calculated for each resource on each level - Data Center, Cluster, Host, Guest.

Of course, as it seems with every licensed software on the planet, there are 3 versions: Standard, Advanced and Enterprise. I won't go into all the details of the differences between them, you can check them out here, but here the highlights (each one builds on the previous):

Standard
  • Dashboard with Health Scores
  • Behavioral Analysis and Trending
  • Heat Maps
  • Estimated timing remaining till Capacity full (CPU, Memory, Disk, Network)
  • Configuration Change Visibility
  • (no alerts)

Advanced
  • Capacity Bottlenecks
  • Resource wastage analysis and trending - including recommendations for right-sizing
  • What-if capacity modeling
  • Custom reports
  • Support for HA, FT and Linked Clones
  • (no alerts)

Enterprise
  • Smart Alerts, including Email and SNMP
  • 3rd party plug-in reporting and data analysis (Nagios, SCOM, Tivoli, etc.)
  • Regulation and Industrial Compliance checks and scans
  • Change Alerts
  • Distribution (OS, Hypervisor, Applications)

I think with the advent of the new licensing model based on vRAM, capacity planning and right-sizing virtual machines will become imperative to every Virtualization Infrastructure admin.

I have yet to try any of these out and VMware allows you to download and try the Standard edition free for 60 days. As performance and metrics are becoming more integral to my job, I plan to take a good look at this. The product itself looks to be pretty solid and you can count me as impressed.

 

Posted by Matt Vogt
 

vSphere 5 Licensing - post grief post

There have been gobs of reactions to VMware's new license model that was announced last week, and the vast majority of it was negative. I will admit that I took part in some of the initial back lash. We sysadmins don't like change, especially when we've engineered systems to maximize a certain licensing model and then that model changes. But then I started to think it's possible likely that I'm overreacting. Maybe it won't have any effect on us. So I started doing the math. Licenses are still purchase by the CPU, at minimum one for each socket in your system, with each license having a vRAM Entitlement. For reference, one of the best license summaries I've found is on Alan Renouf's blog http://www.virtu-al.net. (the following is borrowed from his blog post http://www.virtu-al.net/2011/07/14/vsphere-5-license-entitlements/)

License Type Essentials Essentials Plus Standard Enterprise Enterprise Plus
vRAM Entitlement per license 24GB 24GB 24GB 32GB 48GB

 

We currently have Enterprise Licensing with the following specs:

5 hosts - each with 2 CPU and a total of 256GB physical RAM (pRAM here on out)

70 Virtual Machines wth a total of 140GB Virtual RAM allocated (vRAM). But we also have about 20 powered off virtual machines for test/dev with an average of 4GB RAM each. So worst case scenario for vRAM is 220GB

The Math

10 Licensed CPUs @ 32GB Entitlement per CPU  = 320GB of RAM Entitlements.

So as you can see, at the moment, we're totally fine. Primarily because we have too many hosts, which I plan to fix with eliminating 2 hosts (pRAM is much cheaper these days than when we started building our cluster, which allows for greater consolidation). This works out like so:

6 Licensed CPUs @ 32GB Entitlement per CPU  = 192GB of RAM Entitlements

So I'll still have 52GB vRAM overhead, but will have to careful with how many test/dev servers we turn on at the same time. I'm just glad there isn't a 'hard stop' when you hit your entitlement limit. I'm just not excited to one day tell my CIO that I have to purchase more CPU licenses than we have CPUs.

Not All Bad

I understand where VMware was going with this new model. One of the major tenets of the Cloud is the 'pay for what you need/use, not what you don't'. VMware's philosophy is now a 'license only how much vRAM you need, not your pRAM'. When I lamented that new virtual machines are becoming more of a business decision, this isn't all bad. VM sprawl is very much real. Especially with RAM and CPU speeds and feeds exploding for such a minimal cost increase, we vAdmins have to think less and less about what resources our machines actually need. New Windows Server 2008 R2? Ah, just go ahead and give it 8GB off the bat. Why not?

I think it will help admins and users think critically about the resources they allocate to machines and force people to 'right-size' them. Really, we'll be thankful 5-10 years down the road when if we migrate virtualization platforms. (oh no he didn't)

For those that are adopting or have adopted a charge back model, this will make it much easier to manage and explain the costs of your tiered environment. You want a big beefy server with tons of RAM? Groovy. That'll be $90/1GB RAM/year. But you can have all CPU you need.

Left Over Beef

  • VMware that says they removed "two physical constraints (core and physical RAM)", but they introduced a virtual constraint and I never had either of those 2 previous ones.
  • The 8GB vRAM limit on free ESXi might be a home/test lab buster. Aren't all the other restrictions enough?
  • I think VMware could do lot by increasing the entitlement just a bit: 48GB for Enterprise, 64GB or 96GB for Enterprise+ would silence a lot of critics (but also cut into their profit margins)

But, these are just my thoughts as a sysadmin in an SMB shop. There doesn't seem to be a huge impact us, yet. Oh, did I mention we're in the beginning phases of a DR site? Yeah, this will influence some discussions there now. 

Posted by Matt Vogt
 

vSphere 5 Fab 2

Well, the announcement came and went for vSphere 5.0 yesterday and a lot of new technology and new capability was put out there. You may have also heard of the new licensing scheme, but I'm not going to cover that yet as I want to take more time to evaluate how it will impact me (but I'm currently in stage 2 of The Five Stages of VMware Licensing Grief). Here are some quick hits of 2 the new tech that will primarily affect me, small shop in a small EDU:

New vMotion (aka Storage DRS goodness)

svMotion has a new copy mechanism that now allows for migrating storage for guests that have snapshots or have linked clones. A Mirror Drive was also created on the destination datastore that holds all the changes during a copy so when the copy is done, the changes are synced from the Mirror Drives rather than having to make several passes back to the original datastore. This should decrease svMotion times by quite a bit.

Expanding on the amazing DRS feature for VM/host load balancing, storage DRS brings the same capability to storage. Although this is all wrapped up in the new and improved Storage vMotion, it could stand alone as quite the feature. As introduced with vSphere 4.1, if your storage vendor of choice support VAAI (storage acceleration APIs), this all happens on the SAN rather than over the network, bringing joy to your network admins.

VMFS-5

Lots of new features here. 

  • 1MB block size - gone are the 1, 2, 4 and 8M block sizes
  • 60TB datastores. Yes, 60. Yes, Terabytes
  • Sub-blocks down to 8k from 64k. Smaller files stay small
  • Speaking of smaller files, files smaller than 1k are now kept in the file descriptor location until they're bigger than 1k
  • 100,000 file limit up from 30,000
  • ATS (part of the locking feature of VAAI) improvements. Should lend itself to more VMs per datastore

VMFS-3 file systems can be upgrades straight to VMFS-5 while the VMs are still running. VMware is calling this an "online & non-disruptive upgrade operation".

A couple hold over limitations for a VMFS-5 datastore:

  • 2TB file size limit for a single VMDK and non-passthru RDM drives (passthru RDM can be the full 60TB)
  • Max LUNS is still 256 per host (I personally could never see hitting this, but I'm sure larger implementations can)

More vSphere 5 posts will be coming, but these are the 2 things that got me the most excited.

Posted by Matt Vogt