Downgrade firmware on an Dell ESXi host

An issue has come up on one of my Dell PowerEdge R610 ESXi hosts that I wanted to attempt a downgrade of the firmware on my Broadcom BCM5709 network adapters for troubleshooting, but was not finding any easy way between the Server Update Utlities, OpenManage Essentials, etc.
This didn't fix my issue, but it was a PITA to figure out the best/fastes/easiest way to get this done, that I thought it worthwile to share.
My first thought was to attempt to use the Firmware Upgrade wizard built into the Dell Management Plug-in for VMware vCenter because it offers an option to select an update executable from a CIFS share, but that, unfortunately, just threw me an error even though I was using a valid DUP file.
Failed sending update file: (NETW_FRMW_WIN_R299290.EXE) to iDRAC - Details: The update package (NETW_FRMW_WIN_R299290.EXE) is not supported via 1x1 update feature. Use the repository method to update this device. This error can also be seen if package is not named according to Dell naming standards.
So I decided to build my own repository and point the Plug-in to that, and here's that process:

Read the rest of this post »

Posted by Matt Vogt
 

Dell Management Plug-in for VMware vCenter Update 1 Released

Today, Dell released Update 1 to the 1.0.1 version of their Management Plug-in for VMware vCenter. The biggest highlight among the fixes and changes would be the added support of ESX5 (vCenter 5). If you're currently running the 1.0.1 plugin under a vCenter 5 environment (which 'works', just not in a supported kind of way), you'll need to unregister and re-register the Dell Management Plugin after upgrading (see the Release Notes for all issues/resolutions). 

One of the major changes from the original 1.0 to the 1.0.1 plug-in was the promise that updates to the appliance/software would come as an RPM patch and not tied to re-deploying another OVF. I'm glad to report that this worked wonderfully. You can find full instructions in the Dell Management Plug-in for VMware vCenter User Guide (page 41), but here's the quick and dirty:

  1. Always backup your appliance. Always backup pre-upgrade. When? Always.
  2. Open up and log into the web admin portal (https://myApplianceHostname/)
  3. Click on 'Appliance Management' in the left menu
  4. Click 'Upgrade'
    - This will boot you out of the portal, upgrade the software and reboot the VM (the User Guide makes no mention that it reboots the VM, so just know that it does).
    - I recommend opening up a VM Console so you don't have to just sit and refresh the page to see if it's back up or not
  5. Restart your vCenter Client (this might just be me because I was having some DNS issues at the time on my desktop)
The whole process took about 10 minutes for me. It took about 7 minutes before I saw the appliance reboot.

Happy upgrading.

 

Posted by Matt Vogt
 

Dell Management Plug-In for VMware vCenter Review

Ok, I've had the plug-in running for a few weeks and have gone through some of the primary functions of it (firmware updates, inventory, monitoring, warranty retrieval, create hardware profile for deployment)

I'm not going to go through the initial setup, that's been covered pretty well on DellTechCenter.com.

Here are the claimed major functionalities with my notes as far as day to day usage as well as some miscellaneous thoughts at the end.

Deep-level detail from Dell servers
The level of detail here is quite good; much deeper information and more clearly laid out than the basic 'Hardware' tab in vCenter. But what stands out to me is the efficiency of not having to rely on another tool, be it OpenManage, iDRAC, IT Assistant, etc. I spend a lot of time in vCenter and it's fantastic to not have to leave that for another program.
Dell_plugin_tab_overview
The amount of detail for hardware information is ridiculous. All of this information is available if you have the Enterprise iDRAC in your server, but to be able to get the serial number and manufacturing date of your RAM in the same place that you can check your warranty status is just beautiful.
Capture3
Deploy BIOS and firmware updates within vCenter
This is a wizard based process that requires you to have a CIFS or NFS repository, which the initial setup walks you through for configuring. I've found it pretty straight forward, easy and quick. Well, the wizard is quick. While this feature is fantastic and works very well, the actual upgrade, however, takes quite a while. The server goes through multiple reboots throughout the process. After the updates are downloaded to the repository, the server is automatically put into Maintenance Mode and then reboots into an EFI environment to do the updates. After each update, the server reboots and re-enters the System Update environment to continue with the next update (firmware/bios). If you attempt to perform many updates at once (NIC firmware, BIOS, HD Firmware, etc.), be prepared to wait.

Build hardware and hypervisor profiles and deploy any combination of the two on bare-metal Dell PowerEdge™ servers without a preboot execution environment (PXE)
This is accomplished through the magic of the combination of the LifeCycle Controller and the iDRAC. While I've built the profile which seems very straight forward, I've yet to be able to test this (spare Gen11 PowerEdge servers are hard to come by, though if one were donated, I would not complain). Although I have a new server coming to replace an out of warranty cluster host that I was planing on testing on, I found this little nugget in the Admin Guide
The system needs to have a Virtual Disk for installation of the OS.
The Plug-in will not install the hypervisor to an internal SD card.
Bummer. This is the standard config for my cluster going forward. No Hard Disks. My great hope is that this is resolved in the next version. If not, this is a huge feature and potentially massive time saver that's not available to me.

Automatically perform Dell recommended vCenter actions based on Dell hardware alerts
The Plug-in adds a whole host of new Dell server specific alarms to vCenter. These range from power consumption to OS driver version monitoring. If something critical enough happens, say a single power supply in a dual power supply system dies, the Plug-in will automatically put the host in maintenance mode until the issue is fixed. This can theoretically save you from encountering an HA event, which, while cool, is never fun.

Capture4

When I first installed the plug-in, I was immediately alerted to the fact that I was running on a quite old RAID controller driver. Handy.

Receive proactive renewal alerts from Dell before your warranty expires and access the Dell hardware warranty page online 
I've always been bad at doing this myself. It seems easy to track on my own, but we're all lazy in some areas, I guess this in one of mine. So, thanks, Dell, for enabling me to not have to come up with a better solution on my own :) I have yet to receive this because the server I'm testing on still has almost 1500 days of warranty left. But I see the link to click to renew it if I like, and its status is in the Overview page in the Dell Server Management tab.

Misc Thoughts and Issues

  • Hardware Provisioning and Deployment
    • Unfortunately, v1.0.1 cannot deploy a hypervisor to an internal SD card. This is how we plan to move forward with our ESXi installs (including the R610 I just ordered)
  • Pricing
    • Retail pricing is $299.00 for up to 3 hosts, $799.00 for up to 10, $1,799.00 for up to 50 and $2,999.00 for up to 1000 hosts
      • If you have 1000 hosts, you can probably afford this. It might be hard to sell $800 to my management to manage my 5 hosts. Essentially, we'd have to save about 25 hours of work to break even
      • I'm not asking for it to be free. It does too much to be free and is really bordering on what you can define as a plug-in. What I'd like to see is up to 3 for free (throw the SMBs a bone and gain market share in the process), $300 for 5 hosts, etc.

 

 

 

Posted by Matt Vogt
 

New Dell EqualLogic Arrays

Dell unveiled an update to 2 of their EqualLogic PS series array platforms today along with their first sub-$10k array. The new PS6100 and PS4100 series arrays are a refresh of their PS6000 and PS4000 units. The new boxes are being touted as having up to a 67% improvement in I/O performance. 

Here are the major new features for each:
PS41000
- shrinks down to 2U
- 24 x 2.5" drives - up to 21.6TB
- 12 x 3.5" drives - up to 32TB
- Now starting at under $10,000

PS6100
- 2U version with 24 x 2.5" drives - up to 21.6TB
- New 4U design with 24 x 3.5" drives - up to 72TB
- NEW Dedicated management port

Capture

Both arrays will ship with the latest 5.1 firmware and are certified for VMware's vSphere 5.0 storage APIs (VASA, VAAI, etc.). The SSD options will go up to 400GB per drive, which I'm sure will be slightly over the $10,000 starting price in the PS4100. 

This may sound lame, but the addition of the dedicated management port on the PS6100 is something that I'm very excited about. I never understood why there was one on the PS4000 but not the PS6000. It was maddening to lose 25% of my total network throughput on an array if I needed to attach it to a dedicated management network.

Being in the market for a Sumo (Dell's EqualLogic Monster PS6500 series array), I was hoping that those would get the same refresh, and even though I knew it wasn't going to be refreshed yet, I'm still a bit bummed that I may have to purchase it just before it gets its own upgrade.

Posted by Matt Vogt
 

LA VMUG - vCenter Operations

The Los Angeles VMUG was held today at the DoubeTree Hotel at LAX and the primary topic was a product discussion and demo of vCenter Operations. Much of the time was dedicated to what needs and gaps it fills.

The dilemma now is that we have essentially 3 layers: Hardware, Hypervisor, OS/App. For each of those 3 layers there are a multitude of ways to monitor capacity, get health checks and gain deep visibility into performance metrics and bottlenecks. This is the goal of the vCenter Operations along with the promise of capacity planning, compliance checks and change management.

What I'm impressed with, though, is the robust 3 vectored system of overall health of the ESX environment and how they're scored (each 0-100), which is at the core vCenter Operations management system.

Workload
0 means that the object (Guest, Host, Cluster or Data Center) is using no resources that have been allocated to them
100 means that the object is consuming all of at least one resource. This can be higher than 100 in the case of RAM utilization. A VM can use more RAM than you've allocated to it.
The overall number for the object is bound by highest metric (ram, storage, CPU, network). Meaning, that if a guest's CPU utilization is sitting at 23%, but the RAM usage is at 75%, the Workload score would be 75.

Health
Here, higher means the system workload is following normal patterns and lower can indicate abnormalities. Normal is defined over time as vCenter Operations observes workload trends over time. Month end for an accounting office application will have higher utilization than any other time. This means overall CPU/RAM/Disk/Network usage will spike, but it's normal and expected. 

This to me is one of the biggest advantages over something like SCOM or Nagios. Just because something spikes, doesn't mean that I should get an email altering me (or in the case of our current Nagios implementation, spamming me every 30 minutes).

So when the Health score of an object lowers, key metrics and workload are getting further from what's been previously observed to be normal. This is what I want to know.

Capacity
Based on utilization, when will I run out of resources (CPU, RAM, Storage, Network)?
Pretty straight forward here: High is good, low is bad. The number is based on binding metric for capacity breach: based on the current trend, when will I run out of storage. Like the other two scores, this is calculated for each resource on each level - Data Center, Cluster, Host, Guest.

Of course, as it seems with every licensed software on the planet, there are 3 versions: Standard, Advanced and Enterprise. I won't go into all the details of the differences between them, you can check them out here, but here the highlights (each one builds on the previous):

Standard
  • Dashboard with Health Scores
  • Behavioral Analysis and Trending
  • Heat Maps
  • Estimated timing remaining till Capacity full (CPU, Memory, Disk, Network)
  • Configuration Change Visibility
  • (no alerts)

Advanced
  • Capacity Bottlenecks
  • Resource wastage analysis and trending - including recommendations for right-sizing
  • What-if capacity modeling
  • Custom reports
  • Support for HA, FT and Linked Clones
  • (no alerts)

Enterprise
  • Smart Alerts, including Email and SNMP
  • 3rd party plug-in reporting and data analysis (Nagios, SCOM, Tivoli, etc.)
  • Regulation and Industrial Compliance checks and scans
  • Change Alerts
  • Distribution (OS, Hypervisor, Applications)

I think with the advent of the new licensing model based on vRAM, capacity planning and right-sizing virtual machines will become imperative to every Virtualization Infrastructure admin.

I have yet to try any of these out and VMware allows you to download and try the Standard edition free for 60 days. As performance and metrics are becoming more integral to my job, I plan to take a good look at this. The product itself looks to be pretty solid and you can count me as impressed.

 

Posted by Matt Vogt
 

vSphere 5 Licensing - post grief post

There have been gobs of reactions to VMware's new license model that was announced last week, and the vast majority of it was negative. I will admit that I took part in some of the initial back lash. We sysadmins don't like change, especially when we've engineered systems to maximize a certain licensing model and then that model changes. But then I started to think it's possible likely that I'm overreacting. Maybe it won't have any effect on us. So I started doing the math. Licenses are still purchase by the CPU, at minimum one for each socket in your system, with each license having a vRAM Entitlement. For reference, one of the best license summaries I've found is on Alan Renouf's blog http://www.virtu-al.net. (the following is borrowed from his blog post http://www.virtu-al.net/2011/07/14/vsphere-5-license-entitlements/)

License Type Essentials Essentials Plus Standard Enterprise Enterprise Plus
vRAM Entitlement per license 24GB 24GB 24GB 32GB 48GB

 

We currently have Enterprise Licensing with the following specs:

5 hosts - each with 2 CPU and a total of 256GB physical RAM (pRAM here on out)

70 Virtual Machines wth a total of 140GB Virtual RAM allocated (vRAM). But we also have about 20 powered off virtual machines for test/dev with an average of 4GB RAM each. So worst case scenario for vRAM is 220GB

The Math

10 Licensed CPUs @ 32GB Entitlement per CPU  = 320GB of RAM Entitlements.

So as you can see, at the moment, we're totally fine. Primarily because we have too many hosts, which I plan to fix with eliminating 2 hosts (pRAM is much cheaper these days than when we started building our cluster, which allows for greater consolidation). This works out like so:

6 Licensed CPUs @ 32GB Entitlement per CPU  = 192GB of RAM Entitlements

So I'll still have 52GB vRAM overhead, but will have to careful with how many test/dev servers we turn on at the same time. I'm just glad there isn't a 'hard stop' when you hit your entitlement limit. I'm just not excited to one day tell my CIO that I have to purchase more CPU licenses than we have CPUs.

Not All Bad

I understand where VMware was going with this new model. One of the major tenets of the Cloud is the 'pay for what you need/use, not what you don't'. VMware's philosophy is now a 'license only how much vRAM you need, not your pRAM'. When I lamented that new virtual machines are becoming more of a business decision, this isn't all bad. VM sprawl is very much real. Especially with RAM and CPU speeds and feeds exploding for such a minimal cost increase, we vAdmins have to think less and less about what resources our machines actually need. New Windows Server 2008 R2? Ah, just go ahead and give it 8GB off the bat. Why not?

I think it will help admins and users think critically about the resources they allocate to machines and force people to 'right-size' them. Really, we'll be thankful 5-10 years down the road when if we migrate virtualization platforms. (oh no he didn't)

For those that are adopting or have adopted a charge back model, this will make it much easier to manage and explain the costs of your tiered environment. You want a big beefy server with tons of RAM? Groovy. That'll be $90/1GB RAM/year. But you can have all CPU you need.

Left Over Beef

  • VMware that says they removed "two physical constraints (core and physical RAM)", but they introduced a virtual constraint and I never had either of those 2 previous ones.
  • The 8GB vRAM limit on free ESXi might be a home/test lab buster. Aren't all the other restrictions enough?
  • I think VMware could do lot by increasing the entitlement just a bit: 48GB for Enterprise, 64GB or 96GB for Enterprise+ would silence a lot of critics (but also cut into their profit margins)

But, these are just my thoughts as a sysadmin in an SMB shop. There doesn't seem to be a huge impact us, yet. Oh, did I mention we're in the beginning phases of a DR site? Yeah, this will influence some discussions there now. 

Posted by Matt Vogt
 

vSphere 5 Fab 2

Well, the announcement came and went for vSphere 5.0 yesterday and a lot of new technology and new capability was put out there. You may have also heard of the new licensing scheme, but I'm not going to cover that yet as I want to take more time to evaluate how it will impact me (but I'm currently in stage 2 of The Five Stages of VMware Licensing Grief). Here are some quick hits of 2 the new tech that will primarily affect me, small shop in a small EDU:

New vMotion (aka Storage DRS goodness)

svMotion has a new copy mechanism that now allows for migrating storage for guests that have snapshots or have linked clones. A Mirror Drive was also created on the destination datastore that holds all the changes during a copy so when the copy is done, the changes are synced from the Mirror Drives rather than having to make several passes back to the original datastore. This should decrease svMotion times by quite a bit.

Expanding on the amazing DRS feature for VM/host load balancing, storage DRS brings the same capability to storage. Although this is all wrapped up in the new and improved Storage vMotion, it could stand alone as quite the feature. As introduced with vSphere 4.1, if your storage vendor of choice support VAAI (storage acceleration APIs), this all happens on the SAN rather than over the network, bringing joy to your network admins.

VMFS-5

Lots of new features here. 

  • 1MB block size - gone are the 1, 2, 4 and 8M block sizes
  • 60TB datastores. Yes, 60. Yes, Terabytes
  • Sub-blocks down to 8k from 64k. Smaller files stay small
  • Speaking of smaller files, files smaller than 1k are now kept in the file descriptor location until they're bigger than 1k
  • 100,000 file limit up from 30,000
  • ATS (part of the locking feature of VAAI) improvements. Should lend itself to more VMs per datastore

VMFS-3 file systems can be upgrades straight to VMFS-5 while the VMs are still running. VMware is calling this an "online & non-disruptive upgrade operation".

A couple hold over limitations for a VMFS-5 datastore:

  • 2TB file size limit for a single VMDK and non-passthru RDM drives (passthru RDM can be the full 60TB)
  • Max LUNS is still 256 per host (I personally could never see hitting this, but I'm sure larger implementations can)

More vSphere 5 posts will be coming, but these are the 2 things that got me the most excited.

Posted by Matt Vogt
 

Dell Management Plug-in for vSphere

With our ever growing complexity within our virtualization environment, it's getting a bit un-wieldy to manage all the disparate pieces (physical servers, virtual servers, storage, network, etc.). Actually, managing the pieces is getting easier. It's managing the management pieces that's becoming difficult. I've got SANHQ and Group Manager for my SAN, vCenter/Veeam for my vSphere, OpenManage for my Dell servers, and on and on. Anything that cuts down on the number of management infrastructure components is a god send.

Enter the Dell™ Management Plug-In for VMware vCenter, which is billed as a way to "seamlessly manage both your physical and virtual infrastructure.". I've downloaded the trial (version 1.0.1) and will blog about my experience with it after I run it through some paces. The intial difference I see from the older one is that the older version's download (1.0.0.40) came with the Users Guide built in to the extract, but the new one did not. Had to go find it here along with the Quick Install Guide and the Release Notes.

 

Posted by Matt Vogt