Monday, February 28, 2011

Problem with initial Cisco UCS 6100 series fabric interconnect setup with error message: “Failed to configure given password on the switch, system will be rebooted , please wait… !”

The following was probably one of the most annoying errors I’ve ever come across while trying update the firmware on a Cisco UCS 6120XP fabric interconnect.  What basically happened was that the fabric interconnects we got were at firmware version 1.3 and because we were using the new B230 blades, we wanted to get it updated to 1.4 before we began building the cluster.  The plan was to temporarily set up each fabric interconnect as a standalone, update the firmware, then erase the configuration and set up the cluster.  I managed to get the firmware for the fabric interconnect updated to 1.4(1j) but had some issues with the UCS Manager.  To make a long story short, the firmware was at:

UCS Manager: 1.3(1c)

Fabric Interconnect: 1.4(1j)

image

… the configuration was blown away and setup was reran and the following error message will be displayed:

Failed to configure given password on the switch, system will be rebooted , please wait… !

image

This problem was extremely annoying because the 6120XP will essentially reboot automatically and you will be asked to run the initial setup again.  Each reboot takes upwards to 10 minutes so every reboot wastes a lot of time.

I went through the exercise of trying to run initial setup 3 times and then thought I could try doing a full state restore but then ran into the following error:

Error: End point timed out. Please retry

image

The TFTP server was off my laptop and I was able to successfully perform a full state backup the day before but since this workaround wasn’t working and my colleagues were waiting for me to leave the datacenter (it was a Sunday and we’ve been working for the past 2 weekends), I rebooted the fabric interconnect and thought I’ll write up a list of “things to try” at night and just tackle it the next day.

As I thought about the problem at night, I realized that the only other configuration parameter I could change during the initial setup that I haven’t tired was the Enforce strong password setting during the initial setup.  The setting I choose for this configuration was NO so I could just use the password: “password” and I repeated this over the 3 failed setups I ran.

image

Fast forward to the next day and as stupid as this sounds, when I choose YES for the Enforce strong password option, the setup completed and I was able to successfully configure the fabric interconnect as a standalone setup.

I’m not sure what might have caused this but I do suspect that it may be a mismatch between the UCS Manager (1.3) and the Fabric Interconnect (1.4).  I didn’t have time to confirm this so I hope that the information I’ve provided here may help anyone out there that comes across the exact same problem.

Saturday, February 26, 2011

New “Server Virtualization in Microsoft Lync Server 2010” virtualization guide document

I've been checking Microsoft's site for the new virtualization guide since I've been notified that the old one was pulled because a new one was being developed and it looks like on was released on February 21, 2011:

Server Virtualization in Microsoft Lync Server 2010

http://www.microsoft.com/downloads/en/details.aspx?FamilyID=2905FD33-E29C-4709-A012-E55EA8DB63E4&displaylang=en.

Unfortunately, I'm stuck in North Carolina since last Monday and have to work through the weekend till Wednesday until I fly back to Toronto so I'll try to do a breakdown as I did with the original one: http://terenceluk.blogspot.com/2010/12/microsoft-lync-server-2010.html over the next weekend (assuming I don’t have to work again).  Though I didn’t read the document in depth, I took a quick scroll through it and the content looks very promising.

Friday, February 25, 2011

Problem installing vCenter Server 4.1 with error: “The following port numbers are either invalid or already is use. VMware VirtualCenter HTTP Port: 80”

It’s been a while since I’ve done a vCenter server install where a full SQL Server 2008 instance is installed locally on the same server so while I’ve done quite a fell 4.0 installs, I’ve never really done one with 4.1.  As you’ve probably already guessed, I had to do one today and ran into the error message included in the title of this post and I would like to share ONE of the MANY reasons why you would encounter this.

Problem

You’ve decided that you would like to collocate the SQL Server instance for your vCenter 4.1 install on the same Windows Server 2008 R2 64-bit and since SQL Server 2008 R2 requires you to install .NET 3.5.1 installed as a feature on the server, which will also install the IIS role on the server, you will be warned with the following message upon launching the vCenter Server 4.1 installation:

VMware vCenter Server

Setup has detected that a web server, such as IIS, is installed on this machine.

vCenter Server uses common web service ports for communication. To avoid resource conflicts, a dedicated machine for vCenter Server is recommended. Do you still want to continue?

image

Proceeding with the install will throw no errors or warnings up till the following Configure Ports window:

image

Upon clicking the Next button, the following error will be displayed:

The following port numbers are either invalid or already in use.

VMware VirtualCenter HTTP Port: 80

Note:

The vCenter Server ports must be in [1 - 65535] range.

The LDAP and SSL ports for the Directory Services instance must be 389 and 636 respectively or in the range of –1025 - 65535].

image

Clicking the OK button will return you to the previous window but you will not be allowed to proceed.

Troubleshooting

As noted in the warning that is presented when you initially launch the vCenter 4.1 install, we have the IIS Web Server role installed on the server because it is a dependency of the .NET 3.5.1 feature we need for SQL Server 2008.

image

What should be noted is that if you try to remove the Web Server (IIS) role, you will be warned that all the other features that are dependent on this role will also need to be removed:

image 

image

If you also make an attempt to try and just remove some of the role services for the Web Server (IIS) role, there are really only 2 that you can remove:

image

.NET Extensibility

image

Which will also force you to remove .NET Framework 3.5.1 Features:

image

Request Filtering

image

Which will also force you to remove .NET Framework 3.5.1 Features:

image

Seeing how we cannot simply remove the role services, we can turn our attention to trying to determine which service is locking onto the port through native Windows commands such as: NETSTAT with the “–abo” switch:

image

… as well as the Process Explorer tool provided by Windows Sysinternals http://technet.microsoft.com/en-us/sysinternals/bb896653:

image

Through the tracking information collected with these tools, you’ll realize that the service that’s locking port 80 is actually the World Wide Web Publishing Service:

image

Unfortunately, because this service is installed and set to automatically start when you install the Web Server (IIS) role which cannot be removed, the way around this is to actually stop and disable the service within the Services Console:

image

image

Once the service has been stopped, you will now be able to proceed with the install:

image

image

Reviewing the information provided by process explorer will show that port 80 is no longer locked by PID 4 (System) process:

image

For more information about the World Wide Web Publishing Service service, see the following link: http://technet.microsoft.com/en-us/library/cc734944(WS.10).aspx.

Please keep in mind that if you have other application services on the server that is dependent on the World Wide Web Publishing Service, your application will no longer function when you’ve disabled this service so ensure that you do not inadvertently affect another service. 

If you ever run into an issue where you cannot disable the World Wide Web Publishing Service, the alternative is to use a port other than port 80 for vCenter.

Thursday, February 24, 2011

Updating Cisco UCS B Series infrastructure firmware from 1.4(1i) to 1.4(1j)

It’s been awhile since I’ve written a post about updating Cisco UCS B Series Infrastructure firmware and while preparing to upgrade the firmware for a client I’m working with, I noticed that the last post I wrote was:

Updating Cisco UCS firmware from 1.3(1c) to 1.3(1n)

http://terenceluk.blogspot.com/2010/10/updating-cisco-ucs-firmware-from-131c.html 

Seeing how it’s been awhile and I don’t have a post for 1.4, I figure I should take the opportunity to do some screenshots to write this.

Downloading UCS B Series Firmware

Keep in mind that the way in which Cisco bundles the UCS B Series firmware is has changed and for more information about this, see one of my previous post: http://terenceluk.blogspot.com/2011/01/missing-14-firmware-option-when.html.  Navigate to the Cisco software download page and navigated to:

Products –> Unified Computing –> Cisco UCS Infrastructure Software

image

Once you see the folders with the various versions, choose the one you would like to download (in this example, we’re using 1.4(1j):

image

Since the components required to update the UCS B Series Infrastructure is split into 2 packages, continue by selecting Select Another Product:

image

Navigate on the Cisco software download page to:

Products –> Unified Computing –> Cisco UCS Manager Server Software

image

Select UCS B-Series Blade Server Software:

image

Once you see the folders with the various versions, choose the one you would like to download (in this example, we’re using 1.4(1j):

image

Proceed with downloading the 2 packages queued:

  1. UCS B-Series Blade Server Software
  2. Cisco UCS Infrastructure Software

image

image

Uploading UCS B Series Firmware

Once you have the packages downloaded, check to ensure that your fabric interconnects have sufficient space:

image

Once you’ve confirmed you have enough space on the bootflash, proceed with uploading the firmware packages to the 6100 series Fabric Interconnects:

image 

image 

UCS B Series Firmware Update Guide

Unfortunately, there isn’t a upgrade guide available for 1.4 to 1.4 upgrades so I will simply use 1.3 to 1.4 as a reference:  http://www.cisco.com/en/US/products/ps10281/prod_installation_guides_list.html 

image 

image 

image

As shown in the update guide, the following will be the update order for the devices in the UCS B Series Infrastructure:

  1. Adapter (interface card)
  2. CIMC
  3. I/O module
  4. Cisco UCS Manager
  5. Fabric Interconnect
  6. Host firmware package

image

image 

In case you’re interested in comparing the ordering provided by the 1.2 to 1.3 guide, it’s actually the same other than the reference to CIMC as BMC:

  1. Adapter (interface card)
  2. BMC
  3. I/O module
  4. Cisco UCS Manager
  5. Fabric Interconnect
  6. Host firmware package

image 

Also note that if we were updating the firmware from 1.3 to 1.4, this snippet would be important:

image

However, because this example is from 1.4(1i) to 1.4(1j), we can safely ignore this.

Updating UCS B Series Firmware

Once you’ve completed the upload, you will need to update the UCS B Series Infrastructure components from firmware version that is currently being used to the new version that was uploaded.  Proceed by navigating to the Equipment tab –> Equipment node –> Firmware Management tab –> Installed Firmware then click on the Update Firmware button:

image 

In the Update Firmware window, set the Filter to All and set the Set Version to the version you would like to update to (in this example, it will be 1.4(1j):

image 

image 

Pay close attention to the Backup Version of each component and you will see the new version listed:

image

Click on the Apply button and you will see the Update Status column as updating:

image 

image

If you’re interested in seeing the update status of a component, simply navigate to that component under the Equipment tab, click on the General tab and scroll to the section Update Status:

image

As seen in the following screenshot, the FSM tab isn’t going to tell you much:

image

Navigate back to the Firmware Management tab –> Installed Firmware and wait for all of the components to be listed as ready for the Update Status column:

image

Once all of the components are listed as ready for the Update Status column, proceed with clicking on the Activate Firmware button:

image

Activate Interface Card Firmware

Within the Activate Firmware window, set:

Filter: Interface Cards

Set Version: 1.4(1j)

Ignore Compatibility Check: Checked

Set Startup Version Only: Checked

image

I find that I get asked a lot about why we’re supposed to check the Ignore Compatibility Check checkbox so I’ve included the explanation from the upgrade guide:

Check the Ignore Compatibility Check check box.
The firmware for this release is not compatible with previous releases. Therefore, you must check the
Ignore Compatibility Check check box to ensure that the activation succeeds.

image

Once you proceed and click on the OK button, the interface cards’ firmware activation will begin:

image

Once the activation completes, the Activate Status will be listed as: pending-next-reboot:

image

Activate CIMC Firmware

Once the firmware for the interface cards have been activated, we can proceed with the activation of the CIMC.  Within the Activate Firmware window, set:

Filter: CIMC

Set Version: 1.4(1j)

Ignore Compatibility Check: Checked

Set Startup Version Only: Checked

image

Once you proceed and click on the OK button, the CIMC’s firmware activation will begin and once it has completed, it will be rebooted.  Once the reboot completes, the Activate Status will be listed as: ready:

image 

image

image

Activate IO Module Firmware

Once the firmware for the CIMC have been activated, we can proceed with the activation of the IO Module but before we continue, please be aware of the following snippet from the upgrade guide:

When you configure Set Startup Version Only for an I/O module, the I/O module is rebooted
when the fabric interconnect in its data path is rebooted. If you do not configure Set Startup
Version Only for an I/O module, the I/O module reboots and disrupts traffic. In addition, if
Cisco UCS Manager detects a protocol and firmware version mismatch between it and the
I/O module, Cisco UCS Manager automatically updates the I/O module with the firmware
version that matches its own and then activates the firmware and reboots the I/O module
again.

image

When ready to activate the I/O module firmware, click on the Active Firmware button:

image

Within the Activate Firmware window, set:

Filter: IO Module

Set Version: 1.4(1j)

Ignore Compatibility Check: Checked

Set Startup Version Only: Checked

image

Once you proceed and click on the OK button, the IO Module firmware activation will begin and once the activation completes, the Activate Status will be listed as: pending-next-reboot:

image

As noted in the snippet I included above, you can safely leave the IO Modules with the Activate Status as: pending-next-reboot.

Activate Board Controller Firmware

Once the firmware for the IO Modules have been activated, we can proceed with the activation of the Board Controller but before we continue, please be aware of the following snippet from the upgrade guide:

Activating the Board Controller Firmware on a Server to Release 1.4(1)
Only certain servers, such as the Cisco UCS B440 High Performance blade server and the Cisco UCS B230
blade server, have board controller firmware. The board controller firmware controls many of the server
functions, including eUSBs, LEDs, and I/O connectors.
This procedure continues from the previous one and assumes that you are on the Installed Firmware tab.
This activation procedure causes the server to reboot. Depending upon whether or not the service profile
associated with the server includes a maintenance policy, the reboot can occur immediately. To reduce
the number of times a server needs to be rebooted during the upgrade process, we recommend that you
upgrade the board controller firmware through the host firmware package in the service profile.

The snippet above pertains to the UCS Infrastructure used in this example because all of the servers in the 6 chassis are actually Cisco UCS B230 blades.

When ready to activate the Board Controller firmware, click on the Active Firmware button:

image

Within the Activate Firmware window, set:

Filter: Board Controller

Set Version: This option is grayed out

Ignore Compatibility Check: Checked

Now from the Startup Version column, select the version B230100A:

image

Once you proceed and click on the OK button, the Board Controller firmware activation will change from ready to activating:

image

… then ready:

image

Activate UCS Manager Firmware

Once the firmware for the Board Controllers have been activated, we can proceed with the activation of the UCS Manager so proceed with clicking on the Activate Firmware button:

image

Within the Activate Firmware window, set:

Filter: UCS Manager

Set Version: 1.4(1j)

Ignore Compatibility Check: Checked

image

Once you proceed with clicking the OK button, you’ll briefly get a warning message about ignoring the compatibility check.  Continue by clicking the Yes button to proceed:

image 

You will be kicked off of UCS Manager soon after the activation begins:

image

Wait till you are able to hit the UCS Manager webpage:

image

Continue to log in:

image

Once you’ve successfully logged back into UCS Manager, navigate to the Installed Firmware tab to make sure UCS Manager has been updated:

image

Activate Clustered Fabric Interconnect Firmware

Activate Subordinate Fabric Interconnect:

As noted in the upgrade guide:

image

… it is important to perform the firmware activation on the subordinate fabric interconnect first before you activate the active to avoid unplanned disruption.

Also, prior to activating the firmware on the passive fabric interconnect, it is important that you verify that the cluster’s High Availability Details state is:

Ready: Yes

State: Up

Cluster Link State: Full

image

Once you’ve validated that the cluster is in good health, proceed with the activation of the subordinate fabric interconnect by clicking on the Activate Firmware button:

image

Within the Activate Firmware window, set:

Filter: Fabric Interconnects

Set Version: <do not set>

Ignore Compatibility Check: Checked

Now from the Startup Version column of the subordinate fabric interconnect, select the version you’re updating to for the Kernel and System:

image

Once you proceed with clicking the OK button, you’ll get a warning message about activating the fabric interconnects will cause them to reboot.  Continue by clicking the Yes button to proceed:

image

After you’ve answered Yes to the warning message, you will see the Active Status status of the subordinate fabric interconnect listed as activating:

image

You’ll notice that during the initial start of the activation, the cluster state of your primary and subordinate fabric interconnects will still be listed as being in good health:

image

image

If you would like to review the status of the update, you can navigate to the subordinate fabric interconnect’s General tab and scroll down to the Update Status section:

image

As the activation proceeds, you’ll see that the status of the subordinate fabric interconnect turn to red:

image

The High Availability Details for the subordinate will also change to:

Ready: No

State: Down

Failure Reason: Node down

Leadership: Inapplicable

Cluster Link Stat: Full

image

The High Availability Details for the primary will also change to:

Ready: No

State: Up

Failure Reason: Peer node down

Leadership: Primary

Cluster Link Stat: Full

image

Note that the activation of the new firmware on fabric interconnects take quite a bit of time so be patient:

image

You’ll see the fabric interconnect eventually complete a reboot and go through various states (i.e. election) with states such as:

Failure Reason: Chassis configuration incomplete

Leadership: Election in progress

… but remember that no action is required:

image

image

You’ll also notice that even when the Activate Status in the Installed Firmware tab lists the subordinate as being as ready, you may still a yellow bracket around the subordinate fabric:

image

Waiting for a bit longer will show it change to blue:

image

Then finally change to normal (no bracket) but the High Availability Details may still list:

Ready: Downgraded

State: Up

Failure Reason: Chassis configuration incomplete on peer node

Leadership: Primary

Cluster Link State: Full

image

A few more seconds and the yellow bracket may come back:

image

image

The bottom line is that the errors and faults will eventually clear:

image image

Activate Primary Fabric Interconnect:

As with updating the subordinate, prior to activating the firmware on the active fabric interconnect, it is important that you verify that the cluster’s High Availability Details state is:

Ready: Yes

State: Up

Cluster Link State: Full

image

Once you’ve validated that the cluster is in good health, proceed with the activation of the active fabric interconnect by clicking on the Activate Firmware button:

image

Within the Activate Firmware window, set:

Filter: Fabric Interconnects

Set Version: <do not set>

Ignore Compatibility Check: Checked

Now from the Startup Version column of the primary fabric interconnect, select the version you’re updating to for the Kernel and System:

image

Once you proceed with clicking the OK button, you’ll get a warning message about activating the fabric interconnects will cause them to reboot.  Continue by clicking the Yes button to proceed:

image

image

After you’ve answered OK to the informational message, you will see the Active Status status of the primary fabric interconnect listed as activating:

image

You’ll notice that during the initial start of the activation, the cluster state of your primary and subordinate fabric interconnects will still be listed as being in good health:

image 

If you would like to review the status of the update, you can navigate to the subordinate fabric interconnect’s General tab and scroll down to the Update Status section:

image

image

Once the activation of the firmware completes for the primary fabric interconnect, it will reboot and kick you out of UCS Manager:

image

As with the subordinate fabric interconnect, the state of the primary fabric interconnect will start with a red bracket indicating there are major faults then eventually cycle through the less critical faults and warnings:

image

image

You’ll also notice that the fabric interconnect is still going through the update process after logging back into UCS Manager:

image

image

The firmware activation will eventually complete and you’ll notice that your previously primary fabric interconnect is now the subordinate:

image

Completing Interface Card’s “Pending-next-boot” status

The last task is to reboot your servers so that the firmware activation for the interface cards will complete:

image

Note that even if you don’t have a service profile associated to the server and it’s powered off, you will still need to power cycle the server to complete the update process:

image

image

The server will go through the regular boot cycle:

image

… and when it completes, will have the interface card updated:

image

If you’re dealing with a new deployment and nothing is currently ran on all of your blades, you can actually navigate to the Chassis tab –> Blade Servers tab, then highlight all the servers, right click and choose Reset:

image

This will reset all of the servers to complete the update.

--------------------------------------------------------------------------------------------------------------------------------------------------------------------

This concludes the update process for Cisco UCS B Series Infrastructure.