I ran into a problem last week while updating a Cisco UCS chassis and blade servers that I thought may be a common issue that other professionals may encounter. Basically what happened was once I successfully updated the firmware within UCS Manager then proceeding to activate it:
…one of the servers within the chassis was stuck on updating for the Update Status:
As shown in the screenshot above, all the other servers were ready except for server 2 in chassis 1. Opening the properties window of the interface card shows:
FSM Status: UpdateAdaptorFail
Retry #: 20
Current Stage Description: power on the blade(FSM-STAGE:sam:dme:ComputeBladeUpdateAdapter:BladePowerOn)
Description: update backup image of Adaptor(FSM:sam:dme:ComputeBladeUpdateAdaptor)
Time of Last Operation: 2010-10-04T11:38:15
Status of Last Operation: UpdateAdaptorFail
Remote Invocation Result: end-point-unavailable
Remote Invocation Error Code: 1002
Remote Invocation Description: no connection to MC endpoint
Progress Status: 3%
After reading this message, I went ahead to expand the Equipment node on the left and that’s when I saw that there was a red line around the Chassis 1 node which meant there were errors.
As I continued to expand the nodes, I immediately noticed that Server 2 had critical errors.
Long story short, the blade server was at one point swapped from one chassis to another and no one did anything to acknowledge this and thus it has not been operational since.
Once I resolved the slot issue, the activation of the firmware continued and was successful.
Lesson learned: Remember to check for any faults and errors prior to trying to upgrade firmware on a UCS system. I don’t manage this environment and was in a rush to perform the upgrade to get 2 blade servers for VIEW so I never took the time to do the sanity checks.