ucs-power-state-in-sync

UCS Blades Power Off Unexpectedly

I was called into an interesting issue over the past week. I was told that a chassis worth of UCS blades had powered off without any apparent reason bringing down part of production. Initial troubleshooting of the issue showed no real culprits. UCSM was clean of errors except for an IOM post error. A show-tech command was initiated and a sev1 was opened with Cisco TAC. The technician on-call attempted to power on the servers by selecting them all in UCSM, right-clicking on them, and selecting reset. The blades powered on and came back online without issue.

So what caused the blades to power off unexpectedly?

Continue reading

WARNING: CpuSched: XXXX: processor apparently halted for XXXX ms

While I have seen people discuss this error message and solution, I figured it would be a good idea to discuss in terms of specific configurations such as on Cisco hardware and VMware virtualization. I feel this is important to understand the implications of the error message and to express the importance of BIOS configurations.

First, the issue: Cisco UCS B230-M2 blades (dual 10-core = 20 ‘processors’) running ESXi were throwing processor halted log messages. While this in itself may or may not be an issue, under little load via VMware clone operations ESXi hosts were disconnecting from vCenter Server (vCS) and becoming unresponsive for several minutes. Further digging uncovered that when the ESXi host disconnected from vCS the logs shows that all processors on the host were halting at exactly the same time.

Continue reading

Cisco UCS B230 M2 Caveats

The latest half-width Cisco UCS B-series servers are beautiful pieces of machinery. In such a small footprint it is possible to get two processors each with 10 cores and 20 threads. In addition, the blade supports an amazing 512GB of memory and features the ability to support up to two hot-swappable SSD drives.

Cisco UCS B230 M2

Unfortunately, while working with this bleeding edge technology, I have run into several limitations. I would like to share some of the limitations I have experienced in the workarounds available today.

Continue reading

UCSM KVM Console Displays Black Screen

Over the past several weeks, I have been standing up new hardware in order to expand a cloud infrastructure. During this process, I was tasked with standing up new UCS hardware. I configured the static IP addresses on the N6K devices, configured the firmware to the appropriate versions and created the appropriate service profiles. The next step was to install the base operating systems.

With PXE in place, I opened the KVM console from UCSM and waited for the server to power on. Once the server was powered on, I saw the Cisco splash screen as expected. The problem was once the the system posted, only a black screen was seen on the console. It appeared the blade was functioning and no error messages were seen on UCSM, however the screen remained blank 30 minutes later. I attempted to open the console on several other blades and to my surprise they are experienced that same problem.

When I plugged a physical KVM directly into the server to check the console, the screen displayed as expected. What was going on?

Continue reading