Monday, May 9, 2011

Upgrading Firmware on System P5


Overview

The goal of this blog post is to provide the users a easy-to-read instructions so that he can quickly update system firmware on any p5 servers. It is assumed the reader has basic p5 skills.

I would be covering the three most widely used update method in this blog.


1. Update on an HMC managed system
2. Update on a standalone server via OS, without a HMC
3. Update on a standalone server using the Diagnostic CD, without a HMC

It is vital that we read the pre-req before proceeding with any of the above method to update the system firmware. Then refer to any one section or all sections when choosing the preferred method for updating system firmware.

Hardware used in this example


The hardware used in developing this paper was a p5 550Q (type-model 9133-55A). HMC (version 5.2.0 including fix MH00586) was also used when needed.

System firmware fixes and upgrades


Firmware, also known as microcode, is Licensed Internal Code that fixes problems and enables new system features as they are introduced.

New features introduced are supported by new firmware release levels. In between new hardware introductions, there are fixes or updates to the supported features. These fixes are often bundled into service packs. A service pack is referred to as an update level. A new release is referred to as an upgrade level.

Both levels are represented by the file name in the form of PPMMXXX_YYY_ZZZ. PP and MM are package and machine type identifiers. PP can be 01 for managed system or it can be 02 for power subsystem. The MM identifier is a SF for p5 systems and a BP for Bulk Power Controller. The firmware version file applicable to p5 machines is in the form of 01SFXXX_YYY_ZZZ.

Decoding firmware names

The file naming convention for system firmware is:
01SFXXX_YYY_ZZZ, where
XXX is the stream release level
YYY is the service pack level
ZZZ is the last disruptive service pack level

Using the above example, the system firmware 01SF235_185 would be described as release level 235, service pack 185.

Each stream release level supports new machine types and/or new features.

Firmware updates can be disruptive or concurrent. A disruptive upgrade is defined as one that requires the target system to be shutdown and powered off prior to activating the new firmware level. A new release level upgrade will always be disruptive. All other upgrades are defined as concurrent, meaning that they can be applied while the system is running. Concurrent updates require an HMC but are not guaranteed to be non-disruptive.

In general, a firmware upgrade is disruptive if

1. The release levels (XXX) are different.
Example: Currently installed release is SF230, new release is SF235

2. The service pack level (YYY) and the last disruptive service pack level (ZZZ) are equal.
Example: SF235_180_180 is disruptive, no matter what level of SF235 is currently installed on the system

3. The service pack level (YYY) currently installed on the system is lower than the last disruptive service pack level (ZZZ) of the new service pack to be installed.
Example: Currently installed service pack is SF235_180_180 and the new service pack is SF235_190_185

An installation is concurrent if:
1. The service pack level (YYY) is higher than the service pack level currently installed on your system.
Example: Currently installed service pack is SF235_180_160, new service pack is SF235_185_160.

Why are there multiple firmware streams

Multiple firmware streams (eg, SF230, SF235, SF240, etc) are available for a given type-model (eg, 9117-570). IBM maintains multiple parallel firmware streams so customers can install firmware fixes while avoiding a scheduled p5 server outage.

IBM releases parallel firmware streams (release levels) which are very analogous to the AIX V5.2 and V5.3 release levels. Like AIX V5.2 and V5.3, the firmware streams perform similar functions on the same machine type-models, but SF240 (for example) may have some functions which SF235 (for example) does not.

There is usually little or no reason to upgrade from one release to another (eg, SF235 to SF240), but if the upgrade is performed, it requires an outage of the entire p5 server (every LPAR at once). In contrast, for systems managed by an HMC, updating from one service pack level to another (within the same stream release level) is often not disruptive.

Decoding the operator control panel


When the system is powered on, note the operator control panel. It should appear similar to the image below.
 
01    N     V=F

HMC=1       T
 
In this example the system is currently booted from the temporary side of the firmware image as denoted in the control panel by the letter T. This indicates the firmware is running from the temporary side. N indicates the system is booted in normal mode. V=F indicates the boot speed is set to Fast. HMC=1 indicates that the server is managed by and connected to one HMC.

If it has been recently managed by an HMC and no HMC is connected then it will display HMC=0. If no HMC is available and it is desired to set the server to unmanaged it might be required to reset the service processor to factory default using ASMI.


Temporary versus Permanent Firmware sides


The Service Processor maintains two copies of firmware, the temporary and permanent side, to help manage and reduce the frequency of downtime for maintenance. The permanent side is also known as the "P" side. The temporary side is also known as the "T" side. Server firmware fixes are installed on the temporary side. Copying the temporary firmware level to the permanent side is known as committing or accepting the fix. Conversely, rejecting, or removing the current firmware level consists of copying the permanent firmware image to the temporary side.

Note: It is recommended to use a new firmware fix for a period of time before committing (or accepting) it.

If firmware fixes are applied consecutively, the first fix will, by default, be copied from the temporary to the permanent side, or accepted. Using an HMC, it is possible to simply replace the temporary image by doing an Install and Activate of the new firmware and indicating that the firmware should not be accepted.

Be Advised

During a firmware update, the flashing of the NVRAM might take anywhere from ten minutes to one hour. In general, updating to a new release level will take longer. Ensure the system is not interrupted before the flash process is completed. Interrupting this process could result in a service call.
For systems that are not managed by a HMC, the installation of system firmware is always disruptive.

During the update_flash process, the console output will be displayed. Again, do not interrupt this process.
 
Restarting system.
FLASH: preparing saved firmware image for flash
FLASH: flash image is 35191632 bytes
FLASH: performing flash and reboot
FLASH: this will take several minutes.  Do not power off!
 
Requirements
Software requirements 

The table below is a summary of the minimum components required for each method covered in this paper:

Method
Minimum Requirements
Update via HMC
1. A compatible version of HMC.
2. An Ethernet connection from the HMC to the p5 server (HMC1 port).
3. Desired firmware image on CD. The rpm and XML files are required.
Update via running AIX or Linux operating system
1. A running AIX or supported Linux operating system on a single LPAR environment, ie, no attached HMC.
2. Firmware image on a CD or file system. The rpm file only.
3. update_flash executable. For AIX, it is part of the diagnostic aids tool in the /usr/lpp/diagnostics/bin directory. For Linux, it is part of the Service and Productivity Tools.
4. Serial console and connection
Update via Standalone Diagnostic CD
1. Diagnostic CD
2. Firmware image (.img) file on a CD. Remember, the rpm file is not directly compatible.
3. Serial console and connection


There are two very helpful sites that will assist in gathering the components necessary to update firmware. Visiting the Microcode downloads site ( 1) is recommended before performing any updates.



To download the rpm and XML files, input the server machine type and model number and select the latest firmware components based on the below requirements table (option 1 in Fig 3).

If planning to using the Diagnostic CD method, use option 4 to download the firmware image ISO image. The rpm files are not directly compatible with the Diagnostics CD. For the smallest ISO image take the following path using option 4
Obtain ISO Image -> Download P5 Microcode -> Select one -> GO

Next, go to the Power5 Code matrix site (Figure 2) to ensure the existing code levels support the downloaded firmware release. For the purposes of this paper, this applies mostly to HMC version level. If an HMC is not being used, this is for information only


How to Determine Currently Installed Firmware Levels

We can determining the Installed Firmware Level using the Operating System with the command: lsmcode -c

Determining the Installed Firmware Level using the HMC console
The HMC interface refers to server firmware as Licensed Internal Code.
1. Expand the Licensed Internal Code Maintenance folder.
2. Click the Licensed Internal Code Updates icon.
3. In the Contents area, click Change Internal Code.
4. In the Target Object Selection window, click the target system, and click OK.
5. In the Change Internal Code window, select View system information and click OK.
6. In the Specify LIC Repository window, select None and click OK.
(For more information about each of the repositories, click Help)
The following (for example) information is displayed 

EC Number
LIC Type
Installed Level
Activated Level
Accepted Level
01SF220
Managed System
49
49
49
02BP220
Power Subsystem
49
49
49
Where:
- Installed Level is the Temporary Firmware
- Accepted Level is the Permanent Firmware
- Activated Level is the booting Firmware
If the Temporary firmware image is less than SF230_158 you should consider installing the update.
Downloading the Firmware Package
There were two important links which were given in the above blog for users who already have IBM sign in to download the required package. 

For any normal user you can follow the below procedure given in the snap to download. 

Go to Fix Central  using the below link
Use the below option to download the Firmware.










Continue with the Menu option presented to select the required Firmware version, for example in this document i have chosen SF240_358 which is a concurrent update. 



This is a concurrent FW update and does NOT require shutdown of LPARs or quiescing of processes. However, shutdown of LPARs or quiescing of processes can be performed if you feel it necessary.





You can download these codes on cdrom or on an ftp site.
Making a CD-ROM (Never use a CD R/W)
CD-ROM will be created with the <firmware>.iso file. The CD-ROM contains both the system and power subsystem firmware files.

Another method is to download the individual .rpm and .xml files from this location to a CD-ROM. Both files are required: 01SF2xx_yyy_zzz . xml, 01SF2xx_yyy_zzz . rpm, 02BP2xx_yyy_zzz . xml, 02BP2xx_yyy_zzz . rpm.

For PL3250R and PL6450R models the CD-ROM or ftp site must contain both the system and power subsystem firmware files.


Pre-checks before the Firmware Upgrade 
The "Check System Readiness" task is run when the firmware update is started
On the HMC GUI, select "Licensed Internal Code Maintenance"
Select "Licensed Internal Code Updates"

Select "Check System Readiness
Select the correct target system and click OK.
The "Check System Readiness" task is PASS or FAIL. To verify results it can be run a second time.

 

Updating the Firmware -- Concurrent
The method used to install new firmware will depend on the release level of firmware which is currently installed on your server. The release level can be determined by the prefix of the new firmware's filename.
Example: 01SFXXX_YYY_ZZZ
Where XXX = release level
Updating the Firmware
If the release level will stay the same (Example: Level 01SF222_075_075 is currently installed and you are attempting to install level 01SF222_081_081) this is considered an update.
Updating the Temporary firmware
From the HMC display:
- select Licensed Internal code Maintenance
- select Licensed Internal Code Updates
- select Change Licensed Internal code for the current release
- select the system to be updated and click OK
- select Start Change Internal Code Wizard and click OK
- select DVD drive or Ftp site according to the method you want and click OK



 






 




 

Upgrading the System Firmware - Disruptive


Remember, upgrading to a new release is a disruptive upgrade. Start by opening the Server and Partition folder in the HMC. Then, click on Server Management in that folder. If the state of the machine is Power off, Ready, or Standby, then proceed. Setting the state to Power off is recommended when performing a firmware upgrade, although it is not required. Note: only HMC managed systems can perform firmware upgrades with target system set to Power off state.




Next, open the Licensed Internal Code Maintenance folder on the Hardware Management Console. Then, click on Licensed Internal Code Updates in that folder. In this example, the update will be from our current firmware level 01SF235_185 to 01SF240_202, so the normal " Change Licensed Internal Code for the current release" feature will not work. Select Upgrade Licensed Internal Code to a new release.


    Select the desired target managed system and click OK
 


Insert the CD with the rpm and XML files into the drive. On the Specify LIC Repository panel, select DVD drive and click OK.
Next, a Select LIC level panel is shown. Click OK.


The next prompt will be to accept the LIC license agreement for machine code. Read the license and click OK to accept. After accepting the license agreement, confirm the disruptive upgrade action. Click OK to proceed. When the firmware is flashed, the FSP will restart and activate the new firmware level
 



A dialog box will appear to showing the elapsed time and status of the firmware upgrade.
WARNING - During a disruptive update, the flashing of NVRAM might take from ten minutes to two hours. Do not interrupt the process before the flash process is complete.


Once the firmware upgrade has completed, view the system firmware information to see how the upgrade has changed what is available
 
Reject the installed firmware using a HMC


From the HMC left navigation area, select License Internal Code Updates.
In the content area, click Change Licensed Internal Code to the current release. In the Target Object Selection window, click the target system, and click OK.
The main panel then displays with three options start the Change Internal Code wizard, view system information, and select advanced features.



The Remove and activate feature is a one step process to remove the current active T-side firmware and roll back (or copy) firmware from the P-side. Essentially, undo the last firmware update and restore the T-side with the P-side firmware version.


 
Upgrade firmware without a HMC
Access ASMI via serial console
A system with no HMC is also known as an unmanaged system. ASMI is used to power on the system and perform other useful functions. Using a serial cable and a program like HyperTerminal on Windows, ASMI and the active console can be accessed. Other communication programs should work.





Back view of p550Q server with ports for (left to right)
serial, SPCN, HMC, USB and Ethernet.
Once the serial connection to the system is established, press the key, to be presented with the following ASMI login screen.
Welcome

Machine type-model: 9133-55A
Serial number: 10B7D4G
Date: 2006-4-21
Time: 20:12:48

Service Processor: Primary
User ID: admin
Password: *****

User ID to change: admin
Current password for user ID admin: *****
New password for user: ******

New password again: ******
Operation completed successfully.

PRESS ENTER TO CONTINUE:
Number of columns [80-255, Currently: 80]:
Number of lines [24-255, Currently: 24]:

Type the User ID and password to log in to ASMI. If this is the first time logging into ASMI, it might be required to change the default password. The default password is admin

Checking the current firmware level


Upon logging into ASMI, the firmware level will be clearly displayed as shown below.
System name: Server-9133-55A-SN10B7D4G

Version: SF235_185
User: admin
 
Copyright © 2002-2005 IBM Corporation. All rights reserved.

1. Power/Restart Control
2. System Service Aids
3. System Information
4. System Configuration
5. Network Services
6. Performance Setup
7. On Demand Utilities
8. Concurrent Maintenance
9. Login Profile

99. Log out
 
The current activated firmware level is also shown in the ASMI web interface after logging in. To see the firmware level, look in the upper-right corner. Basically, to access the ASMI web interface, connect an Ethernet cable directly to the HMC1 port. For details on accessing the ASMI web interface,
 


The current active firmware level can also seen from the output of the Display Microcode Level selection on the Diagnostics CD. Details of how to get to the following screen are provided

The current active firmware level as seen from the SMS screen



Finally, if the server has a running AIX or Linux operating system with the Service and Productivity tool lsvpd installed, the lsmcode command can be used as shown below. More details are provided in section 4.4.2
linux:~ #/tmp/fwupdate # lsmcode

Version of System Firmware is SF235_185 (t) SF235_185 (p) SF235_185 (b)
Version of PFW is 17112005111681CF0681


Power on using ASMI


From the ASMI main menu select 1. Power/Restart Control to get to this screen:
Power/Restart Control

1. Power On/Off System
2. Auto Power Restart
3. Immediate Power Off
4. System Reboot
5. Wake On LAN
98. Return to previous menu
99. Log out

S1> 1 
Select 1. Power On/Off System.
On the next screen select 8. Power on. Wait for a few seconds to be logged out as the system powers on. Watch boot progress codes as the system comes up.
 
Power On/Off System
Current system power state: Off
Current firmware boot side: Permanent
Current system server firmware state: Not running
 
1. System boot speed
       Currently: Fast
2. Firmware boot side for the next boot
       Currently: Permanent
3. System operating mode
       Currently: Normal
4. AIX/Linux partition mode boot
       Currently: Continue to operating system
5. Boot to system server firmware
       Currently: Standby
6. System power off policy
       Currently: Stay on
7. i5/OS partition mode boot
       Currently: A
8. Power on
98. Return to previous menu
99. Log out
 
S1>8 

When the system completes the boot process, note the operator control panel. Note that there should be no indication of HMC=. This indicates that the service processor does not expect to be managed by an HMC (see below).
 
01    N     V=F

           T   

Upgrade system firmware via running operating system

The rpm file for the firmware fix file stored either in the file system or on a mounted CD.

For this example, the rpm file is in the /tmp directory.

Run the command below to extract the flash image file in the rpm file:

rpm -Uvh --ignoreos /tmp/01SF240_202_201.rpm

The flash image file is put into a newly created directory /tmp/fwupdate
To install the server firmware through a running OS, use the update_flash. To run this command, root authority is required. Since installing server firmware fixes through the operating system is a disruptive process, shut down any running applications and logout any non-root users.

On AIX, the update_flash command is located in the /usr/lpp/diagnostics/bin directory. If this directory does not exist, install the AIX diagnostics to run this command.

On Linux, the update_flash command is located in the /usr/sbin directory. A separate installation of Service and Productivity Tools may be required.

The command syntax is as follows (for both AIX and Linux):
update_flash [-f file_name]| [-c] | [-r]
Attention: The update_flash command reboots the entire system. Do not use this command if more than one user is logged in to the system.
Flag
Description
-f 
Flash update image file source. The file_name variable specifies the fully qualified path of the flash update image file.
-c
Commit temporary image to permanent side.
-r
Reject temporary image. 

Upgrade firmware image using AIX

Before installing check the existing firmware level. From AIX, use the command lsmcode. This command resides in the diagnostic directory. An example of the output of the lsmcode command is as follows:
 
DISPLAY MICROCODE LEVEL  802811
IBM,9133-55A
The current permanent system firmware image is SF235_185
The current temporary system firmware image is SF235_185
The system is currently booted from the temporary firmware image.

Enter to continue.

Next, run the update_flash command to upgrade firmware:
[c73m5lr01][/]> ls /tmp/fwupdate
01SF240_202_201
[c73m5lr01][/]> /usr/lpp/diagnostics/bin/update_flash -f
/tmp/fwupdate/01SF240_202_201
The image is valid and would update the temporary image to
SF240_202.

The new firmware level for the permanent image would be SF235_185.
The current permanent system firmware image is SF235_185.
The current temporary system firmware image is SF235_185.
***** WARNING: Continuing will reboot the system! *****
Do you wish to continue?
Enter 1=Yes or 2=No
 
Reject installed firmware without an HMC  
 
There are times when it may be necessary to reject a firmware update. The
reject function is accomplished by basically copying the P-side firmware to the
T-side and activating. The rejected T-side firmware is removed completely.
Without an HMC, it is possible to reject the firmware in the T-side and roll
back to the firmware in the P-side using the OS update_flash command or a
Diagnostics CD. 
Note: If rejecting firmware without an HMC, the server must be booted from the P-side copy of the firmware prior to performing this action.

Boot to the permanent side

Power on the system thru ASMI with the below options

System name: Server-9133-55A-SN10B7D2G

Version: SF235_185
User: admin
Copyright © 2002-2006 IBM Corporation. All rights reserved.

1. Power/Restart Control
2. System Service Aids
3. System Information
4. System Configuration
5. Network Services
6. Performance Setup
7. On Demand Utilities
8. Concurrent Maintenance
9. Login Profile
99. Log out
 
S1> 1
Select 1. Power On/Off System
Power/Restart Control
1. Power On/Off System
2. Auto Power Restart
3. Immediate Power Off
4. System Reboot
5. Wake On LAN
98. Return to previous menu
99. Log out
 
S1> 1
Select 2. boot side for the next boot
Power On/Off System
Current system power state: Off
Current firmware boot side: Temporary
Current system server firmware state: Not running

1. System boot speed
       Currently: Fast
2. Firmware boot side for the next boot
       Currently: Temporary
3. System operating mode
       Currently: Normal
4. AIX/Linux partition mode boot
       Currently: Continue to operating system
5. Boot to system server firmware
       Currently: Running
6. System power off policy
       Currently: Automatic
7. i5/OS partition mode boot
       Currently: A
8. Power on
98. Return to previous menu
99. Log out

S1> 2
Select 1. Permanent
Firmware boot side for the next boot
Currently: Temporary
1. Permanent
2. Temporary
98. Return to previous menu without saving changes
99. Log out
 
S1> 1

Note that the firmware boot side is now set to Permanent which is our backup copy of the firmware flash.
Select 8. Power on. Hit the key. Wait for a few seconds to be logged out as the system powers on.
 
Power On/Off System
Current system power state: Off
Current firmware boot side: Temporary
Current system server firmware state: Not running
1. System boot speed
       Currently: Fast
2. Firmware boot side for the next boot
       Currently: Permanent
3. System operating mode
       Currently: Normal
4. AIX/Linux partition mode boot
       Currently: Continue to operating system
5. Boot to system server firmware
       Currently: Running
6. System power off policy
       Currently: Automatic
7. i5/OS partition mode boot
       Currently: A
8. Power on
98. Return to previous menu
99. Log out
 
S1> 8
The system is powering on.
PRESS ENTER TO CONTINUE:
 
Reject the installed firmware using an OS
 
To reject firmware using AIX or Linux, use the update_flash command with the
r option. The system is running on the P-side firmware as noted in the lsmcode
output SF235_185 (b). (b) denotes booted. 

Reject the installed firmware using a diagnostic CD
Powered on from the P-side (see 5.1), boot the Diagnostics CD as described in section 4.5. Then select Update and Manage Flash selection from the diagnostics CDs Task Selection List. Note: the system is currently booted from the permanent firmware side.
 
UPDATE AND MANAGE FLASH                                                  
The current permanent system firmware image is SF235_185
The current temporary system firmware image is SF240_202
The system is currently booted from the permanent firmware image.
Move cursor to selection, then press 'Enter'.
Validate and Update System Firmware
Validate System Firmware
Reject the Temporary Image  

Move the highlighted cursor to Reject the Temporary Image and hit key to proceed with rejecting the installed firmware on the T-side.
 
UPDATE AND MANAGE FLASH                                                  
The reject operation was successful.  
 
 

No comments:

Post a Comment