RAID controller commands

From Wiki
Jump to: navigation, search

Managing 3ware Raid Cards

To enter the 3ware command line interface, type tw_cli. Enter ? or help to view help. For more information about the 3ware CLI, see the man page.

First we need some info on our setup:

Getting info on setup
3ware CLI> info
Ctl   Model        Ports   Drives   Units   NotOpt   RRate   VRate   BBU
------------------------------------------------------------------------
c0    8006-2LP     2       2        1       0        2       -       - 

Here we see that we have one controller 0 with two drives. Let's get more info on controller 0.

Getting info on controller 0
3ware CLI> info c0

On a degraded array you'll see something like this:

Unit     UnitType  Status         %Cmpl  Port  Stripe  Size(GB)  Blocks
-----------------------------------------------------------------------
u0       RAID-1    DEGRADED       -      -     -       149.05    312579760  
u0-0     DISK      OK             -      p0    -       149.05    312579760  
u0-1     DISK      DEGRADED       -      p1    -       149.05    312579760

If the array is rebuilding it will show the rebuild status.

Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0    RAID-1    REBUILDING     8      -       149.05    ON     -        -       

Now we can see the RAID at unit 0 is degraded. Let's find out which disk is giving us a problem.

Finding out which disk is degraded
info c0 u0

Unit     UnitType  Status         %Cmpl  Port  Stripe  Size(GB)  Blocks
-----------------------------------------------------------------------
u0       RAID-1    DEGRADED       -      -     -       149.05    312579760  
u0-0     DISK      OK             -      p0    -       149.05    312579760  
u0-1     DISK      DEGRADED       -      p1    -       149.05    312579760

Ok, port 1 is the disk that needs our attention.

Rebuild The RAID

First we need to remove the drive. This is like unmounting a filesystem so that we can work on it.

Removing the drive
maint remove c0 p1
Exporting port /c0/p1 ... Done.

Now let's have the software rescan the drives.

Rescaning the drives
maint rescan c0
Rescanning controller /c0 for units and drives ...Done.
Found the following unit(s): [none].
Found the following drive(s): [/c0/p1].

At this point, we need to see if the rescan picked the drive back up. If the port is N/A this is likey a bad drive and isn't responding. You'll need to replace the drive.

Checking if the rescan picked the drive back up
Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0    RAID-1    DEGRADED       -      -       149.05    ON     -        -       
Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     149.05 GB   312581808     5LS38DGP     
p1     OK               -      149.05 GB   312581808     4LS0H80S

If the drives show up then rebuild the array.

Rebuilding the drive
//host01> maint rebuild c0 u0 p1
Sending Rebuild-Start request to /c0/u0 on 1 disk(s) [1] ... Done.

To check the status, just give it the info option

Checking the status
info c0 u0
Unit     UnitType  Status         %Cmpl  Port  Stripe  Size(GB)  Blocks
-----------------------------------------------------------------------
u0       RAID-1    REBUILDING     7      -     -       149.05    312579760  
u0-0     DISK      OK             -      p0    -       149.05    312579760  
u0-1     DISK      DEGRADED       -      p1    -       149.05    312579760

Since it's rebuilding as it's running, this could take a really long time. Should this fail, the drive will need to be replaced.

Managing Adaptec Raid Cards

To identify the problem drive:

  • get current tasks (rebuild, etc)
    /usr/StorMan/arcconf getstatus 1
  • get current logical device
    /usr/StorMan/arcconf getconfig 1 ld
  • look at physical devices
    /usr/StorMan/arcconf getconfig 1 pd
  • look at dead drives
    /usr/StorMan/arcconf getlogs 1 dead tabular
  • look at devices with problems
    /usr/StorMan/arcconf getlogs 1 device tabular

When the drive has been replaced you should be able to see it by looking at the logical device again.

Managing LSI Raid Cards

Docs:

To identify the problem drive:

  • Gather information:
    MegaCli64 -LDInfo -Lall -aALL
    MegaCli64 -PDList -a0
    MegaCli64 -CfgDsply -aALL
    MegaCli64 -AdpGetProp AlarmDsply -aAll
    MegaCli64 -LDCC -ShowProg -L0 -a0 -Nolog
    MegaCli64 -FwTermLog -Dsply -a0

Locate a physical drive. The enclosure ID is provided by the -PDList command.

MegaCli6464 -PdLocate -start -physdrv [E:S] -a0
  • To Clear Terminal Log
    MegaCli64 -FwTermLog -Clear -aN
  • Check battery status
    MegaCli64 -AdpBbuCmd -aAll
  • To find enclosure IDs:
    MegaCli64 -EncInfo -aN
  • To view a particular physical drive:
    MegaCli64 -pdinfo -PhysDrv [E:S] -aN
  • To replace a drive:
    MegaCli64 -PDOffline -PhysDrv [E:S] -aN
    MegaCli64 -PDMarkMissing -PhysDrv [E:S] -aN
    MegaCli64 -PDPrpRmv -PhysDrv [E:S] -aN
    ** swap the drive **
    MegaCli64 -PdReplaceMissing -PhysDrv [E:S] -ArrayN -rowN -aN
Note: The number N of the array parameter is the Span Reference you get using MegaCli64 -CfgDsply -aALL and the number N of the row parameter is the Physical Disk in that span which you are replacing, not the slot number. S is the slot number of the NEW disk.

For example: to replace physical disk 0 in span 1 using disk ID 25 you would use the following command.

MegaCli64 -PdReplaceMissing -PhysDrv [:25] -Array1 -row0 -a0
Warning: Drive rebuilds do not start automatically! You have to actually tell the controller to start rebuilding the drive.
MegaCli64 -PDRbld -Start -PhysDrv [E:S] -aN

To rebuild a drive that has a "foreign" configuration:

  • Mark the drive as good
MegaCli64 -PDMakeGood -PhysDrv [E:S]  -aALL
  • Clear the foreign setting
MegaCli64 -CfgForeign -Clear -a0
  • Set global hot spare - drive should start rebuilding after this
MegaCli64 -PDHSP -Set -PhysDrv [E:S] -aN
  • Stop consistency check:
    MegaCli64 -LDCC -Abort -L0 -a0 -Nolog
  • Silence Alarm Beeps
    MegaCli64 -AdpSetProp -AlarmSilence -aN
  • Show Rebuild Status
    watch -n 10 /opt/MegaRAID/MegaCli/MegaCli64 -PDRbld -ShowProg PhysDrv [32:0] -a0

Batteries used for each card type (arcconf getconfig 1 ad OR MegaCli64 -CfgDsply -aALL | head)

  • Adaptec 5805 -> BBU001
  • LSI MegaRAID SAS 84016E -> LSIBBU001
  • LSI MegaRAID SAS 9260-16i -> iBBU08

Megaraid SCSI/SATA controller

If the 'megarc' executable isn't installed on the server, you can install it with:

cd /usr/local/bin
wget http://layer3.example.com/scripts/dsinstall/raid/megarc.bin
wget http://layer3.example.com/scripts/dsinstall/raid/megarc
chmod u+x mega*
Warning: This will wipe out the current config.
/usr/local/bin/megarc -spannewcfg -a0 WT RAA CIO -R10 -strpsz64 -array0[0:0,0:1] -array1[0:2,0:3]
Display logical drive info
megarc -ldInfo -l0 -a0

get a listing of the drives and the status on them

megarc -dispCfg -a0  (0 is the controller id usually)

lets get a listing of what drives are on the controller.

megarc -phys -chAll -idAll -a0

Get the serial numbers for the drives.

megarc -physdrvSerialInfo -chAll -idAll -a0

you can get a report of whats happened with the failed drive with this command (if there is anything to report)

megarc -pdfailinfo -a0 -chall -idall

You can try to turn the drive on in a failed state by issuing this command. x=the channel normally 0 y=the id of the physical drive

megarc -physOn -a0 pd[x:y]

you can also turn the drive off

megarc -physOff -a0 pd[x:y]

show the status of a rebuilding drive that has been replaced

megarc -showRbld -a0

change the rebuild rate of the drive if you would like (its not recommended going over 40% going anything over 90% will render the disk unuseable for anything other then rebuiding rendering a not accessible system, I did this!)

megarc -setRbldRate 35 -a0

Let's rebuild the array. This says do rebuild on adapter 0 (-a0) doing a rebuildarray on channel 0 disk 2 [0:2] (You can leave off -ShowProg if you don't want to monitor the rebuild progress)

megarc -doRbld -a0 -RbldArray[0:2] -ShowProg

Silence (not disable!) the raid alarm:

megarc -silenceAlarm -a0