Une brève introduction aux baies
A short introduction to
SUN STORAGETEK 6140
AGR 2017
Overview
• Dual controller
• 1GB cache (2 FC host ports) or 2 GB cache (4 FC host ports) per controller
• 5 to 16 drives (FC or SATA) per tray
• Up to 7 trays (controller tray + 6 expansion trays)
• Up to 112 drives (112 TB raw capacity)
• RAID 0, 1, 1+0, 3, 5 and 6
• Multipathing
• 2 main firmware versions :
• [Link]
• [Link]
• Optional functionalities (Premium Features)
2
Front view
3
Front LED
4
Rear view
5
Controller
• 2 Host Ports / 1GB cache (375-3375) : LSI 3992
• 4 Host Ports / 2GB cache (375-3335 / 375-3581) : LSI 3994
(FC)
(FC)
6
Rear LED
7
Expansion trays
Tray ID
I/O module (ESM) 2GB controller : up to 6 expansion trays (7x16 = 112 disks)
P/N 375-3336 1GB controller : up to 3 expansion trays (4x16 = 64 disks)
8
Disks
• FC : 73 / 146 / 300 / 400 / 600 GB, 10k or 15k RPM
• SATA : 500 GB / 1TB / 2TB, 7200 RPM
• SATA drives use an interposer board (SATA/FC) included within the bracket
(P/N 541-1406)
P/N 541-1406
SATA bracket with interposer SATA/FC board
9
Batteries
• Non-SMART batteries
• To be replaced once expired
• Life expectancy : 1170 days (> 3 years)
• Age can be reset
• No learn cycle
• Hot-swappable
• Externally accessible
• No need to offline/remove the controller
10 • P/N 371-0717 (LSI 13695-0x)
Management Interfaces (1)
• Common Array Manager (CAM)
• BUI (Browser User Interface) : [Link]
• CLI (SSCS, and other commands set)
• Solaris, Windows, Linux (CAM 6.10 : Solaris only)
• SANtricity (no longer supported by Sun/Oracle)
• GUI (Java-based)
• CLI (SMcli)
• One can use any SANtricity-based Storage Manager from any other vendor (IBM DS Storage Manager, DELL MD
Storage Manager, SGI IS Storage Manager …)
• CAM / SANtricity / Storage Manager are software that must be installed on a host
• Serial connection :
• Menu for configuration (controller IP, array password)
11 • Full shell access (not documented by Sun/Oracle)
Management Interfaces (2)
• Management methods :
• Out-of-band (network)
• Both controllers must be accessible for some operations
• The same CAM / SANtricity management station can manage several arrays
• Preferred method
• In-band
• Using the direct SCSI connection between host and array
• Requires special LUN (Universal X-port, usually target 31) to be mapped to the host
• Using a special agent, an in-band managed array can be
manage by a remote CAM / SANtricity management station
12
CAM : BUI
13
SANtricity : GUI
14
Diagnostic : Support Data (1)
• Collecting Support Data
• CAM (BUI)
[Link]
--> Sun StorageTek Common Array Manager
--> select the array (left pane)
--> Service Advisor button (FR: Grille de Service)
--> (new window) Array Troubleshooting and Recovery
(FR: Procédures de dépannage et de reprise de baie)
--> Collecting Support Data (FR: Collecte des données de support)
Then follow the instructions
• CAM (CLI)
--> Identify the array : # ras_admin device_list
--> Collect the Data : # supportData -d <identifier> -p <path> -o <filename>
where <identifier> may be the array name or the IP of one of the controllers
Path to the commands :
Solaris : /opt/SUNWsefms/bin/
Linux : /opt/sun/cam/private/fms/bin/
Windows (CAM <= 5.0.2) : C:\Program Files\Sun_Microsystems\StorageTek_Mgmt\Component\fms\bin\
15 Windows (CAM >= 5.1) : C:\Program Files\Sun\Common Array Manager\Component\fms\bin\
Diagnostic : Support Data (2)
• Collecting Support Data
• SANtricity / Storage Manager GUI
--> double-click on the array to launch the Array Management window
--> Advanced menu
--> Troubleshooting
--> Collect All Support Data
--> Specify a directory and a filename
--> Start
• SANtricity / Storage Manager CLI (SMcli)
--> Identify the array : # SMcli –d -i
--> Collect the Data : # SMcli -n <array_name> -c "save storageArray supportData filename=\"array_name-[Link]\";"
Path to the commands :
Solaris : /opt/SMgr/client/
Linux : /opt/SMgr/client/
Windows : C:\Program Files\StorageManager\client\
16
Troubleshooting : listing failures (1)
• LED
• Amber LED for tray / controller / IOM / disk / PSU
• CAM BUI
• Alarms pane
• SANtricity / SM
• Recovery Guru window
17
Troubleshooting : listing failures (2)
• Support Data from CAM
• File [Link]
• Always an alarm when installed firmwares do not match the ones expected by the version of CAM (harmless, can be ignored)
Alarm list for device SUN.54065460150.0716AWF00B
Alarm ID : alarm1
Description: [Link].B is at revision "[Link]" baseline version is "[Link]"
[Link].B is at revision "98C1" baseline version is "98D3"
[Link].16 is at revision "3092" baseline version is "3292"
Severity : Major
Element : SUN.54065460150.0716AWF00B
GridCode : 57.75.42
Date : 2014-12-03 [Link]
Alarm ID : alarm14
Description: A hot spare is in use. The affected virtual disk is vdisk.1, failed drive(s) [Link].03,
spare(s) used [Link].16, the affected volume(s) Volume_tray:[Link].lun:0
Severity : Major
Element : t0drive3
GridCode : 57.66.1021
Date : 2017-02-03 [Link]
• Support Data from SANtricity / SM
18 • File [Link] (to be opened by a browser)
TS : configuration and status
• Support Data : file [Link]
• Detailed information on configuration and status
• Support Data : file [Link]
• List of all events on the array
• Support Data : file stateCaptureData.[dmp|txt]
• Results of controller shell commands (low level – for advanced analysis)
19
TS : [Link]
• Contains configuration/status information about :
• Controllers
• Virtual Disks (aka Volume Groups, aka Arrays)
• Volumes (aka Logical Drives)
• Drives
• Channels (Hosts & drives)
• Trays (including batteries, PSUs, etc.)
• ...
• Content slightly differs when from CAM or SANtricity
20
Serial Port : cable
• Mini-DIN / RJ45 cable
• P/N 530-3544
IBM : 13N1932 or 39M5908
DELL : CT109 or MN657
• some 530-3544 are miswired and require to use
RJ45/DB9 connector P/N 530-3100 (straight
through) instead of connector 371-1107 (NULL
modem)
• Mini-DIN / DB9 (RS232)
21 • Netapp P/N 23698-00
Serial Port : setting & Service Menu
• Setting :
• Baud rate = 38400
• Data bits = 8 Stop bits=1
• Parity = None Flow Control = None
• Establishing a connection :
• Send BREAK until you get the message
Press the space bar within 5 seconds: <S> for Service Interface. <BREAK> for baud rate
• Press SPACE to set the baud rate
• Send another BREAK
• BREAK until you get the message
Press the space bar within 5 seconds: <S> for Service Interface. <BREAK> for baud rate
22
• Press « S » to get the Service Interface, and « ESC » to reach the shell
Serial Port : Service Menu
• Password : kra16wen
• Service Interface :
• Showing/setting controller IP address
• Resetting array password (SYMbol password, used for communication between
CAM/SANtricity and the array)
Service Interface Main Menu
==============================
1) Display IP Configuration
2) Change IP Configuration
3) Reset Storage Array (SYMbol) Password
Q) Quit Menu
23
Enter Selection:
Serial Port : shell (advanced)
• Login / password : shellUsr / y2llojp
• No login, only password, required in firmware [Link]
• Commands sets differ in fw [Link] and fw [Link]
• Do not use shell commands unless you know what you do
24
Usual interventions
• Look for alarms
• Either from CAM (select the array, then Alarms) or SANtricity (blinking
sthetoscope)
• Support Data :
• file [Link] (from CAM)
• file [Link] (from SANtricity)
• Follow guidelines from Service Advisor (CAM) or Recovery
Guru (SANtricity)
25
Usual interventions : batteries (1)
• Info about batteries in Support Data :
• SD from CAM : file [Link]
• SD from SANtricity : file [Link]
• File [Link] : look for « Battery »
Battery: [Link].A
Status: Optimal
Age in days: 1033
Days until replacement: 137
Battery: [Link].B
Status: Optimal
Age in days: 1034
Days until replacement: 136
• File [Link] :
• Fw [Link] : look for « BATTERY »
26 • Fw [Link] : look for « bmgrShow »
Usual interventions : batteries (2)
• Non-SMART batteries : to be replaced when they are Failed, Near Expiration or Expired
• No downtime nor controller failover required
• CAM : follow instructions from Service Advisor (FR: Grille de service)
• SANtricity : follow instructions from Recovery Guru
• Reset the age once replaced
27
Usual interventions : batteries (3)
• Resetting the age (GUI) :
• CAM : select array -> Service Advisor -> array Troubleshooting & Recovery
-> Resetting the Controller Battery Age
• SANtricity : click on the Components icon then Batteries
28
Usual interventions : batteries (4)
• Resetting the age (CLI) :
• CAM CLI :
service -d arrayname -c reset -t tXbatY
X : tray ID (usually 85) ; Y : slot ID (1 = Ctler A ; 2 = Ctler B)
Solaris: /opt/SUNWsefms/bin
Linux: /opt/sun/cam/private/fms/bin
Windows: c:\Program Files\Sun\Common Array Manager\Component\fms\bin
• SANtricity SMcli :
smcli -n arrayname [-p password] -c "reset storageArray batteryInstallDate controller=X;“
smcli @IP_A [@IP_B][-p password] -c "reset storageArray batteryInstallDate controller=X;“
X : either A or B
Solaris, Linux : /opt/SMgr/client/
Windows : C:\Program Files\StorageManager\client\
29
Usual interventions : batteries (5)
• Resetting the age (serial shell) :
• Serial shell :
• menu « M » (Boot Operation Menu) -> 8) Special Services Menu -> 6) Install Battery
BOOT SPECIAL OPERATIONS MENU
-> M
1) Change Board Serial Number
NOTICE: The BOOT OPERATIONS MENU has been invoked too late for 2) Reinitialize All NVSRAM
proper operation of some activities, including Isolation Diagnostics. 3) Change Password
You may wish to restart this controller again and press Control-B 4) Change Ethernet Node Address
IMMEDIATELY after seeing the start-up indicator ("-=<###>=-"). 5) Change Subsystem Name
6) Install Battery
BOOT OPERATIONS MENU 7) Reserved
Q) Quit Menu
1) Perform Isolation Diagnostics 10) Serial Interface Mode Menu
2) Download Permanent File 11) Display Hardware Configuration Enter Selection: 6
3) Reserved 12) Change Hardware Configuration Menu Please enter battery number to set installation date(0 or 1):0
4) Dump NVSRAM Group 13) Development Options Menu
5) Patch NVSRAM Group 14) Display Memory Error Log Current date: 09/28/2017
6) Set Real Time Clock 15) Manufacturing Setup Menu Current battery 0 installation date: 09/15/2015
7) Display Board Configuration R) Restart Controller
8) Special Services Menu Q) Quit Menu Use this operation to inform the controller that the batteries for the
9) Display Exception Message cache memory have been replaced, and to identify the date the new
batteries were installed. (Avoid using this function if the batteries
Enter Selection: 8 have not been replaced; otherwise, data still remaining in cache may be lost.)
Do you wish to continue to set the battery installation date? (y/n) y
Enter installation date (mm/dd/yyyy): 09/28/2017
New battery installation date: 09/28/2017
New battery expiration date: 12/11/2020
New battery expiration warning date: 10/30/2020
30 Press <Enter> to continue
Usual interventions : disks (1)
• Info about disks in Support Data :
• SD from CAM : file [Link]
• SD from SANtricity : file [Link]
• File [Link] : look for « DRIVES---- »
DRIVES-----------------------
TRAY,SLOT STATUS CAPACITY CURRENT DATA RATE PRODUCT ID FIRMWARE VERSION
--------- ------- -------- ----------------- ---------------- ----------------
0, 1 Optimal 279 GB 2 Gbit/s HUS103030FLF21 JFQ8
0, 2 Optimal 279 GB 2 Gbit/s HUS103030FLF21 JFQ8
0, 3 Optimal 279 GB 2 Gbit/s HUS103030FLF21 JFQ8
• File [Link] :
• Fw [Link] : look for « cfgPhyList »
31
• Fw [Link] : look for « vdmShowDriveList »
Usual interventions : disks (2)
• To be replaced when status is Failed or Impending Drive Failure
• A drive marked as bypassed requires further investigation
• If Impending Drive Failure, drive must be manually failed before replacement
• CAM : select array -> Physical Devices -> Disks -> select a disk -> Fail
• SANtricity : select disk -> Advanced -> Recovery -> Fail Drive
32
Usual interventions : disks (3)
• If Impending Drive Failure, drive must be manually failed before replacement
• CAM CLI :
service -d arrayname -c fail -t tXdriveY
X : tray ID (usually 85) ; Y : slot ID
Solaris: /opt/SUNWsefms/bin
Linux: /opt/sun/cam/private/fms/bin
Windows: c:\Program Files\Sun\Common Array Manager\Component\fms\bin
• SANtricity SMcli :
smcli -n arrayname [-p password] -c "set physicalDisk [TrayID,slotID] operationalState=failed;“
smcli @IP_A [@IP_B][-p password] -c "set physicalDisk [TrayID,slotID] operationalState=failed;“
Solaris, Linux : /opt/SMgr/client/
Windows : C:\Program Files\StorageManager\client\
33
Usual interventions : controllers (1)
• Info about controllers in Support Data :
• SD from CAM : file [Link]
• SD from SANtricity : file [Link]
• File [Link] : look for « CONTROLLERS---- »
• File [Link] :
• Fw [Link] : look for « getObjectGraph_MT 99 »
• Fw [Link] : look for « [Controller] »
34
Usual interventions : controllers (2)
• Follow Service Advisor / Recovery Guru instructions
• Replacement is performed online
• if other controller is online, of course
• if multipathing (from host end) is correctly configured
• If not failed, controller has to be offlined manually (from GUI or CLI)
35
Usual interventions : controllers (3)
• Failing (offlining) a controller (GUI) :
• CAM : Service Advisor -> Array Troubleshooting and Recovery
-> Place a Controller Offline -> select the controller -> follow the instructions
• SANtricity : select the controller -> menu Adnced -> Recovery -> Place Controller -> Offline
36
Usual interventions : controllers (4)
• Failing (offlining) a controller (CLI) :
• CAM CLI :
service -d arrayname -c fail -t X
X : a or b
Solaris: /opt/SUNWsefms/bin
Linux: /opt/sun/cam/private/fms/bin
Windows: c:\Program Files\Sun\Common Array Manager\Component\fms\bin
• SANtricity SMcli :
smcli -n arrayname [-p password] -c “set controller [a] availability=offline;”
smcli @IP_A [@IP_B][-p password] -c “set controller [a] availability=offline;”
X : a or b
Solaris, Linux : /opt/SMgr/client/
Windows : C:\Program Files\StorageManager\client\
37
Usual interventions : controllers (5)
• Failing (offlining) a controller (serial shell) :
• From the other controller (the one to keep online) :
• Fw [Link] and [Link] :
-> ld </Debug
-> setControllerToFailed_MT 1
• Fw [Link] only :
-> cmgrSetAltToFailed
38
Advanced topics, questions, …
• Issues :
• Controller lockdown
• Controller held in reset
• Unreadable sectors
• RDAC / AVT
• Volume recovery
• …
• Questions
39
On [Link]
- SANtricity 11.30 (for Windows, Linux, Solaris SPARC & x86)
- IBM DS Storage Manager 11.20 and 10.86 for Windows
- Four Support Data archives (from CAM an SANtricity, fw [Link] and [Link])
- This PowerPoint file
40