SERVERware 4 StandAlone Edition Storage Pool Faulty Disk Replacement
When one of the disks from the storage pool is damaged, the next procedure should be followed:
If a disk fails, zpool will be in the state: DEGRADED.
We need to log in to the ssh on the storage server and check the status of zpool .
~ # zpool status
pool: NETSTOR
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: none requested
config:
NAME STATE READ WRITE CKSUM
NETSTOR DEGRADED 0 0 0
mirror-0 ONLINE 0 0 0
NETSTOR1 ONLINE 0 0 0
NETSTOR2 ONLINE 0 0 0
mirror-1 DEGRADED 0 0 0
NETSTOR3 ONLINE 0 0 0
NETSTOR4 UNAVAIL 0 158 0 corrupted dataIn the output from the zpool status, we can see that NETSTOR4 is corrupted.
NETSTOR4 UNAVAIL 0 158 0 corrupted data
If this is the case we need to replace the disk labeled NETSTOR4 with a new one and add it to zpool mirror.
First, physically remove the faulty disk from the server and replace it with a new disk.
After replacement, we should see a new disk in /dev/disk/by-id/
# ls -lah /dev/disk/by-id total 0 drwxr-xr-x 2 root root 480 Srp 27 08:57 . drwxr-xr-x 7 root root 140 Srp 27 08:13 .. lrwxrwxrwx 1 root root 9 Srp 27 08:13 ata-INTEL_SSDSC2CW060A3_CVCV308402M3060AGN -> ../../sde lrwxrwxrwx 1 root root 10 Srp 27 08:13 ata-INTEL_SSDSC2CW060A3_CVCV308402M3060AGN-part1 -> ../../sde1 lrwxrwxrwx 1 root root 10 Srp 27 08:13 ata-INTEL_SSDSC2CW060A3_CVCV308402M3060AGN-part2 -> ../../sde2 lrwxrwxrwx 1 root root 10 Srp 27 08:13 ata-INTEL_SSDSC2CW060A3_CVCV308402M3060AGN-part9 -> ../../sde9 lrwxrwxrwx 1 root root 9 Srp 27 08:13 ata-ST31000520AS_5VX0BZN0 -> ../../sda lrwxrwxrwx 1 root root 10 Srp 27 08:13 ata-ST31000520AS_5VX0BZN0-part1 -> ../../sda1 lrwxrwxrwx 1 root root 9 Srp 27 08:13 ata-WDC_WD10JFCX-68N6GN0_WD-WX61A465TH1Y -> ../../sdc lrwxrwxrwx 1 root root 10 Srp 27 08:13 ata-WDC_WD10JFCX-68N6GN0_WD-WX61A465TH1Y-part1 -> ../../sdc1 lrwxrwxrwx 1 root root 9 Srp 27 08:13 ata-WDC_WD10JFCX-68N6GN0_WD-WX81EC512Y4H -> ../../sdd lrwxrwxrwx 1 root root 10 Srp 27 08:13 ata-WDC_WD10JFCX-68N6GN0_WD-WX81EC512Y4H-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 9 Srp 27 08:57 ata-WDC_WD10JFCX-68N6GN0_WD-WXK1E6458WKX -> ../../sdb
lrwxrwxrwx 1 root root 9 Srp 27 08:13 wwn-0x10076999618641940481x -> ../../sdd lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x10076999618641940481x-part1 -> ../../sdd1 lrwxrwxrwx 1 root root 9 Srp 27 08:13 wwn-0x11689569317835657217x -> ../../sdc lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x11689569317835657217x-part1 -> ../../sdc1 lrwxrwxrwx 1 root root 9 Srp 27 08:57 wwn-0x11769037186453098497x -> ../../sdb lrwxrwxrwx 1 root root 9 Srp 27 08:13 wwn-0x12757853320186451405x -> ../../sde lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x12757853320186451405x-part1 -> ../../sde1 lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x12757853320186451405x-part2 -> ../../sde2 lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x12757853320186451405x-part9 -> ../../sde9 lrwxrwxrwx 1 root root 9 Srp 27 08:13 wwn-0x7847552951345238016x -> ../../sda lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x7847552951345238016x-part1 -> ../../sda1
Now when we have a block device name, we can make a table, partition, and prepare the drive for usage.
Use parted to make partition table for new logical drives.
~# parted /dev/sdb --script -- mktable gpt
And create a new label.
IMPORTANT: label must be named in the following format: NETSTORx.
Where “NETSTOR” comes from the server storage and “x” is the drive number.
So, in our example (NETSTOR4):
1. NETSTOR - this means virtual pool on storage SERVER
2. 4 - this is the number of the disk (disk 4)
Now add a label to the new drive,
~# parted /dev/sdb --script -- mkpart "NETSTOR4" 1 -1
To see changes use the next command:
~ # ls -lah /dev/disk/by-partlabel/ total 0 drwxr-xr-x 2 root root 160 Srp 27 09:07 . drwxr-xr-x 7 root root 140 Srp 27 08:13 .. lrwxrwxrwx 1 root root 10 Srp 27 08:13 grub -> ../../sde2 lrwxrwxrwx 1 root root 10 Srp 27 08:13 NETSTOR1 -> ../../sdc1 lrwxrwxrwx 1 root root 10 Srp 27 08:13 NETSTOR2 -> ../../sdd1 lrwxrwxrwx 1 root root 10 Srp 27 08:13 NETSTOR3 -> ../../sda1 lrwxrwxrwx 1 root root 10 Srp 27 09:07 NETSTOR4 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Srp 27 08:13 zfs-67f0216ac0cdfd49 -> ../../sde1
We have added a new disk to the system, created a partition and label. Now we have to add this disk to the storage pool.
First, we need to change guid of the old disk to guid of the new disk, so that zpool can identify the new disk.
To change guid from old to new in zpool, first we need to find out the new guid.
We can use zdb command to find out:
~ # zdb
NETSTOR:
version: 5000
name: 'NETSTOR'
state: 0
txg: 3690
pool_guid: 1509362615723986299
errata: 0
hostname: 'PoosyNebulaC'
vdev_children: 2
vdev_tree:
type: 'root'
id: 0
guid: 1509362615723986299
children[0]:
type: 'mirror'
id: 0
guid: 9114691413196561361
metaslab_array: 35
metaslab_shift: 33
ashift: 12
asize: 1000197849088
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 17673948060813426502
path: '/dev/disk/by-partlabel/NETSTOR1'
whole_disk: 1
create_txg: 4
children[1]:
type: 'disk'
id: 1
guid: 15948717213456048677
path: '/dev/disk/by-partlabel/NETSTOR2'
whole_disk: 1
create_txg: 4
children[1]:
type: 'mirror'
id: 1
guid: 4857650377060724882
metaslab_array: 58
metaslab_shift: 33
ashift: 12
asize: 1000198373376
is_log: 0
create_txg: 224
children[0]:
type: 'disk'
id: 0
guid: 7084287257001519327
path: '/dev/disk/by-partlabel/NETSTOR3'
whole_disk: 0
create_txg: 224
children[1]:
type: 'disk'
id: 1guid: 3813061267827485888
path: '/dev/disk/by-partlabel/NETSTOR4'whole_disk: 0
create_txg: 224
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_dataThe important line for from zdb output:
guid: 3813061267827485888 path: '/dev/disk/by-partlabel/NETSTOR4'
Using this guid we need to replace the old NETSTOR4 with the new NETSTOR4 disk.
The guid part needs to be updated to zpool.
We can update guid with the command:
~# zpool replace NETSTOR <old_guid> <path_to_HDD> -f
Example:
~# zpool replace NETSTOR 3813061267827485888 /dev/disk/by-partlabel/NETSTOR4 -f
This is the end of the procedure for replacing the disk.
To see status of the pool type:
~ # zpool status
Now restart swhspared deamon to update GUI information.
~# /etc/init.d/swhspared restart