1. Bicom Systems
  2. Solution home
  3. SERVERware
  4. HOWTOs

SERVERware 3 Cluster Mirror Edition Storage Pool Faulty Disk Replacement

When one of the disks from the storage pool is damaged, next procedure should be followed:

If disk fails, zpool will be in the state: DEGRADED, on the primary server.

~# zpool status
pool: NETSTOR
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: scrub repaired 0 in 0h0m with 0 errors on Tue Dec  6 15:10:59 2016
config:
        NAME                    STATE     READ WRITE CKSUM
        NETSTOR                 DEGRADED     0     0     0
          mirror-0              DEGRADED     0     0     0
            SW3-NETSTOR-SRV1-1  ONLINE       0     0     0
            SW3-NETSTOR-SRV2-1  FAULTED      3     0     0  too many errors
errors: No known data errors

First we have to make sure our damaged disk is on secondary server not primary in this case we can find this out from output above:

SW3-NETSTOR-SRV2-1  FAULTED

SRV2 this means Server-2 have damaged disk.

If this is case, we can proceed to next step. If the damaged disk is on primary server    SRV1, than we should first make manual takeover and switch it to secondary server. To switch manually, ssh to secondary server and execute next command:

killall -SIGUSR1 sysmonit


Next, we should physically replace the damaged disk in the server.

In the output from the zpool status we can see that SW3-NETSTOR-SRV2-1 is corrupted:

  SW3-NETSTOR-SRV2-1  FAULTED      3     0     0  too many errors

If this is the case we need to replace disk labeled SW3-NETSTOR-SRV2-1 with a new one and add it to zpool mirror.

First, physically remove faulty disk from server and replace with a new disk.

After replacement, we should see new disk in /dev/disk/by-id/  


# ls -lah /dev/disk/by-id
total 0
drwxr-xr-x 2 root root 480 Srp 27 08:57 .
drwxr-xr-x 7 root root 140 Srp 27 08:13 ..
lrwxrwxrwx 1 root root   9 Srp 27 08:13 ata-INTEL_SSDSC2CW060A3_CVCV308402M3060AGN -> ../../sde
lrwxrwxrwx 1 root root  10 Srp 27 08:13 ata-INTEL_SSDSC2CW060A3_CVCV308402M3060AGN-part1 -> ../../sde1
lrwxrwxrwx 1 root root  10 Srp 27 08:13 ata-INTEL_SSDSC2CW060A3_CVCV308402M3060AGN-part2 -> ../../sde2
lrwxrwxrwx 1 root root  10 Srp 27 08:13 ata-INTEL_SSDSC2CW060A3_CVCV308402M3060AGN-part9 -> ../../sde9
lrwxrwxrwx 1 root root   9 Srp 27 08:13 ata-ST31000520AS_5VX0BZN0 -> ../../sda
lrwxrwxrwx 1 root root  10 Srp 27 08:13 ata-ST31000520AS_5VX0BZN0-part1 -> ../../sda1
lrwxrwxrwx 1 root root   9 Srp 27 08:13 ata-WDC_WD10JFCX-68N6GN0_WD-WX61A465TH1Y -> ../../sdc
lrwxrwxrwx 1 root root  10 Srp 27 08:13 ata-WDC_WD10JFCX-68N6GN0_WD-WX61A465TH1Y-part1 -> ../../sdc1
lrwxrwxrwx 1 root root   9 Srp 27 08:13 ata-WDC_WD10JFCX-68N6GN0_WD-WX81EC512Y4H -> ../../sdd
lrwxrwxrwx 1 root root  10 Srp 27 08:13 ata-WDC_WD10JFCX-68N6GN0_WD-WX81EC512Y4H-part1 -> ../../sdd1
lrwxrwxrwx 1 root root   9 Srp 27 08:57 ata-WDC_WD10JFCX-68N6GN0_WD-WXK1E6458WKX -> ../../sdb

lrwxrwxrwx 1 root root 9 Srp 27 08:13 wwn-0x10076999618641940481x -> ../../sdd lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x10076999618641940481x-part1 -> ../../sdd1 lrwxrwxrwx 1 root root 9 Srp 27 08:13 wwn-0x11689569317835657217x -> ../../sdc lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x11689569317835657217x-part1 -> ../../sdc1 lrwxrwxrwx 1 root root 9 Srp 27 08:57 wwn-0x11769037186453098497x -> ../../sdb lrwxrwxrwx 1 root root 9 Srp 27 08:13 wwn-0x12757853320186451405x -> ../../sde lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x12757853320186451405x-part1 -> ../../sde1 lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x12757853320186451405x-part2 -> ../../sde2 lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x12757853320186451405x-part9 -> ../../sde9 lrwxrwxrwx 1 root root 9 Srp 27 08:13 wwn-0x7847552951345238016x -> ../../sda lrwxrwxrwx 1 root root 10 Srp 27 08:13 wwn-0x7847552951345238016x-part1 -> ../../sda1

Now when we have block device name, we can make table, partition and prepare the drive for usage.  

To make a partition table use parted:


~# parted /dev/<name of your drive> --script -- mktable gpt

Create a new label.


 IMPORTANT: label must be named in the following format: SW3-NETSTOR-SRVx-y.
Where “SRVx” comes from the server number and “-y” is the disk number.

So, in our example (SW3-NETSTOR-SRV2-1):


1. SW3-NETSTOR-SRV2 - this means virtual disk on SERVER 2
2. -1               - this is the number of the disk (disk 1)


Now add a label to the new drive.


Create the partition with the name to match our faulty partition on the server. We have this name from the above:

SW3-NETSTOR-SRV2-1  FAULTED      3     0     0  too many errors

Our command in this case will be:

 
~# parted /dev/<name of your drive> --script -- mkpart "SW3-NETSTOR-SRV2-1" 1 -1

We have now added a new partition and created a label. Now we need to edit mirror configuration file: /etc/tgt/mirror/SW3-NETSTOR-SRV2.conf.


IMPORTANT: Before we can edit configuration file, we need to logout iSCSI session on the primary server.

Connect trough ssh to the primary server and use iscsiadm command to logout:

 
~# iscsiadm -m node -T < logical drive name > --logout

Example:

~# iscsiadm -m node -T SW3-NETSTOR-SRV2-1 --logout
Logging out of session [sid: 1, target: SW3-NETSTOR-SRV2-1, portal: 192.168.1.47,3259]
Logout of [sid: 1, target: SW3-NETSTOR-SRV2-1, portal: 192.168.1.47,3259] successful.

Now we can proceed editing configuration file on the secondary server.


Source of the file looks like this:

<target SW3-NETSTOR-SRV2-1>
        <direct-store /dev/disk/by-id/scsi-3600508b1001cccb52dca8f7cfea0d8df>
                write-cache on
                bs-type rdwr
        </direct-store>
        initiator-address 192.168.1.46
</target>

We need to replace iscsi-id to match the ID of the changed disk.

To see the new ID use this command:

~# ls -lah /dev/disk/by-id/
lrwxrwxrwx 1 root root   9 Pro  6 15:54 scsi-3600508b1001cb960a9daa8733452c470 -> ../../sdd

Now edit configuration file.

~# nano /etc/tgt/mirror/SW3-NETSTOR-SRV2.conf

Replace ID with the new one and save file.


Next, we need to update the target from the configuration file we have just created.

To update tgt-admin info from configuration file, we will use the following command:

~# tgt-admin -C 2 --update ALL -c /etc/tgt/mirror.conf -v
Using /etc/tgt/mirror.conf as configuration file
default-driver not defined, defaulting to iscsi.
Removing target: SW3-NETSTOR-SRV2-1
tgtadm -C 2 --mode target --op delete --tid=1
Adding target: SW3-NETSTOR-SRV2-1
tgtadm -C 2 --lld iscsi --op new --mode target --tid 1 -T SW3-NETSTOR-SRV2-1
tgtadm -C 2 --lld iscsi --op new --mode logicalunit --tid 1 --lun 1 -b /dev/disk/by-id/scsi-3600508b1001cb960a9daa8733452c470  --bstype rdwr   --blocksize 512
Write cache is enabled (default) for lun 1.
tgtadm -C 2 --lld iscsi --op update --mode logicalunit --tid 1 --lun=1 --params vendor_id="HP      "
tgtadm -C 2 --lld iscsi --op update --mode logicalunit --tid 1 --lun=1 --params product_id="LOGICAL VOLUME  "
tgtadm -C 2 --lld iscsi --op update --mode logicalunit --tid 1 --lun=1 --params product_rev="6.64"
tgtadm -C 2 --lld iscsi --op update --mode logicalunit --tid 1 --lun=1 --params scsi_sn="500143802592D000"
tgtadm -C 2 --lld iscsi --op update --mode logicalunit --tid 1 --lun=1 --params lbppbe="3"
tgtadm -C 2 --lld iscsi --op update --mode logicalunit --tid 1 --lun=1 --params la_lba="0"
tgtadm -C 2 --lld iscsi --op bind --mode target --tid 1 -I 192.168.1.46

This ends our procedure on the secondary server.

Next, on the primary server, add newly created virtual disk to the zfs pool.


First, we need to login to the iscsi session we have exported before:

~# iscsiadm -m node -T <remote logical disk name> --login

Example:

~# iscsiadm -m node -T SW3-NETSTOR-SRV2-1 --login

Logging in to [iface: default, target: SW3-NETSTOR-SRV2-1, portal: 192.168.1.47,3259] (multiple)
Login to [iface: default, target: SW3-NETSTOR-SRV2-1, portal: 192.168.1.47,3259] successful.

We can see zpool status:

~# zpool status
  pool: NETSTOR
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: scrub repaired 0 in 0h0m with 0 errors on Tue Dec  6 15:10:59 2016
config:


        NAME                    STATE     READ WRITE CKSUM
        NETSTOR                 DEGRADED     0     0     0
          mirror-0              DEGRADED     0     0     0
            SW3-NETSTOR-SRV1-1  ONLINE       0     0     0
            SW3-NETSTOR-SRV2-1  FAULTED      3     0     0  too many errors


errors: No known data errors


From the output we can see:

SW3-NETSTOR-SRV2-1  FAULTED status of secondary disk.


Now we need to change guid of the old disk to guid of the new disk, so that zpool can identify the new disk.

To change guid from old to new in zpool, first we need to find out new guid.

We can use zdb command to find out:

~# zdb
NETSTOR:
    version: 5000
    name: 'NETSTOR'
    state: 0
    txg: 15
    pool_guid: 14112818788567273316
    errata: 0
    hostname: 'HydraA-1'
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 14112818788567273316
        children[0]:
            type: 'mirror'
            id: 0
            guid: 17350955661294397060
            metaslab_array: 34
            metaslab_shift: 33
            ashift: 12
            asize: 1000164294656
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 11541101181530606692
                path: '/dev/disk/by-partlabel/SW3-NETSTOR-SRV1-1'
                whole_disk: 1
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
             guid: 12365645279327980714
                path: '/dev/disk/by-partlabel/SW3-NETSTOR-SRV2-1'
            whole_disk: 1
                create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data

The important line for from zdb output:

guid: 12365645279327980714
path: '/dev/disk/by-partlabel/SW3-NETSTOR-SRV2-1'

The guid part need to be updated to zpool.

We can update guid with the command:

~# zpool replace NETSTOR <guid> <path_to_HDD> -f


Example:

~# zpool replace NETSTOR 12365645279327980714 /dev/disk/by-partlabel/SW3-NETSTOR-SRV2-1 -f


Now check zpool status:

~# zpool status
  pool: NETSTOR
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Dec  6 16:12:53 2016
    591M scanned out of 728M at 65,6M/s, 0h0m to go
    590M resilvered, 81,14% done
config:


        NAME                      STATE     READ WRITE CKSUM
        NETSTOR                   DEGRADED     0     0     0
          mirror-0                DEGRADED     0     0     0
            SW3-NETSTOR-SRV1-1    ONLINE       0     0     0
            replacing-1           UNAVAIL      0     0     0
              old                 UNAVAIL      0     0     0    corrupted data
              SW3-NETSTOR-SRV2-1  ONLINE       0     0     0  (resilvering)


errors: No known data errors

You need to wait for zpool to finish resilvering.

Now restart swhspared deamon to update GUI information.

~# /etc/init.d/swhspared restart 
 

This ends our replacement procedure.