{"id":4124,"date":"2015-01-27T19:01:32","date_gmt":"2015-01-27T11:01:32","guid":{"rendered":"http:\/\/rmohan.com\/?p=4124"},"modified":"2015-01-27T19:01:32","modified_gmt":"2015-01-27T11:01:32","slug":"recovering-a-lost-lvm-volume-disk","status":"publish","type":"post","link":"https:\/\/mohan.sg\/?p=4124","title":{"rendered":"Recovering a Lost LVM Volume Disk"},"content":{"rendered":"<p>Overview<\/p>\n<p>Logical Volume Management (LVM) provides a high level, flexible view of a server&#8217;s disk storage. Though robust, problems can occur. The purpose of this document is to review the recovery process when a disk is missing or damaged, and then apply that process to plausible examples. When a disk is accidentally removed or damaged in some way that adversely affects the logical volume, the general recovery process is:<\/p>\n<p>Replace the failed or missing disk<br \/>\nRestore the missing disk&#8217;s UUID<br \/>\nRestore the LVM meta data<br \/>\nRepair the file system on the LVM device<br \/>\nThe recovery process will be demonstrated in three specific cases:<\/p>\n<p>A disk belonging to a logical volume group is removed from the server<br \/>\nThe LVM meta data is damaged or corrupted<br \/>\nOne disk in a multi-disk volume group has been permanently removed<br \/>\nThis article discusses how to restore the LVM meta data. This is a risky proposition. If you restore invalid information, you can loose all the data on the LVM device. An important part of LVM recovery is having backups of the meta data to begin with, and knowing how it&#8217;s supposed to look when everything is running smoothly. LVM keeps backup and archive copies of it&#8217;s meta data in \/etc\/lvm\/backup and \/etc\/lvm\/archive. Backup these directories regularly, and be familiar with their contents. You should also manually backup the LVM meta data with vgcfgbackup before starting any maintenance projects on your LVM volumes.<\/p>\n<p>If you are planning on removing a disk from the server that belongs to a volume group, you should refer to the LVM HOWTO before doing so.<\/p>\n<p>Server Configuration<\/p>\n<p>In all three examples, a server with SUSE Linux Enterprise Server 10 with Service Pack 1 (SLES10 SP1) will be used with LVM version 2. The examples will use a volume group called &#8220;sales&#8221; with a linear logical volume called &#8220;reports&#8221;. The logical volume and it&#8217;s mount point are shown below. You will need to substitute your mount points and volume names as needed to match your specific environment.<\/p>\n<p>ls-lvm:~ # cat \/proc\/partitions<br \/>\nmajor minor  #blocks  name<\/p>\n<p>   8     0    4194304 sda<br \/>\n   8     1     514048 sda1<br \/>\n   8     2    1052257 sda2<br \/>\n   8     3          1 sda3<br \/>\n   8     5     248976 sda5<br \/>\n   8    16     524288 sdb<br \/>\n   8    32     524288 sdc<br \/>\n   8    48     524288 sdd<\/p>\n<p>ls-lvm:~ # pvcreate \/dev\/sda5 \/dev\/sd[b-d]<br \/>\n  Physical volume &#8220;\/dev\/sda5&#8221; successfully created<br \/>\n  Physical volume &#8220;\/dev\/sdb&#8221; successfully created<br \/>\n  Physical volume &#8220;\/dev\/sdc&#8221; successfully created<br \/>\n  Physical volume &#8220;\/dev\/sdd&#8221; successfully created<\/p>\n<p>ls-lvm:~ # vgcreate sales \/dev\/sda5 \/dev\/sd[b-d]<br \/>\n  Volume group &#8220;sales&#8221; successfully created<\/p>\n<p>ls-lvm:~ # lvcreate -n reports -L +1G sales<br \/>\n  Logical volume &#8220;reports&#8221; created<\/p>\n<p>ls-lvm:~ # pvscan<br \/>\n  PV \/dev\/sda5   VG sales   lvm2 [240.00 MB \/ 240.00 MB free]<br \/>\n  PV \/dev\/sdb    VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV \/dev\/sdc    VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV \/dev\/sdd    VG sales   lvm2 [508.00 MB \/ 500.00 MB free]<br \/>\n  Total: 4 [1.72 GB] \/ in use: 4 [1.72 GB] \/ in no VG: 0 [0   ]<\/p>\n<p>ls-lvm:~ # vgs<br \/>\n  VG    #PV #LV #SN Attr   VSize VFree<br \/>\n  sales   4   1   0 wz&#8211;n- 1.72G 740.00M<\/p>\n<p>ls-lvm:~ # lvs<br \/>\n  LV      VG    Attr   LSize Origin Snap%  Move Log Copy%<br \/>\n  reports sales -wi-ao 1.00G<\/p>\n<p>ls-lvm:~ # mount | grep sales<br \/>\n\/dev\/mapper\/sales-reports on \/sales\/reports type ext3 (rw)<\/p>\n<p>ls-lvm:~ # df -h \/sales\/reports<br \/>\nFilesystem            Size  Used Avail Use% Mounted on<br \/>\n\/dev\/mapper\/sales-reports<br \/>\n                     1008M   33M  925M   4% \/sales\/reports<br \/>\nDisk Belonging to a Volume Group Removed<\/p>\n<p>Removing a disk, belonging to a logical volume group, from the server may sound a bit strange, but with Storage Area Networks (SAN) or fast paced schedules, it happens.<\/p>\n<p>Symptom:<\/p>\n<p>The first thing you may notice when the server boots are messages like:<\/p>\n<p>&#8220;Couldn&#8217;t find all physical volumes for volume group sales.&#8221;<br \/>\n&#8220;Couldn&#8217;t find device with uuid &#8217;56pgEk-0zLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.&#8221;<br \/>\n&#8216;Volume group &#8220;sales&#8221; not found&#8217;<\/p>\n<p>Type root&#8217;s password.<br \/>\nEdit the \/etc\/fstab file.<br \/>\nComment out the line with \/dev\/sales\/report<br \/>\nReboot<br \/>\nThe LVM symptom is a missing sales volume group. Typing cat \/proc\/partitions confirms the server is missing one of it&#8217;s disks.<\/p>\n<p>ls-lvm:~ # cat \/proc\/partitions<br \/>\nmajor minor  #blocks  name<\/p>\n<p>   8     0    4194304 sda<br \/>\n   8     1     514048 sda1<br \/>\n   8     2    1052257 sda2<br \/>\n   8     3          1 sda3<br \/>\n   8     5     248976 sda5<br \/>\n   8    16     524288 sdb<br \/>\n   8    32     524288 sdc<\/p>\n<p>ls-lvm:~ # pvscan<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  PV \/dev\/sda5        VG sales   lvm2 [240.00 MB \/ 240.00 MB free]<br \/>\n  PV \/dev\/sdb         VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV unknown device   VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV \/dev\/sdc         VG sales   lvm2 [508.00 MB \/ 500.00 MB free]<br \/>\n  Total: 4 [1.72 GB] \/ in use: 4 [1.72 GB] \/ in no VG: 0 [0   ]<br \/>\nSolution:<\/p>\n<p>Fortunately, the meta data and file system on the disk that was \/dev\/sdc are intact.<br \/>\nSo the recovery is to just put the disk back.<br \/>\nReboot the server.<br \/>\nThe \/etc\/init.d\/boot.lvm start script will scan and activate the volume group at boot time.<br \/>\nDon&#8217;t forget to uncomment the \/dev\/sales\/reports device in the \/etc\/fstab file.<br \/>\nIf this procedure does not work, then you may have corrupt LVM meta data.<\/p>\n<p>Corrupted LVM Meta Data<\/p>\n<p>The LVM meta data does not get corrupted very often; but when it does, the file system on the LVM logical volume should also be considered unstable. The goal is to recover the LVM volume, and then check file system integrity.<\/p>\n<p>Symptom 1:<\/p>\n<p>Attempting to activate the volume group gives the following:<\/p>\n<p>ls-lvm:~ # vgchange -ay sales<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  Couldn&#8217;t read volume group metadata.<br \/>\n  Volume group sales metadata is inconsistent<br \/>\n  Volume group for uuid not found: m4Cg2vkBVSGe1qSMNDf63v3fDHqN4uEkmWoTq5TpHpRQwmnAGD18r44OshLdHj05<br \/>\n  0 logical volume(s) in volume group &#8220;sales&#8221; now active<br \/>\nThis symptom is the result of a minor change in the meta data. In fact, only three bytes were overwritten. Since only a portion of the meta data was damaged, LVM can compare it&#8217;s internal check sum against the meta data on the device and know it&#8217;s wrong. There is enough meta data for LVM to know that the &#8220;sales&#8221; volume group and devices exit, but are unreadable.<\/p>\n<p>ls-lvm:~ # pvscan<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  PV \/dev\/sda5   VG sales   lvm2 [240.00 MB \/ 240.00 MB free]<br \/>\n  PV \/dev\/sdb    VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV \/dev\/sdc    VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV \/dev\/sdd    VG sales   lvm2 [508.00 MB \/ 500.00 MB free]<br \/>\n  Total: 4 [1.72 GB] \/ in use: 4 [1.72 GB] \/ in no VG: 0 [0   ]<br \/>\nNotice pvscan shows all devices present and associated with the sales volume group. It&#8217;s not the device UUID that is not found, but the volume group UUID.<\/p>\n<p>Solution 1:<\/p>\n<p>Since the disk was never removed, leave it as is.<br \/>\nThere were no device UUID errors, so don&#8217;t attempt to restore the UUIDs.<br \/>\nThis is a good candidate to just try restoring the LVM meta data.<br \/>\nls-lvm:~ # vgcfgrestore sales<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  \/dev\/sdc: Checksum error<br \/>\n  Restored volume group sales<\/p>\n<p>ls-lvm:~ # vgchange -ay sales<br \/>\n  1 logical volume(s) in volume group &#8220;sales&#8221; now active<\/p>\n<p>ls-lvm:~ # pvscan<br \/>\n  PV \/dev\/sda5   VG sales   lvm2 [240.00 MB \/ 240.00 MB free]<br \/>\n  PV \/dev\/sdb    VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV \/dev\/sdc    VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV \/dev\/sdd    VG sales   lvm2 [508.00 MB \/ 500.00 MB free]<br \/>\n  Total: 4 [1.72 GB] \/ in use: 4 [1.72 GB] \/ in no VG: 0 [0   ]<br \/>\nRun a file system check on \/dev\/sales\/reports.<br \/>\nls-lvm:~ # e2fsck \/dev\/sales\/reports<br \/>\ne2fsck 1.38 (30-Jun-2005)<br \/>\n\/dev\/sales\/reports: clean, 961\/131072 files, 257431\/262144 blocks<\/p>\n<p>ls-lvm:~ # mount \/dev\/sales\/reports \/sales\/reports\/<\/p>\n<p>ls-lvm:~ # df -h \/sales\/reports\/<br \/>\nFilesystem            Size  Used Avail Use% Mounted on<br \/>\n\/dev\/mapper\/sales-reports<br \/>\n                     1008M  990M     0 100% \/sales\/reports<br \/>\nSymptom 2:<\/p>\n<p>Minor damage to the LVM meta data is easily fixed with vgcfgrestore. If the meta data is gone, or severely damaged, then LVM will consider that disk as an &#8220;unknown device.&#8221; If the volume group contains only one disk, then the volume group and it&#8217;s logical volumes will simply be gone. In this case the symptom is the same as if the disk was accidentally removed, with the exception of the device name. Since \/dev\/sdc was not actually removed from the server, the devices are still labeled a through d.<\/p>\n<p>ls-lvm:~ # pvscan<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  PV \/dev\/sda5        VG sales   lvm2 [240.00 MB \/ 240.00 MB free]<br \/>\n  PV \/dev\/sdb         VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV unknown device   VG sales   lvm2 [508.00 MB \/ 0    free]<br \/>\n  PV \/dev\/sdd         VG sales   lvm2 [508.00 MB \/ 500.00 MB free]<br \/>\n  Total: 4 [1.72 GB] \/ in use: 4 [1.72 GB] \/ in no VG: 0 [0   ]<br \/>\nSolution 2:<\/p>\n<p>First, replace the disk. Most likely the disk is already there, just damaged.<br \/>\nSince the UUID on \/dev\/sdc is not there, a vgcfgrestore will not work.<br \/>\nls-lvm:~ # vgcfgrestore sales<br \/>\n  Couldn&#8217;t find device with uuid &#8217;56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu&#8217;.<br \/>\n  Couldn&#8217;t find all physical volumes for volume group sales.<br \/>\n  Restore failed.<br \/>\nComparing the output of cat \/proc\/partitions and pvscan shows the missing device is \/dev\/sdc, and pvscan shows which UUID it needs for that device. So, copy and paste the UUID that pvscan shows for \/dev\/sdc.<br \/>\nls-lvm:~ # pvcreate &#8211;uuid 56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu \/dev\/sdc<br \/>\n  Physical volume &#8220;\/dev\/sdc&#8221; successfully created<br \/>\nRestore the LVM meta data<br \/>\nls-lvm:~ # vgcfgrestore sales<br \/>\n  Restored volume group sales<\/p>\n<p>ls-lvm:~ # vgscan<br \/>\n  Reading all physical volumes.  This may take a while&#8230;<br \/>\n  Found volume group &#8220;sales&#8221; using metadata type lvm2<\/p>\n<p>ls-lvm:~ # vgchange -ay sales<br \/>\n  1 logical volume(s) in volume group &#8220;sales&#8221; now active<br \/>\nRun a file system check on \/dev\/sales\/reports.<br \/>\nls-lvm:~ # e2fsck \/dev\/sales\/reports<br \/>\ne2fsck 1.38 (30-Jun-2005)<br \/>\n\/dev\/sales\/reports: clean, 961\/131072 files, 257431\/262144 blocks<\/p>\n<p>ls-lvm:~ # mount \/dev\/sales\/reports \/sales\/reports\/<\/p>\n<p>ls-lvm:~ # df -h \/sales\/reports<br \/>\nFilesystem            Size  Used Avail Use% Mounted on<br \/>\n\/dev\/mapper\/sales-reports<br \/>\n                     1008M  990M     0 100% \/sales\/reports<\/p>\n<p>Disk Permanently Removed<\/p>\n<p>This is the most severe case. Obviously if the disk is gone and unrecoverable, the data on that disk is likewise unrecoverable. This is a great time to feel good knowing you have a solid backup to rely on. However, if the good feelings are gone, and there is no backup, how do you recover as much data as possible from the remaining disks in the volume group? No attempt will be made to address the data on the unrecoverable disk; this topic will be left to the data recovery experts.<\/p>\n<p>Symptom:<\/p>\n<p>The symptom will be the same as Symptom 2 in the Corrupted LVM Meta Data section above. You will see errors about an &#8220;unknown device&#8221; and missing device with UUID.<\/p>\n<p>Solution:<\/p>\n<p>Add a replacement disk to the server. Make sure the disk is empty.<br \/>\nCreate the LVM meta data on the new disk using the old disk&#8217;s UUID that pvscan displays.<br \/>\nls-lvm:~ # pvcreate &#8211;uuid 56ogEk-OzLS-cKBc-z9vJ-kP65-DUBI-hwZPSu \/dev\/sdc<br \/>\n  Physical volume &#8220;\/dev\/sdc&#8221; successfully created<br \/>\nRestore the backup copy of the LVM meta data for the sales volume group.<br \/>\nls-lvm:~ # vgcfgrestore sales<br \/>\n  Restored volume group sales<\/p>\n<p>ls-lvm:~ # vgscan<br \/>\n  Reading all physical volumes.  This may take a while&#8230;<br \/>\n  Found volume group &#8220;sales&#8221; using metadata type lvm2<\/p>\n<p>ls-lvm:~ # vgchange -ay sales<br \/>\n  1 logical volume(s) in volume group &#8220;sales&#8221; now active<br \/>\nRun a file system check to rebuild the file system.<br \/>\nls-lvm:~ # e2fsck -y \/dev\/sales\/reports<br \/>\ne2fsck 1.38 (30-Jun-2005)<br \/>\n&#8211;snip&#8211;<br \/>\nFree inodes count wrong for group #5 (16258, counted=16384).<br \/>\nFix? yes<\/p>\n<p>Free inodes count wrong (130111, counted=130237).<br \/>\nFix? yes<\/p>\n<p>\/dev\/sales\/reports: ***** FILE SYSTEM WAS MODIFIED *****<br \/>\n\/dev\/sales\/reports: 835\/131072 files (5.7% non-contiguous), 137213\/262144 blocks<br \/>\nMount the file system and recover as much data as possible.<br \/>\nNOTE: If the missing disk contains the beginning of the file system, then the file system&#8217;s superblock will be missing. You will need to rebuild or use an alternate superblock. Restoring a file system superblock is outside the scope of this article, please refer to your file system&#8217;s documentation.<br \/>\nConclusion<\/p>\n<p>LVM by default keeps backup copies of it&#8217;s meta data for all LVM devices. These backup files are stored in \/etc\/lvm\/backup and \/etc\/lvm\/archive. If a disk is removed or the meta data gets damaged in some way, it can be easily restored, if you have backups of the meta data. This is why it is highly recommended to never turn off LVM&#8217;s auto backup feature. Even if a disk is permanently removed from the volume group, it can be reconstructed, and often times the remaining data on the file system recovered.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overview<\/p>\n<p>Logical Volume Management (LVM) provides a high level, flexible view of a server&#8217;s disk storage. Though robust, problems can occur. The purpose of this document is to review the recovery process when a disk is missing or damaged, and then apply that process to plausible examples. When a disk is accidentally removed or damaged [&#8230;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,17],"tags":[],"_links":{"self":[{"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/posts\/4124"}],"collection":[{"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mohan.sg\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4124"}],"version-history":[{"count":1,"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/posts\/4124\/revisions"}],"predecessor-version":[{"id":4125,"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/posts\/4124\/revisions\/4125"}],"wp:attachment":[{"href":"https:\/\/mohan.sg\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4124"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mohan.sg\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4124"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mohan.sg\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4124"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}