I have an interesting problem.
Linux box. /, /boot, swap on 3 RAID1 partitions 2 disks
currently RAID is degraded, just using 1 750G disk. / (ext3) is 700G, only 200G used.
No LVM.
I added a 500G SSD. I want that to be the new 2nd RAID1 disk.
I need to shrink / to be ~450G before I can do this.
resize2fs can only do shrink on unmounted. That's a problem. RAID1 can shrink the RAID block dev once I get the fs shrunk, so that's not a problem.
I'm offsite and want to find a way to do this without going onsite and using single-user mode or a boot cd.
Options?
I guess I could make a new, smaller RAID1 / on the SSD, quiet down all services, and do a cp -a or cpio or something? Then get the system to boot off the new / and ignore the old one, and reboot. Besides in theory, has anyone actually do a whole cpio or cp -a of an entire *running* / and been successful? Sample command lines? I guess nowadays there would be zero dev files that need to be copied because udev recreates them all? So it's literally just files, dirs, links and fifos that need copying?
Any options using just ext3 and mdadm tools? Surely there must be some way... unless the no-shrink-mounted makes that impossible.
Maybe I'm missing something.
Thanks!
It's been a while since I've had to do this, but I have successfully replicated a live root volume on a few occasions using rsync. It could be as simple as doing this:
rsync -aHXx /. /mnt/newdisk/.
I've adopted the "/." at the end of both paths to rsync as a defensive measure because it can make some funky assumptions about where you want source directories put in the target directory otherwise. I've often ended up with the target having an extra level of directory when I tell rsync to copy one directory to another without the /. at the end. The advantage of using rsync is that you can rerun it again to more quickly update the target if you suspect the source changed during the first copy, which can happen even on a relatively quiet system. To make the subsequent update even quicker, you can just run it on directories that are likely to have changed, e.g. /etc.
I don't recall if I had to play around with --exclude options to avoid paths I didn't want traversed, or if the -x took care of all of these. I do recall that on a few occasions I played it safe by using a for loop to only copy specific subdirectories and avoid things like /sys and /proc altogether, but that was a bit more complicated and error-prone (easy to miss something important).
Note that if you're using SELinux on your system, you'll want to verify that the target's contexts are correctly set, and chcon any that aren't. rsync with the -X option should copy contexts accurately, but you may want to manually override the context for any mount points on the target volume, and make sure the volume's root directory is set to system_u:object_r:root_t. ls -Zld dir is the way I examine contexts (in addition to modes & ownership).
Hope this helps.
Gilles
On 2016-03-30 20:56, Trevor Cordes wrote:
I have an interesting problem.
Linux box. /, /boot, swap on 3 RAID1 partitions 2 disks
currently RAID is degraded, just using 1 750G disk. / (ext3) is 700G, only 200G used.
No LVM.
I added a 500G SSD. I want that to be the new 2nd RAID1 disk.
I need to shrink / to be ~450G before I can do this.
resize2fs can only do shrink on unmounted. That's a problem. RAID1 can shrink the RAID block dev once I get the fs shrunk, so that's not a problem.
I'm offsite and want to find a way to do this without going onsite and using single-user mode or a boot cd.
Options?
I guess I could make a new, smaller RAID1 / on the SSD, quiet down all services, and do a cp -a or cpio or something? Then get the system to boot off the new / and ignore the old one, and reboot. Besides in theory, has anyone actually do a whole cpio or cp -a of an entire *running* / and been successful? Sample command lines? I guess nowadays there would be zero dev files that need to be copied because udev recreates them all? So it's literally just files, dirs, links and fifos that need copying?
Any options using just ext3 and mdadm tools? Surely there must be some way... unless the no-shrink-mounted makes that impossible.
Maybe I'm missing something.
Thanks! _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable
I 100% agree about defensive use of trailing "/." in rsync... I just gave up trying to figure out what combination of magical trailing slashes and lack thereof caused which behaviour and *always* terminate with a final "/." now.
On most modern-ish systems with modern-ish versions of rsync, you may want to consider "-vaxHAXES". Some of those options won't exist on older systems, adjust as necessary. If you have sparse files and *don't* use -S you're in for a looooooooong copy session.
Generally "-x" makes dev, dev/pts, dev/shm, proc, sys, etc., etc., get handled correctly: the mountpoints should get copied but not traversed. And as the operator, you'll generally know if there are any really, really "special" directories that rsync can't detect, but --exclude shouldn't be needed nowadays.
I use this quite frequently to migrate systems; for some KVM setups, it can be *faster* than doing a live migration of an oversized raw-provisioned disk. I've also used this to transition from thick- to thin-provisioned disks.
The much bigger, harder problem lies in convincing GRUB to install its bootloader correctly on the second volume (and point to the second volume for stage1, stage1.5, stage2, /boot, kernel, etc.). I have *once* managed to do this successfully without using a rescue CD, and I sure wish I could figure out how I did it! The one pointer I can provide is that physically removing the first disk before trying to reinstall GRUB from rescue media does make the job a heck of a lot easier!
Remember to pre-create the correct RAID layout with a "missing" disk *before* rsync'ing anything. And then good luck convincing the system that that's really md0, not md1 or md127 or ... Use LABEL= in your boot command lines and fstabs, not UUIDs or raw devices nodes for scenarios like this. At least you can sanely control LABELs, and they're easy to change as needed.
As far as SELinux goes, I still refuse to waste my time creating custom SELinux policies to allow me to do what I consider to be fairly "normal" things, so on systems that only run RedHat-provided daemons, it's still enabled (because it doesn't break anything) and on every other system it's either disabled altogether, or that system just doesn't run RHEL/CentOS any more.
IMHO SELinux itself probably shouldn't be trusted, given what we now know about the NSA. In any case, it's about as useful as systemd to me - yet another "improvement" forced on me as a sysadmin. *grumble*
-Adam
On 16-03-31 11:47 AM, Gilles Detillieux wrote:
It's been a while since I've had to do this, but I have successfully replicated a live root volume on a few occasions using rsync. It could be as simple as doing this:
rsync -aHXx /. /mnt/newdisk/.
I've adopted the "/." at the end of both paths to rsync as a defensive measure because it can make some funky assumptions about where you want source directories put in the target directory otherwise. I've often ended up with the target having an extra level of directory when I tell rsync to copy one directory to another without the /. at the end. The advantage of using rsync is that you can rerun it again to more quickly update the target if you suspect the source changed during the first copy, which can happen even on a relatively quiet system. To make the subsequent update even quicker, you can just run it on directories that are likely to have changed, e.g. /etc.
I don't recall if I had to play around with --exclude options to avoid paths I didn't want traversed, or if the -x took care of all of these. I do recall that on a few occasions I played it safe by using a for loop to only copy specific subdirectories and avoid things like /sys and /proc altogether, but that was a bit more complicated and error-prone (easy to miss something important).
Note that if you're using SELinux on your system, you'll want to verify that the target's contexts are correctly set, and chcon any that aren't. rsync with the -X option should copy contexts accurately, but you may want to manually override the context for any mount points on the target volume, and make sure the volume's root directory is set to system_u:object_r:root_t. ls -Zld dir is the way I examine contexts (in addition to modes & ownership).
Hope this helps.
Gilles
On 2016-03-30 20:56, Trevor Cordes wrote:
I have an interesting problem.
Linux box. /, /boot, swap on 3 RAID1 partitions 2 disks
currently RAID is degraded, just using 1 750G disk. / (ext3) is 700G, only 200G used.
No LVM.
I added a 500G SSD. I want that to be the new 2nd RAID1 disk.
I need to shrink / to be ~450G before I can do this.
resize2fs can only do shrink on unmounted. That's a problem. RAID1 can shrink the RAID block dev once I get the fs shrunk, so that's not a problem.
I'm offsite and want to find a way to do this without going onsite and using single-user mode or a boot cd.
Options?
I guess I could make a new, smaller RAID1 / on the SSD, quiet down all services, and do a cp -a or cpio or something? Then get the system to boot off the new / and ignore the old one, and reboot. Besides in theory, has anyone actually do a whole cpio or cp -a of an entire *running* / and been successful? Sample command lines? I guess nowadays there would be zero dev files that need to be copied because udev recreates them all? So it's literally just files, dirs, links and fifos that need copying?
Any options using just ext3 and mdadm tools? Surely there must be some way... unless the no-shrink-mounted makes that impossible.
Maybe I'm missing something.
Thanks! _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable
On 2016-03-31 Adam Thompson wrote:
On most modern-ish systems with modern-ish versions of rsync, you may want to consider "-vaxHAXES". Some of those options won't exist on older systems, adjust as necessary. If you have sparse files and *don't* use -S you're in for a looooooooong copy session.
[...]
The much bigger, harder problem lies in convincing GRUB to install its bootloader correctly on the second volume (and point to the second volume for stage1, stage1.5, stage2, /boot, kernel, etc.). I have *once* managed to do this successfully without using a rescue CD, and I sure wish I could figure out how I did it! The one pointer I can provide is that physically removing the first disk before trying to reinstall GRUB from rescue media does make the job a heck of a lot easier!
Thanks to all who responded! Gave me some other ideas to play with. Combining the brains together gave me.... success! And no drive to onsite required!
Here's what I did for posterity:
HOW TO MOVE A RAID-1 SYSTEM FROM A BIGGER DISK SETUP TO A SMALLER DISK SETUP WITHOUT BEING ONSITE (where you want to replace one of 2 big disks in a RAID-1 with a smaller disk for whatever reason; which will require a shrinking/resize of root):
0. New ssd is physically installed a while ago (/dev/sdb). fdisk partition to match the original big disk layout (for me: boot root swap) but of course root (/dev/sdb3) is made smaller. For boot and swap, just added the new partitions as normal to the existing arrays: mdadm --add /dev/mdX /dev/sdbY
1. create new raid device on the ssd's big partition (default metadata ver is ok) madam --create -l1 -n2 /dev/md55 /dev/sdb3 missing (md55 is just any spare md number, it's irrelevant) Create an ext4 fs on it that new array.
2. initial rsync; "priming" it, which is really handy for making the final pre-switch rsync much faster: rsync -axHAXES /. /mnt/misc/.
3. Root UUID in /etc/fstab. Use blkid to get the UUID of the old md array and also the new one.
4. Edit /etc/fstab. Change the / root fs entry from the old UUID to the new UUID. (If you use labels, set it to the new label. Experience has taught me UUIDs are the way to go, so I use UUIDs.)
5. Edit /boot/grub2/grub.cfg (location might be different on other distros). Search & replace every old UUID to the new UUID. I wasn't sure about a few of the matches in the OS menu entry label, but in the end changing everything seemed to work fine, so why not.
6. Get the md UUIDs: mdadm -D /dev/mdOLD |grep UUID | tr -d ':' mdadm -D /dev/mdNEW |grep UUID | tr -d ':' Edit grub.cfg again and search & replace all the old md-UUIDs to the new one.
DO NOT REBUILD YOUR grub.cfg with grub2-mkconfig (or other tools) yet!!
7. dracut -f --fstab (Other distros might use other tools to make the initramfs.) Since dracut & systemd do mystical voodoo to handle all the fs's rather than just plunking /etc/fstab in the initramfs, the --fstab is required to tell dracut to use your current fstab (with the new UUID) to decide what voodoo to do on boot, rather than /proc/self/mountinfo. I have a hunch dracut/initrd just take their cue from grub, but I'm not certain.
8. turn off nearly every daemon, quiet down system
9. Do final rsync, should be fast: rsync -axHAXES /. /mnt/misc/.
10. reboot, cross your fingers, and sweat some bullets
11. Cheer and rejoice that the box came up 2 minutes later! First try!
12. Make sure things are actually sane. Let the system do its normal grub rebuild and dracut build: grub2-mkconfig -o /boot/grub2/grub.cfg dracut -f
reboot again, it's all good! yay!
13. mdadm --stop and disable/hose the old root md array. Eventually re-add it to the new root array. I wouldn't do it immediately, as it makes a nice backup in case the entire above procedure in fact mangled everything and I need to revert.
Use LABEL= in your boot command lines and fstabs, not UUIDs or raw devices nodes for scenarios like this. At least you can sanely control LABELs, and they're easy to change as needed.
I went through a device-path phase, then a label phase, and now am firmly in the camp that only UUIDs should be used. Labels bit me when I'd take a HDD from one of my client systems and load it on another (or mine) for diagnostics, only to discover they both had the same label, like "trevorsroot". Seriously foobar'd booting!! UUID's are fairly easy to deal with once you understand everywhere they are referenced.
My biggest complaint of the whole process is that grub.cfg is too opaque and the docs are horrible. Like what on earth is "$menuentry_id_option 'gnulinux-simple-...'" with a copy of the UUID after gnulinux-simple? Huh? Where is this stuff documented? search root with a boot UUID? Huh? Oh well, global search & replace hammer worked anyway. Also, I un-cpio'd the initramfs and grep'd the whole thing for the UUID's referenced in fstab and they aren't anywhere in there! Huh? How is it figuring out what to mount? Docs? Fuggitaboutit. Cross your fingers and pray. And keep the car keys handy!