I'm soliciting opinions on whether this is a bug or not.
I had this wacky setup on my Fedora 21 (don't ask why!):
RAID1
on top of
1st HALF: 2nd HALF: partitions partitions
on top of on top of
RAID0 raw disk
on top of
partitions
on top of
raw disk
And it all worked great. Until I rebooted. Then the RAID1's would come up but degraded, with only their non-RAID0 half ("2nd HALF" above) in the array, and the RAID0 parts "gone". I could then re-add the RAID0 half, reboot and get the same thing.
No matter what I did (messing with mdadm, dracut, grub2, etc) I couldn't get these things to assemble properly on boot.
Lots of bugs on the net about simpler cases of nested RAID having the same problem, and many bz's were fixed years ago regarding this. I checked, and their fixes are in my distro.
I got some help from the old bug guys and when I redid my setup to be layered like this instead:
RAID1 1st HALF: 2nd HALF: RAID0 partitions paritions raw disk raw disk
... it all magically worked, and they came up on boot, and the bug disappeared.
The only difference being whether I partition my RAID0 array or make partitions and then put the multiple RAID0's separately, directly into each RAID1 array. (The reason I didn't want to do this in the first place is I was making 5 of these groupings and I didn't want to manage 10 arrays, just 6. And it was only temporary anyhow. I know the buggy setup is bad design.)
My question is, is this a bug I should report? In theory, in my mind, the RAID1 on partitioned RAID0 should work fine. The fact mdadm and the kernel happily support it leans me towards answering "yes". If I can have it live in the kernel, why not have it survive reboots? I thought you could nest arbitrary combinations and levels (to X depth) of md, lvm, paritions, etc. (And, yes, I really tried everything to make it boot, including insanely detailed mdadm.conf and grub2 boot lines, to minimal configs, and yes I have partition type fd.) However, I want to make sure I'm not doing something here that is completely insane and shouldn't be supported.
It appears dracut, udev and mdadm are responsible for all of this. That's where the other similar bugs were fixed.
Details: Here's what it looked like when it was buggy. md9 was the big RAID0 array that was then parititioned into 4. Note how md9 does come up, get recognized, but then the boot stuff doesn't "recurse" into those partitions to see that they themselves are array components. So then md126 (root by the way) only comes up with 1 or 2 components, after an annoyingly long delay.
Oct 23 02:37:57 pog kernel: [ 19.542533] sd 8:0:3:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Oct 23 02:37:57 pog kernel: [ 19.542624] sd 8:0:2:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Oct 23 02:37:57 pog kernel: [ 19.553021] sdb: sdb1 sdb2 Oct 23 02:37:57 pog kernel: [ 19.553835] sda: sda1 sda2 sda3 sda4 Oct 23 02:37:57 pog kernel: [ 19.554991] sdc: sdc1 sdc2 Oct 23 02:37:57 pog kernel: [ 19.556918] sd 8:0:2:0: [sdb] Attached SCSI disk Oct 23 02:37:57 pog kernel: [ 19.558970] sd 8:0:3:0: [sdc] Attached SCSI disk Oct 23 02:37:57 pog kernel: [ 19.559332] sd 8:0:0:0: [sda] Attached SCSI disk Oct 23 02:37:57 pog kernel: [ 19.610787] random: nonblocking pool is initialized Oct 23 02:37:57 pog kernel: [ 19.737894] md: bind<sdb1> Oct 23 02:37:57 pog kernel: [ 19.742379] md: bind<sdb2> Oct 23 02:37:57 pog kernel: [ 19.744213] md: bind<sdc2> Oct 23 02:37:57 pog kernel: [ 19.748375] md: raid0 personality registered for level 0 Oct 23 02:37:57 pog kernel: [ 19.748619] md/raid0:md9: md_size is 285371136 sectors. Oct 23 02:37:57 pog kernel: [ 19.748623] md: RAID0 configuration for md9 - 1 zone Oct 23 02:37:57 pog kernel: [ 19.748625] md: zone0=[sdb2/sdc2] Oct 23 02:37:57 pog kernel: [ 19.748631] zone-offset= 0KB, device-offset= 0KB, size= 142685568KB Oct 23 02:37:57 pog kernel: [ 19.748633] Oct 23 02:37:57 pog kernel: [ 19.748650] md9: detected capacity change from 0 to 146110021632 Oct 23 02:37:57 pog kernel: [ 19.752284] md9: p1 p2 p3 p4 Oct 23 02:37:57 pog kernel: [ 19.776482] md: bind<sda1> Oct 23 02:37:57 pog kernel: [ 19.786343] md: raid1 personality registered for level 1 Oct 23 02:37:57 pog kernel: [ 19.786643] md/raid1:md127: active with 2 out of 2 mirrors Oct 23 02:37:57 pog kernel: [ 19.786669] md127: detected capacity change from 0 to 419364864 Oct 23 02:37:57 pog kernel: [ 19.823529] md: bind<sda2> Oct 23 02:37:57 pog kernel: [ 143.320688] md/raid1:md126: active with 1 out of 2 mirrors Oct 23 02:37:57 pog kernel: [ 143.320733] md126: detected capacity change from 0 to 36350984192