Hello,

I am going to upgrade my server, taking advantage of the fact that I am going to be able to put more hard disks, I wanted to take advantage of this to give a little more security (against loss) to my data.

Currently I have 2 hard drives in ext4 with information, and wanted to buy a third (same capacity all three) and place them in raid5, so that in the future, I can put more hard drives and increase the capacity.

Due to economic issues, right now I can only buy what would be the third disk, so it is impossible for me to back up the data I currently have.

The data itself is not valuable, in case any file gets corrupted, I could download it again, however there are enough teras (20) to make downloading everything a madness.

In principle I thought to put on this server (PC) a dietpi, a trimmed debian and maybe with mdadm make the raid. I have seen tutorials on how to do it (this for example https://ruan.dev/blog/2022/06/29/create-a-raid5-array-with-mdadm-on-linux ).

The question is, is there any way without having to format the hard drives with data?

Thank you and sorry for any mistakes I may make, English is not my mother language.

EDIT:

Thanks for yours answers!! I have several paths to investigate.

20 points

This is madness, but since this is a hobby project and not a production server, there is a way:

  • Shrink the filesystems on the existing disks to free up as much space as possible, and shrink their partitions.
  • Add a new partition to each of the three disks, and make a RAID5 volume from those partitions.
  • Move as many files as possible to the new RAID5 volume to free up space in the old filesystems.
  • Shrink the old filesystems/partitions again.
  • Expand each RAID component partition one at a time by removing it from the array, resizing it into the empty space, and re-adding it to the array, giving plenty of time for the array to rebuild.
  • Move files, shrink the old partitions, and expand the new array partitions as many times as needed until all the files are moved.

This could take several days to accomplish, because of the RAID5 rebuild times. The less free space, the more iterations and the longer it will take.

permalink
report
reply
7 points

That is madness. I love it

permalink
report
parent
reply
1 point
*

He said the two drives are mostly full. It’s not a paritioning issue at that point.

permalink
report
parent
reply
7 points

Even if you could free up only 1GB on each of the drives, you could start the process with a RAID5 of 1GB per disk, migrate two TB of data into it, free up the 2GB in the old disks, to expand the RAID and rinse and repeat. It will take a very long time, and run a lot of risk due to increased stress on the old drives, but it is certainly something that’s theoretically achievable.

permalink
report
parent
reply
9 points
*

Technically, he would have three drives and only two drives of data. So he could move 1/3 of the data off each of the two drives onto the third and then start off with RAID 5 across the remaining 1/3 of each drive.

permalink
report
parent
reply
-2 points
*

Not at all possible whatsoever though. If he has two drives nearly full, he would never be able to fit all replicable data on a RAID 5 of any kind.

What you’re describing as a solution is the “3 jugs of water” problem. The difference is you need only one coherent set of data in order to even start a RAID array. Juggling between disks in this case would never make the solution OP is asking if all data can’t fit on one single drive, due to the limitations of smallest drive capacity. You can’t just swap things around and eventually come up with a viable array if ALL data can’t be in one place at one time.

permalink
report
parent
reply
10 points

Not really with mdadm raid5. But it sounds like you like to live dangerously. You could always go the BTRFS route. Yeah, I know BTRFS Raid56 “will eat your data”, but you said it’s nothing that important anyways. There are some things to keep in mind when running BTRFS in Raid5, e.g. scrub each disk individually, use Raid1c3 for metadata for example.

But basically, BTRFS is one of the only filesystems that allows you to add disks of any size or number, and you can convert the profile on the fly, while in use. So in this case, you could format the new disk with BTRFS as a single disk. Copy over stuff from one of your other disks, then once that disk is empty, add it as a additional device to your existing BTRFS volume. Then do the same with the last disk. Once that is done, you can run a balance convert to convert the single profile into a raid5 data profile.

That being said, there are quite a few caveats to be aware of. Even though it’s improved a lot, BTRFS’s Raid56 implementation is still not recommended for production use. https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@hungrycats.org/

Also, I would STRONGLY recommend against connecting disks via USB. USB HD adapters are notorious for causing all kinds of issues when used in any sort of advanced setup, apart from temporary single disk usage.

permalink
report
reply
1 point

Interesting, i think it will be made for my usecase. i’ll check it

Thanks for your answer!!

permalink
report
parent
reply
5 points

Traditional RAID isn’t very flexible and is meant/easiest for fresh disks without data. Since you’ve already got data in place, look into something like SnapRAID.

permalink
report
reply
2 points

And mergerFS

permalink
report
parent
reply
3 points

I’d suggest you move toward a backup approach (“RAID is not a backup”) first. Assuming you have 2x10Tb, get a 3rd and copy half of your files to it, disconnect it, and now half your files are protected. Save, get another, copy the other half, now all your files are protected. If you’re trying to do RAID on USB, don’t, you are already done, otherwise (using SATA or better) you can proceed to build your array in an orderly fashion.

permalink
report
reply
3 points

I know its not backup, but, for me, its the sweet point between money and security. Not only for this 2 hard disk, also for the capacity of add more HDs and don’t have all redundancy.

Thanks for your answer!!

permalink
report
parent
reply
1 point

I will say it three times, Raid isn’t a backup

Raid isn’t a backup

Raid isn’t a backup

Seriously though it shouldn’t give much peace of mind. All raid does is add a little resistance to hardware failures. If you mistakingly delete files you are hosed. If your hardware causes corruption you are hosed. If something happens to your computer such a physical abuse your drives are likely going to be damaged which will also mean that you may be hosed. If one drive dies and then the other drives dies before you move your data over you are also hosed.

The big take away is that Raid only really buys time. It can prevent downtime but it will not save you.

permalink
report
parent
reply
3 points
*

My recommendation would be to utilize LVM. Set up a PV on the new drive and create an LV filling the drive (wit an FS), then move all the data off of one drive onto this new drive, reformat the first old drive as a second PV in the volume group, and expand the size of the LV. Repeat the process for the second old drive. Then, instead of extending the LV, set the parity option on the LV to 1. You can add further disks, increasing the LV size or adding parity or mirroring in the future, as needed. This also gives you the advantage that you can (once you have some free space) create another LV that has different mirroring or parity requirements.

permalink
report
reply

Selfhosted

!selfhosted@lemmy.world

Create post

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

  1. Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

Community stats

  • 4.8K

    Monthly active users

  • 3.6K

    Posts

  • 80K

    Comments