menga.net

how to split a tar (linux)

This is a computer nerdy thing, but every now and again you will have to archive some files. Most people use ZIP because that is baked into Windows and most Linux distributions. In Linux specifically, TAR a.k.a. "tarball" is better because it has way more options compared to ZIP...

...except the option to split into multiple files during archive creation. Not easily, anyway. However, with a very simple one-liner Terminal command, TAR can be used with split to do on-the-fly tarball files that split as the archive is being created.

Why split?

If you have ever had a very large file botch during a transfer to a USB stick or external hard drive, then you know why. But in case you haven't, I'll spell out the scenario for you.

Even with all the lovely colossal cheap storage available these days (have you seen the 12TB, 18TB and 20TB hard drives?), a legitimate problem is that if you've got this giant whopper of a file, it can get corrupted while sending it over the wire, over the air or direct-to-stick just because of how big it is.

Why does this happen? The simple answer is that the bigger the file, the more "opportunities" it has to botch during transfer in several places. It could be a RAM issue, local drive issue, external drive issue, network interference, whatever. Instead of trying to figure out where the problem is taking place, it's just easier to split the file to decrease the likelihood of a botched transfer.

Split a TAR as it's being created

tar -cvf - bigfile.ext | split -b 4092M - archive.tar.part

The above splits a TAR into 4GB pieces as it's being created, each ending with the .part extension followed by two letters, such as archive.tar.partaa, archive.tar.partab, archive.tar.partac, and so on.

If you have a bunch of files in one folder, use this instead:

tar -cvf - /path/to/folder/ | split -b 4092M - archive.tar.part

Split a TAR as it's being created with GZIP compression (slower to create)

tar -cvzf - bigfile.ext | split -b 4092M - archive.tar.gz.part

Split a TAR as it's being created with XZ compression (much better compression, but very slow to create)

tar -cvJf - bigfile.ext | split -b 4092M - archive.tar.xz.part

Yes, the J is case sensitive. A small j would result in bzip2 compression instead, which is nowhere near as good as XZ.

Split a TAR as it's being created with ZST compression (fast and good compression)

tar --zstd -cvf - bigfile.ext | split -b 4092M - archive.tar.zst.part

If you're going to use compression, I highly recommend using ZST just so you're not waiting around forever for the process to complete. ZST almost always outperforms GZIP for compression and is much, much faster than XZ.

Combine split pieces

To extract the split TAR, you have to combine the pieces first.

cat archive.tar.part* > archive.tar

If you used a different extension like .tar.gz or .tar.zst, use that instead for both archive.tar.part and archive.tar.

Extract afterward with:

tar -xvf archive.tar

or...

Combine split pieces and extract as a one-liner

cat archive.tar.part* | tar -xvf -

Yes, the hyphen at the end needs to be there. Don't forget that part.

And that's it.

Published 2024 Sep 19