Tools: macOS tar destroys files on Linux: I validated it in my real Railway pipeline and documented the 3 cases nobody mentions - Guide
macOS tar destroys files on Linux: I validated it in my real Railway pipeline and documented the 3 cases nobody mentions
macOS tar Linux extraction errors in production: the context that matters
The 3 real cases where macOS tar breaks a Railway pipeline
Case 1: Apple metadata ._* files
Case 2: Permissions that change silently
Case 3: Paths with spaces in filenames
The mistakes I made before I understood the pattern
My current Railway setup
This connects to something bigger than tar
FAQ: macOS tar Linux extraction errors in production
The fix is easy. The trap isn't technical. There's a Hacker News thread that resurfaced this week with 107 points about a 2024 article: tar on macOS creates archives that Linux can't extract cleanly. The community reacted the way it always does — "use GNU tar", "install gtar with Homebrew", "this has been known for years." And yeah, all of that is correct. But there's something nobody's saying: the 3 specific scenarios where this actually breaks production are not the same as each other, and each one has a different fix. I learned this the hard way — a failed deploy at 11pm that took two hours to diagnose. My thesis is that "use GNU tar" is necessary but not sufficient if you don't know exactly why your particular case is exploding. Ever since I migrated from Vercel to Railway in 2024 (a weekend that taught me more about real infrastructure than months of tutorials), my deployment pipeline depends on .tar.gz artifacts I generate on macOS and extract in Linux containers. For months it worked fine. Until it didn't. The core problem is that BSD tar (the one that ships with macOS) and GNU tar (the one running on Ubuntu, Alpine, Debian) are not the same program. They share a name and basic syntax, but differ in how they handle extended metadata. macOS adds HFS+/APFS filesystem metadata that GNU tar doesn't expect to find, and when it does find it, it can silently ignore it, fail with warnings that don't interrupt the process, or — worst case — extract corrupted files without telling you. Check which version of tar you have on macOS: They're not the same program. They never were. This was the first bug I hit. When macOS creates a .tar.gz from a folder you've touched with Finder (or that had extended attributes at some point), it includes ._filename files with HFS metadata. They're invisible in Finder, but they're sitting right there in the tar. In my specific case, I had a Railway script that grabbed the first .js file in the directory to calculate a verification hash. The script found ._chunk-abc123.js before chunk-abc123.js and the hash failed. The deploy completed, but the post-deploy verification fired an alert. It took me 90 minutes to connect those dots. The COPYFILE_DISABLE=1 environment variable is the cleanest because it acts at creation time. But if you already have old .tar.gz files in storage, you need the extraction-side filtering option. This one cost me more because there was no error. The deploy completed green, the app came up, but certain endpoints were returning 403s. The container couldn't read files that, on my local machine, had 644 permissions. The problem: BSD tar on macOS can serialize permissions differently for files with APFS ACLs (Access Control Lists). When GNU tar extracts them, it interprets those permissions in a way that can result in different bits than the originals. It doesn't happen every time. It happens when the file had some extended attribute at some point in its history on the macOS filesystem. The kind of bug that shows up in production but not in staging because staging has a different file history. The second option is superior because it fixes the problem at the source, not the destination. If you fix it at the destination, you're depending on every Dockerfile having that fix — and eventually someone will create a new one without it. This is the quietest one, and the one the original HN article doesn't cover in nearly enough detail. If you pack from macOS and any file in the path has a space (which Finder makes completely normal), extraction behavior on Linux depends on the exact version of GNU tar and how you process the file list. The tar itself extracts correctly with tar -xzf. The problem appears when any downstream script processes the file list assuming no spaces. In my case it was a CDN invalidation script that read paths from the tar to know which caches to flush. Renaming at the source is more robust because you eliminate the root cause. The read-based iteration is a patch that works but that the next developer will break when they copy the loop without understanding why it was written that way. Mistake 1: Trusting that "tar worked before, it'll always work." The ._* files appeared after I started opening that assets folder with Finder for previews. Before Finder touched it, no metadata. After Finder, yes. The pipeline was the same; the filesystem wasn't. Mistake 2: Only reading exit codes. GNU tar extracts ._* files with exit code 0. No error. Your deploy is "green" and in production you've got garbage stuffed into your build directory. You need post-extraction validation, not just exit codes. Mistake 3: Installing gtar but still using tar in the scripts. After brew install gnu-tar, on macOS the binary is called gtar, not tar. If you keep writing tar in your build script, you're still using BSD tar. I did this for a week. This PATH override is what I ended up using to keep existing scripts untouched. After validating all three cases, my macOS build pipeline now looks like this: And in the Railway Dockerfile, the extraction step has its own verification: This double verification — at creation time and at extraction time — is what gives me actual confidence. I don't trust that the process is always perfect; I trust that if it fails, I'll know before the deploy and not after. A few weeks ago I wrote about my YAML specs for agents and about migrating from pgbackrest to Barman. In both cases the pattern was the same: a standard tool that "works" in most cases, until it hits the specific production edge case. Tar is just another instance of this. The real risk isn't that tar is hard. It's that tar is so familiar that nobody considers it a failure point. When the deploy breaks at 11pm, nobody thinks "probably tar." And that's exactly why these bugs hurt more than they should. Why does macOS tar generate ._* files and when do they appear?
The ._filename files are Resource Forks from HFS+/APFS — a legacy mechanism for storing file metadata. They appear when a file had extended attributes at any point: special permissions, Finder metadata, color tags, or simply when Finder opened the folder to show previews. They don't appear on all files; they appear on the ones the macOS filesystem touched in certain ways. It's non-deterministic from the developer's perspective. Is COPYFILE_DISABLE=1 enough or do I still need GNU tar?COPYFILE_DISABLE=1 prevents BSD tar from including extended metadata at creation time. It's sufficient for Case 1 (the ._* files). For Case 2 (permissions with ACLs) and Case 3 (paths with spaces in downstream scripts), you need GNU tar and permission normalization. In practice I use both together because the cost is zero and the combination covers more ground. Does GNU tar on macOS via Homebrew have any tradeoffs?The only real tradeoff is that it installs as gtar, not tar, to avoid breaking the system. If you override PATH so that tar points to GNU tar, you need to be aware that some macOS system tools assume BSD tar with specific behaviors. In practice, after 18 months using the PATH override I haven't had a single problem — but it's something you should know going in. Does this affect GitHub Actions or only local builds?It mainly affects local builds on macOS and any CI runner running on macOS. GitHub Actions runners on Ubuntu already use GNU tar, so the problem doesn't show up there. The real risk is when you compress on local macOS and upload the artifact for a Linux system to extract — which is exactly the workflow for manual or semi-manual deployments. Is there a way to detect whether an existing tar.gz has Apple metadata without extracting it?Yes, one line: You can include this as a CI validation before publishing the artifact. Why doesn't Docker build protect against this?
Docker build copies files into the build context, but if the .tar.gz already has Apple metadata inside, that metadata travels inside the tar — Docker doesn't inspect tar contents when copying it. The problem happens when your Dockerfile does RUN tar -xzf and extracts the corrupted tar inside the container. Docker sees a command that exits with code 0 and assumes everything is fine. GNU tar + COPYFILE_DISABLE=1 + post-extraction verification solves all three cases. The technical part is documented above and you can copy it in five minutes. The real trap is attitudinal: tar is so old and so familiar that nobody adds it to the list of things that can fail. I didn't have it on the list either. Until I had a broken deploy at 11pm with completely green logs and half an hour of staring at code that had no bugs in it. If you're working with Kimi K2, Claude, or any LLM for code generation, none of them will warn you about this problem unless you already know about it and ask explicitly. If your stack touches Railway or any containerized infra, the problem can appear with no visible indicator. My concrete recommendation: audit the tars you currently have in production or in storage with tar -tzf file.tar.gz | grep "^\._". If it returns results, you have work to do. If it returns nothing, good — but add the verification to your pipeline anyway so the next local macOS build doesn't silently break that guarantee. This is exactly the kind of problem that shows up in Railway logs as a symptom of something else entirely. And that's precisely what makes it expensive. Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse