Lost Your EC2 SSH Key? Here’s Every Way I Recovered Access

Lost Your EC2 SSH Key? Here’s Every Way I Recovered Access

Source: Dev.to

aws#ec2#cloudcomputing#learninginpublic#careergrowth#Womenintech The first time you lose an SSH key for an EC2 instance, it feels final. The server is running, your application is still alive, but the door is locked and the key is gone. I learned very quickly that AWS does not keep private keys for you, and once a .pem file is lost, it is lost forever. Instead of panicking, I decided to turn this moment into learning. I deliberately walked through every realistic recovery method available in Amazon Web Services, starting from zero, creating instances from scratch, breaking access on purpose, and then recovering it again. What I discovered is that “losing an EC2 key” is not a dead end. It’s a branching path with multiple recovery strategies, each suited for a different situation. Understanding the Core Truth About EC2 Keys Before touching any recovery method, there is one truth that must be clear. AWS never stores your private SSH key. The .pem file lives only on your local machine. The EC2 instance never sees it. What the instance stores instead is the public version of that key inside a file called authorized_keys. This means recovery is never about getting your old .pem file back. Recovery is always about getting temporary access and then adding a new public key. Once that clicked, everything else made sense. Temporary Access with EC2 Instance Connect The first recovery path I explored was EC2 Instance Connect. This method works only when the instance is running, the AMI supports it, the instance has network access, and port 22 is open. When all those conditions are met, AWS can temporarily inject a one-time public key into the instance and open a browser-based SSH session. What surprised me most is that this access is not time-limited in the way people assume. The injected key itself lives for about a minute, but once the SSH session starts, it behaves like any normal SSH connection. I stayed logged in for as long as I wanted. When I disconnected, the temporary key disappeared. This method doesn’t recover your old key, but it gives you a crucial foothold. From inside the instance, you can add a brand-new public key and restore permanent access. It’s fast, clean, and perfect for emergencies but it depends heavily on networking and AMI support. The “Surgery” Method: Detaching the Root Volume Next, I practiced the most powerful and most intimidating method: detaching the root EBS volume. This approach works even when SSH is completely broken. The instance can be stopped, misconfigured, or unreachable, and recovery is still possible. The process feels like surgery. You stop the broken instance, detach its root volume, and attach that volume to a second helper instance in the same availability zone. From there, you mount the disk, navigate into the filesystem, and manually edit the authorized_keys file. While doing this on Amazon Linux 2023, I ran into a real-world issue that many tutorials skip. The filesystem is XFS, and because both disks were created from the same AMI, they shared the same UUID. XFS refuses to mount duplicate UUIDs unless you explicitly tell it to. Using the -o nouuid option was the key that made the mount succeed. After adding a new public key, fixing permissions, and reattaching the volume to the original instance, I started it again and logged in successfully. This method taught me more about Linux, filesystems, and AWS storage than any lab ever could. It’s not fast, but it works even when everything else fails. Recovering Access Without SSH Using Systems Manager The cleanest recovery experience came from AWS Systems Manager Session Manager. Instead of relying on SSH at all, this method uses IAM and an encrypted control channel managed by AWS. There is no port 22, no .pem file, and no public SSH exposure. I launched an instance using Amazon Linux 2023 and attached an IAM role with the AmazonSSMManagedInstanceCore policy. After a short delay, the instance appeared as “managed” in Systems Manager Fleet Manager. From there, I opened a browser-based shell using Session Manager and gained access immediately. Inside the session, I verified that I was logged in as ssm-user, then elevated privileges and manually added an SSH public key. This meant I could later SSH normally if I wanted, but I no longer had to depend on SSH for access at all. This method feels like how EC2 is meant to be managed in production environments. It’s secure, auditable, and resilient to lost keys. If I had to choose one approach to standardize on, this would be it. Starting Fresh with an AMI Finally, I explored the cleanest reset option: creating an AMI and launching a new instance. This method assumes you don’t care about preserving the original instance identity. You simply create an image of the server, launch a new instance from that image, and select a new key pair during launch. What I appreciated here is the simplicity. There is no filesystem mounting, no Linux surgery, and no emergency access needed. The tradeoff is that the instance ID and public IP change, but the operating system, software, and data remain intact. I also learned an important cleanup lesson. Deleting an AMI is not complete until the associated snapshot is deleted as well. Forgetting that step leaves behind storage charges that quietly accumulate. What This Taught Me About Cloud Engineering Losing an EC2 SSH key stopped feeling scary once I understood that access and identity are separate concepts in AWS. The private key is just one authentication mechanism, not the server itself. Every recovery method I practiced reinforced that cloud infrastructure is designed to be recoverable, provided you understand the tools. More importantly, this journey shifted how I think about “best practice.” In learning environments, EC2 Instance Connect and AMI recovery are convenient. In real systems, Systems Manager Session Manager is the safest long-term strategy. And when everything is broken, volume attachment remains the ultimate escape hatch. If you’re learning AWS and haven’t practiced these scenarios yet, I strongly recommend doing so before you need them in real life. The first time you recover a server you thought was lost, something clicks and cloud engineering starts to feel real. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse