Wednesday, July 28, 2010

Verify SSH Keys on EC2 Instances

Like every server, every EC2 instance should have a unique ssh fingerprint. On "real servers" this fingerprint is generated at first installation of the openssh-server package. On EC2, instead, it is done on first boot of an instance. This is because each instance is a byte for byte copy of a registered image.

What this means to you, is that when you launch an instance and then connect with ssh, you'll see something like:


$ ssh -F /tmp/smoser/foo ec2-67-202-47-56.compute-1.amazonaws.com
The authenticity of host 'ec2-67-202-47-56.compute-1.amazonaws.com (67.202.47.56)' can't be established.
RSA key fingerprint is f1:40:a7:4e:0f:28:8d:12:21:59:f1:ff:03:5f:63:54.
Are you sure you want to continue connecting (yes/no)?


The ssh client is informing you that you are connecting to a host that you do not have ssh keys stored for. In short, it cannot confirm the identity of 'ec2-67-202-47-56'. There could be a "Man in the Middle" who is attempting to trick you. Just as with "real servers", you should identify that remote system via an out of band method. To do this outside of EC2, you might call a hosting provider up and ask them to verify the fingerprint that you see. On EC2, the only out of band transport is the ec2 console.

In order to provide you with the fingerprint that you need, the ssh fingerprint is written to the console when it is booted. You can see this with ec2-get-console-output.

As seen with the results of Eric's poll on alestic.com, this is a very little known or used piece of information. Over 50% of alestic.com voters have "never verified the fingerprint".


$ euca-get-console-output i-72bf1518 | grep ^ec2:
ec2:
ec2: #############################################################
ec2: -----BEGIN SSH HOST KEY FINGERPRINTS-----
ec2: 2048 f1:40:a7:4e:0f:28:8d:12:21:59:f1:ff:03:5f:63:54 /etc/ssh/ssh_host_rsa_key.pub (RSA)
ec2: 1024 28:f3:ef:a6:86:05:50:33:76:16:24:32:56:14:06:13 /etc/ssh/ssh_host_dsa_key.pub (DSA)
ec2: -----END SSH HOST KEY FINGERPRINTS-----
ec2: #############################################################


Note that the ssh fingerprint reported on the console matches the one that ssh client asked me to confirm above. So, I now know that the host I've connected to is the one that I just started.

Putting this all together, lets say you have booted a new EC2 instance, with instance-id i-72bf1518 and hostname ec2-67-202-47-56.compute-1.amazonaws.com.

First we will use ssh-keyscan to get the fingerprint that is being reported by the remote host, and store that in a shell variable 'fp'


$ iid=i-72bf1518
$ ihost=ec2-67-202-47-56.compute-1.amazonaws.com

$ ssh-keyscan ${ihost} 2>/dev/null > ${iid}.keys
$ ssh-keygen -lf ${iid}.keys > ${iid}.fprint
$ read length fp hostname id < ${iid}.fprint
$ echo $fp
f1:40:a7:4e:0f:28:8d:12:21:59:f1:ff:03:5f:63:54



This fingerprint should also appear on the console output of the instance. If it doesn't, then something is wrong. So, we'll get the console output, and grep through it looking for the fingerprint:


$ euca-get-console-output ${iid} > ${iid}.console
$ grep "ec2: ${length} ${fp}" ${iid}.console
ec2: 2048 f1:40:a7:4e:0f:28:8d:12:21:59:f1:ff:03:5f:63:54 /etc/ssh/ssh_host_rsa_key.pub (RSA)


We've now verified that the host we're connecting to is the host we just launched, so we can connect safely. Now you can clean out any old occurences of that host in known_hosts and tell the ssh client that this is a "known_host"


# remove existing entries in ~/.ssh/known_hosts for this host
$ ssh-keygen -R "${ihost}"

# hash the output of the known hosts file. This prevents someone
# from reading known_hosts as simple list of remote hosts you have
# access to in the event that one of your keys was compromised.

$ ssh-keygen -H -f ${iid}.keys

# Add the key to your known_hosts
$ cat ${iid}.keys >> ~/.ssh/known_hosts

# remove the temporary files we created
$ shred -u "${iid}."*


There, we've now verified that the remote host is the instance we started and told the ssh client about it.

Unfortunately, console output on ec2 is only updated approximately every 4 minutes. So, you can't run through this process until you have console output to check.

Updates
  • [2010-09-22]: fix mismatched use of 'iid' and 'ihost'

10 comments:

  1. Great article. Thanks to you, I was able to automate adding my new server instance to my known_hosts in a matter of minutes. You saved me hours of research and trial and error.

    I did notice a few weird lines in your code, probably from copy/paste typos. Check your ${iid} and ${ihost} references, because I think you got a few mixed up.

    Thanks for writing this!

    ReplyDelete
  2. Another problem is that if you include a script in user-data that generates a lot of output, when get-console-output returns its initial data, the ssh fingerprint may not be included. This is because Amazon only returns that 64K most recent messages.

    my user-data script builds a very complete rails stack. It takes 3-4 minutes to get the initial get-console-output. By that time, my script has generated a ton of console messages. So the initial data from get-console-output is missing a bunch of the initial messages. One hack is to add a 3-4 minutes delay at the beginning of my script. Major hack though.

    ReplyDelete
  3. @Athir
    The output to the console of "user data scripts" has (unfortunately) been inconsistent across hardy, karmic, and lucid. In maverick and forward, "by default" the output of user data scripts (or cloud-config 'runcmd') should go to the console.

    As you pointed out that can be a pain. However, there is no reason that it *has* to go to the console. You can very easily redirect output of your script/program to a file. That will generally make it easier for you to get at and have the added benefit of not filling up that 64K buffer.

    ReplyDelete
  4. Nice post!! This worked great for me :)

    Thanks for taking the time to write it!
    Jamie

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. Great post. I used the information to write a script that fully automates the process of starting an EC2 instance and verifying the SSH fingerprints. I posted the whole thing here:

    http://xocoatl.blogspot.com/2011/03/starting-ec2-instance.html

    Thanks Scott!

    ReplyDelete
  7. Excellent post. Just one correction perhaps -
    Command should be (on AWS, perhaps on Euca it works)
    ec2-get-console-output --verbose | grep ^ec2:

    ReplyDelete
  8. Thanks a ton, great to know how to verify that, much appreciated.

    ReplyDelete
  9. Can you explain how using the console is out of band? When you launch an EC2 instance, the only way to use the euca-get-console-output suggestion you make is, well, by logging into the console and then doing that. That doesn't sound out of band to me. I'm a novice so forgive me if I'm missing something obvious here.

    ReplyDelete
  10. John,
    You're right that its not as much "out of band" as it would be to call a service provider on the phone. That said, it probably is good enough, and is much better than just accepting the key. Its reasonable primarily because 'euca-get-console-output' (or getting the 'view console output' from the web UI) are both (likely) transported over SSL/https. So you get to piggy back on the trust built into that secure transport mechanism (which possibly you accepted a certificate for for previously).

    ReplyDelete