This page looks best with JavaScript enabled

Cluster SSH

 ·  ☕ 3 min read

One of the most important parts of a working cluster is the interconnection and communication between nodes. While the networking side will not be covered now, a very important aspect will be: passwordless SSH.

Inter-node SSH

The first task to getting easy access between nodes is ensuring SSH access between all the nodes.

While not necessary, I recommend adding all your nodes to the /etc/hosts file on each node. For example, the /etc/hosts file might look like

1
2
3
4
5
6
7
8
9
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

to which I would add (using the actual IPs of the nodes)

1
2
3
4
192.168.0.11 node01
192.168.0.12 node02
192.168.0.13 node03
192.168.0.14 node04
1
2
3
4
5
6
7
8
9
for node in localhost node02 node03 node04; do
ssh $node "cat | sudo tee -a /etc/hosts > /dev/null" << EOF

192.168.0.11 node01
192.168.0.12 node02
192.168.0.13 node03
192.168.0.14 node04
EOF
done

After this is added to your hosts file on all your nodes, from any node you should be able to ssh node1 from any of them successfully after entering your password.

Passwordless SSH

To be able to SSH between nodes without the need for a password, you will need to create an SSH key. This will allow SSH to work in scripts and tools (MPI) without needing user interaction.

First, we need to create a key. There are multiple standards of encryption you can use for SSH keys. The default is RSA, but it is generally considered to be less secure than modern standards. Therefore, these instructions will show how to create a ed25519 key. This will work on your cluster, but some (very) old systems may not support ED25519 keys (RSA keys will generally work everywhere even though they are less secure).

To create a key, use this command on one of your nodes:

1
ssh-keygen -t ed25519 -a 100 -f ~/.ssh/id_ed25519 -C "Inter-node cluster ssh"

This article does a good job of breaking down what all the arguments are used for.

Next, we need our nodes to trust the key we just created. We’ll start with getting the current node to trust the key.

1
ssh-copy-id -i ~/.ssh/id_ed25519 localhost

Now we can just copy these files to all the other nodes so that they can use and will trust this key.

1
2
3
4
5
for node in node02 node03 node04; do # list all the nodes that should get the key
  ssh-copy-id -i ~/.ssh/id_ed25519 $node # you will need to enter your password for this step
  scp ~/.ssh/id_ed25519 $node:.ssh/
  ssh $node "chmod 600 ~/.ssh/id_ed25519" # ensure the key is locked down so SSH will accept it.
done

And to make all the nodes trust each other’s fingerprints

1
2
3
for node in node02 node03 node04; do
  scp ~/.ssh/known_hosts $node:.ssh/
done

We can check that we can SSH into all the nodes without having to enter a password:

1
2
for node in node2 node3 node4; do
  ssh $node "hostname"