How does SSH’s ProxyCommand actually work?
I have been writing support for SSH’s ProxyCommand
into a product at work over
the past month or so. I find protocols fascinating, particularly those as old as
/ with the pedigree of SSH.
ProxyCommand
is an interesting corner in the set of (defacto) SSH standards.
It is used to establish a secure connection to a destination host, where that
destination host cannot be directly accessed by the local machine. It is mostly
used when you want to tunnel through a bastion host. It specifies a
command that your SSH client executes to create a proxied connection — hence
the name — to the destination host through the bastion.
I’m writing about it because my experience over the last few months suggests
that ProxyCommand
can be slightly mind-bending. Certainly it took me a little
while to wrap my head around what is going on.
So let’s take a look at how it works.
netcat
To start understanding what’s going on, we’ll first look at something else:
netcat
, or nc
. nc
can do a lot of things. Its man
page begins:
The nc (or netcat) utility is used for just about anything under the sun involving TCP or UDP. It can open TCP connections, send UDP packets, listen on arbitrary TCP and UDP ports, do port scanning, and deal with both IPv4 and IPv6.
But we will use nc
to send and receive data between a local terminal and a
server. If we execute nc HOST PORT
, nc
will remain running and will proxy
bytes to the remote host that we send via stdin
and stdout
. In other words,
it’ll send what we type in our terminal to the remote host, and print out what
the host returns.
By using nc
to make a HTTP request to the dev Hugo server that I run when
building this site, we can see this in action.
In this example, the first three lines, starting GET /
, are what I type to
make a HTTP request — this is about the simplest complete HTTP request you can
make.nc
forwards that to Hugo, and then it prints out the response from the
server to stdout
and so my terminal.
> nc localhost 1313
GET / HTTP/1.1 \
Host: localhost |- what I typed
/
HTTP/1.1 200 OK \
Accept-Ranges: bytes |
Content-Length: 15825 |
Content-Type: text/html; charset=utf-8 |
Last-Modified: Sun, 04 Aug 2024 09:30:58 GMT |- what the server
Date: Sun, 04 Aug 2024 09:34:06 GMT | sent back
|
<!DOCTYPE html> |
<html> |
|
<head> |
|
... more page content ... /
What is happening in detail:
nc
opens a network connection tolocalhost
. This could be any non-TLS host.nc
waits to receive data fromstdin
(ie, the shell in this case) or from the network.- I type the first three lines, the HTTP request, terminated by a double newline.
nc
receives each character I type as a byte overstdin
via my terminal and shell. (The shell and terminal handle echoing the content I’m typing back to me).nc
sends those bytes over its TCP connection tolocalhost:1313
(Hugo).- Hugo receives the request and sends a response.
nc
receives the response and forwards it byte-by-byte tostdout
.- The shell receives the output and instructs the terminal to print it.
- The terminal shows the response.
The thing to bear in mind is that nc
is proxying stdin/out over a network
connection to a remote host. stdin
becomes the data that is sent to the remote
host, and stdout
is used to return data to the process that started nc
.
In the example above, I use an interactive shell to start nc
and so the stdin
and stdout are connected to my terminal. This is why my typing is sent to the
remote server (albeit that server is on localhost
) and why the returned data
is printed to my terminal. This works because HTTP is a text-based protocol, so
we are sending and receiving data the terminal and shell can process. A binary
process would likely send control codes that would break our shell or terminal.
But, in the general case, any program that starts an nc
child process and
binds its stdin
and stdout
can then use the nc
process to send and receive
any bytes to and from a remote server. Those bytes need not be printable, as
with the shell case above, so binary protocols — like SSH — can be used.
What SSH expects from ProxyCommand
When we use a ProxyCommand with SSH, SSH does very specific things:
- SSH executes a child process using the command in
ProxyCommand
. - SSH binds the child process
stdin
andstdout
. - SSH uses the child process’s
stdin
andstdout
instead of creating its own network connection. - At this point, SSH starts the standard SSH handshake and authentication over
the
stdin
/stdout
network-like socket, assuming there is ansshd
at the other end of it.
That final step should sound very familiar from that last paragraph on nc
😀
SSH assumes the child process is routing the parent process’s SSH’s network traffic all the way to the destination.
That means that the SSH parent process will assume that it is talking to
DESTINATIONHOST
when exectuing the following command, but in fact it will have
a connection to ANOTHERHOST
because ANOTHERHOST
is specified in the
ProxyCommand
string:
ssh -o ProxyCommand='nc ANOTHERHOST 22' DESTINATIONHOST
This can happen through misconfiguration, for example.
To counter this threat, the parent SSH process will validate the host key it
receives from the server on the other end of the ProxyCommand
. In this case,
SSH will then realise it is talking to the wrong server and terminate the
connection.
Once I’d internalised that all ProxyCommand
does from SSH’s point of view is
to start a child process that SSH uses as a network connection via the child’s
stdin/out, things started falling into place. Specifically, I understood that
the child only has to open a connection to the remote sshd
, proxy raw bytes
and nothing more — because the parent ssh
process takes care of the actual
ssh-ness of things.
The simplest ProxyCommand
ProxyCommand
is usually used to tunnel into a network via a remote bastion
host. But to show the simplest ProxyCommand
, let’s proxy an SSH connection
over a local nc
child process.
These commands have the same effect, in terms of the network connections created:
ssh mike@destination
# - and -
ssh -o ProxyCommand='nc destination 22' mike@destination
In the first case, ssh
creates its own network connection to destination
.
In the second, ssh
starts a child nc
process. Then nc
creates the network
connection and waits for input on stdin/out. ssh
then uses the stdin/out of
the child nc
process as its network connection.
But in both cases, the machine only has one TCP connection to destination
.
The nc
process will exit when the destination host closes its connection,
which will happen once ssh
completes the SSH session with the destination
host. Again, nc
doesn’t have any smarts about when to exit — it just waits
for the parent SSH process to finish doing what it needs to do.
The parent ssh
process sends exactly the same bytes over the nc
stdin/out
pipes that it would send over its own network connection. The std/in out is
just a network connection to ssh
. Specifically, it expects the thing that is
responding over the stdin/out pipes to be an sshd
server.
ssh
will execute the usual SSH handshake, authentication and so on itself,
using the exact same sent and received bytes, just sending them to the nc
stdin/out pipes instead of its own network socket.
There’s nothing special happening from the parent ssh
process’s point of view,
except this (odd) way to send its usual network traffic!
It’s important to remember that the ProxyCommand string is executed completely
locally. The child process may send commands to a remote host — eg, to tell
that remote host to proxy to the destination host — but the process itself is
entirely executed locally. Given that ssh
wants to use the process’s stdin/out
as its network connection, by definition the process must be local 😬
Using remote servers in ProxyCommand
The main use-case for ProxyCommand
isn’t using nc
as a local network proxy,
of course. That’s … kinda pointless. Instead, it’s using a remote bastion host
to proxy to another “destination” host within the same network as the bastion.
This is where ProxyCommand
starts to get confusing, because we stack a few
things together:
- That
ssh
can start and bind a remote program’s stdin/out. - That
nc
can be used as that remote program to create an onward connection to the host inside the bastion’s network.
Binding a remote process’s stdin/out
To start a remote program and bind its stdin/out (and stderr), we do this:
> ssh mike@destination_host ls
This command does a lot of stuff:
- The local
ssh
client creates a network connection tosshd
ondestination
. - The local
ssh
client requests thatsshd
runls
. - The remote
sshd
runsls
, binding thels
process’s stdin/out to itself. - The remote
sshd
receives the result ofls
over the bound stdout. - The remote
sshd
sends the resulting bytes to the localssh
client. - The local
ssh
client prints them out to the local terminal. - The remote
ls
process exits. - The remote
sshd
sends the exit code over to the localssh
client. - The remote
sshd
and the localssh
close the network connection. - The local
ssh
exits with thels
exit code (ie, it mirrors the remote process’s exit code).
Note that during step 3 & 4, the local ssh
client could send data to the
remote sshd
which the remote sshd
would pass to the executed command (ls
)
via its stdin.
So that’s our first layer of stdin/out “proxying”. We can get the remote
sshd
to proxy the stdin/out of a command to our local ssh
client.
Proxying via a remote nc
process
Let’s look at how we take remote execution and turn it into a proxy running on the bastion host.
What we do is:
- Use
nc
as our remote program. - Embed that remote execution of
nc
in aProxyCommand
.
The command line looks like this:
ssh -o ProxyCommand='ssh mike@bastion nc destination 22' mike@destination
Let’s look at how this command gets to the position where the parent local SSH
process is able to start a handshake with the sshd
server running on
destination
.
The first stage is setting up the connection to the destination:
- The
ssh
client parent process creates a childssh
client process:- The child process is
ssh mike@bastion nc destination 22
, ie the exactProxyCommand
string. Note this childssh
process is connecting tobastion
. - The parent binds to the child’s stdin/out.
- The child process is
- The child
ssh
client process opens an SSH connection (runs the SSH handshake, authenticates) tosshd
onbastion
. - The child
ssh
process requests that the bastion’ssshd
startsnc destination 22
. - The bastion
sshd
starts thenc
process, binding thenc
stdin/out to itself. - The
nc
process running on the bastion creates a TCP network connection to port 22 ondestination
.
Side note: to avoid typo errors where the wrong destination host ends up being
used (as noted above), ProxyCommand
allows substitution of the destination
host and port using %
-variables:
ssh -o ProxyCommand='ssh mike@bastion nc %h %p' mike@destination
This is especially useful in configuration files, where the exact destination host isn’t known in advance.
At this point, the parent ssh
client process is able to start sending bytes to
destination, to start the SSH handshake and authentication process with the
sshd
on destination
. Let’s see how those bytes make their way to
destination
.
- The parent
ssh
client process sends the bytes forming the first part of the new SSH handshake over the childssh
client’sstdin
. - The child
ssh
receives the bytes overstdin
and sends them over the network tosshd
on the bastion. - The bastion’s
sshd
receives the bytes and sends them to its childnc
process overstdin
. - The
nc
process receives the bytes overstdin
and sends them over the network todestination:22
. - The
sshd
server ondestination
receives the bytes.
The destination server’s response mirrors that:
- The destination
sshd
sends bytes across the network to the bastion’snc
process. nc
outputs those bytes overstdout
to its parentsshd
process.- The bastion’s
sshd
sends the bytes over the network to the childssh
client process. - The child
ssh
client outputs the bytes overstdout
, where they are received by the parentssh
client process.
Now we can put this together into a data flow diagram:
Summary
Overall, it’s somewhat convoluted! But this is because ProxyCommand
is
designed to provide flexibility to the user by allowing the network connection
to be created in any way the user wants. Other ways include:
- The SSH client’s
man
page gives the example of using an HTTP proxy rather than using an SSH server as the proxy. - More intelligent proxies can use the SSH subsystem mechanism to provide extra functionality.
The key thing that ProxyCommand
does is to abstract the routing of the network
traffic to the destination — whether that’s via a local nc
, a remote bastion
or something completely different — from the user’s SSH session (ie, the
“parent” ssh
client process).
Importantly, because a new SSH session is established by the parent over the
network path ProxyCommand
creates, SSH is able to validate the connection to
the destination using its host key and all data in the user’s SSH session is
securely encrypted — so any proxies along the way cannot read it.