How SSH works, under the hood
I have used SSH as a way to get a shell on a remote machine for over twenty years, but I’ve never given that much thought to how the protocol works. In retrospect, I find this a little surprising as I tend to love this stuff.
But I got a chance to dig into it at work recently. In doing so, I found that my remote shells used a significantly more sophisticated protocol than I imagined. Instead of being super-specific, SSH turns out to be a general purpose, multiplexing, secure connection protocol, whose killer app appears to have been remote shells. I wanted to write a bit about it, to cement my understanding and give an introduction to the power SSH has.
The aim of this post is to give a working understanding of how SSH works one level down from how we typically see it. We’ll not cover the setting up of the SSH connection, but we will cover how the SSH client asks the server to do things like open a shell or run a program, and how data is moved between the two.
Go has an SSH client and server in its extended standard library,
golang.org/x/crypto/ssh. We can
use this to explore the SSH protocol in more detail. We’ll do that by building a
simple SSH server that can run a single command, like when we run
ssh mike@myserver.com ls -l /
— ie, run ls
in the root directory on the
remote server. As we are doing this, we will log activity around SSH’s
underlying primitives to peek under the covers.
The protocol in outline
First, let’s look at what happens when one opens a shell on a remote machine using SHH. In doing this, we’ll encounter SSH’s two core primitives: channels and requests.
- When you
ssh mike@myserver.com
, SSH first creates a TCP session. - Next, SSH negotiates a secure — encrypted and authenticated — SSH connection between the two machines over that TCP connection. The server and client run a handshake and authentication protocol between them. If this fails, the client is disconnected. Otherwise we continue.
- The client creates a channel on the connection. A channel contains a data stream and control message layer. Before sending data, the client will send requests on the control layer — such as “start a shell and use this channel as stdin/out/err”.
- The client can create further channels after this. Channels are multiplexed on the connection, a bit like how HTTP/2 multiplexes HTTP requests over a single TCP connection. SSH agent forwarding uses a second, specific channel alongside the typical shell channel.
- After the client is done with its channels — eg, the shell session is complete and terminated — the client and server will disconnect.
An established SSH connection with three channels.
Let’s accept an SSH connection in Go
First, we need to write the Go code that will accept an SSH connection and run the handshake. As soon as we have accepted the connection, we’ll put the server into an infinite loop to see what happens.
For quicker debug cycles, we will create a server that requires no authentication. This means that we can connect to it without typing a password every time. Obviously this would be silly for any other purpose.
Regardless of whether the server asks for any credentials from the client, we still need to provide a private key for the server itself, otherwise Go won’t start the server. This is just a foible of the Go implementation, and so we must generate one.
We’ll write one to the file id_rsa
using openssl
:
openssl genrsa -out id_rsa 2048
Once we have an id_rsa
in the current directory, we can write some Go:
package main
import (
"log"
"net"
"os"
"golang.org/x/crypto/ssh"
)
func main() {
// An SSH server is represented by a ServerConfig, which holds
// certificate details and handles authentication of ServerConns.
config := &ssh.ServerConfig{
NoClientAuth: true,
}
// We need to set a host key, or the SSH server won't
// start.
privateBytes, err := os.ReadFile("id_rsa")
if err != nil {
log.Fatal("Failed to load private key: ", err)
}
private, err := ssh.ParsePrivateKey(privateBytes)
if err != nil {
log.Fatal("Failed to parse private key: ", err)
}
config.AddHostKey(private)
// Once a ServerConfig has been configured, connections can be
// accepted.
listener, err := net.Listen("tcp", "0.0.0.0:2022")
if err != nil {
log.Fatal("failed to listen for connection: ", err)
}
log.Println("Waiting for connections")
nConn, err := listener.Accept()
if err != nil {
log.Fatal("failed to accept incoming connection: ", err)
}
defer nConn.Close()
// Normally here one would start up a goroutine so the
// server could accept more connections. But for ease,
// we won't.
// We wrap the socket in a server, which performs the
// protocols handshake and accepts our ssh connection.
_, _, _, err = ssh.NewServerConn(nConn, config)
if err != nil {
log.Fatal("failed to handshake: ", err)
}
for {} // wait forever
}
We can start the server:
go run sshserver.go
Running ssh
will execute the handshake in NewServerConn
. As we’ve not
connected to our server before, ssh
will first ask whether we want to trust
the key we just generated with openssl
.
However, after we agree to trust it, ssh
just hangs:
> ssh mike@localhost -p 2022 ls -l /
The authenticity of host '[localhost]:2022 ([::1]:2022)' can't be established.
RSA key fingerprint is SHA256:qck74l1PNHZdehGQPthEGJQ5ELvpB2bt2DRCPekvkBk.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[localhost]:2022' (RSA) to the list of known hosts.
# hangs here
Let’s do some work over SSH
At this point we’ve negotiated the secure SSH connection, but the server is not
responding to anything beyond that. The client wants to start asking the server
to execute ls -l /
, but the server isn’t playing ball.
So what we need to do is to add more code to the server, so the SSH client can continue running the protocol. Our server needs to allow the client to create new SSH channels, send requests on those channels and, finally, send and receive data on them.
A bit more detail on the bytes
I love me a protocol, so let’s dive down a little further for a moment to see what SSH is sending over the wire.
Opening channels, making requests on them and sending data are all done using SSH messages. These start with a byte defining the type of message. Following the type is a set of type specific fields. For example, an exec request on a channel is:
byte SSH_MSG_CHANNEL_REQUEST
uint32 recipient channel
string "exec"
boolean want reply
string command
SSH_MSG_CHANNEL_REQUEST
is defined to be 98
. So when the Go server is
waiting for a new request, and sees a 98
byte on the wire, it will attempt to
read a new request off the wire, and pass it to our application.
Thankfully the Go server handles the details of the actual messages on the wire, and provides us Go-friendly abstractions to handle them, often using Go channels given the SSH protocol is relatively asynchronous.
It gets a bit confusing in code that Go and SSH both have a thing named channels — be alert for this when reading!
The Go server makes handling new SSH channels and requests quite easy to do. The
NewServerConn
function returns a Go channel called chans
. Receiving from
that Go channel in a loop returns new requests from the SSH client for new SSH
channels.
Each SSH channel that the client opens has a type. Within the underlying
protocol, the type is just an ASCII string. For most interactive sessions,
including running a shells and single commands, the type is session
. The other
common type is for ssh-agent
, conventionally the type for this is
auth-agent@openssh.com
.
For our server, we want to support SSH’s single command execution feature.
Therefore, we set up a for loop to read from chans
return from
NewServerConn
, and accept channels of type session
.
To allow us to support multiple channels over the same SSH connection, each time
we accept a channel, we will start a new goroutine to serve the requests on
that channel. For executing a single command, the client sends a request to the
channel with request type set to exec
, again an ASCII string.
Again, the Go server wraps the underlying SSH requests into a Go channel that we
can receive on. In the new gorouting, we will loop over the channel, waiting for
an exec
request and rejecting others. The client usually expects a reply as to
whether the server has accepted each request.
Let’s extend our server to handle the client requesting channels, and then we
will listen for the exec
request on each channel. If we get the exec
request, rather than executing the command, for now we will just send a short
message back to the user:
func main() {
// ... code above to accept the TCP socket and set up the
// server config ...
conn, chans, reqs, err := ssh.NewServerConn(nConn, config)
if err != nil {
log.Fatal("failed to handshake: ", err)
}
defer conn.Close()
// We will use wg to wait on the goroutines that we
// start to service each channel.
var wg sync.WaitGroup
defer wg.Wait()
// The incoming Request channel must be serviced.
// This contains just heartbeats for shell sessions.
wg.Add(1)
go func() {
ssh.DiscardRequests(reqs)
wg.Done()
}()
// In the Go server implementation, we receive new
// SSH multiplexed channels via the chans ... go channel.
for newChannel := range chans {
// Reject any channels that are not "session" channels
if newChannel.ChannelType() != "session" {
newChannel.Reject(
ssh.UnknownChannelType,
"unknown channel type"
)
continue
}
channel, requests, err := newChannel.Accept()
if err != nil {
log.Fatalf("Could not accept channel: %v", err)
}
log.Println("Accepted a new channel!")
// Now we have accepted the channel, we have to
// be ready to handle requests on the channel.
// We will fire up a goroutine for that, so we
// can handle other channels that come in over the
// connection.
// Finally, we can spy on what the SSH clients sends:
wg.Add(1)
go func(in <-chan *ssh.Request) {
for req := range in {
log.Printf("received req %s", req.Type)
// We reject every request but an exec request.
switch req.Type {
case "exec":
log.Printf(
"req exec with payload (cmd): %s",
string(req.Payload)
)
// Accept the exec request
req.Reply(true, nil)
// Send some pretend command output
channel.Write([]byte("Hello SSH\r\n"))
default:
req.Reply(false, nil)
}
}
wg.Done()
}(requests)
}
Running the client against our new server
Now, let’s run ssh mike@localhost -p 2022 ls -l /
again.
On the server side, we will see a new channel created and some requests arriving:
> go run sshserver.go
2024/06/11 21:16:55 Waiting for connections
2024/06/11 21:17:00 Accepted a new channel!
2024/06/11 21:17:00 received req env
2024/06/11 21:17:00 received req exec
2024/06/11 21:17:00 req exec with payload (cmd): ls -l /
Nice! We’re able to read the command off the wire that ssh
is sending. That’s
much better!
If we re-run ssh
with debug level 2 (-vv
) we can even see these requests
being sent, and the responses:
> ssh mike@localhost -p 2022 -vv ls -l /
... a lot of login text ...
debug2: channel 0: send open
debug1: Entering interactive session.
debug1: pledge: filesystem
debug2: channel_input_open_confirmation: channel 0: callback start
debug2: fd 6 setting TCP_NODELAY
debug2: client_session2_setup: id 0
debug1: Sending environment.
debug1: channel 0: setting env LANG = "en_GB.UTF-8"
debug2: channel 0: request env confirm 0
debug1: Sending command: ls -l /
debug2: channel 0: request exec confirm 1
debug2: channel_input_open_confirmation: channel 0: callback done
debug2: channel 0: open confirm rwindow 2097152 rmax 32768
debug2: channel_input_status_confirm: type 99 id 0
debug2: exec request accepted on channel 0
Hello SSH
Let’s pull that apart a little:
First we can see that the client makes an
env
request, that our server replies to withfalse
. The SSH client logs thatfalse
asconfirm 0
:``` debug1: channel 0: setting env LANG = "en_GB.UTF-8" debug2: channel 0: request env confirm 0 ```
Then we can see the command being sent, and accepted (
confirm 1
):``` debug1: Sending command: ls -l / debug2: channel 0: request exec confirm 1 ```
Finally we see the “Hello SSH” string the server writes to the channel after it gets the
exec
request.
But the client still hangs at Hello SSH
. Why? 😠
-vv
— the most useful verbosity level when
debugging what’s going on here. Level 3 seems a bit much, level 1 misses useful
stuff.Why are we left hanging at Hello SSH
?
In the end it’s quite simple. The SSH client doesn’t know that the command execution (our pretend one) is complete. So it’s sat waiting for new output.
To tell the client to stop waiting, we need to close the channel using
channel.Close()
:
// ... code above to accept new server connection
// and handle new channels ...
wg.Add(1)
go func(in <-chan *ssh.Request) {
for req := range in {
log.Printf("received req %s", req.Type)
switch req.Type {
case "exec":
log.Printf(
"req exec with payload (cmd): %s",
string(req.Payload)
)
// Accept the exec request
req.Reply(true, nil)
// Send some pretend command output
channel.Write([]byte("Hello SSH\r\n"))
// Close the channel
channel.Close() // <<<----------------
default:
req.Reply(false, nil)
}
}
wg.Done()
}(requests)
> ssh mike@localhost -p 2022 ls -l /
Hello SSH
Connection to localhost closed.
[255]>
Turns out that while this terminates the connection, it doesn’t close it
cleanly. Why? Because the ssh
client is expecting an exit status for our shell
(that we never really started!).
SSH client exit codes
255
is the ssh client’s generic “something went wrong” exit code.
If the ssh connection is successful, the exit code is instead the remote shell or executed command’s exit code.
Properly closing out our exec
request
Rather than an abrupt channel closure, the SSH client expects a proper sequence
of things to happen during its exec
request:
- The command is executed.
- When the command stops, we tell the SSH client the process exited by sending the client either the process exit code or signal.
- Finally, we can go ahead close the channel.
For step (1), the server is pretending to do execute something, and returning
the pretend output Hello SSH
. So we’re okay there.
However, the server is missing step (2). It must send an exit code or signal. To
send an exit code, the server sends a request to the client (recall the
original exec
came from the client to the server; requests go both ways).
The exit code request has type exit-status
and its payload is the exit status
as a big endian uint32
. We use the arbitrary exit code 7
to make it easier
to confirm that our exit code got to the client, as we will see it as the ssh
client’s exit code.
The server needs to explicitly encode the exit status integer using Go’s
binary
package:
buf := make([]byte, 4)
binary.BigEndian.PutUint32(buf, 7) // put the exit code into buf
channel.SendRequest("exit-status", false, buf[0:4])
After we have sent the exit-status
request, we can close out the channel with
channel.Close()
. The client won’t complain about this now that we’ve sent the
exit status.
Here’s the request handling code in full:
// ... code above to accept new server connection
// and handle new channels ...
wg.Add(1)
go func(in <-chan *ssh.Request) {
for req := range in {
log.Printf("received req %s", req.Type)
switch req.Type {
case "exec":
log.Printf(
"req exec with payload (cmd): %s",
string(req.Payload)
)
// Accept the request
req.Reply(true, nil)
// Write some pretend command output
channel.Write([]byte("Hello SSH\r\n"))
// Send an exit code of 7
buf := make([]byte, 4)
binary.BigEndian.PutUint32(buf, 7)
channel.SendRequest(
"exit-status",
false,
buf[0:4]
)
// Finally, close the channel
channel.Close()
default:
req.Reply(false, nil)
}
}
wg.Done()
}(requests)
And now that works, note the exit code reported [7]
:
> ssh mike@localhost -p 2022 ls -l /
Hello SSH
Connection to localhost closed.
[7]>
And so it works (kind of)
We’ve got a working SSH server (that does nothing!). But we can see from this
the bare bones of handling SSH channels and requests, and we’ve got the standard
ssh
client to work with our server. From here, I feel that I could really
start to dig in and get to know the protocol much better.
We can see that by defining our own channel and request types (the protocol has
extensibility built in!) we could enact arbitrarily complicated protocols over
the secure channel that SSH gives us. Indeed, tools like sftp
and scp
do
exactly that to leverage the underlying security of ssh
to secure their data
transfers. Even further, I could imagine leveraging SSH’s secure connections to
run my own protocols — such as running a protocol buffer client/server over a
channel type I define.
I found learning about SSH pretty amazing, and it’s reinforced my belief that
most things in computers can be understood with time. It’s deeper and more
powerful than I thought, although it seems to have settled mostly into the niche
it’s named after, being a s
ecure sh
ell.
For reference, I found the RFCs to be great, at least once I’d learned the basic connection/channel/request structure:
- RFC 4254 - The Secure Shell (SSH) Connection Protocol - this defines the channels and request types we dealt with here. I found this one the most useful when writing the code above.
- RFC 4251 - The Secure Shell (SSH) Protocol Architecture - this talks at a higher level, and I found it’s primary use the security properties section.
- There are two other RFCs that I didn’t use much, as Go’s server handles most of the details from them: RFC 4253 - The Secure Shell (SSH) Transport Layer Protocol and RFC 4252 - The Secure Shell (SSH) Authentication Protocol.