How SSH works, under the hood

I have used SSH as a way to get a shell on a remote machine for over twenty years, but I’ve never given that much thought to how the protocol works. In retrospect, I find this a little surprising as I tend to love this stuff.

But I got a chance to dig into it at work recently. In doing so, I found that my remote shells used a significantly more sophisticated protocol than I imagined. Instead of being super-specific, SSH turns out to be a general purpose, multiplexing, secure connection protocol, whose killer app appears to have been remote shells. I wanted to write a bit about it, to cement my understanding and give an introduction to the power SSH has.

The aim of this post is to give a working understanding of how SSH works one level down from how we typically see it. We’ll not cover the setting up of the SSH connection, but we will cover how the SSH client asks the server to do things like open a shell or run a program, and how data is moved between the two.

Go has an SSH client and server in its extended standard library, golang.org/x/crypto/ssh. We can use this to explore the SSH protocol in more detail. We’ll do that by building a simple SSH server that can run a single command, like when we run ssh mike@myserver.com ls -l / — ie, run ls in the root directory on the remote server. As we are doing this, we will log activity around SSH’s underlying primitives to peek under the covers.

The protocol in outline

First, let’s look at what happens when one opens a shell on a remote machine using SHH. In doing this, we’ll encounter SSH’s two core primitives: channels and requests.

When you ssh mike@myserver.com, SSH first creates a TCP session.
Next, SSH negotiates a secure — encrypted and authenticated — SSH connection between the two machines over that TCP connection. The server and client run a handshake and authentication protocol between them. If this fails, the client is disconnected. Otherwise we continue.
The client creates a channel on the connection. A channel contains a data stream and control message layer. Before sending data, the client will send requests on the control layer — such as “start a shell and use this channel as stdin/out/err”.
The client can create further channels after this. Channels are multiplexed on the connection, a bit like how HTTP/2 multiplexes HTTP requests over a single TCP connection. SSH agent forwarding uses a second, specific channel alongside the typical shell channel.
After the client is done with its channels — eg, the shell session is complete and terminated — the client and server will disconnect.

An established SSH connection with three channels.

Let’s accept an SSH connection in Go

First, we need to write the Go code that will accept an SSH connection and run the handshake. As soon as we have accepted the connection, we’ll put the server into an infinite loop to see what happens.

For quicker debug cycles, we will create a server that requires no authentication. This means that we can connect to it without typing a password every time. Obviously this would be silly for any other purpose.

Regardless of whether the server asks for any credentials from the client, we still need to provide a private key for the server itself, otherwise Go won’t start the server. This is just a foible of the Go implementation, and so we must generate one.

We’ll write one to the file id_rsa using openssl:

openssl genrsa -out id_rsa 2048

Once we have an id_rsa in the current directory, we can write some Go:

package main

import (
	"log"
	"net"
	"os"

	"golang.org/x/crypto/ssh"
)

func main() {
	// An SSH server is represented by a ServerConfig, which holds
	// certificate details and handles authentication of ServerConns.
	config := &ssh.ServerConfig{
		NoClientAuth: true,
	}

	// We need to set a host key, or the SSH server won't
	// start.
	privateBytes, err := os.ReadFile("id_rsa")
	if err != nil {
		log.Fatal("Failed to load private key: ", err)
	}
	private, err := ssh.ParsePrivateKey(privateBytes)
	if err != nil {
		log.Fatal("Failed to parse private key: ", err)
	}
	config.AddHostKey(private)

	// Once a ServerConfig has been configured, connections can be
	// accepted.
	listener, err := net.Listen("tcp", "0.0.0.0:2022")
	if err != nil {
		log.Fatal("failed to listen for connection: ", err)
	}

	log.Println("Waiting for connections")
	nConn, err := listener.Accept()
	if err != nil {
		log.Fatal("failed to accept incoming connection: ", err)
	}
	defer nConn.Close()

	// Normally here one would start up a goroutine so the
	// server could accept more connections. But for ease,
	// we won't.

	// We wrap the socket in a server, which performs the
	// protocols handshake and accepts our ssh connection.
	_, _, _, err = ssh.NewServerConn(nConn, config)
	if err != nil {
		log.Fatal("failed to handshake: ", err)
	}

	for {} // wait forever
}

We can start the server:

go run sshserver.go

Running ssh will execute the handshake in NewServerConn. As we’ve not connected to our server before, ssh will first ask whether we want to trust the key we just generated with openssl.

However, after we agree to trust it, ssh just hangs:

> ssh mike@localhost -p 2022 ls -l /
The authenticity of host '[localhost]:2022 ([::1]:2022)' can't be established.
RSA key fingerprint is SHA256:qck74l1PNHZdehGQPthEGJQ5ELvpB2bt2DRCPekvkBk.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[localhost]:2022' (RSA) to the list of known hosts.

# hangs here

Let’s do some work over SSH

At this point we’ve negotiated the secure SSH connection, but the server is not responding to anything beyond that. The client wants to start asking the server to execute ls -l /, but the server isn’t playing ball.

So what we need to do is to add more code to the server, so the SSH client can continue running the protocol. Our server needs to allow the client to create new SSH channels, send requests on those channels and, finally, send and receive data on them.

A bit more detail on the bytes

I love me a protocol, so let’s dive down a little further for a moment to see what SSH is sending over the wire.

Opening channels, making requests on them and sending data are all done using SSH messages. These start with a byte defining the type of message. Following the type is a set of type specific fields. For example, an exec request on a channel is:

byte      SSH_MSG_CHANNEL_REQUEST
uint32    recipient channel
string    "exec"
boolean   want reply
string    command

SSH_MSG_CHANNEL_REQUEST is defined to be 98. So when the Go server is waiting for a new request, and sees a 98 byte on the wire, it will attempt to read a new request off the wire, and pass it to our application.

Thankfully the Go server handles the details of the actual messages on the wire, and provides us Go-friendly abstractions to handle them, often using Go channels given the SSH protocol is relatively asynchronous.

It gets a bit confusing in code that Go and SSH both have a thing named channels — be alert for this when reading!

The Go server makes handling new SSH channels and requests quite easy to do. The NewServerConn function returns a Go channel called chans. Receiving from that Go channel in a loop returns new requests from the SSH client for new SSH channels.

Each SSH channel that the client opens has a type. Within the underlying protocol, the type is just an ASCII string. For most interactive sessions, including running a shells and single commands, the type is session. The other common type is for ssh-agent, conventionally the type for this is auth-agent@openssh.com.

For our server, we want to support SSH’s single command execution feature. Therefore, we set up a for loop to read from chans return from NewServerConn, and accept channels of type session.

To allow us to support multiple channels over the same SSH connection, each time we accept a channel, we will start a new goroutine to serve the requests on that channel. For executing a single command, the client sends a request to the channel with request type set to exec, again an ASCII string.

Again, the Go server wraps the underlying SSH requests into a Go channel that we can receive on. In the new gorouting, we will loop over the channel, waiting for an exec request and rejecting others. The client usually expects a reply as to whether the server has accepted each request.

Let’s extend our server to handle the client requesting channels, and then we will listen for the exec request on each channel. If we get the exec request, rather than executing the command, for now we will just send a short message back to the user:

func main() {

	// ... code above to accept the TCP socket and set up the
	//     server config ...
	
	conn, chans, reqs, err := ssh.NewServerConn(nConn, config)
	if err != nil {
		log.Fatal("failed to handshake: ", err)
	}
	defer conn.Close()

	// We will use wg to wait on the goroutines that we 
	// start to service each channel.
	var wg sync.WaitGroup
	defer wg.Wait()

	// The incoming Request channel must be serviced.
	// This contains just heartbeats for shell sessions.
	wg.Add(1)
	go func() {
		ssh.DiscardRequests(reqs)
		wg.Done()
	}()
	
	// In the Go server implementation, we receive new
	// SSH multiplexed channels via the chans ... go channel.
	for newChannel := range chans {
		// Reject any channels that are not "session" channels
		if newChannel.ChannelType() != "session" {
			newChannel.Reject(
				ssh.UnknownChannelType, 
				"unknown channel type"
			)
			continue
		}
		channel, requests, err := newChannel.Accept()
		if err != nil {
			log.Fatalf("Could not accept channel: %v", err)
		}
		log.Println("Accepted a new channel!")

		// Now we have accepted the channel, we have to
		// be ready to handle requests on the channel.
		// We will fire up a goroutine for that, so we
		// can handle other channels that come in over the
		// connection.

		// Finally, we can spy on what the SSH clients sends:
		wg.Add(1)
		go func(in <-chan *ssh.Request) {
			for req := range in {
				log.Printf("received req %s", req.Type)

				// We reject every request but an exec request.
				switch req.Type {
					case "exec":
						log.Printf(
							"req exec with payload (cmd): %s", 
							string(req.Payload)
						)
						// Accept the exec request
						req.Reply(true, nil)
						// Send some pretend command output
						channel.Write([]byte("Hello SSH\r\n"))
					default:
						req.Reply(false, nil)
				}
			}
			wg.Done()
		}(requests)
}

Running the client against our new server

Now, let’s run ssh mike@localhost -p 2022 ls -l / again.

On the server side, we will see a new channel created and some requests arriving:

> go run sshserver.go
2024/06/11 21:16:55 Waiting for connections
2024/06/11 21:17:00 Accepted a new channel!
2024/06/11 21:17:00 received req env
2024/06/11 21:17:00 received req exec
2024/06/11 21:17:00 req exec with payload (cmd): ls -l /

Nice! We’re able to read the command off the wire that ssh is sending. That’s much better!

If we re-run ssh with debug level 2 (-vv) we can even see these requests being sent, and the responses:

> ssh mike@localhost -p 2022 -vv ls -l /
... a lot of login text ...
debug2: channel 0: send open
debug1: Entering interactive session.
debug1: pledge: filesystem
debug2: channel_input_open_confirmation: channel 0: callback start
debug2: fd 6 setting TCP_NODELAY
debug2: client_session2_setup: id 0
debug1: Sending environment.
debug1: channel 0: setting env LANG = "en_GB.UTF-8"
debug2: channel 0: request env confirm 0
debug1: Sending command: ls -l /
debug2: channel 0: request exec confirm 1
debug2: channel_input_open_confirmation: channel 0: callback done
debug2: channel 0: open confirm rwindow 2097152 rmax 32768
debug2: channel_input_status_confirm: type 99 id 0
debug2: exec request accepted on channel 0
Hello SSH

Let’s pull that apart a little:

First we can see that the client makes an env request, that our server replies to with false. The SSH client logs that false as confirm 0:
```
```
 debug1: channel 0: setting env LANG = "en_GB.UTF-8"
 debug2: channel 0: request env confirm 0
 ```
```

Then we can see the command being sent, and accepted (confirm 1):

```
 debug1: Sending command: ls -l /
 debug2: channel 0: request exec confirm 1
 ```

Finally we see the “Hello SSH” string the server writes to the channel after it gets the exec request.

But the client still hangs at Hello SSH. Why? 😠

I find level 2 debug — -vv — the most useful verbosity level when debugging what’s going on here. Level 3 seems a bit much, level 1 misses useful stuff.

Why are we left hanging at `Hello SSH`?

In the end it’s quite simple. The SSH client doesn’t know that the command execution (our pretend one) is complete. So it’s sat waiting for new output.

To tell the client to stop waiting, we need to close the channel using channel.Close():

// ... code above to accept new server connection
//     and handle new channels ...
wg.Add(1)
go func(in <-chan *ssh.Request) {
	for req := range in {
		log.Printf("received req %s", req.Type)

		switch req.Type {
		case "exec":
			log.Printf(
				"req exec with payload (cmd): %s", 
				string(req.Payload)
			)
			// Accept the exec request
			req.Reply(true, nil)
			// Send some pretend command output
			channel.Write([]byte("Hello SSH\r\n"))
			// Close the channel
			channel.Close() // <<<----------------
		default:
			req.Reply(false, nil)
		}
	}
	wg.Done()
}(requests)

> ssh mike@localhost -p 2022 ls -l /
Hello SSH
Connection to localhost closed.
[255]>

Turns out that while this terminates the connection, it doesn’t close it cleanly. Why? Because the ssh client is expecting an exit status for our shell (that we never really started!).

SSH client exit codes

255 is the ssh client’s generic “something went wrong” exit code.

If the ssh connection is successful, the exit code is instead the remote shell or executed command’s exit code.

Properly closing out our `exec` request

Rather than an abrupt channel closure, the SSH client expects a proper sequence of things to happen during its exec request:

The command is executed.
When the command stops, we tell the SSH client the process exited by sending the client either the process exit code or signal.
Finally, we can go ahead close the channel.

For step (1), the server is pretending to do execute something, and returning the pretend output Hello SSH. So we’re okay there.

However, the server is missing step (2). It must send an exit code or signal. To send an exit code, the server sends a request to the client (recall the original exec came from the client to the server; requests go both ways).

The exit code request has type exit-status and its payload is the exit status as a big endian uint32. We use the arbitrary exit code 7 to make it easier to confirm that our exit code got to the client, as we will see it as the ssh client’s exit code.

The server needs to explicitly encode the exit status integer using Go’s binary package:

buf := make([]byte, 4)
binary.BigEndian.PutUint32(buf, 7) // put the exit code into buf
channel.SendRequest("exit-status", false, buf[0:4])

After we have sent the exit-status request, we can close out the channel with channel.Close(). The client won’t complain about this now that we’ve sent the exit status.

Here’s the request handling code in full:

// ... code above to accept new server connection
//     and handle new channels ...
wg.Add(1)
go func(in <-chan *ssh.Request) {
	for req := range in {
		log.Printf("received req %s", req.Type)

		switch req.Type {
		case "exec":
			log.Printf(
				"req exec with payload (cmd): %s", 
				string(req.Payload)
			)
			// Accept the request
			req.Reply(true, nil)
			// Write some pretend command output
			channel.Write([]byte("Hello SSH\r\n"))
			// Send an exit code of 7
			buf := make([]byte, 4)
			binary.BigEndian.PutUint32(buf, 7)
			channel.SendRequest(
				"exit-status", 
				false, 
				buf[0:4]
			)
			// Finally, close the channel
			channel.Close()
		default:
			req.Reply(false, nil)
		}
	}
	wg.Done()
}(requests)

And now that works, note the exit code reported [7]:

> ssh mike@localhost -p 2022 ls -l /
Hello SSH
Connection to localhost closed.
[7]>

And so it works (kind of)

We’ve got a working SSH server (that does nothing!). But we can see from this the bare bones of handling SSH channels and requests, and we’ve got the standard ssh client to work with our server. From here, I feel that I could really start to dig in and get to know the protocol much better.

We can see that by defining our own channel and request types (the protocol has extensibility built in!) we could enact arbitrarily complicated protocols over the secure channel that SSH gives us. Indeed, tools like sftp and scp do exactly that to leverage the underlying security of ssh to secure their data transfers. Even further, I could imagine leveraging SSH’s secure connections to run my own protocols — such as running a protocol buffer client/server over a channel type I define.

I found learning about SSH pretty amazing, and it’s reinforced my belief that most things in computers can be understood with time. It’s deeper and more powerful than I thought, although it seems to have settled mostly into the niche it’s named after, being a secure shell.

For reference, I found the RFCs to be great, at least once I’d learned the basic connection/channel/request structure:

RFC 4254 - The Secure Shell (SSH) Connection Protocol - this defines the channels and request types we dealt with here. I found this one the most useful when writing the code above.
RFC 4251 - The Secure Shell (SSH) Protocol Architecture - this talks at a higher level, and I found it’s primary use the security properties section.
There are two other RFCs that I didn’t use much, as Go’s server handles most of the details from them: RFC 4253 - The Secure Shell (SSH) Transport Layer Protocol and RFC 4252 - The Secure Shell (SSH) Authentication Protocol.