A TCP connection is a bi-directional data pipe between a client and a server. After the initial 3-way handshake of
ACK, both sides can send data arbitrarily to the other. Firstly with a slow start, then gradually to a rate that is sustainable by the network in between, as determined by the negotiated congestion window
cwnd and receive window
The exchange of data is in the form of messages. In most cases, this follows a request/response model, whereby the client sends a request message and the server replies with a response message. In some circumstances, the server can also send unsolicited messages.
How these messages are defined and exchanged is forms the communication protocol. In some protocols, messages are sequential and the client has to wait for the server response before sending the next request. This clearly suffers high latency, so some protocols allow pipelining or multiplexing, whereby multiple request messages can be sent without waiting for responses. Depending on protocol, the server may respond in order (and suffer from head-of-line-blocking), or it may respond arbitrarily.
In simpler protocols that are not capable of handling multiple concurrent message exchanges in the one TCP connection, the application may open a pool of parallel connections to accelerate the message exchange. Messages are queued and distributed across this pool of connections. However, every additional connection comes with a memory cost, and the number of concurrent connections should be kept low to minimise resource usage.
If more messages are expected to be sent in the future, whether a singular or a pool of connections, the connection(s) can be kept alive on idle, ready for reuse without the penalty of the initial handshaking, and the corresponding TCP slow start.
To help understand this process, consider Netcat which can send and receive raw TCP payloads. The command below sets up the server side, listening on a port and storing the incoming request message as a file, and responding with a response message file.
cat $RESPONSE_MESSAGE | nc --listen --local-port=$PORT > $REQUEST_MESSAGE
cat $REQUEST_MESSAGE | nc $SERVER_NAME $PORT > $RESPONSE_MESSAGE
A TCP channel is bidirectional, and the shape and order of the messages exchanged is what makes up the protocol. Let's have a look at some common TCP protocols used in building web applications.
Web browsers use HTTP (Hyper Text Transfer Protocol) over TCP port 80 (or 443 with TLS) to communicate with servers in a request/response model. In HTTP/1.1, the message format contains uncompressed headers, and multiplexing cannot be performed, requiring several connections or pipelining for performance in older browsers. Newer browsers can use HTTP/2 to compress the headers, and support multiplexing, significantly improving performance with a single TCP connection.
Different media types can be sent over HTTP in both directions by specifying the
Content-Type header in the message format. A special
Content-Type: text/event-stream response message enables the server to begin a Server Sent Events session to drip-feed the client with blocks of text separated by a pair of newlines for an arbitrarily long time. To keep the mostly idle connections from being disconnected, a comment character
: can be periodically sent.
HTTP/1.1 can also be upgraded to WebSockets for bi-directional messaging over the underlying bi-directional TCP connection.
HTTP can be easily chained. A client may connect to an application server through intermediate caching servers, perhaps load-balancers, and so on. Each intermediate node establishes its own HTTP connections to its immediate upstream server, and can thus may use its own HTTP version. Each sets up its own TCP connection, and thus connection pooling and keep-alive techniques should be applied throughout to ensure high performance.
HTTP is general enough that it can be used to layer special purpose protocols on top of it, effectively using HTTP as a substrate . Indeed when building applications using the micro-services architecture, a common protocol for server-to-server communication is HTTP. This saves having to design a custom TCP protocol, when tunnelling through HTTP is often easier (though possibly costlier). Further, debugging is easier as there are already plenty of tools that understand HTTP. An example of this is GraphQL .
ElasticSearch follows a request/response model, with watcher actions to provide subscriptions. It makes sense therefore to offer a REST API over a HTTP/1.1 substrate on TCP port 9200. Payloads are exchanged as
Content-Type: application/json, and so any HTTP client can be used. In general though, direct use of the REST API is discouraged, and users should consider using one of the officially supported clients , which handles the connection pooling and keepalive considerations among others.
Redis also follows a request/response model, but also supports multiplexing and subscriptions for server push. It does this using the RESP (REdis Serialization Protocol) over TCP port 6379. The message format is human readable, fast to parse, and simple to implement. Messages start with one of
+-:$* depending on the data type, and different parts of the message are terminated with
\r\n. Many clients in different programming languages are officially supported.
Like Redis, Postgres primarily follows a request/response model, with multiplexing and subscriptions. Postgres uses its own frontend/backend protocol over TCP port 5432. The first byte of a message indicates the message type, followed by 4 bytes for the length of the rest of the message, and then the message itself. To avoid losing synchronization, both servers and clients typically buffer the entire message before processing its contents.
Most CRUD APIs are thin wrappers around databases. PostgREST allows one to write microservices in SQL DDL by providing an automatic HTTP wrapper around Postgres based on the database schema itself, without having to write any custom code. With this perspective, PostgREST is a protocol translator.
Regardless of protocol, TCP connections are the foundations for reliable communication. Optimising performance for these protocols follow the same principles: reuse the same connection to avoid the handshaking and the slow start, and multiplex requests in parallel to improve throughput. This is especially true in serverless architectures, where the runtime may be lost from one invocation to another; care must be taken to preserve the connection pool for warm lambdas, and only setup new connections for cold lambdas.