This week marks the release of a significant refactor as part of a focussed round of efficiency improvements set out to target the low level communication interfaces used by Gateway and Host. This specific change comprises of a change to the RPC (Remote Procedure Call) protocol used to distribute requests and receive responses from Host. RPC uses the client-server model to define procedures which suspend the caller whilst executing on the receiver. It allows two machines to perform a single synchronous operation when connected over a network where latency is otherwise a crippling factor.
In the last round of improvements we significantly reduced the number of RPC connections between Gateway and Host by streaming requests to the Host rather than sending one-by-one. This reduced latency, but didn’t affect the way Host returned payload to Gateway, which still required a connection per response.
In this release we removed the response method and adjusted the protocol that streams requests to the Host to also accept a stream of responses back, reducing the entire process to a single stream. This is made possible through gRPCs support for bi-directional streaming which takes advantage of the multiplexing capabilities of the http/2 protocol whilst structuring payload with Protocol Buffers, a serialization method with its own markup interface definition language.
In slightly less technical terms, this allows Gateway to asynchronously send unfulfilled requests to each Host whilst simultaneously receiving responses, all over a single connection. This reduces the number of open connections to the Gateway, removes the latency caused by establishing a new http/2 connection and only requires headers to be sent once.
One obstacle when moving to a fully bi-directional stream was migrating the concurrent request limiter method from Host to Gateway. With a single stream it was possible for a Host to suspend the RPC method by limiting the number of asynchronous operations using Channel Buffering. When the stream is bi-directional, the logic which previously provided a strict coupling between the RPC method and Hosts local request buffer became too fragile, so it was moved to Gateway.
The new logic stores the number of requests that are reserved by the Host, placing a temporary pause on the RPC method until the Host has satisfied enough requests to fall below the threshold. As some of you may have noticed last week, there were some teething problems with the first release. Under certain conditions where latency from one Host causes the request to be reassigned to another, Gateway was not successfully deducting the count of requests from the first, eventually causing the RPC method to indefinitely lock. After a quick refactor I can now report that the process is working as expected and having a significant positive effect on latency and a reduction in Gateway CPU usage.