Alternative WebSockets for netty/java: doubling throughput of small messages

January 9, 2023
netty websocket java

TL;DR

This post is brief presentation of netty-websocket-http1: alternative netty/java implementation of RFC6455 - the WebSocket protocol.

Its advantage is significant per-core throughput improvement (1.8–2x) for small frames in comparison to netty’s out-of-the-box websocket codecs, and minimal heap allocations on frame path. Library may also be combined with netty-websocket-http2.

Its purpose is to be the basis for high-performance RPC transport of small binary messages (protocol buffers), mainly cross-datacenter communications over internet — both http1 and http2.

Preliminary performance evaluation with netty’s out-of-the-box codec showed only ~1M 120 byte messages per core over non-TLS connection — very modest for this simple wire format, concluding from experience with other protocols.

Additionally there are unnecessary per-frame allocations as netty’s codec expects binary payloads wrapped as WebSocketFrame messages (which are likely not useful for users application purposes), plus allocates array per-frame for payload masking (latter was recently improved).

use case & scope

FrameFactory / Callbacks API

1.WebSocketFrameFactory to create outbound frames as plain byte buffers, which helps to reduce pressure on memory allocator, and avoid either two tiny buffers (“header” plus payload) or redundant memory copies for each frame.

It is library user responsibility to mask outbound frame once payload is written: ByteBuf WebSocketFrameFactory.mask(ByteBuf)

public interface WebSocketFrameFactory {

  ByteBuf createBinaryFrame(ByteBufAllocator allocator, int binaryDataSize);
  
  // create*Frame are omitted for control frames, created in similar fashion

  ByteBuf mask(ByteBuf frame);
}

2.WebSocketFrameListener to receive inbound frames

public interface WebSocketFrameListener {

  void onChannelRead(ChannelHandlerContext ctx, boolean finalFragment,
                     int rsv, int opcode, ByteBuf payload);
   
  // netty handler callbacks are omitted for brevity

  // lifecycle
  default void onOpen(ChannelHandlerContext ctx) {}

  default void onClose(ChannelHandlerContext ctx) {}

3.WebSocketCallbacksHandler to exchange WebSocketFrameListener for WebSocketFrameFactory on successful WebSocket handshake

public interface WebSocketCallbacksHandler {

  WebSocketFrameListener exchange(
      ChannelHandlerContext ctx, WebSocketFrameFactory webSocketFrameFactory);
}

4.Similar to Netty, this library has WebSocketClientProtocolHandler & WebSocketServerProtocolHandler for end users.

These handlers are responsible for whole WebSocket http handshake process — up until WebSocketCallbacksHandler exchange on successful handshake completion.

It is common for WebSocketCallbacksHandler to also implement WebSocketFrameListener, so users have

class FrameHandler implements WebSocketCallbacksHandler, 
                              WebSocketFrameListener {

  WebSocketFrameFactory webSocketFrameFactory;

  WebSocketFrameListener exchange(
        ChannelHandlerContext ctx, 
        WebSocketFrameFactory webSocketFrameFactory) {
    this.webSocketFrameFactory = webSocketFrameFactory;
  }

  void onChannelRead(ChannelHandlerContext ctx, 
      boolean finalFragment, int rsv, int opcode, ByteBuf payload) {
    // read inbound frames, write outbound frames /w webSocketFrameFactory
  }
}

Performance test module serves as good API showcase for both client and server.

performance

Below is per-core throughput comparison with netty’s out-of-the-box WebSocket handlers: non-masked frames with 8, 64, 125, 1000 bytes of randomized payload over encrypted/non-encrypted connection.

websocket-over-http2

One drawback of websocket-over-http2 support with OOTB netty codecs is that It is much slower than either http2 or WebSocket alone (2 protocols to decode from byte stream).

This library helps to ease the problem as It may be combined with jauntsdn/websocket-http2 using http1 codec API for comparable benefit. With 8, 125, 1000 bytes of randomized payload frames over encrypted connection results are as follows:

📌 Summary: alternative RSocket library for high performance network applications on JVM

February 1, 2023
RSocket Mstreams java

Jaunt-RSocket-RPC, Spring-RSocket, GRPC: quantitative and qualitative comparison

September 3, 2021
RSocket Mstreams java

Alternative RSocket-RPC: fast application services communication and transparent GRPC bridge

May 20, 2021
RSocket Mstreams java grpc