Exchanging millions of buffers per second with plain RSockets is not useful for business applications because they need structured data instead of raw bytes.
That’s why there was RSocket-RPC
- remote procedure call system on top of Protocol Buffers with
language agnostic service definitions. It uses code-generation for performance equivalent to hand-written code.
Protocol buffers have tiny on-wire overhead, acceptable performance and is native format of Grpc - important for interoperability and available tooling that can be used directly, or easily adapted.
Alternative RSocket-RPC
Original RSocket/RSocket-RPC
is based on problematic RSocket/RSocket-java
and seems is not supported anymore (generated RPC stubs do not compile). Approach based on code generation
and single efficient data format was dropped in favour of data abstraction and runtime
tricks to translate
between serialized and structured messages.
For alternative RSocket-RPC, code generation was redone to align It with jauntsdn/RSocket-reactor
.
Let’s enumerate outstanding differences:
-
Compact encoding of both RPC calls and user-provided metadata: in majority of cases RPC overhead is only 3-4 bytes.
-
Noticeably higher performance - result of improvements in jauntsdn/rsocket-reactor, less pressure on memory allocator and compact metadata encoding: 1.3 million requests per second with request-response, 3 million messages per second with request-stream, per core.
-
Native compatibility with Grpc - if combined with respective transport. There is no need for separate binary or IDL sharing schemes.
GRPC transport
For internet clients, RSocket-RPC has limited utility because existing solutions (HTTP/Grpc) are mature and already good enough. This relates to both client libraries and API gateways - haproxy, nginx and recently even AWS API gateway support Grpc. Some of them have features and performance that would require years of dedicated teams to reach.
In data center environment, RSocket/RSocket-RPC has several advantages over Http2/Grpc, main of them is considerable performance gains.
Fortunately, RSocket streams are able to accommodate Http2 streams which then may be routed to RSocket-RPC origins over TCP.
This way RSocket-RPC services can be efficiently accessed by existing Grpc clients - the message format is same so there is no penalty of cross-format data translation.
Scope and features:
-
Server only transport - sufficient for external Grpc clients support.
-
Only client initiated requests are allowed - this is required by Http2 streams semantics. Requests support
request-response
,server-stream
,client-stream
,bidistream
interactions. -
Assume API gateway/proxy between RSocket server and internet clients - no quic/http3 stack or message compression; implies fast, reliable network for simpler HTTP2/RSocket flow control coordination.
Like TCP and Unix sockets, GRPC transport can be used with any jauntsdn/RSocket-JVM
based implementation, and was verified with RSocket-reactor
and RSocket-rxjava
.
It is lean dependency-wise: only io.netty:netty-codec-http2
and jauntsdn:rsocket-transport
,
no Protobuf
related libraries. Compiled JAR is less than 60 KBytes in size.
Performance
Performance was evaluated on commodity box running linux 5.4.0 with EPOLL IO and non-TLS TCP transport. JVM is OpenJDK 11.0.11.
Its purpose is comparison against plain RSocket with RSocket-reactor
on single core throughput.
This test gives estimate of per-request (with request-response) and per-message (request-stream, channel) RPC overhead.
Message size is selected small (less than 10 bytes) so most of CPU time is spent on RPC request encoding & message serialization instead of memory copy, which is better gauge for RPC efficiency.
RSocket-RPC
- REQUEST-STREAM
Single connection / single core stream throughput test.
16:55:09.484 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.stream.client.Main client received messages: 3094767
16:55:10.484 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.stream.client.Main client received messages: 2997843
16:55:11.484 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.stream.client.Main client received messages: 2996896
16:55:12.484 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.stream.client.Main client received messages: 3004344
16:55:13.484 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.stream.client.Main client received messages: 3000798
16:55:14.484 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.stream.client.Main client received messages: 3002806
3M msg/sec vs 3.5M msg/sec demonstrated by Plain RSocket, or 14% worse throughput.
This small overhead is caused by Protobuf serialization only as there is single continuous stream per connection (for N response messages there is 1 request).
- REQUEST-CHANNEL
Single connection / single core stream throughput test.
Server
16:56:20.876 rsocket-netty-io-transport-epoll-1-2 com.jauntsdn.rsocket.rpc.examples.channel.server.Main server received messages: 1726494
16:56:21.874 rsocket-netty-io-transport-epoll-1-2 com.jauntsdn.rsocket.rpc.examples.channel.server.Main server received messages: 1734271
16:56:22.877 rsocket-netty-io-transport-epoll-1-2 com.jauntsdn.rsocket.rpc.examples.channel.server.Main server received messages: 1734271
16:56:23.875 rsocket-netty-io-transport-epoll-1-2 com.jauntsdn.rsocket.rpc.examples.channel.server.Main server received messages: 1718717
16:56:24.875 rsocket-netty-io-transport-epoll-1-2 com.jauntsdn.rsocket.rpc.examples.channel.server.Main server received messages: 1710940
16:56:25.874 rsocket-netty-io-transport-epoll-1-2 com.jauntsdn.rsocket.rpc.examples.channel.server.Main server received messages: 1726494
Client
16:56:21.804 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.channel.client.Main client received messages: 1742048
16:56:22.804 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.channel.client.Main client received messages: 1726494
16:56:23.803 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.channel.client.Main client received messages: 1718717
16:56:24.804 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.channel.client.Main client received messages: 1718717
16:56:25.803 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.channel.client.Main client received messages: 1726494
16:56:26.803 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.channel.client.Main client received messages: 1734271
With request-channel RSocket-RPC exchanges ~3.4M msg/sec, while plain RSocket is at 4.8M msg/sec - 29% slower due to additional serialization/deserialization of request stream (for N response messages there is N request messages).
- REQUEST-RESPONSE
Single connection / single core continuous requests window throughput test.
16:58:16.609 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.response.client.Main client received messages: 1329611
16:58:17.607 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.response.client.Main client received messages: 1320154
16:58:18.608 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.response.client.Main client received messages: 1321358
16:58:19.609 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.response.client.Main client received messages: 1317221
16:58:20.607 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.response.client.Main client received messages: 1321927
16:58:21.607 rsocket-netty-io-transport-epoll-1-1 com.jauntsdn.rsocket.rpc.examples.response.client.Main client received messages: 1329611
RSocket-RPC demonstrates ~1.3M msg (requests)/sec, or 38% worse throughput compared to plain RSocket (2.1M requests / sec).
Substantial difference is due to additional overhead of request metadata encoding for each outbound message under request-response model (for N responses there is N requests).
RSocket-GRPC
Test consists of RSocket-RPC server with RSocket-GRPC transport, and Grpc client based on grpc-java 1.37.0 over Netty, under OpenJDK 11.0.11.
Services and messages are identical to ones from RSocket-RPC test.
Each result is compared with grpc-java
-only setup.
- SERVER-STREAM
Single stream throughput test.
17:52:03.752 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.stream.Main Client received messages: 1428347
17:52:04.752 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.stream.Main Client received messages: 1454406
17:52:05.752 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.stream.Main Client received messages: 1568482
17:52:06.752 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.stream.Main Client received messages: 1480556
17:52:07.752 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.stream.Main Client received messages: 1386296
17:52:08.752 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.stream.Main Client received messages: 1492142
Server-stream is 50% slower compared to RSocket-RPC over TCP: the reason is chattiness of Http2, overhead of grpc-java on client side, netty http2 codec on server side.
On the other hand numbers correspond to the best case with grpc-java
only setup which has distinctively more fluctuations
in 0.8-1.5M msg/sec range.
- BIDI-STREAM
Single stream throughput test.
17:53:04.856 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.bidistream.Main ==> Bidi stream sent messages: 709548, received messages: 458496
17:53:05.856 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.bidistream.Main ==> Bidi stream sent messages: 790257, received messages: 291072
17:53:06.856 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.bidistream.Main ==> Bidi stream sent messages: 828460, received messages: 276224
17:53:07.856 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.bidistream.Main ==> Bidi stream sent messages: 831259, received messages: 275083
17:53:08.856 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.bidistream.Main ==> Bidi stream sent messages: 600732, received messages: 818881
17:53:09.856 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.bidistream.Main ==> Bidi stream sent messages: 664493, received messages: 656692
17:53:10.856 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.bidistream.Main ==> Bidi stream sent messages: 850823, received messages: 280517
Exchange is around 1.2M msg/sec, comparable to grpc-java
only setup.
- REQUEST-RESPONSE
Continuous requests window throughput test.
10:52:47.579 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.response.Main Client received messages: 101663
10:52:48.579 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.response.Main Client received messages: 103388
10:52:49.579 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.response.Main Client received messages: 101563
10:52:50.579 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.response.Main Client received messages: 101165
10:52:51.579 pool-2-thread-1 com.jauntsdn.rsocket.showcase.grpc.response.Main Client received messages: 101255
The result is around 100K msg (requests)/sec, same with grpc-java
only setup.
Numbers do not look impressive compared to order of magnitude faster RSocket-RPC counterpart (1.3M requests/sec),
but this is what current Http2/Grpc-java/Netty
stack is able to offer.