gRPC:RPC 调用上的随机取消异常

gRPC: Random CANCELLED exception on RPC calls

我在调用 gRPC 方法时偶尔会遇到取消错误。

这是我的客户端代码(使用 grpc-java 1.22.0 库):

public class MyClient {
    private static final Logger logger = LoggerFactory.getLogger(MyClient.class);
    private ManagedChannel channel; 
    private FooGrpc.FooStub fooStub;

    private final StreamObserver<Empty> responseObserver = new StreamObserver<>() {
        @Override
        public void onNext(Empty value) {
        }

        @Override
        public void onError(Throwable t) {
            logger.error("Error: ", t);
        }

        @Override
        public void onCompleted() {
        }
    };

    public MyClient() {
        this.channel = NettyChannelBuilder
            .forAddress(host, port)
            .sslContext(GrpcSslContexts.forClient().trustManager(certStream).build())
            .build();
        var pool = Executors.newCachedThreadPool(
                new ThreadFactoryBuilder().setNameFormat("foo-pool-%d").build());
        this.fooStub = FooGrpc.newStub(channel)
                .withExecutor(pool);
    }

    public void callFoo() {
        fooStub.withDeadlineAfter(500L, TimeUnit.MILLISECONDS)
                .myMethod(whatever, responseObserver);
    }
}

当我调用 callFoo() 方法时,它通常有效。客户端发送消息,服务器无问题接收。

但是这个调用偶尔会给我一个错误:

io.grpc.StatusRuntimeException: CANCELLED: io.grpc.Context was cancelled without error
        at io.grpc.Status.asRuntimeException(Status.java:533) ~[handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:442) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.CensusStatsModule$StatsClientInterceptor.onClose(CensusStatsModule.java:700) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.CensusTracingModule$TracingClientInterceptor.onClose(CensusTracingModule.java:399) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:507) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl.access0(ClientCallImpl.java:66) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:627) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access0(ClientCallImpl.java:515) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImplStreamClosed.runInternal(ClientCallImpl.java:686) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImplStreamClosed.runInContext(ClientCallImpl.java:675) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [handler-0.0.1-SNAPSHOT.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]

奇怪的是,即使调用在客户端出现错误,但服务器确实收到了大部分请求。但有时服务器会漏掉它。

它甚至不是 DEADLINE_EXCEEDED 异常,它只是抛出 CANCELLED: io.grpc.Context was cancelled without error。没有提供其他描述,所以我无法弄清楚为什么会这样。

总结:

  1. 来自客户端的 gRPC 调用 随机 给出 CANCELLED 错误。
  2. 发生错误时,服务器 有时 会收到调用,但 有时 不会。

grpc-java 支持自动截止日期和取消传播。当入站 RPC 引起出站 RPC 时,这些出站 RPC 将继承入站 RPC 的截止日期。此外,如果入站 RPC 被取消,则出站 RPC 也将被取消。

这是通过 io.grpc.Context 实现的。如果你做一个出站 RPC,你希望比入站 RPC 活得更久,你应该使用 Context.fork().

public void myRpcMethod(Request req, StreamObserver<Response> observer) {
  // ctx has all the values as the current context, but
  // won't be cancelled
  Context ctx = Context.current().fork();
  // Set ctx as the current context within the Runnable
  ctx.run(() -> {
    // Can start asynchronous work here that will not
    // be cancelled when myRpcMethod returns
  });
  observer.onNext(generateReply());
  observer.onCompleted();
}