Zuul 重试配置不适用于 Eureka

Question

我有一个 Spring 启动 Zuul 作为外部网关和 Eureka 作为服务发现的场景，所有这些运行在 Kubernetes 中。

问题是，我想保证我的服务的可用性，所以当我的服务实例出现故障时，我希望 Zuul 通过 Eureka 重试调用其他实例之一。

我试着按照这个 Ryan Baxter's post 来做。另外，我尝试遵循 here.

中的提示

问题是无论我做什么，看起来 Zuul 都没有重新尝试拨打电话。当我删除我的一个实例时，它会不断向我返回该实例的超时，直到 Eureka 地址同步。

我的 application.yaml 看起来像这样：

spring:
  cloud:
    loadbalancer:
      retry:
        enabled: true

 zuul:
  stripPrefix: true
  ignoredServices: '*'
  routes:
    my-service:
      path: /my-service/**
      serviceId: my-service-api
  retryable: true

 my-service:
  ribbon:
    maxAutoRetries: 3
    MaxAutoRetriesNextServer: 3
    OkToRetryOnAllOperations: true
    ReadTimeout: 5000
    ConnectTimeout: 3000

我的服务使用的是 Camden SR7（我也试过 SR6）：

"org.springframework.cloud:spring-cloud-dependencies:Camden.SR7"

还有Spring-重试：

org.springframework.retry:spring-retry:1.1.5.RELEASE

我的应用程序 class 如下所示：

@SpringBootApplication
@EnableEurekaClient
@EnableZuulProxy
@EnableRetry
public class MyZuulApplication

编辑：

打通 Postman，带来

{
    "timestamp": 1497959364819,
    "status": 500,
    "error": "Internal Server Error",
    "exception": "com.netflix.zuul.exception.ZuulException",
    "message": "TIMEOUT"
}.

查看 Zuul 日志，它打印了 {"level":"WARN","logger_name":"org.springframework.cloud.netflix.zuul.filters.post.SendErrorFilter","appName":...,"message":"Error during filtering","stack_trace":"com.netflix.zuul.exception.ZuulException: Forwarding error [... Stack Trace ...] Caused by: com.netflix.hystrix.exception.HystrixRuntimeException: my-service-api timed-out and no fallback available [... Stack Trace ...] Caused by: java.util.concurrent.TimeoutException: null

我发现的另一个有趣的日志：

{"level":"INFO" [...] current list of Servers=[ip_address1:port, ip_address2:port, ip_address3:port],Load balancer stats=Zone stats: {defaultzone=[Zone:[ ... ];    Instance count:3;   Active connections count: 0;    Circuit breaker tripped count: 0;   Active connections per server: 0.0;]
},Server stats: [[Server:ip_address1:port;  [ ... ] Total Requests:0;   Successive connection failure:0;    Total blackout seconds:0;   [ ... ]
, [Server:ip_address2:port; [ ... ] Total Requests:0;   Successive connection failure:0;    Total blackout seconds:0;   [ ... ]
, [Server:ip_address3:port; [ ... ] Total Requests:0;   Successive connection failure:0;    Total blackout seconds:0;   [ ... ]

Answer 1

问题好像是Hystrix超时导致的。 HystrixCommand 的默认超时时间是 1000ms，这不足以让 ribbon 重试 http 请求。尝试增加 hystrix 的超时时间，如下所示。

hystrix:
  command:
    default:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 20000

它将整个 hystrix 命令的超时时间增加到 20 秒。如果有效，请根据您的环境调整上述值。您为读取和连接超时使用了相当大的超时值。因此，如果需要，您需要使用 hystrix 超时调整这些值。

Zuul 重试配置不适用于 Eureka

Zuul retry configuration is not working with Eureka

spring-boot

kubernetes

netflix-eureka

netflix-zuul