每个其他请求的 Kubernetes Traefik 内部服务器错误
Kubernetes Traefik internal server error on every other request
所以我使用的是 traefik 2.2,我 运行 一个带有单节点主节点的裸机 kubernetes 集群。我没有物理或虚拟负载均衡器,因此 traefik pod 接收端口 80 和 443 上的所有请求。我有一个安装了 helm 的示例 wordpress。正如您在这里看到的,其他所有请求都是 500 错误。 http://wp-example.cryptexlabs.com/feed/。我可以确认 500 错误的请求永远不会到达 wordpress 容器,所以我知道这与 traefik 有关。在 traefik 日志中它只显示有 500 错误。所以我在 traefik 命名空间中有 1 个 pod,在默认服务中有一个服务,在默认命名空间中有一个外部名称服务指向示例 wordpress 站点,其中有一个 wp-example 命名空间。
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: traefik
chart: traefik-0.2.0
heritage: Tiller
release: traefik
name: traefik
namespace: traefik
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: traefik
release: traefik
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: traefik
chart: traefik-0.2.0
heritage: Tiller
release: traefik
spec:
containers:
- args:
- --api.insecure
- --accesslog
- --entrypoints.web.Address=:80
- --entrypoints.websecure.Address=:443
- --providers.kubernetescrd
- --certificatesresolvers.default.acme.tlschallenge
- --certificatesresolvers.default.acme.email=foo@you.com
- --certificatesresolvers.default.acme.storage=acme.json
- --certificatesresolvers.default.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
image: traefik:2.2
imagePullPolicy: IfNotPresent
name: traefik
ports:
- containerPort: 80
hostPort: 80
name: web
protocol: TCP
- containerPort: 443
hostPort: 443
name: websecure
protocol: TCP
- containerPort: 8088
name: admin
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: traefik-service-account
serviceAccountName: traefik-service-account
terminationGracePeriodSeconds: 60
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: wp-example.cryptexlabs.com
namespace: wp-example
spec:
entryPoints:
- web
routes:
- kind: Rule
match: Host(`wp-example.cryptexlabs.com`)
services:
- name: wp-example
port: 80
- name: wp-example
port: 443
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/managed-by: Tiller
app.kubernetes.io/name: wordpress
helm.sh/chart: wordpress-9.3.14
name: wp-example-wordpress
namespace: wp-example
spec:
clusterIP: 10.101.142.74
externalTrafficPolicy: Cluster
ports:
- name: http
nodePort: 31862
port: 80
protocol: TCP
targetPort: http
- name: https
nodePort: 32473
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/name: wordpress
sessionAffinity: None
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/managed-by: Tiller
app.kubernetes.io/name: wordpress
helm.sh/chart: wordpress-9.3.14
name: wp-example-wordpress
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/name: wordpress
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/managed-by: Tiller
app.kubernetes.io/name: wordpress
helm.sh/chart: wordpress-9.3.14
spec:
containers:
- env:
- name: ALLOW_EMPTY_PASSWORD
value: "yes"
- name: MARIADB_HOST
value: wp-example-mariadb
- name: MARIADB_PORT_NUMBER
value: "3306"
- name: WORDPRESS_DATABASE_NAME
value: bitnami_wordpress
- name: WORDPRESS_DATABASE_USER
value: bn_wordpress
- name: WORDPRESS_DATABASE_PASSWORD
valueFrom:
secretKeyRef:
key: mariadb-password
name: wp-example-mariadb
- name: WORDPRESS_USERNAME
value: user
- name: WORDPRESS_PASSWORD
valueFrom:
secretKeyRef:
key: wordpress-password
name: wp-example-wordpress
- name: WORDPRESS_EMAIL
value: user@example.com
- name: WORDPRESS_FIRST_NAME
value: FirstName
- name: WORDPRESS_LAST_NAME
value: LastName
- name: WORDPRESS_HTACCESS_OVERRIDE_NONE
value: "no"
- name: WORDPRESS_HTACCESS_PERSISTENCE_ENABLED
value: "no"
- name: WORDPRESS_BLOG_NAME
value: "User's Blog!"
- name: WORDPRESS_SKIP_INSTALL
value: "no"
- name: WORDPRESS_TABLE_PREFIX
value: wp_
- name: WORDPRESS_SCHEME
value: http
image: docker.io/bitnami/wordpress:5.4.2-debian-10-r6
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 6
httpGet:
path: /wp-login.php
port: http
scheme: HTTP
initialDelaySeconds: 120
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: wordpress
ports:
- containerPort: 8080
name: http
protocol: TCP
- containerPort: 8443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 6
httpGet:
path: /wp-login.php
port: http
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources:
requests:
cpu: 300m
memory: 512Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /bitnami/wordpress
name: wordpress-data
subPath: wordpress
dnsPolicy: ClusterFirst
hostAliases:
- hostnames:
- status.localhost
ip: 127.0.0.1
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1001
runAsUser: 1001
terminationGracePeriodSeconds: 30
volumes:
- name: wordpress-data
persistentVolumeClaim:
claimName: wp-example-wordpress
kubectl describe svc wp-example-wordpress -n wp-example
的输出
Name: wp-example-wordpress
Namespace: wp-example
Labels: app.kubernetes.io/instance=wp-example
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=wordpress
helm.sh/chart=wordpress-9.3.14
Annotations: <none>
Selector: app.kubernetes.io/instance=wp-example,app.kubernetes.io/name=wordpress
Type: LoadBalancer
IP: 10.101.142.74
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 31862/TCP
Endpoints: 10.32.0.17:8080
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 32473/TCP
Endpoints: 10.32.0.17:8443
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
josh@Joshs-MacBook-Pro-2:$ ab -n 10000 -c 10 http://wp-example.cryptexlabs.com/
This is ApacheBench, Version 2.3 <$Revision: 1874286 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking wp-example.cryptexlabs.com (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests
Server Software: Apache/2.4.43
Server Hostname: wp-example.cryptexlabs.com
Server Port: 80
Document Path: /
Document Length: 26225 bytes
Concurrency Level: 10
Time taken for tests: 37.791 seconds
Complete requests: 10000
Failed requests: 5000
(Connect: 0, Receive: 0, Length: 5000, Exceptions: 0)
Non-2xx responses: 5000
Total transferred: 133295000 bytes
HTML transferred: 131230000 bytes
Requests per second: 264.61 [#/sec] (mean)
Time per request: 37.791 [ms] (mean)
Time per request: 3.779 [ms] (mean, across all concurrent requests)
Transfer rate: 3444.50 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 6 8.1 5 239
Processing: 4 32 29.2 39 315
Waiting: 4 29 26.0 34 307
Total: 7 38 31.6 43 458
Percentage of the requests served within a certain time (ms)
50% 43
66% 49
75% 51
80% 52
90% 56
95% 60
98% 97
99% 180
100% 458 (longest request)
Traefik 调试日志:https://pastebin.com/QUaAR6G0 显示了一些关于 SSL 和 x509 证书的信息,尽管我是通过 http 而不是 https 发出请求的。
我对使用相同模式的 nginx 容器进行了测试,没有遇到任何问题。所以这个跟wordpress和traefik的关系具体有关系。
我在 traefik 上也看到了关于下游服务器未启用 Keep-Alive 并且 traefik 默认启用 Keep-Alive 的参考资料。我还尝试通过扩展 wordpress 图像并在 wordpress 上启用 Keep-Alive 来启用 Keep-Alive。当我通过`kubectl port-forward 访问 wordpress 容器时,我可以看到正在发送 Keep-Alive headers,所以我知道它已启用,但我仍然看到 50% 的请求失败。
我不太清楚为什么,我也无法解释,但是当我添加这个选项时,随机故障停止了:
--serversTransport.insecureSkipVerify=true
我在 traefik 日志中看到 HTTP 连接正常,但是当网站图标等发生 HTTPS 重定向时,您会得到 x509 证书无效。那是因为 wordpress pod 有无效的 ssl 证书。
您可以在集群内安全地使用 --serversTransport.insecureSkipVerify=true
,因为流量将被加密并且外部流量是 HTTP。
如果您将来需要使用受信任的证书,请使用 wordpress 应用程序部署它并使用带有 ssl 直通的 traefik,以便在 pod 级别解密流量。然后你可以删除 traefik 上的不安全选项。
所以我使用的是 traefik 2.2,我 运行 一个带有单节点主节点的裸机 kubernetes 集群。我没有物理或虚拟负载均衡器,因此 traefik pod 接收端口 80 和 443 上的所有请求。我有一个安装了 helm 的示例 wordpress。正如您在这里看到的,其他所有请求都是 500 错误。 http://wp-example.cryptexlabs.com/feed/。我可以确认 500 错误的请求永远不会到达 wordpress 容器,所以我知道这与 traefik 有关。在 traefik 日志中它只显示有 500 错误。所以我在 traefik 命名空间中有 1 个 pod,在默认服务中有一个服务,在默认命名空间中有一个外部名称服务指向示例 wordpress 站点,其中有一个 wp-example 命名空间。
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: traefik
chart: traefik-0.2.0
heritage: Tiller
release: traefik
name: traefik
namespace: traefik
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: traefik
release: traefik
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: traefik
chart: traefik-0.2.0
heritage: Tiller
release: traefik
spec:
containers:
- args:
- --api.insecure
- --accesslog
- --entrypoints.web.Address=:80
- --entrypoints.websecure.Address=:443
- --providers.kubernetescrd
- --certificatesresolvers.default.acme.tlschallenge
- --certificatesresolvers.default.acme.email=foo@you.com
- --certificatesresolvers.default.acme.storage=acme.json
- --certificatesresolvers.default.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
image: traefik:2.2
imagePullPolicy: IfNotPresent
name: traefik
ports:
- containerPort: 80
hostPort: 80
name: web
protocol: TCP
- containerPort: 443
hostPort: 443
name: websecure
protocol: TCP
- containerPort: 8088
name: admin
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: traefik-service-account
serviceAccountName: traefik-service-account
terminationGracePeriodSeconds: 60
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: wp-example.cryptexlabs.com
namespace: wp-example
spec:
entryPoints:
- web
routes:
- kind: Rule
match: Host(`wp-example.cryptexlabs.com`)
services:
- name: wp-example
port: 80
- name: wp-example
port: 443
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/managed-by: Tiller
app.kubernetes.io/name: wordpress
helm.sh/chart: wordpress-9.3.14
name: wp-example-wordpress
namespace: wp-example
spec:
clusterIP: 10.101.142.74
externalTrafficPolicy: Cluster
ports:
- name: http
nodePort: 31862
port: 80
protocol: TCP
targetPort: http
- name: https
nodePort: 32473
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/name: wordpress
sessionAffinity: None
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/managed-by: Tiller
app.kubernetes.io/name: wordpress
helm.sh/chart: wordpress-9.3.14
name: wp-example-wordpress
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/name: wordpress
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/managed-by: Tiller
app.kubernetes.io/name: wordpress
helm.sh/chart: wordpress-9.3.14
spec:
containers:
- env:
- name: ALLOW_EMPTY_PASSWORD
value: "yes"
- name: MARIADB_HOST
value: wp-example-mariadb
- name: MARIADB_PORT_NUMBER
value: "3306"
- name: WORDPRESS_DATABASE_NAME
value: bitnami_wordpress
- name: WORDPRESS_DATABASE_USER
value: bn_wordpress
- name: WORDPRESS_DATABASE_PASSWORD
valueFrom:
secretKeyRef:
key: mariadb-password
name: wp-example-mariadb
- name: WORDPRESS_USERNAME
value: user
- name: WORDPRESS_PASSWORD
valueFrom:
secretKeyRef:
key: wordpress-password
name: wp-example-wordpress
- name: WORDPRESS_EMAIL
value: user@example.com
- name: WORDPRESS_FIRST_NAME
value: FirstName
- name: WORDPRESS_LAST_NAME
value: LastName
- name: WORDPRESS_HTACCESS_OVERRIDE_NONE
value: "no"
- name: WORDPRESS_HTACCESS_PERSISTENCE_ENABLED
value: "no"
- name: WORDPRESS_BLOG_NAME
value: "User's Blog!"
- name: WORDPRESS_SKIP_INSTALL
value: "no"
- name: WORDPRESS_TABLE_PREFIX
value: wp_
- name: WORDPRESS_SCHEME
value: http
image: docker.io/bitnami/wordpress:5.4.2-debian-10-r6
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 6
httpGet:
path: /wp-login.php
port: http
scheme: HTTP
initialDelaySeconds: 120
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: wordpress
ports:
- containerPort: 8080
name: http
protocol: TCP
- containerPort: 8443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 6
httpGet:
path: /wp-login.php
port: http
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources:
requests:
cpu: 300m
memory: 512Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /bitnami/wordpress
name: wordpress-data
subPath: wordpress
dnsPolicy: ClusterFirst
hostAliases:
- hostnames:
- status.localhost
ip: 127.0.0.1
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1001
runAsUser: 1001
terminationGracePeriodSeconds: 30
volumes:
- name: wordpress-data
persistentVolumeClaim:
claimName: wp-example-wordpress
kubectl describe svc wp-example-wordpress -n wp-example
Name: wp-example-wordpress
Namespace: wp-example
Labels: app.kubernetes.io/instance=wp-example
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=wordpress
helm.sh/chart=wordpress-9.3.14
Annotations: <none>
Selector: app.kubernetes.io/instance=wp-example,app.kubernetes.io/name=wordpress
Type: LoadBalancer
IP: 10.101.142.74
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 31862/TCP
Endpoints: 10.32.0.17:8080
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 32473/TCP
Endpoints: 10.32.0.17:8443
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
josh@Joshs-MacBook-Pro-2:$ ab -n 10000 -c 10 http://wp-example.cryptexlabs.com/
This is ApacheBench, Version 2.3 <$Revision: 1874286 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking wp-example.cryptexlabs.com (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests
Server Software: Apache/2.4.43
Server Hostname: wp-example.cryptexlabs.com
Server Port: 80
Document Path: /
Document Length: 26225 bytes
Concurrency Level: 10
Time taken for tests: 37.791 seconds
Complete requests: 10000
Failed requests: 5000
(Connect: 0, Receive: 0, Length: 5000, Exceptions: 0)
Non-2xx responses: 5000
Total transferred: 133295000 bytes
HTML transferred: 131230000 bytes
Requests per second: 264.61 [#/sec] (mean)
Time per request: 37.791 [ms] (mean)
Time per request: 3.779 [ms] (mean, across all concurrent requests)
Transfer rate: 3444.50 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 6 8.1 5 239
Processing: 4 32 29.2 39 315
Waiting: 4 29 26.0 34 307
Total: 7 38 31.6 43 458
Percentage of the requests served within a certain time (ms)
50% 43
66% 49
75% 51
80% 52
90% 56
95% 60
98% 97
99% 180
100% 458 (longest request)
Traefik 调试日志:https://pastebin.com/QUaAR6G0 显示了一些关于 SSL 和 x509 证书的信息,尽管我是通过 http 而不是 https 发出请求的。
我对使用相同模式的 nginx 容器进行了测试,没有遇到任何问题。所以这个跟wordpress和traefik的关系具体有关系。
我在 traefik 上也看到了关于下游服务器未启用 Keep-Alive 并且 traefik 默认启用 Keep-Alive 的参考资料。我还尝试通过扩展 wordpress 图像并在 wordpress 上启用 Keep-Alive 来启用 Keep-Alive。当我通过`kubectl port-forward 访问 wordpress 容器时,我可以看到正在发送 Keep-Alive headers,所以我知道它已启用,但我仍然看到 50% 的请求失败。
我不太清楚为什么,我也无法解释,但是当我添加这个选项时,随机故障停止了:
--serversTransport.insecureSkipVerify=true
我在 traefik 日志中看到 HTTP 连接正常,但是当网站图标等发生 HTTPS 重定向时,您会得到 x509 证书无效。那是因为 wordpress pod 有无效的 ssl 证书。
您可以在集群内安全地使用 --serversTransport.insecureSkipVerify=true
,因为流量将被加密并且外部流量是 HTTP。
如果您将来需要使用受信任的证书,请使用 wordpress 应用程序部署它并使用带有 ssl 直通的 traefik,以便在 pod 级别解密流量。然后你可以删除 traefik 上的不安全选项。