为什么从 WebServer01 到 DBServer 的 ping 超时?
Why is ping timing out from WebServer01 to DBServer?
我使用下面的 Terraform 代码创建了自定义 VPC。
我应用了下面的 Terraform 模块,其中:
- 添加 Public 和私有子网
- 配置 IGW
- 添加 NAT 网关
- 为 Web 和数据库服务器添加 SG
应用此后,我无法从 public ec2 实例 ping/ssh 到私有 ec2 实例。
不确定缺少什么。
# Custom VPC
resource "aws_vpc" "MyVPC" {
cidr_block = "10.0.0.0/16"
instance_tenancy = "default" # For Prod use "dedicated"
tags = {
Name = "MyVPC"
}
}
# Creates "Main Route Table", "NACL" & "default Security Group"
# Create Public Subnet, Associate with our VPC, Auto assign Public IP
resource "aws_subnet" "PublicSubNet" {
vpc_id = aws_vpc.MyVPC.id # Our VPC
availability_zone = "eu-west-2a" # AZ within London, 1 Subnet = 1 AZ
cidr_block = "10.0.1.0/24" # Check using this later > "${cidrsubnet(data.aws_vpc.MyVPC.cidr_block, 4, 1)}"
map_public_ip_on_launch = "true" # Auto assign Public IP for Public Subnet
tags = {
Name = "PublicSubNet"
}
}
# Create Private Subnet, Associate with our VPC
resource "aws_subnet" "PrivateSubNet" {
vpc_id = aws_vpc.MyVPC.id # Our VPC
availability_zone = "eu-west-2b" # AZ within London region, 1 Subnet = 1 AZ
cidr_block = "10.0.2.0/24" # Check using this later > "${cidrsubnet(data.aws_vpc.MyVPC.cidr_block, 4, 1)}"
tags = {
Name = "PrivateSubNet"
}
}
# Only 1 IGW per VPC
resource "aws_internet_gateway" "MyIGW" {
vpc_id = aws_vpc.MyVPC.id
tags = {
Name = "MyIGW"
}
}
# New Public route table, so we can keep "default main" route table as Private. Route out to MyIGW
resource "aws_route_table" "MyPublicRouteTable" {
vpc_id = aws_vpc.MyVPC.id # Our VPC
route { # Route out IPV4
cidr_block = "0.0.0.0/0" # IPV4 Route Out for all
# ipv6_cidr_block = "::/0" The parameter destinationCidrBlock cannot be used with the parameter destinationIpv6CidrBlock # IPV6 Route Out for all
gateway_id = aws_internet_gateway.MyIGW.id # Target : Internet Gateway created earlier
}
route { # Route out IPV6
ipv6_cidr_block = "::/0" # IPV6 Route Out for all
gateway_id = aws_internet_gateway.MyIGW.id # Target : Internet Gateway created earlier
}
tags = {
Name = "MyPublicRouteTable"
}
}
# Associate "PublicSubNet" with the public route table created above, removes it from default main route table
resource "aws_route_table_association" "PublicSubNetnPublicRouteTable" {
subnet_id = aws_subnet.PublicSubNet.id
route_table_id = aws_route_table.MyPublicRouteTable.id
}
# Create new security group "WebDMZ" for WebServer
resource "aws_security_group" "WebDMZ" {
name = "WebDMZ"
description = "Allows SSH & HTTP requests"
vpc_id = aws_vpc.MyVPC.id # Our VPC : SGs cannot span VPC
ingress {
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # SSH restricted to my laptop public IP <My PUBLIC IP>/32
}
ingress {
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # You can use Load Balancer
}
ingress {
description = "Allows HTTP requests for VPC: IPV6"
from_port = 80
to_port = 80
protocol = "tcp"
ipv6_cidr_blocks = ["::/0"]
}
egress {
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # SSH restricted to my laptop public IP <My PUBLIC IP>/32
}
egress {
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
description = "Allows HTTP requests for VPC: IPV6"
from_port = 80
to_port = 80
protocol = "tcp"
ipv6_cidr_blocks = ["::/0"]
}
}
# Create new EC2 instance (WebServer01) in Public Subnet
# Get ami id from Console
resource "aws_instance" "WebServer01" {
ami = "ami-01a6e31ac994bbc09"
instance_type = "t2.micro"
subnet_id = aws_subnet.PublicSubNet.id
key_name = "MyEC2KeyPair" # To connect using key pair
security_groups = [aws_security_group.WebDMZ.id] # Assign WebDMZ security group created above
# vpc_security_group_ids = [aws_security_group.WebDMZ.id]
tags = {
Name = "WebServer01"
}
}
# Create new security group "MyDBSG" for WebServer
resource "aws_security_group" "MyDBSG" {
name = "MyDBSG"
description = "Allows Public WebServer to Communicate with Private DB Server"
vpc_id = aws_vpc.MyVPC.id # Our VPC : SGs cannot span VPC
ingress {
description = "Allows ICMP requests: IPV4" # For ping,communication, error reporting etc
from_port = -1
to_port = -1
protocol = "icmp"
cidr_blocks = ["10.0.1.0/24"] # Public Subnet CIDR block, can be "WebDMZ" security group id too as below
security_groups = [aws_security_group.WebDMZ.id] # Tried this as above was not working, but still doesn't work
}
ingress {
description = "Allows SSH requests: IPV4" # You can SSH from WebServer01 to DBServer, using DBServer private ip address and copying private key to WebServer
from_port = 22 # ssh ec2-user@Private Ip Address -i MyPvKey.pem Private Key pasted in MyPvKey.pem
to_port = 22 # Not a good practise to use store private key on WebServer, instead use Bastion Host (Hardened Image, Secure) to connect to Private DB
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
ingress {
description = "Allows HTTP requests: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
ingress {
description = "Allows HTTPS requests : IPV4"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
ingress {
description = "Allows MySQL/Aurora requests"
from_port = 3306
to_port = 3306
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
egress {
description = "Allows ICMP requests: IPV4" # For ping,communication, error reporting etc
from_port = -1
to_port = -1
protocol = "icmp"
cidr_blocks = ["10.0.1.0/24"] # Public Subnet CIDR block, can be "WebDMZ" security group id too
}
egress {
description = "Allows SSH requests: IPV4" # You can SSH from WebServer01 to DBServer, using DBServer private ip address and copying private key to WebServer
from_port = 22 # ssh ec2-user@Private Ip Address -i MyPvtKey.pem Private Key pasted in MyPvKey.pem chmod 400 MyPvtKey.pem
to_port = 22 # Not a good practise to use store private key on WebServer, instead use Bastion Host (Hardened Image, Secure) to connect to Private DB
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
egress {
description = "Allows HTTP requests: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
egress {
description = "Allows HTTPS requests : IPV4"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
egress {
description = "Allows MySQL/Aurora requests"
from_port = 3306
to_port = 3306
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
}
# Create new EC2 instance (DBServer) in Private Subnet, Associate "MyDBSG" Security Group
resource "aws_instance" "DBServer" {
ami = "ami-01a6e31ac994bbc09"
instance_type = "t2.micro"
subnet_id = aws_subnet.PrivateSubNet.id
key_name = "MyEC2KeyPair" # To connect using key pair
security_groups = [aws_security_group.MyDBSG.id] # THIS WAS GIVING ERROR WHEN ASSOCIATING
# vpc_security_group_ids = [aws_security_group.MyDBSG.id]
tags = {
Name = "DBServer"
}
}
# Elastic IP required for NAT Gateway
resource "aws_eip" "nateip" {
vpc = true
tags = {
Name = "NATEIP"
}
}
# DBServer in private subnet cannot access internet, so add "NAT Gateway" in Public Subnet
# Add Target as NAT Gateway in default main route table. So there is route out to Internet.
# Now you can do yum update on DBServer
resource "aws_nat_gateway" "NATGW" { # Create NAT Gateway in each AZ so in case of failure it can use other
allocation_id = aws_eip.nateip.id # Elastic IP allocation
subnet_id = aws_subnet.PublicSubNet.id # Public Subnet
tags = {
Name = "NATGW"
}
}
# Main Route Table add NATGW as Target
resource "aws_default_route_table" "DefaultRouteTable" {
default_route_table_id = aws_vpc.MyVPC.default_route_table_id
route {
cidr_block = "0.0.0.0/0" # IPV4 Route Out for all
nat_gateway_id = aws_nat_gateway.NATGW.id # Target : NAT Gateway created above
}
tags = {
Name = "DefaultRouteTable"
}
}
为什么从 WebServer01 到 DBServer 的 ping 超时?
没有特定的 NACL,默认的 NACL 是完全开放的,因此它们在这里不相关。
为此,DBServer 上的安全组需要允许出口到 DBServer 的安全组或包含以下内容的 CIDR它。
aws_instance.DBServer 使用 aws_security_group.MyDBSG.
aws_instance.WebServer01 使用 aws_security_group.WebDMZ.
aws_security_group.WebDMZ的出口规则如下:
egress {
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # SSH restricted to my laptop public IP <My PUBLIC IP>/32
}
egress {
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
description = "Allows HTTP requests for VPC: IPV6"
from_port = 80
to_port = 80
protocol = "tcp"
ipv6_cidr_blocks = ["::/0"]
}
这些出口规则意味着:
- 允许 SSH 到所有 IPv4 主机
- 允许所有 IPv4 和 IPv6 主机的 HTTP
ICMP 未列出,因此 ICMP 回显请求在离开 aws_security_group.WebDMZ 之前将被丢弃。对于 aws_instance.WebServer01.
的 ENI,这应该在 VPC FlowLog 中显示为 REJECT
将这个 egress 规则添加到 aws_security_group.WebDMZ 应该可以解决这个问题:
egress {
description = "Allows ICMP requests: IPV4" # For ping,communication, error reporting etc
from_port = -1
to_port = -1
protocol = "icmp"
cidr_blocks = ["10.0.2.0/24"]
}
DBServer 可能不会响应 ICMP,因此在进行此更改后您可能仍会看到超时。参考 VPC FlowLog 将有助于确定差异。如果您在 VPC FlowLog 中看到 ICMP 流的 ACCEPT,则问题是 DBServer 没有响应 ICMP。
aws_security_group.WebDMZ 中没有任何内容阻止 SSH,所以问题一定出在其他地方。
aws_security_group.MyDBSG上的入口规则如下。
ingress {
description = "Allows ICMP requests: IPV4" # For ping,communication, error reporting etc
from_port = -1
to_port = -1
protocol = "icmp"
cidr_blocks = ["10.0.1.0/24"] # Public Subnet CIDR block, can be "WebDMZ" security group id too as below
security_groups = [aws_security_group.WebDMZ.id] # Tried this as above was not working, but still doesn't work
}
ingress {
description = "Allows SSH requests: IPV4" # You can SSH from WebServer01 to DBServer, using DBServer private ip address and copying private key to WebServer
from_port = 22 # ssh ec2-user@Private Ip Address -i MyPvKey.pem Private Key pasted in MyPvKey.pem
to_port = 22 # Not a good practise to use store private key on WebServer, instead use Bastion Host (Hardened Image, Secure) to connect to Private DB
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
这些出口规则意味着:
- 允许来自 Public 子网 (10.0.1.0/24)
的所有 ICMP
- 允许来自 Public 子网 (10.0.1.0/24) 的 SSH
SSH 应该可以正常工作。 DBServer 可能不接受 SSH 连接。
假设您的 DBServer 是 10.0.2.123,那么如果 SSH 不是 运行 它看起来像 WebServer01 运行 ssh -v 10.0.2.123
:
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: Connecting to 10.0.2.123 [10.0.2.123] port 22.
debug1: connect to address 10.0.2.123 port 22: Operation timed out
ssh: connect to host 10.0.2.123 port 22: Operation timed out
参考 VPC FlowLog 将有助于确定差异。如果您在 VPC FlowLog 中看到 SSH 流的 ACCEPT,则问题是 DBServer 不接受 SSH 连接。
启用 VPC 流日志
由于这些年来已经发生了几次变化,我将向您指出 AWS' own maintained, step by step guide for creating a flow log that publishes to CloudWatch Logs。
我目前推荐 CloudWatch 而不是 S3,因为与使用 Athena 设置和查询 S3 相比,使用 CloudWatch Logs Insights 进行查询非常容易。您只需选择用于流日志的 CloudWatch 日志流并搜索感兴趣的 IP 或端口。
此示例 CloudWatch Logs Insights 查询将获得 eni-0123456789abcdef0 上最近的 20 个拒绝(不是真正的 ENI,使用您正在调试的实际 ENI ID):
fields @timestamp,@message
| sort @timestamp desc
| filter @message like 'eni-0123456789abcdef0'
| filter @message like 'REJECT'
| limit 20
在 VPC 流日志中,丢失的出口规则在源 ENI 上显示为拒绝。
在 VPC 流日志中,丢失的入口规则在目标 ENI 上显示为 REJECT。
安全组是有状态的
无状态数据包过滤器要求您处理神秘的事情,例如大多数(不是所有)操作系统使用端口 32767-65535 来处理 TCP 流中的回复流量。这是一个痛苦,它使 NACL(无状态)成为一个巨大的痛苦。
像安全组这样的有状态防火墙会自动跟踪连接(有状态的状态),因此您只需要在源 SG 的出口规则中允许服务端口到目标 SG 或 IP CIDR 块以及来自源 SG 或目标 SG 的入口规则中的 IP CIDR 块。
尽管安全组 (SG) 是有状态的,但它们默认为拒绝。这包括来自源的出站发起流量。如果源 SG 不允许流量出站到目的地,则即使目的地有允许它的 SG,也是不允许的。这是一个普遍的误解。 SG规则是不可传递的,需要双方制定。
AWS's Protecting Your Instance with Security Groups video 很好地直观地解释了这一点。
旁白:风格推荐
您应该使用 aws_security_group_rule 资源而不是内联的出站和入站规则。
NOTE on Security Groups and Security Group Rules: Terraform currently provides both a standalone Security Group Rule resource (a single ingress or egress rule), and a Security Group resource with ingress and egress rules defined in-line. At this time you cannot use a Security Group with in-line rules in conjunction with any Security Group Rule resources. Doing so will cause a conflict of rule settings and will overwrite rules.
使用带有内联规则的旧式 aws_security_group:
resource "aws_security_group" "WebDMZ" {
name = "WebDMZ"
description = "Allows SSH & HTTP requests"
vpc_id = aws_vpc.MyVPC.id
ingress {
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # You can use Load Balancer
}
egress {
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
并将其替换为这种现代风格的 aws_security_group,每个规则都有 aws_security_group_rule 资源:
resource "aws_security_group" "WebDMZ" {
name = "WebDMZ"
description = "Allows SSH & HTTP requests"
vpc_id = aws_vpc.MyVPC.id
}
resource "aws_security_group_rule" "WebDMZ_HTTP_in" {
security_group_id = aws_security_group.WebDMZ.id
type = "ingress"
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "WebDMZ_SSH_out" {
security_group_id = aws_security_group.WebDMZ.id
type = "egress"
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
允许下面粘贴的 "WebDMZ" 安全组的所有出站规则解决了问题。
egress { # Allow allow traffic outbound, THIS WAS THE REASON YOU WAS NOT ABLE TO PING FROM WebServer to DBServer
description = "Allows All Traffic Outbound from Web Server"
from_port = 0
to_port = 0
protocol = -1
cidr_blocks = ["0.0.0.0/0"]
}
我使用下面的 Terraform 代码创建了自定义 VPC。
我应用了下面的 Terraform 模块,其中:
- 添加 Public 和私有子网
- 配置 IGW
- 添加 NAT 网关
- 为 Web 和数据库服务器添加 SG
应用此后,我无法从 public ec2 实例 ping/ssh 到私有 ec2 实例。
不确定缺少什么。
# Custom VPC
resource "aws_vpc" "MyVPC" {
cidr_block = "10.0.0.0/16"
instance_tenancy = "default" # For Prod use "dedicated"
tags = {
Name = "MyVPC"
}
}
# Creates "Main Route Table", "NACL" & "default Security Group"
# Create Public Subnet, Associate with our VPC, Auto assign Public IP
resource "aws_subnet" "PublicSubNet" {
vpc_id = aws_vpc.MyVPC.id # Our VPC
availability_zone = "eu-west-2a" # AZ within London, 1 Subnet = 1 AZ
cidr_block = "10.0.1.0/24" # Check using this later > "${cidrsubnet(data.aws_vpc.MyVPC.cidr_block, 4, 1)}"
map_public_ip_on_launch = "true" # Auto assign Public IP for Public Subnet
tags = {
Name = "PublicSubNet"
}
}
# Create Private Subnet, Associate with our VPC
resource "aws_subnet" "PrivateSubNet" {
vpc_id = aws_vpc.MyVPC.id # Our VPC
availability_zone = "eu-west-2b" # AZ within London region, 1 Subnet = 1 AZ
cidr_block = "10.0.2.0/24" # Check using this later > "${cidrsubnet(data.aws_vpc.MyVPC.cidr_block, 4, 1)}"
tags = {
Name = "PrivateSubNet"
}
}
# Only 1 IGW per VPC
resource "aws_internet_gateway" "MyIGW" {
vpc_id = aws_vpc.MyVPC.id
tags = {
Name = "MyIGW"
}
}
# New Public route table, so we can keep "default main" route table as Private. Route out to MyIGW
resource "aws_route_table" "MyPublicRouteTable" {
vpc_id = aws_vpc.MyVPC.id # Our VPC
route { # Route out IPV4
cidr_block = "0.0.0.0/0" # IPV4 Route Out for all
# ipv6_cidr_block = "::/0" The parameter destinationCidrBlock cannot be used with the parameter destinationIpv6CidrBlock # IPV6 Route Out for all
gateway_id = aws_internet_gateway.MyIGW.id # Target : Internet Gateway created earlier
}
route { # Route out IPV6
ipv6_cidr_block = "::/0" # IPV6 Route Out for all
gateway_id = aws_internet_gateway.MyIGW.id # Target : Internet Gateway created earlier
}
tags = {
Name = "MyPublicRouteTable"
}
}
# Associate "PublicSubNet" with the public route table created above, removes it from default main route table
resource "aws_route_table_association" "PublicSubNetnPublicRouteTable" {
subnet_id = aws_subnet.PublicSubNet.id
route_table_id = aws_route_table.MyPublicRouteTable.id
}
# Create new security group "WebDMZ" for WebServer
resource "aws_security_group" "WebDMZ" {
name = "WebDMZ"
description = "Allows SSH & HTTP requests"
vpc_id = aws_vpc.MyVPC.id # Our VPC : SGs cannot span VPC
ingress {
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # SSH restricted to my laptop public IP <My PUBLIC IP>/32
}
ingress {
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # You can use Load Balancer
}
ingress {
description = "Allows HTTP requests for VPC: IPV6"
from_port = 80
to_port = 80
protocol = "tcp"
ipv6_cidr_blocks = ["::/0"]
}
egress {
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # SSH restricted to my laptop public IP <My PUBLIC IP>/32
}
egress {
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
description = "Allows HTTP requests for VPC: IPV6"
from_port = 80
to_port = 80
protocol = "tcp"
ipv6_cidr_blocks = ["::/0"]
}
}
# Create new EC2 instance (WebServer01) in Public Subnet
# Get ami id from Console
resource "aws_instance" "WebServer01" {
ami = "ami-01a6e31ac994bbc09"
instance_type = "t2.micro"
subnet_id = aws_subnet.PublicSubNet.id
key_name = "MyEC2KeyPair" # To connect using key pair
security_groups = [aws_security_group.WebDMZ.id] # Assign WebDMZ security group created above
# vpc_security_group_ids = [aws_security_group.WebDMZ.id]
tags = {
Name = "WebServer01"
}
}
# Create new security group "MyDBSG" for WebServer
resource "aws_security_group" "MyDBSG" {
name = "MyDBSG"
description = "Allows Public WebServer to Communicate with Private DB Server"
vpc_id = aws_vpc.MyVPC.id # Our VPC : SGs cannot span VPC
ingress {
description = "Allows ICMP requests: IPV4" # For ping,communication, error reporting etc
from_port = -1
to_port = -1
protocol = "icmp"
cidr_blocks = ["10.0.1.0/24"] # Public Subnet CIDR block, can be "WebDMZ" security group id too as below
security_groups = [aws_security_group.WebDMZ.id] # Tried this as above was not working, but still doesn't work
}
ingress {
description = "Allows SSH requests: IPV4" # You can SSH from WebServer01 to DBServer, using DBServer private ip address and copying private key to WebServer
from_port = 22 # ssh ec2-user@Private Ip Address -i MyPvKey.pem Private Key pasted in MyPvKey.pem
to_port = 22 # Not a good practise to use store private key on WebServer, instead use Bastion Host (Hardened Image, Secure) to connect to Private DB
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
ingress {
description = "Allows HTTP requests: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
ingress {
description = "Allows HTTPS requests : IPV4"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
ingress {
description = "Allows MySQL/Aurora requests"
from_port = 3306
to_port = 3306
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
egress {
description = "Allows ICMP requests: IPV4" # For ping,communication, error reporting etc
from_port = -1
to_port = -1
protocol = "icmp"
cidr_blocks = ["10.0.1.0/24"] # Public Subnet CIDR block, can be "WebDMZ" security group id too
}
egress {
description = "Allows SSH requests: IPV4" # You can SSH from WebServer01 to DBServer, using DBServer private ip address and copying private key to WebServer
from_port = 22 # ssh ec2-user@Private Ip Address -i MyPvtKey.pem Private Key pasted in MyPvKey.pem chmod 400 MyPvtKey.pem
to_port = 22 # Not a good practise to use store private key on WebServer, instead use Bastion Host (Hardened Image, Secure) to connect to Private DB
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
egress {
description = "Allows HTTP requests: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
egress {
description = "Allows HTTPS requests : IPV4"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
egress {
description = "Allows MySQL/Aurora requests"
from_port = 3306
to_port = 3306
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
}
# Create new EC2 instance (DBServer) in Private Subnet, Associate "MyDBSG" Security Group
resource "aws_instance" "DBServer" {
ami = "ami-01a6e31ac994bbc09"
instance_type = "t2.micro"
subnet_id = aws_subnet.PrivateSubNet.id
key_name = "MyEC2KeyPair" # To connect using key pair
security_groups = [aws_security_group.MyDBSG.id] # THIS WAS GIVING ERROR WHEN ASSOCIATING
# vpc_security_group_ids = [aws_security_group.MyDBSG.id]
tags = {
Name = "DBServer"
}
}
# Elastic IP required for NAT Gateway
resource "aws_eip" "nateip" {
vpc = true
tags = {
Name = "NATEIP"
}
}
# DBServer in private subnet cannot access internet, so add "NAT Gateway" in Public Subnet
# Add Target as NAT Gateway in default main route table. So there is route out to Internet.
# Now you can do yum update on DBServer
resource "aws_nat_gateway" "NATGW" { # Create NAT Gateway in each AZ so in case of failure it can use other
allocation_id = aws_eip.nateip.id # Elastic IP allocation
subnet_id = aws_subnet.PublicSubNet.id # Public Subnet
tags = {
Name = "NATGW"
}
}
# Main Route Table add NATGW as Target
resource "aws_default_route_table" "DefaultRouteTable" {
default_route_table_id = aws_vpc.MyVPC.default_route_table_id
route {
cidr_block = "0.0.0.0/0" # IPV4 Route Out for all
nat_gateway_id = aws_nat_gateway.NATGW.id # Target : NAT Gateway created above
}
tags = {
Name = "DefaultRouteTable"
}
}
为什么从 WebServer01 到 DBServer 的 ping 超时?
没有特定的 NACL,默认的 NACL 是完全开放的,因此它们在这里不相关。
为此,DBServer 上的安全组需要允许出口到 DBServer 的安全组或包含以下内容的 CIDR它。
aws_instance.DBServer 使用 aws_security_group.MyDBSG.
aws_instance.WebServer01 使用 aws_security_group.WebDMZ.
aws_security_group.WebDMZ的出口规则如下:
egress {
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # SSH restricted to my laptop public IP <My PUBLIC IP>/32
}
egress {
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
description = "Allows HTTP requests for VPC: IPV6"
from_port = 80
to_port = 80
protocol = "tcp"
ipv6_cidr_blocks = ["::/0"]
}
这些出口规则意味着:
- 允许 SSH 到所有 IPv4 主机
- 允许所有 IPv4 和 IPv6 主机的 HTTP
ICMP 未列出,因此 ICMP 回显请求在离开 aws_security_group.WebDMZ 之前将被丢弃。对于 aws_instance.WebServer01.
的 ENI,这应该在 VPC FlowLog 中显示为 REJECT将这个 egress 规则添加到 aws_security_group.WebDMZ 应该可以解决这个问题:
egress {
description = "Allows ICMP requests: IPV4" # For ping,communication, error reporting etc
from_port = -1
to_port = -1
protocol = "icmp"
cidr_blocks = ["10.0.2.0/24"]
}
DBServer 可能不会响应 ICMP,因此在进行此更改后您可能仍会看到超时。参考 VPC FlowLog 将有助于确定差异。如果您在 VPC FlowLog 中看到 ICMP 流的 ACCEPT,则问题是 DBServer 没有响应 ICMP。
aws_security_group.WebDMZ 中没有任何内容阻止 SSH,所以问题一定出在其他地方。
aws_security_group.MyDBSG上的入口规则如下。
ingress {
description = "Allows ICMP requests: IPV4" # For ping,communication, error reporting etc
from_port = -1
to_port = -1
protocol = "icmp"
cidr_blocks = ["10.0.1.0/24"] # Public Subnet CIDR block, can be "WebDMZ" security group id too as below
security_groups = [aws_security_group.WebDMZ.id] # Tried this as above was not working, but still doesn't work
}
ingress {
description = "Allows SSH requests: IPV4" # You can SSH from WebServer01 to DBServer, using DBServer private ip address and copying private key to WebServer
from_port = 22 # ssh ec2-user@Private Ip Address -i MyPvKey.pem Private Key pasted in MyPvKey.pem
to_port = 22 # Not a good practise to use store private key on WebServer, instead use Bastion Host (Hardened Image, Secure) to connect to Private DB
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"]
}
这些出口规则意味着:
- 允许来自 Public 子网 (10.0.1.0/24) 的所有 ICMP
- 允许来自 Public 子网 (10.0.1.0/24) 的 SSH
SSH 应该可以正常工作。 DBServer 可能不接受 SSH 连接。
假设您的 DBServer 是 10.0.2.123,那么如果 SSH 不是 运行 它看起来像 WebServer01 运行 ssh -v 10.0.2.123
:
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: Connecting to 10.0.2.123 [10.0.2.123] port 22.
debug1: connect to address 10.0.2.123 port 22: Operation timed out
ssh: connect to host 10.0.2.123 port 22: Operation timed out
参考 VPC FlowLog 将有助于确定差异。如果您在 VPC FlowLog 中看到 SSH 流的 ACCEPT,则问题是 DBServer 不接受 SSH 连接。
启用 VPC 流日志
由于这些年来已经发生了几次变化,我将向您指出 AWS' own maintained, step by step guide for creating a flow log that publishes to CloudWatch Logs。
我目前推荐 CloudWatch 而不是 S3,因为与使用 Athena 设置和查询 S3 相比,使用 CloudWatch Logs Insights 进行查询非常容易。您只需选择用于流日志的 CloudWatch 日志流并搜索感兴趣的 IP 或端口。
此示例 CloudWatch Logs Insights 查询将获得 eni-0123456789abcdef0 上最近的 20 个拒绝(不是真正的 ENI,使用您正在调试的实际 ENI ID):
fields @timestamp,@message
| sort @timestamp desc
| filter @message like 'eni-0123456789abcdef0'
| filter @message like 'REJECT'
| limit 20
在 VPC 流日志中,丢失的出口规则在源 ENI 上显示为拒绝。
在 VPC 流日志中,丢失的入口规则在目标 ENI 上显示为 REJECT。
安全组是有状态的
无状态数据包过滤器要求您处理神秘的事情,例如大多数(不是所有)操作系统使用端口 32767-65535 来处理 TCP 流中的回复流量。这是一个痛苦,它使 NACL(无状态)成为一个巨大的痛苦。
像安全组这样的有状态防火墙会自动跟踪连接(有状态的状态),因此您只需要在源 SG 的出口规则中允许服务端口到目标 SG 或 IP CIDR 块以及来自源 SG 或目标 SG 的入口规则中的 IP CIDR 块。
尽管安全组 (SG) 是有状态的,但它们默认为拒绝。这包括来自源的出站发起流量。如果源 SG 不允许流量出站到目的地,则即使目的地有允许它的 SG,也是不允许的。这是一个普遍的误解。 SG规则是不可传递的,需要双方制定。
AWS's Protecting Your Instance with Security Groups video 很好地直观地解释了这一点。
旁白:风格推荐
您应该使用 aws_security_group_rule 资源而不是内联的出站和入站规则。
NOTE on Security Groups and Security Group Rules: Terraform currently provides both a standalone Security Group Rule resource (a single ingress or egress rule), and a Security Group resource with ingress and egress rules defined in-line. At this time you cannot use a Security Group with in-line rules in conjunction with any Security Group Rule resources. Doing so will cause a conflict of rule settings and will overwrite rules.
使用带有内联规则的旧式 aws_security_group:
resource "aws_security_group" "WebDMZ" {
name = "WebDMZ"
description = "Allows SSH & HTTP requests"
vpc_id = aws_vpc.MyVPC.id
ingress {
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # You can use Load Balancer
}
egress {
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
并将其替换为这种现代风格的 aws_security_group,每个规则都有 aws_security_group_rule 资源:
resource "aws_security_group" "WebDMZ" {
name = "WebDMZ"
description = "Allows SSH & HTTP requests"
vpc_id = aws_vpc.MyVPC.id
}
resource "aws_security_group_rule" "WebDMZ_HTTP_in" {
security_group_id = aws_security_group.WebDMZ.id
type = "ingress"
description = "Allows HTTP requests for VPC: IPV4"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
resource "aws_security_group_rule" "WebDMZ_SSH_out" {
security_group_id = aws_security_group.WebDMZ.id
type = "egress"
description = "Allows SSH requests for VPC: IPV4"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
允许下面粘贴的 "WebDMZ" 安全组的所有出站规则解决了问题。
egress { # Allow allow traffic outbound, THIS WAS THE REASON YOU WAS NOT ABLE TO PING FROM WebServer to DBServer
description = "Allows All Traffic Outbound from Web Server"
from_port = 0
to_port = 0
protocol = -1
cidr_blocks = ["0.0.0.0/0"]
}