我正在尝试建立一个站点对站点的VPN到一个大型电信公司.
我们在AWS中运行VyOS 1.1.7;他们使用Cisco 5520。
因为电信是大的,我们是小的,他们已经向我们口述了所有必要的设置,不太可能代表我们改变任何事情。此外,从它们获取日志文件或更多配置信息可能很繁琐。
从他们的角度来看,他们希望为所有应用程序流量(http)白化我们的一个静态IP。因此,我们发起的任何连接都必须看起来像是来自于我们的静态IP,不管我们内部网络体系结构的其他部分可能是什么样子。因此,我们使用NAT来满足这一要求。
无论如何,我们可以成功地建立一个IKEv1连接和一个ESP连接,并且ICMP和TCP流量都可以双向流动。我们可以对它们进行平分,它们也可以对我们进行平分,我们的应用服务器的入站http流量也可以工作。
对我来说,这意味着我们的AWS安全策略是正确配置的,我们的防火墙规则是正确的,我们的VPN设置匹配,并且路由我们的子网也是工作的。
问题是,经过一段很短的时间,比如10分钟,我们就不能再从电信公司的角度联系到了。
就我们而言,show vpn ipsec sa的输出表明隧道仍在运行:
vyos@VPN-FW01:~$ show vpn ipsec sa
Peer ID / IP Local ID / IP
------------ -------------
<TELCO IP> <MY IP>
Tunnel State Bytes Out/In Encrypt Hash NAT-T A-Time L-Time Proto
------ ----- ------------- ------- ---- ----- ------ ------ -----
1 up 0.0/0.0 aes128 sha1 no 2153 28800 all
2 up 0.0/0.0 aes128 sha1 no 1964 28800 all
3 up 0.0/0.0 aes128 sha1 no 1906 28800 all
4 up 0.0/0.0 aes128 sha1 no 1864 28800 all但正如你所看到的,隧道没有经过任何车辆。
日志文件中似乎也没有任何信息。show log的输出有许多条目,如下所示:
Jun 17 11:46:06 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-1" #347: sent QI2, IPsec SA established {ESP=>0x1e15be1f <0xc9355ae4}
Jun 17 11:54:26 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-1" #14: received Delete SA payload: replace IPSEC State #338 in 10 seconds
Jun 17 11:54:36 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-2" #348: initiating Quick Mode PSK+ENCRYPT+TUNNEL+PFS+UP to replace #338 {using isakmp#14}
Jun 17 11:54:36 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-2" #348: sent QI2, IPsec SA established {ESP=>0xa48e62c1 <0xc67eaa07}
Jun 17 12:02:26 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-1" #14: received Delete SA payload: replace IPSEC State #332 in 10 seconds
Jun 17 12:02:36 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-3" #349: initiating Quick Mode PSK+ENCRYPT+TUNNEL+PFS+UP to replace #332 {using isakmp#14}
Jun 17 12:02:36 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-3" #349: sent QI2, IPsec SA established {ESP=>0xe0d44968 <0xccc1945f}
Jun 17 12:03:56 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-1" #14: received Delete SA payload: replace IPSEC State #333 in 10 seconds
Jun 17 12:04:06 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-4" #350: initiating Quick Mode PSK+ENCRYPT+TUNNEL+PFS+UP to replace #333 {using isakmp#14}
Jun 17 12:04:06 VPN-FW01 pluto[18897]: "peer-TELCO-IP-tunnel-4" #350: sent QI2, IPsec SA established {ESP=>0xad009d57 <0xc8b2287d}但没有其他错误或任何东西。
在我们的终端上使用tcpdump也没有显示任何信息。许多类似于下面的条目,以及典型的ARP流量、NTP等等。
TELCO-IP.isakmp > ip-MY-IP.ec2.internal.isakmp: isakmp 1.0 msgid dd22ed6d: phase 2/others ? inf[E]: [encrypted hash]
12:06:59.148180 IP (tos 0x0, ttl 64, id 2672, offset 0, flags [DF], proto UDP (17), length 120)
ip-MY-IP.ec2.internal.isakmp > TELCO-IP.isakmp: isakmp 1.0 msgid f8f1d9ba: phase 2/others ? inf[E]: [encrypted hash]
12:07:19.147638 IP (tos 0x0, ttl 234, id 31559, offset 0, flags [none], proto UDP (17), length 120)但我们再也看不到传入的ping或http流量了。
有趣的一点是,如果我们从我们身边平平电信的子网,那么传入的流量,包括http,会在下降之前再工作大约10分钟。
有什么线索吗?
我的VyOS配置在这里:
firewall {
all-ping enable
broadcast-ping disable
config-trap disable
group {
address-group TELCO-HOSTS {
address 192.xx.yy.38
address 192.xx.yy.39
address 192.xx.yy.40
address 192.xx.yy.41
}
}
ipv6-receive-redirects disable
ipv6-src-route disable
ip-src-route disable
log-martians enable
name eth0in {
default-action reject
rule 20 {
action accept
description "accept ICMP pings"
icmp {
type-name echo-request
}
protocol icmp
}
rule 30 {
action accept
destination {
port 22
}
protocol tcp
}
rule 40 {
action accept
description "accept all internal traffic"
source {
address 10.113.0.0/16
}
}
rule 50 {
action accept
description "accept expected tunneled TCP traffic from TELCO"
destination {
port 5101,8310,8443,8080,9101,9107,9109
}
protocol tcp
source {
group {
address-group TELCO-HOSTS
}
}
}
rule 200 {
action drop
}
}
name eth0out {
default-action accept
}
receive-redirects disable
send-redirects enable
source-validation disable
state-policy {
established {
action accept
}
invalid {
action drop
}
related {
action accept
}
}
syn-cookies enable
twa-hazards-protection disable
}
interfaces {
ethernet eth0 {
address dhcp
duplex auto
firewall {
in {
name eth0in
}
out {
name eth0out
}
}
hw-id 0a:d2:b0:8e:53:f3
smp_affinity auto
speed auto
}
loopback lo {
}
}
nat {
source {
rule 10 {
description "US to TELCO"
destination {
address 192.xx.yy.0/24
}
outbound-interface eth0
translation {
address <MY-APP-SERVER>
}
}
rule 500 {
description "US to anywhere else"
outbound-interface eth0
source {
address 10.113.0.0/16
}
translation {
address masquerade
}
}
}
}
service {
ssh {
disable-password-authentication
port 22
}
}
vpn {
ipsec {
esp-group ESP {
compression disable
lifetime 28800
mode tunnel
pfs enable
proposal 1 {
encryption aes128
hash sha1
}
}
ike-group IKE {
key-exchange ikev1
lifetime 86400
proposal 1 {
dh-group 5
encryption aes256
hash sha1
}
}
ipsec-interfaces {
interface eth0
}
site-to-site {
peer <TELCO-STATIC-IP> {
authentication {
mode pre-shared-secret
pre-shared-secret ****************
}
connection-type initiate
ike-group IKE
local-address <MY-IP>
tunnel 1 {
allow-nat-networks disable
allow-public-networks disable
esp-group ESP
local {
prefix 10.113.0.0/24
}
remote {
prefix 192.xx.yy.38/32
}
}
tunnel 2 {
allow-nat-networks disable
allow-public-networks disable
esp-group ESP
local {
prefix 10.113.0.0/24
}
remote {
prefix 192.xx.yy.39/32
}
}
tunnel 3 {
allow-nat-networks disable
allow-public-networks disable
esp-group ESP
local {
prefix 10.113.0.0/24
}
remote {
prefix 192.xx.yy.40/32
}
}
tunnel 4 {
allow-nat-networks disable
allow-public-networks disable
esp-group ESP
local {
prefix 10.113.0.0/24
}
remote {
prefix 192.xx.yy.41/32
}
}
}
}
}
}发布于 2016-06-21 17:54:17
接下来,我们发现AWS安全策略在这里产生了影响。
当我们看到奇怪的“10分钟后断开连接”问题时,安全策略被设置为仅允许端口500上的入站UDP (向对等端白化)。
我们更改了策略,允许来自对等端的所有入站通信量(在所有端口上),问题似乎已经解决了。
发布于 2019-01-12 08:44:00
由于VyOS在aws中,它将始终有nat从您的设备到互联网。然后必须允许udp端口4500,因为当设备位于nat后时,所有IPsec连接都将发生在udp 4500上。为此,您需要协议50、51 (ah和esp)和udp 500和4500。所以当你打开所有东西的时候它就起作用了。最近过得怎样?VyOS是稳定的?
https://networkengineering.stackexchange.com/questions/32336
复制相似问题