Active-active route-based Vyos 1.2 to Azure VPN (with BGP!)

VyOS is an open source network operating system that can be installed on physical hardware or a virtual machine on your own server, or a cloud platform. It is based on GNU/Linux and joins multiple applications such as Quagga, ISC DHCPD, OpenVPN, StrongS/WAN and others under a single management interface.

https://vyos.io

I should add that it has a CLI that’s very similar to the one on Juniper SRX (and MX I believe), so if you’re familiar with those you’ll feel at home with Vyos.

This article will show you how to get a route-based VPN up and running between Vyos and Azure with redundant tunnels in an active-active setup. We will also run BGP over the tunnel, so buckle up because we’re going to have fun.

Settings

Let’s start by defining the necessary settings to configure the tunnels on both sides. Some of the below will be available before you start, such as the IP address space on each, but some other settings will only become available after some action has been done, such as the public IP addresses of the Azure VNet Gateways. In any case this guide has been written in the right order to avoid hitting a step where you don’t have the necessary settings available.

ItemAddress
Vyos device public IP20.30.40.50
Vyos device private IP10.10.0.4
Vyos VTI1 IP10.10.0.5
Vyos VTI2 IP10.10.0.6
Azure VGW first public IP1.2.3.4
Azure VGW second public IP4.3.2.1
Azure VNet Addressing10.0.0.0/16
Azure GatewaySubnet Addressing10.0.0.0/27
Azure BGP listener first IP10.0.0.4
Azure BGP listener second IP10.0.0.5
Azure ASN65521
Vyos ASN65522

Steps

Azure

The Azure side requires the creation of a number of resources, which are:

  • Azure VNet Gateway
  • Local network gateway (AKA remote gateway / VPN peer)
  • Connections

Azure VNet Gateway

The specific settings you need to configure when deploying the VNet Gateway are:

  • Enable BGP
  • Set the BGP ASN
  • Enable Active-Active feature
  • Choose at least VpnGw1 as the SKU or VpnGw1Az if you need zone redundancy

In this step you should have deployed a new pair of VNet Gateways. If you already had VNet Gateways and just want to connect them to a Vyos device make sure you have the right SKU – the other options should be only optional if you don’t need/want BGP.

Important! Deploying these gateways may take up to 45 minutes. During this time you may continue with the next step, but that’s about it. You may want though to continue reading the article meanwhile so you get familiar with the steps to follow.

Local network gateway

The specific settings you need to configure when deploying the local network gateway are:

  • Public IP address: Vyos public IP
  • Address space: Vyos private IP only
  • Check Configure BGP settings
  • ASN: Vyos ASN
  • BGP peer IP address: Vyos private IP as above

Connections

You need to create only one connection on the Azure side. The connection settings will be applied to each VNet Gateway instance, so you can configure two tunnels from Vyos to Azure.

The specific settings you need to configure here are:

  • Choose site-to-site as connection type
  • Choose the VNet Gateway created above as Virtual Network Gateway
  • Choose the local network gateway created above as local network gateway
  • Pre-shared key of your choice.
  • Check Enable BGP

And at this point you’re ready to configure two active-active tunnels resilient to gateway failures on the Azure side.

Vyos

VTIs

We need to configure one VTI interface per tunnel in Vyos so we can properly route traffic over the VPN without depending on updating local and remote prefixes. This way any BGP updates would dynamically work without requiring manual changes on the VPN tunnel. That’s the point of BGP anyway 🙂

set interfaces vti vti1 address '10.10.0.5/32'
set interfaces vti vti1 description 'Azure Primary Tunnel'
set interfaces vti vti2 address '10.10.0.6/32'
set interfaces vti vti2 description 'Azure Secondary Tunnel'

Important! Make sure you clamp your MSS to 1350 or else some traffic may be blackholed between you and Azure:

set firewall options interface vti1 adjust-mss 1350
set firewall options interface vti2 adjust-mss 1350

IKE and ESP settings

Please note Azure only supports IKEv2 on their production ready VNet Gateway SKUs:

set vpn ipsec ike-group AZURE dead-peer-detection action 'restart'
set vpn ipsec ike-group AZURE dead-peer-detection interval '15'
set vpn ipsec ike-group AZURE dead-peer-detection timeout '30'
set vpn ipsec ike-group AZURE ikev2-reauth 'yes'
set vpn ipsec ike-group AZURE key-exchange 'ikev2'
set vpn ipsec ike-group AZURE lifetime '28800'
set vpn ipsec ike-group AZURE proposal 1 dh-group '2'
set vpn ipsec ike-group AZURE proposal 1 encryption 'aes256'
set vpn ipsec ike-group AZURE proposal 1 hash 'sha1'
set vpn ipsec esp-group AZURE lifetime '3600'
set vpn ipsec esp-group AZURE mode 'tunnel'
set vpn ipsec esp-group AZURE pfs 'dh-group2'
set vpn ipsec esp-group AZURE proposal 1 encryption 'aes256'
set vpn ipsec esp-group AZURE proposal 1 hash 'sha1'

The above is a fairly safe choice of crypto algos, however you can go for a different (and safer) one if you please. Just make sure you match all the supported settings with the supported ones.

Gluing things together

Now we need to glue all the above together and instruct the Vyos to connect to the Azure gateways. For that matter we’re going to configure two tunnels. The first tunnel will be tied to vti1:

set vpn ipsec site-to-site peer 1.2.3.4 authentication id '20.30.40.50'
set vpn ipsec site-to-site peer 1.2.3.4 authentication mode 'pre-shared-secret'
set vpn ipsec site-to-site peer 1.2.3.4 authentication pre-shared-secret 'xxxxxxxxx'
set vpn ipsec site-to-site peer 1.2.3.4 authentication remote-id '1.2.3.4'
set vpn ipsec site-to-site peer 1.2.3.4 connection-type 'respond'
set vpn ipsec site-to-site peer 1.2.3.4 description 'AZURE PRIMARY TUNNEL'
set vpn ipsec site-to-site peer 1.2.3.4 ike-group 'AZURE'
set vpn ipsec site-to-site peer 1.2.3.4 ikev2-reauth 'inherit'
set vpn ipsec site-to-site peer 1.2.3.4 local-address '10.10.0.4'
set vpn ipsec site-to-site peer 1.2.3.4 vti bind 'vti1'
set vpn ipsec site-to-site peer 1.2.3.4 vti esp-group 'AZURE'

The second tunnel will be tied to vti2:

set vpn ipsec site-to-site peer 4.3.2.1 authentication id '20.30.40.50'
set vpn ipsec site-to-site peer 4.3.2.1 authentication mode 'pre-shared-secret'
set vpn ipsec site-to-site peer 4.3.2.1 authentication pre-shared-secret 'xxxxxxxxx'
set vpn ipsec site-to-site peer 4.3.2.1 authentication remote-id '4.3.2.1'
set vpn ipsec site-to-site peer 4.3.2.1 connection-type 'respond'
set vpn ipsec site-to-site peer 4.3.2.1 description 'AZURE SECONDARY TUNNEL'
set vpn ipsec site-to-site peer 4.3.2.1 ike-group 'AZURE'
set vpn ipsec site-to-site peer 4.3.2.1 ikev2-reauth 'inherit'
set vpn ipsec site-to-site peer 4.3.2.1 local-address '10.10.0.4'
set vpn ipsec site-to-site peer 4.3.2.1 vti bind 'vti2'
set vpn ipsec site-to-site peer 4.3.2.1 vti esp-group 'AZURE'

Note how we’re basically using the same configuration, apart from pointing to the right remote public IP and tying each tunnel to a different VTI. Important! you can’t have different pre-shared keys for each of the tunnels as on the Azure side the configuration is shared among both!

Static Routing

At this point we will have the tunnels up, however we have no routes pointing to Azure, so there’s no way for us to put traffic into either of the tunnels. Remember we wanted to run BGP with Azure, so our first requirement is to be able to reach the Azure BGP listeners 10.0.0.4 and 10.0.0.5. In order to accomplish that we will put static routes as we have no other choice 🙂

Important! If you go to your Azure portal and check on your VNet Gateways’ BGP configuration you’ll see the two BGP listeners addresses in a comma-separated list e.g. 10.0.0.4,10.0.0.5, however please note it could also be instead 10.0.0.5,10.0.0.4 and that order is in fact very important.

We have two tunnels to Azure and each of them is connecting to one of your VNet Gateway instances. The BGP listener IP addresses are also each tied to one of the VNet Gateway instances and they are in the same order as the public IP addresses. In our case we see 10.0.0.4,10.0.0.5 so we have this:

Public IPBGP listener IP
1.2.3.410.0.0.4
4.3.2.110.0.0.5

That means we will have to add a static route to reach 10.0.0.4 on vti1 as that’s the VTI used for the 1.2.3.4 tunnel. Likewise for the second tunnel.

set protocols static interface-route 10.0.0.4/32 next-hop-interface vti1
set protocols static interface-route 10.0.0.5/32 next-hop-interface vti2

Dynamic Routing

Now we should be able to reach the Azure BGP listeners, so let’s configure BGP on the Vyos:

set protocols bgp 65522 neighbor 10.0.0.4 remote-as '65521'
set protocols bgp 65522 neighbor 10.0.0.5 remote-as '65521'

The above instructs Vyos to enable BGP, use 65522 as its own ASN and configure two new BGP neighbours that happen to have 65521 as their ASN, so any other BGP connection that’s either not coming from those IP addresses or not using 65521 as their ASN will be discarded.

And let’s add some reasonable keepalives and timers to make sure we minimise the impact of any BGP failures whilst not risking a continuous route flapping:

set protocols bgp 65522 neighbor 10.0.0.4 timers holdtime '30'
set protocols bgp 65522 neighbor 10.0.0.4 timers keepalive '10'

set protocols bgp 65522 neighbor 10.0.0.5 timers holdtime '30'
set protocols bgp 65522 neighbor 10.0.0.5 timers keepalive '10'

Important! There’s one gotcha here. Azure will advertise their networks by telling us the next hop is the private IP of the VNet Gateway, e.g. 10.0.0.4. This IP is not in a directly connected network, but it’s reachable with the static route we put up before, so we need to find a way to accept a recursive routes in our routing table. In order to accomplish this we need to disable the feature that checks if the gateway of a route is in a connected network:

set protocols bgp 65522 neighbor 10.0.0.4 disable-connected-check
set protocols bgp 65522 neighbor 10.0.0.5 disable-connected-check

Without the above, the routes would be learned but never put in the routing table as they’re not valid! (the gateway is not directly reachable!).

Notes

It’s important to understand some basics about Azure Networking:

Azure will always advertise over BGP only the VNet address space and not each subnet separately. If you have more than one address space defined, Azure will advertise those separately.

Azure will advertise to you also every single route it has in the VNet table, so if you have added static routes pointing to a NVA in your VNet, those networks will also be advertised.

There’s not a 100% sure way to get the VNet route table in one simple API query as the table may differ among subnets or even among network interfaces / VMs. My suggestion is to always query the effective routing table of a VM you are familiar with, so the results should be predictable in case there are any surprises.

Make sure you understand how Azure routes traffic in case a BGP route and a static route have the same prefix: Static routes created using UDR take precedence over BGP learned routes (and BGP learned routes take precedence over system routes).

Any address space defined as part of a local network gateway creates a route in the VNet table when the VPN that connects to it is UP.

Conclusion

At this point you have two tunnels from one Vyos device to two Azure VNet Gateway instances. This offers you resiliency against failures on one Azure Gateway, including zone failure resiliency if you chose VpnGw1Az as the SKU for your VNet Gateway.

Redundant tunnels from on-premises to Azure

I would recommend at least adding a second Vyos and repeating all the steps above, including the Azure steps except the creation of the VNet Gateway, to create a mesh of 4 tunnels between your premises and Azure so you can be resilient to failures on either side.

Dual redundant tunnels from on-premises to Azure

At that point you should carefully look into how you’d like your traffic to travel. Asymmetric routing is fine and in fact that’s how the Internet actually works, however if you have L4 devices (e.g. firewalls) that may be exposed to asymmetric routing they will probably drop the asymmetric traffic (as that’s the whole point of a stateful / L4 device).

There are different ways you can influence how the traffic is sent from/to each side, such as routing weight or longer AS Paths. We may cover those in a future article.

4 Replies to “Active-active route-based Vyos 1.2 to Azure VPN (with BGP!)”

  1. Hello!

    Is it possible to have this setup if I have a VyOS appliance on Hyper-V behind a residential gateway?

    Like

    1. I don’t see why not, however you want to make sure whichever NAT approaches you’re taking on both the Hyper-V and the residential gateway layers do not interfere. A usual problem is e.g. the residential gateway deciding to change the source port of your outbound connections (and it has to be 4500/UDP on either side for scenarios with the IPSec peer behind a NAT).

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.