Cloud network infrastructore with BGP

This page describes a network architecture designed especially for cloud networks, in that it maintains a completely dynamic, mobile ip infrastructure using BGP. This architecture has been in use at Fubra for almost 2 years now and has performed flawlessly for us.

The key advantages are:

IP Address Portability This architecture provides a mechanism (other than arp!) that servers can use to advertise their public ip addresses to the network. It is particularly useful for providing ip address portability to virtual servers and virtual machines without having to worry about arp issues. Individual IP addresses can be transparently migrated between geographically seperate data centres, as long as there is an internal as-path between the sites.

That means you can move a single ip address from one data centre to another and everything will "just work!".

Save IP addresses, be more secure Why not kill two birds with one stone. By giving all of your physical host servers a non-routable address like 10.0.0.10 you can isolate them from the Internet altogether. Mount your public addresses inside your virtual servers and watch them spring into life when the when they start.

This also saves IP space as you no longer need to allocate public addresses for each physical node from your RIPE allocation ;)

Flat Network Structure reduces hops and latencies All servers at one site have a direct route to all services on all hosts, enabling a completely "flat" architecture atop a suitably sized core network prefix. packets get to their destinations faster as they no longer need to cross as many routers.

zero configuration once bgp is set up Once BGP is enabled, the whole network configures itself automatically. You can automate the bgp setup using something like Puppet.

Shared Address Binding It enables multiple hosts to bind to shared ip addresses without having to deal with arp issues, very useful for things like active/active load balancers and failover. Complex policies can be set up in BGP to influence network behaviour when the toplogy changes. Used in conjunction with multiple data centres you can create truly redundant services.

Diagram of the network architecture

AS65001                                                             Backbone
------------------+-----------------------+-----------------------> x.x.x.x/28
                  |                       |
          x.x.x.12|               x.x.x.11|         
         +-----------------+    +-----------------+
         |  border router  |    |  border router  |
         |      (bgp)      |    |      (bgp)      |
         +-----------------+    +-----------------+
          10.0.0.1|               10.0.0.2|
                  |                       |                          Core Network
------------------+---------+-+------+----+---+-+-------------------> 10.0.0.0/24
                            | |      |        | |       
                            | |      |        | |
 *LVS Range: 10.0.1.0/24*   | |      |        | |
+----------------+          | |  []--+--[]    | |          +----------------+
|  load balancer |10.0.0.7  | |      |        | |  10.0.0.5|  route server  |
|      (LVS)     |----------+ |  []--+--[]    | +----------|      (bgp)     |
+----------------+            |      |        |            +----------------+ 
+----------------+            |  []--+--[]    |            +----------------+ 
|  load balancer |10.0.0.8    |      |        |    10.0.0.6|  route server  | 
|      (LVS)     |------------+  []--+--[]    +------------|      (bgp)     | 
+----------------+                   |                     +----------------+
                                  Servers
                              (10.0.0.10->n)

Setting it all up

You will need to create a bgp-enabled interior network - this should be nice and easy, since ideally you will already have a BGP exterior and your own AS number. If you don't have BGP on your network yet, then you can still use it to create a dynamic interior, but your external connectivity will not be redundant.

Install Quagga on your border routers and route servers

We start with a pair of dedicated servers running Centos5.2 to be used as route servers, and another pair of BGP-enabled routers at the border - this HOWTO will asume these servers are also running Centos 5.2/

For BGP on Linux, Quagga is an excellent solution.

Install the Quagga suite, enable zebra and bgp daemons:

yum install quagga
chkconfig zebra on
chkconfig bgpd on

Next you'll need a valid startup configuration for zebra and bgpd.

Put the following in /etc/quagga/zebra.conf

hostname host.domain.tld
password XXXXXX

Put the following in /etc/quagga/bgpd.conf

hostname host.domain.tld
password XXXXXX
!
router bgp 65001
 bgp router-id 10.0.0.n

Where 10.0.0.n is the core network IP address of the host you are installing Quagga on (see diagram above).

Now chown the config files quagga:quagga so the routing daemons can save their configuration:

chown quagga:quagga /etc/quagga/zebra.conf
chown quagga:quagga /etc/quagga/bgpd.conf

Next, start up quagga and bgpd services:

service zebra start
service bgpd start

Since both configurations are bare bones, telnet into zebra and bgpd and save the configurations:

telnet localhost bgpd
enable
write
exit

telnet localhost bgpd
enable
write
exit

If you get any errors, its most likely file ownerships or permisions on the configuration files. Go back and make sure the configuration files are owned quagga:quagga and mode 644.

Repeat this prcess on all border routers and route servers (that should be 4 hosts minimum for redundancy).

Set up a full BGP mesh between border routers and route servers

A full BGP mesh is where every host participating in BGP has a session with every other host. We need a full BGP mesh between our border routers and route servers. Here is a diagram:

         +-----------------+   +-----------------+
         | border router 1 |   | border router 2 |
         |    10.0.0.1     |---|    10.0.0.2     |
         +-----------+---+-+   +-+---+-----------+
                     |    \     /    |
                     |     \   / BGP |
                     |      \ /      |
                     |       X   Full|
                     |      / \      |
                     |     /   \ Mesh|
                     |    /     \    |
         +-----------+---+-+   +-+---+-----------+
         |  route server 2 |---|  route server 1 |
         |    10.0.0.5     |   |    10.0.0.6     |
         +-----------------+   +-----------------+

This step requires setting up BGP sessions on the border routers and route servers. The aim here is for the each of the border routers to provide a default route to the core network via the route servers, and for the route servers to provide routes to all ip addresses in use on the network.

First configure the border routers

This will need to be done on both border routers:

Add some prefix-lists to filter everything and keep things safe:

ip prefix-list defaultroute seq 5 permit 0.0.0.0/0
ip prefix-list defaultroute seq 10 deny any
ip prefix-list hostroutes seq 5 permit 0.0.0.0/0 ge 32
ip prefix-list hostroutes seq 10 deny any

Next set up a peer-group to apply common settings across all route servers:

router bgp 65001
 neighbor routeserver peer-group
 neighbor routeserver description route servers on core network
 neighbor routeserver update-source 10.0.0.1
 neighbor routeserver next-hop-self
 neighbor routeserver default-originate
 neighbor routeserver soft-reconfiguration inbound
 neighbor routeserver prefix-list defaultroute out

This config will work for border1. Change ip address to 10.0.0.2 for border 2.

Next set up neighbor statements for each of your route servers:

router bgp 65001
 neighbor 10.0.0.5 remote-as 65001
 neighbor 10.0.0.5 peer-group routeserver
 neighbor 10.0.0.5 description route server 2
 neighbor 10.0.0.6 remote-as 65001
 neighbor 10.0.0.6 peer-group routeserver
 neighbor 10.0.0.6 description route server 1

Now the border router is ready and waiting to establish sessions with the route servers.

Configure the route servers

We want to receive a default gateway from the border routers, and propagate hosts routes learned from servers to the border routers.

First add some filters, the same ones as above:

ip prefix-list alldynamicroutes seq 5 permit 0.0.0.0/0 ge 32
ip prefix-list alldynamicroutes seq 100 deny any
ip prefix-list defaultroute seq 5 permit 0.0.0.0/0
ip prefix-list defaultroute seq 10 deny any
ip prefix-list hostroutes seq 5 permit 0.0.0.0/0 ge 32
ip prefix-list hostroutes seq 10 deny any

Now add a peer-group for the border routers:

router bgp 65001
 neighbor border peer-group
 neighbor border soft-reconfiguration inbound
 neighbor border update-source 10.0.0.5
 neighbor border route-server-client
 neighbor border prefix-list defaultroute in
 neighbor border prefix-list alldynamicroutes out

This configuration will work on route server 1. Change the ip address to 10.0.0.6 for route server 2.

And finally add the neighbor statements for the border routers:

router bgp 35456
 neighbor 10.0.0.1 remote-as 65001
 neighbor 10.0.0.1 peer-group border
 neighbor 10.0.0.1 description border router 1
 neighbor 10.0.0.2 remote-as 65001
 neighbor 10.0.0.2 peer-group border
 neighbor 10.0.0.3 description border router 2

Now you can look at your sessions on the route server:

routeserver1# sh ip bgp sum
BGP router identifier 10.0.0.5, local AS number 65001
1 BGP AS-PATH entries
0 BGP community entries

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.0.0.1        4 65001     182     186        0    0    0 02:59:13        1
10.0.0.2        4 65001     181     183        0    0    0 02:59:27        1

Next make sure the default routes are received from the border routers:

routeserver1# show ip bgp
BGP table version is 0, local router ID is 10.0.0.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*>i0.0.0.0          10.0.0.1                      100      0 i
* i                 10.0.0.2                      100      0 i

This output tells you that the router has received two default routes, one from each border router. You should see a similar output from both route servers.

Application servers

If you've got this far then you've got the basic infrastructure in place. Now you need to add some application servers to the network to make it useful.