main blog post image for the BGP Explained post, showing basic bgp flow diagrams

BGP Explained

Right, BGP explained. Let me tell you what happened the first time I had to deal with it properly.

Got handed a service provider migration project. New job, few months in, boss drops it on my desk and says the edge routers need BGP sorting. Right. Fine. How hard can it be.

I’d done basic BGP before. Configured an internet handoff here and there. Knew the commands. Thought I was sorted.

Spent a week going through documentation. RFC 4271. Vendor guides. Every tutorial I could find online. Felt like I was getting somewhere. Then sat down in front of an actual broken BGP session at half eleven at night and had absolutely no idea what I was looking at.

The problem was that every resource either glossed over how BGP actually works or went straight into RFC text before you’d got any feel for what the protocol was actually doing. Neither was much use when the thing isn’t working and you’re getting calls.

So this is the version I wish I’d had.

bgp explained with an image showing 2 autonomous systems (AS) sharing routes

So What Is BGP Then

BGP – Border Gateway Protocol – is what holds the internet together. Every autonomous system on the internet uses it to exchange routing information with other autonomous systems. The global BGP routing table right now is sitting above 900,000 IPv4 prefixes.

History of BGP

OSPF is not built for that. EIGRP is not built for that. BGP is.

But it’s not just internet handoffs. You’ll find it in MPLS VPN architectures. Data centre fabrics running BGP EVPN. Large enterprise networks doing multi-site redundancy. SD-WAN overlays. Once you get past basic campus networking BGP is everywhere.

Current version is BGP-4, defined in RFC 4271. It’s a path vector protocol. Most explanations skip past what that actually means. Don’t. Understanding path vector is the key to understanding why BGP behaves the way it does when things go wrong.

What Path Vector Actually Means

OSPF builds a complete map of the network. Every router knows every link, every cost, every other router. Runs Dijkstra’s algorithm, calculates shortest paths. Clean and logical.

BGP doesn’t do any of that. Doesn’t know about links between routers at all.

What BGP knows about is paths. Specifically the sequence of autonomous systems a route passes through to reach its destination. When BGP advertises a route it includes the AS path – a list of every AS number the route has already travelled through.

So your edge router receives a route for 8.8.8.0/24 and the advertisement contains something like: AS path 65001 15169. That tells you the route went through AS65001 then AS15169. That’s it. No link metrics. No topology map. Just a list of AS numbers.

Loop prevention falls out of this automatically too. BGP receives an advertisement, sees its own AS number already in the AS path, discards it. Route has already been through here. No extra mechanisms needed. Quite clever when you think about it.

eBGP and iBGP – Get This Straight

This is the one that trips people up most when they start. Get this clear in your head and everything else makes more sense.

eBGP – External BGP – is peering between routers in different autonomous systems. Your edge router peering with your ISP. Different AS numbers on each end.

iBGP – Internal BGP – is peering between routers within the same AS. Multiple BGP speakers inside your own network sharing BGP routing information with each other.

Similar but with critical differences that bite people constantly.

With eBGP your AS number gets prepended to the AS path when you advertise to a peer. Other ASes can see the route came from you.

With iBGP you don’t prepend your own AS because the route is still inside it. But here’s the rule that catches people out: a route learned via iBGP cannot be readvertised to another iBGP peer. That’s split horizon for iBGP. It’s why you need either a full mesh of iBGP sessions between all your internal BGP routers or you need route reflectors. More on those in a minute.

Administrative distance is different too. eBGP routes get AD 20. iBGP routes get AD 200. An iBGP learned route loses to almost any IGP route if there’s a conflict. Worth knowing when you’re troubleshooting and routes aren’t installing where you expect.

BGP Explained: How It Actually Picks the Best Route

This is where BGP really differs from everything else. OSPF picks best route based on cost. One number. BGP evaluates a set of attributes in a specific order. First attribute that gives a clear winner, that’s your best path. If that attribute is identical on all paths it falls through to the next one.

OrderAttributeRule
1WeightCisco proprietary, local to the router. Higher wins.
2Local PreferenceShared across AS via iBGP. Higher wins.
3Locally OriginatedRoutes originated here win.
4AS Path LengthShorter wins.
5Origin CodeIGP beats EGP beats Incomplete.
6MEDLower wins. Only compared between same AS.
7eBGP over iBGPExternal beats internal.
8IGP MetricLowest cost to BGP next hop.
9Oldest eBGP routeMore stable path preferred.
10Router IDLowest BGP router ID.
11Neighbour IPLast resort tiebreaker.

The ones that matter day to day are weight, local preference, AS path, and MED.

Weight is Cisco only and local to one router. Nothing else in the decision process overrides it. Useful when you want one specific router using a specific path regardless of what the rest of your AS thinks.

Local Preference gets shared across your whole AS via iBGP. Set local pref 200 on routes learned from your primary ISP and 100 on the backup. Every router in your AS prefers the primary for outbound. Standard approach and it works well.

AS Path prepending is the standard way of influencing inbound traffic. Make your AS path artificially longer when advertising to one ISP. That path looks less attractive. Remote networks use your other connection instead. Bit blunt as a tool but effective enough.

MED is a hint you give to directly connected ASes about which path to use into your AS. Lower is preferred. Only compared between paths from the same neighbouring AS. More precise than prepending but less commonly used.

bgp explained best path selection using a flowchart

Session States – What You’ll See in Show Commands

Configure a BGP peer and it goes through states before it’s up and exchanging routes. You’ll see these in show bgp summary. Worth knowing what each one means.

Idle – Not attempting to connect. Config issue or interface is down.

Connect – TCP connection attempt in progress.

Active – TCP connection failed, retrying. This is where most troubleshooting starts. Peer stuck in Active nearly always means the peer IP isn’t reachable, TCP port 179 is blocked somewhere, or the remote end isn’t configured yet. Check those three in order and you’ll find it almost every time.

OpenSent – TCP connected, OPEN message sent, waiting for response.

OpenConfirm – OPEN messages exchanged, waiting for KEEPALIVE.

Established – Up and exchanging routes. Where you want to be.

Basic Configuration

eBGP to an ISP:

router bgp 65001
 bgp router-id 192.168.100.1
 neighbor 203.0.113.1 remote-as 65002
 neighbor 203.0.113.1 description ISP-PRIMARY-PEER
 network 192.0.2.0 mask 255.255.255.0

Different AS numbers on each end. That’s eBGP.

The network command catches people coming from OSPF. In OSPF it enables the protocol on matching interfaces. In BGP it tells BGP to take that exact prefix from the routing table and advertise it to peers. Exact match required. Not a summary, not a longer prefix. The exact prefix must exist in the routing table or BGP won’t advertise it. Cost me more troubleshooting time than I’d like to admit before I understood that properly.

iBGP between two routers in the same AS:

router bgp 65001
 bgp router-id 192.168.100.1
 neighbor 192.168.100.2 remote-as 65001
 neighbor 192.168.100.2 update-source Loopback0

Same AS number both ends. That’s iBGP. Always use update-source pointing at the loopback. Loopbacks don’t go down when a physical interface fails so the iBGP session stays up through link failures as long as the router is still reachable via another path.

Route Reflectors

With iBGP every BGP router needs a session to every other BGP router. Full mesh. Three routers that’s three sessions. Fine. Ten PE routers in a service provider network, that’s 45 sessions. Hundred PE routers? Nearly five thousand. Pain in the backside to manage and scale.

Route reflectors sort this. A route reflector can readvertise iBGP-learned routes to other iBGP peers – something that’s normally blocked by split horizon. Clients send routes to the reflector, reflector passes them to all other clients.

router bgp 65001
 neighbor 192.168.100.2 remote-as 65001
 neighbor 192.168.100.2 route-reflector-client
 neighbor 192.168.100.3 remote-as 65001
 neighbor 192.168.100.3 route-reflector-client

That’s on the reflector itself. Clients just configure a normal iBGP session pointing at it.

Always run two route reflectors minimum. I inherited a network once with a single route reflector and no redundancy. It went down on a Sunday morning. iBGP distribution stopped across the entire AS. Not a fun day. Don’t make that mistake.

Checking What’s Going On

Start here every time:

show bgp ipv4 unicast summary

All configured neighbours, their AS numbers, current state, how many prefixes received. Established with a number in the prefixes column means it’s working. Anything else means there’s a problem somewhere.

Full routing table:

show bgp ipv4 unicast

Status codes – > is best path, i is iBGP learned, * is valid. Next hop, local pref, AS path, origin code all visible.

Specific prefix including all paths BGP knows about:

show bgp ipv4 unicast 8.8.8.0/24

Shows which path was selected as best and why it beat the others. Very useful when traffic is going somewhere unexpected.

What you’re actually advertising to a peer:

show bgp ipv4 unicast neighbors 203.0.113.1 advertised-routes

What you’re receiving from them:

show bgp ipv4 unicast neighbors 203.0.113.1 received-routes

These two are my first stop when traffic isn’t going where it should. What you think you’re advertising and what BGP is actually sending are not always the same thing. Trust the output.

Things That Go Wrong Constantly

Neighbour stuck in Active – reachability, TCP 179, remote config. Check those three.

Routes not appearing in BGP table after a network command – the exact prefix isn’t in the routing table. Exact match or nothing.

Routes in BGP table but not installing in the routing table – next hop isn’t reachable via the IGP. Common with iBGP at the network edge. Add next-hop-self on iBGP peers so the next hop becomes something reachable.

Unexpected path selection – work through the attribute table in order. Use show bgp for the specific prefix. Something’s got a weight or local preference set somewhere and forgotten about. Almost always is.


External Link: RFC 4271 – A Border Gateway Protocol 4 – the BGP specification covering path attributes and the decision process in full


BGP Explained: How Border Gateway Protocol Actually Works

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top