Saturday, November 12, 2011

aggregate-address ... summary-only-after-a-while

As it seems, there is always something that you think you know, until it's proven the other way around.

Some years ago, when i was studying for the CCIE, i knew that in order to suppress more specific routes from an aggregate advertisement in BGP, you could use the "aggregate-address .... summary-only" command. And i believed it until recently.

Let's suppose you have the following config in a ASR1k (10.1.1.1) running 15.1(2)S2:

router bgp 100
 bgp router-id 10.1.1.1
 neighbor 10.2.2.2 remote-as 100
 neighbor 10.2.2.2 update-source Loopback0
...
 address-family ipv4
  aggregate-address 10.10.10.0 255.255.255.0 summary-only
  redistribute connected
  neighbor 10.2.2.2 activate
...

Then you have 2 subscribers logging in.
With "show bgp" the 2 /32 routes under 10.10.10.0/24 seem to be suppressed and the /24 is in the BGP table as it should be:

*> 10.10.10.0/24    0.0.0.0                            32768 i
s> 10.10.10.3/32    0.0.0.0                  0         32768 ?
s> 10.10.10.4/32    0.0.0.0                  0         32768 ?

but doing some "debug bgp updates/events" reveals the following:

BGP(0): 10.2.2.2 send UPDATE (format) 10.10.10.3/32, next 10.1.1.1, metric 0, path Local
BGP(0): 10.2.2.2 send UPDATE (format) 10.10.10.4/32, next 10.1.1.1, metric 0, path Local

...and after a while:

BGP: aggregate timer expired

BGP: topo global:IPv4 Unicast:base Remove_fwdroute for 10.10.10.3/32
BGP: topo global:IPv4 Unicast:base Remove_fwdroute for 10.10.10.4/32

At the same time on the peer router (10.2.2.2) you can see the above 2 /32 routes being received:

RP/0/RP0/CPU0:ASR9k#sh bgp neighbor 10.1.1.1 routes | i /32
*>i10.10.10.3/32    10.1.1.1            0    100      0 ?
*>i10.10.10.4/32    10.1.1.1            0    100      0 ?

and immediately afterwards you can see the 2 /32 routes being withdrawn:

RP/0/RP0/CPU0:ASR9k#sh bgp neighbor 10.1.1.1 routes | i /32

Cisco is a little bit contradicting on this behavior, as usual.


According to Cisco tac, the default aggregation logic runs every 30 seconds, but bgp update processing will be done almost every 2 seconds. That's the reason the route is being updated initially and later withdrawn (due to the aggregation processing following the initial update). They also admit that the root cause of this problem is with the BGP code. The route will be advertised as soon as the best path is completed. It may take 30 seconds or more for the aggregation logic to complete and withdraw the more specific route.

Then we also have the following:

Bug CSCsu96698

BGP: /32 route being advertised while 'summary-only' is configured

Symptoms: More specific routes are advertised and withdrawn later even if config aggregate-address net mask summary-only is configured. The BGP table shows the specific prefixes as suppressed with s>
Conditions: This occurs only with very large configurations.
Workaround: Configure a distribute-list in BGP process that denies all of the aggregation child routes.

Related Bug Information

It takes 30 seconds for BGP to form aggregate route

Symptom: for approximately 30 seconds router announces specific prefixes instead of aggregate route
Conditions: bgp session up/down
Workaround: unknown yet


Release notes of 12.2SB and 12.0S

The periodic function is by default called at 60 second intervals. The aggregate processing is normally done based on the CPU load. If there is no CPU load, then the aggregate processing function would be triggered within one second. As the CPU load increases, this function call will be triggered at higher intervals and if the CPU load is very high it could go as high as the maximum aggregate timer value configured via command. By default this maximum value is 30 seconds and is configurable with a range of 6-60 seconds and in some trains 0. So, if default values are configured, then as the CPU load increases, the chances of hitting this defect is higher.


Release notes of 12.2(33)SXH6

CLI change to bgp aggregate-timer command to suppress more specific routes.

Old Behavior: More specific routes are advertised and withdrawn later, even if aggregate-address
summary-only is configured. The BGP table shows the specific prefixes as suppressed.

New Behavior: The bgp aggregate-timer command now accepts the value of 0 (zero), which
disables the aggregate timer and suppresses the routes immediately.


Command Reference for "bgp aggregate-timer"

To set the interval at which BGP routes will be aggregated or to disable timer-based route aggregation, use the bgp aggregate-timer command in address-family or router configuration mode. To restore the default value, use the no form of this command.

bgp aggregate-timer seconds
no bgp aggregate-timer

Syntax Description

seconds

Interval (in seconds) at which the system will aggregate BGP routes.

•The range is from 6 to 60 or else 0 (zero). The default is 30.
•A value of 0 (zero) disables timer-based aggregation and starts aggregation immediately.

Command Default

30 seconds

Usage Guidelines

Use this command to change the default interval at which BGP routes are aggregated.

In very large configurations, even if the aggregate-address summary-only command is configured, more specific routes are advertised and later withdrawn. To avoid this behavior, configure the bgp aggregate-timer to 0 (zero), and the system will immediately check for aggregate routes and suppress specific routes.


The interesting part is that the command reference for "aggregate-address ... summary-only" doesn't mention anything about this behavior in order to warn you.

Using the summary-only keyword not only creates the aggregate route (for example, 192.*.*.*) but also suppresses advertisements of more-specific routes to all neighbors. If you want to suppress only advertisements to certain neighbors, you may use the neighbor distribute-list command, with caution. If a more-specific route leaks out, all BGP or mBGP routers will prefer that route over the less-specific aggregate you are generating (using longest-match routing).


The following debug logs show the default aggregate-timer which is 30 secs, vs the default BGP scan timer which is 60 secs:

Nov 12 21:45:32.468: BGP: Performing BGP general scanning
Nov 12 21:45:38.906: BGP: aggregate timer expired
Nov 12 21:46:09.637: BGP: aggregate timer expired
Nov 12 21:46:32.487: BGP: Performing BGP general scanning
Nov 12 21:46:40.379: BGP: aggregate timer expired
Nov 12 21:47:11.099: BGP: aggregate timer expired
Nov 12 21:47:32.506: BGP: Performing BGP general scanning
Nov 12 21:47:41.828: BGP: aggregate timer expired
Nov 12 21:48:12.547: BGP: aggregate timer expired
Nov 12 21:48:32.525: BGP: Performing BGP general scanning
Nov 12 21:48:43.268: BGP: aggregate timer expired
Nov 12 21:49:13.989: BGP: aggregate timer expired
Nov 12 21:49:32.544: BGP: Performing BGP general scanning
Nov 12 21:49:44.765: BGP: aggregate timer expired
Nov 12 21:50:15.510: BGP: aggregate timer expired

Guess what! After changing the aggregate-timer to 0, the cpu load increases by a steady +10%, due to the BGP Router process!

ASR1k#sh proc cpu s | exc 0.00
CPU utilization for five seconds: 17%/6%; one minute: 16%; five minutes: 15%
PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process
 61   329991518  1083305044        304  4.07%  4.09%  3.93%   0 IOSD ipc task
340   202049403    53391862       3784  1.35%  2.10%  2.11%   0 VTEMPLATE Backgr
404    84594181  2432529294          0  1.19%  1.18%  1.14%   0 PPP Events
229    49275197     1710570      28806  0.71%  0.33%  0.39%   0 QoS stats proces
152    39536838   801801056         49  0.63%  0.56%  0.51%   0 SSM connection m
159    51982155   383585236        135  0.47%  0.66%  0.64%   0 SSS Manager

ASR1k#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
ASR1k(config)#router bgp 100
ASR1k(config-router)# address-family ipv4
ASR1k(config-router-af)# bgp aggregate-timer 0
ASR1k(config-router-af)#^Z

...and after a while:

ASR1k#sh proc cpu s | exc 0.00
CPU utilization for five seconds: 29%/6%; one minute: 26%; five minutes: 25%
PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process
394   143774989    19776654       7269  9.91%  7.93%  7.32%   0 BGP Router
 61   329983414  1083280872        304  4.79%  3.97%  3.90%   0 IOSD ipc task
340   202044852    53390504       3784  2.71%  2.22%  2.04%   0 VTEMPLATE Backgr
404    84591714  2432460324          0  1.03%  1.16%  1.09%   0 PPP Events
159    51980758   383575060        135  0.79%  0.66%  0.62%   0 SSS Manager
152    39535734   801779715         49  0.63%  0.54%  0.50%   0 SSM connection m

Conclusions

1) By default, the "aggregate-address ... summary-only" command doesn't immediately stop the announcement of more specific routes, as it's supposed to. You need to also change the BGP aggregate-timer to 0.
2) After changing the BGP aggregate-timer to 0, the announcement of more specific routes stops, but the cpu load increases by 10%.


C'mon Cisco! You gave us NH Address Tracking and PIC, and you can't fix the aggregation process?

14 comments:

  1. Great write up. What happens when we do a soft reset, does the same rule apply ?

    ReplyDelete
    Replies
    1. Soft reset doesn't have any effect.
      It's just a matter of local processing.

      Delete
  2. Hi Tassos,

    Nice article...!

    Just curious how come ASR1K is running 15.1(2)S2... As per my knowledge, it runs IOS XE and latest version is stil velow 4.0.

    Regards,
    Aditya Ghanekar...!

    ReplyDelete
    Replies
    1. Aditya,

      ASR1k runs IOS XE, which (in order to confuse you) is following 2 numbering schemes:
      IOS XE based one, like 3.x.x
      IOS based one, like 15.x (this is he Cisco IOS Software version from which the Cisco IOS XE Software image is derived)

      Delete
  3. Hi there, been reading through your blog and it has inspired me!

    I have decided that I must complete my CCIE this year 2012, I have setup a blog to track my progress, would you consider adding it to your blogroll?

    Many Thanks
    Roger
    http://www.rogerperkin.co.uk/ccie

    ReplyDelete
  4. Please, read carefully what TAC said; summary only route works, it doesn't when we've very large configurations.

    Say that Cisco is usual contradicting is false, I believe Cisco is working fine, more than other companies, and in this case I see clearly a bug rather than a contradiction.

    Regards

    ReplyDelete
    Replies
    1. I wish Cisco could so easily accept something as a bug. Their usual answer is "this is the expected behavior" or "this is a feature, not a bug".

      Delete
  5. Nice Article..... thanx for the useful updated information....keep posting

    ReplyDelete
  6. Thank you for such an in depth review and thank you for sharing this! You never know when you are going to need it, but when you do you realize it’s a life saver!

    ReplyDelete
  7. Dear Tassos,

    Thanks a lot for the post.

    Preparing for my CCNP Route, but im not able to understand the difference between aggregate routes and summary addresses?

    Both are same or not?

    Thanks in advance...

    CCNA Trainer

    ReplyDelete
  8. All code (independent of vendor) has bugs. Heavy is the head that wears the crown.

    ReplyDelete
  9. Hi,
    Could you please provide a path to prepare for CCIE in 3 months, Thanks in advance

    ReplyDelete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Greece License.