From 24062df217da6327cf3d516a7c2a3d23083f8aeb Mon Sep 17 00:00:00 2001 From: Pierre Neltner Date: Thu, 16 Apr 2026 13:57:03 +0200 Subject: [PATCH 1/9] bgpd: Add neighbor X.X.X.X cluster-id CLUSTER_ID command Add the command : neighbor cluster-id CLUSTER_ID This command can be executed after router bgp AS, address-family AFI SAFI. This command assigns a per-neighbor cluster for that specific peer and afi safi combination. If the peer is not previously declared that command fails. For that purpose the peer struct has one new attribute per_neighbor_cluster that is an in_addr list the content of which can be accessed through afi and safi indexes. This list is associated with a new af_flag that indicates whether it is configured. Another new attribute is added: cluster_flags that is also accessed throug afi safi indexes and ensures that the checking of cluster_flags can be performed in O(1) in the hot-path. A new structure is added to represent the new per-neighbor clusters: struct cluster, it contains the cluster-id of the cluster, flags that define its behavior and a count of how many times a cluster is referenced in the configuration, when that count goes down to 0 the cluster is deleted. That count takes into account every peers and groups that are part of the cluster and any reflecetion policy for that group. The CLUSTER_FLAG_GLOBAL marks any per-neighbor cluster whose cluster-id is identical to the global cluster-id: the unique historical cluster-id of route-reflectors. Whenever a cluster is marked as global,its flags are ignored and the ones of the global cluster are used instead. bgp instances have a new attribute: per_neighbor_clusters: a list that contains cluster objects. A peer that is part of a per-neighbor cluster and is configured as a route-reflector-client, behaves as a classic route-reflector-client peer. However, prefixes received from that peer will be advertised to other client and non-client peers with the per-neighbor cluster-id of that peer in the prefix's cluster-list. In the same spirit prefix reflected to one such peer from a non-client peer will have the per-neighbor cluster-id of the client-peer added to the cluster list. When a prefix is reflected between two client peers part of different per-neighbor clusters only the cluster-id of the sender is added to the cluster-list. If not configured as route-reflector client, a peer with a per-neighbor cluster-id behaves as a normal peer. Because the per-neighbor cluster-ids have an influence on out-going prefixes, this commit also modifies update-groups more specificaly ```updgrp_hash_key_make``` and ```updgrp_hash_cmp``` in order to have peers from different per-neighbor clusters sorted in different update groups. In the spirit of the rfc4456 any incoming prefix that contains one of the cluster-ids of the route-reflector clients is dropped. This implies changes to the ```bgp_cluster_filter``` function in order to check all clusters and not only the global one. A cluster-id can be assigned to a peer that is not a route-reflector-client in that case that peer will behave normaly but the cluster-id will still be taken into account for filtering. This makes sure that commands do not depend on the execution order. Information about cluster-ids that are configured on peers can be found in the running config and in the display of that peer's information. Signed-off-by: Pierre Neltner --- bgpd/bgp_attr.c | 30 +++- bgpd/bgp_memory.c | 1 + bgpd/bgp_memory.h | 1 + bgpd/bgp_route.c | 11 ++ bgpd/bgp_updgrp.c | 15 +- bgpd/bgp_updgrp.h | 2 +- bgpd/bgp_vty.c | 87 +++++++++++ bgpd/bgpd.c | 373 +++++++++++++++++++++++++++++++++++++++++++++- bgpd/bgpd.h | 44 +++++- doc/user/bgp.rst | 10 ++ 10 files changed, 568 insertions(+), 6 deletions(-) diff --git a/bgpd/bgp_attr.c b/bgpd/bgp_attr.c index 721d78b72153..1b59de8d24ce 100644 --- a/bgpd/bgp_attr.c +++ b/bgpd/bgp_attr.c @@ -5748,9 +5748,35 @@ bgp_size_t bgp_packet_attribute(struct bgp *bgp, struct peer *peer, struct strea stream_putc(s, cluster ? cluster->length + 4 : 4); } + /* The route reflector has multiple clusters associated with it, + * put the one that has been configured by the user for that peer + * in the cluster-list. + * filtering has been updated to take these new clusters into account + * and prevent loops + */ + + /*if the originator of the prefix has a per-neighbor cluster configured*/ + if (CHECK_FLAG(from->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT) && + CHECK_FLAG(from->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) + stream_put_in_addr(s, &from->cluster[afi][safi]); + + /*if the originator of the prefix is a client of the default cluster*/ + else if (CHECK_FLAG(from->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT)) { + if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) + stream_put_in_addr(s, &bgp->cluster_id); + else + stream_put_in_addr(s, &bgp->router_id); + } + + /*if the destination of the id has a per-neighbor cluster configured*/ + else if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT) && + CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) + stream_put_in_addr(s, &peer->cluster[afi][safi]); + /* If this peer configuration's parent BGP has - * cluster_id. */ - if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) + * cluster_id. + */ + else if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) stream_put_in_addr(s, &bgp->cluster_id); else stream_put_in_addr(s, &bgp->router_id); diff --git a/bgpd/bgp_memory.c b/bgpd/bgp_memory.c index ff9ed92b5c03..961bf63b812a 100644 --- a/bgpd/bgp_memory.c +++ b/bgpd/bgp_memory.c @@ -75,6 +75,7 @@ DEFINE_MTYPE(BGPD, COMMUNITY_LIST_HANDLER, "community-list handler"); DEFINE_MTYPE(BGPD, CLUSTER, "Cluster list"); DEFINE_MTYPE(BGPD, CLUSTER_VAL, "Cluster list val"); +DEFINE_MTYPE(BGPD, PER_NEIGHBOR_CLUSTER, "Per-neighbor Cluster"); DEFINE_MTYPE(BGPD, BGP_PROCESS_QUEUE, "BGP Process queue"); DEFINE_MTYPE(BGPD, BGP_CLEAR_NODE_QUEUE, "BGP node clear queue"); diff --git a/bgpd/bgp_memory.h b/bgpd/bgp_memory.h index 3cc2e1945053..47173f4a8f92 100644 --- a/bgpd/bgp_memory.h +++ b/bgpd/bgp_memory.h @@ -71,6 +71,7 @@ DECLARE_MTYPE(COMMUNITY_LIST_HANDLER); DECLARE_MTYPE(CLUSTER); DECLARE_MTYPE(CLUSTER_VAL); +DECLARE_MTYPE(PER_NEIGHBOR_CLUSTER); DECLARE_MTYPE(BGP_PROCESS_QUEUE); DECLARE_MTYPE(BGP_CLEAR_NODE_QUEUE); diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c index dbd1aa3e6a05..81ae7a0d9f01 100644 --- a/bgpd/bgp_route.c +++ b/bgpd/bgp_route.c @@ -2055,7 +2055,12 @@ static bool bgp_cluster_filter(struct peer *peer, struct attr *attr) { struct in_addr cluster_id; struct cluster_list *cluster = bgp_attr_get_cluster(attr); + struct cluster *per_neighbor_cluster; + /* when receiving a route with any of the per-neighbor cluster-ids configured + * on the router, the route should be dropped to avoid loop regardless of the + * cluster that originated it + */ if (cluster) { if (CHECK_FLAG(peer->bgp->config, BGP_CONFIG_CLUSTER_ID)) cluster_id = peer->bgp->cluster_id; @@ -2064,6 +2069,12 @@ static bool bgp_cluster_filter(struct peer *peer, struct attr *attr) if (cluster_loop_check(cluster, cluster_id)) return true; + + frr_each (per_neighbor_cluster_list, &peer->bgp->per_neighbor_clusters, + per_neighbor_cluster) { + if (cluster_loop_check(cluster, per_neighbor_cluster->cluster_id)) + return true; + } } return false; } diff --git a/bgpd/bgp_updgrp.c b/bgpd/bgp_updgrp.c index 668f1c1451b8..f7993c0dbadd 100644 --- a/bgpd/bgp_updgrp.c +++ b/bgpd/bgp_updgrp.c @@ -134,6 +134,13 @@ static void conf_copy(struct peer *dst, struct peer *src, afi_t afi, dst->v_routeadv = src->v_routeadv; dst->flags = src->flags; dst->af_flags[afi][safi] = src->af_flags[afi][safi]; + dst->cluster[afi][safi] = src->cluster[afi][safi]; + /*do not need to increment refcount: + *as soon as there is no more peer in the cluster + *there there will be no more peer in the subgroup and + *the subgroup will be deleted + */ + dst->cluster_flags[afi][safi] = src->cluster_flags[afi][safi]; dst->pmax_out[afi][safi] = src->pmax_out[afi][safi]; dst->max_packet_size = src->max_packet_size; XFREE(MTYPE_BGP_PEER_HOST, dst->host); @@ -371,6 +378,8 @@ static unsigned int updgrp_hash_key_make(const void *p) CHECK_FLAG(peer->flags, PEER_FLAG_AS_LOOP_DETECTION), key); + if (CHECK_FLAG(flags, PEER_FLAG_CLUSTER_ID)) + key = jhash_1word(peer->cluster[afi][safi].s_addr, key); if (peer->group) key = jhash_1word(jhash(peer->group->name, strlen(peer->group->name), SEED1), @@ -582,7 +591,6 @@ static bool updgrp_hash_cmp(const void *p1, const void *p2) flags2 = pe2->af_flags[afi][safi]; fl1 = &pe1->filter[afi][safi]; fl2 = &pe2->filter[afi][safi]; - /* put EBGP and IBGP peers in different update groups */ if (pe1->sort != pe2->sort) return false; @@ -603,6 +611,11 @@ static bool updgrp_hash_cmp(const void *p1, const void *p2) if ((flags1 & PEER_UPDGRP_AF_FLAGS) != (flags2 & PEER_UPDGRP_AF_FLAGS)) return false; + /*if per neighbor cluster is configured it should match*/ + if (CHECK_FLAG(flags1, PEER_FLAG_CLUSTER_ID) && CHECK_FLAG(flags2, PEER_FLAG_CLUSTER_ID) && + !IPV4_ADDR_SAME(&pe1->cluster[afi][safi], &pe2->cluster[afi][safi])) + return false; + if (pe1->addpath_type[afi][safi] != pe2->addpath_type[afi][safi]) return false; diff --git a/bgpd/bgp_updgrp.h b/bgpd/bgp_updgrp.h index ffe0418f912e..421c6bcfe998 100644 --- a/bgpd/bgp_updgrp.h +++ b/bgpd/bgp_updgrp.h @@ -50,7 +50,7 @@ PEER_FLAG_REMOVE_PRIVATE_AS_ALL | PEER_FLAG_REMOVE_PRIVATE_AS_REPLACE | \ PEER_FLAG_REMOVE_PRIVATE_AS_ALL_REPLACE | PEER_FLAG_AS_OVERRIDE | \ PEER_FLAG_CONFIG_ENCAPSULATION_SRV6 | PEER_FLAG_CONFIG_ENCAPSULATION_SRV6_RELAX | \ - PEER_FLAG_CONFIG_ENCAPSULATION_MPLS) + PEER_FLAG_CONFIG_ENCAPSULATION_MPLS | PEER_FLAG_CLUSTER_ID) /* * PEER_CAP_LLGR_RCV is included so peers that differ in the received Long-Lived diff --git a/bgpd/bgp_vty.c b/bgpd/bgp_vty.c index 068b25d682b3..6436eb671261 100644 --- a/bgpd/bgp_vty.c +++ b/bgpd/bgp_vty.c @@ -5478,6 +5478,56 @@ DEFUN (neighbor_remote_as, argv[idx_remote_as]->arg); } + +DEFPY (neighbor_cluster_id, + neighbor_cluster_id_cmd, + "[no] neighbor $neighbor cluster-id [$id]", + NO_STR + NEIGHBOR_STR + NEIGHBOR_ADDR_STR2 + "Configure a cluster-id for the neighbor\n" + "Route-Reflector Cluster-id in IP address format\n" + "Route-Reflector Cluster-id as 32 bit quantity\n") +{ + VTY_DECLVAR_CONTEXT(bgp, bgp); + struct in_addr cluster; + int ret; + struct peer *peer; + afi_t afi = bgp_node_afi(vty); + safi_t safi = bgp_node_safi(vty); + + peer = peer_and_group_lookup_vty(vty, neighbor); + + if (!peer) + return CMD_WARNING_CONFIG_FAILED; + if (peer->sort == BGP_PEER_EBGP) { + vty_out(vty, "%% Invalid command. Not an internal neighbor\n"); + return CMD_WARNING_CONFIG_FAILED; + } + if (!no) { + if (!id) { + vty_out(vty, "%% Specify a cluster-id\n"); + return CMD_WARNING_CONFIG_FAILED; + } + ret = inet_aton(id, &cluster); + if (!ret) { + vty_out(vty, "%% Malformed bgp cluster identifier\n"); + return CMD_WARNING_CONFIG_FAILED; + } + } + /* There is no check to verify whether the peer is a route-reflector-client, + * this is intentional and will not affect routing for the peer until it is + * set as route-reflector-client, + * it will still affect inbound filtering based on cluster-list + */ + if (no) + bgp_neighbor_cluster_id_unset(bgp, peer, afi, safi); + else + bgp_neighbor_cluster_id_set(bgp, &cluster, peer, afi, safi); + + return CMD_SUCCESS; +} + DEFPY (bgp_allow_martian, bgp_allow_martian_cmd, "[no]$no bgp allow-martian-nexthop", @@ -13980,6 +14030,12 @@ DEFUN (show_bgp_memory, if ((count = mtype_stats_alloc(MTYPE_BGP_REGEXP))) vty_out(vty, "%ld compiled regexes, using %s of memory\n", count, mtype_memstr(memstrbuf, sizeof(memstrbuf), count * sizeof(struct frregex))); + + count = mtype_stats_alloc(MTYPE_PER_NEIGHBOR_CLUSTER); + if (count) + vty_out(vty, "%ld per-neighbor clusters, using %s of memory\n", count, + mtype_memstr(memstrbuf, sizeof(memstrbuf), count * sizeof(struct cluster))); + return CMD_SUCCESS; } @@ -15686,6 +15742,7 @@ static void bgp_show_peer_afi(struct vty *vty, struct peer *p, afi_t afi, struct peer_af *paf; char orf_pfx_name[BUFSIZ]; int orf_pfx_count; + char cluster_id[INET_ADDRSTRLEN]; json_object *json_af = NULL; json_object *json_prefA = NULL; json_object *json_addr = NULL; @@ -15773,6 +15830,12 @@ static void bgp_show_peer_afi(struct vty *vty, struct peer *p, afi_t afi, PEER_FLAG_REFLECTOR_CLIENT)) json_object_boolean_true_add(json_addr, "routeReflectorClient"); + + if (CHECK_FLAG(p->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) + json_object_string_add(json_addr, "clusterId", + inet_ntop(AF_INET, &p->cluster[afi][safi], + cluster_id, INET_ADDRSTRLEN)); + if (CHECK_FLAG(p->af_flags[afi][safi], PEER_FLAG_RSERVER_CLIENT)) json_object_boolean_true_add(json_addr, @@ -16102,6 +16165,8 @@ static void bgp_show_peer_afi(struct vty *vty, struct peer *p, afi_t afi, if (CHECK_FLAG(p->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT)) vty_out(vty, " Route-Reflector Client\n"); + if (CHECK_FLAG(p->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) + vty_out(vty, " Cluster-id: %pI4\n", &p->cluster[afi][safi]); if (CHECK_FLAG(p->af_flags[afi][safi], PEER_FLAG_RSERVER_CLIENT)) vty_out(vty, " Route-Server Client\n"); @@ -21646,6 +21711,7 @@ static void bgp_config_write_peer_af(struct vty *vty, struct bgp *bgp, { struct peer *g_peer = NULL; char *addr; + struct in_addr cluster_id; bool flag_scomm, flag_secomm, flag_slcomm; /* skip hidden default vrf bgp instance */ @@ -21749,6 +21815,12 @@ static void bgp_config_write_peer_af(struct vty *vty, struct bgp *bgp, vty_out(vty, " neighbor %s route-reflector-client\n", addr); } + /* Per-neighbor cluster*/ + if (peergroup_af_flag_check(peer, afi, safi, PEER_FLAG_CLUSTER_ID)) { + cluster_id = peer->cluster[afi][safi]; + vty_out(vty, " neighbor %s cluster-id %pI4\n", addr, &cluster_id); + } + /* next-hop-self force */ if (peergroup_af_flag_check(peer, afi, safi, PEER_FLAG_FORCE_NEXTHOP_SELF)) { @@ -22115,6 +22187,7 @@ int bgp_config_write(struct vty *vty) { struct bgp *bgp; struct peer_group *group; + struct cluster *cluster; struct peer *peer; struct listnode *node, *nnode; struct listnode *mnode, *mnnode; @@ -23265,6 +23338,7 @@ void bgp_vty_init(void) /* "bgp cluster-id" commands. */ install_element(BGP_NODE, &bgp_cluster_id_cmd); install_element(BGP_NODE, &no_bgp_cluster_id_cmd); + install_element(BGP_NODE, &bgp_cluster_id_client_to_client_cmd); /* "bgp no-rib" commands. */ install_element(CONFIG_NODE, &bgp_norib_cmd); @@ -23944,6 +24018,19 @@ void bgp_vty_init(void) install_element(BGP_VPNV4_NODE, &neighbor_ecommunity_rpki_cmd); install_element(BGP_VPNV6_NODE, &neighbor_ecommunity_rpki_cmd); + /* "neighbor cluster-id" commands.*/ + install_element(BGP_IPV4_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_IPV4M_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_IPV4L_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_IPV6_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_IPV6M_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_IPV6L_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_VPNV4_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_VPNV6_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_FLOWSPECV4_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_FLOWSPECV6_NODE, &neighbor_cluster_id_cmd); + install_element(BGP_EVPN_NODE, &neighbor_cluster_id_cmd); + /* "neighbor route-reflector" commands.*/ install_element(BGP_NODE, &neighbor_route_reflector_client_hidden_cmd); install_element(BGP_NODE, diff --git a/bgpd/bgpd.c b/bgpd/bgpd.c index 21ddf48097e7..9ae30f3a7b5d 100644 --- a/bgpd/bgpd.c +++ b/bgpd/bgpd.c @@ -364,7 +364,13 @@ static int bgp_router_id_set(struct bgp *bgp, const struct in_addr *id, if (bgp && bgp->ls_info && bgp->ls_info->enable_distribution) bgp_ls_export_bgp_topology(bgp); + /* Verify the global marking on per-neighbor clusters + * the router-id is used as global cluster-id when not configured + */ + per_neighbor_cluster_update_global_marking(bgp, false); + hook_call(bgp_routerid_update, bgp, false); + return 0; } @@ -655,6 +661,11 @@ void bgp_cluster_id_set(struct bgp *bgp, struct in_addr *cluster_id) IPV4_ADDR_COPY(&bgp->cluster_id, cluster_id); bgp_config_set(bgp, BGP_CONFIG_CLUSTER_ID); + /* update per neighbor cluster so the default flag matches + * the default cluster + */ + per_neighbor_cluster_update_global_marking(bgp, true); + /* Clear all IBGP peer. */ for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) { if (peer->sort != BGP_PEER_IBGP) @@ -676,6 +687,7 @@ void bgp_cluster_id_unset(struct bgp *bgp) bgp->cluster_id.s_addr = 0; bgp_config_unset(bgp, BGP_CONFIG_CLUSTER_ID); + per_neighbor_cluster_update_global_marking(bgp, true); /* Clear all IBGP peer. */ for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) { @@ -2332,6 +2344,8 @@ void peer_as_change(struct peer *peer, as_t as, enum peer_asn_type as_type, const char *as_str) { enum bgp_peer_sort origtype, newtype; + afi_t afi; + safi_t safi; /* Stop peer. */ if (!CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) { @@ -2398,6 +2412,17 @@ void peer_as_change(struct peer *peer, as_t as, enum peer_asn_type as_type, PEER_FLAG_REFLECTOR_CLIENT); UNSET_FLAG(peer->af_flags[AFI_L2VPN][SAFI_EVPN], PEER_FLAG_REFLECTOR_CLIENT); + FOREACH_AFI_SAFI (afi, safi) { + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) { + bgp_per_neighbor_cluster_id_delete(peer->bgp, + &peer->cluster[afi][safi]); + peer->cluster[afi][safi].s_addr = 0; + peer->cluster_flags[afi][safi] = 0; + UNSET_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID); + UNSET_FLAG(peer->af_flags_override[afi][safi], + PEER_FLAG_CLUSTER_ID); + } + } } } @@ -2488,10 +2513,12 @@ static void peer_group2peer_config_copy_af(struct peer_group *group, uint64_t pflags_ovrd; uint8_t *pfilter_ovrd; struct peer *conf; + int old_cluster_configured; conf = group->conf; pflags_ovrd = peer->af_flags_override[afi][safi]; pfilter_ovrd = &peer->filter_override[afi][safi][in]; + old_cluster_configured = CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID); /* peer af_flags apply */ flags_tmp = conf->af_flags[afi][safi] & ~pflags_ovrd; @@ -2527,6 +2554,20 @@ static void peer_group2peer_config_copy_af(struct peer_group *group, if (!CHECK_FLAG(pflags_ovrd, PEER_FLAG_WEIGHT)) PEER_ATTR_INHERIT(peer, group, weight[afi][safi]); + /*per neighbor cluster-id*/ + if (!CHECK_FLAG(pflags_ovrd, PEER_FLAG_CLUSTER_ID)) { + /* if we are overriding a previous cluster */ + if (old_cluster_configured) + bgp_per_neighbor_cluster_id_delete(peer->bgp, &peer->cluster[afi][safi]); + PEER_ATTR_INHERIT(peer, group, cluster[afi][safi]); + PEER_ATTR_INHERIT(peer, group, cluster_flags[afi][safi]); + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) + /* the returned pointer is ignored since it will always be the same as + *the inherited one + */ + bgp_per_neighbor_cluster_id_add(peer->bgp, &peer->cluster[afi][safi]); + } + /* default-originate route-map */ if (!CHECK_FLAG(pflags_ovrd, PEER_FLAG_DEFAULT_ORIGINATE)) { PEER_STR_ATTR_INHERIT(peer, group, default_rmap[afi][safi].name, @@ -3120,6 +3161,12 @@ int peer_delete(struct peer *peer) XFREE(MTYPE_ROUTE_MAP_NAME, peer->allowas_in_rmap[afi][safi].name); } + /*reduce refcnt of per-neighbor*/ + FOREACH_AFI_SAFI (afi, safi) { + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) + bgp_per_neighbor_cluster_id_delete(bgp, &peer->cluster[afi][safi]); + } + FOREACH_AFI_SAFI (afi, safi) peer_af_delete(peer, afi, safi); @@ -3717,6 +3764,107 @@ int peer_group_bind(struct bgp *bgp, union sockunion *su, struct peer *peer, return 0; } +static struct cluster *per_neighbor_cluster_new(void) +{ + struct cluster *cluster; + + cluster = XCALLOC(MTYPE_PER_NEIGHBOR_CLUSTER, sizeof(struct cluster)); + cluster->refcnt = 1; + return cluster; +} + +/* the cluster should always have been removed from the per-neighbor cluster list + * before calling per_neighbor_cluster_free + */ +static void per_neighbor_cluster_free(struct cluster *cluster) +{ + XFREE(MTYPE_PER_NEIGHBOR_CLUSTER, cluster); +} + +struct cluster *per_neighbor_cluster_lookup(struct bgp *bgp, const struct in_addr *cluster_id) +{ + struct cluster *cluster; + + frr_each (per_neighbor_cluster_list, &bgp->per_neighbor_clusters, cluster) { + if (IPV4_ADDR_SAME(&cluster->cluster_id, cluster_id)) + return cluster; + } + return NULL; +} + +uint8_t copy_global_cluster_flags(struct bgp *bgp) +{ + uint8_t flags = 0; + + if (CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED)) + SET_FLAG(flags, CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED); + if (CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER)) + SET_FLAG(flags, CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER); + SET_FLAG(flags, CLUSTER_FLAG_GLOBAL); + return flags; +} + +/*check all per-neighbor clusters if one has the same id as the global one then mark it as global + * configuration of clusters marked as global are ignored, the configuration of the global cluster + * is used instead + */ +void per_neighbor_cluster_update_global_marking(struct bgp *bgp, bool modified_cluster_id) +{ + struct in_addr *cluster_id; + struct cluster *cluster; + struct listnode *node, *nnode; + struct peer *peer; + afi_t afi; + safi_t safi; + + /*if there are no per-neighbor clusters nothing to do*/ + if (per_neighbor_cluster_list_count(&bgp->per_neighbor_clusters) == 0) + return; + + if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) { + /*if the attribute cluster-id hasn't been modified and is configured + * then the global cluster id hasn't changed and there is nothing to do + */ + if (!modified_cluster_id) + return; + cluster_id = &bgp->cluster_id; + } else + cluster_id = &bgp->router_id; + + frr_each (per_neighbor_cluster_list, &bgp->per_neighbor_clusters, cluster) { + if (IPV4_ADDR_SAME(&cluster->cluster_id, cluster_id)) + SET_FLAG(cluster->flags, CLUSTER_FLAG_GLOBAL); + else + UNSET_FLAG(cluster->flags, CLUSTER_FLAG_GLOBAL); + } + + /* Global cluster-id as changed : + * update flags of per-neighbor cluster members + * if the function is call from cluster_id_set or unset + * peers will be hard cleared anyway + * that is also true when the router-id as changed + */ + for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) { + if (peer->sort != BGP_PEER_IBGP) + continue; + FOREACH_AFI_SAFI (afi, safi) { + if (!peer->afc[afi][safi]) + continue; + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) { + cluster = per_neighbor_cluster_lookup(bgp, + &peer->cluster[afi][safi]); + if (!cluster) + continue; + if (!CHECK_FLAG(cluster->flags, CLUSTER_FLAG_GLOBAL)) + peer->cluster_flags[afi][safi] = cluster->flags; + else + peer->cluster_flags[afi][safi] = + copy_global_cluster_flags(bgp); + } + } + } +} + static void bgp_startup_timer_expire(struct event *event) { struct bgp *bgp; @@ -3820,6 +3968,7 @@ static struct bgp *bgp_create(as_t *as, const char *name, SET_FLAG(bgp->peer_self->cap, PEER_CAP_AS4_ADV); bgp->peer = list_new(); + per_neighbor_cluster_list_init(&bgp->per_neighbor_clusters); peer_init: bgp->peer->cmp = (int (*)(void *, void *))peer_cmp; @@ -4382,6 +4531,7 @@ int bgp_delete(struct bgp *bgp) struct peer_connection *connection; uint32_t b_ann_cnt = 0, b_l2_cnt = 0; uint32_t a_ann_cnt = 0, a_l2_cnt = 0; + struct cluster *cluster; assert(bgp); @@ -4612,6 +4762,15 @@ int bgp_delete(struct bgp *bgp) update_bgp_group_free(bgp); + /*Free per-neighbor clusters*/ + if (per_neighbor_cluster_list_count(&bgp->per_neighbor_clusters)) { + frr_each_safe (per_neighbor_cluster_list, &bgp->per_neighbor_clusters, cluster) { + per_neighbor_cluster_list_del(&bgp->per_neighbor_clusters, cluster); + per_neighbor_cluster_free(cluster); + } + } + per_neighbor_cluster_list_fini(&bgp->per_neighbor_clusters); + /* Cancel peer connection errors event */ event_cancel(&bgp->t_conn_errors); @@ -5428,6 +5587,7 @@ static const struct peer_flag_action peer_af_flag_action_list[] = { { PEER_FLAG_CONFIG_ENCAPSULATION_SRV6, 0, peer_change_best_path }, { PEER_FLAG_CONFIG_ENCAPSULATION_SRV6_RELAX, 0, peer_change_best_path }, { PEER_FLAG_CONFIG_ENCAPSULATION_MPLS, 0, peer_change_best_path }, + { PEER_FLAG_CLUSTER_ID, 0, peer_change_none }, { 0, 0, 0 } }; @@ -5781,7 +5941,7 @@ static int peer_af_flag_modify(struct peer *peer, afi_t afi, safi_t safi, found = peer_flag_action_set(peer_af_flag_action_list, size, &action, flag); - /* Abort if flag action exists. */ + /* Abort if no flag action exists. */ if (!found) return BGP_ERR_INVALID_FLAG; @@ -6756,6 +6916,217 @@ void peer_on_policy_change(struct peer *peer, afi_t afi, safi_t safi, } } +/* BGP's cluster-id control. */ +void bgp_neighbor_cluster_id_set(struct bgp *bgp, struct in_addr *cluster_id, struct peer *peer, + afi_t afi, safi_t safi) +{ + struct listnode *node, *nnode; + struct peer *member; + struct cluster *cluster; + + /* do nothing when the value is already configured as desired*/ + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID) && + IPV4_ADDR_SAME(&peer->cluster[afi][safi], cluster_id)) + return; + + /*if there was another cluster configured previously*/ + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) { + bgp_per_neighbor_cluster_id_delete(bgp, &peer->cluster[afi][safi]); + + /*if it is a group does the same for each member*/ + if (CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) { + for (ALL_LIST_ELEMENTS(peer->group->peer, node, nnode, member)) { + /* Skip peers with overridden configuration. */ + if (CHECK_FLAG(member->af_flags_override[afi][safi], + PEER_FLAG_CLUSTER_ID)) + continue; + + /* Set flag and configuration on peer-group member. */ + bgp_per_neighbor_cluster_id_delete(bgp, + &member->cluster[afi][safi]); + } + } + } + + /*create or add a new reference to the cluster and configure it*/ + cluster = bgp_per_neighbor_cluster_id_add(bgp, cluster_id); + IPV4_ADDR_COPY(&peer->cluster[afi][safi], cluster_id); + peer_af_flag_set(peer, afi, safi, PEER_FLAG_CLUSTER_ID); + if (!CHECK_FLAG(cluster->flags, CLUSTER_FLAG_GLOBAL)) + peer->cluster_flags[afi][safi] = cluster->flags; + else + peer->cluster_flags[afi][safi] = copy_global_cluster_flags(bgp); + + /* Skip peer-group mechanics for regular peers. */ + if (!CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) { + /* Clear the one changed IBGP peer. */ + peer_set_last_reset(peer, PEER_DOWN_CLID_CHANGE); + peer_notify_config_change(peer->connection); + return; + } + /* + * clear all peer-group members, unless they are + * explicitly overriding peer-group configuration. + */ + for (ALL_LIST_ELEMENTS(peer->group->peer, node, nnode, member)) { + /* Skip peers with overridden configuration. */ + if (CHECK_FLAG(member->af_flags_override[afi][safi], PEER_FLAG_CLUSTER_ID)) + continue; + + /* Set configuration on peer-group member. */ + bgp_per_neighbor_cluster_id_add(bgp, cluster_id); + member->cluster[afi][safi] = *cluster_id; + member->cluster_flags[afi][safi] = peer->cluster_flags[afi][safi]; + peer_set_last_reset(member, PEER_DOWN_CLID_CHANGE); + peer_notify_config_change(member->connection); + } +} + +void bgp_neighbor_cluster_id_unset(struct bgp *bgp, struct peer *peer, afi_t afi, safi_t safi) +{ + struct listnode *node, *nnode; + struct peer *member; + struct in_addr cluster_id; + + /* do nothing when there is no locally configured cluster*/ + if ((!CHECK_FLAG(peer->af_flags_override[afi][safi], PEER_FLAG_CLUSTER_ID) && + !CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) || + (CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP) && + !CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID))) + return; + + cluster_id = peer->cluster[afi][safi]; + bgp_per_neighbor_cluster_id_delete(bgp, &peer->cluster[afi][safi]); + peer->cluster_flags[afi][safi] = 0; + peer->cluster[afi][safi].s_addr = 0; + /*unset the flag for both the peer and group members if it is a group*/ + peer_af_flag_unset(peer, afi, safi, PEER_FLAG_CLUSTER_ID); + + /* Inherit configuration from peer-group if peer is member. */ + if (peer_group_active(peer)) { + peer_af_flag_inherit(peer, afi, safi, PEER_FLAG_CLUSTER_ID); + PEER_ATTR_INHERIT(peer, peer->group, cluster[afi][safi]); + PEER_ATTR_INHERIT(peer, peer->group, cluster_ptr[afi][safi]); + + /*ensures refcnt is incremented when the cluster is inherited from the group*/ + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) + bgp_per_neighbor_cluster_id_add(bgp, &peer->cluster[afi][safi]); + } + + /* Skip peer-group mechanics for regular peers. */ + if (!CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) { + /* Clear the one changed IBGP peer. */ + peer_set_last_reset(peer, PEER_DOWN_CLID_CHANGE); + peer_notify_config_change(peer->connection); + return; + } + /* + * clear all peer-group members, unless they are + * explicitly overriding peer-group configuration. + */ + for (ALL_LIST_ELEMENTS(peer->group->peer, node, nnode, member)) { + /* Skip peers with overridden configuration. */ + if (CHECK_FLAG(member->af_flags_override[afi][safi], PEER_FLAG_CLUSTER_ID)) + continue; + /* clear all other member peers + * there flags are already down thanks to + * peer_af_flag_unset + */ + member->cluster[afi][safi].s_addr = 0; + member->cluster_flags[afi][safi] = peer->cluster_flags[afi][safi]; + bgp_per_neighbor_cluster_id_delete(bgp, &cluster_id); + peer_set_last_reset(member, PEER_DOWN_CLID_CHANGE); + peer_notify_config_change(member->connection); + } +} + +struct cluster *bgp_per_neighbor_cluster_id_add(struct bgp *bgp, struct in_addr *cluster_id) +{ + struct cluster *cluster; + struct in_addr *global_cluster_id; + struct listnode *node, *nnode; + struct peer *peer; + afi_t afi; + safi_t safi; + + cluster = per_neighbor_cluster_lookup(bgp, cluster_id); + if (cluster) { + cluster->refcnt++; + return cluster; + } + + cluster = per_neighbor_cluster_new(); + IPV4_ADDR_COPY(&cluster->cluster_id, cluster_id); + per_neighbor_cluster_list_add_tail(&bgp->per_neighbor_clusters, cluster); + + /* if the cluster id is the same as the global cluster + * set the flag to override that cluster with the global one + */ + if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) + global_cluster_id = &bgp->cluster_id; + else + global_cluster_id = &bgp->router_id; + if (IPV4_ADDR_SAME(global_cluster_id, cluster_id)) + SET_FLAG(cluster->flags, CLUSTER_FLAG_GLOBAL); + + /* Clear all IBGP peer: + * when a cluster is created routes containing its cluster-id + * should be filtered + */ + for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) { + if (peer->sort != BGP_PEER_IBGP) + continue; + if (CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) + continue; + + FOREACH_AFI_SAFI (afi, safi) { + if (peer->afc[afi][safi]) + peer_clear_soft(peer, afi, safi, BGP_CLEAR_SOFT_IN); + } + } + return cluster; +} + +void bgp_per_neighbor_cluster_id_delete(struct bgp *bgp, struct in_addr *cluster_id) +{ + struct cluster *cluster; + struct listnode *node, *nnode; + struct peer *peer; + afi_t afi; + safi_t safi; + + cluster = per_neighbor_cluster_lookup(bgp, cluster_id); + if (!cluster) + return; + + /*decrease the number of references to the cluster*/ + cluster->refcnt--; + if (cluster->refcnt) + return; + + /*if there is no more references: delete the cluster*/ + per_neighbor_cluster_list_del(&bgp->per_neighbor_clusters, cluster); + per_neighbor_cluster_free(cluster); + + if (CHECK_FLAG(bgp->flags, BGP_FLAG_DELETE_IN_PROGRESS)) + return; + + /* Clear all IBGP peer: + * whenever the cluster is suppressed routes containing its cluster-id + * should not be filtered anymore + */ + for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) { + if (peer->sort != BGP_PEER_IBGP) + continue; + if (CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) + continue; + + FOREACH_AFI_SAFI (afi, safi) { + if (peer->afc[afi][safi]) + peer_clear_soft(peer, afi, safi, BGP_CLEAR_SOFT_IN); + } + } +} /* neighbor weight. */ int peer_weight_set(struct peer *peer, afi_t afi, safi_t safi, uint16_t weight) diff --git a/bgpd/bgpd.h b/bgpd/bgpd.h index 11c937ea40a0..a8899e4e61e8 100644 --- a/bgpd/bgpd.h +++ b/bgpd/bgpd.h @@ -121,6 +121,9 @@ extern struct frr_pthread *bgp_pth_ka; /* FIFO list for peer connections */ PREDECL_LIST(peer_connection_fifo); +/* List of per-neighbor clusters, per bgp instance */ +PREDECL_DLIST(per_neighbor_cluster_list); + /* BGP master for system wide configurations and variables. */ struct bgp_master { /* BGP instance list. */ @@ -633,9 +636,11 @@ struct bgp { struct in_addr router_id_static; struct in_addr router_id_zebra; - /* BGP route reflector cluster ID. */ + /* BGP route reflector global cluster ID. */ struct in_addr cluster_id; + struct per_neighbor_cluster_list_head per_neighbor_clusters; + /* BGP confederation information. */ as_t confed_id; char *confed_id_pretty; @@ -1929,10 +1934,17 @@ struct peer { #define PEER_FLAG_CONFIG_ENCAPSULATION_MPLS (1ULL << 34) #define PEER_FLAG_BGP_LS_IPV4 (1ULL << 35) #define PEER_FLAG_BGP_LS_IPV6 (1ULL << 36) +#define PEER_FLAG_CLUSTER_ID (1ULL << 37) #define PEER_FLAG_ACCEPT_OWN (1ULL << 63) enum bgp_addpath_strat addpath_type[AFI_MAX][SAFI_MAX]; + /*cluster-id for each AFI-SAFI*/ + struct in_addr cluster[AFI_MAX][SAFI_MAX]; + + /*copy of the cluster configuration for each AFI-SAFI*/ + uint8_t cluster_flags[AFI_MAX][SAFI_MAX]; + /* MD5 password */ char *password; @@ -2347,6 +2359,25 @@ struct bgp_nlri { uint8_t *nlri; }; +/*This structure contains information about a per-neighbor cluster*/ +struct cluster { + /*cluster-id*/ + struct in_addr cluster_id; + + /*flag for client-to-client reflection*/ + uint8_t flags; +#define CLUSTER_FLAG_GLOBAL (1 << 2) + + /*count the number of times the cluster is referenced during + *configuration + the number of pointers toward it (one per member afi safi) + */ + int refcnt; + struct per_neighbor_cluster_list_item cluster_link; +}; + +/* List of per-neighbor clusters, per bgp instance */ +DECLARE_DLIST(per_neighbor_cluster_list, struct cluster, cluster_link); + /* BGP versions. */ #define BGP_VERSION_4 4 @@ -2742,8 +2773,19 @@ extern void bm_wait_for_fib_set(bool set, uint16_t adv_delay); extern void bgp_suppress_fib_pending_set(struct bgp *bgp, bool set, uint16_t adv_delay); extern uint16_t bgp_suppress_fib_get_adv_delay(struct bgp *bgp); +extern struct cluster *per_neighbor_cluster_lookup(struct bgp *bgp, + const struct in_addr *cluster_id); +extern uint8_t copy_global_cluster_flags(struct bgp *bgp); +extern void per_neighbor_cluster_update_global_marking(struct bgp *bgp, bool modified_cluster_id); +extern struct cluster *bgp_per_neighbor_cluster_id_add(struct bgp *bgp, struct in_addr *cluster_id); +extern void bgp_per_neighbor_cluster_id_delete(struct bgp *bgp, struct in_addr *cluster_id); extern void bgp_cluster_id_set(struct bgp *bgp, struct in_addr *cluster_id); extern void bgp_cluster_id_unset(struct bgp *bgp); +extern void bgp_neighbor_cluster_id_unset(struct bgp *bgp, struct peer *peer, afi_t afi, + safi_t safi); +extern void bgp_neighbor_cluster_id_set(struct bgp *bgp, struct in_addr *cluster_id, + struct peer *peer, afi_t afi, safi_t safi); + extern void bgp_confederation_id_set(struct bgp *bgp, as_t as, const char *as_str); diff --git a/doc/user/bgp.rst b/doc/user/bgp.rst index 146c198e67e4..997f7ec68ef6 100644 --- a/doc/user/bgp.rst +++ b/doc/user/bgp.rst @@ -5770,6 +5770,16 @@ setting. This command allows the BGP daemon to control, at a global level, the DSCP value used in outgoing packets for each BGP connection. +.. clicmd:: neighbor PEER cluster-id A.B.C.D + +This command sets the peer PEER as part of a per-neighbor cluster. +Per-neighbor clusters are additional clusters that allow a route-reflector +to manage client in multiple different clusters. Each per-neighbor cluster +is defined by its cluster-id. As long as a per-neighbor cluster has +the same id as the unique cluster of the route-reflector that we call global, +the settings of the global cluster will override the settings of that +per-neighbor cluster. + .. _bgp-suppress-fib: Suppressing routes not installed in FIB From 2dedac8fcca1a3a4e9b66e50779f59328374124d Mon Sep 17 00:00:00 2001 From: Pierre Neltner Date: Mon, 27 Apr 2026 18:07:31 +0200 Subject: [PATCH 2/9] tests: Add a topotest for all features linked to multiple clusters the topology of the test is the following : +------+ +------+ | R1--|- -|--R6 | | | \ R9 / | | C1 | | \ / \ / | | C3 | R2--|---R4--R5---|--R7 | +------+ / \ +------+ +------+ / \ +------+ C2 | R3--|- -|--R8 | C2 +------+ +------+ All routers are in AS 100 (IBGP). R4 and R5 are configured as a route reflector with their neighbors as clients. They form with r9 a full-mesh of non-client peers. They each oversee two clusters of clients Test that reflection in the default setup works as intended and that cluster-lists are correct, verifies that client-to-client configurations works well both in the per-neighbor and global clusters. Test that the non-client-to-client prefer-global-cluster-id and loose-cluster-list-check configuration work as intentded. These checks are achieved through verifying the prefixes in the bgp tables of each client router and on R9 as well as their cluster-lists. Signed-off-by: Pierre Neltner --- .../bgp_multiple_clusters/r1/frr.conf | 23 + .../bgp_multiple_clusters/r2/frr.conf | 23 + .../bgp_multiple_clusters/r3/frr.conf | 23 + .../bgp_multiple_clusters/r4/frr.conf | 57 ++ .../bgp_multiple_clusters/r5/frr.conf | 57 ++ .../bgp_multiple_clusters/r6/frr.conf | 23 + .../bgp_multiple_clusters/r7/frr.conf | 23 + .../bgp_multiple_clusters/r8/frr.conf | 23 + .../bgp_multiple_clusters/r9/frr.conf | 30 + .../test_bgp_multiple_clusters.py | 844 ++++++++++++++++++ 10 files changed, 1126 insertions(+) create mode 100644 tests/topotests/bgp_multiple_clusters/r1/frr.conf create mode 100644 tests/topotests/bgp_multiple_clusters/r2/frr.conf create mode 100644 tests/topotests/bgp_multiple_clusters/r3/frr.conf create mode 100644 tests/topotests/bgp_multiple_clusters/r4/frr.conf create mode 100644 tests/topotests/bgp_multiple_clusters/r5/frr.conf create mode 100644 tests/topotests/bgp_multiple_clusters/r6/frr.conf create mode 100644 tests/topotests/bgp_multiple_clusters/r7/frr.conf create mode 100644 tests/topotests/bgp_multiple_clusters/r8/frr.conf create mode 100644 tests/topotests/bgp_multiple_clusters/r9/frr.conf create mode 100644 tests/topotests/bgp_multiple_clusters/test_bgp_multiple_clusters.py diff --git a/tests/topotests/bgp_multiple_clusters/r1/frr.conf b/tests/topotests/bgp_multiple_clusters/r1/frr.conf new file mode 100644 index 000000000000..a018c394077f --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/r1/frr.conf @@ -0,0 +1,23 @@ +hostname r1 +! +interface lo + ip address 10.0.0.1/32 +! +! Connection to r4 +interface r1-eth0 + ip address 10.1.4.1/24 +! +router ospf + network 10.0.0.1/32 area 0 + network 10.1.4.0/24 area 0 +! +router bgp 100 + bgp router-id 10.0.0.1 + no bgp network import-check + neighbor 10.1.4.2 remote-as 100 + neighbor 10.1.4.2 update-source 10.1.4.1 + ! + address-family ipv4 unicast + network 10.0.0.1/32 + exit-address-family +! diff --git a/tests/topotests/bgp_multiple_clusters/r2/frr.conf b/tests/topotests/bgp_multiple_clusters/r2/frr.conf new file mode 100644 index 000000000000..e67de48fa4ba --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/r2/frr.conf @@ -0,0 +1,23 @@ +hostname r2 +! +interface lo + ip address 10.0.0.2/32 +! +! Connection to r4 +interface r2-eth0 + ip address 10.2.4.1/24 +! +router ospf + network 10.0.0.2/32 area 0 + network 10.2.4.0/24 area 0 +! +router bgp 100 + bgp router-id 10.0.0.2 + no bgp network import-check + neighbor 10.2.4.2 remote-as 100 + neighbor 10.2.4.2 update-source 10.2.4.1 + ! + address-family ipv4 unicast + network 10.0.0.2/32 + exit-address-family +! diff --git a/tests/topotests/bgp_multiple_clusters/r3/frr.conf b/tests/topotests/bgp_multiple_clusters/r3/frr.conf new file mode 100644 index 000000000000..adc0f9792c32 --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/r3/frr.conf @@ -0,0 +1,23 @@ +hostname r3 +! +interface lo + ip address 10.0.0.3/32 +! +! Connection to r4 +interface r3-eth0 + ip address 10.3.4.1/24 +! +router ospf + network 10.0.0.3/32 area 0 + network 10.3.4.0/24 area 0 +! +router bgp 100 + bgp router-id 10.0.0.3 + no bgp network import-check + neighbor 10.3.4.2 remote-as 100 + neighbor 10.3.4.2 update-source 10.3.4.1 + ! + address-family ipv4 unicast + network 10.0.0.3/32 + exit-address-family +! diff --git a/tests/topotests/bgp_multiple_clusters/r4/frr.conf b/tests/topotests/bgp_multiple_clusters/r4/frr.conf new file mode 100644 index 000000000000..bba271702d84 --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/r4/frr.conf @@ -0,0 +1,57 @@ +hostname r4 +! +interface lo + ip address 10.0.0.4/32 +! +! Connection to r1 +interface r4-eth0 + ip address 10.1.4.2/24 +! +! Connection to r2 +interface r4-eth1 + ip address 10.2.4.2/24 +! +! Connection to r3 +interface r4-eth2 + ip address 10.3.4.2/24 +! +! Connection to r5 +interface r4-eth3 + ip address 10.4.5.1/24 +! +! Connection to r9 +interface r4-eth4 + ip address 10.4.9.1/24 +! +router ospf + network 10.0.0.4/32 area 0 + network 10.1.4.0/24 area 0 + network 10.2.4.0/24 area 0 + network 10.3.4.0/24 area 0 + network 10.4.5.0/24 area 0 + network 10.4.9.0/24 area 0 +! +router bgp 100 + bgp router-id 10.0.0.4 + bgp cluster-id 10.0.0.4 + no bgp network import-check + neighbor cluster1 peer-group + neighbor cluster1 remote-as 100 + neighbor 10.1.4.1 peer-group cluster1 + neighbor 10.1.4.1 update-source 10.1.4.2 + neighbor 10.2.4.1 peer-group cluster1 + neighbor 10.2.4.1 update-source 10.2.4.2 + neighbor 10.3.4.1 remote-as 100 + neighbor 10.3.4.1 update-source 10.3.4.2 + neighbor 10.4.5.2 remote-as 100 + neighbor 10.4.5.2 update-source 10.4.5.1 + neighbor 10.4.9.2 remote-as 100 + neighbor 10.4.9.2 update-source 10.4.9.1 + ! + address-family ipv4 unicast + neighbor cluster1 route-reflector-client + neighbor cluster1 cluster-id 20.0.0.1 + neighbor 10.3.4.1 route-reflector-client + neighbor 10.3.4.1 cluster-id 20.0.0.2 + exit-address-family +! diff --git a/tests/topotests/bgp_multiple_clusters/r5/frr.conf b/tests/topotests/bgp_multiple_clusters/r5/frr.conf new file mode 100644 index 000000000000..869c0b1de49c --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/r5/frr.conf @@ -0,0 +1,57 @@ +hostname r5 +! +interface lo + ip address 10.0.0.5/32 +! +! Connection to r6 +interface r5-eth0 + ip address 10.5.6.1/24 +! +! Connection to r7 +interface r5-eth1 + ip address 10.5.7.1/24 +! +! Connection to r8 +interface r5-eth2 + ip address 10.5.8.1/24 +! +! Connection to r4 +interface r5-eth3 + ip address 10.4.5.2/24 +! +! Connection to r9 +interface r5-eth4 + ip address 10.5.9.1/24 +! +router ospf + network 10.0.0.5/32 area 0 + network 10.5.6.0/24 area 0 + network 10.5.7.0/24 area 0 + network 10.5.8.0/24 area 0 + network 10.4.5.0/24 area 0 + network 10.5.9.0/24 area 0 +! +router bgp 100 + bgp router-id 10.0.0.5 + bgp cluster-id 10.0.0.5 + no bgp network import-check + neighbor cluster3 peer-group + neighbor cluster3 remote-as 100 + neighbor 10.5.6.2 peer-group cluster3 + neighbor 10.5.6.2 update-source 10.5.6.1 + neighbor 10.4.5.1 remote-as 100 + neighbor 10.4.5.1 update-source 10.4.5.2 + neighbor 10.5.7.2 peer-group cluster3 + neighbor 10.5.7.2 update-source 10.5.7.1 + neighbor 10.5.8.2 remote-as 100 + neighbor 10.5.8.2 update-source 10.5.8.1 + neighbor 10.5.9.2 remote-as 100 + neighbor 10.5.9.2 update-source 10.5.9.1 + ! + address-family ipv4 unicast + neighbor cluster3 route-reflector-client + neighbor cluster3 cluster-id 20.0.0.3 + neighbor 10.5.8.2 route-reflector-client + neighbor 10.5.8.2 cluster-id 20.0.0.2 + exit-address-family +! diff --git a/tests/topotests/bgp_multiple_clusters/r6/frr.conf b/tests/topotests/bgp_multiple_clusters/r6/frr.conf new file mode 100644 index 000000000000..c9861383c905 --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/r6/frr.conf @@ -0,0 +1,23 @@ +hostname r6 +! +interface lo + ip address 10.0.0.6/32 +! +! Connection to r5 +interface r6-eth0 + ip address 10.5.6.2/24 +! +router ospf + network 10.0.0.6/32 area 0 + network 10.5.6.0/24 area 0 +! +router bgp 100 + bgp router-id 10.0.0.6 + no bgp network import-check + neighbor 10.5.6.1 remote-as 100 + neighbor 10.5.6.1 update-source 10.5.6.2 + ! + address-family ipv4 unicast + network 10.0.0.6/32 + exit-address-family +! diff --git a/tests/topotests/bgp_multiple_clusters/r7/frr.conf b/tests/topotests/bgp_multiple_clusters/r7/frr.conf new file mode 100644 index 000000000000..c1e9208a9d8f --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/r7/frr.conf @@ -0,0 +1,23 @@ +hostname r7 +! +interface lo + ip address 10.0.0.7/32 +! +! Connection to r5 +interface r7-eth0 + ip address 10.5.7.2/24 +! +router ospf + network 10.0.0.7/32 area 0 + network 10.5.7.0/24 area 0 +! +router bgp 100 + bgp router-id 10.0.0.7 + no bgp network import-check + neighbor 10.5.7.1 remote-as 100 + neighbor 10.5.7.1 update-source 10.5.7.2 + ! + address-family ipv4 unicast + network 10.0.0.7/32 + exit-address-family +! diff --git a/tests/topotests/bgp_multiple_clusters/r8/frr.conf b/tests/topotests/bgp_multiple_clusters/r8/frr.conf new file mode 100644 index 000000000000..46b5d83229fe --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/r8/frr.conf @@ -0,0 +1,23 @@ +hostname r8 +! +interface lo + ip address 10.0.0.8/32 +! +! Connection to r5 +interface r8-eth0 + ip address 10.5.8.2/24 +! +router ospf + network 10.0.0.8/32 area 0 + network 10.5.8.0/24 area 0 +! +router bgp 100 + bgp router-id 10.0.0.8 + no bgp network import-check + neighbor 10.5.8.1 remote-as 100 + neighbor 10.5.8.1 update-source 10.5.8.2 + ! + address-family ipv4 unicast + network 10.0.0.8/32 + exit-address-family +! diff --git a/tests/topotests/bgp_multiple_clusters/r9/frr.conf b/tests/topotests/bgp_multiple_clusters/r9/frr.conf new file mode 100644 index 000000000000..17622df0cea4 --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/r9/frr.conf @@ -0,0 +1,30 @@ +hostname r9 +! +interface lo + ip address 10.0.0.9/32 +! +! Connection to r4 +interface r9-eth0 + ip address 10.4.9.2/24 +! +! Connection to r5 +interface r9-eth1 + ip address 10.5.9.2/24 +! +router ospf + network 10.0.0.9/32 area 0 + network 10.4.9.0/24 area 0 + network 10.5.9.0/24 area 0 +! +router bgp 100 + bgp router-id 10.0.0.9 + no bgp network import-check + neighbor 10.5.9.1 remote-as 100 + neighbor 10.5.9.1 update-source 10.5.9.2 + neighbor 10.4.9.1 remote-as 100 + neighbor 10.4.9.1 update-source 10.4.9.2 + ! + address-family ipv4 unicast + network 10.0.0.9/32 + exit-address-family +! diff --git a/tests/topotests/bgp_multiple_clusters/test_bgp_multiple_clusters.py b/tests/topotests/bgp_multiple_clusters/test_bgp_multiple_clusters.py new file mode 100644 index 000000000000..37a2dec530b8 --- /dev/null +++ b/tests/topotests/bgp_multiple_clusters/test_bgp_multiple_clusters.py @@ -0,0 +1,844 @@ +#!/usr/bin/env python +# SPDX-License-Identifier: ISC + +# +# test_bgp_multiple_clusters.py +# Test BGP multiple cluster features in a two route reflector topology +# + +r""" +Test BGP multiple clusters feature in a multi-layer route reflector topology. + +Topology: + + +------+ +------+ + | R1--|- -|--R6 | + | | \ R9 / | | +C1 | | \ / \ / | | C3 + | R2--|---R4--R5---|--R7 | + +------+ / \ +------+ + +------+ / \ +------+ +C2 | R3--|- -|--R8 | C2 + +------+ +------+ + +All routers are in AS 100 (IBGP). +R4 and R5 are configured as a route reflector with their neighbors as clients. They form with r9 a full-mesh of non-client peers. +They each oversee two clusters of clients +""" + +import os +import sys +import json +import pytest +import functools + +CWD = os.path.dirname(os.path.realpath(__file__)) +sys.path.append(os.path.join(CWD, "../")) + +# pylint: disable=C0413 +from lib import topotest +from lib.topogen import Topogen, TopoRouter, get_topogen +from lib.topolog import logger + +pytestmark = [pytest.mark.bgpd] + + +def build_topo(tgen): + """Build the multi-layer RR topology""" + + # Create routers + for routern in range(1, 10): + tgen.add_router("r{}".format(routern)) + + # Layer 1 connections (r1-r2 to r3-r4) + # r1 to r4 + switch = tgen.add_switch("s1") + switch.add_link(tgen.gears["r1"]) + switch.add_link(tgen.gears["r4"]) + + # r2 to r4 + switch = tgen.add_switch("s2") + switch.add_link(tgen.gears["r2"]) + switch.add_link(tgen.gears["r4"]) + + # r3 to r4 + switch = tgen.add_switch("s3") + switch.add_link(tgen.gears["r3"]) + switch.add_link(tgen.gears["r4"]) + + # r5 to r6 + switch = tgen.add_switch("s5") + switch.add_link(tgen.gears["r5"]) + switch.add_link(tgen.gears["r6"]) + + # r5 to r7 + switch = tgen.add_switch("s6") + switch.add_link(tgen.gears["r5"]) + switch.add_link(tgen.gears["r7"]) + + # r5 to r8 + switch = tgen.add_switch("s7") + switch.add_link(tgen.gears["r5"]) + switch.add_link(tgen.gears["r8"]) + + # r4 to r5 + switch = tgen.add_switch("s4") + switch.add_link(tgen.gears["r4"]) + switch.add_link(tgen.gears["r5"]) + + # r4 to r9 + switch = tgen.add_switch("s9") + switch.add_link(tgen.gears["r4"]) + switch.add_link(tgen.gears["r9"]) + + # r5 to r9 + switch = tgen.add_switch("s8") + switch.add_link(tgen.gears["r5"]) + switch.add_link(tgen.gears["r9"]) + + +def setup_module(mod): + """Setup the test environment""" + tgen = Topogen(build_topo, mod.__name__) + tgen.start_topology() + + router_list = tgen.routers() + + for rname, router in router_list.items(): + router.load_frr_config("frr.conf") + + tgen.start_router() + + +def teardown_module(mod): + """Teardown the test environment""" + tgen = get_topogen() + tgen.stop_topology() + + +def test_bgp_convergence(): + """Test that BGP sessions are established""" + tgen = get_topogen() + + if tgen.routers_have_failure(): + pytest.skip(tgen.errors) + + logger.info("Checking BGP convergence") + + # Expected neighbor counts for each router + expected_neighbors = { + "r1": 1, # r4 + "r2": 1, # r4 + "r3": 1, # r4 + "r4": 5, # r1, r2, r3, r5, r9 + "r5": 5, # r4, r6, r7, r8, r9 + "r6": 1, # r5 + "r7": 1, # r5 + "r8": 1, # r5 + "r9": 2, # r4, r5 + } + + for rname, expected_count in expected_neighbors.items(): + router = tgen.gears[rname] + + def check_bgp_session(router, expected_count): + output = router.vtysh_cmd("show bgp summary json") + try: + parsed = json.loads(output) + ipv4_summary = parsed.get("ipv4Unicast", {}) + peers = ipv4_summary.get("peers", {}) + established_count = sum( + 1 for peer in peers.values() if peer.get("state") == "Established" + ) + + if established_count == expected_count: + logger.info( + "{}: {} BGP sessions established".format( + router.name, established_count + ) + ) + return True + else: + logger.info( + "{}: {}/{} BGP sessions established (waiting)".format( + router.name, established_count, expected_count + ) + ) + return False + except (json.JSONDecodeError, KeyError): + return False + test_func = functools.partial(check_bgp_session, router, expected_count) + success, result = topotest.run_and_expect(test_func, True, count=60, wait=1) + assert success, "{} BGP sessions did not converge".format(rname) + + + +def check_routes_cluster_list(router, prefixes, expected_cluster_lists): + """Check that router receives a prefix with the correct cluster list""" + + if len(prefixes) != len(expected_cluster_lists): + logger.info("wrong input for router {}".format(router.name)) + return None + + #Check that there are no unwanted prefix + output = router.vtysh_cmd("show ip bgp json") + parsed = json.loads(output) + totalRoutes = parsed.get("totalRoutes") + if totalRoutes != len(prefixes) + 1: # we don't check for the loopback address + logger.info( + "{}: expected {} prefixes, got {} (waiting)".format( + router.name,len(prefixes)+1,totalRoutes + ) + ) + return None + + #Check cluster lists + for i,prefix in enumerate(prefixes): + try: + expected_cluster_list = expected_cluster_lists[i] + output = router.vtysh_cmd("show ip bgp {} json".format(prefix)) + parsed = json.loads(output) + paths = parsed.get("paths", []) + + if not paths: + logger.info( + "{}: No paths found for {} (waiting for BGP convergence)".format( + router.name, prefix + ) + ) + return None + + #Check that there is only one path + if len(paths) > 1: + logger.info( + "{}: Expected 1 path, got {} (waiting)".format( + router.name, len(paths) + ) + ) + return None + + #Check that the cluster list is as intended + cluster_list = paths[0].get("clusterList", [])["list"] + if cluster_list != expected_cluster_list: + logger.info( + "{}: prefix {} Expected cluster list {}, got {} (waiting)".format( + router.name, prefix, expected_cluster_list,cluster_list + ) + ) + return None + + except (json.JSONDecodeError, KeyError) as e: + logger.info("{}: Error parsing BGP output: {}".format(router.name, e)) + return None + return True + +def test_bgp_multiple_cluster_cluster_lists(): + """Test that prefix are correctly reflected between per-neighbor clusters with correct cluster list""" + tgen = get_topogen() + + if tgen.routers_have_failure(): + pytest.skip(tgen.errors) + + logger.info("Testing cluster-lists of prefixes on r1, r2, r3, r6, r7, r8 and r9") + + r1 = tgen.gears["r1"] + r2 = tgen.gears["r2"] + r3 = tgen.gears["r3"] + r6 = tgen.gears["r6"] + r7 = tgen.gears["r7"] + r8 = tgen.gears["r8"] + r9 = tgen.gears["r9"] + + expected_results = { + r1:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r9:{ + "10.0.0.8":["20.0.0.2"], + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + }, + r2:{ + "10.0.0.1":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r3:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.6":["20.0.0.2","20.0.0.3"], + "10.0.0.7":["20.0.0.2","20.0.0.3"], + "10.0.0.9":["20.0.0.2"], + }, + r6:{ + "10.0.0.7":["20.0.0.3"], + "10.0.0.8":["20.0.0.2"], + "10.0.0.1":["20.0.0.3","20.0.0.1"], + "10.0.0.2":["20.0.0.3","20.0.0.1"], + "10.0.0.9":["20.0.0.3"], + }, + r7:{ + "10.0.0.6":["20.0.0.3"], + "10.0.0.8":["20.0.0.2"], + "10.0.0.1":["20.0.0.3","20.0.0.1"], + "10.0.0.2":["20.0.0.3","20.0.0.1"], + "10.0.0.9":["20.0.0.3"], + }, + r8:{ + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.2","20.0.0.1"], + "10.0.0.2":["20.0.0.2","20.0.0.1"], + "10.0.0.9":["20.0.0.2"], + }, + } + for r in expected_results.keys(): + logger.info("Checking {} routes".format(r.name)) + + test_func = functools.partial( + check_routes_cluster_list, + r, + list(expected_results[r].keys()), + [expected_results[r][k] for k in expected_results[r].keys()] + ) + success, result = topotest.run_and_expect(test_func, True, count=20, wait=2) + assert success, "{}: incorrect cluster lists".format(r.name) + +def test_client_to_client_reflection_per_neighbor_cluster(): + """Test that client_to_client_reflection works as intended inside of and between per-neighbor clusters""" + tgen = get_topogen() + + if tgen.routers_have_failure(): + pytest.skip(tgen.errors) + + r1 = tgen.gears["r1"] + r2 = tgen.gears["r2"] + r3 = tgen.gears["r3"] + r4 = tgen.gears["r4"] + r9 = tgen.gears["r9"] + + logger.info("Testing cluster-lists of prefixes on r1, r2, r3 and r9 with no reflection in cluster 20.0.0.1 ") + + r4.vtysh_cmd("configure \n\ + router bgp 100 \n\ + bgp cluster-id per-neighbor 20.0.0.1 client-to-client-reflection never\n") + + expected_results = { + r1:{ + "10.0.0.3":["20.0.0.2"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r2:{ + "10.0.0.3":["20.0.0.2"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r3:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.6":["20.0.0.2","20.0.0.3"], + "10.0.0.7":["20.0.0.2","20.0.0.3"], + "10.0.0.9":["20.0.0.2"], + }, + r9:{ + "10.0.0.8":["20.0.0.2"], + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + } + } + + for r in expected_results.keys(): + logger.info("Checking {} routes".format(r.name)) + + test_func = functools.partial( + check_routes_cluster_list, + r, + list(expected_results[r].keys()), + [expected_results[r][k] for k in expected_results[r].keys()] + ) + success, result = topotest.run_and_expect(test_func, True, count=20, wait=2) + assert success, "{}: incorrect cluster lists".format(r.name) + + logger.info("Testing cluster-lists of prefixes on r1, r2, r3 and r9 with no reflection in cluster 20.0.0.1 and no client-to-client reflection") + + r4.vtysh_cmd("configure \n\ + router bgp 100 \n\ + no bgp client-to-client reflection\n") + + expected_results = { + r1:{ + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r2:{ + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r3:{ + "10.0.0.6":["20.0.0.2","20.0.0.3"], + "10.0.0.7":["20.0.0.2","20.0.0.3"], + "10.0.0.9":["20.0.0.2"], + }, + r9:{ + "10.0.0.8":["20.0.0.2"], + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + } + } + + for r in expected_results.keys(): + logger.info("Checking {} routes".format(r.name)) + + test_func = functools.partial( + check_routes_cluster_list, + r, + list(expected_results[r].keys()), + [expected_results[r][k] for k in expected_results[r].keys()] + ) + success, result = topotest.run_and_expect(test_func, True, count=20, wait=2) + assert success, "{}: incorrect cluster lists".format(r.name) + + logger.info("Testing cluster-lists of prefixes on r1, r2, r3 and r9 with reflection in cluster 20.0.0.1 and no client-to-client reflection") + + r4.vtysh_cmd(""" + configure + router bgp 100 + bgp cluster-id per-neighbor 20.0.0.1 client-to-client-reflection always + """) + + expected_results = { + r1:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r2:{ + "10.0.0.1":["20.0.0.1"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r3:{ + "10.0.0.6":["20.0.0.2","20.0.0.3"], + "10.0.0.7":["20.0.0.2","20.0.0.3"], + "10.0.0.9":["20.0.0.2"], + }, + r9:{ + "10.0.0.8":["20.0.0.2"], + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + } + } + for r in expected_results.keys(): + logger.info("Checking {} routes".format(r.name)) + + test_func = functools.partial( + check_routes_cluster_list, + r, + list(expected_results[r].keys()), + [expected_results[r][k] for k in expected_results[r].keys()] + ) + success, result = topotest.run_and_expect(test_func, True, count=20, wait=1) + assert success, "{}: incorrect cluster lists".format(r.name) + + #reset of the configuration + r4.vtysh_cmd(""" + configure + router bgp 100 + no bgp cluster-id per-neighbor 20.0.0.1 client-to-client-reflection + bgp client-to-client reflection + """ ) + +def test_client_to_client_reflection_global_cluster(): + """Test that client_to_client_reflection works as intended inside of and between per-neighbor clusters""" + tgen = get_topogen() + + if tgen.routers_have_failure(): + pytest.skip(tgen.errors) + + r1 = tgen.gears["r1"] + r2 = tgen.gears["r2"] + r3 = tgen.gears["r3"] + r4 = tgen.gears["r4"] + r9 = tgen.gears["r9"] + + logger.info("Testing cluster-lists of prefixes on r1, r2, r3 and r9 with no reflection in global cluster") + r4.vtysh_cmd(""" + configure + router bgp 100 + bgp cluster-id global client-to-client-reflection never + bgp cluster-id 20.0.0.1 + """) + + expected_results = { + r1:{ + "10.0.0.3":["20.0.0.2"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r2:{ + "10.0.0.3":["20.0.0.2"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r3:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.6":["20.0.0.2","20.0.0.3"], + "10.0.0.7":["20.0.0.2","20.0.0.3"], + "10.0.0.9":["20.0.0.2"], + }, + r9:{ + "10.0.0.8":["20.0.0.2"], + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + } + } + + for r in expected_results.keys(): + logger.info("Checking {} routes".format(r.name)) + + test_func = functools.partial( + check_routes_cluster_list, + r, + list(expected_results[r].keys()), + [expected_results[r][k] for k in expected_results[r].keys()] + ) + success, result = topotest.run_and_expect(test_func, True, count=20, wait=1) + assert success, "{}: incorrect cluster lists".format(r.name) + + logger.info("Testing cluster-lists of prefixes on r1, r2, r3 and r9 with no reflection in global cluster and no client-to-client reflection") + + r4.vtysh_cmd(""" + configure + router bgp 100 + no bgp client-to-client reflection + """) + + expected_results = { + r1:{ + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r2:{ + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r3:{ + "10.0.0.6":["20.0.0.2","20.0.0.3"], + "10.0.0.7":["20.0.0.2","20.0.0.3"], + "10.0.0.9":["20.0.0.2"], + }, + r9:{ + "10.0.0.8":["20.0.0.2"], + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + } + } + + for r in expected_results.keys(): + logger.info("Checking {} routes".format(r.name)) + + test_func = functools.partial( + check_routes_cluster_list, + r, + list(expected_results[r].keys()), + [expected_results[r][k] for k in expected_results[r].keys()] + ) + success, result = topotest.run_and_expect(test_func, True, count=20, wait=1) + assert success, "{}: incorrect cluster lists".format(r.name) + + logger.info("Testing cluster-lists of prefixes on r1, r2, r3 and r9 with reflection in global cluster and no client-to-client reflection") + + r4.vtysh_cmd("""configure + router bgp 100 + bgp cluster-id global client-to-client-reflection always""") + + expected_results = { + r1:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r2:{ + "10.0.0.1":["20.0.0.1"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r3:{ + "10.0.0.6":["20.0.0.2","20.0.0.3"], + "10.0.0.7":["20.0.0.2","20.0.0.3"], + "10.0.0.9":["20.0.0.2"], + }, + r9:{ + "10.0.0.8":["20.0.0.2"], + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + } + } + + for r in expected_results.keys(): + logger.info("Checking {} routes".format(r.name)) + + test_func = functools.partial( + check_routes_cluster_list, + r, + list(expected_results[r].keys()), + [expected_results[r][k] for k in expected_results[r].keys()] + ) + success, result = topotest.run_and_expect(test_func, True, count=20, wait=1) + assert success, "{}: incorrect cluster lists".format(r.name) + + #reset of the configuration + r4.vtysh_cmd("""configure + router bgp 100 + no bgp cluster-id global client-to-client-reflection + bgp client-to-client reflection + bgp cluster-id 10.0.0.4""" ) + +def test_bgp_multiple_cluster_prefer_global_cluster_configuration(): + """Test that prefix are correctly reflected between per-neighbor clusters with correct cluster list + whenever non-client-to-client prefer-global-cluster-id is configured""" + tgen = get_topogen() + + if tgen.routers_have_failure(): + pytest.skip(tgen.errors) + + + logger.info("Testing cluster-lists of prefixes on r1, r2, r3, r6, r7, r8 and r9") + + r1 = tgen.gears["r1"] + r2 = tgen.gears["r2"] + r3 = tgen.gears["r3"] + r4 = tgen.gears["r4"] + r5 = tgen.gears["r5"] + r6 = tgen.gears["r6"] + r7 = tgen.gears["r7"] + r8 = tgen.gears["r8"] + r9 = tgen.gears["r9"] + + r4.vtysh_cmd("""configure + router bgp 100 + bgp cluster-id non-client-to-client prefer-global-cluster-id""") + r5.vtysh_cmd("""configure + router bgp 100 + bgp cluster-id non-client-to-client prefer-global-cluster-id""") + + expected_results = { + r1:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + "10.0.0.7":["10.0.0.4","20.0.0.3"], + "10.0.0.6":["10.0.0.4","20.0.0.3"], + "10.0.0.9":["10.0.0.4"], + }, + r9:{ + "10.0.0.8":["20.0.0.2"], + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + }, + r2:{ + "10.0.0.1":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + "10.0.0.6":["10.0.0.4","20.0.0.3"], + "10.0.0.7":["10.0.0.4","20.0.0.3"], + "10.0.0.9":["10.0.0.4"], + }, + r3:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.6":["10.0.0.4","20.0.0.3"], + "10.0.0.7":["10.0.0.4","20.0.0.3"], + "10.0.0.9":["10.0.0.4"], + }, + r6:{ + "10.0.0.7":["20.0.0.3"], + "10.0.0.8":["20.0.0.2"], + "10.0.0.1":["10.0.0.5","20.0.0.1"], + "10.0.0.2":["10.0.0.5","20.0.0.1"], + "10.0.0.9":["10.0.0.5"], + }, + r7:{ + "10.0.0.6":["20.0.0.3"], + "10.0.0.8":["20.0.0.2"], + "10.0.0.1":["10.0.0.5","20.0.0.1"], + "10.0.0.2":["10.0.0.5","20.0.0.1"], + "10.0.0.9":["10.0.0.5"], + }, + r8:{ + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["10.0.0.5","20.0.0.1"], + "10.0.0.2":["10.0.0.5","20.0.0.1"], + "10.0.0.9":["10.0.0.5"], + }, + } + for r in expected_results.keys(): + logger.info("Checking {} routes".format(r.name)) + + test_func = functools.partial( + check_routes_cluster_list, + r, + list(expected_results[r].keys()), + [expected_results[r][k] for k in expected_results[r].keys()] + ) + success, result = topotest.run_and_expect(test_func, True, count=20, wait=2) + assert success, "{}: incorrect cluster lists".format(r.name) + + #reset of the configuration + r4.vtysh_cmd("""configure + router bgp 100 + no bgp cluster-id non-client-to-client prefer-global-cluster-id""") + r5.vtysh_cmd("""configure + router bgp 100 + no bgp cluster-id non-client-to-client prefer-global-cluster-id""") + +def test_bgp_multiple_cluster_loose_cluster_check_configuration(): + """Test that prefix are correctly reflected between per-neighbor clusters with correct cluster list + whenever loose-cluster-list-check is configured""" + tgen = get_topogen() + + if tgen.routers_have_failure(): + pytest.skip(tgen.errors) + + + logger.info("Testing cluster-lists of prefixes on r1, r2, r3, r6, r7, r8 and r9") + + r1 = tgen.gears["r1"] + r2 = tgen.gears["r2"] + r3 = tgen.gears["r3"] + r4 = tgen.gears["r4"] + r5 = tgen.gears["r5"] + r6 = tgen.gears["r6"] + r7 = tgen.gears["r7"] + r8 = tgen.gears["r8"] + r9 = tgen.gears["r9"] + + r4.vtysh_cmd("""configure + router bgp 100 + bgp cluster-id loose-cluster-list-check""") + r5.vtysh_cmd("""configure + router bgp 100 + bgp cluster-id loose-cluster-list-check""") + + expected_results = { + r1:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + "10.0.0.8":["20.0.0.1","20.0.0.2"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.9":["20.0.0.1"], + }, + r9:{ + "10.0.0.8":["20.0.0.2"], + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.2":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + }, + r2:{ + "10.0.0.1":["20.0.0.1"], + "10.0.0.3":["20.0.0.2"], + "10.0.0.6":["20.0.0.1","20.0.0.3"], + "10.0.0.7":["20.0.0.1","20.0.0.3"], + "10.0.0.8":["20.0.0.1","20.0.0.2"], + "10.0.0.9":["20.0.0.1"], + }, + r3:{ + "10.0.0.2":["20.0.0.1"], + "10.0.0.1":["20.0.0.1"], + "10.0.0.6":["20.0.0.2","20.0.0.3"], + "10.0.0.7":["20.0.0.2","20.0.0.3"], + "10.0.0.9":["20.0.0.2"], + }, + r6:{ + "10.0.0.7":["20.0.0.3"], + "10.0.0.8":["20.0.0.2"], + "10.0.0.1":["20.0.0.3","20.0.0.1"], + "10.0.0.2":["20.0.0.3","20.0.0.1"], + "10.0.0.3":["20.0.0.3","20.0.0.2"], + "10.0.0.9":["20.0.0.3"], + }, + r7:{ + "10.0.0.6":["20.0.0.3"], + "10.0.0.8":["20.0.0.2"], + "10.0.0.1":["20.0.0.3","20.0.0.1"], + "10.0.0.2":["20.0.0.3","20.0.0.1"], + "10.0.0.3":["20.0.0.3","20.0.0.2"], + "10.0.0.9":["20.0.0.3"], + }, + r8:{ + "10.0.0.7":["20.0.0.3"], + "10.0.0.6":["20.0.0.3"], + "10.0.0.1":["20.0.0.2","20.0.0.1"], + "10.0.0.2":["20.0.0.2","20.0.0.1"], + "10.0.0.9":["20.0.0.2"], + }, + } + for r in expected_results.keys(): + logger.info("Checking {} routes".format(r.name)) + + test_func = functools.partial( + check_routes_cluster_list, + r, + list(expected_results[r].keys()), + [expected_results[r][k] for k in expected_results[r].keys()] + ) + success, result = topotest.run_and_expect(test_func, True, count=20, wait=2) + assert success, "{}: incorrect cluster lists".format(r.name) + + #reset of the configuration + r4.vtysh_cmd("""configure + router bgp 100 + no bgp cluster-id loose-cluster-list-check""") + r5.vtysh_cmd("""configure + router bgp 100 + no bgp cluster-id loose-cluster-list-check""") + +if __name__ == "__main__": + args = ["-s"] + sys.argv[1:] + sys.exit(pytest.main(args)) From 23ed94f3f54de3a055bb27fd06cf62bdd22c66fd Mon Sep 17 00:00:00 2001 From: Pierre Neltner Date: Thu, 11 Jun 2026 11:41:59 +0200 Subject: [PATCH 3/9] bgpd: style modifications in bgp_route.c Modify style to comply with style guidlines in bgp_route.c Signed-off-by: Pierre Neltner --- bgpd/bgp_route.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c index 81ae7a0d9f01..acd8b26a426c 100644 --- a/bgpd/bgp_route.c +++ b/bgpd/bgp_route.c @@ -2689,12 +2689,12 @@ bool subgroup_announce_check(struct bgp_dest *dest, struct bgp_path_info *pi, /* IBGP reflection check. */ if (ibgp_to_ibgp && !samepeer_safe) { /* A route from a Client peer. */ - if (CHECK_FLAG(from->af_flags[afi][safi], - PEER_FLAG_REFLECTOR_CLIENT)) { + if (CHECK_FLAG(from->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT)) { /* Reflect to all the Non-Client peers and also to the - Client peers other than the originator. Originator - check - is already done. So there is noting to do. */ + * Client peers other than the originator. Originator + * check + * is already done. So there is nothing to do. + */ /* no bgp client-to-client reflection check. */ if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT)) @@ -2703,9 +2703,9 @@ bool subgroup_announce_check(struct bgp_dest *dest, struct bgp_path_info *pi, return false; } else { /* A route from a Non-client peer. Reflect to all other - clients. */ - if (!CHECK_FLAG(peer->af_flags[afi][safi], - PEER_FLAG_REFLECTOR_CLIENT)) + * clients. + */ + if (!CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT)) return false; } } From 7bb5b9e51652b4f57ce17b421190c6e5cfedb43d Mon Sep 17 00:00:00 2001 From: Pierre Neltner Date: Thu, 11 Jun 2026 10:59:59 +0200 Subject: [PATCH 4/9] bgpd: refactor the route-reflector reflection-check in bgp_route.c Refactor the route-reflector reflection-check in bgp_route.c in order to prepare for next commit. Signed-off-by: Pierre Neltner --- bgpd/bgp_route.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c index acd8b26a426c..b2e47f1b1304 100644 --- a/bgpd/bgp_route.c +++ b/bgpd/bgp_route.c @@ -2395,6 +2395,15 @@ void subgroup_announce_reset_nhop(uint8_t family, struct attr *attr) memset(&attr->mp_nexthop_global_in, 0, BGP_ATTR_NHLEN_IPV4); } +static bool ibgp_reflection_check(struct bgp *bgp, struct peer *from, struct peer *peer, afi_t afi, + safi_t safi) +{ + if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT)) { + return false; + } + return true; +} + bool subgroup_announce_check(struct bgp_dest *dest, struct bgp_path_info *pi, struct update_subgroup *subgrp, const struct prefix *p, struct attr *attr, @@ -2695,12 +2704,12 @@ bool subgroup_announce_check(struct bgp_dest *dest, struct bgp_path_info *pi, * check * is already done. So there is nothing to do. */ - /* no bgp client-to-client reflection check. */ - if (CHECK_FLAG(bgp->flags, - BGP_FLAG_NO_CLIENT_TO_CLIENT)) - if (CHECK_FLAG(peer->af_flags[afi][safi], - PEER_FLAG_REFLECTOR_CLIENT)) + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT)) { + /* checks client-to-client reflection policies*/ + ret = ibgp_reflection_check(bgp, from, peer, afi, safi); + if (!ret) return false; + } } else { /* A route from a Non-client peer. Reflect to all other * clients. From e90b5d61bbf759d23d65613175a1d16a28f1efea Mon Sep 17 00:00:00 2001 From: Pierre Neltner Date: Wed, 6 May 2026 13:18:24 +0200 Subject: [PATCH 5/9] bgpd: Add cmd to configure client-to-client reflection in clusters Add 2 new flags: CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER and CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED. These flags represent the three states that are possible for a cluster configuration and apply to the flags attribute of the cluster object: the cluster-to-cluster behavior is not configured, the cluster-to-cluster behavior is configured and enabled, the cluster-to-cluster behavior is configured and disabled Add 2 new flags: BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER and BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED. These flags have same role as the ones described before but for the global cluster, they apply to bgp->flags. The behavior of client-to-client reflection is now the following: reflection inside of a cluster is always allowed if it is configured as allowed for that cluster, it is always forbidden if it is configured as forbidden for that cluster. If nothing is configured for that cluster then the legacy client-to-client reflection configuration applies. Between two different clusters the legacy client-to-client reflection configuration applies. This allows any pattern of reflection to be easily set up. This configuration is achieved through the new command: ```[no] bgp cluster \ client-to-client-reflection ``` That command configures client-to-client reflection on any per-neighbor cluster or the global cluster. In its negative form it unconfigures client-to-client reflection. If it is configured on a per-neighbor cluster then adds one to the reference of that cluster and if it doesn't exist creates it in a similar fashion to the neighbor X.X.X.X cluster-id command. These configurations are shown in the running config Signed-off-by: Pierre Neltner --- bgpd/bgp_route.c | 114 +++++++++++++++++++++++++++++++++++++++++++++-- bgpd/bgp_vty.c | 86 ++++++++++++++++++++++++++++++++++- bgpd/bgpd.c | 109 +++++++++++++++++++++++++++++++++++++++++++- bgpd/bgpd.h | 13 +++++- doc/user/bgp.rst | 13 ++++++ 5 files changed, 328 insertions(+), 7 deletions(-) diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c index b2e47f1b1304..f9ef05456a72 100644 --- a/bgpd/bgp_route.c +++ b/bgpd/bgp_route.c @@ -2395,13 +2395,121 @@ void subgroup_announce_reset_nhop(uint8_t family, struct attr *attr) memset(&attr->mp_nexthop_global_in, 0, BGP_ATTR_NHLEN_IPV4); } + +static bool ibgp_reflection_check_from_per_neighbor(struct bgp *bgp, struct peer *from, + struct peer *peer, afi_t afi, safi_t safi) +{ + if (!CHECK_FLAG(from->cluster_flags[afi][safi], + CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED)) { + if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT)) + /* both peers are clients, from is part of a per-neighbor cluster but + * client-to-client reflection isn't configured in that cluster, + * client-to-client reflection is disabled at the bgp instance level + */ + return false; + + } else if (CHECK_FLAG(from->cluster_flags[afi][safi], + CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER)) { + if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT)) { + if (!CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) + /* both peers are clients, from is part of a per-neighbor cluster and + * client-to-client reflection is enabled in that cluster but + * client-to-client reflection is disabled at the bgp instance level + * and peer is part of the global cluster + */ + return false; + + if (!IPV4_ADDR_SAME(&from->cluster[afi][safi], &peer->cluster[afi][safi])) + /* both peers are clients, from is part of a per-neighbor cluster and + * client-to-client reflection is enabled in that cluster but + * client-to-client reflection is disabled at the bgp instance level + * and peer is part of another per-neighbor cluster + */ + return false; + } + } else { + if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT)) + /* both peers are clients, from is part of a per-neighbor cluster and + * client-to-client reflection is disabled both in that cluster and + * at the bgp instance level + */ + return false; + + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) { + if (IPV4_ADDR_SAME(&from->cluster[afi][safi], &peer->cluster[afi][safi])) + /* both peers are clients, from is part of a per-neighbor cluster and + * client-to-client reflection is disabled in that cluster and both + * peers are part of that cluster + */ + return false; + } + } + return true; +} + +static bool ibgp_reflection_check_from_global(struct bgp *bgp, struct peer *from, + struct peer *peer, afi_t afi, safi_t safi) +{ + if (!CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED)) { + if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT)) + /* both peers are clients, from is part of the global cluster and + * client-to-client reflection is not configured in that cluster but + * client-to-client reflection is disabled at the bgp instance level + */ + return false; + } else if (CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER)) { + if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT) && + CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) { + /* both peers are clients, from is part of the global cluster and + * client-to-client reflection is enabled in that cluster but + * client-to-client reflection is disabled at the bgp instance level + * and peer is not part of the default cluster + */ + + if (!CHECK_FLAG(peer->cluster_flags[afi][safi], CLUSTER_FLAG_GLOBAL)) + return false; + } + } else { + if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT)) + /* both peers are clients, from is part of the global cluster and + * client-to-client reflection is disabled in that cluster and + * client-to-client reflection is disabled at the bgp instance level + */ + return false; + + if (!CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) + /* both peers are clients, both peer and from are + * part of the global cluster and client-to-client reflection + * is disabled in that cluster + */ + return false; + + if (CHECK_FLAG(peer->cluster_flags[afi][safi], CLUSTER_FLAG_GLOBAL)) + /* both peers are clients, both peer and from are + * part of the global cluster and client-to-client reflection + * is disabled in that cluster + */ + return false; + } + return true; +} + static bool ibgp_reflection_check(struct bgp *bgp, struct peer *from, struct peer *peer, afi_t afi, safi_t safi) { - if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT)) { - return false; + /* reflection logic is as follows : + * if both peers are in the same cluster, be it global or per-neighbor + * the client-to-client configuration of the said cluster decides if the + * route is reflected or not. + * if there is no client-to-client configuration in that cluster or + * if both peers are from different cluster the bgp level configuration + * applies. + */ + if (CHECK_FLAG(from->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) { + if (!CHECK_FLAG(from->cluster_flags[afi][safi], CLUSTER_FLAG_GLOBAL)) + return ibgp_reflection_check_from_per_neighbor(bgp, from, peer, afi, safi); } - return true; + return ibgp_reflection_check_from_global(bgp, from, peer, afi, safi); } bool subgroup_announce_check(struct bgp_dest *dest, struct bgp_path_info *pi, diff --git a/bgpd/bgp_vty.c b/bgpd/bgp_vty.c index 6436eb671261..0865c141c23d 100644 --- a/bgpd/bgp_vty.c +++ b/bgpd/bgp_vty.c @@ -142,6 +142,10 @@ FRR_CFG_DEFAULT_BOOL(BGP_IPV6_NEXTHOP_PREFER_GLOBAL, { .val_bool = false }, ); +FRR_CFG_DEFAULT_BOOL(BGP_CLIENT_TO_CLIENT, + { .val_bool = true }, +); + DEFINE_HOOK(bgp_inst_config_write, (struct bgp *bgp, struct vty *vty), (bgp, vty)); @@ -738,6 +742,8 @@ int bgp_get_vty(struct bgp **bgp, as_t *as, const char *name, SET_FLAG((*bgp)->flags, BGP_FLAG_RR_ALLOW_OUTBOUND_POLICY); if (DFLT_BGP_COMPARE_AIGP) SET_FLAG((*bgp)->flags, BGP_FLAG_COMPARE_AIGP); + if (!DFLT_BGP_CLIENT_TO_CLIENT) + SET_FLAG((*bgp)->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT); ret = BGP_SUCCESS; } @@ -5528,6 +5534,53 @@ DEFPY (neighbor_cluster_id, return CMD_SUCCESS; } +DEFPY (bgp_cluster_id_client_to_client, + bgp_cluster_id_client_to_client_cmd, + "[no] bgp cluster-id [$id] client-to-client-reflection []", + NO_STR + BGP_STR + "Configure Route-Reflector Cluster-id\n" + "Configure a per-neighbor cluster\n" + "Configure the global cluster\n" + "Route-Reflector Cluster-id in IP address format\n" + "Route-Reflector Cluster-id as 32 bit quantity\n" + "Configure client-to-client route reflection intra-cluster\n" + "Reflection of routes always allowed inside this cluster\n" + "Reflection of routes forbidden inside this cluster\n" + ) +{ + VTY_DECLVAR_CONTEXT(bgp, bgp); + int ret; + struct in_addr cluster = { 0 }; + + if (per_neighbor) { + if (id) { + ret = inet_aton(id, &cluster); + if (!ret) { + vty_out(vty, "%% Malformed bgp cluster identifier\n"); + return CMD_WARNING_CONFIG_FAILED; + } + } else { + vty_out(vty, "%% specify a per-neighbor cluster\n"); + return CMD_WARNING_CONFIG_FAILED; + } + } + + if (no) { + bgp_cluster_client_to_client_unset(bgp, per_neighbor, &cluster); + } else { + /* always|never argument is optional in order for the negative + *form of the command to be clean + */ + if (!conf_true && !conf_false) { + vty_out(vty, "%% specify a value for client-to-client reflection\n"); + return CMD_WARNING_CONFIG_FAILED; + } + bgp_cluster_client_to_client_set(bgp, per_neighbor, &cluster, conf_true); + } + return CMD_SUCCESS; +} + DEFPY (bgp_allow_martian, bgp_allow_martian_cmd, "[no]$no bgp allow-martian-nexthop", @@ -22450,14 +22503,43 @@ int bgp_config_write(struct vty *vty) bgp->default_subgroup_pkt_queue_max); /* BGP client-to-client reflection. */ - if (CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT)) - vty_out(vty, " no bgp client-to-client reflection\n"); + if (!!CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT) == + SAVE_BGP_CLIENT_TO_CLIENT) + vty_out(vty, " %sbgp client-to-client reflection\n", + CHECK_FLAG(bgp->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT) ? "no " : ""); /* BGP cluster ID. */ if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) vty_out(vty, " bgp cluster-id %pI4\n", &bgp->cluster_id); + /*BGP client-to-client reflection in the global cluster*/ + if (CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED)) { + if (CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER)) + vty_out(vty, + " bgp cluster-id global client-to-client-reflection always\n"); + else + vty_out(vty, + " bgp cluster-id global client-to-client-reflection never\n"); + } + + /*BGP per-neighbor clusters*/ + frr_each (per_neighbor_cluster_list, &bgp->per_neighbor_clusters, cluster) { + if (CHECK_FLAG(cluster->flags, + CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED)) { + if (CHECK_FLAG(cluster->flags, + CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER)) + vty_out(vty, + " bgp cluster-id per-neighbor %pI4 client-to-client-reflection always\n", + &cluster->cluster_id); + else { + vty_out(vty, + " bgp cluster-id per-neighbor %pI4 client-to-client-reflection never\n", + &cluster->cluster_id); + } + } + } + /* Disable ebgp connected nexthop check */ if (CHECK_FLAG(bgp->flags, BGP_FLAG_DISABLE_NH_CONNECTED_CHK)) vty_out(vty, diff --git a/bgpd/bgpd.c b/bgpd/bgpd.c index 9ae30f3a7b5d..5f58561a605f 100644 --- a/bgpd/bgpd.c +++ b/bgpd/bgpd.c @@ -7006,7 +7006,7 @@ void bgp_neighbor_cluster_id_unset(struct bgp *bgp, struct peer *peer, afi_t afi if (peer_group_active(peer)) { peer_af_flag_inherit(peer, afi, safi, PEER_FLAG_CLUSTER_ID); PEER_ATTR_INHERIT(peer, peer->group, cluster[afi][safi]); - PEER_ATTR_INHERIT(peer, peer->group, cluster_ptr[afi][safi]); + PEER_ATTR_INHERIT(peer, peer->group, cluster_flags[afi][safi]); /*ensures refcnt is incremented when the cluster is inherited from the group*/ if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) @@ -7128,6 +7128,113 @@ void bgp_per_neighbor_cluster_id_delete(struct bgp *bgp, struct in_addr *cluster } } +/* Clear all cluster members and give them their new flags*/ +static void clear_and_update_flags_for_cluster_members(struct bgp *bgp, const char *per_neighbor, + struct in_addr *cluster_id, + struct cluster *cluster) +{ + struct listnode *node, *nnode; + struct peer *peer; + struct in_addr *global_cluster_id; + afi_t afi; + safi_t safi; + + if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) + global_cluster_id = &bgp->cluster_id; + else + global_cluster_id = &bgp->router_id; + + for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) { + if (peer->sort != BGP_PEER_IBGP) + continue; + FOREACH_AFI_SAFI (afi, safi) { + /* Filter all peers that are not in the modified cluster */ + if (!peer->afc[afi][safi]) + continue; + if (per_neighbor) { + if (!CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID) || + !IPV4_ADDR_SAME(cluster_id, &peer->cluster[afi][safi])) + continue; + } else { + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID) && + !IPV4_ADDR_SAME(global_cluster_id, &peer->cluster[afi][safi])) + continue; + } + /*if the neighbor has a cluster-id configured it should inherit from the + *new cluster flags + */ + if (CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) { + if (cluster && !CHECK_FLAG(cluster->flags, CLUSTER_FLAG_GLOBAL)) + peer->cluster_flags[afi][safi] = cluster->flags; + else + peer->cluster_flags[afi][safi] = + copy_global_cluster_flags(bgp); + } + if (CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) + continue; + peer_clear_soft(peer, afi, safi, BGP_CLEAR_SOFT_BOTH); + } + } +} + +void bgp_cluster_client_to_client_set(struct bgp *bgp, const char *per_neighbor, + struct in_addr *cluster_id, const char *configuration) +{ + struct cluster *cluster = NULL; + + /* if the target cluster is a per-neighbor cluster*/ + if (per_neighbor) { + cluster = per_neighbor_cluster_lookup(bgp, cluster_id); + /*if the cluster doesn't exist or if it didn't have any client-to-client policy: add a references to that cluster*/ + if (!cluster || + !CHECK_FLAG(cluster->flags, + CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED)) { + cluster = bgp_per_neighbor_cluster_id_add(bgp, cluster_id); + } + + /* set the client-to-client-reflection to configured state and configure it with the desired value*/ + SET_FLAG(cluster->flags, CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED); + if (configuration) + SET_FLAG(cluster->flags, CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER); + else + UNSET_FLAG(cluster->flags, CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER); + } + + /* if the target cluster is the global cluster*/ + else { + /* set the client-to-client-reflection to configured state and configure it with the desired value*/ + SET_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED); + if (configuration) + SET_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER); + else + UNSET_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER); + } + clear_and_update_flags_for_cluster_members(bgp, per_neighbor, cluster_id, cluster); +} + +void bgp_cluster_client_to_client_unset(struct bgp *bgp, const char *per_neighbor, + struct in_addr *cluster_id) +{ + struct cluster *cluster = NULL; + + /* if the target cluster is a per-neighbor cluster*/ + if (per_neighbor) { + cluster = per_neighbor_cluster_lookup(bgp, cluster_id); + if (!cluster || !CHECK_FLAG(cluster->flags, + CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED)) + return; + + UNSET_FLAG(cluster->flags, CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED); + } + /* if the target cluster is the global cluster*/ + else + UNSET_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED); + + clear_and_update_flags_for_cluster_members(bgp, per_neighbor, cluster_id, cluster); + if (per_neighbor) + bgp_per_neighbor_cluster_id_delete(bgp, cluster_id); +} + /* neighbor weight. */ int peer_weight_set(struct peer *peer, afi_t afi, safi_t safi, uint16_t weight) { diff --git a/bgpd/bgpd.h b/bgpd/bgpd.h index a8899e4e61e8..78abde433dc4 100644 --- a/bgpd/bgpd.h +++ b/bgpd/bgpd.h @@ -764,6 +764,8 @@ struct bgp { #define BGP_FLAG_VRF_MAY_LISTEN (1ULL << 44) #define BGP_FLAG_SOFT_VERSION_CAPABILITY_NEW (1ULL << 45) #define BGP_FLAG_USE_RECURSIVE_WEIGHT (1ULL << 46) +#define BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER (1ULL << 47) +#define BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED (1ULL << 48) /* Use current (imported) path's attributes instead of source path's attributes * for bestpath comparison of imported paths. @@ -2364,8 +2366,13 @@ struct cluster { /*cluster-id*/ struct in_addr cluster_id; - /*flag for client-to-client reflection*/ uint8_t flags; + /*both CLUSTER_FLAG_CLIENT_TO_CLIENT act as one ternary flag + *with states: + *always, never and not configured + */ +#define CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED (1 << 0) +#define CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER (1 << 1) #define CLUSTER_FLAG_GLOBAL (1 << 2) /*count the number of times the cluster is referenced during @@ -2785,6 +2792,10 @@ extern void bgp_neighbor_cluster_id_unset(struct bgp *bgp, struct peer *peer, af safi_t safi); extern void bgp_neighbor_cluster_id_set(struct bgp *bgp, struct in_addr *cluster_id, struct peer *peer, afi_t afi, safi_t safi); +extern void bgp_cluster_client_to_client_unset(struct bgp *bgp, const char *per_neighbor, + struct in_addr *cluster_id); +extern void bgp_cluster_client_to_client_set(struct bgp *bgp, const char *per_neighbor, + struct in_addr *cluster_id, const char *configuration); extern void bgp_confederation_id_set(struct bgp *bgp, as_t as, diff --git a/doc/user/bgp.rst b/doc/user/bgp.rst index 997f7ec68ef6..cc1280f53b21 100644 --- a/doc/user/bgp.rst +++ b/doc/user/bgp.rst @@ -5780,6 +5780,19 @@ the same id as the unique cluster of the route-reflector that we call global, the settings of the global cluster will override the settings of that per-neighbor cluster. +.. clicmd:: bgp cluster-id client-to-client-reflection + +This command configures a client-to-client reflection policy for the said +cluster be it a per-neighbor cluster or the global (default unique) cluster +of the router. Reflection inside of a cluster is always allowed if it is configured +as always for that cluster, it is always forbidden if it is configured +as never for that cluster. If nothing is configured for that cluster +then the bgp client-to-client reflection configuration applies. +Between two different clusters the bgp client-to-client reflection +configuration applies. + +Once configured for a cluster it can be deconfigured. + .. _bgp-suppress-fib: Suppressing routes not installed in FIB From 211c2277849801df535742f9473750b67d51e06d Mon Sep 17 00:00:00 2001 From: Pierre Neltner Date: Wed, 13 May 2026 12:01:08 +0200 Subject: [PATCH 6/9] bgpd: Add command show bgp clusters [json] add the command : ```Add command show bgp clusters [json]``` that shows information about all configured or referenced clusters including the global cluster and per-neighbor clusters this includes the cluster id, client-to-client reflection policies inside of each cluster and number of times referenced in the configuration as well as whether it is overwritten by the global cluster. the json format is the following: { "global":{ "cluster-id":"CLUSTER_ID", "client-to-client-reflection":<"always"|"never"|"not-configured"> }, "per-neighbor":[ "cluster-id":"CLUSTER_ID", "global":bool, "client-to-client-reflection":<"always"|"never"|"not-configured">, "refcnt":NUMBER_OF_REFERENCES ], } Signed-off-by: Pierre Neltner --- bgpd/bgp_vty.c | 113 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 113 insertions(+) diff --git a/bgpd/bgp_vty.c b/bgpd/bgp_vty.c index 0865c141c23d..c04b4b595c72 100644 --- a/bgpd/bgp_vty.c +++ b/bgpd/bgp_vty.c @@ -12422,6 +12422,116 @@ DEFUN (show_bgp_views, return CMD_SUCCESS; } +DEFPY (show_bgp_clusters, + show_bgp_clusters_cmd, + "show [ip] bgp [ VIEWVRFNAME$name] clusters [json$json]", + SHOW_STR + IP_STR + BGP_STR + BGP_INSTANCE_HELP_STR + "Show referenced clusters\n" + JSON_STR) +{ + struct bgp *bgp; + struct cluster *cluster; + json_object *json_all_clusters; + json_object *json_one_cluster; + json_object *json_per_neighbor_clusters; + + /* [ VIEWVRFNAME] */ + if (name && (!strmatch(name, VRF_DEFAULT_NAME) || view)) + bgp = bgp_lookup_by_name(name); + else + bgp = bgp_get_default(); + + if (!json) { + if (!bgp) { + vty_out(vty, "%% BGP instance not found\n"); + return CMD_WARNING; + } + vty_out(vty, "BGP global cluster:\n"); + if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) + vty_out(vty, "\t cluster-id %pI4\n", &bgp->cluster_id); + else + vty_out(vty, "\t cluster-id %pI4\n", &bgp->router_id); + if (CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED)) { + if (CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER)) + vty_out(vty, "\t client to client reflection: always\n"); + else + vty_out(vty, "\t client to client reflection: never\n"); + } else + vty_out(vty, "\t client to client reflection: not configured\n\n"); + + vty_out(vty, "Referenced BGP per-neighbor clusters:\n"); + + frr_each (per_neighbor_cluster_list, &bgp->per_neighbor_clusters, cluster) { + vty_out(vty, "\t cluster %pI4\n", &cluster->cluster_id); + if (CHECK_FLAG(cluster->flags, CLUSTER_FLAG_GLOBAL)) + vty_out(vty, "\t\t override by global\n"); + if (CHECK_FLAG(cluster->flags, + CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED)) { + if (CHECK_FLAG(cluster->flags, + CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER)) + vty_out(vty, "\t\t client to client reflection: always\n"); + else + vty_out(vty, "\t\t client to client reflection: never\n"); + } else + vty_out(vty, "\t\t client to client reflection: not configured\n"); + vty_out(vty, "\t\t number of refcnt: %d\n", cluster->refcnt); + } + return CMD_SUCCESS; + } + + /*json format*/ + json_all_clusters = json_object_new_object(); + if (!bgp) { + vty_json(vty, json_all_clusters); + return CMD_WARNING; + } + json_one_cluster = json_object_new_object(); + if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) + json_object_string_addf(json_one_cluster, "cluster-id", "%pI4", &bgp->cluster_id); + else + json_object_string_addf(json_one_cluster, "cluster-id", "%pI4", &bgp->router_id); + if (CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED)) { + if (CHECK_FLAG(bgp->flags, BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER)) + json_object_string_add(json_one_cluster, "client-to-client-reflection", + "always"); + else + json_object_string_add(json_one_cluster, "client-to-client-reflection", + "never"); + } else + json_object_string_add(json_one_cluster, "client-to-client-reflection", + "not configured"); + json_object_object_add(json_all_clusters, "global", json_one_cluster); + + json_per_neighbor_clusters = json_object_new_array(); + frr_each (per_neighbor_cluster_list, &bgp->per_neighbor_clusters, cluster) { + json_one_cluster = json_object_new_object(); + + json_object_string_addf(json_one_cluster, "cluster-id", "%pI4", + &cluster->cluster_id); + json_object_boolean_add(json_one_cluster, "global", + CHECK_FLAG(cluster->flags, CLUSTER_FLAG_GLOBAL)); + if (CHECK_FLAG(cluster->flags, + CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER_CONFIGURED)) { + if (CHECK_FLAG(cluster->flags, CLUSTER_FLAG_CLIENT_TO_CLIENT_INTRA_CLUSTER)) + json_object_string_add(json_one_cluster, + "client-to-client-reflection", "always"); + else + json_object_string_add(json_one_cluster, + "client-to-client-reflection", "never"); + } else + json_object_string_add(json_one_cluster, "client-to-client-reflection", + "not configured"); + json_object_int_add(json_one_cluster, "refcnt", cluster->refcnt); + json_object_array_add(json_per_neighbor_clusters, json_one_cluster); + } + json_object_object_add(json_all_clusters, "per-neighbor", json_per_neighbor_clusters); + vty_json(vty, json_all_clusters); + return CMD_SUCCESS; +} + static inline void calc_peers_cfgd_estbd(struct bgp *bgp, int *peers_cfgd, int *peers_estbd) { @@ -24951,6 +25061,9 @@ void bgp_vty_init(void) /* "show [ip] bgp vrfs" commands. */ install_element(VIEW_NODE, &show_bgp_vrfs_cmd); + /* "show [ip] bgp clusters" commands. */ + install_element(VIEW_NODE, &show_bgp_clusters_cmd); + /* Some overall BGP information */ install_element(VIEW_NODE, &show_bgp_router_cmd); From dea1e09fc903ea9676a6593884cf833d5f6b5cb1 Mon Sep 17 00:00:00 2001 From: Pierre Neltner Date: Tue, 19 May 2026 15:01:31 +0200 Subject: [PATCH 7/9] bgpd: Add cmd to change cluster-id behavior when using multiple clusters Add the command : bgp cluster-id non-client-to-client prefer-global-cluster This command configures the new flag BGP_FLAG_PREFER_GLOBAL_CLUSTER When this flag is configured, the global cluster-id is added to the cluster list of prefixes reflected from non-clients toward client-peers, default behavior is to add the destination peer's cluster-id instead. When configured the behavior is similar to the cisco one and further from a classic cluster-id behavior. This feature modifies the attributes of outgoing routes and so modifies the ```bgp_packet_attribute``` function from bgp_attribute.c This flag can be seen in the running config. Signed-off-by: Pierre Neltner --- bgpd/bgp_attr.c | 8 ++++++-- bgpd/bgp_vty.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++ bgpd/bgpd.h | 1 + doc/user/bgp.rst | 10 ++++++++++ 4 files changed, 69 insertions(+), 2 deletions(-) diff --git a/bgpd/bgp_attr.c b/bgpd/bgp_attr.c index 1b59de8d24ce..6fd48a2f47d2 100644 --- a/bgpd/bgp_attr.c +++ b/bgpd/bgp_attr.c @@ -5760,8 +5760,12 @@ bgp_size_t bgp_packet_attribute(struct bgp *bgp, struct peer *peer, struct strea CHECK_FLAG(from->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) stream_put_in_addr(s, &from->cluster[afi][safi]); - /*if the originator of the prefix is a client of the default cluster*/ - else if (CHECK_FLAG(from->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT)) { + /* if the originator of the prefix is a client of the default cluster + * or if from is non-client and prefer global cluster-id is configured + */ + else if (CHECK_FLAG(from->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT) || + (!CHECK_FLAG(from->af_flags[afi][safi], PEER_FLAG_REFLECTOR_CLIENT) && + CHECK_FLAG(bgp->flags, BGP_FLAG_PREFER_GLOBAL_CLUSTER))) { if (CHECK_FLAG(bgp->config, BGP_CONFIG_CLUSTER_ID)) stream_put_in_addr(s, &bgp->cluster_id); else diff --git a/bgpd/bgp_vty.c b/bgpd/bgp_vty.c index c04b4b595c72..5a8b7db6287c 100644 --- a/bgpd/bgp_vty.c +++ b/bgpd/bgp_vty.c @@ -146,6 +146,10 @@ FRR_CFG_DEFAULT_BOOL(BGP_CLIENT_TO_CLIENT, { .val_bool = true }, ); +FRR_CFG_DEFAULT_BOOL(BGP_PREFER_GLOBAL_CLUSTER, + { .val_bool = false }, +); + DEFINE_HOOK(bgp_inst_config_write, (struct bgp *bgp, struct vty *vty), (bgp, vty)); @@ -744,6 +748,8 @@ int bgp_get_vty(struct bgp **bgp, as_t *as, const char *name, SET_FLAG((*bgp)->flags, BGP_FLAG_COMPARE_AIGP); if (!DFLT_BGP_CLIENT_TO_CLIENT) SET_FLAG((*bgp)->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT); + if (DFLT_BGP_PREFER_GLOBAL_CLUSTER) + SET_FLAG((*bgp)->flags, BGP_FLAG_PREFER_GLOBAL_CLUSTER); ret = BGP_SUCCESS; } @@ -5581,6 +5587,43 @@ DEFPY (bgp_cluster_id_client_to_client, return CMD_SUCCESS; } +DEFPY (bgp_cluster_id_prefer_global, + bgp_cluster_id_prefer_global_cmd, + "[no] bgp cluster-id non-client-to-client prefer-global-cluster-id", + NO_STR + BGP_STR + "Configure Route-Reflector Cluster-ids\n" + "Configure the behavior of non-client to client reflections\n" + "Add the global cluster-id to the cluster-list of prefixes that are reflected from non-client peers\n") +{ + struct peer *peer; + struct listnode *node, *nnode; + afi_t afi; + safi_t safi; + + VTY_DECLVAR_CONTEXT(bgp, bgp); + + if (no) + UNSET_FLAG(bgp->flags, BGP_FLAG_PREFER_GLOBAL_CLUSTER); + else + SET_FLAG(bgp->flags, BGP_FLAG_PREFER_GLOBAL_CLUSTER); + + /* Clear all IBGP peer. */ + for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) { + if (peer->sort != BGP_PEER_IBGP) + continue; + if (CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) + continue; + + FOREACH_AFI_SAFI (afi, safi) { + if (!peer->afc[afi][safi]) + continue; + peer_clear_soft(peer, afi, safi, BGP_CLEAR_SOFT_BOTH); + } + } + return CMD_SUCCESS; +} + DEFPY (bgp_allow_martian, bgp_allow_martian_cmd, "[no]$no bgp allow-martian-nexthop", @@ -22633,6 +22676,14 @@ int bgp_config_write(struct vty *vty) " bgp cluster-id global client-to-client-reflection never\n"); } + /* BGP non-client-to-client prefer-global-cluster-id */ + if (!!CHECK_FLAG(bgp->flags, BGP_FLAG_PREFER_GLOBAL_CLUSTER) != + SAVE_BGP_PREFER_GLOBAL_CLUSTER) + vty_out(vty, + " %sbgp cluster-id non-client-to-client prefer-global-cluster-id\n", + CHECK_FLAG(bgp->flags, BGP_FLAG_PREFER_GLOBAL_CLUSTER) ? "" + : "no "); + /*BGP per-neighbor clusters*/ frr_each (per_neighbor_cluster_list, &bgp->per_neighbor_clusters, cluster) { if (CHECK_FLAG(cluster->flags, @@ -23531,6 +23582,7 @@ void bgp_vty_init(void) install_element(BGP_NODE, &bgp_cluster_id_cmd); install_element(BGP_NODE, &no_bgp_cluster_id_cmd); install_element(BGP_NODE, &bgp_cluster_id_client_to_client_cmd); + install_element(BGP_NODE, &bgp_cluster_id_prefer_global_cmd); /* "bgp no-rib" commands. */ install_element(CONFIG_NODE, &bgp_norib_cmd); diff --git a/bgpd/bgpd.h b/bgpd/bgpd.h index 78abde433dc4..fe20f699b006 100644 --- a/bgpd/bgpd.h +++ b/bgpd/bgpd.h @@ -766,6 +766,7 @@ struct bgp { #define BGP_FLAG_USE_RECURSIVE_WEIGHT (1ULL << 46) #define BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER (1ULL << 47) #define BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED (1ULL << 48) +#define BGP_FLAG_PREFER_GLOBAL_CLUSTER (1ULL << 49) /* Use current (imported) path's attributes instead of source path's attributes * for bestpath comparison of imported paths. diff --git a/doc/user/bgp.rst b/doc/user/bgp.rst index cc1280f53b21..79242ac42f22 100644 --- a/doc/user/bgp.rst +++ b/doc/user/bgp.rst @@ -5793,6 +5793,16 @@ configuration applies. Once configured for a cluster it can be deconfigured. +.. clicmd:: bgp cluster-id non-client-to-client prefer-global-cluster-id + +Default behavior for the route-reflector when reflecting a route from a non-client +peer to a client peer is to add the cluster-id of the destination peer in the +cluster-list of the prefix. +This command changes this behavior, the global cluster-id of the route-reflector is added +instead. + + + .. _bgp-suppress-fib: Suppressing routes not installed in FIB From c70cb860db41a64b7315d73f8eceeffce0a4fc6d Mon Sep 17 00:00:00 2001 From: Pierre Neltner Date: Wed, 20 May 2026 17:38:28 +0200 Subject: [PATCH 8/9] bgpd: Add cmd to modify inbound filtering when using multiple clusters add the command: bgp cluster-id loose-cluster-list-check This command configures the new bgp flag: BGP_FLAG_LOOSE_CLUSTER_LIST_CHECK When this is configured, prefixes that contain one or more of the cluster-ids of the per neighbor clusters of the router are no longer dropped when received, they are just not reflected to members of these clusters. This implies that the ```bgp_cluster_filter``` function from bgp_route.c is modified to no longer reject these and that ```subgroup_announce_check``` is modified to reject these prefixes only when advertising toward a member of that cluster. When configured this option shows a behavior closer to juniper behavior while still respecting rfc4456. This configuration can be seen in the running-config. Signed-off-by: Pierre Neltner --- bgpd/bgp_route.c | 25 ++++++++++++++++---- bgpd/bgp_vty.c | 52 +++++++++++++++++++++++++++++++++++++++++ bgpd/bgpd.h | 1 + doc/user/bgp.rst | 6 +++++ tools/frr_babeltrace.py | 1 + 5 files changed, 80 insertions(+), 5 deletions(-) diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c index f9ef05456a72..ceaee3a78e74 100644 --- a/bgpd/bgp_route.c +++ b/bgpd/bgp_route.c @@ -2069,11 +2069,12 @@ static bool bgp_cluster_filter(struct peer *peer, struct attr *attr) if (cluster_loop_check(cluster, cluster_id)) return true; - - frr_each (per_neighbor_cluster_list, &peer->bgp->per_neighbor_clusters, - per_neighbor_cluster) { - if (cluster_loop_check(cluster, per_neighbor_cluster->cluster_id)) - return true; + if (!CHECK_FLAG(peer->bgp->flags, BGP_FLAG_LOOSE_CLUSTER_LIST_CHECK)) { + frr_each (per_neighbor_cluster_list, &peer->bgp->per_neighbor_clusters, + per_neighbor_cluster) { + if (cluster_loop_check(cluster, per_neighbor_cluster->cluster_id)) + return true; + } } } return false; @@ -2719,6 +2720,20 @@ bool subgroup_announce_check(struct bgp_dest *dest, struct bgp_path_info *pi, return false; } + /* If the attribute has cluster-list and one of the cluster-ids in it is the same as remote + * peer's cluster-id. Only when the loose check cluster-id check is enabled and the peer + * is part of a per-neighbor cluster, otherwise it is filtered when received by the + * route-reflector + */ + if (CHECK_FLAG(bgp->flags, BGP_FLAG_LOOSE_CLUSTER_LIST_CHECK) && + CHECK_FLAG(peer->af_flags[afi][safi], PEER_FLAG_CLUSTER_ID)) { + if (bgp_attr_exists(piattr, BGP_ATTR_CLUSTER_LIST) && + cluster_loop_check(bgp_attr_get_cluster(piattr), peer->cluster[afi][safi])) { + frrtrace(3, frr_bgp, upd_prefix_filtered_due_to, 4, peer->host, pfxprint); + return false; + } + } + /* ORF prefix-list filter check */ if (CHECK_FLAG(peer->af_cap[afi][safi], PEER_CAP_ORF_PREFIX_RM_ADV) && CHECK_FLAG(peer->af_cap[afi][safi], PEER_CAP_ORF_PREFIX_SM_RCV)) diff --git a/bgpd/bgp_vty.c b/bgpd/bgp_vty.c index 5a8b7db6287c..ca041989fb52 100644 --- a/bgpd/bgp_vty.c +++ b/bgpd/bgp_vty.c @@ -150,6 +150,10 @@ FRR_CFG_DEFAULT_BOOL(BGP_PREFER_GLOBAL_CLUSTER, { .val_bool = false }, ); +FRR_CFG_DEFAULT_BOOL(BGP_LOOSE_CLUSTER_LIST_CHECK, + { .val_bool = false }, +); + DEFINE_HOOK(bgp_inst_config_write, (struct bgp *bgp, struct vty *vty), (bgp, vty)); @@ -750,6 +754,8 @@ int bgp_get_vty(struct bgp **bgp, as_t *as, const char *name, SET_FLAG((*bgp)->flags, BGP_FLAG_NO_CLIENT_TO_CLIENT); if (DFLT_BGP_PREFER_GLOBAL_CLUSTER) SET_FLAG((*bgp)->flags, BGP_FLAG_PREFER_GLOBAL_CLUSTER); + if (DFLT_BGP_LOOSE_CLUSTER_LIST_CHECK) + SET_FLAG((*bgp)->flags, BGP_FLAG_LOOSE_CLUSTER_LIST_CHECK); ret = BGP_SUCCESS; } @@ -5624,6 +5630,43 @@ DEFPY (bgp_cluster_id_prefer_global, return CMD_SUCCESS; } +DEFPY (bgp_cluster_id_loose_cluster_check, + bgp_cluster_id_loose_cluster_check_cmd, + "[no] bgp cluster-id loose-cluster-list-check", + NO_STR + BGP_STR + "Configure Route-Reflector Cluster-ids\n" + "Stop rejecting prefixes with known cluster-id, but do not advertise them to members of that cluster\n") +{ + struct peer *peer; + struct listnode *node, *nnode; + afi_t afi; + safi_t safi; + + VTY_DECLVAR_CONTEXT(bgp, bgp); + + if (no) + UNSET_FLAG(bgp->flags, BGP_FLAG_LOOSE_CLUSTER_LIST_CHECK); + else + SET_FLAG(bgp->flags, BGP_FLAG_LOOSE_CLUSTER_LIST_CHECK); + + /* Clear all IBGP peer. */ + for (ALL_LIST_ELEMENTS(bgp->peer, node, nnode, peer)) { + if (peer->sort != BGP_PEER_IBGP) + continue; + if (CHECK_FLAG(peer->sflags, PEER_STATUS_GROUP)) + continue; + + FOREACH_AFI_SAFI (afi, safi) { + if (!peer->afc[afi][safi]) + continue; + peer_clear_soft(peer, afi, safi, BGP_CLEAR_SOFT_BOTH); + } + } + return CMD_SUCCESS; +} + + DEFPY (bgp_allow_martian, bgp_allow_martian_cmd, "[no]$no bgp allow-martian-nexthop", @@ -22684,6 +22727,14 @@ int bgp_config_write(struct vty *vty) CHECK_FLAG(bgp->flags, BGP_FLAG_PREFER_GLOBAL_CLUSTER) ? "" : "no "); + /* BGP loose-cluster-list-check */ + if (!!CHECK_FLAG(bgp->flags, BGP_FLAG_LOOSE_CLUSTER_LIST_CHECK) != + SAVE_BGP_LOOSE_CLUSTER_LIST_CHECK) + vty_out(vty, " %sbgp cluster-id loose-cluster-list-check\n", + CHECK_FLAG(bgp->flags, BGP_FLAG_LOOSE_CLUSTER_LIST_CHECK) ? "" + : "no "); + + /*BGP per-neighbor clusters*/ frr_each (per_neighbor_cluster_list, &bgp->per_neighbor_clusters, cluster) { if (CHECK_FLAG(cluster->flags, @@ -23583,6 +23634,7 @@ void bgp_vty_init(void) install_element(BGP_NODE, &no_bgp_cluster_id_cmd); install_element(BGP_NODE, &bgp_cluster_id_client_to_client_cmd); install_element(BGP_NODE, &bgp_cluster_id_prefer_global_cmd); + install_element(BGP_NODE, &bgp_cluster_id_loose_cluster_check_cmd); /* "bgp no-rib" commands. */ install_element(CONFIG_NODE, &bgp_norib_cmd); diff --git a/bgpd/bgpd.h b/bgpd/bgpd.h index fe20f699b006..54ab66a42c02 100644 --- a/bgpd/bgpd.h +++ b/bgpd/bgpd.h @@ -767,6 +767,7 @@ struct bgp { #define BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER (1ULL << 47) #define BGP_FLAG_CLIENT_TO_CLIENT_GLOBAL_CLUSTER_CONFIGURED (1ULL << 48) #define BGP_FLAG_PREFER_GLOBAL_CLUSTER (1ULL << 49) +#define BGP_FLAG_LOOSE_CLUSTER_LIST_CHECK (1ULL << 50) /* Use current (imported) path's attributes instead of source path's attributes * for bestpath comparison of imported paths. diff --git a/doc/user/bgp.rst b/doc/user/bgp.rst index 79242ac42f22..67db6a9c46f7 100644 --- a/doc/user/bgp.rst +++ b/doc/user/bgp.rst @@ -5801,7 +5801,13 @@ cluster-list of the prefix. This command changes this behavior, the global cluster-id of the route-reflector is added instead. +.. clicmd:: bgp cluster-id loose-cluster-list-check +In the spirit of rfc4456 the default behavior of a route-reflector when receiving +a route with one of its per-neighbor clusters in the cluster-list is to drop that +route in order to prevent loops. +This command changes this behavior, now the route is not dropped, it is just not +advertised to members of that cluster. .. _bgp-suppress-fib: diff --git a/tools/frr_babeltrace.py b/tools/frr_babeltrace.py index 1d5cf0d7bd20..60e0286d3613 100755 --- a/tools/frr_babeltrace.py +++ b/tools/frr_babeltrace.py @@ -1443,6 +1443,7 @@ def parse_frr_update_prefix_filter(event): 1: "Originator-id same as remote router id", 2: "Filtered via ORF", 3: "Output Filter", + 4: "Cluster list contains remote cluster id", }.get(x, f"Unknown prefix filter reason {x}") } parse_event(event, field_parsers) From ef4d7fb54a6855eb0671866f5cd5b454e969cf37 Mon Sep 17 00:00:00 2001 From: Pierre Neltner Date: Thu, 21 May 2026 15:50:59 +0200 Subject: [PATCH 9/9] doc: Add documentation to bgp client-to-client reflection Signed-off-by: Pierre Neltner --- doc/user/bgp.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/doc/user/bgp.rst b/doc/user/bgp.rst index 67db6a9c46f7..49c42d2898a2 100644 --- a/doc/user/bgp.rst +++ b/doc/user/bgp.rst @@ -5770,6 +5770,11 @@ setting. This command allows the BGP daemon to control, at a global level, the DSCP value used in outgoing packets for each BGP connection. +.. clicmd:: bgp client-to-client reflection + +Routes received from client peers are also reflected to other client peers, otherwise +they are only reflected to non-client peers. This is the default configuration. + .. clicmd:: neighbor PEER cluster-id A.B.C.D This command sets the peer PEER as part of a per-neighbor cluster.