rfc9871v2.txt   rfc9871.txt 
skipping to change at line 27 skipping to change at line 27
This document describes the routing framework and BGP extensions to This document describes the routing framework and BGP extensions to
enable intent-aware routing using the BGP CAR solution. The solution enable intent-aware routing using the BGP CAR solution. The solution
defines two new BGP SAFIs (BGP CAR SAFI and BGP VPN CAR SAFI) for defines two new BGP SAFIs (BGP CAR SAFI and BGP VPN CAR SAFI) for
IPv4 and IPv6. It also defines an extensible Network Layer IPv4 and IPv6. It also defines an extensible Network Layer
Reachability Information (NLRI) model for both SAFIs that allows Reachability Information (NLRI) model for both SAFIs that allows
multiple NLRI types to be defined for different use cases. Each type multiple NLRI types to be defined for different use cases. Each type
of NLRI contains key and TLV-based non-key fields for efficient of NLRI contains key and TLV-based non-key fields for efficient
encoding of different per-prefix information. This specification encoding of different per-prefix information. This specification
defines two NLRI types: Color-Aware Route NLRI and IP Prefix NLRI. defines two NLRI types: Color-Aware Route NLRI and IP Prefix NLRI.
It defines non-key TLV types for the MPLS label stack, SR-MPLS Label It defines non-key TLV types for the MPLS label stack, SR-MPLS label
Index and Segment Routing over IPv6 (SRv6) Segment Identifiers index, and Segment Routing over IPv6 (SRv6) Segment Identifiers
(SIDs). This solution also defines a new Local Color Mapping (LCM) (SIDs). This solution also defines a new Local Color Mapping (LCM)
Extended Community. Extended Community.
Status of This Memo Status of This Memo
This document is not an Internet Standards Track specification; it is This document is not an Internet Standards Track specification; it is
published for examination, experimental implementation, and published for examination, experimental implementation, and
evaluation. evaluation.
This document defines an Experimental Protocol for the Internet This document defines an Experimental Protocol for the Internet
skipping to change at line 305 skipping to change at line 305
(Section 8 of [RFC9256]), in this document (service route) (Section 8 of [RFC9256]), in this document (service route)
steering is used to describe the mapping of the traffic for a steering is used to describe the mapping of the traffic for a
service route onto a BGP CAR path. In contrast, the term service route onto a BGP CAR path. In contrast, the term
resolution is preserved for the mapping of an inter-domain BGP CAR resolution is preserved for the mapping of an inter-domain BGP CAR
route on an intra-domain color-aware path. route on an intra-domain color-aware path.
Service steering: Service steering:
Service route maps traffic to a BGP CAR path (or other color- Service route maps traffic to a BGP CAR path (or other color-
aware path, e.g., SR Policy). If a color-aware path is not aware path, e.g., SR Policy). If a color-aware path is not
available, local policy may map to a color-unaware routing/TE available, local policy may map to a color-unaware routing/TE
path (e.g., BGP LU, RSVP-TE, IGP/LDP). The service steering path (e.g., BGP-LU, RSVP-TE, IGP/LDP). The service steering
concept is agnostic to the transport technology used. concept is agnostic to the transport technology used.
Section 3 describes the specific service steering mechanisms Section 3 describes the specific service steering mechanisms
leveraged for MPLS, SR-MPLS, and SRv6. leveraged for MPLS, SR-MPLS, and SRv6.
Intra-domain resolution: Intra-domain resolution:
BGP CAR route maps to an intra-domain color-aware path (e.g., BGP CAR route maps to an intra-domain color-aware path (e.g.,
SR Policy, IGP Flexible Algorithm, BGP CAR) or a color-unaware SR Policy, IGP Flexible Algorithm, BGP CAR) or a color-unaware
routing/TE path (e.g., RSVP-TE, IGP/LDP, BGP-LU). routing/TE path (e.g., RSVP-TE, IGP/LDP, BGP-LU).
Transport network: Transport network:
skipping to change at line 454 skipping to change at line 454
- W/w is steered on a color-aware path provided by SR Policy - W/w is steered on a color-aware path provided by SR Policy
* Seamless interworking of BGP CAR and SR Policy * Seamless interworking of BGP CAR and SR Policy
- V/v is steered on a BGP CAR path that is itself resolved within - V/v is steered on a BGP CAR path that is itself resolved within
domain 2 onto an SR Policy bound to the color of V/v domain 2 onto an SR Policy bound to the color of V/v
Other properties: Other properties:
* MPLS data-plane: with 300k PEs and 5 colors, the BGP CAR solution * MPLS data plane: with 300k PEs and 5 colors, the BGP CAR solution
ensures that no single node needs to support a data-plane scaling ensures that no single node needs to support a data plane scaling
in the order of Remote PE * C (Section 5). This would otherwise in the order of Remote PE * C (Section 5). This would otherwise
exceed the MPLS data-plane. exceed the MPLS data plane.
* Control-plane: a node should not install a (E, C) path if it's not * Control plane: a node should not install a (E, C) path if it's not
participating in that color-aware path. participating in that color-aware path.
* Incongruent color-intent mapping: the solution supports the * Incongruent color-intent mapping: the solution supports the
signaling of a BGP CAR route across different color domains signaling of a BGP CAR route across different color domains
(Section 2.8). (Section 2.8).
The key benefits of this model are: The key benefits of this model are:
* Leverage of the BGP Color-EC [RFC9012] to color service routes * Leverage of the BGP Color-EC [RFC9012] to color service routes
* The definition of the automated service steering: a C-colored * The definition of the automated service steering: a C-colored
service route V/v from E2 is steered onto a color-aware path (E2, service route V/v from E2 is steered onto a color-aware path (E2,
C) C)
* The definition of the data model of a BGP CAR path: (E, C) * The definition of the data model of a BGP CAR path: (E, C)
- Natural extension of BGP IP/LU data model (E) - Natural extension of BGP-IP/BGP-LU data model (E)
- Consistent with SR Policy data model - Consistent with SR Policy data model
* The definition of the recursive resolution of a BGP CAR route: a * The definition of the recursive resolution of a BGP CAR route: a
BGP CAR (E2, C) route via N is resolved onto the color-aware path BGP CAR (E2, C) route via N is resolved onto the color-aware path
(N, C), which may itself be provided by BGP CAR or via another (N, C), which may itself be provided by BGP CAR or via another
color-aware routing solution (e.g., SR Policy, IGP Flexible color-aware routing solution (e.g., SR Policy, IGP Flexible
Algorithm) Algorithm)
* Explicit definitions for multiple transport encapsulations (e.g., * Explicit definitions for multiple transport encapsulations (e.g.,
skipping to change at line 578 skipping to change at line 578
2.3. BGP CAR Route Origination 2.3. BGP CAR Route Origination
A BGP CAR route may be originated locally (e.g., loopback) or through A BGP CAR route may be originated locally (e.g., loopback) or through
redistribution of an (E, C) color-aware path provided by another redistribution of an (E, C) color-aware path provided by another
routing solution (e.g., SR Policy, IGP Flexible Algorithm, RSVP-TE, routing solution (e.g., SR Policy, IGP Flexible Algorithm, RSVP-TE,
BGP-LU [RFC8277]). BGP-LU [RFC8277]).
2.4. BGP CAR Route Validation 2.4. BGP CAR Route Validation
A BGP CAR path (E, C) via next hop N with encapsulation T is valid if A BGP CAR path (E, C) via next hop N with encapsulation T is valid if
color-aware path (N, C) exists with encapsulation T available in color-aware path (N, C) exists with encapsulation T available in data
data-plane. plane.
A local policy may customize the validation process: A local policy may customize the validation process:
* The color constraint in the first check may be relaxed. If N is * The color constraint in the first check may be relaxed. If N is
reachable via alternate color(s) or in the default routing table, reachable via alternate color(s) or in the default routing table,
the route may be considered valid. the route may be considered valid.
* The data-plane availability constraint of T may be relaxed to use * The data plane availability constraint of T may be relaxed to use
an alternate encapsulation. an alternate encapsulation.
* A performance-measurement verification may be added to ensure that * A performance-measurement verification may be added to ensure that
the intent associated with C is met (e.g., delay < bound). the intent associated with C is met (e.g., delay < bound).
A path that is not valid MUST NOT be considered for BGP best path A path that is not valid MUST NOT be considered for BGP best path
selection. selection.
2.5. BGP CAR Route Resolution 2.5. BGP CAR Route Resolution
skipping to change at line 631 skipping to change at line 631
domain, an egress node selects and advertises an SRv6 SID from its domain, an egress node selects and advertises an SRv6 SID from its
locator for intent C1, with a BGP CAR route. In such a case, the locator for intent C1, with a BGP CAR route. In such a case, the
ingress node resolves the received SRv6 SID over an IPv6 route for ingress node resolves the received SRv6 SID over an IPv6 route for
the intent-aware locator of the egress node for C1 or a summary the intent-aware locator of the egress node for C1 or a summary
route that covers the locator. This summary route may be provided route that covers the locator. This summary route may be provided
by SRv6 Flexible Algorithm or BGP CAR IP Prefix route itself by SRv6 Flexible Algorithm or BGP CAR IP Prefix route itself
(e.g., Appendix C.2). (e.g., Appendix C.2).
* Local policy may map the CAR route to mechanisms that are unaware * Local policy may map the CAR route to mechanisms that are unaware
of color or that provide best-effort, such as RSVP-TE, IGP/LDP, of color or that provide best-effort, such as RSVP-TE, IGP/LDP,
BGP LU/IP (e.g., Appendix A.3.2) for brownfield scenarios. BGP-LU/BGP-IP (e.g., Appendix A.3.2) for brownfield scenarios.
Route resolution via a different color C2 can be automated by Route resolution via a different color C2 can be automated by
attaching BGP Color-EC C2 to CAR route (E2, C1), leveraging automated attaching BGP Color-EC C2 to CAR route (E2, C1), leveraging automated
steering as described in Section 8.4 of "Segment Routing Policy steering as described in Section 8.4 of "Segment Routing Policy
Architecture" [RFC9256] for BGP CAR routes. This mechanism is Architecture" [RFC9256] for BGP CAR routes. This mechanism is
illustrated in Appendix B.2. This mechanism SHOULD be supported. illustrated in Appendix B.2. This mechanism SHOULD be supported.
For CAR route resolution, if Color-EC color is present with the For CAR route resolution, if Color-EC color is present with the
route, it takes precedence over the route's intent color. The route, it takes precedence over the route's intent color. The
route’s intent color is the LCM-EC color if present (see route’s intent color is the LCM-EC color if present (see
skipping to change at line 704 skipping to change at line 704
AIGP updates. AIGP updates.
Additional AIGP extensions may be defined to signal state for Additional AIGP extensions may be defined to signal state for
specific use cases such as Maximum SID Depth (MSD) along the BGP CAR specific use cases such as Maximum SID Depth (MSD) along the BGP CAR
route advertisement and minimum MTU along the BGP CAR route route advertisement and minimum MTU along the BGP CAR route
advertisement. This is out of scope for this document. advertisement. This is out of scope for this document.
2.7. Inherent Multipath Capability 2.7. Inherent Multipath Capability
The (E, C) route definition inherently provides availability of The (E, C) route definition inherently provides availability of
redundant paths at every BGP hop identical to BGP-LU or BGP IP. For redundant paths at every BGP hop identical to BGP-LU or BGP-IP. For
instance, BGP CAR routes originated by two or more egress ABRs in a instance, BGP CAR routes originated by two or more egress ABRs in a
domain are advertised as multiple paths to ingress ABRs in the domain are advertised as multiple paths to ingress ABRs in the
domain, where they become equal-cost or primary-backup paths. A domain, where they become equal-cost or primary-backup paths. A
failure of an egress ABR is detected and handled by ingress ABRs failure of an egress ABR is detected and handled by ingress ABRs
locally within the domain for faster convergence, without any locally within the domain for faster convergence, without any
necessity to propagate the event to upstream nodes for traffic necessity to propagate the event to upstream nodes for traffic
restoration. restoration.
BGP ADD-PATH [RFC7911] SHOULD be enabled for BGP CAR to signal BGP ADD-PATH [RFC7911] SHOULD be enabled for BGP CAR to signal
multiple next hops through a transport RR. multiple next hops through a TRR.
2.8. BGP CAR Signaling Through Different Color Domains 2.8. BGP CAR Signaling Through Different Color Domains
[Color Domain 1 A]-----[B Color Domain 2 E2] [Color Domain 1 A]-----[B Color Domain 2 E2]
[C1=low delay ] [C2=low delay ] [C1=low delay ] [C2=low delay ]
Let us assume a BGP CAR route (E2, C2) is signaled from B to A, two Let us assume a BGP CAR route (E2, C2) is signaled from B to A, two
border routers of Domain 2 and Domain 1, respectively. Let us assume BRs of Domain 2 and Domain 1, respectively. Let us assume that these
that these two domains do not share the same color-to-intent mapping two domains do not share the same color-to-intent mapping (i.e., they
(i.e., they belong to different color domains). Low delay in Domain belong to different color domains). Low delay in Domain 2 is color
2 is color C2, while it is C1 in Domain 1 (C1 <> C2). C2, while it is C1 in Domain 1 (C1 <> C2).
It is not expected to be a typical scenario to have an underlay It is not expected to be a typical scenario to have an underlay
transport path (e.g., an MPLS LSP) extend across different color transport path (e.g., an MPLS LSP) extend across different color
domains. However, the BGP CAR solution seamlessly supports this rare domains. However, the BGP CAR solution seamlessly supports this rare
scenario while maintaining the separation and independence of the scenario while maintaining the separation and independence of the
administrative authority in different color domains. administrative authority in different color domains.
The solution works as described below: The solution works as described below:
* Within Domain 2, the BGP CAR route is (E2, C2) via E2. * Within Domain 2, the BGP CAR route is (E2, C2) via E2.
skipping to change at line 786 skipping to change at line 786
resolution and steering. resolution and steering.
* In the rare case of color incongruence, the local color encoded in * In the rare case of color incongruence, the local color encoded in
LCM-EC takes precedence. LCM-EC takes precedence.
Operational considerations are in Section 11. Further illustrations Operational considerations are in Section 11. Further illustrations
are provided in Appendix B. are provided in Appendix B.
2.9. Format and Encoding 2.9. Format and Encoding
BGP CAR leverages BGP multi-protocol extensions [RFC4760] and uses BGP CAR leverages BGP multiprotocol extensions [RFC4760] and uses the
the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route updates MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route updates within
within SAFI value 83 along with AFI 1 for IPv4 prefixes and AFI 2 for SAFI value 83 along with AFI 1 for IPv4 prefixes and AFI 2 for IPv6
IPv6 prefixes. prefixes.
BGP speakers MUST use the BGP Capabilities Advertisement to ensure BGP speakers MUST use the BGP Capabilities Advertisement to ensure
support for processing of BGP CAR updates. This is done as specified support for processing of BGP CAR updates. This is done as specified
in [RFC4760], by using capability code 1 (multi-protocol BGP), with in [RFC4760], by using capability code 1 (multiprotocol BGP), with
AFI 1 and 2 (as required) and SAFI 83. AFI 1 and 2 (as required) and SAFI 83.
The Next Hop network address field in the MP_REACH_NLRI may either be The Next Hop network address field in the MP_REACH_NLRI may either be
an IPv4 address or an IPv6 address, independent of AFI. If the next an IPv4 address or an IPv6 address, independent of AFI. If the next
hop length is 4, then the next hop is an IPv4 address. The next hop hop length is 4, then the next hop is an IPv4 address. The next hop
length may be 16 or 32 for an IPv6 next hop address, set as per length may be 16 or 32 for an IPv6 next hop address, set as per
Section 3 of [RFC2545]. Processing of the Next Hop field is governed Section 3 of [RFC2545]. Processing of the Next Hop field is governed
by standard BGP procedures as described in Section 3 of [RFC4760]. by standard BGP procedures as described in Section 3 of [RFC4760].
The sub-sections below specify the generic encoding of the BGP CAR The sub-sections below specify the generic encoding of the BGP CAR
skipping to change at line 1334 skipping to change at line 1334
relied upon to extract the key and perform 'treat-as-withdraw' for relied upon to extract the key and perform 'treat-as-withdraw' for
malformed information. malformed information.
A sender MUST ensure that the NLRI and key lengths are the number of A sender MUST ensure that the NLRI and key lengths are the number of
actual bytes encoded in the NLRI and key fields, respectively, actual bytes encoded in the NLRI and key fields, respectively,
regardless of content being encoded. regardless of content being encoded.
Given the NLRI length and Key length MUST be valid, failures in the Given the NLRI length and Key length MUST be valid, failures in the
following checks result in 'AFI/SAFI disable' or 'session reset': following checks result in 'AFI/SAFI disable' or 'session reset':
* The minimum NLRI length MUST be at least 2, as key length and NLRI * The minimum NLRI Length MUST be at least 2, as Key Length and NLRI
type are required fields. Type are required fields.
* The Key Length MUST be at least 2 less than NLRI Length. * The Key Length MUST be at least 2 less than NLRI Length.
NLRI type-specific error handling: NLRI type-specific error handling:
* By default, a speaker SHOULD discard an unrecognized or * By default, a speaker SHOULD discard an unrecognized or
unsupported NLRI type and move to the next NLRI. unsupported NLRI type and move to the next NLRI.
* Key length and key errors of a known NLRI type SHOULD result in * Key length and key errors of a known NLRI type SHOULD result in
the discard of NLRI similar to an unrecognized NLRI type. (This the discard of NLRI similar to an unrecognized NLRI type. (This
skipping to change at line 1453 skipping to change at line 1453
routes from upstream routers or Route Reflectors (RRs) to limit the routes from upstream routers or Route Reflectors (RRs) to limit the
routes that it needs to learn. On-demand subscription and automated routes that it needs to learn. On-demand subscription and automated
filtering procedures for individual CAR routes are outside the scope filtering procedures for individual CAR routes are outside the scope
of this document. of this document.
5. Scaling 5. Scaling
This section analyzes the key scale requirement of [INTENT-AWARE], This section analyzes the key scale requirement of [INTENT-AWARE],
specifically: specifically:
* No intermediate node data-plane should need to scale to (Colors * * No intermediate node data plane should need to scale to (Colors *
PEs). PEs).
* No node should learn and install a BGP CAR route to (E, C) if it * No node should learn and install a BGP CAR route to (E, C) if it
does not install a colored service route to E. does not install a colored service route to E.
While the requirements and design principles generally apply to any While the requirements and design principles generally apply to any
transport, the logical analysis based on the network design in this transport, the logical analysis based on the network design in this
section focuses on MPLS/SR-MPLS transport since the scaling section focuses on MPLS/SR-MPLS transport since the scaling
constraints are specifically relevant to these technologies. BGP CAR constraints are specifically relevant to these technologies. BGP CAR
SAFI is used here, but the considerations can apply to [RFC8277] or SAFI is used here, but the considerations can apply to [RFC8277] or
skipping to change at line 1522 skipping to change at line 1522
* Each domain has Flex-Algo 128. Prefix-SID for a node is Segment * Each domain has Flex-Algo 128. Prefix-SID for a node is Segment
Routing Global Block (SRGB) 168000 plus node number. Routing Global Block (SRGB) 168000 plus node number.
* A BGP CAR route (E2, C1) is advertised by egress BRM node 451. * A BGP CAR route (E2, C1) is advertised by egress BRM node 451.
The route is sourced locally from redistribution from IGP Flex- The route is sourced locally from redistribution from IGP Flex-
Algo 128. Algo 128.
* Not shown for simplicity, node 452 will also advertise (E2, C1). * Not shown for simplicity, node 452 will also advertise (E2, C1).
* When a transport RR is used within the domain or across domains, * When a TRR is used within the domain or across domains, ADD-PATH
ADD-PATH is enabled to advertise paths from both egress BRs to its is enabled to advertise paths from both egress BRs to its clients.
clients.
* Egress PE E2 advertises a VPN route RD:V/v with BGP Color-EC C1 * Egress PE E2 advertises a VPN route RD:V/v with BGP Color-EC C1
that propagates via service RRs to ingress PE E1. that propagates via service RRs to ingress PE E1.
* E1 steers V/v prefix via color-aware path (E2, C1) and VPN label * E1 steers V/v prefix via color-aware path (E2, C1) and VPN label
30030. 30030.
5.2. Deployment Model 5.2. Deployment Model
5.2.1. Flat 5.2.1. Flat
skipping to change at line 1636 skipping to change at line 1635
* Each BGP hop allocates local label and programs swap entry in * Each BGP hop allocates local label and programs swap entry in
forwarding for (451, C1). forwarding for (451, C1).
* 121 resolves received BGP CAR route (451, C1) via 231 (label * 121 resolves received BGP CAR route (451, C1) via 231 (label
168451) on color-aware path (231, C1). 168451) on color-aware path (231, C1).
- Color-aware path (231, C1) is Flex-Algo 128 path to 231 (label - Color-aware path (231, C1) is Flex-Algo 128 path to 231 (label
168231). 168231).
* 451 advertises BGP CAR route (E2, C1) via 451 to transport RR * 451 advertises BGP CAR route (E2, C1) via 451 to TRR T-RR2, which
T-RR2, which reflects it to transport RR T-RR1, which reflects it reflects it to TRR T-RR1, which reflects it to 121.
to 121.
* 121 receives BGP CAR route (E2, C1) via 451 with label 168002. * 121 receives BGP CAR route (E2, C1) via 451 with label 168002.
- Let's assume 121 selects that path. - Let's assume 121 selects that path.
* 121 resolves BGP CAR route (E2, C1) via 451 on color-aware path * 121 resolves BGP CAR route (E2, C1) via 451 on color-aware path
(451, C1). (451, C1).
- Color-aware path (451, C1) is BGP CAR path to 451 (label - Color-aware path (451, C1) is BGP CAR path to 451 (label
168451). 168451).
skipping to change at line 1720 skipping to change at line 1718
Figure 5: Hierarchical BGP Transport CAR, Next-Hop-Unchanged Figure 5: Hierarchical BGP Transport CAR, Next-Hop-Unchanged
(NHU) at iBR (NHU) at iBR
* Nodes 341, 231, and 121 receive and resolve BGP CAR route (451, * Nodes 341, 231, and 121 receive and resolve BGP CAR route (451,
C1) the same as in the previous model. C1) the same as in the previous model.
* Node 121 allocates local label and programs swap entry in * Node 121 allocates local label and programs swap entry in
forwarding for (451, C1). forwarding for (451, C1).
* 451 advertises BGP CAR route (E2, C1) to transport RR T-RR2, which * 451 advertises BGP CAR route (E2, C1) to TRR T-RR2, which reflects
reflects it to transport RR T-RR1, which reflects it to 121. it to TRR T-RR1, which reflects it to 121.
* Node 121 advertises (E2, C1) to E1 with next hop as 451 (i.e., * Node 121 advertises (E2, C1) to E1 with next hop as 451 (i.e.,
next-hop-unchanged). next-hop-unchanged).
* 121 also advertises (451, C1) to E1 with next-hop-self (121) and * 121 also advertises (451, C1) to E1 with next-hop-self (121) and
label 168451. label 168451.
* E1 resolves BGP CAR route (451, C1) via 121 on color-aware path * E1 resolves BGP CAR route (451, C1) via 121 on color-aware path
(121, C1). (121, C1).
skipping to change at line 1764 skipping to change at line 1762
* Nodes 121, 231, and 341 perform swap operation on 168451 bound to * Nodes 121, 231, and 341 perform swap operation on 168451 bound to
(451, C1). (451, C1).
* 451 performs swap operation on 168002 bound to color-aware path * 451 performs swap operation on 168002 bound to color-aware path
(E2, C1). (E2, C1).
5.3. Scale Analysis 5.3. Scale Analysis
The following two tables summarize the logically analyzed scaling of The following two tables summarize the logically analyzed scaling of
the control-plane and data-plane for the previous three models: the control plane and data plane for the previous three models:
+=======+=====================+=====================+=============+ +=======+=====================+=====================+=============+
| | E1 | 121 | 231 | | | E1 | 121 | 231 |
+=======+=====================+=====================+=============+ +=======+=====================+=====================+=============+
| FLAT | (E2,C) via (121,C) | (E2,C) via (231,C) | (E2,C) via | | FLAT | (E2,C) via (121,C) | (E2,C) via (231,C) | (E2,C) via |
| | | | (341,C) | | | | | (341,C) |
+=======+---------------------+---------------------+-------------+ +=======+---------------------+---------------------+-------------+
| H.NHS | (E2,C) via (121,C) | (E2,C) via (451,C) | (451,C) via | | H.NHS | (E2,C) via (121,C) | (E2,C) via (451,C) | (451,C) via |
| | | (451,C) via (231,C) | (341,C) | | | | (451,C) via (231,C) | (341,C) |
+=======+---------------------+---------------------+-------------+ +=======+---------------------+---------------------+-------------+
skipping to change at line 1806 skipping to change at line 1804
+=======+------------+------------------+------------------+ +=======+------------+------------------+------------------+
Table 2 Table 2
* The flat model is the simplest design, with a single BGP transport * The flat model is the simplest design, with a single BGP transport
level. It results in the minimum label/SID stack at each BGP hop. level. It results in the minimum label/SID stack at each BGP hop.
However, it significantly increases the scale impact on the core However, it significantly increases the scale impact on the core
BRs (e.g., 341), whose FIB capacity and even MPLS label space may BRs (e.g., 341), whose FIB capacity and even MPLS label space may
be exceeded. be exceeded.
- 341's data-plane scales with (E2, C) where there may be 300k Es - 341's data plane scales with (E2, C) where there may be 300k Es
and 5 Cs, hence 1.5M entries > 1M MPLS data-plane. and 5 Cs, hence 1.5M entries > 1M MPLS data plane.
* The hierarchical models avoid the need for core BRs to learn * The hierarchical models avoid the need for core BRs to learn
routes and install label forwarding entries for (E, C) routes. routes and install label forwarding entries for (E, C) routes.
- Whether next hop is set to self or left unchanged at 121, 341's - Whether next hop is set to self or left unchanged at 121, 341's
data-plane scales with (451, C) where there may be thousands of data plane scales with (451, C) where there may be thousands of
451s and 5 Cs. Therefore, this scaling is well under the 1 451s and 5 Cs. Therefore, this scaling is well under the 1
million MPLS labels data-plane limit. million MPLS labels data plane limit.
- They also aid faster convergence by allowing the PE routes to - They also aid faster convergence by allowing the PE routes to
be distributed via out-of-band RRs that can be scaled be distributed via out-of-band RRs that can be scaled
independent of the transport BRs. independent of the transport BRs.
* The next-hop-self option at ingress BRM (e.g., 121) hides the * The next-hop-self option at ingress BRM (e.g., 121) hides the
hierarchical design from the ingress PE, keeping its outgoing hierarchical design from the ingress PE, keeping its outgoing
label programming as simple as the flat model. However, the label programming as simple as the flat model. However, the
ingress BRM requires an additional BGP transport level recursion, ingress BRM requires an additional BGP transport level recursion,
which coupled with load-balancing adds data-plane complexity. It which coupled with load-balancing adds data plane complexity. It
needs to support a swap and push operation. It also needs to needs to support a swap and push operation. It also needs to
install label forwarding entries for the egress PEs that are of install label forwarding entries for the egress PEs that are of
interest to its local ingress PEs. interest to its local ingress PEs.
* With the next-hop-unchanged option at ingress BRM (e.g., 121), * With the next-hop-unchanged option at ingress BRM (e.g., 121),
only an ingress PE needs to learn and install output label entries only an ingress PE needs to learn and install output label entries
for egress (E, C) routes. The ingress BRM only installs label for egress (E, C) routes. The ingress BRM only installs label
forwarding entries for the egress ABR (e.g., 451). However, the forwarding entries for the egress ABR (e.g., 451). However, the
ingress PE needs an additional BGP transport level recursion and ingress PE needs an additional BGP transport level recursion and
pushes a BGP VPN label and two BGP transport labels. It may also pushes a BGP VPN label and two BGP transport labels. It may also
need to handle load-balancing for the egress ABRs. This is the need to handle load-balancing for the egress ABRs. This is the
most complex data-plane option for the ingress PE. most complex data plane option for the ingress PE.
5.4. Anycast SID 5.4. Anycast SID
This section describes how Anycast SID complements and improves the This section describes how Anycast SID complements and improves the
scaling designs above. scaling designs above.
5.4.1. Anycast SID for Transit Inter-Domain Nodes 5.4.1. Anycast SID for Transit Inter-Domain Nodes
* Redundant BRs (e.g., two egress BRMs, 451 and 452) advertise BGP * Redundant BRs (e.g., two egress BRMs, 451 and 452) advertise BGP
CAR routes for a local PE (e.g., E2) with the same SID (based on CAR routes for a local PE (e.g., E2) with the same SID (based on
skipping to change at line 2194 skipping to change at line 2192
with existing operational usage, the CAR IP Prefix route is allowed with existing operational usage, the CAR IP Prefix route is allowed
to be without color for best-effort. In this case, the routes will to be without color for best-effort. In this case, the routes will
not carry an LCM-EC. Resolution is described in Section 2.5. not carry an LCM-EC. Resolution is described in Section 2.5.
As described in Section 7.3, infrastructure prefixes are intended to As described in Section 7.3, infrastructure prefixes are intended to
be carried in CAR SAFI instead of SAFIs that also carry service be carried in CAR SAFI instead of SAFIs that also carry service
routes such as BGP-IP (SAFI 1, [RFC4271]) and BGP-LU (SAFI 4, routes such as BGP-IP (SAFI 1, [RFC4271]) and BGP-LU (SAFI 4,
[RFC4798]). However, if such infrastructure routes are also [RFC4798]). However, if such infrastructure routes are also
distributed in these SAFIs, a router may receive both BGP CAR SAFI distributed in these SAFIs, a router may receive both BGP CAR SAFI
paths and IP/LU SAFI paths. By default, the CAR SAFI transport path paths and IP/LU SAFI paths. By default, the CAR SAFI transport path
is preferred over the BGP IP or BGP-LU SAFI path. is preferred over the BGP-IP or BGP-LU SAFI path.
A BGP transport CAR speaker that supports packet forwarding lookup A BGP transport CAR speaker that supports packet forwarding lookup
based on the IPv6 prefix route (such as a BR) will set itself as next based on the IPv6 prefix route (such as a BR) will set itself as next
hop while advertising the route to peers. It will also install the hop while advertising the route to peers. It will also install the
IPv6 route into forwarding with the received next hop and/or IPv6 route into forwarding with the received next hop and/or
encapsulation. If such a transit router does not support this route encapsulation. If such a transit router does not support this route
type, it will not install this route and will not set itself as next type, it will not install this route and will not set itself as next
hop; hence, it will not propagate the route any further. hop; hence, it will not propagate the route any further.
9. VPN CAR 9. VPN CAR
skipping to change at line 2289 skipping to change at line 2287
CAR routes distributed in VPN CAR SAFI are infrastructure routes CAR routes distributed in VPN CAR SAFI are infrastructure routes
advertised by CEs in different customer VRFs on a PE. Example use advertised by CEs in different customer VRFs on a PE. Example use
cases are intent-aware L3VPN Carriers' Carriers (Section 9 of cases are intent-aware L3VPN Carriers' Carriers (Section 9 of
[RFC4364]) and SRv6 over a provider network. The VPN RD [RFC4364]) and SRv6 over a provider network. The VPN RD
distinguishes CAR routes of different customers being advertised by distinguishes CAR routes of different customers being advertised by
the PE. the PE.
9.1. Format and Encoding 9.1. Format and Encoding
BGP VPN CAR SAFI leverages BGP multi-protocol extensions [RFC4760] BGP VPN CAR SAFI leverages BGP multiprotocol extensions [RFC4760] and
and uses the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route uses the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route
updates within SAFI value 84 along with AFI 1 for IPv4 VPN CAR updates within SAFI value 84 along with AFI 1 for IPv4 VPN CAR
prefixes and AFI 2 for IPv6 VPN CAR prefixes. prefixes and AFI 2 for IPv6 VPN CAR prefixes.
BGP speakers MUST use the BGP Capabilities Advertisement to ensure BGP speakers MUST use the BGP Capabilities Advertisement to ensure
support for processing of BGP VPN CAR updates. This is done as support for processing of BGP VPN CAR updates. This is done as
specified in [RFC4760], by using capability code 1 (multi-protocol specified in [RFC4760], by using capability code 1 (multiprotocol
BGP), with AFI 1 and 2 (as required) and SAFI 84. BGP), with AFI 1 and 2 (as required) and SAFI 84.
The Next Hop network address field in the MP_REACH_NLRI may contain The Next Hop network address field in the MP_REACH_NLRI may contain
either a VPN-IPv4 or a VPN-IPv6 address with 8-octet RD set to zero, either a VPN-IPv4 or a VPN-IPv6 address with 8-octet RD set to zero,
independent of AFI. If the next hop length is 12, then the next hop independent of AFI. If the next hop length is 12, then the next hop
is a VPN-IPv4 address with an RD of 0 constructed as per [RFC4364]. is a VPN-IPv4 address with an RD of 0 constructed as per [RFC4364].
If the next hop length is 24 or 48, then the next hop is a VPN-IPv6 If the next hop length is 24 or 48, then the next hop is a VPN-IPv6
address constructed as per Section 3.2.1.1 of [RFC4659]. address constructed as per Section 3.2.1.1 of [RFC4659].
9.1.1. VPN CAR (E, C) NLRI Type 9.1.1. VPN CAR (E, C) NLRI Type
skipping to change at line 2819 skipping to change at line 2817
* The following description applies to the reference topology above: * The following description applies to the reference topology above:
- IGP Flex-Algo 128 is running in each domain, and mapped to - IGP Flex-Algo 128 is running in each domain, and mapped to
color C1. color C1.
- Egress PE E2 advertises a VPN route RD:V/v colored with Color- - Egress PE E2 advertises a VPN route RD:V/v colored with Color-
EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN
route propagates via service RRs to ingress PE E1. route propagates via service RRs to ingress PE E1.
- BGP CAR route (E2, C1) with next hop, label index, and label as - BGP CAR route (E2, C1) with next hop, label index, and label as
shown above are advertised through border routers in each shown above are advertised through BRs in each domain. When an
domain. When an RR is used in the domain, ADD-PATH is enabled RR is used in the domain, ADD-PATH is enabled to advertise
to advertise multiple available paths. multiple available paths.
- On each BGP hop, the (E2, C1) route's next hop is resolved over - On each BGP hop, the (E2, C1) route's next hop is resolved over
IGP Flex-Algo 128 of the domain. The AIGP attribute influences IGP Flex-Algo 128 of the domain. The AIGP attribute influences
the BGP CAR route best path decision as per [RFC7311]. The BGP the BGP CAR route best path decision as per [RFC7311]. The BGP
CAR label swap entry is installed that goes over Flex-Algo 128 CAR label swap entry is installed that goes over Flex-Algo 128
LSP to next hop providing intent in each IGP domain. The AIGP LSP to next hop providing intent in each IGP domain. The AIGP
metric should be updated to reflect Flex-Algo 128 metric to metric should be updated to reflect Flex-Algo 128 metric to
next hop. next hop.
- Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN - Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN
skipping to change at line 2905 skipping to change at line 2903
o SR Policy (C1, 231) segments <S2, 231>, and o SR Policy (C1, 231) segments <S2, 231>, and
o SR Policy (C1, E2) segments <S3, E2>. o SR Policy (C1, E2) segments <S3, E2>.
- Egress PE E2 advertises a VPN route RD:V/v colored with Color- - Egress PE E2 advertises a VPN route RD:V/v colored with Color-
EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN
route propagates via service RRs to ingress PE E1. route propagates via service RRs to ingress PE E1.
- BGP CAR route (E2, C1) with next hop, label index, and label as - BGP CAR route (E2, C1) with next hop, label index, and label as
shown above are advertised through border routers in each shown above are advertised through BRs in each domain. When an
domain. When an RR is used in the domain, ADD-PATH is enabled RR is used in the domain, ADD-PATH is enabled to advertise
to advertise multiple available paths. multiple available paths.
- On each BGP hop, the CAR route (E2, C1) next hop is resolved - On each BGP hop, the CAR route (E2, C1) next hop is resolved
over an SR Policy (C1, next hop). The BGP CAR label swap entry over an SR Policy (C1, next hop). The BGP CAR label swap entry
is installed that goes over SR Policy segment list. is installed that goes over SR Policy segment list.
- Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN - Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN
route RD:V/v into (E2, C1). route RD:V/v into (E2, C1).
* Important: * Important:
skipping to change at line 2978 skipping to change at line 2976
* The following description applies to the reference topology above: * The following description applies to the reference topology above:
- IGP Flex-Algo 128 is only enabled in core (e.g., WAN network), - IGP Flex-Algo 128 is only enabled in core (e.g., WAN network),
mapped to C1. Access network domain only has Base Algo 0. mapped to C1. Access network domain only has Base Algo 0.
- Egress PE E2 advertises a VPN route RD:V/v colored with Color- - Egress PE E2 advertises a VPN route RD:V/v colored with Color-
EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN
route propagates via service RRs to ingress PE E1. route propagates via service RRs to ingress PE E1.
- BGP CAR route (E2, C1) with next hop, label index, and label as - BGP CAR route (E2, C1) with next hop, label index, and label as
shown above are advertised through border routers in each shown above are advertised through BRs in each domain. When an
domain. When an RR is used in the domain, ADD-PATH is enabled RR is used in the domain, ADD-PATH is enabled to advertise
to advertise multiple available paths. multiple available paths.
- Local policy on 231 and 232 maps intent C1 to resolve CAR route - Local policy on 231 and 232 maps intent C1 to resolve CAR route
next hop over IGP Base Algo 0 in right access domain. The BGP next hop over IGP Base Algo 0 in right access domain. The BGP
CAR label swap entry is installed that goes over Base Algo 0 CAR label swap entry is installed that goes over Base Algo 0
LSP to next hop. AIGP metric is updated to reflect Base Algo 0 LSP to next hop. AIGP metric is updated to reflect Base Algo 0
metric to next hop with an additional penalty (+1000). metric to next hop with an additional penalty (+1000).
- On 121 and 122, CAR route (E2, C1) next hop learnt from Core - On 121 and 122, CAR route (E2, C1) next hop learnt from Core
domain is resolved over IGP Flex-Algo 128. The BGP CAR label domain is resolved over IGP Flex-Algo 128. The BGP CAR label
swap entry is installed that goes over Flex-Algo 128 LSP to swap entry is installed that goes over Flex-Algo 128 LSP to
skipping to change at line 3058 skipping to change at line 3056
- RSVP-TE MPLS tunnel mesh is configured only in core (e.g., WAN - RSVP-TE MPLS tunnel mesh is configured only in core (e.g., WAN
network). Access only has IS-IS/LDP. (The figure does not network). Access only has IS-IS/LDP. (The figure does not
show all TE tunnels.) show all TE tunnels.)
- Egress PE E2 advertises a VPN route RD:V/v colored with Color- - Egress PE E2 advertises a VPN route RD:V/v colored with Color-
EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN
route propagates via service RRs to ingress PE E1. route propagates via service RRs to ingress PE E1.
- BGP CAR route (E2, C1) with next hops and labels as shown above - BGP CAR route (E2, C1) with next hops and labels as shown above
is advertised through border routers in each domain. When an is advertised through BRs in each domain. When an RR is used
RR is used in the domain, ADD-PATH is enabled to advertise in the domain, ADD-PATH is enabled to advertise multiple
multiple available paths. available paths.
- Local policy on 231 and 232 maps intent C1 to resolve CAR route - Local policy on 231 and 232 maps intent C1 to resolve CAR route
next hop over best-effort LDP LSP in access domain 1. The BGP next hop over best-effort LDP LSP in access domain 1. The BGP
CAR label swap entry is installed that goes over LDP LSP to CAR label swap entry is installed that goes over LDP LSP to
next hop. AIGP metric is updated to reflect best-effort metric next hop. AIGP metric is updated to reflect best-effort metric
to next hop with an additional penalty (+1000). to next hop with an additional penalty (+1000).
- Local policy on 121 and 122 maps intent C1 to resolve CAR route - Local policy on 121 and 122 maps intent C1 to resolve CAR route
next hop in Core domain over RSVP-TE tunnels. The BGP CAR next hop in Core domain over RSVP-TE tunnels. The BGP CAR
label swap entry is installed that goes over a TE tunnel to label swap entry is installed that goes over a TE tunnel to
skipping to change at line 3092 skipping to change at line 3090
- Dynamic BGP CAR label carries intent from PEs, which is - Dynamic BGP CAR label carries intent from PEs, which is
realized in Core domain by resolution via RSVP-TE tunnel. realized in Core domain by resolution via RSVP-TE tunnel.
A.4. Transit Network Domains That Do Not Support CAR A.4. Transit Network Domains That Do Not Support CAR
* In a brownfield deployment, color-aware paths between two PEs may * In a brownfield deployment, color-aware paths between two PEs may
need to go through a transit domain that does not support CAR. need to go through a transit domain that does not support CAR.
Examples of such a brownfield network include an MPLS LDP network Examples of such a brownfield network include an MPLS LDP network
with IGP best-effort, or a multi-domain network based on BGP-LU. with IGP best-effort, or a multi-domain network based on BGP-LU.
An MPLS LDP network with best-effort IGP can adopt the above An MPLS LDP network with best-effort IGP can adopt the above
scheme in Appendix A.3. Below is the example scenario for BGP LU. scheme in Appendix A.3. Below is the example scenario for BGP-LU.
* Reference topology: * Reference topology:
E1 --- BR1 --- BR2 ......... BR3 ---- BR4 --- E2 E1 --- BR1 --- BR2 ......... BR3 ---- BR4 --- E2
Ci <----LU----> Ci Ci <----LU----> Ci
Figure 10: BGP CAR Not Supported in Transit Domain Figure 10: BGP CAR Not Supported in Transit Domain
- Network between BR2 and BR3 comprises of multiple BGP-LU hops - Network between BR2 and BR3 comprises of multiple BGP-LU hops
(over IGP-LDP domains). (over IGP-LDP domains).
skipping to change at line 3215 skipping to change at line 3213
different domains. different domains.
A.6. Per-Flow Steering over CAR Routes A.6. Per-Flow Steering over CAR Routes
This section provides an example of ingress PE per-flow steering as This section provides an example of ingress PE per-flow steering as
defined in Section 8.6 of [RFC9256] onto BGP CAR routes. defined in Section 8.6 of [RFC9256] onto BGP CAR routes.
The following description applies to the reference topology in The following description applies to the reference topology in
Figure 6: Figure 6:
* Ingress PE E1 learns best-effort BGP LU route E2. * Ingress PE E1 learns best-effort BGP-LU route E2.
* Ingress PE E1 learns CAR route (E2, C1), C1 is mapped to "low * Ingress PE E1 learns CAR route (E2, C1), C1 is mapped to "low
delay". delay".
* Ingress PE E1 learns CAR route (E2, C2), C2 is mapped to "low * Ingress PE E1 learns CAR route (E2, C2), C2 is mapped to "low
delay and avoid resource R". delay and avoid resource R".
* Ingress PE E1 is configured to instantiate an array of paths to E2 * Ingress PE E1 is configured to instantiate an array of paths to E2
where entry 0 is the BGP LU path to next hop, color C1 is the where entry 0 is the BGP-LU path to next hop, color C1 is the
first entry, and color C2 is the second entry. The index into the first entry, and color C2 is the second entry. The index into the
array is called a Forwarding Class (FC). The index can have array is called a Forwarding Class (FC). The index can have
values 0 to 7, especially when derived from the MPLS TC bits values 0 to 7, especially when derived from the MPLS TC bits
[RFC5462]. [RFC5462].
* E1 is configured to match flows in its ingress interfaces (upon * E1 is configured to match flows in its ingress interfaces (upon
any field such as Ethernet destination/source/VLAN/TOS or IP any field such as Ethernet destination/source/VLAN/TOS or IP
destination/source/DSCP or transport ports, etc.) and color them destination/source/DSCP or transport ports, etc.) and color them
with an internal per-packet FC variable (0, 1, or 2 in this with an internal per-packet FC variable (0, 1, or 2 in this
example). example).
skipping to change at line 3640 skipping to change at line 3638
- Similarly, Prefix B:C12::/32 summarizes Flex-Algo 128 block in - Similarly, Prefix B:C12::/32 summarizes Flex-Algo 128 block in
AS2. AS2.
- Per Flex-Algo external subnets for eBGP next hops IP1 and IP2 - Per Flex-Algo external subnets for eBGP next hops IP1 and IP2
are distributed in IS-IS within AS2. are distributed in IS-IS within AS2.
* BGP CAR prefix route B:C11::/32 with LCM C1 is originated by AS1 * BGP CAR prefix route B:C11::/32 with LCM C1 is originated by AS1
BRs 231 and 232 on eBGP sessions to AS2 BRs 121 and 122. BRs 231 and 232 on eBGP sessions to AS2 BRs 121 and 122.
* ASBR 121 and 122 propagate the route in AS2 to all the P, ABRs, * ASBR 121 and 122 propagate the route in AS2 to all the P, ABRs,
and PEs through transport RR. and PEs through TRR.
* Every router in AS2 resolves BGP CAR prefix B:C11::/32 next hops * Every router in AS2 resolves BGP CAR prefix B:C11::/32 next hops
IP1 and IP2 in IS-ISv6 Flex-Algo 128 and programs B:C11::/32 IP1 and IP2 in IS-ISv6 Flex-Algo 128 and programs B:C11::/32
prefix in global IPv6 forwarding table. prefix in global IPv6 forwarding table.
* AIGP attribute influences BGP CAR route best path decision. * AIGP attribute influences BGP CAR route best path decision.
* Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID * Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID
B:C11:2:DT4::. Service SID is allocated by E2 from its locator of B:C11:2:DT4::. Service SID is allocated by E2 from its locator of
color C1 intent. color C1 intent.
skipping to change at line 3746 skipping to change at line 3744
domain for the given intent. Node locators in the egress domain for the given intent. Node locators in the egress
domain are sub-allocated from the block. domain are sub-allocated from the block.
- Prefix B:C12::/32 summarizes Flex-Algo 128 block in transit - Prefix B:C12::/32 summarizes Flex-Algo 128 block in transit
domain. domain.
- Prefix B:C13::/32 summarizes Flex-Algo 128 block in ingress - Prefix B:C13::/32 summarizes Flex-Algo 128 block in ingress
domain. domain.
* BGP CAR route B:C11::/32 is originated by ABRs 231 and 232 with * BGP CAR route B:C11::/32 is originated by ABRs 231 and 232 with
LCM C1. Along the propagation path, border routers set next-hop- LCM C1. Along the propagation path, BRs set next-hop-self and
self and appropriately update the intra-domain encapsulation appropriately update the intra-domain encapsulation information
information for the C1 intent. For example, 231 and 121 signal for the C1 intent. For example, 231 and 121 signal SRv6 SID of
SRv6 SID of End behavior [RFC8986] allocated from their respective End behavior [RFC8986] allocated from their respective locators
locators for the C1 intent. (Note: IGP Fleixible Algorithm is for the C1 intent. (Note: IGP Fleixible Algorithm is shown for
shown for intra-domain path, but SR Policy may also provide the intra-domain path, but SR Policy may also provide the path as
path as shown in Appendix C.3.) shown in Appendix C.3.)
* AIGP attribute influences BGP CAR route best path decision. * AIGP attribute influences BGP CAR route best path decision.
* Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID * Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID
B:C11:2:DT4::. Service SID is allocated by E2 from its locator of B:C11:2:DT4::. Service SID is allocated by E2 from its locator of
color C1 intent. color C1 intent.
* Ingress PE E1 learns CAR route B:C11::/32 and VPN route RD:V/v * Ingress PE E1 learns CAR route B:C11::/32 and VPN route RD:V/v
with SRv6 SID B:C11:2:DT4::. with SRv6 SID B:C11:2:DT4::.
skipping to change at line 3774 skipping to change at line 3772
steered along IPv6 routed path provided by BGP CAR IP Prefix route steered along IPv6 routed path provided by BGP CAR IP Prefix route
to locator B:C11::/32. to locator B:C11::/32.
Important: Important:
* Uses longest prefix match of SRv6 Service SID to BGP CAR prefix. * Uses longest prefix match of SRv6 Service SID to BGP CAR prefix.
There is no mapping labels/SIDs; there is simple IP-based There is no mapping labels/SIDs; there is simple IP-based
forwarding instead. forwarding instead.
* Originating domain PE locators of the given intent can be * Originating domain PE locators of the given intent can be
summarized on transit BGP hops eliminating per PE state on border summarized on transit BGP hops eliminating per PE state on BRs.
routers.
Packet forwarding: Packet forwarding:
@E1: IPv4 VRF V/v => H.Encaps.red <B:C13:121:END::, B:C11:2:DT4::> @E1: IPv4 VRF V/v => H.Encaps.red <B:C13:121:END::, B:C11:2:DT4::>
@121: My SID table: B:C13:121:END:: => Update DA with B:C11:2:DT4:: @121: My SID table: B:C13:121:END:: => Update DA with B:C11:2:DT4::
@121: IPv6 Table: B:C11::/32 => H.Encaps.red <B:C12:231:END::> @121: IPv6 Table: B:C11::/32 => H.Encaps.red <B:C12:231:END::>
@231: My SID table: B:C12:231:END:: => Remove IPv6 header; @231: My SID table: B:C12:231:END:: => Remove IPv6 header;
Inner DA B:C11:2:DT4:: Inner DA B:C11:2:DT4::
@231: IPv6 Table B:C11:2::/48 => Forward via IS-ISv6 Flex-Algo @231: IPv6 Table B:C11:2::/48 => Forward via IS-ISv6 Flex-Algo
path to E2 path to E2
skipping to change at line 3834 skipping to change at line 3831
route-based design (Section 7.1.2). The example is iBGP, but the route-based design (Section 7.1.2). The example is iBGP, but the
design also applies to eBGP (multi-AS). design also applies to eBGP (multi-AS).
* SR Policy (E2, C2) provides given intent in egress domain. * SR Policy (E2, C2) provides given intent in egress domain.
- SR Policy (E2, C2) with segments <B:01:z:END::, B:01:2:END::>, - SR Policy (E2, C2) with segments <B:01:z:END::, B:01:2:END::>,
where z is the node id in egress domain. where z is the node id in egress domain.
* Egress ABRs 231 and 232 redistribute SR Policy into BGP CAR Type-1 * Egress ABRs 231 and 232 redistribute SR Policy into BGP CAR Type-1
NLRI (E2, C2) to other domains, with SRv6 SID of End.B6 behavior. NLRI (E2, C2) to other domains, with SRv6 SID of End.B6 behavior.
This route is propagated to ingress PEs through Transport RR (TRR) This route is propagated to ingress PEs through TRR or inline with
or inline with next-hop-unchanged. next-hop-unchanged.
* The ABRs also advertise BGP CAR prefix route (B:C21::/32) * The ABRs also advertise BGP CAR prefix route (B:C21::/32)
summarizing locator part of SRv6 SIDs for SR policies of given summarizing locator part of SRv6 SIDs for SR policies of given
intent to different PEs in egress domain. BGP CAR prefix route intent to different PEs in egress domain. BGP CAR prefix route
propagates through border routers. At each BGP hop, BGP CAR propagates through BRs. At each BGP hop, BGP CAR prefix next-hop
prefix next-hop resolution triggers intra-domain transit SR Policy resolution triggers intra-domain transit SR Policy (C2, CAR next
(C2, CAR next hop). For example: hop). For example:
- SR Policy (231, C2) with segments <B:02:y:END::, - SR Policy (231, C2) with segments <B:02:y:END::,
B:02:231:END::>, and B:02:231:END::>, and
- SR Policy (121, C2) with segments <B:03:x:END::, - SR Policy (121, C2) with segments <B:03:x:END::,
B:03:121:END::>, B:03:121:END::>,
- where x and y are node ids within the respective domains. - where x and y are node ids within the respective domains.
* Egress PE E2 advertises a VPN route RD:V/v with Color-EC C2. * Egress PE E2 advertises a VPN route RD:V/v with Color-EC C2.
skipping to change at line 3985 skipping to change at line 3982
endpoints). endpoints).
CASE A: BGP data exchanged for MPLS (non-SR): CASE A: BGP data exchanged for MPLS (non-SR):
Consider 200 bytes of shared attributes Consider 200 bytes of shared attributes
CAR SAFI signals label in non-key TLV part of NLRI CAR SAFI signals label in non-key TLV part of NLRI
Each NLRI size for AFI 1 = 12(key) + 5(label) = 17 bytes Each NLRI size for AFI 1 = 12(key) + 5(label) = 17 bytes
Ideal packing: Ideal packing:
Number of NLRIs in 4k update size = 223 (4k-200/17) Number of NLRIs in 4k update size = 223 (4k-200/17)
Number of update messages of 4k size = 1.5 million/223 = 6726 Number of update messages of 4k size = 1.5 million/223 = 6726
Total BGP data on wire = 6726 * 4k = ~27.5MB Total BGP data on wire = 6726 * 4k = ~27.5 MB
Practical packing (5 routes in update message): Practical packing (5 routes in update message):
Size of update message = (17 * 5) + 200 = 285 Size of update message = (17 * 5) + 200 = 285
Total BGP data on wire = 285 * 300k = ~86MB Total BGP data on wire = 285 * 300k = ~86 MB
No-packing case (1 route per update message): No-packing case (1 route per update message):
Size of update message = 17 + 200 = 217 Size of update message = 17 + 200 = 217
Total BGP data on wire = 217 * 1.5 million = ~325MB Total BGP data on wire = 217 * 1.5 million = ~325 MB
SAFI 128 using encoding specified in RFC 8277 with label in NLRI SAFI 128 using encoding specified in RFC 8277 with label in NLRI
Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes
Ideal packing: Ideal packing:
Number of NLRIs in 4k update size = 237 (4k-200/16) Number of NLRIs in 4k update size = 237 (4k-200/16)
Number of update messages of 4k size = 1.5 million/237 = ~6330 Number of update messages of 4k size = 1.5 million/237 = ~6330
Total BGP data on wire = 6330 * 4k = ~25.9MB Total BGP data on wire = 6330 * 4k = ~25.9 MB
Practical packing (5 routes in update message): Practical packing (5 routes in update message):
Size of update message = (16 * 5) + 200 = 280 Size of update message = (16 * 5) + 200 = 280
Total BGP data on wire = 280 * 300k = ~84MB Total BGP data on wire = 280 * 300k = ~84 MB
No-packing case (1 route per update message): No-packing case (1 route per update message):
Size of update message = 16 + 200 = 216 Size of update message = 16 + 200 = 216
Total BGP data on wire = 216 * 1.5 million = ~324MB Total BGP data on wire = 216 * 1.5 million = ~324 MB
CASE B: BGP data exchanged for SR-MPLS label index: CASE B: BGP data exchanged for SR-MPLS label index:
Consider 200 bytes of shared attributes Consider 200 bytes of shared attributes
CAR SAFI signals label index in non-key TLV part of NLRI CAR SAFI signals label index in non-key TLV part of NLRI
Each NLRI size for AFI 1 Each NLRI size for AFI 1
= 12(key) + 5(label) + 9(Index) = 26 bytes = 12(key) + 5(label) + 9(Index) = 26 bytes
Ideal packing: Ideal packing:
Number of NLRIs in 4k update size = 146 (4k-200/26) Number of NLRIs in 4k update size = 146 (4k-200/26)
Number of update messages of 4k size = 1.5 million/146 = 6726 Number of update messages of 4k size = 1.5 million/146 = 6726
Total BGP data on wire = 10274 * 4k = ~42MB Total BGP data on wire = 10274 * 4k = ~42 MB
Practical packing (5 routes in update message) Practical packing (5 routes in update message)
Size of update message = (26 * 5) + 200 = 330 Size of update message = (26 * 5) + 200 = 330
Total BGP data on wire = 330 * 300k = ~99MB Total BGP data on wire = 330 * 300k = ~99 MB
No-packing case (1 route per update message) No-packing case (1 route per update message)
Size of update message = 26 + 200 = 226 Size of update message = 26 + 200 = 226
Total BGP data on wire = 226 * 1.5 million = ~339MB Total BGP data on wire = 226 * 1.5 million = ~339 MB
SAFI 128 using encoding specified in RFC 8277 with label in NLRI SAFI 128 using encoding specified in RFC 8277 with label in NLRI
Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes
Ideal packing: Ideal packing:
Not supported as label index is encoded in Prefix-SID Not supported as label index is encoded in Prefix-SID
attribute attribute
Practical packing (5 routes in update message): Practical packing (5 routes in update message):
Not supported as label index is encoded in Prefix-SID Not supported as label index is encoded in Prefix-SID
attribute attribute
No-packing case (1 route per update message): No-packing case (1 route per update message):
Size of update message = 16 + 210 = 226 Size of update message = 16 + 210 = 226
Total BGP data on wire = 216 * 1.5 million = ~339MB Total BGP data on wire = 216 * 1.5 million = ~339 MB
CASE C: BGP data exchanged with 128 bit single SRv6 SID: CASE C: BGP data exchanged with 128 bit single SRv6 SID:
Consider 200 bytes of shared attributes Consider 200 bytes of shared attributes
CAR SAFI signals SRv6 SID in non-key TLV part of NLRI CAR SAFI signals SRv6 SID in non-key TLV part of NLRI
Each NLRI size for AFI 1 = 12(key) + 18(SRv6 SID) = 30 bytes Each NLRI size for AFI 1 = 12(key) + 18(SRv6 SID) = 30 bytes
Ideal packing: Ideal packing:
Number of NLRIs in 4k update size = 126 (4k-200/30) Number of NLRIs in 4k update size = 126 (4k-200/30)
Number of update messages of 4k size = 1.5 million/126 = ~12k Number of update messages of 4k size = 1.5 million/126 = ~12k
Total BGP data on wire = 12k * 4k = ~49MB Total BGP data on wire = 12k * 4k = ~49 MB
Practical packing (5 routes in update message): Practical packing (5 routes in update message):
Size of update message Size of update message
= (30 * 5) + 236 (including Prefix-SID) = 386 = (30 * 5) + 236 (including Prefix-SID) = 386
Total BGP data on wire = 386 * 300k = ~115MB Total BGP data on wire = 386 * 300k = ~115 MB
No-packing case (1 route per update message): No-packing case (1 route per update message):
Size of update message = 12 + 236 (SID in Prefix-SID) = 252 Size of update message = 12 + 236 (SID in Prefix-SID) = 252
Total BGP data on wire = 252 * 1.5 million = ~378MB Total BGP data on wire = 252 * 1.5 million = ~378 MB
SAFI 128 using encoding specified in RFC 8277 with label in NLRI SAFI 128 using encoding specified in RFC 8277 with label in NLRI
(No transposition) (No transposition)
Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes
Ideal packing: Ideal packing:
Not supported as SRv6 SID is encoded in Prefix-SID Not supported as SRv6 SID is encoded in Prefix-SID
attribute attribute
Practical packing (5 routes in update message): Practical packing (5 routes in update message):
Not supported as SRv6 SID is encoded in Prefix-SID Not supported as SRv6 SID is encoded in Prefix-SID
attribute attribute
No-packing case (1 route per update message): No-packing case (1 route per update message):
Size of update message = 16 + 236 = 252 Size of update message = 16 + 236 = 252
Total BGP data on wire = 252 * 1.5 million = ~378MB Total BGP data on wire = 252 * 1.5 million = ~378 MB
BGP data exchanged with transposition of 4 bytes from SRv6 SID into BGP data exchanged with transposition of 4 bytes from SRv6 SID into
SRv6 SID TLV: SRv6 SID TLV:
Consider 200 bytes of shared attributes Consider 200 bytes of shared attributes
CAR SAFI signals SRv6 SID in non-key TLV part of NLRI CAR SAFI signals SRv6 SID in non-key TLV part of NLRI
Each NLRI size for AFI 1 = 12(key) + 6(SRv6 SID) = 18 bytes Each NLRI size for AFI 1 = 12(key) + 6(SRv6 SID) = 18 bytes
Ideal packing: Ideal packing:
Number of NLRIs in 4k update size = 211 (4k-200/18) Number of NLRIs in 4k update size = 211 (4k-200/18)
Number of update messages of 4k size = 1.5 million/211 = ~7110 Number of update messages of 4k size = 1.5 million/211 = ~7110
Total BGP data on wire = 7110 * 4k = ~29MB Total BGP data on wire = 7110 * 4k = ~29 MB
Practical packing (5 routes in update message): Practical packing (5 routes in update message):
Size of update message Size of update message
= (18 * 5) + 236 (including Prefix-SID) = 326 = (18 * 5) + 236 (including Prefix-SID) = 326
Total BGP data on wire = 326 * 300k = ~98MB Total BGP data on wire = 326 * 300k = ~98 MB
No-packing case (1 route per update message): No-packing case (1 route per update message):
Size of update message Size of update message
= 12 + 236 (SID in Prefix-SID attribute) = 252 = 12 + 236 (SID in Prefix-SID attribute) = 252
Total BGP data on wire = 252 * 1.5 million = ~378MB Total BGP data on wire = 252 * 1.5 million = ~378 MB
Acknowledgements Acknowledgements
The authors would like to acknowledge the invaluable contributions of The authors would like to acknowledge the invaluable contributions of
many collaborators towards the BGP CAR solution and this document in many collaborators towards the BGP CAR solution and this document in
providing input about use cases, participating in brainstorming and providing input about use cases, participating in brainstorming and
mailing list discussions and in reviews of the solution and draft mailing list discussions and in reviews of the solution and draft
revisions. In addition to the contributors listed in the revisions. In addition to the contributors listed in the
Contributors section, the authors would like to thank Robert Raszuk, Contributors section, the authors would like to thank Robert Raszuk,
Bin Wen, Chaitanya Yadlapalli, Satoru Matsushima, Moses Nagarajah, Bin Wen, Chaitanya Yadlapalli, Satoru Matsushima, Moses Nagarajah,
 End of changes. 57 change blocks. 
92 lines changed or deleted 89 lines changed or added

This html diff was produced by rfcdiff 1.48.