rfc9871v2.txt | rfc9871.txt | |||
---|---|---|---|---|
skipping to change at line 27 ¶ | skipping to change at line 27 ¶ | |||
This document describes the routing framework and BGP extensions to | This document describes the routing framework and BGP extensions to | |||
enable intent-aware routing using the BGP CAR solution. The solution | enable intent-aware routing using the BGP CAR solution. The solution | |||
defines two new BGP SAFIs (BGP CAR SAFI and BGP VPN CAR SAFI) for | defines two new BGP SAFIs (BGP CAR SAFI and BGP VPN CAR SAFI) for | |||
IPv4 and IPv6. It also defines an extensible Network Layer | IPv4 and IPv6. It also defines an extensible Network Layer | |||
Reachability Information (NLRI) model for both SAFIs that allows | Reachability Information (NLRI) model for both SAFIs that allows | |||
multiple NLRI types to be defined for different use cases. Each type | multiple NLRI types to be defined for different use cases. Each type | |||
of NLRI contains key and TLV-based non-key fields for efficient | of NLRI contains key and TLV-based non-key fields for efficient | |||
encoding of different per-prefix information. This specification | encoding of different per-prefix information. This specification | |||
defines two NLRI types: Color-Aware Route NLRI and IP Prefix NLRI. | defines two NLRI types: Color-Aware Route NLRI and IP Prefix NLRI. | |||
It defines non-key TLV types for the MPLS label stack, SR-MPLS Label | It defines non-key TLV types for the MPLS label stack, SR-MPLS label | |||
Index and Segment Routing over IPv6 (SRv6) Segment Identifiers | index, and Segment Routing over IPv6 (SRv6) Segment Identifiers | |||
(SIDs). This solution also defines a new Local Color Mapping (LCM) | (SIDs). This solution also defines a new Local Color Mapping (LCM) | |||
Extended Community. | Extended Community. | |||
Status of This Memo | Status of This Memo | |||
This document is not an Internet Standards Track specification; it is | This document is not an Internet Standards Track specification; it is | |||
published for examination, experimental implementation, and | published for examination, experimental implementation, and | |||
evaluation. | evaluation. | |||
This document defines an Experimental Protocol for the Internet | This document defines an Experimental Protocol for the Internet | |||
skipping to change at line 305 ¶ | skipping to change at line 305 ¶ | |||
(Section 8 of [RFC9256]), in this document (service route) | (Section 8 of [RFC9256]), in this document (service route) | |||
steering is used to describe the mapping of the traffic for a | steering is used to describe the mapping of the traffic for a | |||
service route onto a BGP CAR path. In contrast, the term | service route onto a BGP CAR path. In contrast, the term | |||
resolution is preserved for the mapping of an inter-domain BGP CAR | resolution is preserved for the mapping of an inter-domain BGP CAR | |||
route on an intra-domain color-aware path. | route on an intra-domain color-aware path. | |||
Service steering: | Service steering: | |||
Service route maps traffic to a BGP CAR path (or other color- | Service route maps traffic to a BGP CAR path (or other color- | |||
aware path, e.g., SR Policy). If a color-aware path is not | aware path, e.g., SR Policy). If a color-aware path is not | |||
available, local policy may map to a color-unaware routing/TE | available, local policy may map to a color-unaware routing/TE | |||
path (e.g., BGP LU, RSVP-TE, IGP/LDP). The service steering | path (e.g., BGP-LU, RSVP-TE, IGP/LDP). The service steering | |||
concept is agnostic to the transport technology used. | concept is agnostic to the transport technology used. | |||
Section 3 describes the specific service steering mechanisms | Section 3 describes the specific service steering mechanisms | |||
leveraged for MPLS, SR-MPLS, and SRv6. | leveraged for MPLS, SR-MPLS, and SRv6. | |||
Intra-domain resolution: | Intra-domain resolution: | |||
BGP CAR route maps to an intra-domain color-aware path (e.g., | BGP CAR route maps to an intra-domain color-aware path (e.g., | |||
SR Policy, IGP Flexible Algorithm, BGP CAR) or a color-unaware | SR Policy, IGP Flexible Algorithm, BGP CAR) or a color-unaware | |||
routing/TE path (e.g., RSVP-TE, IGP/LDP, BGP-LU). | routing/TE path (e.g., RSVP-TE, IGP/LDP, BGP-LU). | |||
Transport network: | Transport network: | |||
skipping to change at line 454 ¶ | skipping to change at line 454 ¶ | |||
- W/w is steered on a color-aware path provided by SR Policy | - W/w is steered on a color-aware path provided by SR Policy | |||
* Seamless interworking of BGP CAR and SR Policy | * Seamless interworking of BGP CAR and SR Policy | |||
- V/v is steered on a BGP CAR path that is itself resolved within | - V/v is steered on a BGP CAR path that is itself resolved within | |||
domain 2 onto an SR Policy bound to the color of V/v | domain 2 onto an SR Policy bound to the color of V/v | |||
Other properties: | Other properties: | |||
* MPLS data-plane: with 300k PEs and 5 colors, the BGP CAR solution | * MPLS data plane: with 300k PEs and 5 colors, the BGP CAR solution | |||
ensures that no single node needs to support a data-plane scaling | ensures that no single node needs to support a data plane scaling | |||
in the order of Remote PE * C (Section 5). This would otherwise | in the order of Remote PE * C (Section 5). This would otherwise | |||
exceed the MPLS data-plane. | exceed the MPLS data plane. | |||
* Control-plane: a node should not install a (E, C) path if it's not | * Control plane: a node should not install a (E, C) path if it's not | |||
participating in that color-aware path. | participating in that color-aware path. | |||
* Incongruent color-intent mapping: the solution supports the | * Incongruent color-intent mapping: the solution supports the | |||
signaling of a BGP CAR route across different color domains | signaling of a BGP CAR route across different color domains | |||
(Section 2.8). | (Section 2.8). | |||
The key benefits of this model are: | The key benefits of this model are: | |||
* Leverage of the BGP Color-EC [RFC9012] to color service routes | * Leverage of the BGP Color-EC [RFC9012] to color service routes | |||
* The definition of the automated service steering: a C-colored | * The definition of the automated service steering: a C-colored | |||
service route V/v from E2 is steered onto a color-aware path (E2, | service route V/v from E2 is steered onto a color-aware path (E2, | |||
C) | C) | |||
* The definition of the data model of a BGP CAR path: (E, C) | * The definition of the data model of a BGP CAR path: (E, C) | |||
- Natural extension of BGP IP/LU data model (E) | - Natural extension of BGP-IP/BGP-LU data model (E) | |||
- Consistent with SR Policy data model | - Consistent with SR Policy data model | |||
* The definition of the recursive resolution of a BGP CAR route: a | * The definition of the recursive resolution of a BGP CAR route: a | |||
BGP CAR (E2, C) route via N is resolved onto the color-aware path | BGP CAR (E2, C) route via N is resolved onto the color-aware path | |||
(N, C), which may itself be provided by BGP CAR or via another | (N, C), which may itself be provided by BGP CAR or via another | |||
color-aware routing solution (e.g., SR Policy, IGP Flexible | color-aware routing solution (e.g., SR Policy, IGP Flexible | |||
Algorithm) | Algorithm) | |||
* Explicit definitions for multiple transport encapsulations (e.g., | * Explicit definitions for multiple transport encapsulations (e.g., | |||
skipping to change at line 578 ¶ | skipping to change at line 578 ¶ | |||
2.3. BGP CAR Route Origination | 2.3. BGP CAR Route Origination | |||
A BGP CAR route may be originated locally (e.g., loopback) or through | A BGP CAR route may be originated locally (e.g., loopback) or through | |||
redistribution of an (E, C) color-aware path provided by another | redistribution of an (E, C) color-aware path provided by another | |||
routing solution (e.g., SR Policy, IGP Flexible Algorithm, RSVP-TE, | routing solution (e.g., SR Policy, IGP Flexible Algorithm, RSVP-TE, | |||
BGP-LU [RFC8277]). | BGP-LU [RFC8277]). | |||
2.4. BGP CAR Route Validation | 2.4. BGP CAR Route Validation | |||
A BGP CAR path (E, C) via next hop N with encapsulation T is valid if | A BGP CAR path (E, C) via next hop N with encapsulation T is valid if | |||
color-aware path (N, C) exists with encapsulation T available in | color-aware path (N, C) exists with encapsulation T available in data | |||
data-plane. | plane. | |||
A local policy may customize the validation process: | A local policy may customize the validation process: | |||
* The color constraint in the first check may be relaxed. If N is | * The color constraint in the first check may be relaxed. If N is | |||
reachable via alternate color(s) or in the default routing table, | reachable via alternate color(s) or in the default routing table, | |||
the route may be considered valid. | the route may be considered valid. | |||
* The data-plane availability constraint of T may be relaxed to use | * The data plane availability constraint of T may be relaxed to use | |||
an alternate encapsulation. | an alternate encapsulation. | |||
* A performance-measurement verification may be added to ensure that | * A performance-measurement verification may be added to ensure that | |||
the intent associated with C is met (e.g., delay < bound). | the intent associated with C is met (e.g., delay < bound). | |||
A path that is not valid MUST NOT be considered for BGP best path | A path that is not valid MUST NOT be considered for BGP best path | |||
selection. | selection. | |||
2.5. BGP CAR Route Resolution | 2.5. BGP CAR Route Resolution | |||
skipping to change at line 631 ¶ | skipping to change at line 631 ¶ | |||
domain, an egress node selects and advertises an SRv6 SID from its | domain, an egress node selects and advertises an SRv6 SID from its | |||
locator for intent C1, with a BGP CAR route. In such a case, the | locator for intent C1, with a BGP CAR route. In such a case, the | |||
ingress node resolves the received SRv6 SID over an IPv6 route for | ingress node resolves the received SRv6 SID over an IPv6 route for | |||
the intent-aware locator of the egress node for C1 or a summary | the intent-aware locator of the egress node for C1 or a summary | |||
route that covers the locator. This summary route may be provided | route that covers the locator. This summary route may be provided | |||
by SRv6 Flexible Algorithm or BGP CAR IP Prefix route itself | by SRv6 Flexible Algorithm or BGP CAR IP Prefix route itself | |||
(e.g., Appendix C.2). | (e.g., Appendix C.2). | |||
* Local policy may map the CAR route to mechanisms that are unaware | * Local policy may map the CAR route to mechanisms that are unaware | |||
of color or that provide best-effort, such as RSVP-TE, IGP/LDP, | of color or that provide best-effort, such as RSVP-TE, IGP/LDP, | |||
BGP LU/IP (e.g., Appendix A.3.2) for brownfield scenarios. | BGP-LU/BGP-IP (e.g., Appendix A.3.2) for brownfield scenarios. | |||
Route resolution via a different color C2 can be automated by | Route resolution via a different color C2 can be automated by | |||
attaching BGP Color-EC C2 to CAR route (E2, C1), leveraging automated | attaching BGP Color-EC C2 to CAR route (E2, C1), leveraging automated | |||
steering as described in Section 8.4 of "Segment Routing Policy | steering as described in Section 8.4 of "Segment Routing Policy | |||
Architecture" [RFC9256] for BGP CAR routes. This mechanism is | Architecture" [RFC9256] for BGP CAR routes. This mechanism is | |||
illustrated in Appendix B.2. This mechanism SHOULD be supported. | illustrated in Appendix B.2. This mechanism SHOULD be supported. | |||
For CAR route resolution, if Color-EC color is present with the | For CAR route resolution, if Color-EC color is present with the | |||
route, it takes precedence over the route's intent color. The | route, it takes precedence over the route's intent color. The | |||
route’s intent color is the LCM-EC color if present (see | route’s intent color is the LCM-EC color if present (see | |||
skipping to change at line 704 ¶ | skipping to change at line 704 ¶ | |||
AIGP updates. | AIGP updates. | |||
Additional AIGP extensions may be defined to signal state for | Additional AIGP extensions may be defined to signal state for | |||
specific use cases such as Maximum SID Depth (MSD) along the BGP CAR | specific use cases such as Maximum SID Depth (MSD) along the BGP CAR | |||
route advertisement and minimum MTU along the BGP CAR route | route advertisement and minimum MTU along the BGP CAR route | |||
advertisement. This is out of scope for this document. | advertisement. This is out of scope for this document. | |||
2.7. Inherent Multipath Capability | 2.7. Inherent Multipath Capability | |||
The (E, C) route definition inherently provides availability of | The (E, C) route definition inherently provides availability of | |||
redundant paths at every BGP hop identical to BGP-LU or BGP IP. For | redundant paths at every BGP hop identical to BGP-LU or BGP-IP. For | |||
instance, BGP CAR routes originated by two or more egress ABRs in a | instance, BGP CAR routes originated by two or more egress ABRs in a | |||
domain are advertised as multiple paths to ingress ABRs in the | domain are advertised as multiple paths to ingress ABRs in the | |||
domain, where they become equal-cost or primary-backup paths. A | domain, where they become equal-cost or primary-backup paths. A | |||
failure of an egress ABR is detected and handled by ingress ABRs | failure of an egress ABR is detected and handled by ingress ABRs | |||
locally within the domain for faster convergence, without any | locally within the domain for faster convergence, without any | |||
necessity to propagate the event to upstream nodes for traffic | necessity to propagate the event to upstream nodes for traffic | |||
restoration. | restoration. | |||
BGP ADD-PATH [RFC7911] SHOULD be enabled for BGP CAR to signal | BGP ADD-PATH [RFC7911] SHOULD be enabled for BGP CAR to signal | |||
multiple next hops through a transport RR. | multiple next hops through a TRR. | |||
2.8. BGP CAR Signaling Through Different Color Domains | 2.8. BGP CAR Signaling Through Different Color Domains | |||
[Color Domain 1 A]-----[B Color Domain 2 E2] | [Color Domain 1 A]-----[B Color Domain 2 E2] | |||
[C1=low delay ] [C2=low delay ] | [C1=low delay ] [C2=low delay ] | |||
Let us assume a BGP CAR route (E2, C2) is signaled from B to A, two | Let us assume a BGP CAR route (E2, C2) is signaled from B to A, two | |||
border routers of Domain 2 and Domain 1, respectively. Let us assume | BRs of Domain 2 and Domain 1, respectively. Let us assume that these | |||
that these two domains do not share the same color-to-intent mapping | two domains do not share the same color-to-intent mapping (i.e., they | |||
(i.e., they belong to different color domains). Low delay in Domain | belong to different color domains). Low delay in Domain 2 is color | |||
2 is color C2, while it is C1 in Domain 1 (C1 <> C2). | C2, while it is C1 in Domain 1 (C1 <> C2). | |||
It is not expected to be a typical scenario to have an underlay | It is not expected to be a typical scenario to have an underlay | |||
transport path (e.g., an MPLS LSP) extend across different color | transport path (e.g., an MPLS LSP) extend across different color | |||
domains. However, the BGP CAR solution seamlessly supports this rare | domains. However, the BGP CAR solution seamlessly supports this rare | |||
scenario while maintaining the separation and independence of the | scenario while maintaining the separation and independence of the | |||
administrative authority in different color domains. | administrative authority in different color domains. | |||
The solution works as described below: | The solution works as described below: | |||
* Within Domain 2, the BGP CAR route is (E2, C2) via E2. | * Within Domain 2, the BGP CAR route is (E2, C2) via E2. | |||
skipping to change at line 786 ¶ | skipping to change at line 786 ¶ | |||
resolution and steering. | resolution and steering. | |||
* In the rare case of color incongruence, the local color encoded in | * In the rare case of color incongruence, the local color encoded in | |||
LCM-EC takes precedence. | LCM-EC takes precedence. | |||
Operational considerations are in Section 11. Further illustrations | Operational considerations are in Section 11. Further illustrations | |||
are provided in Appendix B. | are provided in Appendix B. | |||
2.9. Format and Encoding | 2.9. Format and Encoding | |||
BGP CAR leverages BGP multi-protocol extensions [RFC4760] and uses | BGP CAR leverages BGP multiprotocol extensions [RFC4760] and uses the | |||
the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route updates | MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route updates within | |||
within SAFI value 83 along with AFI 1 for IPv4 prefixes and AFI 2 for | SAFI value 83 along with AFI 1 for IPv4 prefixes and AFI 2 for IPv6 | |||
IPv6 prefixes. | prefixes. | |||
BGP speakers MUST use the BGP Capabilities Advertisement to ensure | BGP speakers MUST use the BGP Capabilities Advertisement to ensure | |||
support for processing of BGP CAR updates. This is done as specified | support for processing of BGP CAR updates. This is done as specified | |||
in [RFC4760], by using capability code 1 (multi-protocol BGP), with | in [RFC4760], by using capability code 1 (multiprotocol BGP), with | |||
AFI 1 and 2 (as required) and SAFI 83. | AFI 1 and 2 (as required) and SAFI 83. | |||
The Next Hop network address field in the MP_REACH_NLRI may either be | The Next Hop network address field in the MP_REACH_NLRI may either be | |||
an IPv4 address or an IPv6 address, independent of AFI. If the next | an IPv4 address or an IPv6 address, independent of AFI. If the next | |||
hop length is 4, then the next hop is an IPv4 address. The next hop | hop length is 4, then the next hop is an IPv4 address. The next hop | |||
length may be 16 or 32 for an IPv6 next hop address, set as per | length may be 16 or 32 for an IPv6 next hop address, set as per | |||
Section 3 of [RFC2545]. Processing of the Next Hop field is governed | Section 3 of [RFC2545]. Processing of the Next Hop field is governed | |||
by standard BGP procedures as described in Section 3 of [RFC4760]. | by standard BGP procedures as described in Section 3 of [RFC4760]. | |||
The sub-sections below specify the generic encoding of the BGP CAR | The sub-sections below specify the generic encoding of the BGP CAR | |||
skipping to change at line 1334 ¶ | skipping to change at line 1334 ¶ | |||
relied upon to extract the key and perform 'treat-as-withdraw' for | relied upon to extract the key and perform 'treat-as-withdraw' for | |||
malformed information. | malformed information. | |||
A sender MUST ensure that the NLRI and key lengths are the number of | A sender MUST ensure that the NLRI and key lengths are the number of | |||
actual bytes encoded in the NLRI and key fields, respectively, | actual bytes encoded in the NLRI and key fields, respectively, | |||
regardless of content being encoded. | regardless of content being encoded. | |||
Given the NLRI length and Key length MUST be valid, failures in the | Given the NLRI length and Key length MUST be valid, failures in the | |||
following checks result in 'AFI/SAFI disable' or 'session reset': | following checks result in 'AFI/SAFI disable' or 'session reset': | |||
* The minimum NLRI length MUST be at least 2, as key length and NLRI | * The minimum NLRI Length MUST be at least 2, as Key Length and NLRI | |||
type are required fields. | Type are required fields. | |||
* The Key Length MUST be at least 2 less than NLRI Length. | * The Key Length MUST be at least 2 less than NLRI Length. | |||
NLRI type-specific error handling: | NLRI type-specific error handling: | |||
* By default, a speaker SHOULD discard an unrecognized or | * By default, a speaker SHOULD discard an unrecognized or | |||
unsupported NLRI type and move to the next NLRI. | unsupported NLRI type and move to the next NLRI. | |||
* Key length and key errors of a known NLRI type SHOULD result in | * Key length and key errors of a known NLRI type SHOULD result in | |||
the discard of NLRI similar to an unrecognized NLRI type. (This | the discard of NLRI similar to an unrecognized NLRI type. (This | |||
skipping to change at line 1453 ¶ | skipping to change at line 1453 ¶ | |||
routes from upstream routers or Route Reflectors (RRs) to limit the | routes from upstream routers or Route Reflectors (RRs) to limit the | |||
routes that it needs to learn. On-demand subscription and automated | routes that it needs to learn. On-demand subscription and automated | |||
filtering procedures for individual CAR routes are outside the scope | filtering procedures for individual CAR routes are outside the scope | |||
of this document. | of this document. | |||
5. Scaling | 5. Scaling | |||
This section analyzes the key scale requirement of [INTENT-AWARE], | This section analyzes the key scale requirement of [INTENT-AWARE], | |||
specifically: | specifically: | |||
* No intermediate node data-plane should need to scale to (Colors * | * No intermediate node data plane should need to scale to (Colors * | |||
PEs). | PEs). | |||
* No node should learn and install a BGP CAR route to (E, C) if it | * No node should learn and install a BGP CAR route to (E, C) if it | |||
does not install a colored service route to E. | does not install a colored service route to E. | |||
While the requirements and design principles generally apply to any | While the requirements and design principles generally apply to any | |||
transport, the logical analysis based on the network design in this | transport, the logical analysis based on the network design in this | |||
section focuses on MPLS/SR-MPLS transport since the scaling | section focuses on MPLS/SR-MPLS transport since the scaling | |||
constraints are specifically relevant to these technologies. BGP CAR | constraints are specifically relevant to these technologies. BGP CAR | |||
SAFI is used here, but the considerations can apply to [RFC8277] or | SAFI is used here, but the considerations can apply to [RFC8277] or | |||
skipping to change at line 1522 ¶ | skipping to change at line 1522 ¶ | |||
* Each domain has Flex-Algo 128. Prefix-SID for a node is Segment | * Each domain has Flex-Algo 128. Prefix-SID for a node is Segment | |||
Routing Global Block (SRGB) 168000 plus node number. | Routing Global Block (SRGB) 168000 plus node number. | |||
* A BGP CAR route (E2, C1) is advertised by egress BRM node 451. | * A BGP CAR route (E2, C1) is advertised by egress BRM node 451. | |||
The route is sourced locally from redistribution from IGP Flex- | The route is sourced locally from redistribution from IGP Flex- | |||
Algo 128. | Algo 128. | |||
* Not shown for simplicity, node 452 will also advertise (E2, C1). | * Not shown for simplicity, node 452 will also advertise (E2, C1). | |||
* When a transport RR is used within the domain or across domains, | * When a TRR is used within the domain or across domains, ADD-PATH | |||
ADD-PATH is enabled to advertise paths from both egress BRs to its | is enabled to advertise paths from both egress BRs to its clients. | |||
clients. | ||||
* Egress PE E2 advertises a VPN route RD:V/v with BGP Color-EC C1 | * Egress PE E2 advertises a VPN route RD:V/v with BGP Color-EC C1 | |||
that propagates via service RRs to ingress PE E1. | that propagates via service RRs to ingress PE E1. | |||
* E1 steers V/v prefix via color-aware path (E2, C1) and VPN label | * E1 steers V/v prefix via color-aware path (E2, C1) and VPN label | |||
30030. | 30030. | |||
5.2. Deployment Model | 5.2. Deployment Model | |||
5.2.1. Flat | 5.2.1. Flat | |||
skipping to change at line 1636 ¶ | skipping to change at line 1635 ¶ | |||
* Each BGP hop allocates local label and programs swap entry in | * Each BGP hop allocates local label and programs swap entry in | |||
forwarding for (451, C1). | forwarding for (451, C1). | |||
* 121 resolves received BGP CAR route (451, C1) via 231 (label | * 121 resolves received BGP CAR route (451, C1) via 231 (label | |||
168451) on color-aware path (231, C1). | 168451) on color-aware path (231, C1). | |||
- Color-aware path (231, C1) is Flex-Algo 128 path to 231 (label | - Color-aware path (231, C1) is Flex-Algo 128 path to 231 (label | |||
168231). | 168231). | |||
* 451 advertises BGP CAR route (E2, C1) via 451 to transport RR | * 451 advertises BGP CAR route (E2, C1) via 451 to TRR T-RR2, which | |||
T-RR2, which reflects it to transport RR T-RR1, which reflects it | reflects it to TRR T-RR1, which reflects it to 121. | |||
to 121. | ||||
* 121 receives BGP CAR route (E2, C1) via 451 with label 168002. | * 121 receives BGP CAR route (E2, C1) via 451 with label 168002. | |||
- Let's assume 121 selects that path. | - Let's assume 121 selects that path. | |||
* 121 resolves BGP CAR route (E2, C1) via 451 on color-aware path | * 121 resolves BGP CAR route (E2, C1) via 451 on color-aware path | |||
(451, C1). | (451, C1). | |||
- Color-aware path (451, C1) is BGP CAR path to 451 (label | - Color-aware path (451, C1) is BGP CAR path to 451 (label | |||
168451). | 168451). | |||
skipping to change at line 1720 ¶ | skipping to change at line 1718 ¶ | |||
Figure 5: Hierarchical BGP Transport CAR, Next-Hop-Unchanged | Figure 5: Hierarchical BGP Transport CAR, Next-Hop-Unchanged | |||
(NHU) at iBR | (NHU) at iBR | |||
* Nodes 341, 231, and 121 receive and resolve BGP CAR route (451, | * Nodes 341, 231, and 121 receive and resolve BGP CAR route (451, | |||
C1) the same as in the previous model. | C1) the same as in the previous model. | |||
* Node 121 allocates local label and programs swap entry in | * Node 121 allocates local label and programs swap entry in | |||
forwarding for (451, C1). | forwarding for (451, C1). | |||
* 451 advertises BGP CAR route (E2, C1) to transport RR T-RR2, which | * 451 advertises BGP CAR route (E2, C1) to TRR T-RR2, which reflects | |||
reflects it to transport RR T-RR1, which reflects it to 121. | it to TRR T-RR1, which reflects it to 121. | |||
* Node 121 advertises (E2, C1) to E1 with next hop as 451 (i.e., | * Node 121 advertises (E2, C1) to E1 with next hop as 451 (i.e., | |||
next-hop-unchanged). | next-hop-unchanged). | |||
* 121 also advertises (451, C1) to E1 with next-hop-self (121) and | * 121 also advertises (451, C1) to E1 with next-hop-self (121) and | |||
label 168451. | label 168451. | |||
* E1 resolves BGP CAR route (451, C1) via 121 on color-aware path | * E1 resolves BGP CAR route (451, C1) via 121 on color-aware path | |||
(121, C1). | (121, C1). | |||
skipping to change at line 1764 ¶ | skipping to change at line 1762 ¶ | |||
* Nodes 121, 231, and 341 perform swap operation on 168451 bound to | * Nodes 121, 231, and 341 perform swap operation on 168451 bound to | |||
(451, C1). | (451, C1). | |||
* 451 performs swap operation on 168002 bound to color-aware path | * 451 performs swap operation on 168002 bound to color-aware path | |||
(E2, C1). | (E2, C1). | |||
5.3. Scale Analysis | 5.3. Scale Analysis | |||
The following two tables summarize the logically analyzed scaling of | The following two tables summarize the logically analyzed scaling of | |||
the control-plane and data-plane for the previous three models: | the control plane and data plane for the previous three models: | |||
+=======+=====================+=====================+=============+ | +=======+=====================+=====================+=============+ | |||
| | E1 | 121 | 231 | | | | E1 | 121 | 231 | | |||
+=======+=====================+=====================+=============+ | +=======+=====================+=====================+=============+ | |||
| FLAT | (E2,C) via (121,C) | (E2,C) via (231,C) | (E2,C) via | | | FLAT | (E2,C) via (121,C) | (E2,C) via (231,C) | (E2,C) via | | |||
| | | | (341,C) | | | | | | (341,C) | | |||
+=======+---------------------+---------------------+-------------+ | +=======+---------------------+---------------------+-------------+ | |||
| H.NHS | (E2,C) via (121,C) | (E2,C) via (451,C) | (451,C) via | | | H.NHS | (E2,C) via (121,C) | (E2,C) via (451,C) | (451,C) via | | |||
| | | (451,C) via (231,C) | (341,C) | | | | | (451,C) via (231,C) | (341,C) | | |||
+=======+---------------------+---------------------+-------------+ | +=======+---------------------+---------------------+-------------+ | |||
skipping to change at line 1806 ¶ | skipping to change at line 1804 ¶ | |||
+=======+------------+------------------+------------------+ | +=======+------------+------------------+------------------+ | |||
Table 2 | Table 2 | |||
* The flat model is the simplest design, with a single BGP transport | * The flat model is the simplest design, with a single BGP transport | |||
level. It results in the minimum label/SID stack at each BGP hop. | level. It results in the minimum label/SID stack at each BGP hop. | |||
However, it significantly increases the scale impact on the core | However, it significantly increases the scale impact on the core | |||
BRs (e.g., 341), whose FIB capacity and even MPLS label space may | BRs (e.g., 341), whose FIB capacity and even MPLS label space may | |||
be exceeded. | be exceeded. | |||
- 341's data-plane scales with (E2, C) where there may be 300k Es | - 341's data plane scales with (E2, C) where there may be 300k Es | |||
and 5 Cs, hence 1.5M entries > 1M MPLS data-plane. | and 5 Cs, hence 1.5M entries > 1M MPLS data plane. | |||
* The hierarchical models avoid the need for core BRs to learn | * The hierarchical models avoid the need for core BRs to learn | |||
routes and install label forwarding entries for (E, C) routes. | routes and install label forwarding entries for (E, C) routes. | |||
- Whether next hop is set to self or left unchanged at 121, 341's | - Whether next hop is set to self or left unchanged at 121, 341's | |||
data-plane scales with (451, C) where there may be thousands of | data plane scales with (451, C) where there may be thousands of | |||
451s and 5 Cs. Therefore, this scaling is well under the 1 | 451s and 5 Cs. Therefore, this scaling is well under the 1 | |||
million MPLS labels data-plane limit. | million MPLS labels data plane limit. | |||
- They also aid faster convergence by allowing the PE routes to | - They also aid faster convergence by allowing the PE routes to | |||
be distributed via out-of-band RRs that can be scaled | be distributed via out-of-band RRs that can be scaled | |||
independent of the transport BRs. | independent of the transport BRs. | |||
* The next-hop-self option at ingress BRM (e.g., 121) hides the | * The next-hop-self option at ingress BRM (e.g., 121) hides the | |||
hierarchical design from the ingress PE, keeping its outgoing | hierarchical design from the ingress PE, keeping its outgoing | |||
label programming as simple as the flat model. However, the | label programming as simple as the flat model. However, the | |||
ingress BRM requires an additional BGP transport level recursion, | ingress BRM requires an additional BGP transport level recursion, | |||
which coupled with load-balancing adds data-plane complexity. It | which coupled with load-balancing adds data plane complexity. It | |||
needs to support a swap and push operation. It also needs to | needs to support a swap and push operation. It also needs to | |||
install label forwarding entries for the egress PEs that are of | install label forwarding entries for the egress PEs that are of | |||
interest to its local ingress PEs. | interest to its local ingress PEs. | |||
* With the next-hop-unchanged option at ingress BRM (e.g., 121), | * With the next-hop-unchanged option at ingress BRM (e.g., 121), | |||
only an ingress PE needs to learn and install output label entries | only an ingress PE needs to learn and install output label entries | |||
for egress (E, C) routes. The ingress BRM only installs label | for egress (E, C) routes. The ingress BRM only installs label | |||
forwarding entries for the egress ABR (e.g., 451). However, the | forwarding entries for the egress ABR (e.g., 451). However, the | |||
ingress PE needs an additional BGP transport level recursion and | ingress PE needs an additional BGP transport level recursion and | |||
pushes a BGP VPN label and two BGP transport labels. It may also | pushes a BGP VPN label and two BGP transport labels. It may also | |||
need to handle load-balancing for the egress ABRs. This is the | need to handle load-balancing for the egress ABRs. This is the | |||
most complex data-plane option for the ingress PE. | most complex data plane option for the ingress PE. | |||
5.4. Anycast SID | 5.4. Anycast SID | |||
This section describes how Anycast SID complements and improves the | This section describes how Anycast SID complements and improves the | |||
scaling designs above. | scaling designs above. | |||
5.4.1. Anycast SID for Transit Inter-Domain Nodes | 5.4.1. Anycast SID for Transit Inter-Domain Nodes | |||
* Redundant BRs (e.g., two egress BRMs, 451 and 452) advertise BGP | * Redundant BRs (e.g., two egress BRMs, 451 and 452) advertise BGP | |||
CAR routes for a local PE (e.g., E2) with the same SID (based on | CAR routes for a local PE (e.g., E2) with the same SID (based on | |||
skipping to change at line 2194 ¶ | skipping to change at line 2192 ¶ | |||
with existing operational usage, the CAR IP Prefix route is allowed | with existing operational usage, the CAR IP Prefix route is allowed | |||
to be without color for best-effort. In this case, the routes will | to be without color for best-effort. In this case, the routes will | |||
not carry an LCM-EC. Resolution is described in Section 2.5. | not carry an LCM-EC. Resolution is described in Section 2.5. | |||
As described in Section 7.3, infrastructure prefixes are intended to | As described in Section 7.3, infrastructure prefixes are intended to | |||
be carried in CAR SAFI instead of SAFIs that also carry service | be carried in CAR SAFI instead of SAFIs that also carry service | |||
routes such as BGP-IP (SAFI 1, [RFC4271]) and BGP-LU (SAFI 4, | routes such as BGP-IP (SAFI 1, [RFC4271]) and BGP-LU (SAFI 4, | |||
[RFC4798]). However, if such infrastructure routes are also | [RFC4798]). However, if such infrastructure routes are also | |||
distributed in these SAFIs, a router may receive both BGP CAR SAFI | distributed in these SAFIs, a router may receive both BGP CAR SAFI | |||
paths and IP/LU SAFI paths. By default, the CAR SAFI transport path | paths and IP/LU SAFI paths. By default, the CAR SAFI transport path | |||
is preferred over the BGP IP or BGP-LU SAFI path. | is preferred over the BGP-IP or BGP-LU SAFI path. | |||
A BGP transport CAR speaker that supports packet forwarding lookup | A BGP transport CAR speaker that supports packet forwarding lookup | |||
based on the IPv6 prefix route (such as a BR) will set itself as next | based on the IPv6 prefix route (such as a BR) will set itself as next | |||
hop while advertising the route to peers. It will also install the | hop while advertising the route to peers. It will also install the | |||
IPv6 route into forwarding with the received next hop and/or | IPv6 route into forwarding with the received next hop and/or | |||
encapsulation. If such a transit router does not support this route | encapsulation. If such a transit router does not support this route | |||
type, it will not install this route and will not set itself as next | type, it will not install this route and will not set itself as next | |||
hop; hence, it will not propagate the route any further. | hop; hence, it will not propagate the route any further. | |||
9. VPN CAR | 9. VPN CAR | |||
skipping to change at line 2289 ¶ | skipping to change at line 2287 ¶ | |||
CAR routes distributed in VPN CAR SAFI are infrastructure routes | CAR routes distributed in VPN CAR SAFI are infrastructure routes | |||
advertised by CEs in different customer VRFs on a PE. Example use | advertised by CEs in different customer VRFs on a PE. Example use | |||
cases are intent-aware L3VPN Carriers' Carriers (Section 9 of | cases are intent-aware L3VPN Carriers' Carriers (Section 9 of | |||
[RFC4364]) and SRv6 over a provider network. The VPN RD | [RFC4364]) and SRv6 over a provider network. The VPN RD | |||
distinguishes CAR routes of different customers being advertised by | distinguishes CAR routes of different customers being advertised by | |||
the PE. | the PE. | |||
9.1. Format and Encoding | 9.1. Format and Encoding | |||
BGP VPN CAR SAFI leverages BGP multi-protocol extensions [RFC4760] | BGP VPN CAR SAFI leverages BGP multiprotocol extensions [RFC4760] and | |||
and uses the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route | uses the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route | |||
updates within SAFI value 84 along with AFI 1 for IPv4 VPN CAR | updates within SAFI value 84 along with AFI 1 for IPv4 VPN CAR | |||
prefixes and AFI 2 for IPv6 VPN CAR prefixes. | prefixes and AFI 2 for IPv6 VPN CAR prefixes. | |||
BGP speakers MUST use the BGP Capabilities Advertisement to ensure | BGP speakers MUST use the BGP Capabilities Advertisement to ensure | |||
support for processing of BGP VPN CAR updates. This is done as | support for processing of BGP VPN CAR updates. This is done as | |||
specified in [RFC4760], by using capability code 1 (multi-protocol | specified in [RFC4760], by using capability code 1 (multiprotocol | |||
BGP), with AFI 1 and 2 (as required) and SAFI 84. | BGP), with AFI 1 and 2 (as required) and SAFI 84. | |||
The Next Hop network address field in the MP_REACH_NLRI may contain | The Next Hop network address field in the MP_REACH_NLRI may contain | |||
either a VPN-IPv4 or a VPN-IPv6 address with 8-octet RD set to zero, | either a VPN-IPv4 or a VPN-IPv6 address with 8-octet RD set to zero, | |||
independent of AFI. If the next hop length is 12, then the next hop | independent of AFI. If the next hop length is 12, then the next hop | |||
is a VPN-IPv4 address with an RD of 0 constructed as per [RFC4364]. | is a VPN-IPv4 address with an RD of 0 constructed as per [RFC4364]. | |||
If the next hop length is 24 or 48, then the next hop is a VPN-IPv6 | If the next hop length is 24 or 48, then the next hop is a VPN-IPv6 | |||
address constructed as per Section 3.2.1.1 of [RFC4659]. | address constructed as per Section 3.2.1.1 of [RFC4659]. | |||
9.1.1. VPN CAR (E, C) NLRI Type | 9.1.1. VPN CAR (E, C) NLRI Type | |||
skipping to change at line 2819 ¶ | skipping to change at line 2817 ¶ | |||
* The following description applies to the reference topology above: | * The following description applies to the reference topology above: | |||
- IGP Flex-Algo 128 is running in each domain, and mapped to | - IGP Flex-Algo 128 is running in each domain, and mapped to | |||
color C1. | color C1. | |||
- Egress PE E2 advertises a VPN route RD:V/v colored with Color- | - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | |||
EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN | EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN | |||
route propagates via service RRs to ingress PE E1. | route propagates via service RRs to ingress PE E1. | |||
- BGP CAR route (E2, C1) with next hop, label index, and label as | - BGP CAR route (E2, C1) with next hop, label index, and label as | |||
shown above are advertised through border routers in each | shown above are advertised through BRs in each domain. When an | |||
domain. When an RR is used in the domain, ADD-PATH is enabled | RR is used in the domain, ADD-PATH is enabled to advertise | |||
to advertise multiple available paths. | multiple available paths. | |||
- On each BGP hop, the (E2, C1) route's next hop is resolved over | - On each BGP hop, the (E2, C1) route's next hop is resolved over | |||
IGP Flex-Algo 128 of the domain. The AIGP attribute influences | IGP Flex-Algo 128 of the domain. The AIGP attribute influences | |||
the BGP CAR route best path decision as per [RFC7311]. The BGP | the BGP CAR route best path decision as per [RFC7311]. The BGP | |||
CAR label swap entry is installed that goes over Flex-Algo 128 | CAR label swap entry is installed that goes over Flex-Algo 128 | |||
LSP to next hop providing intent in each IGP domain. The AIGP | LSP to next hop providing intent in each IGP domain. The AIGP | |||
metric should be updated to reflect Flex-Algo 128 metric to | metric should be updated to reflect Flex-Algo 128 metric to | |||
next hop. | next hop. | |||
- Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN | - Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN | |||
skipping to change at line 2905 ¶ | skipping to change at line 2903 ¶ | |||
o SR Policy (C1, 231) segments <S2, 231>, and | o SR Policy (C1, 231) segments <S2, 231>, and | |||
o SR Policy (C1, E2) segments <S3, E2>. | o SR Policy (C1, E2) segments <S3, E2>. | |||
- Egress PE E2 advertises a VPN route RD:V/v colored with Color- | - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | |||
EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN | EC C1 to steer traffic to BGP transport CAR (E2, C1). VPN | |||
route propagates via service RRs to ingress PE E1. | route propagates via service RRs to ingress PE E1. | |||
- BGP CAR route (E2, C1) with next hop, label index, and label as | - BGP CAR route (E2, C1) with next hop, label index, and label as | |||
shown above are advertised through border routers in each | shown above are advertised through BRs in each domain. When an | |||
domain. When an RR is used in the domain, ADD-PATH is enabled | RR is used in the domain, ADD-PATH is enabled to advertise | |||
to advertise multiple available paths. | multiple available paths. | |||
- On each BGP hop, the CAR route (E2, C1) next hop is resolved | - On each BGP hop, the CAR route (E2, C1) next hop is resolved | |||
over an SR Policy (C1, next hop). The BGP CAR label swap entry | over an SR Policy (C1, next hop). The BGP CAR label swap entry | |||
is installed that goes over SR Policy segment list. | is installed that goes over SR Policy segment list. | |||
- Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN | - Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN | |||
route RD:V/v into (E2, C1). | route RD:V/v into (E2, C1). | |||
* Important: | * Important: | |||
skipping to change at line 2978 ¶ | skipping to change at line 2976 ¶ | |||
* The following description applies to the reference topology above: | * The following description applies to the reference topology above: | |||
- IGP Flex-Algo 128 is only enabled in core (e.g., WAN network), | - IGP Flex-Algo 128 is only enabled in core (e.g., WAN network), | |||
mapped to C1. Access network domain only has Base Algo 0. | mapped to C1. Access network domain only has Base Algo 0. | |||
- Egress PE E2 advertises a VPN route RD:V/v colored with Color- | - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | |||
EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN | EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN | |||
route propagates via service RRs to ingress PE E1. | route propagates via service RRs to ingress PE E1. | |||
- BGP CAR route (E2, C1) with next hop, label index, and label as | - BGP CAR route (E2, C1) with next hop, label index, and label as | |||
shown above are advertised through border routers in each | shown above are advertised through BRs in each domain. When an | |||
domain. When an RR is used in the domain, ADD-PATH is enabled | RR is used in the domain, ADD-PATH is enabled to advertise | |||
to advertise multiple available paths. | multiple available paths. | |||
- Local policy on 231 and 232 maps intent C1 to resolve CAR route | - Local policy on 231 and 232 maps intent C1 to resolve CAR route | |||
next hop over IGP Base Algo 0 in right access domain. The BGP | next hop over IGP Base Algo 0 in right access domain. The BGP | |||
CAR label swap entry is installed that goes over Base Algo 0 | CAR label swap entry is installed that goes over Base Algo 0 | |||
LSP to next hop. AIGP metric is updated to reflect Base Algo 0 | LSP to next hop. AIGP metric is updated to reflect Base Algo 0 | |||
metric to next hop with an additional penalty (+1000). | metric to next hop with an additional penalty (+1000). | |||
- On 121 and 122, CAR route (E2, C1) next hop learnt from Core | - On 121 and 122, CAR route (E2, C1) next hop learnt from Core | |||
domain is resolved over IGP Flex-Algo 128. The BGP CAR label | domain is resolved over IGP Flex-Algo 128. The BGP CAR label | |||
swap entry is installed that goes over Flex-Algo 128 LSP to | swap entry is installed that goes over Flex-Algo 128 LSP to | |||
skipping to change at line 3058 ¶ | skipping to change at line 3056 ¶ | |||
- RSVP-TE MPLS tunnel mesh is configured only in core (e.g., WAN | - RSVP-TE MPLS tunnel mesh is configured only in core (e.g., WAN | |||
network). Access only has IS-IS/LDP. (The figure does not | network). Access only has IS-IS/LDP. (The figure does not | |||
show all TE tunnels.) | show all TE tunnels.) | |||
- Egress PE E2 advertises a VPN route RD:V/v colored with Color- | - Egress PE E2 advertises a VPN route RD:V/v colored with Color- | |||
EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN | EC C1 to steer traffic via BGP transport CAR (E2, C1). VPN | |||
route propagates via service RRs to ingress PE E1. | route propagates via service RRs to ingress PE E1. | |||
- BGP CAR route (E2, C1) with next hops and labels as shown above | - BGP CAR route (E2, C1) with next hops and labels as shown above | |||
is advertised through border routers in each domain. When an | is advertised through BRs in each domain. When an RR is used | |||
RR is used in the domain, ADD-PATH is enabled to advertise | in the domain, ADD-PATH is enabled to advertise multiple | |||
multiple available paths. | available paths. | |||
- Local policy on 231 and 232 maps intent C1 to resolve CAR route | - Local policy on 231 and 232 maps intent C1 to resolve CAR route | |||
next hop over best-effort LDP LSP in access domain 1. The BGP | next hop over best-effort LDP LSP in access domain 1. The BGP | |||
CAR label swap entry is installed that goes over LDP LSP to | CAR label swap entry is installed that goes over LDP LSP to | |||
next hop. AIGP metric is updated to reflect best-effort metric | next hop. AIGP metric is updated to reflect best-effort metric | |||
to next hop with an additional penalty (+1000). | to next hop with an additional penalty (+1000). | |||
- Local policy on 121 and 122 maps intent C1 to resolve CAR route | - Local policy on 121 and 122 maps intent C1 to resolve CAR route | |||
next hop in Core domain over RSVP-TE tunnels. The BGP CAR | next hop in Core domain over RSVP-TE tunnels. The BGP CAR | |||
label swap entry is installed that goes over a TE tunnel to | label swap entry is installed that goes over a TE tunnel to | |||
skipping to change at line 3092 ¶ | skipping to change at line 3090 ¶ | |||
- Dynamic BGP CAR label carries intent from PEs, which is | - Dynamic BGP CAR label carries intent from PEs, which is | |||
realized in Core domain by resolution via RSVP-TE tunnel. | realized in Core domain by resolution via RSVP-TE tunnel. | |||
A.4. Transit Network Domains That Do Not Support CAR | A.4. Transit Network Domains That Do Not Support CAR | |||
* In a brownfield deployment, color-aware paths between two PEs may | * In a brownfield deployment, color-aware paths between two PEs may | |||
need to go through a transit domain that does not support CAR. | need to go through a transit domain that does not support CAR. | |||
Examples of such a brownfield network include an MPLS LDP network | Examples of such a brownfield network include an MPLS LDP network | |||
with IGP best-effort, or a multi-domain network based on BGP-LU. | with IGP best-effort, or a multi-domain network based on BGP-LU. | |||
An MPLS LDP network with best-effort IGP can adopt the above | An MPLS LDP network with best-effort IGP can adopt the above | |||
scheme in Appendix A.3. Below is the example scenario for BGP LU. | scheme in Appendix A.3. Below is the example scenario for BGP-LU. | |||
* Reference topology: | * Reference topology: | |||
E1 --- BR1 --- BR2 ......... BR3 ---- BR4 --- E2 | E1 --- BR1 --- BR2 ......... BR3 ---- BR4 --- E2 | |||
Ci <----LU----> Ci | Ci <----LU----> Ci | |||
Figure 10: BGP CAR Not Supported in Transit Domain | Figure 10: BGP CAR Not Supported in Transit Domain | |||
- Network between BR2 and BR3 comprises of multiple BGP-LU hops | - Network between BR2 and BR3 comprises of multiple BGP-LU hops | |||
(over IGP-LDP domains). | (over IGP-LDP domains). | |||
skipping to change at line 3215 ¶ | skipping to change at line 3213 ¶ | |||
different domains. | different domains. | |||
A.6. Per-Flow Steering over CAR Routes | A.6. Per-Flow Steering over CAR Routes | |||
This section provides an example of ingress PE per-flow steering as | This section provides an example of ingress PE per-flow steering as | |||
defined in Section 8.6 of [RFC9256] onto BGP CAR routes. | defined in Section 8.6 of [RFC9256] onto BGP CAR routes. | |||
The following description applies to the reference topology in | The following description applies to the reference topology in | |||
Figure 6: | Figure 6: | |||
* Ingress PE E1 learns best-effort BGP LU route E2. | * Ingress PE E1 learns best-effort BGP-LU route E2. | |||
* Ingress PE E1 learns CAR route (E2, C1), C1 is mapped to "low | * Ingress PE E1 learns CAR route (E2, C1), C1 is mapped to "low | |||
delay". | delay". | |||
* Ingress PE E1 learns CAR route (E2, C2), C2 is mapped to "low | * Ingress PE E1 learns CAR route (E2, C2), C2 is mapped to "low | |||
delay and avoid resource R". | delay and avoid resource R". | |||
* Ingress PE E1 is configured to instantiate an array of paths to E2 | * Ingress PE E1 is configured to instantiate an array of paths to E2 | |||
where entry 0 is the BGP LU path to next hop, color C1 is the | where entry 0 is the BGP-LU path to next hop, color C1 is the | |||
first entry, and color C2 is the second entry. The index into the | first entry, and color C2 is the second entry. The index into the | |||
array is called a Forwarding Class (FC). The index can have | array is called a Forwarding Class (FC). The index can have | |||
values 0 to 7, especially when derived from the MPLS TC bits | values 0 to 7, especially when derived from the MPLS TC bits | |||
[RFC5462]. | [RFC5462]. | |||
* E1 is configured to match flows in its ingress interfaces (upon | * E1 is configured to match flows in its ingress interfaces (upon | |||
any field such as Ethernet destination/source/VLAN/TOS or IP | any field such as Ethernet destination/source/VLAN/TOS or IP | |||
destination/source/DSCP or transport ports, etc.) and color them | destination/source/DSCP or transport ports, etc.) and color them | |||
with an internal per-packet FC variable (0, 1, or 2 in this | with an internal per-packet FC variable (0, 1, or 2 in this | |||
example). | example). | |||
skipping to change at line 3640 ¶ | skipping to change at line 3638 ¶ | |||
- Similarly, Prefix B:C12::/32 summarizes Flex-Algo 128 block in | - Similarly, Prefix B:C12::/32 summarizes Flex-Algo 128 block in | |||
AS2. | AS2. | |||
- Per Flex-Algo external subnets for eBGP next hops IP1 and IP2 | - Per Flex-Algo external subnets for eBGP next hops IP1 and IP2 | |||
are distributed in IS-IS within AS2. | are distributed in IS-IS within AS2. | |||
* BGP CAR prefix route B:C11::/32 with LCM C1 is originated by AS1 | * BGP CAR prefix route B:C11::/32 with LCM C1 is originated by AS1 | |||
BRs 231 and 232 on eBGP sessions to AS2 BRs 121 and 122. | BRs 231 and 232 on eBGP sessions to AS2 BRs 121 and 122. | |||
* ASBR 121 and 122 propagate the route in AS2 to all the P, ABRs, | * ASBR 121 and 122 propagate the route in AS2 to all the P, ABRs, | |||
and PEs through transport RR. | and PEs through TRR. | |||
* Every router in AS2 resolves BGP CAR prefix B:C11::/32 next hops | * Every router in AS2 resolves BGP CAR prefix B:C11::/32 next hops | |||
IP1 and IP2 in IS-ISv6 Flex-Algo 128 and programs B:C11::/32 | IP1 and IP2 in IS-ISv6 Flex-Algo 128 and programs B:C11::/32 | |||
prefix in global IPv6 forwarding table. | prefix in global IPv6 forwarding table. | |||
* AIGP attribute influences BGP CAR route best path decision. | * AIGP attribute influences BGP CAR route best path decision. | |||
* Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID | * Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID | |||
B:C11:2:DT4::. Service SID is allocated by E2 from its locator of | B:C11:2:DT4::. Service SID is allocated by E2 from its locator of | |||
color C1 intent. | color C1 intent. | |||
skipping to change at line 3746 ¶ | skipping to change at line 3744 ¶ | |||
domain for the given intent. Node locators in the egress | domain for the given intent. Node locators in the egress | |||
domain are sub-allocated from the block. | domain are sub-allocated from the block. | |||
- Prefix B:C12::/32 summarizes Flex-Algo 128 block in transit | - Prefix B:C12::/32 summarizes Flex-Algo 128 block in transit | |||
domain. | domain. | |||
- Prefix B:C13::/32 summarizes Flex-Algo 128 block in ingress | - Prefix B:C13::/32 summarizes Flex-Algo 128 block in ingress | |||
domain. | domain. | |||
* BGP CAR route B:C11::/32 is originated by ABRs 231 and 232 with | * BGP CAR route B:C11::/32 is originated by ABRs 231 and 232 with | |||
LCM C1. Along the propagation path, border routers set next-hop- | LCM C1. Along the propagation path, BRs set next-hop-self and | |||
self and appropriately update the intra-domain encapsulation | appropriately update the intra-domain encapsulation information | |||
information for the C1 intent. For example, 231 and 121 signal | for the C1 intent. For example, 231 and 121 signal SRv6 SID of | |||
SRv6 SID of End behavior [RFC8986] allocated from their respective | End behavior [RFC8986] allocated from their respective locators | |||
locators for the C1 intent. (Note: IGP Fleixible Algorithm is | for the C1 intent. (Note: IGP Fleixible Algorithm is shown for | |||
shown for intra-domain path, but SR Policy may also provide the | intra-domain path, but SR Policy may also provide the path as | |||
path as shown in Appendix C.3.) | shown in Appendix C.3.) | |||
* AIGP attribute influences BGP CAR route best path decision. | * AIGP attribute influences BGP CAR route best path decision. | |||
* Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID | * Egress PE E2 advertises a VPN route RD:V/v with SRv6 Service SID | |||
B:C11:2:DT4::. Service SID is allocated by E2 from its locator of | B:C11:2:DT4::. Service SID is allocated by E2 from its locator of | |||
color C1 intent. | color C1 intent. | |||
* Ingress PE E1 learns CAR route B:C11::/32 and VPN route RD:V/v | * Ingress PE E1 learns CAR route B:C11::/32 and VPN route RD:V/v | |||
with SRv6 SID B:C11:2:DT4::. | with SRv6 SID B:C11:2:DT4::. | |||
skipping to change at line 3774 ¶ | skipping to change at line 3772 ¶ | |||
steered along IPv6 routed path provided by BGP CAR IP Prefix route | steered along IPv6 routed path provided by BGP CAR IP Prefix route | |||
to locator B:C11::/32. | to locator B:C11::/32. | |||
Important: | Important: | |||
* Uses longest prefix match of SRv6 Service SID to BGP CAR prefix. | * Uses longest prefix match of SRv6 Service SID to BGP CAR prefix. | |||
There is no mapping labels/SIDs; there is simple IP-based | There is no mapping labels/SIDs; there is simple IP-based | |||
forwarding instead. | forwarding instead. | |||
* Originating domain PE locators of the given intent can be | * Originating domain PE locators of the given intent can be | |||
summarized on transit BGP hops eliminating per PE state on border | summarized on transit BGP hops eliminating per PE state on BRs. | |||
routers. | ||||
Packet forwarding: | Packet forwarding: | |||
@E1: IPv4 VRF V/v => H.Encaps.red <B:C13:121:END::, B:C11:2:DT4::> | @E1: IPv4 VRF V/v => H.Encaps.red <B:C13:121:END::, B:C11:2:DT4::> | |||
@121: My SID table: B:C13:121:END:: => Update DA with B:C11:2:DT4:: | @121: My SID table: B:C13:121:END:: => Update DA with B:C11:2:DT4:: | |||
@121: IPv6 Table: B:C11::/32 => H.Encaps.red <B:C12:231:END::> | @121: IPv6 Table: B:C11::/32 => H.Encaps.red <B:C12:231:END::> | |||
@231: My SID table: B:C12:231:END:: => Remove IPv6 header; | @231: My SID table: B:C12:231:END:: => Remove IPv6 header; | |||
Inner DA B:C11:2:DT4:: | Inner DA B:C11:2:DT4:: | |||
@231: IPv6 Table B:C11:2::/48 => Forward via IS-ISv6 Flex-Algo | @231: IPv6 Table B:C11:2::/48 => Forward via IS-ISv6 Flex-Algo | |||
path to E2 | path to E2 | |||
skipping to change at line 3834 ¶ | skipping to change at line 3831 ¶ | |||
route-based design (Section 7.1.2). The example is iBGP, but the | route-based design (Section 7.1.2). The example is iBGP, but the | |||
design also applies to eBGP (multi-AS). | design also applies to eBGP (multi-AS). | |||
* SR Policy (E2, C2) provides given intent in egress domain. | * SR Policy (E2, C2) provides given intent in egress domain. | |||
- SR Policy (E2, C2) with segments <B:01:z:END::, B:01:2:END::>, | - SR Policy (E2, C2) with segments <B:01:z:END::, B:01:2:END::>, | |||
where z is the node id in egress domain. | where z is the node id in egress domain. | |||
* Egress ABRs 231 and 232 redistribute SR Policy into BGP CAR Type-1 | * Egress ABRs 231 and 232 redistribute SR Policy into BGP CAR Type-1 | |||
NLRI (E2, C2) to other domains, with SRv6 SID of End.B6 behavior. | NLRI (E2, C2) to other domains, with SRv6 SID of End.B6 behavior. | |||
This route is propagated to ingress PEs through Transport RR (TRR) | This route is propagated to ingress PEs through TRR or inline with | |||
or inline with next-hop-unchanged. | next-hop-unchanged. | |||
* The ABRs also advertise BGP CAR prefix route (B:C21::/32) | * The ABRs also advertise BGP CAR prefix route (B:C21::/32) | |||
summarizing locator part of SRv6 SIDs for SR policies of given | summarizing locator part of SRv6 SIDs for SR policies of given | |||
intent to different PEs in egress domain. BGP CAR prefix route | intent to different PEs in egress domain. BGP CAR prefix route | |||
propagates through border routers. At each BGP hop, BGP CAR | propagates through BRs. At each BGP hop, BGP CAR prefix next-hop | |||
prefix next-hop resolution triggers intra-domain transit SR Policy | resolution triggers intra-domain transit SR Policy (C2, CAR next | |||
(C2, CAR next hop). For example: | hop). For example: | |||
- SR Policy (231, C2) with segments <B:02:y:END::, | - SR Policy (231, C2) with segments <B:02:y:END::, | |||
B:02:231:END::>, and | B:02:231:END::>, and | |||
- SR Policy (121, C2) with segments <B:03:x:END::, | - SR Policy (121, C2) with segments <B:03:x:END::, | |||
B:03:121:END::>, | B:03:121:END::>, | |||
- where x and y are node ids within the respective domains. | - where x and y are node ids within the respective domains. | |||
* Egress PE E2 advertises a VPN route RD:V/v with Color-EC C2. | * Egress PE E2 advertises a VPN route RD:V/v with Color-EC C2. | |||
skipping to change at line 3985 ¶ | skipping to change at line 3982 ¶ | |||
endpoints). | endpoints). | |||
CASE A: BGP data exchanged for MPLS (non-SR): | CASE A: BGP data exchanged for MPLS (non-SR): | |||
Consider 200 bytes of shared attributes | Consider 200 bytes of shared attributes | |||
CAR SAFI signals label in non-key TLV part of NLRI | CAR SAFI signals label in non-key TLV part of NLRI | |||
Each NLRI size for AFI 1 = 12(key) + 5(label) = 17 bytes | Each NLRI size for AFI 1 = 12(key) + 5(label) = 17 bytes | |||
Ideal packing: | Ideal packing: | |||
Number of NLRIs in 4k update size = 223 (4k-200/17) | Number of NLRIs in 4k update size = 223 (4k-200/17) | |||
Number of update messages of 4k size = 1.5 million/223 = 6726 | Number of update messages of 4k size = 1.5 million/223 = 6726 | |||
Total BGP data on wire = 6726 * 4k = ~27.5MB | Total BGP data on wire = 6726 * 4k = ~27.5 MB | |||
Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
Size of update message = (17 * 5) + 200 = 285 | Size of update message = (17 * 5) + 200 = 285 | |||
Total BGP data on wire = 285 * 300k = ~86MB | Total BGP data on wire = 285 * 300k = ~86 MB | |||
No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
Size of update message = 17 + 200 = 217 | Size of update message = 17 + 200 = 217 | |||
Total BGP data on wire = 217 * 1.5 million = ~325MB | Total BGP data on wire = 217 * 1.5 million = ~325 MB | |||
SAFI 128 using encoding specified in RFC 8277 with label in NLRI | SAFI 128 using encoding specified in RFC 8277 with label in NLRI | |||
Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | |||
Ideal packing: | Ideal packing: | |||
Number of NLRIs in 4k update size = 237 (4k-200/16) | Number of NLRIs in 4k update size = 237 (4k-200/16) | |||
Number of update messages of 4k size = 1.5 million/237 = ~6330 | Number of update messages of 4k size = 1.5 million/237 = ~6330 | |||
Total BGP data on wire = 6330 * 4k = ~25.9MB | Total BGP data on wire = 6330 * 4k = ~25.9 MB | |||
Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
Size of update message = (16 * 5) + 200 = 280 | Size of update message = (16 * 5) + 200 = 280 | |||
Total BGP data on wire = 280 * 300k = ~84MB | Total BGP data on wire = 280 * 300k = ~84 MB | |||
No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
Size of update message = 16 + 200 = 216 | Size of update message = 16 + 200 = 216 | |||
Total BGP data on wire = 216 * 1.5 million = ~324MB | Total BGP data on wire = 216 * 1.5 million = ~324 MB | |||
CASE B: BGP data exchanged for SR-MPLS label index: | CASE B: BGP data exchanged for SR-MPLS label index: | |||
Consider 200 bytes of shared attributes | Consider 200 bytes of shared attributes | |||
CAR SAFI signals label index in non-key TLV part of NLRI | CAR SAFI signals label index in non-key TLV part of NLRI | |||
Each NLRI size for AFI 1 | Each NLRI size for AFI 1 | |||
= 12(key) + 5(label) + 9(Index) = 26 bytes | = 12(key) + 5(label) + 9(Index) = 26 bytes | |||
Ideal packing: | Ideal packing: | |||
Number of NLRIs in 4k update size = 146 (4k-200/26) | Number of NLRIs in 4k update size = 146 (4k-200/26) | |||
Number of update messages of 4k size = 1.5 million/146 = 6726 | Number of update messages of 4k size = 1.5 million/146 = 6726 | |||
Total BGP data on wire = 10274 * 4k = ~42MB | Total BGP data on wire = 10274 * 4k = ~42 MB | |||
Practical packing (5 routes in update message) | Practical packing (5 routes in update message) | |||
Size of update message = (26 * 5) + 200 = 330 | Size of update message = (26 * 5) + 200 = 330 | |||
Total BGP data on wire = 330 * 300k = ~99MB | Total BGP data on wire = 330 * 300k = ~99 MB | |||
No-packing case (1 route per update message) | No-packing case (1 route per update message) | |||
Size of update message = 26 + 200 = 226 | Size of update message = 26 + 200 = 226 | |||
Total BGP data on wire = 226 * 1.5 million = ~339MB | Total BGP data on wire = 226 * 1.5 million = ~339 MB | |||
SAFI 128 using encoding specified in RFC 8277 with label in NLRI | SAFI 128 using encoding specified in RFC 8277 with label in NLRI | |||
Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | |||
Ideal packing: | Ideal packing: | |||
Not supported as label index is encoded in Prefix-SID | Not supported as label index is encoded in Prefix-SID | |||
attribute | attribute | |||
Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
Not supported as label index is encoded in Prefix-SID | Not supported as label index is encoded in Prefix-SID | |||
attribute | attribute | |||
No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
Size of update message = 16 + 210 = 226 | Size of update message = 16 + 210 = 226 | |||
Total BGP data on wire = 216 * 1.5 million = ~339MB | Total BGP data on wire = 216 * 1.5 million = ~339 MB | |||
CASE C: BGP data exchanged with 128 bit single SRv6 SID: | CASE C: BGP data exchanged with 128 bit single SRv6 SID: | |||
Consider 200 bytes of shared attributes | Consider 200 bytes of shared attributes | |||
CAR SAFI signals SRv6 SID in non-key TLV part of NLRI | CAR SAFI signals SRv6 SID in non-key TLV part of NLRI | |||
Each NLRI size for AFI 1 = 12(key) + 18(SRv6 SID) = 30 bytes | Each NLRI size for AFI 1 = 12(key) + 18(SRv6 SID) = 30 bytes | |||
Ideal packing: | Ideal packing: | |||
Number of NLRIs in 4k update size = 126 (4k-200/30) | Number of NLRIs in 4k update size = 126 (4k-200/30) | |||
Number of update messages of 4k size = 1.5 million/126 = ~12k | Number of update messages of 4k size = 1.5 million/126 = ~12k | |||
Total BGP data on wire = 12k * 4k = ~49MB | Total BGP data on wire = 12k * 4k = ~49 MB | |||
Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
Size of update message | Size of update message | |||
= (30 * 5) + 236 (including Prefix-SID) = 386 | = (30 * 5) + 236 (including Prefix-SID) = 386 | |||
Total BGP data on wire = 386 * 300k = ~115MB | Total BGP data on wire = 386 * 300k = ~115 MB | |||
No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
Size of update message = 12 + 236 (SID in Prefix-SID) = 252 | Size of update message = 12 + 236 (SID in Prefix-SID) = 252 | |||
Total BGP data on wire = 252 * 1.5 million = ~378MB | Total BGP data on wire = 252 * 1.5 million = ~378 MB | |||
SAFI 128 using encoding specified in RFC 8277 with label in NLRI | SAFI 128 using encoding specified in RFC 8277 with label in NLRI | |||
(No transposition) | (No transposition) | |||
Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | Each NLRI size for AFI 1 = 13(key) + 3(label) = 16 bytes | |||
Ideal packing: | Ideal packing: | |||
Not supported as SRv6 SID is encoded in Prefix-SID | Not supported as SRv6 SID is encoded in Prefix-SID | |||
attribute | attribute | |||
Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
Not supported as SRv6 SID is encoded in Prefix-SID | Not supported as SRv6 SID is encoded in Prefix-SID | |||
attribute | attribute | |||
No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
Size of update message = 16 + 236 = 252 | Size of update message = 16 + 236 = 252 | |||
Total BGP data on wire = 252 * 1.5 million = ~378MB | Total BGP data on wire = 252 * 1.5 million = ~378 MB | |||
BGP data exchanged with transposition of 4 bytes from SRv6 SID into | BGP data exchanged with transposition of 4 bytes from SRv6 SID into | |||
SRv6 SID TLV: | SRv6 SID TLV: | |||
Consider 200 bytes of shared attributes | Consider 200 bytes of shared attributes | |||
CAR SAFI signals SRv6 SID in non-key TLV part of NLRI | CAR SAFI signals SRv6 SID in non-key TLV part of NLRI | |||
Each NLRI size for AFI 1 = 12(key) + 6(SRv6 SID) = 18 bytes | Each NLRI size for AFI 1 = 12(key) + 6(SRv6 SID) = 18 bytes | |||
Ideal packing: | Ideal packing: | |||
Number of NLRIs in 4k update size = 211 (4k-200/18) | Number of NLRIs in 4k update size = 211 (4k-200/18) | |||
Number of update messages of 4k size = 1.5 million/211 = ~7110 | Number of update messages of 4k size = 1.5 million/211 = ~7110 | |||
Total BGP data on wire = 7110 * 4k = ~29MB | Total BGP data on wire = 7110 * 4k = ~29 MB | |||
Practical packing (5 routes in update message): | Practical packing (5 routes in update message): | |||
Size of update message | Size of update message | |||
= (18 * 5) + 236 (including Prefix-SID) = 326 | = (18 * 5) + 236 (including Prefix-SID) = 326 | |||
Total BGP data on wire = 326 * 300k = ~98MB | Total BGP data on wire = 326 * 300k = ~98 MB | |||
No-packing case (1 route per update message): | No-packing case (1 route per update message): | |||
Size of update message | Size of update message | |||
= 12 + 236 (SID in Prefix-SID attribute) = 252 | = 12 + 236 (SID in Prefix-SID attribute) = 252 | |||
Total BGP data on wire = 252 * 1.5 million = ~378MB | Total BGP data on wire = 252 * 1.5 million = ~378 MB | |||
Acknowledgements | Acknowledgements | |||
The authors would like to acknowledge the invaluable contributions of | The authors would like to acknowledge the invaluable contributions of | |||
many collaborators towards the BGP CAR solution and this document in | many collaborators towards the BGP CAR solution and this document in | |||
providing input about use cases, participating in brainstorming and | providing input about use cases, participating in brainstorming and | |||
mailing list discussions and in reviews of the solution and draft | mailing list discussions and in reviews of the solution and draft | |||
revisions. In addition to the contributors listed in the | revisions. In addition to the contributors listed in the | |||
Contributors section, the authors would like to thank Robert Raszuk, | Contributors section, the authors would like to thank Robert Raszuk, | |||
Bin Wen, Chaitanya Yadlapalli, Satoru Matsushima, Moses Nagarajah, | Bin Wen, Chaitanya Yadlapalli, Satoru Matsushima, Moses Nagarajah, | |||
End of changes. 57 change blocks. | ||||
92 lines changed or deleted | 89 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |