| rfc9957v1.txt | rfc9957.txt | |||
|---|---|---|---|---|
| skipping to change at line 15 ¶ | skipping to change at line 15 ¶ | |||
| ISSN: 2070-1721 CableLabs | ISSN: 2070-1721 CableLabs | |||
| April 2026 | April 2026 | |||
| The DOCSISĀ® Queue Protection Algorithm to Preserve Low Latency | The DOCSISĀ® Queue Protection Algorithm to Preserve Low Latency | |||
| Abstract | Abstract | |||
| This Informational RFC explains the specification of the queue | This Informational RFC explains the specification of the queue | |||
| protection algorithm used in Data-Over-Cable Service Interface | protection algorithm used in Data-Over-Cable Service Interface | |||
| Specification (DOCSIS) technology since version 3.1. A shared low- | Specification (DOCSIS) technology since version 3.1. A shared low- | |||
| latency queue relies on the non-queue-building behavior of every | latency queue relies on the non-queue-building behaviour of every | |||
| traffic flow using it. However, some flows might not take such care, | traffic flow using it. However, some flows might not take such care, | |||
| either accidentally or maliciously. If a queue is about to exceed a | either accidentally or maliciously. If a queue is about to exceed a | |||
| threshold level of delay, the queue protection algorithm can rapidly | threshold level of delay, the Queue Protection algorithm can rapidly | |||
| detect the flows most likely to be responsible. It can then prevent | detect the flows most likely to be responsible. It can then prevent | |||
| harm to other traffic in the low-latency queue by ejecting selected | harm to other traffic in the low-latency queue by ejecting selected | |||
| packets (or all packets) of these flows. This document is designed | packets (or all packets) of these flows. This document is designed | |||
| for four audiences: a) congestion control designers who need to | for four audiences: a) congestion control designers who need to | |||
| understand how to keep on the "good" side of the algorithm; b) | understand how to keep on the "good" side of the algorithm; b) | |||
| implementers of the algorithm who want to understand it in more | implementers of the algorithm who want to understand it in more | |||
| depth; c) designers of algorithms with similar goals, perhaps for | depth; c) designers of algorithms with similar goals, perhaps for | |||
| non-DOCSIS scenarios; and d) researchers interested in evaluating the | non-DOCSIS scenarios; and d) researchers interested in evaluating the | |||
| algorithm. | algorithm. | |||
| skipping to change at line 62 ¶ | skipping to change at line 62 ¶ | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. | to this document. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction | 1. Introduction | |||
| 1.1. Document Roadmap | 1.1. Document Roadmap | |||
| 1.2. Terminology | 1.2. Terminology | |||
| 1.3. Copyright Material | ||||
| 2. Approach (In Brief) | 2. Approach (In Brief) | |||
| 2.1. Mechanism | 2.1. Mechanism | |||
| 2.2. Policy | 2.2. Policy | |||
| 2.2.1. Policy Conditions | 2.2.1. Policy Conditions | |||
| 2.2.2. Policy Action | 2.2.2. Policy Action | |||
| 3. Necessary Flow Behavior | 3. Necessary Flow Behaviour | |||
| 4. Pseudocode Walk-Through | 4. Pseudocode Walk-Through | |||
| 4.1. Input Parameters, Constants, and Variables | 4.1. Input Parameters, Constants, and Variables | |||
| 4.2. Queue Protection Data Path | 4.2. Queue Protection Data Path | |||
| 4.2.1. The qprotect() Function | 4.2.1. The qprotect() Function | |||
| 4.2.2. The pick_bucket() Function | 4.2.2. The pick_bucket() Function | |||
| 4.2.3. The fill_bucket() Function | 4.2.3. The fill_bucket() Function | |||
| 4.2.4. The calcProbNative() Function | 4.2.4. The calcProbNative() Function | |||
| 5. Rationale | 5. Rationale | |||
| 5.1. Rationale: Blame for Queuing, Not for Rate in Itself | 5.1. Rationale: Blame for Queuing, Not for Rate in Itself | |||
| 5.2. Rationale for Constant Aging of the Queuing Score | 5.2. Rationale: Constant Aging of the Queuing Score | |||
| 5.3. Rationale for Transformed Queuing Score | 5.3. Rationale: Transformed Queuing Score | |||
| 5.4. Rationale for Policy Conditions | 5.4. Rationale: Policy Conditions | |||
| 5.5. Rationale for Reclassification as the Policy Action | 5.5. Rationale: Reclassification as the Policy Action | |||
| 6. Limitations | 6. Limitations | |||
| 7. IANA Considerations | 7. IANA Considerations | |||
| 8. Implementation Status | 8. Security Considerations | |||
| 9. Security Considerations | 8.1. Resource Exhaustion Attacks | |||
| 9.1. Resource Exhaustion Attacks | 8.1.1. Exhausting Flow State Storage | |||
| 9.1.1. Exhausting Flow-State Storage | 8.1.2. Exhausting Processing Resources | |||
| 9.1.2. Exhausting Processing Resources | 9. Comments Solicited | |||
| 10. Comments Solicited | 10. References | |||
| 11. References | 10.1. Normative References | |||
| 11.1. Normative References | 10.2. Informative References | |||
| 11.2. Informative References | ||||
| Acknowledgements | Acknowledgements | |||
| Authors' Addresses | Authors' Addresses | |||
| 1. Introduction | 1. Introduction | |||
| This Informational RFC explains the specification of the queue | This Informational RFC explains the specification of the queue | |||
| protection (QProt) algorithm used in DOCSIS technology since version | protection (QProt) algorithm used in DOCSIS technology since version | |||
| 3.1 [DOCSIS]. | 3.1 [DOCSIS]. | |||
| Although the algorithm is defined in Annex P of [DOCSIS], it relies | Although the algorithm is defined in Annex P of [DOCSIS], it relies | |||
| on cross references to other parts of the set of specifications. | on cross references to other parts of the set of specifications. | |||
| This document pulls all the strands together into one self-contained | This document pulls all the strands together into one self-contained | |||
| document. The core of the document is a similar pseudocode walk- | document. The core of the document is a similar pseudocode walk- | |||
| through to that in the DOCSIS specification, but it also includes | through to that in the DOCSIS specification, but it also includes | |||
| additional material: | additional material: | |||
| i. a brief overview, | i. a brief overview, | |||
| ii. a definition of how a data sender needs to behave to avoid | ii. a definition of how a data sender needs to behave to avoid | |||
| triggering queue protection, and | triggering Queue Protection, and | |||
| iii. a section giving the rationale for the design choices. | iii. a section giving the rationale for the design choices. | |||
| Low queuing delay depends on hosts sending their data smoothly, | Low queuing delay depends on hosts sending their data smoothly, | |||
| either at a low rate or responding to Explicit Congestion | either at a low rate or responding to Explicit Congestion | |||
| Notifications (ECNs) (see [RFC8311] and [RFC9331]). So, low-latency | Notification (ECN) (see [RFC8311] and [RFC9331]). Therefore, low- | |||
| queuing is something hosts create themselves, not something the | latency queuing is something hosts create themselves, not something | |||
| network gives them. This tends to ensure that self-interest alone | the network gives them. This tends to ensure that self-interest | |||
| does not drive flows to mis-mark their packets for the low-latency | alone does not drive flows to mismark their packets for the low- | |||
| queue. However, traffic from an application that does not behave in | latency queue. However, traffic from an application that does not | |||
| a non-queue-building way might erroneously be classified into a low- | behave in a non-queue-building way might erroneously be classified | |||
| latency queue, whether accidentally or maliciously. QProt protects | into a low-latency queue, whether accidentally or maliciously. QProt | |||
| other traffic in the low-latency queue from the harm due to excess | protects other traffic in the low-latency queue from the harm due to | |||
| queuing that would otherwise be caused by such anomalous behavior. | excess queuing that would otherwise be caused by such anomalous | |||
| behaviour. | ||||
| In normal scenarios without misclassified traffic, QProt is not | In normal scenarios without misclassified traffic, QProt is not | |||
| expected to intervene at all in the classification or forwarding of | expected to intervene at all in the classification or forwarding of | |||
| packets. | packets. | |||
| An overview of how low-latency support has been added to DOCSIS | An overview of how low-latency support has been added to DOCSIS | |||
| technology is given in [LLD]. In each direction of a DOCSIS link | technology is given in [LLD]. In each direction of a DOCSIS link | |||
| (upstream and downstream), there are two queues: one for Low-Latency | (upstream and downstream), there are two queues: one for Low-Latency | |||
| (LL) and one for Classic traffic, in an arrangement similar to the | (LL) and one for Classic traffic, in an arrangement similar to the | |||
| IETF's Coupled DualQ Active Queue Management (AQM) [RFC9332]. The | IETF's Dual-Queue Coupled Active Queue Management (AQM) [RFC9332]. | |||
| two queues enable a transition from "Classic" to "Scalable" | The two queues enable a transition from "Classic" to "Scalable" | |||
| congestion control so that low latency can become the norm for any | congestion control so that low latency can become the norm for any | |||
| application, including ones seeking both full throughput and low | application, including ones seeking both full throughput and low | |||
| latency, not just low-rate applications that have been more | latency, not just low-rate applications that have been more | |||
| traditionally associated with a low-latency service. The Classic | traditionally associated with a low-latency service. The Classic | |||
| queue is only necessary for traffic such as traditional (Reno/Cubic) | queue is only necessary for traffic such as traditional (Reno | |||
| TCP that needs about a round trip of buffering to fully utilize the | [RFC5681] / Cubic [RFC9438]) TCP that needs about a round trip of | |||
| link, and therefore has no incentive to mismark itself as low | buffering to fully utilize the link; therefore, this traffic has no | |||
| latency. The QProt function is located at the ingress to the Low- | incentive to mismark itself as low latency. The QProt function is | |||
| Latency queue. Therefore, in the upstream, QProt is located on the | located at the ingress to the Low-Latency queue. Therefore, in the | |||
| cable modem (CM); in the downstream, it is located on the CM | upstream, QProt is located on the cable modem (CM); in the | |||
| Termination System (CMTS). If an arriving packet triggers queue | downstream, it is located on the CM Termination System (CMTS). If an | |||
| protection, the QProt algorithm ejects the packet from the Low- | arriving packet triggers queue protection, the QProt algorithm ejects | |||
| Latency queue and reclassifies it into the Classic queue. | the packet from the Low-Latency queue and reclassifies it into the | |||
| Classic queue. | ||||
| If QProt is used in settings other than DOCSIS links, it would be a | If QProt is used in settings other than DOCSIS links, it would be a | |||
| simple matter to detect queue-building flows by using slightly | simple matter to detect queue-building flows by using slightly | |||
| different conditions and/or to trigger a different action as a | different conditions and/or to trigger a different action as a | |||
| consequence, as appropriate for the scenario, e.g., dropping instead | consequence, as appropriate for the scenario, e.g., dropping instead | |||
| of reclassifying packets or perhaps accumulating a second per-flow | of reclassifying packets or perhaps accumulating a second per-flow | |||
| score to decide whether to redirect a whole flow rather than just | score to decide whether to redirect a whole flow rather than just | |||
| certain packets. Such work is for future study and out of scope of | certain packets. Such work is for future study and out of scope of | |||
| the present document. | the present document. | |||
| The QProt algorithm is based on a rigorous approach to quantifying | The QProt algorithm is based on a rigorous approach to quantifying | |||
| how much each flow contributes to congestion, which is used in | how much each flow contributes to congestion, which is used in | |||
| economics to allocate responsibility for the cost of one party's | economics to allocate responsibility for the cost of one party's | |||
| behavior on others (the economic externality). Another important | behaviour on others (the economic externality). Another important | |||
| feature of the approach is that the metric used for the queuing score | feature of the approach is that the metric used for the queuing score | |||
| is based on the same variable that determines the level of ECN | is based on the same variable that determines the level of ECN | |||
| signalling seen by the sender (see [RFC8311] and [RFC9331]). This | signalling seen by the sender (see [RFC8311] and [RFC9331]). This | |||
| makes the internal queuing score visible externally as ECN markings. | makes the internal queuing score visible externally as ECN markings. | |||
| This transparency is necessary to be able to objectively state (in | This transparency is necessary to be able to objectively state (in | |||
| Section 3) how a flow can keep on the "good" side of the algorithm. | Section 3) how a flow can keep on the "good" side of the algorithm. | |||
| 1.1. Document Roadmap | 1.1. Document Roadmap | |||
| The core of the document is the walk-through of the DOCSIS QProt | The core of the document is the walk-through of the DOCSIS QProt | |||
| algorithm's pseudocode in Section 4. | algorithm's pseudocode in Section 4. | |||
| Prior to that, Section 2 summarizes the approach used in the | Prior to that, Section 2 summarizes the approach used in the | |||
| algorithm. Then, Section 3 considers QProt from the perspective of | algorithm. Then, Section 3 considers QProt from the perspective of | |||
| the end-system by defining the behavior that a flow needs to comply | the end-system by defining the behaviour that a flow needs to comply | |||
| with to avoid the QProt algorithm ejecting its packets from the low- | with to avoid the QProt algorithm ejecting its packets from the low- | |||
| latency queue. | latency queue. | |||
| Section 5 gives deeper insight into the principles and rationale | Section 5 gives deeper insight into the principles and rationale | |||
| behind the algorithm. Then, Section 6 explains the limitations of | behind the algorithm. Then, Section 6 explains the limitations of | |||
| the approach. The usual closing sections follow. | the approach. The usual closing sections follow. | |||
| 1.2. Terminology | 1.2. Terminology | |||
| The normative language for the DOCSIS QProt algorithm is in the | The normative language for the DOCSIS QProt algorithm is in the | |||
| DOCSIS specifications [DOCSIS], [DOCSIS-CM-OSS], and | DOCSIS specifications [DOCSIS], [DOCSIS-CM-OSS], and | |||
| [DOCSIS-CCAP-OSS]: not in this Informational RFC. If there is any | [DOCSIS-CCAP-OSS]: not in this Informational RFC. If there is any | |||
| inconsistency, the DOCSIS specifications take precedence. | inconsistency, the DOCSIS specifications take precedence. | |||
| The following terms and abbreviations are used: | The following terms and abbreviations are used: | |||
| CM: Cable Modem | CM: Cable Modem | |||
| CMTS: CM Termination System | CMTS: CM Termination System | |||
| Congestion-rate: The transmission rate of bits or bytes contained | Congestion-rate: The transmission rate counting only bits or bytes | |||
| within packets of a flow that have the CE codepoint set in the IP- | contained within packets of a flow that have the Congestion | |||
| ECN field [RFC3168] (including IP headers unless specified | Experienced (CE) codepoint set in the IP-ECN field [RFC3168] | |||
| otherwise). Congestion-bit-rate and congestion-volume were | (including IP headers unless specified otherwise). Congestion- | |||
| introduced in [RFC7713] and [RFC6789]. | bit-rate and congestion-volume were introduced in [RFC7713] and | |||
| [RFC6789]. | ||||
| DOCSIS: Data-Over-Cable System Interface Specification. "DOCSIS" is | DOCSIS: Data-Over-Cable System Interface Specification. "DOCSIS" is | |||
| a registered trademark of Cable Television Laboratories, Inc. | a registered trademark of Cable Television Laboratories, Inc. | |||
| ("CableLabs"). | ("CableLabs"). | |||
| Non-queue-building: A flow that tends not to build a queue. | non-queue-building: A flow that tends not to build a queue. | |||
| Queue-building: A flow that builds a queue. If it is classified | Note the use of lowercase "non-queue-building", which describes | |||
| into the Low-Latency queue, it is therefore a candidate for the | the behaviour of a flow, in contrast to uppercase Non-Queue- | |||
| queue protection algorithm to detect and sanction. | Building (NQB), which refers to a Diffserv marking [RFC9956]. A | |||
| flow identifying itself as uppercase Non-Queue-Building may not | ||||
| behave in a non-queue-building way, which is what the QProt | ||||
| algorithm is intended to detect. | ||||
| ECN: Explicit Congestion Notification | queue-building: A flow that builds a queue. If it is classified | |||
| into the Low-Latency queue, it is therefore a candidate for the | ||||
| Queue Protection algorithm to detect and sanction. | ||||
| QProt: The Queue Protection function | ECN: Explicit Congestion Notification [RFC3168] | |||
| 1.3. Copyright Material | ECN marking or ECN signalling: Setting the IP-ECN field of an | |||
| increasing proportion of packets to the Congestion Experienced | ||||
| (CE) codepoint [RFC3168] as queuing worsens. | ||||
| Parts of this document are reproduced from [DOCSIS] with kind | QProt: The Queue Protection function | |||
| permission of the copyright holder, Cable Television Laboratories, | ||||
| Inc. ("CableLabs"). | ||||
| 2. Approach (In Brief) | 2. Approach (In Brief) | |||
| The algorithm is divided into mechanism and policy. There is only a | The QProt algorithm is divided into mechanism and policy. There is | |||
| tiny amount of policy code, but policy might need to be changed in | only a tiny amount of policy code, but policy might need to be | |||
| the future. So, where hardware implementation is being considered, | changed in the future. So, where hardware implementation is being | |||
| it would be advisable to implement the policy aspects in firmware or | considered, it would be advisable to implement the policy aspects in | |||
| software: | firmware or software: | |||
| * The mechanism aspects identify flows, maintain flow-state, and | * The mechanism aspects identify flows, maintain flow state, and | |||
| accumulate per-flow queuing scores; | accumulate per-flow queuing scores; | |||
| * The policy aspects can be divided into conditions and actions: | * The policy aspects can be divided into conditions and actions: | |||
| - The conditions are the logic that determines when action should | - The conditions are the logic that determines when action should | |||
| be taken to avert the risk of queuing delay becoming excessive; | be taken to avert the risk of queuing delay becoming excessive; | |||
| - The actions determine how this risk is averted, e.g., by | - The actions determine how this risk is averted, e.g., by | |||
| redirecting packets from a flow into another queue or by | redirecting packets from a flow into another queue or by | |||
| reclassifying a whole flow that seems to be misclassified. | reclassifying a whole flow that seems to be misclassified. | |||
| 2.1. Mechanism | 2.1. Mechanism | |||
| The algorithm maintains per-flow-state, where "flow" usually means an | The algorithm maintains per-flow state, where "flow" usually means an | |||
| end-to-end (Layer 4) 5-tuple. The flow-state consists of a queuing | end-to-end (Layer 4) 5-tuple. The flow state consists of a queuing | |||
| score that decays over time. Indeed, it is transformed into time | score that decays over time. Indeed, it is transformed into time | |||
| units so that it represents the flow-state's own expiry time | units so that it represents the flow state's own expiry time | |||
| (explained in Section 5.3). A higher queuing score pushes out the | (explained in Section 5.3). A higher queuing score pushes out the | |||
| expiry time further. | expiry time further. | |||
| Non-queue-building flows tend to release their flow-state rapidly: it | Non-queue-building flows tend to release their flow state rapidly: it | |||
| usually expires reasonably early in the gap between the packets of a | usually expires reasonably early in the gap between the packets of a | |||
| normal flow. Then, the memory can be recycled for packets from other | normal flow. Then, the memory can be recycled for packets from other | |||
| flows that arrive in-between. So, only queue-building flows hold | flows that arrive in-between. Thus, only queue-building flows hold | |||
| flow state persistently. | flow state persistently. | |||
| The simplicity and effectiveness of the algorithm is due to the | The simplicity and effectiveness of the algorithm is due to the | |||
| definition of the queuing score. The queueing score represents the | definition of the queuing score. The queueing score represents the | |||
| share of blame for queuing that each flow bears. The scoring | share of blame for queuing that each flow bears. The scoring | |||
| algorithm uses the same internal variable, probNative, that the AQM | algorithm uses the same internal variable, probNative, that the AQM | |||
| for the low-latency queue uses to ECN-mark packets. (The other two | for the low-latency queue uses to ECN-mark packets. (The other two | |||
| forms of marking, Classic and coupled, are driven by Classic traffic; | forms of marking, Classic and coupled, are driven by Classic traffic; | |||
| therefore, they are not relevant to protection of the LL queue). In | therefore, they are not relevant to protection of the LL queue). In | |||
| this way, the queuing score accumulates the size of each arriving | this way, the queuing score accumulates the size of each arriving | |||
| packet of a flow but scaled by the value of probNative (in the range | packet of a flow but scaled by the value of probNative (in the range | |||
| 0 to 1) at the instant the packet arrives. So, a flow's score | 0 to 1) at the instant the packet arrives. Therefore, a flow's score | |||
| accumulates faster: | accumulates faster: | |||
| * the higher the degree of queuing and | * the higher the degree of queuing and | |||
| * the faster that the flow's packets arrive when there is queuing. | * the faster the flow's packets arrive when there is queuing. | |||
| Section 5.1 explains further why this score represents blame for | Section 5.1 explains further why this score represents blame for | |||
| queuing. | queuing. | |||
| The algorithm, as described so far, would accumulate a number that | The algorithm, as described so far, would accumulate a number that | |||
| would rise at the so-called congestion-rate of the flow (see | would rise at the so-called Congestion-rate of the flow (see | |||
| Section 1.2), i.e., the rate at which the flow is contributing to | Section 1.2), i.e., the rate at which the flow is contributing to | |||
| congestion or the rate at which the AQM is forwarding bytes of the | congestion or the rate at which the AQM is forwarding bytes of the | |||
| flow that are ECN-marked. However, rather than growing continually, | flow that are ECN-marked. However, rather than growing continually, | |||
| the queuing score is also reduced (or "aged") at a constant rate. | the queuing score is also reduced (or "aged") at a constant rate. | |||
| This is because it is unavoidable for capacity-seeking flows to | This is because it is unavoidable for capacity-seeking flows to | |||
| induce a continuous low level of congestion as they track available | induce a continuous low level of congestion as they track available | |||
| capacity. Section 5.2 explains why this allowance can be set to the | capacity. Section 5.2 explains why this allowance can be set to the | |||
| same constant for any scalable flow, whatever its bit rate. | same constant for any scalable flow, whatever its bit rate. | |||
| For implementation efficiency, the queuing score is transformed into | For implementation efficiency, the queuing score is transformed into | |||
| time units; this is so that it represents the expiry time of the flow | time units. It then represents the expiry time of the flow state (as | |||
| state (as already discussed above). Then, it does not need to be | already discussed above). Then, it does not need to be explicitly | |||
| explicitly aged because the natural passage of time implicitly "ages" | aged because the natural passage of time implicitly "ages" an expiry | |||
| an expiry time. The transformation into time units simply involves | time. The transformation into time units simply involves dividing | |||
| dividing the queuing score of each packet by the constant aging rate | the queuing score of each packet by the constant aging rate (this is | |||
| (this is explained further in Section 5.3). | explained further in Section 5.3). | |||
| 2.2. Policy | 2.2. Policy | |||
| 2.2.1. Policy Conditions | 2.2.1. Policy Conditions | |||
| The algorithm uses the queuing score to determine whether to eject | The algorithm uses the queuing score to determine whether to eject | |||
| each packet only at the time it first arrives. This limits the | each packet only at the time it first arrives. This limits the | |||
| policies available. For instance, when queueing delay exceeds a | policies available. For instance, when queueing delay exceeds a | |||
| threshold, it is not possible to eject a packet from the flow with | threshold, it is not feasible to eject a packet from the flow with | |||
| the highest queuing scoring because that would involve searching the | the highest queuing scoring because that would involve searching the | |||
| queue for such a packet (if, indeed, one were still in the queue). | queue for such a packet (if, indeed, one were still in the queue). | |||
| Nonetheless, it is still possible to develop a policy that protects | Nonetheless, it is still possible to develop a policy that protects | |||
| the low latency of the queue by making the queuing score threshold | the low latency of the queue by making the queuing score threshold | |||
| stricter the greater the excess of queuing delay relative to the | stricter the greater the excess of queuing delay relative to the | |||
| threshold (this is explained in Section 5.4). | threshold (this is explained in Section 5.4). | |||
| 2.2.2. Policy Action | 2.2.2. Policy Action | |||
| At the time of writing, the DOCSIS QProt specification states that | At the time of writing, the DOCSIS QProt specification states that, | |||
| when the policy conditions are met, the action taken to protect the | when the policy conditions are met, the action taken to protect the | |||
| low-latency queue is to reclassify a packet into the Classic queue | low-latency queue is to reclassify a packet into the Classic queue | |||
| (this is justified in Section 5.5). | (this is justified in Section 5.5). | |||
| 3. Necessary Flow Behavior | 3. Necessary Flow Behaviour | |||
| The QProt algorithm described here can be used for responsive and/or | The QProt algorithm described here can be used for responsive and/or | |||
| unresponsive flows. | unresponsive flows. | |||
| * It is possible to objectively describe the least responsive way | * It is possible to objectively describe the least responsive way | |||
| that a flow will need to respond to congestion signals in order to | that a flow will need to respond to congestion signals in order to | |||
| avoid triggering queue protection, no matter the link capacity and | avoid triggering Queue Protection, no matter the link capacity and | |||
| no matter how much other traffic there is. | no matter how much other traffic there is. | |||
| * It is not possible to describe how fast or smooth an unresponsive | * It is not possible to describe how fast or smooth an unresponsive | |||
| flow should be to avoid queue protection because this depends on | flow should be to avoid Queue Protection because this depends on | |||
| how much other traffic there is and the capacity of the link, | how much other traffic there is and the capacity of the link, | |||
| which an application is unable to know. However, the more | which an application is unable to know. However, the more | |||
| smoothly an unresponsive flow paces its packets and the lower its | smoothly an unresponsive flow paces its packets and the lower its | |||
| rate relative to typical broadband link capacities, the less | rate relative to typical broadband link capacities, the less | |||
| likelihood that it will risk causing enough queueing to trigger | likelihood that it will risk causing enough queueing to trigger | |||
| queue protection. | Queue Protection. | |||
| Responsive low-latency flows can use a Low Latency, Low Loss, and | Responsive low-latency flows can use a Low Latency, Low Loss, and | |||
| Scalable throughput (L4S) ECN codepoint [RFC9331] to get classified | Scalable throughput (L4S) ECN codepoint [RFC9331] to get classified | |||
| into the low-latency queue. | into the low-latency queue. | |||
| A sender can arrange for flows that are smooth but do not respond to | A sender can arrange for flows that are smooth but do not respond to | |||
| ECN marking to be classified into the low-latency queue by using the | ECN marking to be classified into the low-latency queue by using the | |||
| Non-Queue-Building (NQB) Diffserv codepoint [RFC9956], which the | Non-Queue-Building (NQB) Diffserv codepoint [RFC9956], which the | |||
| DOCSIS specifications support, or an operator can use various other | DOCSIS specifications support, or an operator can use various other | |||
| local classifiers. | local classifiers. | |||
| As already explained in Section 2.1, the QProt algorithm is driven | As already explained in Section 2.1, the QProt algorithm is driven | |||
| from the same variable that drives the ECN-marking probability in the | from the same variable that drives the ECN-marking probability in the | |||
| low-latency or "LL" queue (the "Native" AQM of the LL queue is | Low-Latency or "LL" queue (the "Native" AQM of the LL queue is | |||
| defined in the Immediate Active Queue Management Annex of [DOCSIS]). | defined in the Immediate Active Queue Management Annex of [DOCSIS]). | |||
| The algorithm that calculates this internal variable is run on the | The algorithm that calculates this internal variable is run on the | |||
| arrival of every packet, whether or not it is ECN-capable, so that it | arrival of every LL packet, whether or not it is ECN-capable, so that | |||
| can be used by the QProt algorithm. But the variable is only used to | it can be used by the QProt algorithm. But the variable is only used | |||
| ECN-mark packets that are ECN-capable. | to ECN-mark packets that are ECN-capable. | |||
| Not only does this dual use of the variable improve processing | Not only does this dual use of the variable improve processing | |||
| efficiency, but it also makes the basis of the QProt algorithm | efficiency, but it also makes the basis of the QProt algorithm | |||
| visible and transparent, at least for responsive ECN-capable flows. | visible and transparent, at least for responsive ECN-capable flows. | |||
| Then, it is possible to state objectively that a flow can avoid | Then, it is possible to state objectively that a flow can avoid | |||
| triggering queue protection by keeping the bit rate of ECN-marked | triggering queue protection by keeping the bit rate of ECN-marked | |||
| packets (the congestion-rate) below AGING, which is a configured | packets (the Congestion-rate) below AGING, which is a configured | |||
| constant of the algorithm (default 2^19 B/s ~= 4 Mb/s). Note that it | constant of the algorithm (default 2^19 B/s ā 4 Mb/s). Note that it | |||
| is in a congestion controller's own interest to keep its average | is in a congestion controller's own interest to keep its average | |||
| congestion-rate well below this level (e.g., ~1 Mb/s) to ensure that | Congestion-rate well below this level (e.g., ~1 Mb/s) to ensure that | |||
| it does not trigger queue protection during transient dynamics. | it does not trigger queue protection during transient dynamics. | |||
| If the QProt algorithm is used in other settings, it would still need | If the QProt algorithm is used in other settings, it would still need | |||
| to be based on the visible level of congestion signalling, in a | to be based on the visible level of congestion signalling, in a | |||
| similar way to the DOCSIS approach. Without transparency of the | similar way to the DOCSIS approach. Without transparency of the | |||
| basis of the algorithm's decisions, end-systems would not be able to | basis of the algorithm's decisions, end-systems would not be able to | |||
| avoid triggering queue protection on an objective basis. | avoid triggering Queue Protection on an objective basis. | |||
| 4. Pseudocode Walk-Through | 4. Pseudocode Walk-Through | |||
| 4.1. Input Parameters, Constants, and Variables | 4.1. Input Parameters, Constants, and Variables | |||
| The operator input parameters that set the parameters in the first | The operator input parameters that set the parameters in the first | |||
| two blocks of pseudocode below are defined for cable modems (CMs) in | two blocks of pseudocode below are defined for cable modems (CMs) in | |||
| [DOCSIS-CM-OSS] and for CMTSs in [DOCSIS-CCAP-OSS]. Then, further | [DOCSIS-CM-OSS] and for CMTSs in [DOCSIS-CCAP-OSS]. Then, further | |||
| constants are either derived from the input parameters or hard-coded. | constants are either derived from the input parameters or hard-coded. | |||
| Defaults and units are shown in square brackets. Defaults (or indeed | Defaults and units are shown in square brackets. Defaults (or indeed | |||
| any aspect of the algorithm) are subject to change, so the latest | any aspect of the QProt algorithm) are subject to change, so the | |||
| DOCSIS specifications are the definitive references. Also, any | latest DOCSIS specifications are the definitive references. Also, | |||
| operator might set certain parameters to non-default values. | any operator might set certain parameters to non-default values. | |||
| <CODE BEGINS> | <CODE BEGINS> | |||
| // Input Parameters | // Input Parameters | |||
| MAX_RATE; // Configured maximum sustained rate [b/s] | MAX_RATE; // Configured maximum sustained rate [b/s] | |||
| QPROTECT_ON; // Queue Protection is enabled [Default: TRUE] | QPROTECT_ON; // Queue Protection is enabled [Default: TRUE] | |||
| CRITICALqL_us; // LL queue threshold delay [us] Default: MAXTH_us | CRITICALqL_us; // LL queue threshold delay [µs] Default: MAXTH_us | |||
| CRITICALqLSCORE_us;// The threshold queuing score [Default: 4000 µs] | CRITICALqLSCORE_us;// The threshold queuing score [Default: 4000 µs] | |||
| LG_AGING; // The aging rate of the q'ing score [Default: 19] | LG_AGING; // The aging rate of the q'ing score [Default: 19] | |||
| // as log base 2 of the congestion-rate [lg(B/s)] | // as log base 2 of the Congestion-rate [lg(B/s)] | |||
| // Input Parameters for the calcProbNative() algorithm: | // Input Parameters for the calcProbNative() algorithm: | |||
| MAXTH_us; // Max LL AQM marking threshold [Default: 1000 µs] | MAXTH_us; // Max LL AQM marking threshold [Default: 1000 µs] | |||
| LG_RANGE; // Log base 2 of the range of ramp [lg(ns)] | LG_RANGE; // Log base 2 of the range of ramp [lg(ns)] | |||
| // Default: 2^19 = 524288 ns (roughly 525 µs) | // Default: 2^19 = 524288 ns (roughly 525 µs) | |||
| <CODE ENDS> | <CODE ENDS> | |||
| <CODE BEGINS> | <CODE BEGINS> | |||
| // Constants, either derived from input parameters or hard-coded | // Constants, either derived from input parameters or hard-coded | |||
| T_RES; // Resolution of t_exp [ns] | T_RES; // Resolution of t_exp [ns] | |||
| // Convert units (approx) | // Convert units (approx) | |||
| AGING = pow(2, (LG_AGING-30) ) * T_RES; // lg([B/s]) to [B/T_RES] | AGING = pow(2, (LG_AGING-30) ) * T_RES; // lg([B/s]) to [B/T_RES] | |||
| CRITICALqL = CRITICALqL_us * 1000; // [us] to [ns] | CRITICALqL = CRITICALqL_us * 1000; // [µs] to [ns] | |||
| CRITICALqLSCORE = CRITICALqLSCORE_us * 1000/T_RES; // [us] to [T_RES] | CRITICALqLSCORE = CRITICALqLSCORE_us * 1000/T_RES; // [µs] to [T_RES] | |||
| // Threshold for the q'ing score condition | // Threshold for the q'ing score condition | |||
| CRITICALqLPRODUCT = CRITICALqL * CRITICALqLSCORE; | CRITICALqLPRODUCT = CRITICALqL * CRITICALqLSCORE; | |||
| qLSCORE_MAX = 5E9 / T_RES; // Max queuing score = 5 s | qLSCORE_MAX = 5E9 / T_RES; // Max queuing score = 5 s | |||
| ATTEMPTS = 2; // Max attempts to pick a bucket (vendor-specific) | ATTEMPTS = 2; // Max attempts to pick a bucket (vendor-specific) | |||
| BI_SIZE = 5; // Bit-width of index number for non-default buckets | BI_SIZE = 5; // Bit-width of index number for non-default buckets | |||
| NBUCKETS = pow(2, BI_SIZE); // No. of non-default buckets | NBUCKETS = pow(2, BI_SIZE); // No. of non-default buckets | |||
| MASK = NBUCKETS-1; // convenient constant, with BI_SIZE LSBs set | MASK = NBUCKETS-1; // convenient constant, with BI_SIZE LSBs set | |||
| // Queue Protection exit states | // Queue Protection exit states | |||
| skipping to change at line 440 ¶ | skipping to change at line 446 ¶ | |||
| FLOOR = 2 * 8 * MAX_FRAME_SIZE * 10^9 / MAX_RATE; | FLOOR = 2 * 8 * MAX_FRAME_SIZE * 10^9 / MAX_RATE; | |||
| RANGE = (1 << LG_RANGE); // Range of ramp [ns] | RANGE = (1 << LG_RANGE); // Range of ramp [ns] | |||
| MINTH = max ( MAXTH - RANGE, FLOOR); | MINTH = max ( MAXTH - RANGE, FLOOR); | |||
| MAXTH = MINTH + RANGE; // Max marking threshold [ns] | MAXTH = MINTH + RANGE; // Max marking threshold [ns] | |||
| <CODE ENDS> | <CODE ENDS> | |||
| Throughout the pseudocode, most variables are integers. The only | Throughout the pseudocode, most variables are integers. The only | |||
| exceptions are floating-point variables representing probabilities | exceptions are floating-point variables representing probabilities | |||
| (MAX_PROB and probNative) and the AGING parameter. The actual DOCSIS | (MAX_PROB and probNative) and the AGING parameter. The actual DOCSIS | |||
| QProt algorithm is defined using integer arithmetic, but in the | QProt algorithm is defined using integer arithmetic, but in the | |||
| floating-point arithmetic used in this document, (0 <= probNative <= | floating-point arithmetic used in this document, 0 <= probNative <= | |||
| 1). Also, the pseudocode omits overflow checking, and it would need | 1. Also, the pseudocode omits overflow checking and it would need to | |||
| to be made robust to non-default input parameters. | be made robust to non-default input parameters. | |||
| The resolution for expressing time, T_RES, needs to be chosen to | The resolution for expressing time, T_RES, needs to be chosen to | |||
| ensure that expiry times for buckets can represent times that are a | ensure that expiry times for buckets can represent times that are a | |||
| fraction (e.g., 1/10) of the expected packet interarrival time for | fraction (e.g., 1/10) of the expected packet interarrival time for | |||
| the system. | the system. | |||
| The following definitions explain the purpose of important variables | The following definitions explain the purpose of important variables | |||
| and functions. | and functions. | |||
| <CODE BEGINS> | <CODE BEGINS> | |||
| skipping to change at line 514 ¶ | skipping to change at line 520 ¶ | |||
| <CODE ENDS> | <CODE ENDS> | |||
| 4.2. Queue Protection Data Path | 4.2. Queue Protection Data Path | |||
| All the functions of Queue Protection operate on the data path, | All the functions of Queue Protection operate on the data path, | |||
| driven by packet arrivals. | driven by packet arrivals. | |||
| The following functions that maintain per-flow queuing scores and | The following functions that maintain per-flow queuing scores and | |||
| manage per-flow state are considered primarily as mechanism: | manage per-flow state are considered primarily as mechanism: | |||
| * pick_bucket(uflow_id); // Returns bucket identifier | pick_bucket(uflow_id); // Returns bucket identifier | |||
| fill_bucket(bucket_id, pkt_size, probNative); /* Returns queuing | ||||
| * fill_bucket(bucket_id, pkt_size, probNative); // Returns queuing | * score */ | |||
| score | calcProbNative(qdelay); /* Returns ECN-marking probability of the | |||
| * Native LL AQM */ | ||||
| * calcProbNative(qdelay) // Returns ECN-marking probability of the | ||||
| native LL AQM | ||||
| The following function is primarily concerned with policy: | The following function is primarily concerned with policy: | |||
| * qprotect(packet, ...); // Returns exit status to either forward or | qprotect(packet, ...); /* Returns exit status to either forward | |||
| redirect the packet | * or redirect the packet */ | |||
| ('...' suppresses distracting detail.) | ('...' suppresses distracting detail.) | |||
| Future modifications to policy aspects are more likely than | Future modifications to policy aspects are more likely than | |||
| modifications to mechanisms. Therefore, policy aspects would be less | modifications to mechanisms. Therefore, policy aspects would be less | |||
| appropriate candidates for any hardware acceleration. | appropriate candidates for any hardware acceleration. | |||
| The entry point to these functions is qprotect(), which is called | The entry point to these functions is qprotect(), which is called | |||
| from packet classification before each packet is enqueued into the | from packet classification before each packet is enqueued into the | |||
| appropriate queue, queue_id, as follows: | appropriate queue, queue_id, as follows: | |||
| skipping to change at line 556 ¶ | skipping to change at line 560 ¶ | |||
| // redirect packet to Classic Service Flow | // redirect packet to Classic Service Flow | |||
| queue_id = CQ; | queue_id = CQ; | |||
| } | } | |||
| } | } | |||
| return queue_id; | return queue_id; | |||
| } | } | |||
| <CODE ENDS> | <CODE ENDS> | |||
| 4.2.1. The qprotect() Function | 4.2.1. The qprotect() Function | |||
| On each packet arrival at the LL queue, qprotect() measures the | On each packet arrival at the Low-Latency (LL) queue, qprotect() | |||
| current delay of the LL queue and derives the native LL marking | measures the current delay of the LL queue and derives the Native LL | |||
| probability from it. Then, it uses pick_bucket to find the bucket | marking probability from it. Then, it uses pick_bucket to find the | |||
| already holding the flow's state or to allocate a new bucket if the | bucket already holding the flow's state or to allocate a new bucket | |||
| flow is new or its state has expired (the most likely case). Then, | if the flow is new or its state has expired (the most likely case). | |||
| the queuing score is updated by the fill_bucket() function. That | Then, the queuing score is updated by the fill_bucket() function. | |||
| completes the mechanism aspects. | That completes the mechanism aspects. | |||
| The comments against the subsequent policy conditions and actions | The comments against the subsequent policy conditions and actions | |||
| should be self-explanatory at a superficial level. The deeper | should be self-explanatory at a superficial level. The deeper | |||
| rationale for these conditions is given in Section 5.4. | rationale for these conditions is given in Section 5.4. | |||
| <CODE BEGINS> | <CODE BEGINS> | |||
| // Per-packet queue protection | // Per-packet Queue Protection | |||
| qprotect(packet, ...) { | qprotect(packet, ...) { | |||
| bckt_id; // bucket index | bckt_id; // bucket index | |||
| qLscore; // queuing score of pkt's flow in units of T_RES | qLscore; // queuing score of pkt's flow in units of T_RES | |||
| qdelay = qL.qdelay(...); | qdelay = qL.qdelay(...); | |||
| probNative = calcProbNative(qdelay); | probNative = calcProbNative(qdelay); | |||
| bckt_id = pick_bucket(packet.uflow); | bckt_id = pick_bucket(packet.uflow); | |||
| qLscore = fill_bucket(buckets[bckt_id], packet.size, probNative); | qLscore = fill_bucket(buckets[bckt_id], packet.size, probNative); | |||
| skipping to change at line 598 ¶ | skipping to change at line 602 ¶ | |||
| return EXIT_SANCTION; | return EXIT_SANCTION; | |||
| else | else | |||
| return EXIT_SUCCESS; | return EXIT_SUCCESS; | |||
| } | } | |||
| <CODE ENDS> | <CODE ENDS> | |||
| 4.2.2. The pick_bucket() Function | 4.2.2. The pick_bucket() Function | |||
| The pick_bucket() function is optimized for flow-state that will | The pick_bucket() function is optimized for flow state that will | |||
| normally have expired from packet to packet of the same flow. It is | normally have expired from packet to packet of the same flow. It is | |||
| just one way of finding the bucket associated with the flow ID of | just one way of finding the bucket associated with the flow ID of | |||
| each packet: it might be possible to develop more efficient | each packet: it might be possible to develop more efficient | |||
| alternatives. | alternatives. | |||
| The algorithm is arranged so that the bucket holding any live (non- | The algorithm is arranged so that the bucket holding any live (non- | |||
| expired) flow-state associated with a packet will always be found | expired) flow state associated with a packet will always be found | |||
| before a new bucket is allocated. The constant ATTEMPTS, defined | before a new bucket is allocated. The constant ATTEMPTS, defined | |||
| earlier, determines how many hashes are used to find a bucket for | earlier, determines how many hashes are used to find a bucket for | |||
| each flow. (Actually, only one hash is generated; then, by default, | each flow. (Actually, only one hash is generated; then, by default, | |||
| 5 bits of it at a time are used as the hash value because, by | 5 bits of it at a time are used as the hash value because, by | |||
| default, there are 2^5 = 32 buckets). | default, there are 2^5 = 32 buckets). | |||
| The algorithm stores the flow's own ID in its flow-state. So, when a | The algorithm stores the flow's own ID in its flow state. So, when a | |||
| packet of a flow arrives, the algorithm tries up to ATTEMPTS times to | packet of a flow arrives, the algorithm tries up to ATTEMPTS times to | |||
| hash to a bucket, looking for the flow's own ID. If found, it uses | hash to a bucket, looking for the flow's own ID. If found, it uses | |||
| that bucket, first resetting the expiry time to "now" if it has | that bucket, first resetting the expiry time to "now" if it has | |||
| expired. | expired. | |||
| If it does not find the flow's ID, and the expiry time is still | If it does not find the flow's ID, and the expiry time is still | |||
| current, the algorithm can tell that another flow is using that | current, the algorithm can tell that another flow is using that | |||
| bucket, and it continues to look for a bucket for the flow. Even if | bucket, and it continues to look for a bucket for the flow. Even if | |||
| it finds another flow's bucket where the expiry time has passed, it | it finds another flow's bucket where the expiry time has passed, it | |||
| doesn't immediately use it. It merely remembers it as the potential | doesn't immediately use it. It merely remembers it as the potential | |||
| skipping to change at line 635 ¶ | skipping to change at line 639 ¶ | |||
| not already associated with the packet's flow, the algorithm should | not already associated with the packet's flow, the algorithm should | |||
| have already set aside an existing bucket with a score that has aged | have already set aside an existing bucket with a score that has aged | |||
| out. Given this bucket is no longer necessary to hold state for its | out. Given this bucket is no longer necessary to hold state for its | |||
| previous flow, it can be recycled for use by the present packet's | previous flow, it can be recycled for use by the present packet's | |||
| flow. | flow. | |||
| If all else fails, there is one additional bucket (called the dregs) | If all else fails, there is one additional bucket (called the dregs) | |||
| that can be used. If the dregs is still in live use by another flow, | that can be used. If the dregs is still in live use by another flow, | |||
| subsequent flows that cannot find a bucket of their own all share it, | subsequent flows that cannot find a bucket of their own all share it, | |||
| adding their score to the one in the dregs. A flow might get away | adding their score to the one in the dregs. A flow might get away | |||
| with using the dregs on its own, but when there are many mis-marked | with using the dregs on its own, but when there are many mismarked | |||
| flows, multiple flows are more likely to collide in the dregs, | flows, multiple flows are more likely to collide in the dregs, | |||
| including innocent flows. The choice of number of buckets and number | including innocent flows. The choice of number of buckets and number | |||
| of hash attempts determines how likely it will be that this | of hash attempts determines how likely it will be that this | |||
| undesirable scenario will occur. | undesirable scenario will occur. | |||
| <CODE BEGINS> | <CODE BEGINS> | |||
| // Pick the bucket associated with flow uflw | // Pick the bucket associated with flow uflw | |||
| pick_bucket(uflw) { | pick_bucket(uflw) { | |||
| now; // current time | now; // current time | |||
| skipping to change at line 815 ¶ | skipping to change at line 819 ¶ | |||
| |------Capacity-|b|----------,-.----------|b|----------|b\----- | |------Capacity-|b|----------,-.----------|b|----------|b\----- | |||
| | __|_|_______ |b| /``\| _...-._-': | ,.-- | | __|_|_______ |b| /``\| _...-._-': | ,.-- | |||
| | ,-. __/ \__|_|_ _/ |/ \|/ | | ,-. __/ \__|_|_ _/ |/ \|/ | |||
| | |b| ___/ \___/ __ r | | |b| ___/ \___/ __ r | |||
| | |_|/ v \__/ \_______ _/\____/ | | |_|/ v \__/ \_______ _/\____/ | |||
| | _/ \__/ | | _/ \__/ | |||
| | | | | |||
| +----------------------------------------------------------------> | +----------------------------------------------------------------> | |||
| time | time | |||
| Figure 2: Responsibility for Queuing: More-Complex Scenario | Figure 2: Responsibility for Queuing: A More Complex Scenario | |||
| Figure 2 gives a more-complex illustration of the way the queuing | Figure 2 gives a more complex illustration of the way the queuing | |||
| score assigns responsibility for queuing (limited to the precision | score assigns responsibility for queuing (limited to the precision | |||
| that ASCII art can illustrate). The figure shows the bit rates of | that ASCII art can illustrate). The figure shows the bit rates of | |||
| three flows represented as stacked areas labelled b, v, and r. The | three flows represented as stacked areas labelled b, v, and r. The | |||
| unresponsive bursts (b) are the same as in the previous example, but | unresponsive bursts (b) are the same as in the previous example, but | |||
| a variable-rate video (v) replaces flow c. Its rate varies as the | a variable-rate video (v) replaces flow c. Its rate varies as the | |||
| complexity of the video scene varies. Also, on a slower timescale, | complexity of the video scene varies. Also, on a slower timescale, | |||
| in response to the level of congestion, the video adapts its quality. | in response to the level of congestion, the video adapts its quality. | |||
| However, on a short timescale it appears to be unresponsive to small | However, on a short timescale it appears to be unresponsive to small | |||
| amounts of queuing. Also, partway through, a low-latency responsive | amounts of queuing. Also, partway through, a low-latency responsive | |||
| flow (r) joins in, aiming to fill the balance of capacity left by the | flow (r) joins in, aiming to fill the balance of capacity left by the | |||
| skipping to change at line 852 ¶ | skipping to change at line 856 ¶ | |||
| growth of the queue. After a round trip, the responsive flow rapidly | growth of the queue. After a round trip, the responsive flow rapidly | |||
| backs off, and the adaptive video also backs off more rapidly than it | backs off, and the adaptive video also backs off more rapidly than it | |||
| would normally because of the very high congestion level. The rapid | would normally because of the very high congestion level. The rapid | |||
| response to congestion of flow r reduces the queuing score that all | response to congestion of flow r reduces the queuing score that all | |||
| three flows accumulate, but they each still bear the cost in | three flows accumulate, but they each still bear the cost in | |||
| proportion to the product of the rates at which their packets arrive | proportion to the product of the rates at which their packets arrive | |||
| at the queue and the value of probNative when they do so. Thus, | at the queue and the value of probNative when they do so. Thus, | |||
| during the fifth burst, they all accumulate a lower score than the | during the fifth burst, they all accumulate a lower score than the | |||
| fourth because the queuing delay is not as excessive. | fourth because the queuing delay is not as excessive. | |||
| 5.2. Rationale for Constant Aging of the Queuing Score | 5.2. Rationale: Constant Aging of the Queuing Score | |||
| Even well-behaved flows will not always be able to respond fast | Even well-behaved flows will not always be able to respond fast | |||
| enough to dynamic events. Also, well-behaved flows, e.g., Data | enough to dynamic events. Also, well-behaved flows, e.g., Data | |||
| Center TCP (DCTCP) [RFC8257], TCP Prague [PRAGUE-CC], Bottleneck | Center TCP (DCTCP) [RFC8257], TCP Prague [PRAGUE-CC], Bottleneck | |||
| Bandwidth and Round-trip propagation time version 3 (BBRv3) [BBRv3], | Bandwidth and Round-trip propagation time version 3 (BBRv3) [BBRv3], | |||
| or the L4S variant of SCReAM [SCReAM] for real-time media [RFC8298], | or the L4S variant of SCReAM [SCReAM] for real-time media [RFC8298], | |||
| can maintain a very shallow queue by continual careful probing for | can maintain a very shallow queue by continual careful probing for | |||
| more while also continually subtracting a little from their rate (or | more while also continually subtracting a little from their rate (or | |||
| congestion window) in response to low levels of ECN signalling. | congestion window) in response to low levels of ECN signalling. | |||
| Therefore, the QProt algorithm needs to continually offer a degree of | Therefore, the QProt algorithm needs to continually offer a degree of | |||
| skipping to change at line 888 ¶ | skipping to change at line 892 ¶ | |||
| This suggests that the QProt algorithm will not sanction a well- | This suggests that the QProt algorithm will not sanction a well- | |||
| behaved scalable flow if it ages out the queuing score at a | behaved scalable flow if it ages out the queuing score at a | |||
| sufficient constant rate. The constant will need to be somewhat | sufficient constant rate. The constant will need to be somewhat | |||
| above the average of a well-behaved scalable flow to allow for normal | above the average of a well-behaved scalable flow to allow for normal | |||
| dynamics. | dynamics. | |||
| Relating QProt's aging constant to a scalable flow does not mean that | Relating QProt's aging constant to a scalable flow does not mean that | |||
| a flow has to behave like a scalable flow: it can be less aggressive | a flow has to behave like a scalable flow: it can be less aggressive | |||
| but not more aggressive. For instance, a longer RTT flow can run at | but not more aggressive. For instance, a longer RTT flow can run at | |||
| a lower congestion-rate than the aging rate, but it can also increase | a lower Congestion-rate than the aging rate, but it can also increase | |||
| its aggressiveness to equal the rate of short RTT scalable flows | its aggressiveness to equal the rate of short RTT scalable flows | |||
| [ScalingCC]. The constant aging of QProt also means that a long- | [ScalingCC]. The constant aging of QProt also means that a long- | |||
| running unresponsive flow will be prone to trigger QProt if it runs | running unresponsive flow will be prone to trigger QProt if it runs | |||
| faster than a competing responsive scalable flow would. And, of | faster than a competing responsive scalable flow would. And, of | |||
| course, if a flow causes excessive queuing in the short term, its | course, if a flow causes excessive queuing in the short term, its | |||
| queuing score will still rise faster than the constant aging process | queuing score will still rise faster than the constant aging process | |||
| will decrease it. Then, QProt will still eject the flow's packets | will decrease it. Then, QProt will still eject the flow's packets | |||
| before they harm the low latency of the shared queue. | before they harm the low latency of the shared queue. | |||
| 5.3. Rationale for Transformed Queuing Score | 5.3. Rationale: Transformed Queuing Score | |||
| The QProt algorithm holds a flow's queuing score state in a structure | The QProt algorithm holds a flow's queuing score state in a structure | |||
| called a "bucket". This is because of its similarity to a classic | called a "bucket". This is because of its similarity to a classic | |||
| leaky bucket (except the contents of the bucket do not represent | leaky bucket (except the contents of the bucket do not represent | |||
| bytes). | bytes). | |||
| probNative * pkt_sz probNative * pkt_sz / AGING | probNative * pkt_sz probNative * pkt_sz / AGING | |||
| | | | | | | |||
| | V | | V | | | V | | V | | |||
| | : | ___ | : | | | : | ___ | : | | |||
| skipping to change at line 929 ¶ | skipping to change at line 933 ¶ | |||
| of Figure 3 in token bucket form. Dt is the difference between the | of Figure 3 in token bucket form. Dt is the difference between the | |||
| times when the scores of the current and previous packets were | times when the scores of the current and previous packets were | |||
| processed. | processed. | |||
| A transformed equivalent of this token bucket is shown on the right | A transformed equivalent of this token bucket is shown on the right | |||
| of Figure 3, dividing both the input and output by the constant AGING | of Figure 3, dividing both the input and output by the constant AGING | |||
| rate. The result is a bucket-depth that represents time and it | rate. The result is a bucket-depth that represents time and it | |||
| drains at the rate that time passes. | drains at the rate that time passes. | |||
| As a further optimization, the time the bucket was last updated is | As a further optimization, the time the bucket was last updated is | |||
| not stored with the flow-state. Instead, when the bucket is | not stored with the flow state. Instead, when the bucket is | |||
| initialized, the queuing score is added to the system time "now" and | initialized, the queuing score is added to the system time "now" and | |||
| the resulting expiry time is written into the bucket. Subsequently, | the resulting expiry time is written into the bucket. Subsequently, | |||
| if the bucket has not expired, the incremental queuing score is added | if the bucket has not expired, the incremental queuing score is added | |||
| to the time already held in the bucket. Then, the queuing score | to the time already held in the bucket. Then, the queuing score | |||
| always represents the expiry time of the flow-state itself. This | always represents the expiry time of the flow state itself. This | |||
| means that the queuing score does not need to be aged explicitly | means that the queuing score does not need to be aged explicitly | |||
| because it ages itself implicitly. | because it ages itself implicitly. | |||
| 5.4. Rationale for Policy Conditions | 5.4. Rationale: Policy Conditions | |||
| Pseudocode for the QProt policy conditions is given in Section 4.1 | Pseudocode for the QProt policy conditions is given in Section 4.1 | |||
| within the second half of the qprotect() function. When each packet | within the second half of the qprotect() function. When each packet | |||
| arrives, after finding its flow state and updating the queuing score | arrives, after finding its flow state and updating the queuing score | |||
| of the packet's flow, the algorithm checks whether the shared queue | of the packet's flow, the algorithm checks whether the shared queue | |||
| delay exceeds a constant threshold CRITICALqL (e.g., 2 ms), as | delay exceeds a constant threshold CRITICALqL (e.g., 2 ms), as | |||
| repeated below for convenience: | repeated below for convenience: | |||
| <CODE BEGINS> | <CODE BEGINS> | |||
| if ( ( qdelay > CRITICALqL ) // Test if qdelay over threshold... | if ( ( qdelay > CRITICALqL ) // Test if qdelay over threshold... | |||
| skipping to change at line 985 ¶ | skipping to change at line 989 ¶ | |||
| packet. This is why the queuing score is scaled up by the current | packet. This is why the queuing score is scaled up by the current | |||
| queue delay. Then, the more the queue has grown without ejecting | queue delay. Then, the more the queue has grown without ejecting | |||
| a packet, the more the algorithm "raises the bar" to further | a packet, the more the algorithm "raises the bar" to further | |||
| packets. | packets. | |||
| The above approach is preferred over the extra per-packet processing | The above approach is preferred over the extra per-packet processing | |||
| cost of searching the buckets for the flow with the highest queuing | cost of searching the buckets for the flow with the highest queuing | |||
| score and searching the queue for one of its packets to eject (if one | score and searching the queue for one of its packets to eject (if one | |||
| is still in the queue). | is still in the queue). | |||
| Note that, by default, CRITICALqL_us is set to the maximum threshold | ||||
| of the ramp marking algorithm MAXTH_us. However, there is some | ||||
| debate as to whether setting it to the minimum threshold instead | ||||
| would improve QProt performance. This would roughly double the ratio | ||||
| of qdelay to CRITICALqL, which is compared against the | ||||
| CRITICALqLSCORE threshold. So, the threshold would have to be | ||||
| roughly doubled accordingly. | ||||
| Figure 4 explains this approach graphically. On the horizontal axis, | Figure 4 explains this approach graphically. On the horizontal axis, | |||
| it shows actual harm, meaning the queuing delay in the shared queue. | it shows actual harm, meaning the queuing delay in the shared queue. | |||
| On the vertical axis, it shows the behavior record of the flow | On the vertical axis, it shows the behaviour record of the flow | |||
| associated with the currently arriving packet, represented in the | associated with the currently arriving packet, represented in the | |||
| algorithm by the flow's queuing score. The shaded region represents | algorithm by the flow's queuing score. The shaded region represents | |||
| the combination of actual harm and behavior record that will lead to | the combination of actual harm and behaviour record that will lead to | |||
| the packet being ejected. | the packet being ejected. | |||
| Behavior Record: | | Note that, by default, CRITICALqL_us is set to the maximum | |||
| | threshold of the ramp marking algorithm MAXTH_us. However, | ||||
| | there is some debate as to whether setting it to the minimum | ||||
| | threshold instead would improve QProt performance. This would | ||||
| | roughly double the ratio of qdelay to CRITICALqL, which is | ||||
| | compared against the CRITICALqLSCORE threshold. Therefore, the | ||||
| | threshold would have to be roughly doubled accordingly. | ||||
| Behaviour Record: | ||||
| Queueing Score of | Queueing Score of | |||
| Arriving Packet's Flow | Arriving Packet's Flow | |||
| ^ | ^ | |||
| | + |/ / / / / / / / / / / / / / / / / / / | | + |/ / / / / / / / / / / / / / / / / / / | |||
| | + N | / / / / / / / / / / / / / / / / / / / | | + N | / / / / / / / / / / / / / / / / / / / | |||
| | + |/ / / / / / / / / / | | + |/ / / / / / / / / / | |||
| | + | / / / / E (Eject packet) / / / / / | | + | / / / / E (Eject packet) / / / / / | |||
| | + |/ / / / / / / / / / | | + |/ / / / / / / / / / | |||
| | + | / / / / / / / / / / / / / / / / / / / | | + | / / / / / / / / / / / / / / / / / / / | |||
| | + |/ / / / / / / / / / / / / / / / / / / | | + |/ / / / / / / / / / / / / / / / / / / | |||
| skipping to change at line 1036 ¶ | skipping to change at line 1040 ¶ | |||
| threshold, CRITICALqL. | threshold, CRITICALqL. | |||
| The region labelled "E" represents cases where there is actual harm | The region labelled "E" represents cases where there is actual harm | |||
| (queue delay exceeds CRITICALqL) and the queuing score associated | (queue delay exceeds CRITICALqL) and the queuing score associated | |||
| with the arriving packet is high enough to be able to eject it with | with the arriving packet is high enough to be able to eject it with | |||
| certainty. | certainty. | |||
| The region labelled "P" represents cases where there is actual harm, | The region labelled "P" represents cases where there is actual harm, | |||
| but the queuing score of the arriving packet is insufficient to eject | but the queuing score of the arriving packet is insufficient to eject | |||
| it, so it has to be passed over. This adds to queuing delay, but the | it, so it has to be passed over. This adds to queuing delay, but the | |||
| alternative would be to sanction an innocent flow. It can be seen | alternative would be to risk sanctioning an innocent flow. It can be | |||
| that, as actual harm increases, the judgment of innocence becomes | seen that, as actual harm increases, the judgement of innocence | |||
| increasingly stringent; the behavior record of the next packet's flow | becomes increasingly stringent; the behaviour record of the next | |||
| does not have to be as bad to eject it. | packet's flow does not have to be as bad to eject it. | |||
| Conditioning ejection on actual harm helps prevent VPN packets being | Conditioning ejection on actual harm helps prevent VPN packets being | |||
| ejected unnecessarily. VPNs consisting of multiple flows can tend to | ejected unnecessarily. VPNs consisting of multiple flows can tend to | |||
| accumulate queuing score faster than it is aged out because the aging | accumulate queuing score faster than it is aged out because the aging | |||
| rate is intended for a single flow. However, whether or not some | rate is intended for a single flow. However, whether or not some | |||
| traffic is in a VPN, the queue delay threshold (CRITICALqL) will be | traffic is in a VPN, the queue delay threshold (CRITICALqL) will be | |||
| no more likely to be exceeded. So, conditioning ejection on actual | no more likely to be exceeded. Therefore, conditioning ejection on | |||
| harm helps reduce the chance that VPN traffic will be ejected by the | actual harm helps reduce the chance that VPN traffic will be ejected | |||
| QProt function. | by the QProt function. | |||
| 5.5. Rationale for Reclassification as the Policy Action | 5.5. Rationale: Reclassification as the Policy Action | |||
| When the DOCSIS QProt algorithm deems that it is necessary to eject a | When the DOCSIS QProt algorithm deems that it is necessary to eject a | |||
| packet to protect the Low-Latency queue, it redirects the packet to | packet to protect the Low-Latency queue, it redirects the packet to | |||
| the Classic queue. In the Low-Latency DOCSIS architecture (as in | the Classic queue. In the Low-Latency DOCSIS architecture (as in | |||
| Coupled DualQ AQMs generally), the Classic queue is expected to | Dual-Queue Coupled AQMs generally), the Classic queue is expected to | |||
| frequently have a larger backlog of packets, which is caused by | frequently have a larger backlog of packets, which is caused by | |||
| classic congestion controllers interacting with a classic AQM (which | classic congestion controllers interacting with a classic AQM (which | |||
| has a latency target of 10 ms) as well as other bursty traffic. | has a latency target of 10 ms) as well as other bursty traffic. | |||
| Therefore, typically, an ejected packet will experience higher | Therefore, typically, an ejected packet will experience higher | |||
| queuing delay than it would otherwise, and it could be re-ordered | queuing delay than it would otherwise, and it could be re-ordered | |||
| within its flow (assuming QProt does not eject all packets of an | within its flow (assuming QProt does not eject all packets of an | |||
| anomalous flow). The mild harm caused to the performance of the | anomalous flow). The mild harm caused to the performance of the | |||
| ejected packet's flow is deliberate. It gives senders a slight | ejected packet's flow is deliberate. It gives senders a slight | |||
| incentive to identify their packets correctly. | incentive to identify their packets correctly. | |||
| If there were no such harm, there would be nothing to prevent all | If there were no such harm, there would be nothing to prevent all | |||
| flows from identifying themselves as suitable for classification into | flows from identifying themselves as suitable for classification into | |||
| the low-latency queue and just letting QProt sort the resulting | the low-latency queue and just letting QProt sort the resulting | |||
| aggregate into queue-building and non-queue-building flows. This | aggregate into queue-building and non-queue-building flows. This | |||
| might seem like a useful alternative to requiring senders to | might seem like a useful alternative to requiring senders to | |||
| correctly identify their flows. However, handling of mis-classified | correctly identify their flows. However, handling of misclassified | |||
| flows is not without a cost. The more packets that have to be | flows is not without a cost. The more packets that have to be | |||
| reclassified, the more often the delay of the low-latency queue would | reclassified, the more often the delay of the low-latency queue would | |||
| exceed the threshold. Also, more memory would be required to hold | exceed the threshold. Also, more memory would be required to hold | |||
| the extra flow state. | the extra flow state. | |||
| When a packet is redirected into the Classic queue, an operator might | When a packet is redirected into the Classic queue, an operator might | |||
| want to alter the identifier(s) that originally caused it to be | want to alter the identifier(s) that originally caused it to be | |||
| classified into the Low-Latency queue so that the packet will not be | classified into the Low-Latency queue so that the packet will not be | |||
| classified into another low-latency queue further downstream. | classified into another low-latency queue further downstream. | |||
| However, redirection of occasional packets can be due to unusually | However, redirection of occasional packets can be due to unusually | |||
| high transient load just at the specific bottleneck, not necessarily | high transient load just at the specific bottleneck, not necessarily | |||
| at any other bottleneck and not necessarily due to bad flow behavior. | at any other bottleneck and not necessarily due to bad flow | |||
| Therefore, Section 5.4.1.2 of [RFC9331] precludes a network node from | behaviour. Therefore, Section 5.4.1.2 of [RFC9331] precludes a | |||
| altering the end-to-end ECN field to exclude traffic from L4S | network node from altering the end-to-end ECN field to exclude | |||
| treatment. Instead a local-use identifier ought to be used (e.g., | traffic from L4S treatment. Instead a local-use identifier ought to | |||
| Diffserv Codepoint or VLAN tag) so that each operator can apply its | be used (e.g., Diffserv codepoint or VLAN tag) so that each operator | |||
| own policy, without prejudging what other operators ought to do. | can apply its own policy, without prejudging what other operators | |||
| ought to do. | ||||
| Although not supported in the DOCSIS specifications, QProt could be | Although not supported in the DOCSIS specifications, QProt could be | |||
| extended to recognize that large numbers of redirected packets belong | extended to recognize that large numbers of redirected packets belong | |||
| to the same flow. This might be detected when the bucket expiry time | to the same flow. This might be detected when the bucket expiry time | |||
| t_exp exceeds a threshold. Depending on policy and implementation | t_exp exceeds a threshold. Depending on policy and implementation | |||
| capabilities, QProt could then install a classifier to redirect a | capabilities, QProt could then install a classifier to redirect a | |||
| whole flow into the Classic queue, with an idle timeout to remove | whole flow into the Classic queue, with an idle timeout to remove | |||
| stale classifiers. In these "persistent offender" cases, QProt might | stale classifiers. In these "persistent offender" cases, QProt might | |||
| also overwrite each redirected packet's DSCP or clear its ECN field | also overwrite each redirected packet's DSCP or clear its ECN field | |||
| to Not-ECT, in order to protect other potential L4S queues | to Not-ECT, in order to protect other potential L4S queues | |||
| skipping to change at line 1132 ¶ | skipping to change at line 1137 ¶ | |||
| The use of a queuing score that excludes those aspects of flow rate | The use of a queuing score that excludes those aspects of flow rate | |||
| that do not contribute to queuing (Section 5.1) goes some way to | that do not contribute to queuing (Section 5.1) goes some way to | |||
| mitigating this limitation because the algorithm does not judge | mitigating this limitation because the algorithm does not judge | |||
| responsibility for queuing delay primarily on the combined rate of a | responsibility for queuing delay primarily on the combined rate of a | |||
| set of flows grouped under one flow ID. | set of flows grouped under one flow ID. | |||
| 7. IANA Considerations | 7. IANA Considerations | |||
| This document has no IANA actions. | This document has no IANA actions. | |||
| 8. Implementation Status | 8. Security Considerations | |||
| +================+================================================+ | ||||
| | Implementation | DOCSIS models for ns-3 | | ||||
| | name: | | | ||||
| +================+================================================+ | ||||
| | Organization | CableLabs | | ||||
| +----------------+------------------------------------------------+ | ||||
| | Web page | https://apps.nsnam.org/app/docsis-ns3/ | | ||||
| +----------------+------------------------------------------------+ | ||||
| | Description | ns-3 simulation models developed and used in | | ||||
| | | support of the Low-Latency DOCSIS development, | | ||||
| | | including models of Dual Queue Coupled AQM, | | ||||
| | | Queue Protection, and the DOCSIS MAC | | ||||
| +----------------+------------------------------------------------+ | ||||
| | Maturity | Simulation models that can also be used in | | ||||
| | | emulation mode in a testbed context | | ||||
| +----------------+------------------------------------------------+ | ||||
| | Coverage | Complete implementation of Annex P of DOCSIS | | ||||
| | | 3.1 | | ||||
| +----------------+------------------------------------------------+ | ||||
| | Version | DOCSIS 3.1, version I21; | | ||||
| | | https://www.cablelabs.com/specifications/CM- | | ||||
| | | SP-MULPIv3.1?v=I21 | | ||||
| +----------------+------------------------------------------------+ | ||||
| | Licence | GPLv2 | | ||||
| +----------------+------------------------------------------------+ | ||||
| | Contact | via web page | | ||||
| +----------------+------------------------------------------------+ | ||||
| | Last | Mar 2022 | | ||||
| | Implementation | | | ||||
| | Update | | | ||||
| +----------------+------------------------------------------------+ | ||||
| | Information | 7 Mar 2022 | | ||||
| | valid at | | | ||||
| +----------------+------------------------------------------------+ | ||||
| Table 1 | ||||
| There are also a number of closed source implementations, including | ||||
| two cable modem implementations written by different chipset | ||||
| manufacturers and several CMTS implementations by other | ||||
| manufacturers. These, as well as the ns-3 implementation, have | ||||
| passed the full suite of compliance tests developed by CableLabs. | ||||
| 9. Security Considerations | ||||
| The whole of this document concerns traffic security. It considers | The whole of this document concerns traffic security. It considers | |||
| the security question of how to identify and eject traffic that does | the security question of how to identify and eject traffic that does | |||
| not comply with the non-queue-building behavior required to use a | not comply with the non-queue-building behaviour required to use a | |||
| shared low-latency queue, whether accidentally or maliciously. | shared low-latency queue, whether accidentally or maliciously. | |||
| Section 8.2 of [RFC9330] of the L4S architecture [RFC9330] introduces | Section 8.2 of [RFC9330] of the L4S architecture [RFC9330] introduces | |||
| the problem of maintaining low latency by either self-restraint or | the problem of maintaining low latency by either self-restraint or | |||
| enforcement and places DOCSIS queue protection in context within a | enforcement and places DOCSIS Queue Protection (QProt) in context | |||
| wider set of approaches to the problem. | within a wider set of approaches to the problem. | |||
| 9.1. Resource Exhaustion Attacks | 8.1. Resource Exhaustion Attacks | |||
| The algorithm has been designed to fail gracefully in the face of | The QProt algorithm has been designed to fail gracefully in the face | |||
| traffic crafted to overrun the resources used for the algorithm's own | of traffic crafted to overrun the resources used for the algorithm's | |||
| processing and flow state. This means that non-queue-building flows | own processing and flow state. This means that non-queue-building | |||
| will always be less likely to be sanctioned than queue-building | flows will always be less likely to be sanctioned than queue-building | |||
| flows. But an attack could be contrived to deplete resources in such | flows. But an attack could be contrived to deplete resources in such | |||
| a way that the proportion of innocent (non-queue-building) flows that | a way that the proportion of innocent (non-queue-building) flows that | |||
| are incorrectly sanctioned could increase. | are incorrectly sanctioned could increase. | |||
| Incorrect sanctioning is intended not to be catastrophic; it results | Incorrect sanctioning is intended not to be catastrophic; it results | |||
| in more packets from well-behaved flows being redirected into the | in more packets from well-behaved flows being redirected into the | |||
| Classic queue, which introduces more reordering into innocent flows. | Classic queue, which introduces more reordering into innocent flows. | |||
| 9.1.1. Exhausting Flow-State Storage | 8.1.1. Exhausting Flow State Storage | |||
| To exhaust the number of buckets, the most efficient attack is to | To exhaust the number of buckets, the most efficient attack is to | |||
| send enough long-running attack flows to increase the chance that an | send enough long-running attack flows to increase the chance that an | |||
| arriving flow will not find an available bucket and will, therefore, | arriving flow will not find an available bucket and will, therefore, | |||
| have to share the "dregs" bucket. For instance, if ATTEMPTS=2 and | have to share the "dregs" bucket. For instance, if ATTEMPTS=2 and | |||
| NBUCKETS=32, it requires about 94 attack flows, each using different | NBUCKETS=32, it requires about 94 attack flows, each using different | |||
| port numbers, to increase the probability to 99% that an arriving | port numbers, to increase the probability to 99% that an arriving | |||
| flow will have to share the dregs, where it will share a high degree | flow will have to share the dregs, where it will share a high degree | |||
| of redirection into the C queue with the remainder of the attack | of redirection into the Classic queue with the remainder of the | |||
| flows. | attack flows. | |||
| For an attacker to keep buckets busy, it is more efficient to hold | For an attacker to keep buckets busy, it is more efficient to hold | |||
| onto them by cycling regularly through a set of port numbers (94 in | onto them by cycling regularly through a set of port numbers (94 in | |||
| the above example) rather than to keep occupying and releasing | the above example) rather than to keep occupying and releasing | |||
| buckets with single packet flows across a much larger number of | buckets with single packet flows across a much larger number of | |||
| ports. | ports. | |||
| During such an attack, the coupled marking probability will have | During such an attack, the coupled marking probability will have | |||
| saturated at 100%. So, to hold a bucket, the rate of an attack flow | saturated at 100%. Therefore, to hold a bucket, the rate of an attack | |||
| needs to be no less than the AGING rate of each bucket: 4 Mb/s by | flow needs to be no less than the AGING rate of each bucket: 4 Mb/s | |||
| default. However, for an attack flow to be sure to hold on to its | by default. However, for an attack flow to be sure to hold on to its | |||
| bucket, it would need to send somewhat faster. Thus, an attack with | bucket, it would need to send somewhat faster. Thus, an attack with | |||
| 100 flows would need a total force of more than 100 * 4 Mb/s = 400 | 100 flows would need a total force of more than 100 * 4 Mb/s = 400 | |||
| Mb/s. | Mb/s. | |||
| This attack can be mitigated (but not prevented) by increasing the | This attack can be mitigated (but not prevented) by increasing the | |||
| number of buckets. The required attack force scales linearly with | number of buckets. The required attack force scales linearly with | |||
| the number of buckets, NBUCKETS. So, if NBUCKETS were doubled to 64, | the number of buckets, NBUCKETS. Therefore, if NBUCKETS were doubled | |||
| twice as many 4 Mb/s flows would be needed to maintain the same | to 64, twice as many 4 Mb/s flows would be needed to maintain the | |||
| impact on innocent flows. | same impact on innocent flows. | |||
| Probably the most effective mitigation would be to implement | Probably the most effective mitigation would be to implement | |||
| redirection of whole-flows once enough of the individual packets of a | redirection of whole-flows once enough of the individual packets of a | |||
| certain offending flow had been redirected. This would free up the | certain offending flow had been redirected. This would free up the | |||
| buckets used to maintain the per-packet queuing score of persistent | buckets used to maintain the per-packet queuing score of persistent | |||
| offenders. Whole-flow redirection is outside the scope of the | offenders. Whole-flow redirection is outside the scope of the | |||
| current version of the QProt algorithm specified here, but it is | current version of the QProt algorithm specified here, but it is | |||
| briefly discussed at the end of Section 5.5. | briefly discussed at the end of Section 5.5. | |||
| It might be considered that all the packets of persistently offending | It might be considered that all the packets of persistently offending | |||
| flows ought to be discarded rather than redirected. However, this is | flows ought to be discarded rather than redirected. However, this is | |||
| not recommended because attack flows might be able to pervert whole- | not recommended because attack flows might be able to pervert whole- | |||
| flow discard, turning it onto at least some innocent flows, thus | flow discard, turning it onto at least some innocent flows, thus | |||
| amplifying an attack that causes reordering into total deletion of | amplifying an attack that causes reordering into total deletion of | |||
| some innocent flows. | some innocent flows. | |||
| 9.1.2. Exhausting Processing Resources | 8.1.2. Exhausting Processing Resources | |||
| The processing time needed to apply the QProt algorithm to each LL | The processing time needed to apply the QProt algorithm to each Low- | |||
| packet is small and intended not to take all the time available | Latency (LL) packet is small and intended not to take all the time | |||
| between each of a run of fairly small packets. However, an attack | available between each of a run of fairly small packets. However, an | |||
| could use minimum sized packets launched from multiple input | attack could use minimum sized packets launched from multiple input | |||
| interfaces into a lower capacity output interface. Whether the QProt | interfaces into a lower capacity output interface. Whether the QProt | |||
| algorithm is vulnerable to processor exhaustion will depend on the | algorithm is vulnerable to processor exhaustion will depend on the | |||
| specific implementation. | specific implementation. | |||
| Addition of a capability to redirect persistently offending flows | Addition of a capability to redirect persistently offending flows | |||
| from LL to C would be the most effective way to reduce the per-packet | from LL to Classic would be the most effective way to reduce the per- | |||
| processing cost of the QProt algorithm when under attack. As already | packet processing cost of the QProt algorithm when under attack. As | |||
| mentioned in Section 9.1.1, this would also be an effective way to | already mentioned in Section 8.1.1, this would also be an effective | |||
| mitigate flow-state exhaustion attacks. Further discussion of whole- | way to mitigate flow-state exhaustion attacks. Further discussion of | |||
| flow redirection is outside the scope of the present document but is | whole-flow redirection is outside the scope of the present document | |||
| briefly discussed at the end of Section 5.5. | but is briefly discussed at the end of Section 5.5. | |||
| 10. Comments Solicited | 9. Comments Solicited | |||
| Evaluation and assessment of the algorithm by researchers is | Evaluation and assessment of the algorithm by researchers is | |||
| solicited. Comments and questions are also encouraged and welcome. | solicited. Comments and questions are also encouraged and welcome. | |||
| They can be addressed to the authors. | They can be addressed to the authors. | |||
| 11. References | 10. References | |||
| 11.1. Normative References | 10.1. Normative References | |||
| [DOCSIS] CableLabs, "MAC and Upper Layer Protocols Interface | [DOCSIS] CableLabs, "MAC and Upper Layer Protocols Interface | |||
| (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable | (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable | |||
| Service Interface Specifications DOCSIS(r) 3.1 Version I17 | Service Interface Specifications DOCSIS(r) 3.1 Version I17 | |||
| or later, 21 January 2019, | or later, 21 January 2019, | |||
| <https://www.cablelabs.com/specifications/CM-SP- | <https://www.cablelabs.com/specifications/CM-SP- | |||
| MULPIv3.1>. | MULPIv3.1>. | |||
| [DOCSIS-CCAP-OSS] | [DOCSIS-CCAP-OSS] | |||
| CableLabs, "CCAP Operations Support System Interface | CableLabs, "CCAP Operations Support System Interface | |||
| skipping to change at line 1321 ¶ | skipping to change at line 1281 ¶ | |||
| Congestion Notification (ECN) Protocol for Low Latency, | Congestion Notification (ECN) Protocol for Low Latency, | |||
| Low Loss, and Scalable Throughput (L4S)", RFC 9331, | Low Loss, and Scalable Throughput (L4S)", RFC 9331, | |||
| DOI 10.17487/RFC9331, January 2023, | DOI 10.17487/RFC9331, January 2023, | |||
| <https://www.rfc-editor.org/info/rfc9331>. | <https://www.rfc-editor.org/info/rfc9331>. | |||
| [RFC9956] White, G., Fossati, T., and R. Geib, "A Non-Queue-Building | [RFC9956] White, G., Fossati, T., and R. Geib, "A Non-Queue-Building | |||
| Per-Hop Behavior (NQB PHB) for Differentiated Services", | Per-Hop Behavior (NQB PHB) for Differentiated Services", | |||
| RFC 9956, DOI 10.17487/RFC9956, April 2026, | RFC 9956, DOI 10.17487/RFC9956, April 2026, | |||
| <https://www.rfc-editor.org/info/rfc9956>. | <https://www.rfc-editor.org/info/rfc9956>. | |||
| 11.2. Informative References | 10.2. Informative References | |||
| [BBRv3] "TCP BBR v3 Release", commit 90210de, 18 March 2025, | [BBRv3] "TCP BBR v3 Release", commit 90210de, 18 March 2025, | |||
| <https://github.com/google/bbr/blob/v3/README.md>. | <https://github.com/google/bbr/blob/v3/README.md>. | |||
| [LLD] White, G., Sundaresan, K., and B. Briscoe, "Low Latency | [LLD] White, G., Sundaresan, K., and B. Briscoe, "Low Latency | |||
| DOCSIS: Technology Overview", CableLabs White Paper, | DOCSIS: Technology Overview", CableLabs White Paper, | |||
| February 2019, <https://cablela.bs/low-latency-docsis- | February 2019, <https://cablela.bs/low-latency-docsis- | |||
| technology-overview-february-2019>. | technology-overview-february-2019>. | |||
| [PRAGUE-CC] | [PRAGUE-CC] | |||
| De Schepper, K., Tilmans, O., Briscoe, B., and V. Goel, | De Schepper, K., Tilmans, O., Briscoe, B., and V. Goel, | |||
| "Prague Congestion Control", Work in Progress, Internet- | "Prague Congestion Control", Work in Progress, Internet- | |||
| Draft, draft-briscoe-iccrg-prague-congestion-control-04, | Draft, draft-briscoe-iccrg-prague-congestion-control-04, | |||
| 24 July 2024, <https://datatracker.ietf.org/doc/html/ | 24 July 2024, <https://datatracker.ietf.org/doc/html/ | |||
| draft-briscoe-iccrg-prague-congestion-control-04>. | draft-briscoe-iccrg-prague-congestion-control-04>. | |||
| [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", | [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", | |||
| RFC 4303, DOI 10.17487/RFC4303, December 2005, | RFC 4303, DOI 10.17487/RFC4303, December 2005, | |||
| <https://www.rfc-editor.org/info/rfc4303>. | <https://www.rfc-editor.org/info/rfc4303>. | |||
| [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | ||||
| Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, | ||||
| <https://www.rfc-editor.org/info/rfc5681>. | ||||
| [RFC6789] Briscoe, B., Ed., Woundy, R., Ed., and A. Cooper, Ed., | [RFC6789] Briscoe, B., Ed., Woundy, R., Ed., and A. Cooper, Ed., | |||
| "Congestion Exposure (ConEx) Concepts and Use Cases", | "Congestion Exposure (ConEx) Concepts and Use Cases", | |||
| RFC 6789, DOI 10.17487/RFC6789, December 2012, | RFC 6789, DOI 10.17487/RFC6789, December 2012, | |||
| <https://www.rfc-editor.org/info/rfc6789>. | <https://www.rfc-editor.org/info/rfc6789>. | |||
| [RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) | [RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) | |||
| Concepts, Abstract Mechanism, and Requirements", RFC 7713, | Concepts, Abstract Mechanism, and Requirements", RFC 7713, | |||
| DOI 10.17487/RFC7713, December 2015, | DOI 10.17487/RFC7713, December 2015, | |||
| <https://www.rfc-editor.org/info/rfc7713>. | <https://www.rfc-editor.org/info/rfc7713>. | |||
| skipping to change at line 1373 ¶ | skipping to change at line 1337 ¶ | |||
| (L4S) Internet Service: Architecture", RFC 9330, | (L4S) Internet Service: Architecture", RFC 9330, | |||
| DOI 10.17487/RFC9330, January 2023, | DOI 10.17487/RFC9330, January 2023, | |||
| <https://www.rfc-editor.org/info/rfc9330>. | <https://www.rfc-editor.org/info/rfc9330>. | |||
| [RFC9332] De Schepper, K., Briscoe, B., Ed., and G. White, "Dual- | [RFC9332] De Schepper, K., Briscoe, B., Ed., and G. White, "Dual- | |||
| Queue Coupled Active Queue Management (AQM) for Low | Queue Coupled Active Queue Management (AQM) for Low | |||
| Latency, Low Loss, and Scalable Throughput (L4S)", | Latency, Low Loss, and Scalable Throughput (L4S)", | |||
| RFC 9332, DOI 10.17487/RFC9332, January 2023, | RFC 9332, DOI 10.17487/RFC9332, January 2023, | |||
| <https://www.rfc-editor.org/info/rfc9332>. | <https://www.rfc-editor.org/info/rfc9332>. | |||
| [RFC9438] Xu, L., Ha, S., Rhee, I., Goel, V., and L. Eggert, Ed., | ||||
| "CUBIC for Fast and Long-Distance Networks", RFC 9438, | ||||
| DOI 10.17487/RFC9438, August 2023, | ||||
| <https://www.rfc-editor.org/info/rfc9438>. | ||||
| [ScalingCC] | [ScalingCC] | |||
| Briscoe, B. and K. De Schepper, "Resolving Tensions | Briscoe, B. and K. De Schepper, "Resolving Tensions | |||
| between Congestion Control Scaling Requirements", | between Congestion Control Scaling Requirements", | |||
| arXiv:1904.07605, Simula Technical Report TR-CS-2016-001, | arXiv:1904.07605, Simula Technical Report TR-CS-2016-001, | |||
| DOI 10.48550/arXiv.1904.07605, July 2017, | DOI 10.48550/arXiv.1904.07605, July 2017, | |||
| <https://arxiv.org/abs/1904.07605>. | <https://arxiv.org/abs/1904.07605>. | |||
| [SCReAM] "SCReAM", commit 0208f59, 10 November 2025, | [SCReAM] "SCReAM", commit 0208f59, 10 November 2025, | |||
| <https://github.com/EricssonResearch/scream/blob/master/ | <https://github.com/EricssonResearch/scream/blob/master/ | |||
| README.md>. | README.md>. | |||
| End of changes. 89 change blocks. | ||||
| 233 lines changed or deleted | 202 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||