Diff: rfc9957v2.txt - rfc9957.txt

	rfc9957v2.txt		rfc9957.txt

	skipping to change at line 70 ¶		skipping to change at line 70 ¶
	1.1. Document Roadmap		1.1. Document Roadmap
	1.2. Terminology		1.2. Terminology
	2. Approach (In Brief)		2. Approach (In Brief)
	2.1. Mechanism		2.1. Mechanism
	2.2. Policy		2.2. Policy
	2.2.1. Policy Conditions		2.2.1. Policy Conditions
	2.2.2. Policy Action		2.2.2. Policy Action
	3. Necessary Flow Behaviour		3. Necessary Flow Behaviour
	4. Pseudocode Walk-Through		4. Pseudocode Walk-Through
	4.1. Input Parameters, Constants, and Variables		4.1. Input Parameters, Constants, and Variables

	4.2. Queue Protection Data Path		4.2. QProt Data Path
	4.2.1. The qprotect() Function		4.2.1. The qprotect() Function
	4.2.2. The pick_bucket() Function		4.2.2. The pick_bucket() Function
	4.2.3. The fill_bucket() Function		4.2.3. The fill_bucket() Function
	4.2.4. The calcProbNative() Function		4.2.4. The calcProbNative() Function
	5. Rationale		5. Rationale
	5.1. Rationale: Blame for Queuing, Not for Rate in Itself		5.1. Rationale: Blame for Queuing, Not for Rate in Itself
	5.2. Rationale: Constant Aging of the Queuing Score		5.2. Rationale: Constant Aging of the Queuing Score
	5.3. Rationale: Transformed Queuing Score		5.3. Rationale: Transformed Queuing Score
	5.4. Rationale: Policy Conditions		5.4. Rationale: Policy Conditions
	5.5. Rationale: Reclassification as the Policy Action		5.5. Rationale: Reclassification as the Policy Action

	skipping to change at line 110 ¶		skipping to change at line 110 ¶
	Although the algorithm is defined in Annex P of [DOCSIS], it relies		Although the algorithm is defined in Annex P of [DOCSIS], it relies
	on cross references to other parts of the set of specifications.		on cross references to other parts of the set of specifications.
	This document pulls all the strands together into one self-contained		This document pulls all the strands together into one self-contained
	document. The core of the document is a similar pseudocode walk-		document. The core of the document is a similar pseudocode walk-
	through to that in the DOCSIS specification, but it also includes		through to that in the DOCSIS specification, but it also includes
	additional material:		additional material:

	i. a brief overview,		i. a brief overview,

	ii. a definition of how a data sender needs to behave to avoid		ii. a definition of how a data sender needs to behave to avoid

	triggering Queue Protection, and		triggering queue protection, and

	iii. a section giving the rationale for the design choices.		iii. a section giving the rationale for the design choices.

	Low queuing delay depends on hosts sending their data smoothly,		Low queuing delay depends on hosts sending their data smoothly,
	either at a low rate or responding to Explicit Congestion		either at a low rate or responding to Explicit Congestion
	Notification (ECN) (see [RFC8311] and [RFC9331]). Therefore, low-		Notification (ECN) (see [RFC8311] and [RFC9331]). Therefore, low-
	latency queuing is something hosts create themselves, not something		latency queuing is something hosts create themselves, not something
	the network gives them. This tends to ensure that self-interest		the network gives them. This tends to ensure that self-interest
	alone does not drive flows to mismark their packets for the low-		alone does not drive flows to mismark their packets for the low-
	latency queue. However, traffic from an application that does not		latency queue. However, traffic from an application that does not

	skipping to change at line 211 ¶		skipping to change at line 211 ¶
	contained within packets of a flow that have the Congestion		contained within packets of a flow that have the Congestion
	Experienced (CE) codepoint set in the IP-ECN field [RFC3168]		Experienced (CE) codepoint set in the IP-ECN field [RFC3168]
	(including IP headers unless specified otherwise). Congestion-		(including IP headers unless specified otherwise). Congestion-
	bit-rate and congestion-volume were introduced in [RFC7713] and		bit-rate and congestion-volume were introduced in [RFC7713] and
	[RFC6789].		[RFC6789].

	DOCSIS: Data-Over-Cable System Interface Specification. "DOCSIS" is		DOCSIS: Data-Over-Cable System Interface Specification. "DOCSIS" is
	a registered trademark of Cable Television Laboratories, Inc.		a registered trademark of Cable Television Laboratories, Inc.
	("CableLabs").		("CableLabs").


			QProt: The DOCSIS Queue Protection function

	non-queue-building: A flow that tends not to build a queue.		non-queue-building: A flow that tends not to build a queue.

	Note the use of lowercase "non-queue-building", which describes		Note the use of lowercase "non-queue-building", which describes
	the behaviour of a flow, in contrast to uppercase Non-Queue-		the behaviour of a flow, in contrast to uppercase Non-Queue-
	Building (NQB), which refers to a Diffserv marking [RFC9956]. A		Building (NQB), which refers to a Diffserv marking [RFC9956]. A

	flow identifying itself as uppercase Non-Queue-Building may not		flow identifying itself as uppercase NQB may not behave in a non-
	behave in a non-queue-building way, which is what the QProt		queue-building way, which is what the QProt algorithm is intended
	algorithm is intended to detect.		to detect.

	queue-building: A flow that builds a queue. If it is classified		queue-building: A flow that builds a queue. If it is classified
	into the Low-Latency queue, it is therefore a candidate for the		into the Low-Latency queue, it is therefore a candidate for the

	Queue Protection algorithm to detect and sanction.		QProt algorithm to detect and sanction.

	ECN: Explicit Congestion Notification [RFC3168]		ECN: Explicit Congestion Notification [RFC3168]

	ECN marking or ECN signalling: Setting the IP-ECN field of an		ECN marking or ECN signalling: Setting the IP-ECN field of an
	increasing proportion of packets to the Congestion Experienced		increasing proportion of packets to the Congestion Experienced
	(CE) codepoint [RFC3168] as queuing worsens.		(CE) codepoint [RFC3168] as queuing worsens.


	QProt: The Queue Protection function

	2. Approach (In Brief)		2. Approach (In Brief)

	The QProt algorithm is divided into mechanism and policy. There is		The QProt algorithm is divided into mechanism and policy. There is
	only a tiny amount of policy code, but policy might need to be		only a tiny amount of policy code, but policy might need to be
	changed in the future. So, where hardware implementation is being		changed in the future. So, where hardware implementation is being
	considered, it would be advisable to implement the policy aspects in		considered, it would be advisable to implement the policy aspects in
	firmware or software:		firmware or software:

	* The mechanism aspects identify flows, maintain flow state, and		* The mechanism aspects identify flows, maintain flow state, and
	accumulate per-flow queuing scores;		accumulate per-flow queuing scores;

	skipping to change at line 287 ¶		skipping to change at line 287 ¶
	accumulates faster:		accumulates faster:

	* the higher the degree of queuing and		* the higher the degree of queuing and

	* the faster the flow's packets arrive when there is queuing.		* the faster the flow's packets arrive when there is queuing.

	Section 5.1 explains further why this score represents blame for		Section 5.1 explains further why this score represents blame for
	queuing.		queuing.

	The algorithm, as described so far, would accumulate a number that		The algorithm, as described so far, would accumulate a number that

	would rise at the so-called Congestion-rate of the flow (see		would rise at the so-called congestion-rate of the flow (see
	Section 1.2), i.e., the rate at which the flow is contributing to		Section 1.2), i.e., the rate at which the flow is contributing to
	congestion or the rate at which the AQM is forwarding bytes of the		congestion or the rate at which the AQM is forwarding bytes of the
	flow that are ECN-marked. However, rather than growing continually,		flow that are ECN-marked. However, rather than growing continually,
	the queuing score is also reduced (or "aged") at a constant rate.		the queuing score is also reduced (or "aged") at a constant rate.
	This is because it is unavoidable for capacity-seeking flows to		This is because it is unavoidable for capacity-seeking flows to
	induce a continuous low level of congestion as they track available		induce a continuous low level of congestion as they track available
	capacity. Section 5.2 explains why this allowance can be set to the		capacity. Section 5.2 explains why this allowance can be set to the
	same constant for any scalable flow, whatever its bit rate.		same constant for any scalable flow, whatever its bit rate.

	For implementation efficiency, the queuing score is transformed into		For implementation efficiency, the queuing score is transformed into

	skipping to change at line 334 ¶		skipping to change at line 334 ¶
	low-latency queue is to reclassify a packet into the Classic queue		low-latency queue is to reclassify a packet into the Classic queue
	(this is justified in Section 5.5).		(this is justified in Section 5.5).

	3. Necessary Flow Behaviour		3. Necessary Flow Behaviour

	The QProt algorithm described here can be used for responsive and/or		The QProt algorithm described here can be used for responsive and/or
	unresponsive flows.		unresponsive flows.

	* It is possible to objectively describe the least responsive way		* It is possible to objectively describe the least responsive way
	that a flow will need to respond to congestion signals in order to		that a flow will need to respond to congestion signals in order to

	avoid triggering Queue Protection, no matter the link capacity and		avoid triggering queue protection, no matter the link capacity and
	no matter how much other traffic there is.		no matter how much other traffic there is.

	* It is not possible to describe how fast or smooth an unresponsive		* It is not possible to describe how fast or smooth an unresponsive

	flow should be to avoid Queue Protection because this depends on		flow should be to avoid queue protection because this depends on
	how much other traffic there is and the capacity of the link,		how much other traffic there is and the capacity of the link,
	which an application is unable to know. However, the more		which an application is unable to know. However, the more
	smoothly an unresponsive flow paces its packets and the lower its		smoothly an unresponsive flow paces its packets and the lower its
	rate relative to typical broadband link capacities, the less		rate relative to typical broadband link capacities, the less
	likelihood that it will risk causing enough queueing to trigger		likelihood that it will risk causing enough queueing to trigger

	Queue Protection.		queue protection.

	Responsive low-latency flows can use a Low Latency, Low Loss, and		Responsive low-latency flows can use a Low Latency, Low Loss, and
	Scalable throughput (L4S) ECN codepoint [RFC9331] to get classified		Scalable throughput (L4S) ECN codepoint [RFC9331] to get classified
	into the low-latency queue.		into the low-latency queue.

	A sender can arrange for flows that are smooth but do not respond to		A sender can arrange for flows that are smooth but do not respond to
	ECN marking to be classified into the low-latency queue by using the		ECN marking to be classified into the low-latency queue by using the
	Non-Queue-Building (NQB) Diffserv codepoint [RFC9956], which the		Non-Queue-Building (NQB) Diffserv codepoint [RFC9956], which the
	DOCSIS specifications support, or an operator can use various other		DOCSIS specifications support, or an operator can use various other
	local classifiers.		local classifiers.

	skipping to change at line 370 ¶		skipping to change at line 370 ¶
	The algorithm that calculates this internal variable is run on the		The algorithm that calculates this internal variable is run on the
	arrival of every LL packet, whether or not it is ECN-capable, so that		arrival of every LL packet, whether or not it is ECN-capable, so that
	it can be used by the QProt algorithm. But the variable is only used		it can be used by the QProt algorithm. But the variable is only used
	to ECN-mark packets that are ECN-capable.		to ECN-mark packets that are ECN-capable.

	Not only does this dual use of the variable improve processing		Not only does this dual use of the variable improve processing
	efficiency, but it also makes the basis of the QProt algorithm		efficiency, but it also makes the basis of the QProt algorithm
	visible and transparent, at least for responsive ECN-capable flows.		visible and transparent, at least for responsive ECN-capable flows.
	Then, it is possible to state objectively that a flow can avoid		Then, it is possible to state objectively that a flow can avoid
	triggering queue protection by keeping the bit rate of ECN-marked		triggering queue protection by keeping the bit rate of ECN-marked

	packets (the Congestion-rate) below AGING, which is a configured		packets (the congestion-rate) below AGING, which is a configured
	constant of the algorithm (default 2^19 B/s ≈ 4 Mb/s). Note that it		constant of the algorithm (default 2^19 B/s ≈ 4 Mb/s). Note that it
	is in a congestion controller's own interest to keep its average		is in a congestion controller's own interest to keep its average

	Congestion-rate well below this level (e.g., ~1 Mb/s) to ensure that		congestion-rate well below this level (e.g., ~1 Mb/s) to ensure that
	it does not trigger queue protection during transient dynamics.		it does not trigger queue protection during transient dynamics.

	If the QProt algorithm is used in other settings, it would still need		If the QProt algorithm is used in other settings, it would still need
	to be based on the visible level of congestion signalling, in a		to be based on the visible level of congestion signalling, in a
	similar way to the DOCSIS approach. Without transparency of the		similar way to the DOCSIS approach. Without transparency of the
	basis of the algorithm's decisions, end-systems would not be able to		basis of the algorithm's decisions, end-systems would not be able to

	avoid triggering Queue Protection on an objective basis.		avoid triggering queue protection on an objective basis.

	4. Pseudocode Walk-Through		4. Pseudocode Walk-Through

	4.1. Input Parameters, Constants, and Variables		4.1. Input Parameters, Constants, and Variables

	The operator input parameters that set the parameters in the first		The operator input parameters that set the parameters in the first
	two blocks of pseudocode below are defined for cable modems (CMs) in		two blocks of pseudocode below are defined for cable modems (CMs) in
	[DOCSIS-CM-OSS] and for CMTSs in [DOCSIS-CCAP-OSS]. Then, further		[DOCSIS-CM-OSS] and for CMTSs in [DOCSIS-CCAP-OSS]. Then, further
	constants are either derived from the input parameters or hard-coded.		constants are either derived from the input parameters or hard-coded.


	skipping to change at line 403 ¶		skipping to change at line 403 ¶
	latest DOCSIS specifications are the definitive references. Also,		latest DOCSIS specifications are the definitive references. Also,
	any operator might set certain parameters to non-default values.		any operator might set certain parameters to non-default values.

	<CODE BEGINS>		<CODE BEGINS>
	// Input Parameters		// Input Parameters
	MAX_RATE; // Configured maximum sustained rate [b/s]		MAX_RATE; // Configured maximum sustained rate [b/s]
	QPROTECT_ON; // Queue Protection is enabled [Default: TRUE]		QPROTECT_ON; // Queue Protection is enabled [Default: TRUE]
	CRITICALqL_us; // LL queue threshold delay [µs] Default: MAXTH_us		CRITICALqL_us; // LL queue threshold delay [µs] Default: MAXTH_us
	CRITICALqLSCORE_us;// The threshold queuing score [Default: 4000 µs]		CRITICALqLSCORE_us;// The threshold queuing score [Default: 4000 µs]
	LG_AGING; // The aging rate of the q'ing score [Default: 19]		LG_AGING; // The aging rate of the q'ing score [Default: 19]

	// as log base 2 of the Congestion-rate [lg(B/s)]		// as log base 2 of the congestion-rate [lg(B/s)]

	// Input Parameters for the calcProbNative() algorithm:		// Input Parameters for the calcProbNative() algorithm:
	MAXTH_us; // Max LL AQM marking threshold [Default: 1000 µs]		MAXTH_us; // Max LL AQM marking threshold [Default: 1000 µs]
	LG_RANGE; // Log base 2 of the range of ramp [lg(ns)]		LG_RANGE; // Log base 2 of the range of ramp [lg(ns)]
	// Default: 2^19 = 524288 ns (roughly 525 µs)		// Default: 2^19 = 524288 ns (roughly 525 µs)
	<CODE ENDS>		<CODE ENDS>

	<CODE BEGINS>		<CODE BEGINS>
	// Constants, either derived from input parameters or hard-coded		// Constants, either derived from input parameters or hard-coded
	T_RES; // Resolution of t_exp [ns]		T_RES; // Resolution of t_exp [ns]

	skipping to change at line 500 ¶		skipping to change at line 500 ¶
	Payload (ESP) [RFC4303].		Payload (ESP) [RFC4303].

	The Microflow Classification section of the Queue Protection Annex of		The Microflow Classification section of the Queue Protection Annex of
	the DOCSIS specification [DOCSIS] defines various strategies to find		the DOCSIS specification [DOCSIS] defines various strategies to find
	these headers by skipping extension headers or encapsulations. If		these headers by skipping extension headers or encapsulations. If
	they cannot be found, the specification defines various less-specific		they cannot be found, the specification defines various less-specific
	3-tuples that would be used. The DOCSIS specification should be		3-tuples that would be used. The DOCSIS specification should be
	referred to for all these strategies, which will not be repeated		referred to for all these strategies, which will not be repeated
	here.		here.


	The array of bucket structures defined below is used by all the Queue		The array of bucket structures defined below is used by all the QProt
	Protection functions:		functions:

	<CODE BEGINS>		<CODE BEGINS>
	struct bucket { // The leaky bucket structure to hold per-flow state		struct bucket { // The leaky bucket structure to hold per-flow state
	id; // identifier (e.g., 5-tuple) of flow using bucket		id; // identifier (e.g., 5-tuple) of flow using bucket
	t_exp; // expiry time in units of T_RES		t_exp; // expiry time in units of T_RES
	// (t_exp - now) = flow's transformed q'ing score		// (t_exp - now) = flow's transformed q'ing score
	};		};
	struct bucket buckets[NBUCKETS+1];		struct bucket buckets[NBUCKETS+1];
	<CODE ENDS>		<CODE ENDS>


	4.2. Queue Protection Data Path		4.2. QProt Data Path


	All the functions of Queue Protection operate on the data path,		All the functions of QProt operate on the data path, driven by packet
	driven by packet arrivals.		arrivals.

	The following functions that maintain per-flow queuing scores and		The following functions that maintain per-flow queuing scores and
	manage per-flow state are considered primarily as mechanism:		manage per-flow state are considered primarily as mechanism:

	pick_bucket(uflow_id); // Returns bucket identifier		pick_bucket(uflow_id); // Returns bucket identifier
	fill_bucket(bucket_id, pkt_size, probNative); /* Returns queuing		fill_bucket(bucket_id, pkt_size, probNative); /* Returns queuing
	* score */		* score */
	calcProbNative(qdelay); /* Returns ECN-marking probability of the		calcProbNative(qdelay); /* Returns ECN-marking probability of the
	* Native LL AQM */		* Native LL AQM */


	skipping to change at line 892 ¶		skipping to change at line 892 ¶

	This suggests that the QProt algorithm will not sanction a well-		This suggests that the QProt algorithm will not sanction a well-
	behaved scalable flow if it ages out the queuing score at a		behaved scalable flow if it ages out the queuing score at a
	sufficient constant rate. The constant will need to be somewhat		sufficient constant rate. The constant will need to be somewhat
	above the average of a well-behaved scalable flow to allow for normal		above the average of a well-behaved scalable flow to allow for normal
	dynamics.		dynamics.

	Relating QProt's aging constant to a scalable flow does not mean that		Relating QProt's aging constant to a scalable flow does not mean that
	a flow has to behave like a scalable flow: it can be less aggressive		a flow has to behave like a scalable flow: it can be less aggressive
	but not more aggressive. For instance, a longer RTT flow can run at		but not more aggressive. For instance, a longer RTT flow can run at

	a lower Congestion-rate than the aging rate, but it can also increase		a lower congestion-rate than the aging rate, but it can also increase
	its aggressiveness to equal the rate of short RTT scalable flows		its aggressiveness to equal the rate of short RTT scalable flows
	[ScalingCC]. The constant aging of QProt also means that a long-		[ScalingCC]. The constant aging of QProt also means that a long-
	running unresponsive flow will be prone to trigger QProt if it runs		running unresponsive flow will be prone to trigger QProt if it runs
	faster than a competing responsive scalable flow would. And, of		faster than a competing responsive scalable flow would. And, of
	course, if a flow causes excessive queuing in the short term, its		course, if a flow causes excessive queuing in the short term, its
	queuing score will still rise faster than the constant aging process		queuing score will still rise faster than the constant aging process
	will decrease it. Then, QProt will still eject the flow's packets		will decrease it. Then, QProt will still eject the flow's packets
	before they harm the low latency of the shared queue.		before they harm the low latency of the shared queue.

	5.3. Rationale: Transformed Queuing Score		5.3. Rationale: Transformed Queuing Score

	skipping to change at line 928 ¶		skipping to change at line 928 ¶
	AGING * Dt Dt		AGING * Dt Dt

	Figure 3: Transformation of Queuing Score		Figure 3: Transformation of Queuing Score

	The accumulation and aging of the queuing score is shown on the left		The accumulation and aging of the queuing score is shown on the left
	of Figure 3 in token bucket form. Dt is the difference between the		of Figure 3 in token bucket form. Dt is the difference between the
	times when the scores of the current and previous packets were		times when the scores of the current and previous packets were
	processed.		processed.

	A transformed equivalent of this token bucket is shown on the right		A transformed equivalent of this token bucket is shown on the right

	of Figure 3, dividing both the input and output by the constant AGING		of Figure 3, dividing both the input and output by the constant rate,
	rate. The result is a bucket-depth that represents time and it		AGING. The result is a bucket-depth that represents time and it
	drains at the rate that time passes.		drains at the rate that time passes.

	As a further optimization, the time the bucket was last updated is		As a further optimization, the time the bucket was last updated is
	not stored with the flow state. Instead, when the bucket is		not stored with the flow state. Instead, when the bucket is
	initialized, the queuing score is added to the system time "now" and		initialized, the queuing score is added to the system time "now" and
	the resulting expiry time is written into the bucket. Subsequently,		the resulting expiry time is written into the bucket. Subsequently,
	if the bucket has not expired, the incremental queuing score is added		if the bucket has not expired, the incremental queuing score is added
	to the time already held in the bucket. Then, the queuing score		to the time already held in the bucket. Then, the queuing score
	always represents the expiry time of the flow state itself. This		always represents the expiry time of the flow state itself. This
	means that the queuing score does not need to be aged explicitly		means that the queuing score does not need to be aged explicitly

	skipping to change at line 1183 ¶		skipping to change at line 1183 ¶
	attack flows.		attack flows.

	For an attacker to keep buckets busy, it is more efficient to hold		For an attacker to keep buckets busy, it is more efficient to hold
	onto them by cycling regularly through a set of port numbers (94 in		onto them by cycling regularly through a set of port numbers (94 in
	the above example) rather than to keep occupying and releasing		the above example) rather than to keep occupying and releasing
	buckets with single packet flows across a much larger number of		buckets with single packet flows across a much larger number of
	ports.		ports.

	During such an attack, the coupled marking probability will have		During such an attack, the coupled marking probability will have
	saturated at 100%. Therefore, to hold a bucket, the rate of an attack		saturated at 100%. Therefore, to hold a bucket, the rate of an attack

	flow needs to be no less than the AGING rate of each bucket: 4 Mb/s		flow needs to be no less than the aging rate of each bucket: 4 Mb/s
	by default. However, for an attack flow to be sure to hold on to its		by default. However, for an attack flow to be sure to hold on to its
	bucket, it would need to send somewhat faster. Thus, an attack with		bucket, it would need to send somewhat faster. Thus, an attack with
	100 flows would need a total force of more than 100 * 4 Mb/s = 400		100 flows would need a total force of more than 100 * 4 Mb/s = 400
	Mb/s.		Mb/s.

	This attack can be mitigated (but not prevented) by increasing the		This attack can be mitigated (but not prevented) by increasing the
	number of buckets. The required attack force scales linearly with		number of buckets. The required attack force scales linearly with
	the number of buckets, NBUCKETS. Therefore, if NBUCKETS were doubled		the number of buckets, NBUCKETS. Therefore, if NBUCKETS were doubled
	to 64, twice as many 4 Mb/s flows would be needed to maintain the		to 64, twice as many 4 Mb/s flows would be needed to maintain the
	same impact on innocent flows.		same impact on innocent flows.

End of changes. 20 change blocks.
	25 lines changed or deleted		25 lines changed or added
This html diff was produced by rfcdiff 1.48.