TCP Encapsulation of Internet Key Exchange Protocol (IKE) and IPsec PacketsApple Inc.1 Infinite LoopCupertinoCalifornia95014United States of Americatpauly@apple.comELVIS-PLUSPO Box 81Moscow (Zelenograd)124460Russian Federation+7 495 276 0211svan@elvis.ru
sec
ipsecme This document describes a method to transport Internet Key Exchange
Protocol (IKE) and IPsec packets over a TCP connection for traversing
network middleboxes that may block IKE negotiation over UDP. This
method, referred to as "TCP encapsulation", involves sending both IKE
packets for Security Association (SA) establishment and Encapsulating
Security Payload (ESP) packets over a TCP connection. This method is
intended to be used as a fallback option when IKE cannot be
negotiated over UDP.
TCP encapsulation for IKE and IPsec was defined in RFC 8229.
This document clarifies the specification for TCP encapsulation by including
additional clarifications obtained during implementation and deployment
of this method. This documents obsoletes RFC 8229.
Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by
the Internet Engineering Steering Group (IESG). Further
information on Internet Standards is available in Section 2 of
RFC 7841.
Information about the current status of this document, any
errata, and how to provide feedback on it may be obtained at
.
Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
() in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Revised BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Revised BSD License.
Table of Contents
. Introduction
. Prior Work and Motivation
. Terminology and Notation
. Configuration
. TCP-Encapsulated Data Formats
. TCP-Encapsulated IKE Message Format
. TCP-Encapsulated ESP Packet Format
. TCP-Encapsulated Stream Prefix
. Applicability
. Recommended Fallback from UDP
. Using TCP Encapsulation
. Connection Establishment and Teardown
. Retransmissions
. Cookies and Puzzles
. Statelessness versus Delay of SA Establishment
. Error Handling in IKE_SA_INIT
. NAT-Detection Payloads
. NAT-Keepalive Packets
. Dead Peer Detection and Transport Keepalives
. Implications of TCP Encapsulation on IPsec SA Processing
. Interaction with IKEv2 Extensions
. MOBIKE Protocol
. IKE Redirect
. IKEv2 Session Resumption
. IKEv2 Protocol Support for High Availability
. IKEv2 Fragmentation
. Middlebox Considerations
. Performance Considerations
. TCP-in-TCP
. Added Reliability for Unreliable Protocols
. Quality-of-Service Markings
. Maximum Segment Size
. Tunneling ECN in TCP
. Security Considerations
. IANA Considerations
. References
. Normative References
. Informative References
. Using TCP Encapsulation with TLS
. Example Exchanges of TCP Encapsulation with TLS 1.3
. Establishing an IKE Session
. Deleting an IKE Session
. Re-establishing an IKE Session
. Using MOBIKE between UDP and TCP Encapsulation
Acknowledgments
Authors' Addresses
Introduction The Internet Key Exchange Protocol version 2 (IKEv2) is a
protocol for establishing IPsec Security Associations (SAs) using
IKE messages over UDP for control traffic and using Encapsulating
Security Payload (ESP) messages for encrypted data traffic.
Many network middleboxes that filter traffic on public hotspots block
all UDP traffic, including IKE and IPsec, but allow TCP connections
through because they appear to be web traffic. Devices on these
networks that need to use IPsec (to access private enterprise
networks, to route Voice over IP calls to carrier networks
because of security policies, etc.) are unable to establish IPsec SAs.
This document defines a method for encapsulating IKE control messages
as well as ESP data messages within a TCP connection. Note that Authentication Header (AH) is not supported by this specification.
Using TCP as a transport for IPsec packets adds the third option (below) to the
list of traditional IPsec transports:
Direct. Usually, IKE negotiations begin over UDP port 500. If
no Network Address Translation (NAT) device is detected between
the Initiator and the Responder, then subsequent IKE packets are
sent over UDP port 500 and IPsec data packets are sent
using ESP.
UDP Encapsulation. Described in . If a NAT is detected between the
Initiator and the Responder, then subsequent IKE packets are sent
over UDP port 4500 with 4 bytes of zero at the start of the
UDP payload, and ESP packets are sent out over UDP port 4500.
Some implementations default to using UDP encapsulation even when no NAT is
detected on the path, as some middleboxes do not support IP
protocols other than TCP and UDP.
TCP Encapsulation. Described in this document. If the other two methods are not available or
appropriate, IKE negotiation packets as well as ESP packets can
be sent over a single TCP connection to the peer.
Direct use of ESP or UDP encapsulation should be preferred by
IKE implementations due to performance concerns when using
TCP encapsulation (). Most implementations should use
TCP encapsulation only on networks where negotiation over UDP has
been attempted without receiving responses from the peer or if a
network is known to not support UDP.Prior Work and Motivation Encapsulating IKE connections within TCP streams is a common approach
to solve the problem of UDP packets being blocked by network
middleboxes. The specific goals of this document are as follows:
To promote interoperability by defining a standard method of
framing IKE and ESP messages within TCP streams.
To be compatible with the current IKEv2 standard without requiring
modifications or extensions.
To use IKE over UDP by default to avoid the overhead of other
alternatives that always rely on TCP or Transport Layer Security
(TLS) .
Some previous alternatives include:
Cellular Network Access:
Interworking Wireless LAN (IWLAN) uses IKEv2 to create secure
connections to cellular carrier networks for making voice calls
and accessing other network services over Wi-Fi networks. 3GPP has
recommended that IKEv2 and ESP packets be sent within a TLS
connection to be able to establish connections on restrictive
networks.
ISAKMP over TCP:
Various non-standard extensions to the Internet Security
Association and Key Management Protocol (ISAKMP) have been
deployed that send IPsec traffic over TCP or TCP-like packets.
Secure Sockets Layer (SSL) VPNs:
Many proprietary VPN solutions use a combination of TLS and IPsec
in order to provide reliability. These often run on TCP port 443.
IKEv2 over TCP:
IKEv2 over TCP as described in is used to avoid UDP
fragmentation.
TCP encapsulation for IKE and IPsec was defined in .
This document updates the specification for TCP encapsulation by including
additional clarifications obtained during implementation and deployment
of this method.
In particular:
The interpretation of the Length field preceding every message is clarified ().
The use of the NAT_DETECTION_*_IP notifications is clarified
(Sections , ,
and ).
Retransmission behavior is clarified ().
The use of cookies and puzzles is described in more detail ().
Error handling is clarified ().
Implications of TCP encapsulation on IPsec SA processing are expanded ().
describing interactions with other IKEv2 extensions is added.
The interaction of TCP encapsulation with IKEv2 Mobility and Multihoming (MOBIKE) is clarified ().
The recommendation for TLS encapsulation () now includes TLS 1.3.
Examples of TLS encapsulation are provided using TLS 1.3 ().
More security considerations are added.
Terminology and Notation This document distinguishes between the IKE peer that initiates TCP
connections to be used for TCP encapsulation and the roles of
Initiator and Responder for particular IKE messages. During the
course of IKE exchanges, the role of IKE Initiator and Responder may
swap for a given SA (as with IKE SA rekeys), while the Initiator of
the TCP connection is still responsible for tearing down the TCP
connection and re-establishing it if necessary. For this reason,
this document will use the term "TCP Originator" to indicate the IKE
peer that initiates TCP connections. The peer that receives TCP
connections will be referred to as the "TCP Responder". If an IKE SA
is rekeyed one or more times, the TCP Originator MUST remain the peer
that originally initiated the first IKE SA.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14
when, and only when, they appear in all capitals, as shown here.
ConfigurationOne of the main reasons to use TCP encapsulation is that UDP traffic
may be entirely blocked on a network. Because of this, support for
TCP encapsulation is not specifically negotiated in the IKE exchange.
Instead, support for TCP encapsulation must be preconfigured on both
the TCP Originator and the TCP Responder.Compliant implementations MUST support TCP encapsulation on TCP port 4500,
which is reserved for IPsec NAT traversal.Beyond a flag indicating support for TCP encapsulation, the
configuration for each peer can include the following optional
parameters:
Alternate TCP ports on which the specific TCP Responder listens
for incoming connections. Note that the TCP Originator may
initiate TCP connections to the TCP Responder from any local port.
An extra framing protocol to use on top of TCP to further
encapsulate the stream of IKE and IPsec packets. See
for a detailed discussion.
Since TCP encapsulation of IKE and IPsec packets adds overhead and
has potential performance trade-offs compared to direct or
UDP-encapsulated SAs (as described in ), implementations
SHOULD prefer ESP direct or UDP-encapsulated SAs over
TCP-encapsulated SAs when possible.
TCP-Encapsulated Data FormatsLike UDP encapsulation, TCP encapsulation uses the first 4 bytes
of a message to differentiate IKE and ESP messages. TCP
encapsulation also adds a 16-bit Length field that precedes every message
to define the boundaries of messages within a stream.
The value in this field is equal to the length of the original message
plus the length of the field itself, in octets. If the first 32 bits
of the message are zeros (a non-ESP marker), then the contents comprise an
IKE message. Otherwise, the contents comprise an ESP message.
AH messages are not supported for TCP
encapsulation.
Although a TCP stream may be able to send very long messages,
implementations SHOULD limit message lengths to match the lengths
used for UDP encapsulation of ESP messages.
The maximum message length is used as the effective MTU
for connections that are being encrypted using ESP,
so the maximum message length will influence characteristics of these
connections, such as the TCP Maximum Segment Size (MSS).
Due to the fact that the Length field is 16 bits and includes both the message length and the
length of the field itself, it is impossible to encapsulate messages greater than 65533
octets in length. In most cases, this is not a problem. Note that a similar
limitation exists for encapsulation ESP in UDP .
The minimum size of an encapsulated message is 1 octet
(for NAT-keepalive packets, see ). Empty messages
(where the Length field equals 2) MUST be silently ignored by receiver.
Note that this method of encapsulation will also work for placing IKE
and ESP messages within any protocol that presents a stream
abstraction, beyond TCP.
TCP-Encapsulated IKE Message Format The IKE message is preceded by a 16-bit Length field in network byte
order that specifies the length of the IKE message (including the
non-ESP marker) within the TCP stream. As with IKE over UDP
port 4500, a zeroed 32-bit non-ESP marker is inserted before the
start of the IKE header in order to differentiate the traffic from
ESP traffic between the same addresses and ports.
Length (2 octets, unsigned integer):
Length of the IKE message,
including the Length field and non-ESP marker. The value in the Length
field MUST NOT be 0 or 1. The receiver MUST treat these values as
fatal errors and MUST close the TCP connection.
Non-ESP Marker (4 octets):
Four zero-valued bytes.
TCP-Encapsulated ESP Packet FormatThe ESP packet is preceded by a 16-bit Length field in network byte
order that specifies the length of the ESP packet within the TCP
stream.The Security Parameter Index (SPI) field in the ESP header
MUST NOT be a zero value.
Length (2 octets, unsigned integer):
Length of the ESP
packet, including the Length field. The value in the Length field
MUST NOT be 0 or 1. The receiver MUST
treat these values as fatal errors and MUST close TCP
connection.
TCP-Encapsulated Stream PrefixEach stream of bytes used for IKE and IPsec encapsulation MUST begin
with a fixed sequence of 6 bytes as a magic value, containing the
characters "IKETCP" as ASCII values.
This value is intended to
identify and validate that the TCP connection is being used for TCP
encapsulation as defined in this document, to avoid conflicts with
the prevalence of previous non-standard protocols that used TCP
port 4500. This value is only sent once, by the TCP Originator only,
at the beginning of the TCP stream of IKE and ESP messages.
Initiator Responder
---------------------------------------------------------------------
<new TCP connection is established by Initiator>
Stream Prefix|Length|non-ESP marker|IKE message -->
<-- Length|non-ESP marker|IKE message
Length|non-ESP marker|IKE message -->
<-- Length|non-ESP marker|IKE message
[...]
Length|ESP packet ->
<- Length|ESP packetIf other framing protocols are used within TCP to further encapsulate
or encrypt the stream of IKE and ESP messages, the stream prefix must
be at the start of the TCP Originator's IKE and ESP message stream
within the added protocol layer (). Although some framing
protocols do support negotiating inner protocols, the stream prefix
should always be used in order for implementations to be as generic
as possible and not rely on other framing protocols on top of TCP.
ApplicabilityTCP encapsulation is applicable only when it has been configured to
be used with specific IKE peers. If a Responder is configured to accept and is allowed to use
TCP encapsulation, it MUST listen on the configured port(s) in case
any peers will initiate new IKE sessions. Initiators MAY use TCP
encapsulation for any IKE session to a peer that is configured to
support TCP encapsulation, although it is recommended that Initiators
only use TCP encapsulation when traffic over UDP is blocked.
Since the support of TCP encapsulation is a configured property, not
a negotiated one, it is recommended that if there are multiple IKE
endpoints representing a single peer (such as multiple machines with
different IP addresses when connecting by Fully Qualified Domain
Name (FQDN), or endpoints used with IKE redirection), all of the endpoints
equally support TCP encapsulation.
If TCP encapsulation is being used for a specific IKE SA, all
IKE messages for that IKE SA and ESP packets for its Child SAs MUST be sent over a TCP
connection until the SA is deleted or IKEv2 Mobility and Multihoming
(MOBIKE) is used to change the SA endpoints and/or the encapsulation
protocol. See for more details on using MOBIKE to
transition between encapsulation modes.
Recommended Fallback from UDPSince UDP is the preferred method of transport for IKE messages,
implementations that use TCP encapsulation should have an algorithm
for deciding when to use TCP after determining that UDP is unusable.
If an Initiator implementation has no prior knowledge about the
network it is on and the status of UDP on that network, it SHOULD
always attempt to negotiate IKE over UDP first. IKEv2 defines how to
use retransmission timers with IKE messages and, specifically,
IKE_SA_INIT messages . Generally, this means that the
implementation will define a frequency of retransmission and the
maximum number of retransmissions allowed before marking the IKE SA
as failed. An implementation can attempt negotiation over TCP once
it has hit the maximum retransmissions over UDP, or slightly before
to reduce connection setup delays. It is recommended that the
initial message over UDP be retransmitted at least once before
falling back to TCP, unless the Initiator knows beforehand that the
network is likely to block UDP.
When switching from UDP to TCP, a new IKE_SA_INIT exchange MUST be
initiated with the Initiator's new SPI and with recalculated content of
NAT_DETECTION_*_IP notifications.
Using TCP EncapsulationConnection Establishment and TeardownWhen the IKE Initiator uses TCP encapsulation, it will initiate a TCP
connection to the Responder using the Responder's preconfigured TCP port. The first
bytes sent on the TCP stream MUST be the stream prefix value ().
After this prefix, encapsulated IKE messages will negotiate the IKE
SA and initial Child SA . After this point, both
encapsulated IKE () and ESP ()
messages will be sent over the TCP connection. The TCP Responder MUST wait for the entire
stream prefix to be received on the stream before trying to parse out
any IKE or ESP messages. The stream prefix is sent only once, and
only by the TCP Originator.
In order to close an IKE session, either the Initiator or Responder
SHOULD gracefully tear down IKE SAs with DELETE payloads. Once the
SA has been deleted, the TCP Originator SHOULD close the TCP
connection if it does not intend to use the connection for another
IKE session to the TCP Responder. If the TCP connection is no longer
associated with any active IKE SA, the TCP Responder MAY close the connection
to clean up IKE resources if the TCP Originator didn't close it within some reasonable period of time (e.g., a few seconds).
An unexpected FIN or a TCP Reset on the TCP connection may indicate a
loss of connectivity, an attack, or some other error. If a DELETE
payload has not been sent, both sides SHOULD maintain the state for
their SAs for the standard lifetime or timeout period. The TCP
Originator is responsible for re-establishing the TCP connection if
it is torn down for any unexpected reason. Since new TCP connections
may use different IP addresses and/or ports due to NAT mappings or local address or port allocations
changing, the TCP Responder MUST allow packets for existing SAs to be
received from new source IP addresses and ports. Note that the
IPv6 Flow-ID header MUST remain constant when a new TCP connection is created to avoid ECMP load balancing.
A peer MUST discard a partially received message due to a broken
connection.
Whenever the TCP Originator opens a new TCP connection to be used for
an existing IKE SA, it MUST send the stream prefix first, before any
IKE or ESP messages. This follows the same behavior as the initial
TCP connection.
Multiple IKE SAs MUST NOT share a single TCP connection, unless one
is a rekey of an existing IKE SA, in which case there will
temporarily be two IKE SAs on the same TCP connection.
If a TCP connection is being used to continue an existing IKE/ESP
session, the TCP Responder can recognize the session using either the
IKE SPI from an encapsulated IKE message or the ESP SPI from an
encapsulated ESP packet. If the session had been fully established
previously, it is suggested that the TCP Originator send an
UPDATE_SA_ADDRESSES message if MOBIKE is supported and an
empty informational message if it is not.
The TCP Responder MUST NOT accept any messages for the existing IKE
session on a new incoming connection, unless that connection begins
with the stream prefix. If either the TCP Originator or TCP
Responder detects corruption on a connection that was started with a
valid stream prefix, it SHOULD close the TCP connection. The
connection can be corrupted if there are too many
subsequent messages that cannot be parsed as valid IKE messages or
ESP messages with known SPIs, or if the authentication check for an
IKE message or ESP message with a known SPI fails. Implementations SHOULD NOT
tear down a connection if only a few consecutive ESP packets have unknown
SPIs since the SPI databases may be momentarily out of sync. If
there is instead a syntax issue within an IKE message, an
implementation MUST send the INVALID_SYNTAX notify payload and
tear down the IKE SA as usual, rather than tearing down the TCP
connection directly.
A TCP Originator SHOULD only open one TCP connection per IKE SA, over
which it sends all of the corresponding IKE and ESP messages. This
helps ensure that any firewall or NAT mappings allocated for the TCP
connection apply to all of the traffic associated with the IKE SA
equally.
As with TCP Originators, a TCP Responder SHOULD send packets for an
IKE SA and its Child SAs over only one TCP connection at any given
time. It SHOULD choose the TCP connection on which it last received
a valid and decryptable IKE or ESP message. In order to be
considered valid for choosing a TCP connection, an IKE message must
be successfully decrypted and authenticated, not be a retransmission
of a previously received message, and be within the expected window
for IKE message IDs. Similarly, an ESP message must be successfully
decrypted and authenticated, and must not be a replay of a previous
message.
Since a connection may be broken and a new connection re-established
by the TCP Originator without the TCP Responder being aware, a TCP
Responder SHOULD accept receiving IKE and ESP messages on both old
and new connections until the old connection is closed by the TCP
Originator. A TCP Responder MAY close a TCP connection that it
perceives as idle and extraneous (one previously used for IKE and ESP
messages that has been replaced by a new connection).
Retransmissions describes how IKEv2 deals with the unreliability
of the UDP protocol.
In brief, the exchange Initiator is responsible for retransmissions
and must retransmit request messages until a response message is
received. If no reply is received after several
retransmissions, the SA is deleted. The Responder never initiates retransmission,
but it must send a response message again in case it receives a retransmitted request.
When IKEv2 uses a reliable transport protocol, like TCP, the retransmission rules are as follows:
The exchange Initiator SHOULD NOT retransmit request message (*); if
no response is received within some reasonable period of time, the
IKE SA is deleted.
If a new TCP connection for the IKE SA is established while the exchange
Initiator is waiting for a response, the Initiator MUST
retransmit its request over this connection and continue to wait for a response.
The exchange Responder does not change its behavior, but acts as
described in .
(*) This is an optimization; implementations may continue to use the retransmission logic from
for simplicity.
Cookies and PuzzlesIKEv2 provides a DoS attack protection mechanism through Cookies, which
is described in . extends this
mechanism for protection against DDoS attacks by means of Client
Puzzles. Both mechanisms allow the Responder to avoid keeping state until
the Initiator proves its IP address is legitimate (and after solving a puzzle if required).
The connection-oriented nature of TCP transport brings additional
considerations for using these mechanisms.
In general, Cookies provide less value in the case of TCP encapsulation; by the time a Responder receives the IKE_SA_INIT request, the TCP
session has already been established and the Initiator's IP address
has been verified. Moreover, a TCP/IP stack creates state once
a TCP SYN packet is received (unless SYN Cookies described in
are employed), which contradicts the statelessness of IKEv2 Cookies.
In particular, with TCP, an attacker is able to mount a SYN flooding DoS attack
that an IKEv2 Responder cannot prevent using stateless IKEv2 Cookies.
Thus, when using TCP encapsulation, it makes little sense to send Cookie requests without
Puzzles unless the Responder is concerned with a possibility of TCP
sequence number attacks (see and for details). Puzzles, on
the other hand, still remain useful (and their use requires using Cookies).
The following considerations are applicable for using Cookie and
Puzzle mechanisms in the case of TCP encapsulation:
The exchange Responder SHOULD NOT send an IKEv2 Cookie request without an accompanied Puzzle;
implementations might choose to have exceptions to this for cases like mitigating TCP sequence number attacks.
If the Responder chooses to send a Cookie request (possibly along
with Puzzle request), then the TCP connection that the IKE_SA_INIT
request message was received over SHOULD be closed after the Responder sends its reply
and no repeated requests are received within some short period of time
to keep the Responder stateless (see ). Note that the Responder MUST NOT
include the Initiator's TCP port into the Cookie
calculation (*) since the Cookie can be returned over a new
TCP connection with a different port.
The exchange Initiator acts as described in and , i.e., using TCP encapsulation doesn't change the Initiator's behavior.
(*) Examples of Cookie calculation methods are given in and in , and they don't
include transport protocol ports. However, these examples are given
for illustrative purposes since the Cookie generation algorithm is a
local matter and some implementations might include port numbers
that won't work with TCP encapsulation. Note also that these
examples include the Initiator's IP address in Cookie calculation.
In general, this address may change between two initial requests
(with and without Cookies). This may happen due to NATs, which
have more freedom to change source IP addresses for new TCP
connections than for UDP. In such cases, cookie verification might
fail.
Statelessness versus Delay of SA Establishment
There is a trade-off in choosing the period of time after which
the TCP connection is closed. If it is too short, then the proper Initiator
that repeats its request would need to re-establish the TCP connection,
introducing additional delay. On the other hand, if it is too long, then
the Responder's resources would be wasted in case the Initiator never comes back.
This document doesn't mandate the duration of time because it doesn't affect interoperability,
but it is believed that 5-10 seconds is a good compromise. Also, note that if the Responder requests that
the Initiator solve a puzzle, then the Responder can estimate how long it would take the Initiator
to find a solution and adjust the time interval accordingly.
Error Handling in IKE_SA_INIT
describes how error notifications are handled in the IKE_SA_INIT
exchange. In particular, it is advised that the Initiator should not
act immediately after receiving an error notification; instead, it should
wait some time for a valid response since the IKE_SA_INIT
messages are completely unauthenticated. This advice does not apply
equally in the case of TCP encapsulation. If the Initiator receives a
response message over TCP, then either this message is genuine and was
sent by the peer or the TCP session was hijacked and the message is
forged. In the latter case, no genuine messages from the Responder
will be received.
Thus, in the case of TCP encapsulation, an Initiator SHOULD NOT wait for
additional messages in case it receives an error notification from the
Responder in the IKE_SA_INIT exchange.
In the IKE_SA_INIT exchange, if the Responder returns an error notification that implies
a recovery action from the Initiator (such as INVALID_KE_PAYLOAD
or INVALID_MAJOR_VERSION, see ),
then the Responder SHOULD NOT close the TCP connection immediately in anticipation of the fact
that the Initiator will repeat the request with corrected parameters.
See also .
NAT-Detection PayloadsWhen negotiating over UDP, IKE_SA_INIT packets include
NAT_DETECTION_SOURCE_IP and NAT_DETECTION_DESTINATION_IP payloads to
determine if UDP encapsulation of IPsec packets should be used.
These payloads contain SHA-1 digests of the SPIs, IP addresses, and
ports as defined in . IKE_SA_INIT packets sent on a TCP
connection SHOULD include these payloads with the same content as
when sending over UDP and SHOULD use the applicable TCP ports when
creating and checking the SHA-1 digests.
If a NAT is detected due to the SHA-1 digests not matching the
expected values, no change should be made for encapsulation of
subsequent IKE or ESP packets since TCP encapsulation inherently
supports NAT traversal. However, for the transport mode IPsec SAs, implementations
need to handle TCP and UDP packet checksum fixup during decapsulation,
as defined for UDP encapsulation in .
Implementations MAY use the information that
a NAT is present to influence keepalive timer values.
NAT-Keepalive PacketsEncapsulating IKE and IPsec inside of a TCP connection can impact the
strategy that implementations use to
maintain middlebox port mappings.
In general, TCP port mappings are maintained by NATs longer than UDP
port mappings, so IPsec ESP NAT-keepalive packets SHOULD NOT be
sent when using TCP encapsulation. Any implementation using TCP
encapsulation MUST silently drop incoming NAT-keepalive packets
and not treat them as errors. NAT-keepalive packets over a
TCP-encapsulated IPsec connection will be sent as a 1-octet-long payload
with the value 0xFF, preceded by the 2-octet Length specifying a length
of 3 (since it includes the length of the Length field).
Dead Peer Detection and Transport KeepalivesPeer liveness should be checked
using IKE informational packets .
Note that, depending on the configuration of TCP and TLS on the
connection, TCP keep-alives and TLS keep-alives MAY be used. These MUST NOT be used as indications of IKE peer
liveness, for which purpose the standard IKEv2 mechanism of exchanging (usually empty) INFORMATIONAL messages is used
(see ).
Implications of TCP Encapsulation on IPsec SA ProcessingUsing TCP encapsulation affects some aspects of IPsec SA processing.
requires all tunnel mode IPsec SAs to
be able to copy the Don't Fragment (DF) bit from inner IPv4 header to
the outer (tunnel) one. With TCP encapsulation, this is generally
not possible because the TCP/IP stack manages the DF bit in the outer IPv4
header, and usually the stack ensures that the DF bit is set for TCP
packets to avoid IP fragmentation. Note, that this behavior is
compliant with generic tunneling considerations since the outer TCP header acts
as a link-layer protocol and its fragmentation and reassembly have no correlation with
the inner payload.
The other feature that is less applicable with TCP encapsulation is an
ability to split traffic of different QoS classes into different
IPsec SAs, created by a single IKE SA. In this case, the
Differentiated Services Code Point (DSCP) field is usually copied
from the inner IP header to the outer (tunnel) one, ensuring that
IPsec traffic of each SA receives the corresponding level of service.
With TCP encapsulation, all IPsec SAs created by a single IKE SA will
share a single TCP connection; thus, they will receive the same level of
service (see ). If this functionality is needed,
implementations should create several IKE SAs each over separate TCP connections
and assign a corresponding DSCP value to each of them.
TCP encapsulation of IPsec packets may have implications
on performance of the encapsulated traffic. Performance considerations
are discussed in .
Interaction with IKEv2 ExtensionsMOBIKE ProtocolThe MOBIKE protocol, which allows SAs to migrate between IP
addresses, is defined in ; further clarifies
the details of the protocol. When an IKE session that has negotiated MOBIKE is
transitioning between networks, the Initiator of the transition may
switch between using TCP encapsulation, UDP encapsulation, or no
encapsulation. Implementations that implement both MOBIKE and TCP
encapsulation within the same connection configuration
MUST support dynamically enabling and disabling TCP
encapsulation as interfaces change.
When a MOBIKE-enabled Initiator changes networks, the
INFORMATIONAL exchange with the UPDATE_SA_ADDRESSES notification SHOULD be initiated
first over UDP before attempting over TCP. If there is a response to the
request sent over UDP, then the ESP packets should be sent directly over IP or over UDP port 4500
(depending on if a NAT was detected), regardless of if a connection on a previous
network was using TCP encapsulation. If no response is received within a certain period of time after
several retransmissions, the Initiator ought to change its transport for this exchange from
UDP to TCP and resend the request message. A new INFORMATIONAL exchange MUST NOT be started in this situation. If the Responder only responds to the request sent over TCP, then
the ESP packets should be sent over the TCP connection, regardless of
if a connection on a previous network did not use TCP encapsulation.
The value of the timeout and the specific number of retransmissions before switching to
TCP can vary depending on the Initiator's configuration. Implementations ought to provide
reasonable defaults to ensure that UDP attempts have a chance to succeed, but can shorten
the timeout based on historical data or metrics.
If the TCP transport was used for the previous network connection, the old TCP
connection SHOULD be closed by the Initiator once MOBIKE finishes migration
to a new connection (either TCP or UDP).
Since switching from UDP to TCP can happen during a single
INFORMATIONAL message exchange, the content of the NAT_DETECTION_*_IP
notifications will in most cases be incorrect (since UDP and TCP ports
will most likely be different), and the peer may incorrectly detect
the presence of a NAT. states that
a new INFORMATIONAL exchange with the UPDATE_SA_ADDRESSES notify is initiated
in case the address (or transport) is changed while waiting for a response.
also states that
once an IKE SA is switched to a new IP address, all outstanding requests in this SA
are immediately retransmitted using this address. See also .
The MOBIKE protocol defines the NO_NATS_ALLOWED notification that can be
used to detect the presence of NAT between peer and to refuse to
communicate in this situation. In the case of TCP, the NO_NATS_ALLOWED
notification SHOULD be ignored because TCP generally has no problems
with NAT boxes.
describes an additional optional step in the
process of changing IP addresses called "Return Routability Check". It
is performed by Responders in order to be sure that the new
Initiator's address is, in fact, routable.
In the case of TCP encapsulation, this check has little value since a
TCP handshake proves the routability of the TCP Originator's address;
thus, the Return Routability Check SHOULD NOT be performed.
IKE RedirectA redirect mechanism for IKEv2 is defined in . This mechanism
allows security gateways to redirect clients to another gateway
either during IKE SA establishment or after session setup. If a
client is connecting to a security gateway using TCP and
then is redirected to another security gateway, the client
needs to reset its transport selection.
In other words, with the next security gateway, the client MUST first try UDP and then fall
back to TCP while establishing a new IKE SA, regardless of the transport of
the SA the redirect notification was received over (unless the client's
configuration instructs it to instantly use TCP for the gateway it is
redirected to).
IKEv2 Session ResumptionSession resumption for IKEv2 is defined in . Once an IKE SA is
established, the server creates a resumption ticket where information
about this SA is stored and transfers this ticket to the client.
The ticket may be later used to resume the IKE SA after it is deleted.
In the event of resumption, the client presents the ticket in a new
exchange, called IKE_SESSION_RESUME. Some parameters in the new SA
are retrieved from the ticket and others are renegotiated (more details
are given in ).
Since network conditions may change while the client is inactive,
the fact that TCP encapsulation was used in an old SA SHOULD NOT affect which transport
is used during session resumption. In other words, the transport should be
selected as if the IKE SA is being created from scratch.
IKEv2 Protocol Support for High Availability defines a support for High Availability in IKEv2.
In case of cluster failover, a new active node
must immediately initiate a special INFORMATION exchange containing the
IKEV2_MESSAGE_ID_SYNC notification, which instructs the client to
skip some number of Message IDs that might not be synchronized yet
between nodes at the time of failover.
Synchronizing states when using TCP encapsulation is much harder than
when using UDP; doing so requires access to TCP/IP stack internals, which is
not always available from an IKE/IPsec implementation. If a cluster
implementation doesn't synchronize TCP states between nodes, then
after failover event the new active node will not have any TCP
connection with the client, so the node cannot initiate the
INFORMATIONAL exchange as required by . Since the cluster
usually acts as TCP Responder, the new active node cannot re- establish TCP connection because only the TCP Originator can do it.
For the client, the cluster failover event may remain
undetected for long time if it has no IKE or ESP traffic to send. Once
the client sends an ESP or IKEv2 packet, the cluster node will reply
with TCP RST and the client (as TCP Originator) will reestablish the TCP
connection so that the node will be able to initiate the
INFORMATIONAL exchange informing the client about the cluster
failover.
This document makes the following recommendation: if support for High
Availability in IKEv2 is negotiated and TCP transport is used,
a client that is a TCP Originator SHOULD periodically send
IKEv2 messages (e.g., by initiating liveness check exchange) whenever
there is no IKEv2 or ESP traffic. This differs from the
recommendations given in in the following:
the liveness check should be periodically performed even if the
client has nothing to send over ESP. The frequency of sending such
messages should be high enough to allow quick detection and restoration
of broken TCP connections.
IKEv2 FragmentationIKE message fragmentation is not required when using TCP
encapsulation since a TCP stream already handles the fragmentation
of its contents across packets. Since fragmentation is redundant in
this case, implementations might choose to not negotiate IKE
fragmentation. Even if fragmentation is negotiated, an
implementation SHOULD NOT send fragments when going over a TCP
connection, although it MUST support receiving fragments.
If an implementation supports both MOBIKE and IKE fragmentation, it
SHOULD negotiate IKE fragmentation over a TCP-encapsulated session in
case the session switches to UDP encapsulation on another network.
Middlebox ConsiderationsMany security networking devices, such as firewalls or intrusion
prevention systems, network optimization/acceleration devices, and
NAT devices, keep the state of sessions that traverse through them.
These devices commonly track the transport-layer and/or application-
layer data to drop traffic that is anomalous or malicious in nature.
While many of these devices will be more likely to pass
TCP-encapsulated traffic as opposed to UDP-encapsulated traffic, some
may still block or interfere with TCP-encapsulated IKE and IPsec
traffic.
A network device that monitors the transport layer will track the
state of TCP sessions, such as TCP sequence numbers. If the IKE implementation
has its own minimal implementation of TCP,
it SHOULD still use common TCP behaviors to avoid being dropped by
middleboxes.
Operators that intentionally block IPsec because of security implications
might want to also block TCP port 4500 or use other methods to reject TCP encapsulated IPsec traffic
(e.g., filter out TCP connections that begin with the "IKETCP" stream prefix).
Performance ConsiderationsSeveral aspects of TCP encapsulation for IKE and IPsec packets may
negatively impact the performance of connections within a tunnel-mode
IPsec SA. Implementations should be aware of these performance
impacts and take these into consideration when determining when to
use TCP encapsulation. Implementations MUST favor using direct ESP
or UDP encapsulation over TCP encapsulation whenever possible.
TCP-in-TCPIf the outer connection between IKE peers is over TCP, inner TCP
connections may suffer negative effects from using TCP within TCP.
Running TCP within TCP is discouraged since the TCP algorithms
generally assume that they are running over an unreliable datagram
layer.
If the outer (tunnel) TCP connection experiences packet loss, this
loss will be hidden from any inner TCP connections since the outer
connection will retransmit to account for the losses. Since the
outer TCP connection will deliver the inner messages in order, any
messages after a lost packet may have to wait until the loss is
recovered. This means that loss on the outer connection will be
interpreted only as delay by inner connections. The burstiness of
inner traffic can increase since a large number of inner packets may
be delivered across the tunnel at once. The inner TCP connection may
interpret a long period of delay as a transmission problem,
triggering a retransmission timeout, which will cause spurious
retransmissions. The sending rate of the inner connection may be
unnecessarily reduced if the retransmissions are not detected as
spurious in time.
The inner TCP connection's round-trip-time estimation will be
affected by the burstiness of the outer TCP connection if there are
long delays when packets are retransmitted by the outer TCP
connection. This will make the congestion control loop of the inner
TCP traffic less reactive, potentially permanently leading to a lower
sending rate than the outer TCP would allow for.
TCP-in-TCP can also lead to "TCP meltdown", where stacked instances
of TCP can result in significant impacts to performance
. This can occur when losses in the lower TCP (closer to the link)
increase delays seen by the higher TCP (closer to the application) that create
timeouts, which, in turn, cause retransmissions that can then cause losses in
the lower TCP by overrunning its buffer. The very mechanism intended to avoid loss
(retransmission) interacts between the two layers to increase loss. To limit this effect,
the timeouts of the two TCP layers need to be carefully managed, e.g., such that
the higher layer has a much longer timeout than the lower layer.
Note that any negative effects will be shared among all flows going
through the outer TCP connection. This is of particular concern for
any latency-sensitive or real-time applications using the tunnel. If
such traffic is using a TCP-encapsulated IPsec connection, it is
recommended that the number of inner connections sharing the tunnel
be limited as much as possible.
Added Reliability for Unreliable ProtocolsSince ESP is an unreliable protocol, transmitting ESP packets over a
TCP connection will change the fundamental behavior of the packets.
Some application-level protocols that prefer packet loss to delay
(such as Voice over IP or other real-time protocols) may be
negatively impacted if their packets are retransmitted by the TCP
connection due to packet loss.
Quality-of-Service MarkingsQuality-of-Service (QoS) markings, such as the Differentiated
Services Code Point (DSCP) and Traffic Class, should be used with
care on TCP connections used for encapsulation. Individual packets
SHOULD NOT use different markings than the rest of the connection
since packets with different priorities may be routed differently and
cause unnecessary delays in the connection.
Maximum Segment SizeA TCP connection used for IKE encapsulation SHOULD negotiate its MSS
in order to avoid unnecessary fragmentation of packets.
Tunneling ECN in TCPSince there is not a one-to-one relationship between outer IP packets
and inner ESP/IP messages when using TCP encapsulation, the markings
for Explicit Congestion Notification (ECN) cannot easily be
mapped. However, any ECN Congestion Experienced (CE) marking on
inner headers should be preserved through the tunnel.
Implementations SHOULD follow the ECN compatibility mode for tunnel
ingress as described in . In compatibility mode, the outer
tunnel TCP connection marks its packet headers as not ECN-capable.
Upon egress, if the arriving outer header is marked with CE, the
implementation will drop the inner packet since there is not a
distinct inner packet header onto which to translate the ECN
markings.
Security ConsiderationsIKE Responders that support TCP encapsulation may become vulnerable
to new Denial-of-Service (DoS) attacks that are specific to TCP, such
as SYN-flooding attacks. TCP Responders should be aware of this additional attack surface.
TCP connections are also susceptible to RST and other spoofing attacks .
This specification makes IPsec tolerant of sudden TCP connection drops, but if an attacker
is able to tear down TCP connections, IPsec connection's performance can suffer,
effectively making this a DoS attack.
TCP data injection attacks have no effect on application data since IPsec provides data integrity.
However, they can have some effect, mostly by creating DoS attacks:
If an attacker alters the content of the Length field that separates packets,
then the Receiver will incorrectly identify the boundaries of the following packets and
will drop all of them or even tear down the TCP connection if the content of the
Length field happens to be 0 or 1 (see ).
If the content of an IKE message is altered, then it will be dropped by the receiver;
if the dropped message is the IKE request message, then the Initiator will tear
down the IKE SA after some timeout since, in most cases, the request message will not be retransmitted
(as advised in ); thus, the response will never be received.
If an attacker alters the non-ESP marker, then IKE packets will be dispatched to ESP
(and sometimes visa versa) and those packets will be dropped.
If an attacker modifies TCP-Encapsulated stream prefix or unencrypted IKE messages before IKE SA is established,
then in most cases this will result in failure to establish IKE SA, often with false "authentication failed" diagnostics.
discusses how TCP injection attacks can be mitigated.
Note that data injection attacks are also possible on IP level (e.g., when IP fragmentation is used),
resulting in DoS attacks even if TCP encapsulation is not used. On the other hand, TCP injection attacks are easier to mount
than the IP fragmentation injection attacks because TCP keeps a long receive window open that's a sitting target for such attacks.
If an attacker successfully mounts an injection attack on a TCP connection used for encapsulating IPsec traffic
and modifies a Length field, the receiver might not be able to correctly identify the boundaries of the following packets in the stream
since it will try to parse arbitrary data as an ESP or IKE header.
After such a parsing failure, all following packets will be dropped. Communication will eventually recover, but this might
take several minutes and can result in IKE SA deletion and re-creation.
To speed up the recovery from such attacks, implementations are advised to follow recommendations in and close
the TCP connection if incoming packets contain SPIs that don't match any known SAs.
Once the TCP connection is closed, it will be re-created by the TCP Originator as described in .
To avoid performance degradation caused by closing and re-creating TCP connections,
implementations MAY alternatively try to resync after they receive unknown SPIs by searching the TCP stream
for a 64-bit binary vector consisting of a known SPI in the first 32 bits and a valid Sequence Number for this SPI in the
second 32 bits. Then, they can validate the Integrity Check Value (ICV) of this packet candidate by taking the preceding 16 bits as the Length field.
They can also search for 4 bytes of zero (non-ESP marker) followed by
128 bits of IKE SPIs of the IKE SA(s) associated with this TCP connection and
then validate the ICV of this IKE message candidate by taking the 16 bits
preceding the non-ESP marker as the Length field.
Implementations SHOULD limit the attempts to resync, because if the
injection attack is ongoing, then there is a high probability that
the resync process will not succeed or will quickly come under attack
again.
An attacker capable of blocking UDP traffic can force peers to use TCP encapsulation,
thus, degrading the performance and making the connection more vulnerable to DoS attacks.
Note that an attacker that is able to modify packets on the wire or to block them can
prevent peers from communicating regardless of the transport being used.
TCP Responders should be careful to ensure that the stream prefix
"IKETCP" uniquely identifies incoming streams as streams that use the
TCP encapsulation protocol.
Attackers may be able to disrupt the TCP connection by sending
spurious TCP Reset packets. Therefore, implementations SHOULD make
sure that IKE session state persists even if the underlying TCP
connection is torn down.
If MOBIKE is being used, all of the security considerations outlined
for MOBIKE apply .
Similar to MOBIKE, TCP encapsulation requires a TCP Responder to
handle changes to source address and port due to network or
connection disruption. The successful delivery of valid new IKE or ESP
messages over a new TCP connection is used by the TCP Responder to
determine where to send subsequent responses. If an attacker is able
to send packets on a new TCP connection that pass the validation
checks of the TCP Responder, it can influence which path future
packets will take. For this reason, the validation of messages on
the TCP Responder must include decryption, authentication, and replay
checks.
IANA ConsiderationsTCP port 4500 is already allocated to IPsec for NAT traversal in the "Service Name and Transport Protocol Port Number Registry". This
port SHOULD be used for TCP-encapsulated IKE and ESP as described in
this document.
This document updates the reference for TCP port 4500 from RFC 8229 to itself:
Service Name:
ipsec-nat-t
Port Number / Transport Protocol:
4500/tcp
Description:
IPsec NAT-Traversal
Reference:
RFC 9329
ReferencesNormative ReferencesKey words for use in RFCs to Indicate Requirement LevelsIn many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.UDP Encapsulation of IPsec ESP PacketsThis protocol specification defines methods to encapsulate and decapsulate IP Encapsulating Security Payload (ESP) packets inside UDP packets for traversing Network Address Translators. ESP encapsulation, as defined in this document, can be used in both IPv4 and IPv6 scenarios. Whenever negotiated, encapsulation is used with Internet Key Exchange (IKE). [STANDARDS-TRACK]Security Architecture for the Internet ProtocolThis document describes an updated version of the "Security Architecture for IP", which is designed to provide security services for traffic at the IP layer. This document obsoletes RFC 2401 (November 1998). [STANDARDS-TRACK]IP Encapsulating Security Payload (ESP)This document describes an updated version of the Encapsulating Security Payload (ESP) protocol, which is designed to provide a mix of security services in IPv4 and IPv6. ESP is used to provide confidentiality, data origin authentication, connectionless integrity, an anti-replay service (a form of partial sequence integrity), and limited traffic flow confidentiality. This document obsoletes RFC 2406 (November 1998). [STANDARDS-TRACK]Tunnelling of Explicit Congestion NotificationThis document redefines how the explicit congestion notification (ECN) field of the IP header should be constructed on entry to and exit from any IP-in-IP tunnel. On encapsulation, it updates RFC 3168 to bring all IP-in-IP tunnels (v4 or v6) into line with RFC 4301 IPsec ECN processing. On decapsulation, it updates both RFC 3168 and RFC 4301 to add new behaviours for previously unused combinations of inner and outer headers. The new rules ensure the ECN field is correctly propagated across a tunnel whether it is used to signal one or two severity levels of congestion; whereas before, only one severity level was supported. Tunnel endpoints can be updated in any order without affecting pre-existing uses of the ECN field, thus ensuring backward compatibility. Nonetheless, operators wanting to support two severity levels (e.g., for pre-congestion notification -- PCN) can require compliance with this new specification. A thorough analysis of the reasoning for these changes and the implications is included. In the unlikely event that the new rules do not meet a specific need, RFC 4774 gives guidance on designing alternate ECN semantics, and this document extends that to include tunnelling issues. [STANDARDS-TRACK]Internet Key Exchange Protocol Version 2 (IKEv2)This document describes version 2 of the Internet Key Exchange (IKE) protocol. IKE is a component of IPsec used for performing mutual authentication and establishing and maintaining Security Associations (SAs). This document obsoletes RFC 5996, and includes all of the errata for it. It advances IKEv2 to be an Internet Standard.Protecting Internet Key Exchange Protocol Version 2 (IKEv2) Implementations from Distributed Denial-of-Service AttacksThis document recommends implementation and configuration best practices for Internet Key Exchange Protocol version 2 (IKEv2) Responders, to allow them to resist Denial-of-Service and Distributed Denial-of-Service attacks. Additionally, the document introduces a new mechanism called "Client Puzzles" that helps accomplish this task.Ambiguity of Uppercase vs Lowercase in RFC 2119 Key WordsRFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.Informative ReferencesA TCP transport for the Internet Key ExchangeCheck Point This document describes using TCP for IKE messages. This facilitates
the transport of large messages over paths where fragments are either
dropped, or where packet loss makes the use of large UDP packets
unreliable.
Work in ProgressRequirements for Internet Hosts - Communication LayersThis RFC is an official specification for the Internet community. It incorporates by reference, amends, corrects, and supplements the primary protocol standards documents relating to hosts. [STANDARDS-TRACK]Upgrading to TLS Within HTTP/1.1This memo explains how to use the Upgrade mechanism in HTTP/1.1 to initiate Transport Layer Security (TLS) over an existing TCP connection. [STANDARDS-TRACK]The Addition of Explicit Congestion Notification (ECN) to IPThis memo specifies the incorporation of ECN (Explicit Congestion Notification) to TCP and IP, including ECN's use of two bits in the IP header. [STANDARDS-TRACK]IKEv2 Mobility and Multihoming Protocol (MOBIKE)This document describes the MOBIKE protocol, a mobility and multihoming extension to Internet Key Exchange (IKEv2). MOBIKE allows the IP addresses associated with IKEv2 and tunnel mode IPsec Security Associations to change. A mobile Virtual Private Network (VPN) client could use MOBIKE to keep the connection with the VPN gateway active while moving from one address to another. Similarly, a multihomed host could use MOBIKE to move the traffic to a different interface if, for instance, the one currently being used stops working. [STANDARDS-TRACK]Design of the IKEv2 Mobility and Multihoming (MOBIKE) ProtocolThe IKEv2 Mobility and Multihoming (MOBIKE) protocol is an extension of the Internet Key Exchange Protocol version 2 (IKEv2). These extensions should enable an efficient management of IKE and IPsec Security Associations when a host possesses multiple IP addresses and/or where IP addresses of an IPsec host change over time (for example, due to mobility).This document discusses the involved network entities and the relationship between IKEv2 signaling and information provided by other protocols. Design decisions for the MOBIKE protocol, background information, and discussions within the working group are recorded. This memo provides information for the Internet community.Defending TCP Against Spoofing AttacksRecent analysis of potential attacks on core Internet infrastructure indicates an increased vulnerability of TCP connections to spurious resets (RSTs), sent with forged IP source addresses (spoofing). TCP has always been susceptible to such RST spoofing attacks, which were indirectly protected by checking that the RST sequence number was inside the current receive window, as well as via the obfuscation of TCP endpoint and port numbers. For pairs of well-known endpoints often over predictable port pairs, such as BGP or between web servers and well-known large-scale caches, increases in the path bandwidth-delay product of a connection have sufficiently increased the receive window space that off-path third parties can brute-force generate a viable RST sequence number. The susceptibility to attack increases with the square of the bandwidth, and thus presents a significant vulnerability for recent high-speed networks. This document addresses this vulnerability, discussing proposed solutions at the transport level and their inherent challenges, as well as existing network level solutions and the feasibility of their deployment. This document focuses on vulnerabilities due to spoofed TCP segments, and includes a discussion of related ICMP spoofing attacks on TCP connections. This memo provides information for the Internet community.TCP SYN Flooding Attacks and Common MitigationsThis document describes TCP SYN flooding attacks, which have been well-known to the community for several years. Various countermeasures against these attacks, and the trade-offs of each, are described. This document archives explanations of the attack and common defense techniques for the benefit of TCP implementers and administrators of TCP servers or networks, but does not make any standards-level recommendations. This memo provides information for the Internet community.The Transport Layer Security (TLS) Protocol Version 1.2This document specifies Version 1.2 of the Transport Layer Security (TLS) protocol. The TLS protocol provides communications security over the Internet. The protocol allows client/server applications to communicate in a way that is designed to prevent eavesdropping, tampering, or message forgery. [STANDARDS-TRACK]Redirect Mechanism for the Internet Key Exchange Protocol Version 2 (IKEv2)The Internet Key Exchange Protocol version 2 (IKEv2) is a protocol for setting up Virtual Private Network (VPN) tunnels from a remote location to a gateway so that the VPN client can access services in the network behind the gateway. This document defines an IKEv2 extension that allows an overloaded VPN gateway or a VPN gateway that is being shut down for maintenance to redirect the VPN client to attach to another gateway. The proposed mechanism can also be used in Mobile IPv6 to enable the home agent to redirect the mobile node to another home agent. [STANDARDS-TRACK]Internet Key Exchange Protocol Version 2 (IKEv2) Session ResumptionThe Internet Key Exchange version 2 (IKEv2) protocol has a certain computational and communication overhead with respect to the number of round trips required and the cryptographic operations involved. In remote access situations, the Extensible Authentication Protocol (EAP) is used for authentication, which adds several more round trips and consequently latency.To re-establish security associations (SAs) upon a failure recovery condition is time consuming especially when an IPsec peer (such as a VPN gateway) needs to re-establish a large number of SAs with various endpoints. A high number of concurrent sessions might cause additional problems for an IPsec peer during SA re-establishment.In order to avoid the need to re-run the key exchange protocol from scratch, it would be useful to provide an efficient way to resume an IKE/IPsec session. This document proposes an extension to IKEv2 that allows a client to re-establish an IKE SA with a gateway in a highly efficient manner, utilizing a previously established IKE SA.A client can reconnect to a gateway from which it was disconnected. The proposed approach encodes partial IKE state into an opaque ticket, which can be stored on the client or in a centralized store, and is later made available to the IKEv2 responder for re-authentication. We use the term ticket to refer to the opaque data that is created by the IKEv2 responder. This document does not specify the format of the ticket but examples are provided. [STANDARDS-TRACK]Improving TCP's Robustness to Blind In-Window AttacksTCP has historically been considered to be protected against spoofed off-path packet injection attacks by relying on the fact that it is difficult to guess the 4-tuple (the source and destination IP addresses and the source and destination ports) in combination with the 32-bit sequence number(s). A combination of increasing window sizes and applications using longer-term connections (e.g., H-323 or Border Gateway Protocol (BGP) [STANDARDS-TRACK]Protocol Support for High Availability of IKEv2/IPsecThe IPsec protocol suite is widely used for business-critical network traffic. In order to make IPsec deployments highly available, more scalable, and failure-resistant, they are often implemented as IPsec High Availability (HA) clusters. However, there are many issues in IPsec HA clustering, and in particular in Internet Key Exchange Protocol version 2 (IKEv2) clustering. An earlier document, "IPsec Cluster Problem Statement", enumerates the issues encountered in the IKEv2/IPsec HA cluster environment. This document resolves these issues with the least possible change to the protocol.This document defines an extension to the IKEv2 protocol to solve the main issues of "IPsec Cluster Problem Statement" in the commonly deployed hot standby cluster, and provides implementation advice for other issues. The main issues solved are the synchronization of IKEv2 Message ID counters, and of IPsec replay counters. [STANDARDS-TRACK]Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) Heartbeat ExtensionThis document describes the Heartbeat Extension for the Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) protocols.The Heartbeat Extension provides a new protocol for TLS/DTLS allowing the usage of keep-alive functionality without performing a renegotiation and a basis for path MTU (PMTU) discovery for DTLS. [STANDARDS-TRACK]Defending against Sequence Number AttacksThis document specifies an algorithm for the generation of TCP Initial Sequence Numbers (ISNs), such that the chances of an off-path attacker guessing the sequence numbers in use by a target connection are reduced. This document revises (and formally obsoletes) RFC 1948, and takes the ISN generation algorithm originally proposed in that document to Standards Track, formally updating RFC 793. [STANDARDS-TRACK]Internet Key Exchange Protocol Version 2 (IKEv2) Message FragmentationThis document describes a way to avoid IP fragmentation of large Internet Key Exchange Protocol version 2 (IKEv2) messages. This allows IKEv2 messages to traverse network devices that do not allow IP fragments to pass through.TCP Encapsulation of IKE and IPsec PacketsThis document describes a method to transport Internet Key Exchange Protocol (IKE) and IPsec packets over a TCP connection for traversing network middleboxes that may block IKE negotiation over UDP. This method, referred to as "TCP encapsulation", involves sending both IKE packets for Security Association establishment and Encapsulating Security Payload (ESP) packets over a TCP connection. This method is intended to be used as a fallback option when IKE cannot be negotiated over UDP.The Transport Layer Security (TLS) Protocol Version 1.3This document specifies version 1.3 of the Transport Layer Security (TLS) protocol. TLS allows client/server applications to communicate over the Internet in a way that is designed to prevent eavesdropping, tampering, and message forgery.This document updates RFCs 5705 and 6066, and obsoletes RFCs 5077, 5246, and 6961. This document also specifies new requirements for TLS 1.2 implementations.Transmission Control Protocol (TCP)This document specifies the Transmission Control Protocol (TCP). TCP is an important transport-layer protocol in the Internet protocol stack, and it has continuously evolved over decades of use and growth of the Internet. Over this time, a number of changes have been made to TCP as it was specified in RFC 793, though these have only been documented in a piecemeal fashion. This document collects and brings those changes together with the protocol specification from RFC 793. This document obsoletes RFC 793, as well as RFCs 879, 2873, 6093, 6429, 6528, and 6691 that updated parts of RFC 793. It updates RFCs 1011 and 1122, and it should be considered as a replacement for the portions of those documents dealing with TCP requirements. It also updates RFC 5961 by adding a small clarification in reset handling while in the SYN-RECEIVED state. The TCP header control bits from RFC 793 have also been updated based on RFC 3168.Recommendations for Secure Use of Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS)IntuitindependentarmUnderstanding TCP over TCP: effects of TCP tunneling on end-to-end throughput and latencyUsing TCP Encapsulation with TLSThis section provides recommendations on how to use TLS in addition
to TCP encapsulation.
When using TCP encapsulation, implementations may choose to use TLS 1.2
or TLS 1.3 on the TCP connection
to be able to traverse middleboxes, which may otherwise block the traffic.
If a web proxy is applied to the ports used for the TCP connection
and TLS is being used, the TCP Originator can send an HTTP CONNECT
message to establish an SA through the proxy .
The use of TLS should be configurable on the peers and may be used
as the default when using TCP encapsulation or may be used as a
fallback when basic TCP encapsulation fails. The TCP Responder may
expect to read encapsulated IKE and ESP packets directly from the TCP
connection, or it may expect to read them from a stream of TLS data
packets. The TCP Originator should be preconfigured regarding whether or not to use TLS
when communicating with a given port on the TCP Responder.
When new TCP connections are re-established due to a broken
connection, TLS must be renegotiated. TLS session resumption is
recommended to improve efficiency in this case.
The security of the IKE session is entirely derived from the IKE
negotiation and key establishment and not from the TLS session (which,
in this context, is only used for encapsulation purposes); therefore,
when TLS is used on the TCP connection, both the TCP Originator and
the TCP Responder SHOULD allow the NULL cipher to be selected for
performance reasons. Note that TLS 1.3 only supports AEAD algorithms
and at the time of writing this document there was no recommended cipher suite
for TLS 1.3 with the NULL cipher. It is RECOMMENDED to follow
when selecting parameters for TLS.
Implementations should be aware that the use of TLS introduces
another layer of overhead requiring more bytes to transmit a given
IKE and IPsec packet. For this reason, direct ESP, UDP
encapsulation, or TCP encapsulation without TLS should be preferred
in situations in which TLS is not required in order to traverse
middleboxes.
Example Exchanges of TCP Encapsulation with TLS 1.3This appendix contains examples of data flows in cases where TCP encapsulation of IKE and IPsec packets
is used with TLS 1.3. The examples below are provided for illustrative purpose only;
readers should refer to the main body of the document for details.Establishing an IKE Session
Client Server
---------- ----------
1) -------------------- TCP Connection -------------------
(IP_I:Port_I -> IP_R:Port_R)
TcpSyn ------->
<------- TcpSyn,Ack
TcpAck ------->
2) --------------------- TLS Session ---------------------
ClientHello ------->
ServerHello
{EncryptedExtensions}
{Certificate*}
{CertificateVerify*}
<------- {Finished}
{Finished} ------->
3) ---------------------- Stream Prefix --------------------
"IKETCP" ------->
4) ----------------------- IKE Session ---------------------
Length + Non-ESP Marker ------->
IKE_SA_INIT
HDR, SAi1, KEi, Ni,
[N(NAT_DETECTION_SOURCE_IP)],
[N(NAT_DETECTION_DESTINATION_IP)]
<------- Length + Non-ESP Marker
IKE_SA_INIT
HDR, SAr1, KEr, Nr,
[N(NAT_DETECTION_SOURCE_IP)],
[N(NAT_DETECTION_DESTINATION_IP)]
Length + Non-ESP Marker ------->
first IKE_AUTH
HDR, SK {IDi, [CERTREQ]
CP(CFG_REQUEST), IDr,
SAi2, TSi, TSr, ...}
<------- Length + Non-ESP Marker
first IKE_AUTH
HDR, SK {IDr, [CERT], AUTH,
EAP, SAr2, TSi, TSr}
Length + Non-ESP Marker ------->
IKE_AUTH (repeat 1..N times)
HDR, SK {EAP}
<------- Length + Non-ESP Marker
IKE_AUTH (repeat 1..N times)
HDR SK {EAP}
Length + Non-ESP Marker ------->
final IKE_AUTH
HDR, SK {AUTH}
<------- Length + Non-ESP Marker
final IKE_AUTH
HDR, SK {AUTH, CP(CFG_REPLY),
SA, TSi, TSr, ...}
-------------- IKE and IPsec SAs Established ------------
Length + ESP Frame ------->
The client establishes a TCP connection with the server on
port 4500 or on an alternate preconfigured port that the server
is listening on.
If configured to use TLS, the client initiates a TLS handshake.
During the TLS handshake, the server SHOULD NOT request the
client's certificate since authentication is handled as part of
IKE negotiation.
The client sends the stream prefix for TCP-encapsulated IKE
() traffic to signal the beginning of IKE negotiation.
The client and server establish an IKE connection. This example
shows EAP-based authentication, although any authentication type
may be used.
Deleting an IKE Session
Client Server
---------- ----------
1) ----------------------- IKE Session ---------------------
Length + Non-ESP Marker ------->
INFORMATIONAL
HDR, SK {[N,] [D,]
[CP,] ...}
<------- Length + Non-ESP Marker
INFORMATIONAL
HDR, SK {[N,] [D,]
[CP], ...}
2) --------------------- TLS Session ---------------------
close_notify ------->
<------- close_notify
3) -------------------- TCP Connection -------------------
TcpFin ------->
<------- Ack
<------- TcpFin
Ack ------->
-------------------- IKE SA Deleted -------------------
The client and server exchange informational messages to notify
IKE SA deletion.
The client and server negotiate TLS session deletion using TLS
CLOSE_NOTIFY.
The TCP connection is torn down.
The deletion of the IKE SA should lead to the disposal of the
underlying TLS and TCP state.Re-establishing an IKE Session
Client Server
---------- ----------
1) -------------------- TCP Connection -------------------
(IP_I:Port_I -> IP_R:Port_R)
TcpSyn ------->
<------- TcpSyn,Ack
TcpAck ------->
2) --------------------- TLS Session ---------------------
ClientHello ------->
ServerHello
{EncryptedExtensions}
<------- {Finished}
{Finished} ------->
3) ---------------------- Stream Prefix --------------------
"IKETCP" ------->
4) <---------------------> IKE/ESP Flow <------------------>
If a previous TCP connection was broken (for example, due to a
TCP Reset), the client is responsible for re-initiating the TCP
connection. The TCP Originator's address and port (IP_I and
Port_I) may be different from the previous connection's address
and port.
The client SHOULD attempt TLS session resumption if it
has previously established a session with the server.
After TCP and TLS are complete, the client sends the stream
prefix for TCP-encapsulated IKE traffic ().
The IKE and ESP packet flow can resume. If MOBIKE is being used,
the Initiator SHOULD send an UPDATE_SA_ADDRESSES message.
During the IKE_AUTH exchange, the client and server exchange
MOBIKE_SUPPORTED notify payloads to indicate support for MOBIKE.
The client changes its point of attachment to the network and
receives a new IP address. The client attempts to re-establish
the IKE session using the UPDATE_SA_ADDRESSES notify payload, but
the server does not respond because the network blocks UDP
traffic.
The client brings up a TCP connection to the server in order to
use TCP encapsulation.
The client initiates a TLS handshake with the server.
The client sends the stream prefix for TCP-encapsulated IKE
traffic ().
The client sends the UPDATE_SA_ADDRESSES notify payload in the
INFORMATIONAL exchange on the
TCP-encapsulated connection. Note that this IKE message is the
same as the one sent over UDP in step 2; it should have the same
message ID and contents.
Once the client receives a response on the
TCP-encapsulated connection, it immediately starts a new INFORMATIONAL
exchange with an UPDATE_SA_ADDRESSES notify payload and recalculated
NAT_DETECTION_*_IP notify payloads in order to get correct information about the presence
of NATs.
The IKE and ESP packet flow can resume.
AcknowledgmentsThanks to the authors of RFC 8229 (, , and ). Since this document clarifies and obsoletes RFC 8229, most of
its text was borrowed from the original document.
The following people provided valuable feedback and advice while
preparing RFC 8229: , , , , ,
, , , , , , , and . Special thanks to for his implementation work.
The authors would like to thank ,
, ,
and for their valuable comments
while preparing this document.
Authors' AddressesApple Inc.1 Infinite LoopCupertinoCalifornia95014United States of Americatpauly@apple.comELVIS-PLUSPO Box 81Moscow (Zelenograd)124460Russian Federation+7 495 276 0211svan@elvis.ru