NAT as a Topology Shield

NAT provides a security function by segregating private hosts from the publicly routed Internet. Depending upon your addressing requirements, NAT can isolate, to some extent, your VoIP network IP space from the balance of your internal network IP space. The large number of private RFC1918 IP addresses allows system architects to intelligently address hosts and other network elements based upon location, function, or other criteria during the design phase of the VoIP network.
External hosts cannot directly access a particular internal host if a NAT intervenes since the external host has no way of targeting its payload to a chosen IP address. Of course, when addresses are assigned dynamically, it becomes even more problematic for an attacker to point to a specific host within the NAT domain. This may help protect internal hosts from external malicious content. At worst, NAT is an additional layer of security controls that you implement as part of your overall security architecture.
The IPsec model is instructive in that it illustrates a complex interaction between encryption and NAT. However, IPsec is not the only functional or proposed security mechanism for VoIP environments. SSL/TLS, S/MIME, HTTP 1.1 digest, and ZRTP have also been proposed as security instruments. Nor are all environments as simple as the symmetric examples we have seen where one or more devices reside on opposite sides of a NAT device. Asymmetric or hairpin call routing (a call from one phone behind a NAT to another phone behind the same NAT), in an environment where basic NAT and encryption issues have been resolved, can cause communications to fail. The point here is to introduce some of the concepts that you will come across as you design and troubleshoot in this area. We’ll see in the next section how encryption, NAT, and VoIP protocols work (or don’t work) together.


NAT and Encryption

As IPsec VPNs became popular, NAT became an impediment to their initial widespread implementation. I’ll use the IPsec model to develop a description of the interactions between NAT and encryption since it is one of the more popular Internet encryption systems and has potential value in VoIP networks. The IP security (IPsec) protocol was defined by the Internet Engineering Task Force (IETF) to provide security for IP networks. IPsec is a large protocol suite designed to provide the following security services for IP networks: Data Integrity, Authentication, Confidentiality, and Application-transparent Security. IPSec secures packet flows and key transmission. Since we are interested in NAT and encryption, we’ll ignore most of the protocol suite including key exchange (IKE), and the various hash and encryption algorithms, and focus instead on the protocols that are used to secure packet flows.
The AH and ESP protocols can operate in two modes: Transport Mode can be visualized simply as a secure connection between two concurring hosts. In Tunnel Mode—more of a “VPN-like” mode—IPsec completely encapsulates the original IP datagram, including the original IP header, within a second IP datagram. ESP and AH normally are implemented independently, though it’s possible (but uncommon) to use them both together.
The Authentication Header (AH) and the Encapsulating Security Payload (ESP) are the two main network protocols used by IPsec. The AH provides data origin authentication, message integrity, and protection against replay attacks, but has no provision for privacy—data is not encrypted. The key to the AH authentication process is the inclusion in the AH header of an Integrity Check Value (ICV) —a hash based upon a secret key that is calculated over a subset of the original IP header fields, including the source and destination IP addresses. AH guarantees (if implemented correctly) that the data received is identical to the data sent, and asserts the identity of the true sender. AH provides authentication for as much of the IP header as possible, as well as for upper level protocol data. However, some IP header fields (SIP, DIP, TTL, CHKSUM, and optionally, TOS, FLAGS, and OPTIONS) change in transit. The values of such fields usually are not protected by AH. In transport mode, AH is inserted after the IP header and before the upper layer protocol (TCP, UDP, ICMP, etc.) header. In tunnel mode, the AH header precedes the encapsulated IP header. Figure 1 shows the AH transport and tunnel modes.

Figure 1: Authentication Header: Transport and Tunnel Modes
In Figure 1, sections A and B show the location of the AH header in transport mode. Sections C) and D show the location of the AH header in tunnel mode. The data field in all packets is not to scale (indicated by the double slanted lines). You can see from this figure that tunnel mode AH adds an additional 20 bytes to the length of each packet. None of the fields in this figure are encrypted.
The key to the incompatibility of NAT and IPsec AH is the presence of the ICV, whose value depends partially on the values of the source and destination IP addresses, the IP header checksum, and either the TCP or UDP header checksum. The AH ICV calculation takes into account the mutable and predictable header fields that change as the packet moves from hop to hop through the network, but because intermediate devices do not share the secret key, they cannot recalculate the correct ICV after NAT has altered the aforementioned original header fields.
ESP, on the other hand, was used initially only for encryption; authentication functionality was subsequently added. The ESP header is inserted after the IP header and before the upper layer protocol header (transport mode) or before an encapsulated IP header (tunnel mode).
Figure 2 shows the location of the ESP header in both transport mode (sections A and B) and tunnel mode (sections C) and D) for TCP (sections A and C) and UDP (sections B and D). In transport mode, the original IP header is followed by the ESP header. The rightmost field contains the ESP trailer and optionally, the ESP authorization field. Only the upper-layer protocol header, data, and the ESP trailer (also, optionally, the ESP authorization field) is encrypted. The IP header is not encrypted.

Figure 2: ESP Header: Transport Mode and Tunnel Mode
In transport mode, ESP encrypts the entire packet. This means that the entire original IP datagram, including the original IP and protocol header, is encrypted. In this mode, when IP traffic moves between gateways, the outer, unencrypted IP header contains the IP addresses of the penultimate source and destination gateways, and the inner, encrypted IP header contains the IP source and destination addresses of the true endpoints. However, even though ESP encrypts most of the IP datagram in either transport or tunnel mode, ESP is relatively compatible with NAT, since ESP does not incorporate the IP source and destination addresses in its keyed message integrity check. Still, ESP has a dependency on TCP and UDP checksum integrity through inclusion of the pseudo-header in the calculation. As a result, when checksums are calculated, they will be invalidated by passage through a NAT device (except in some cases where the UDP checksum is set to zero).
NAT traversal using ESP leads to a catch-22. NAT must recalculate the TCP header checksums used to verify packet integrity, because as was showed earlier, NAT modifies those headers. If NAT updates the header checksum, ESP authentication will fail. If NAT does not update the checksum, TCP verification will fail. One way around this, if the transport endpoint is under your control, is to turn off checksum verification, but I’m not aware of anyone who has done this in production environments. A second, more common means to do this is to NAT before IPSec; don’t perform IPSec before NAT. This can be accomplished by locating the NAT device logically behind the IPsec device. The most common form of NAT traversal used today relies on encapsulating IPsec packets in UDP in order to bypass NAT devices. The IPsec packet is encapsulated in a meta-UDP packet and the metaUDP packet is stripped off after it passes through the NAT device. This enables NAT and IPsec to function together but none of these are hardly elegant solutions.


NAT Has Three Common Modes of Operation

Depending upon networking requirement and topology requirements, NAT is manifested in one of three related modes. Static NAT refers to a one-to-one mapping or correspondence between internal and external IP addresses. In this case, the number of internal IP addresses equals the number of external addresses (see Figure 1).
Figure 1: Static NAT
The NAT device maintains a lookup table of internal and external addresses in order to manage translations in a stateless manner. Static NAT has utility in mapping the private internal IP addresses of critical infrastructure servers and network appliances to a unique globally available IP address.
Dynamic NAT in its original form consisted of an outside pool or collection of public IP addresses that were used on a first-come, first-served strategy (see Figure 2). Each unique single internal address could be used by any member of the outside pool to communicate with external Internet hosts. Consequently, the size of the outside pool member set limited the number of inside users that could connect externally. A built-in timeout mechanism allowed external pool members to be reused.

Figure 2: Dynamic NAT
The third and probably most common style of NAT is derived functionally from Dynamic NAT since it reuses a smaller pool or a single external IP address to proxy for all the internal IP addresses. This NAT is known by a number of names, including Network Address Port Translation (NAPT), Port Address Translation (PAT), Full Cone NAT (From the STUN RFC3489), hiding NAT, and masquerading NAT. This type of NAT (we’ll call it NAPT to keep things organized) works to preserve state by maintaining a lookup table of source IP, destination IP, source port, and destination port. This 4-tuple is almost always guaranteed to be unique within a given conversation stream. You’ll find NAPT operating in almost all home broadband and in most large enterprise networking scenarios. Figure 3 shows an example of NAPT.

Figure 3: Network Address Port Translation
So a normal scenario that occurs when moving TCP traffic between two domains running NAT at each edge is shown in Figure 4.
Figure 4: Normal NAT Process with TCP
In addition to these three NAT modes, STUN (we’ll see this later) has defined a three types of NAT that map more or less to these three modes. These are cone NAT, restricted NAT, and symmetric NAT. We’ll talk more about these in the section on STUN and TURN.
Section A of Figure 4 shows the TCP/IP packet header prior to NAT. After passing through the first NAT edge device (section B), the four header fields are modified: the three IP header fields—source address, destination address, and checksum—and the TCP header checksum. After passing through the second NAT edge device, the original header fields are regenerated (section C). The same is true for UDP in this situation, except that if the UDP checksum is zero, it will not be altered.
You may naturally ask by now, why is NAT such an issue for VoIP? Well, when we begin to combine NAT and protocols such as H.323 and SIP that partition the signaling and media channels; and, to make things even more interesting, embed IP addresses in the signaling channel, it will be important to understand how, when, and where NAT manipulates these fields. When we add encryption into the mix, NAT adds further complexity to these systems. Additionally, note that NAT stores its address mapping information in binding tables, and that these bindings are only initiated by outbound traffic. NAT breaks the choreography of SIP session flow. Encryption adds further complexity to these systems.
Related Posts with Thumbnails

Link Exchange