The Linux Blog UNIX, LINUX, BSD, OSX

22Apr/090

Understanding Network Address Translation, NAT

Network Address Translation (NAT) is one of the basic functions of a circuit level gateway. The simple purpose of NAT is to hide the IP addresses of a private network from the outside world.

Normally, when a router forwards a packet from one segment to another, the packet is unchanged. With NAT, as a packet crosses from a trusted segment of a circuit level gateway to an untrusted segment, the packet is rewritten so that the packet’s source address as it appears on the private segment is replaced by a translated source address. The translated source address is what the outside world sees. Thus, the private address remains hidden from the outside world.
nat1

When a host on a public network transmits a packet to a host on the private network, the source host addresses the packet to the private host’s publicly translated address. The sender on the public side does not know the destination host’s true address. As the packet crosses the circuit level gateway, the gateway rewrites the packet so that the destination address is translated to the destination host’s private address.

nat2

This image illustrates the changes in source and destination addresses as packets cross a circuit level gateway performing network address translation

nat3

One to One Translation
One form of NAT establishes a one to one translation between an equal number of private and public host addresses. For example, each host address on a Class C network on the private side of a circuit level gateway is uniquely mapped to a corresponding host address on a Class C network on the public side of the gateway. If 10.1.1.0/24 is the private network address and 172.19.19.0/24 is the public network address, then outbound packets with a source address of 10.1.1.5 can always be rewritten with a translated source address of 172.19.19.5, and inbound packets with a destination address of 172.19.19.5 can be rewritten with a translated destination address of 10.1.1.5. The mapping is persistent and bi-directional. Therefore, connections may be initiated from either side of the circuit level gateway unless a default deny policy is applied.

Pool of Translated Addresses
One form of NAT maps a large block of addresses from the private network to a small pool of addresses on the public segment. Multiple Class A addresses may be mapped to part of a Class C network block. If 10.0.0.0/4 is the private segment’s network address and 172.19.19.0/28 is the public pool of addresses, then an outbound packet with a source address of 10.1.1.5 may be rewritten to have a translated source address of any host address in the pool of 172.19.19.0/28. The NAT gateway will then create a temporary entry in its internal translation table to track the mapping. An inbound packet’s destination address cannot be translated unless a corresponding entry exists in the NAT table. If a current translation exists in the NAT table, the inbound packet’s destination address will be rewritten in accordance with the NAT table entry. The mapping is not persistent and is only temporarily bi-directional. An inbound connection may be accepted only until the NAT table entry expires.

Single Translated Addresses
The form of NAT commonly (but not exclusively) used in commercial circuit level gateways maps any number of addresses from the private network to a single address on the public segment. Given a private segment with the network address 10.0.0.0/8 and a NAT policy that sets 172.19.19.130 as the public address, all outbound packets from the private network will be rewritten to have a translated source address of 172.19.19.130. To correctly map replies to the private host that initiated the connection, the source port number of the outbound packet must also be translated. The NAT gateway will then create a temporary entry in its internal translation table to track the translated source address and port number. An inbound packet’s destination address and port number cannot be translated unless a corresponding entry exists in the NAT table. If a current translation exists in the NAT table, the inbound packet’s destination address and port number will be rewritten in accordance with the NAT table entry. The mapping is not persistent and is only temporarily bi-directional. An inbound connection may be accepted only until the NAT table entry expires.

This image illustrates the changes in IP addresses and port numbers as packets cross a circuit level gateway performing network address and port translation.

nat Chains
netfilter implements network address translation in the nat table. This pre-defined table consists of three built-in chains, the PREROUTING, OUTPUT and POSTROUTING chains. Rules in the PREROUTING chain apply to inbound packets (packets arriving at the gateway from any direction). Rules in the OUTPUT chain apply to locally generated packets (packets that are generated on the gateway itself). Rules in the POSTROUTING chain apply to outbound packets (packets leaving the gateway in any direction).

nat Targets
The nat table includes the built-in targets MASQUERADE, SNAT, DNAT, NETMAP and REDIRECT.

The MASQUERADE target is available in the POSTROUTING chain. MASQUERADE is intended to be used where a firewall’s public side IP address is dynamically assigned, such as where an ISP assigns IP addresses by DHCP. MASQUERADE translates all private network addresses to the single address of the external interface as illustrated, performing port translation as needed and rewriting the destination address and port of replies as needed. When the firewall’s external IP address is released or changed, all translations are dropped.

The SNAT target is available in the POSTROUTING chain. SNAT may be used on a firewall with statically assigned IP addresses. SNAT provides outbound (more trusted to less trusted) network address translation to a pool of public side addresses such that the source address of each outbound packet is translated to an address from the pool, with port translation being performed as needed and the destination address and port of replies being rewritten as needed.

SNAT can use a single public side address as an alternative to a pool of addresses, making SNAT comparable to MASQUERADE. However, SNAT should not be used with dynamically assigned public addresses.

Conversely to SNAT, the DNAT target is available in the PREROUTING and OUTPUT chains and provides inbound (less trusted to more trusted) network address translation. When a connection is initiated from a less trusted network, the destination address is the address of the firewall interface that faces the originating network. DNAT translates the destination address to the address of a host on a more trusted segment. Optionally, the destination port may also be translated. The source address and port of replies from the more trusted segment will be rewritten as needed.

DNAT can use a pool of destination addresses and ports, providing a simple circuit level method of performing load balancing across a number of hosts such as a farm of web servers.

The NETMAP target provides static one to one translation between two network blocks of equal size.

The REDIRECT target is available in the PREROUTING and OUTPUT chains. REDIRECT translates the destination IP address of each packet arriving on any interface to the IP address of the interface on which the packet arrived. For example, REDIRECT will translate the destination address of any packet arriving at eth2. Optionally, the destination port may also be translated. Among other uses, REDIRECT facilitates use of transparent proxies whereby client software such as web browsers may be automatically redirected through the firewall to a proxy server without reconfiguration on the client side.

10Mar/090

Traffic classification using BGP (a quagga+realms approach)

Realms patch - Quagga 0.98.6

Stable: quagga-0.98.6-realms.diff
Development: quagga-0.99.5-realms.diff
Updated versions (>0.99.5) - http://linux.mantech.ro/quagga+realm_en.html

This patch enables Linux route realms support in quagga 0.98.6
By Arcady Stepanov’s patch for zebra 0.93b http://win.mol.ru/penguin/zebra-hacks/, adapted it to quagga 0.98.4 interface and added some useful features.
The following commands are supported:

* Route-map
o

bgpd(config-route-map)# set realm
<1-255>    Realm id for Linux FIB routes
WORD       Realm name for Linux FIB routes
origin-as  Use route origin AS as realm id
peer-as    Use route peer AS as realm id

o

bgpd(config-route-map)# no set realm
<0-255>    Realm value
WORD       Realm name
origin-as  Origin AS - realm
peer-as    Peer AS - realm
<cr>

* Neighbor
o

bgpd(config-router)# neighbor x.x.x.x realm
<0-255>    default realm id
WORD       default realm name
origin-as  Set default realm to received route origin AS
peer-as    Set default realm to peer AS

o

bgpd(config-router)# no neighbor x.x.x.x realm
<0-255>    default realm id
WORD       default realm name
origin-as  Set default realm to received route origin AS
peer-as    Set default realm to peer AS
<cr>

Note:

’set realm origin-as’ was added with inter-AS traffic accounting in mind. For now, this is possible only with the iptables realm match which can match on the full 16bit realm value. The current realm accounting code in the kernel (rtacct - /proc/net/rt_acct) supports only 256 values for realms, and displays incorrect statistics.

Bugs/suggestions should go to: vcalinusATgemenii.ro
Brief usage guide…

0. kernel support (if you want to classify traffic into htb classes using tc)

CONFIG_NET_CLS_ROUTE=y

1. /eetc/iproute2/rt_realms

Assign meaningful names to realm numbers...

user@router:/# cat /eetc/iproute2/rt_realms

10 localnet
20 metro-isp
22 metro-other
30 international

2. compile/install quagga

Stable Quagga 0.98.6
quagga 0.98.6 - official release
+
quagga 0.98.6 realms patch
Big thanks to Alin Nastac for updating the patch to 0.98.6!

Patch for development Quagga 0.99.5
quagga-0.99.5-realms.diff
Older patches

quagga-0.98.5-realms.diff
quagga-0.98.4-realms.diff
quagga-0.98.3-realms.diff

Remember to use  ./configure --enable-realms

3. BGP CONFIGURATION
a possible bgp setup:
(if you hold the full routing table - replace defgw with a match on the desired community)
AS-regexp match is also possible

neighbor xxx.xxx.xxx.xxx remote-as XXXXX
neighbor xxx.xxx.xxx.xxx soft-reconfiguration inbound
neighbor xxx.xxx.xxx.xxx route-map isp_in in

ip prefix-list defgw seq 5 permit 0.0.0.0/0

ip community-list standard metro-isp permit XXXXX:comm1
ip community-list standard metro-other permit XXXXX:comm2

route-map isp_in permit 10
match ip address prefix-list defgw
set realm 30
!
route-map isp_in permit 20
match community metro-isp
set realm 20
!
route-map isp_in permit 30
match community metro-other
set realm 22
!
route-map isp_in permit 40

3.1 'ip route sh' will show kernel routes - they should have the realms specified in the route-map

something like....

62.217.192.0/18 via 193.19.192.65 dev eth1  proto zebra equalize realm 20
82.137.0.0/18 via 172.16.100.1 dev eth2  proto zebra equalize realm 22
84.243.64.0/18 via 172.16.100.1 dev eth2  proto zebra equalize realm 20
82.208.128.0/18 via 193.19.192.65 dev eth1  proto zebra equalize realm 22

4. iptables

Can be used in FORWARD or POSTROUTING (remember that realms are valid only after the forwarding decision)

Download: match default route, community 1, and community 2 sets

-A FORWARD -i eth3 -m realm --realm 0x1e0000/0xffff0000 -j sometarget...
-A FORWARD -i eth3 -m realm --realm 0x140000/0xffff0000 -j sometarget...
-A FORWARD -i eth3 -m realm --realm 0x160000/0xffff0000 -j sometarget...

Upload: match default route, community 1, and community 2 sets

-A FORWARD -o eth3 -m realm --realm 0x1e/0xffff -j sometarget...
-A FORWARD -o eth3 -m realm --realm 0x14/0xffff -j sometarget...
-A FORWARD -o eth3 -m realm --realm 0x16/0xffff -j sometarget...

(realms 30,20 and 22 are specified in hexadecimal)

5. tc

Excerpt from LARTC

# ip route add 192.168.2.0/24 dev eth2 realm 2
# tc filter add dev eth1 parent 1:0 protocol ip prio 100
route from 2 classid 1:2

Here the filter specifies that packets from the subnetwork 192.168.2.0 (realm 2) will match class id 1:2.

You can also find useful QoS stuff at: http://kernel.umbrella.ro/net/

6. what are realms after all?

Realms are 16bit integer values used to group routes into sets, according to
some defined policy. Each route in the set will have the same realm.

Each packet routed will have a 32bit integer value specifying a source and a destination realm. (they may be 0 - or unknown)
On the leftmost 16bits you will find the source realm, on the rightmost 16bits the destination realm.
More info: http://www.policyrouting.org/iproute2.doc.html#ss9.9