Wednesday, February 17, 2016

Docker Container Networks: All Things Considered

Docker container networks have to achieve two seemingly conflicting goals:
  1. Provides complete isolation for containers on a host
  2. Provides the service that is running inside the container, not only to other co-located containers, but also to remote hosts.
In this article, we will review how Docker container networks achieve its goals.

Docker Container Networks


To provide service that is running inside the container in a secured matter, it is important to have control over the networks your applications running on. To see how container networks achieve that, we will examine the container networks from the following perspectives:
  • Network modes
    • Default networks vs user-defined networks
  • Packet Forwarding and Filtering (Netfilter)
    • Port mappings
  • Bridge (veth Interface)
  • DNS Configuration
To enable a service consumer to communicate with the service providing containers, Docker needs to configure the following entities:
  • IP address
    • Providing ways to configure any of the containers network interfaces to support services on different containers
  • Port
    • Providing ways to expose and publish a port on the container (also, mapping it to a port on the host)
  • Rules
    • Controlling access to a container's service via rules associated with the host's netfilter framework, in both the NAT and filter tables (see diagram 123).

    Network Modes


    When you install Docker, it creates three networks automatically:[34,37]
    1. bridge
      • Represents the docker0 (a virtual ethernet bridge) network present in all Docker installations.
        • Each container's network interface is attached to the bridge, and network address translation (NAT) is used when containers need to make themselves visible to the Docker host and beyond.
      • Unless you specify otherwise with the docker run --net=option, the Docker daemon connects containers to this network by default.
      • Docker does not support automatic service discovery on the default bridge network.
      • Supports the use of port mapping and docker run --link to allow communications between containers in the docker0 network.
    2. host
      • Adds a container on the hosts network stack. You’ll find the network configuration inside the container is identical to the host.
        • Because containers deployed in host mode share the same host network stack, you can’t use the same IP address for the same service on different containers on the same host.
        • In this mode, you don't get port mapping anymore.
    3. none
      • Tells docker to put the container in its own network stack but not to do configure any of the containers network interfaces.
      • This allows for you to create custom network configuration
    All these network modes applied at the container level. So you can certainly have a mix of different network modes on the same docker host.

    Default Networks vs User-Defined Networks

    Besides default networks, you can create your own user-defined networks that better isolate containers. Docker provides some default network drivers for creating these networks. The easiest user-defined network to create is a bridge network. This network is similar to the historical, default docker0 network. After you create the network, you can launch containers on it using the docker run --net= option. Within a user-defined bridge network, linking is not supported. You can expose and publish container ports on containers in this network. This is useful if you want to make a portion of the bridge network available to an outside network.

    You can read [34, 37] for more details.

    Packet Forwarding and Filtering


    Whether a container can talk to the world is governed by two factors.
    1. Whether the host machine is forwarding its IP packets
      • In order for a remote host to consume a container's service, the Docker host must act like a router, forwarding traffic to the network associated with the ethernet bridge.
      • IP packet forwarding is governed by the ip_forward system parameter in Docker
        • Many using Docker will want ip_forward to be on, to at least make communication possible between containers and the wider world.[39]
    2. Whether the host's iptables allow this particular connections[45]
      • Docker will never make changes to your host's iptables rules if you set --iptables=false when the daemon starts. Otherwise the Docker server will append forwarding rules to the DOCKER filter chain.
    Controlling access to a container's service is controlled with rules associated with the host's netfilter framework, in both the NAT and filter tables. A Docker host makes significant use of netfilter rules to aid NAT, and to control access to the containers it hosts.[44]

    Netfilter offers various functions and operations for packet filtering, network address translation, and port translation, which provide the functionality required for directing packets through a network, as well as for providing ability to prohibit packets from reaching sensitive locations within a computer network.

    Bridge (veth Interface)


    The default network mode in Docker is bridge. To create a virtual subnet shared between the host machine and every container in bridge mode, Docker bind every veth* interface to the docker0 bridge.

    To show information on the bridge and its attached ports (or interfaces), you do:

    # brctl show
    bridge name bridge id         STP enabled interfaces
    docker0     8000.56847afe9799 no          veth33957e0
                                              veth6cee79b


    To show veth interfaces on a host, you do:

    # ip link list
    3: docker0: mtu 9000 qdisc noqueue state UP link/ether 56:84:7a:fe:97:99 brd ff:ff:ff:ff:ff:ff
    11: veth33957e0: mtu 9000 qdisc noqueue master docker0 state UP link/ether 3e:01:d1:0f:24:b8 brd ff:ff:ff:ff:ff:ff
    13: veth6cee79b: mtu 9000 qdisc noqueue master docker0 state UP link/ether fa:aa:84:15:82:5a brd ff:ff:ff:ff:ff:ff


    Note that there are two containers on the host, hence two veth interfaces were shown. Those virtual interfaces work in pairs:
    • eth0 in the container 
      • Will have an IPv4 address 
      • For all purposes, it looks like a normal interface. 
    • veth interface in the host 
      • Won't have an IPv4 address
    Those two interfaces are connected together: any packet sent on an interface will appear as being received by the other. You can imagine that they are connected by a cross-over cable, if that helps.

    DNS Configuration


    How can Docker supply each container with a hostname and DNS configuration, without having to build a custom image with the hostname written inside? Its trick is to overlay three crucial /etc files inside the container with virtual files where it can write fresh information. You can see this by running mount inside a container:[29]

    # mount
    /dev/mapper/vg--docker-dockerVolume on /etc/resolv.conf type btrfs ...
    /dev/mapper/vg--docker-dockerVolume on /etc/hostname type btrfs ...
    /dev/mapper/vg--docker-dockerVolume on /etc/hosts type btrfs ...

    This arrangement allows Docker to do clever things like keep resolv.conf up to date across all containers when the host machine receives new configuration over DHCP later.

    With DHCP, computers request IP addresses and networking parameters automatically from a DHCP server, reducing the need for a network administrator or a user to configure these settings manually. For resource constrained routers and firewalls, dnsmasq is often used for its small-footprint. Dnsmasq provides network infrastructure for small networks: DNS, DHCP, router advertisement and network boot.

    References

    1. The TCP Maximum Segment Size and Related Topics
    2. Jumbo/Giant Frame Support on Catalyst Switches Configuration Example
    3. Ethernet Jumbo Frames\
    4. IP Fragmentation: How to Avoid It? (Xml and More)
    5. The Great Jumbo Frames Debate
    6. Resolve IP Fragmentation, MTU, MSS, and PMTUD Issues with GRE and IPSEC
    7. Sites with Broken/Working PMTUD
    8. Path MTU Discovery
    9. TCP headers
    10. bad TCP checksums
    11. MSS performance consideration
    12. Understanding Routing Table
    13. route (Linux man page)
    14. Docker should set host-side veth MTU #4378
    15. Add MTU to lxc conf to make host and container MTU match
    16. Xen Networking
    17. TCP parameter settings (/proc/sys/net/ipv4)
    18. Change the MTU of a network interface
      • tcp_base_mss, tcp_mtu_probing, etc
    19. MTU manipulation
    20. Jumbo Frames, the gotcha's you need to know! (good)
    21. Understand container communication (Docker)
    22. calicoctl should allow configuration of veth MTU #488 - GitHub
    23. Linux MTU Change Size
    24. Changing the MTU size in Windows Vista, 7 or 8
    25. Linux Configure Jumbo Frames to Boost Network Performance
    26. Path MTU discovery in practice
    27. 10 iptables rules to help secure your Linux box
    28. An Updated Performance Comparison of Virtual Machinesand Linux Containers
    29. Network Configuration (Docker)
    30. Storage Concepts in Docker: Persistent Storage
    31. Xen org
    32. CLOUD ARCHITECTURES,NETWORKS, SERVICES, ANDMANAGEMENT
    33. Cloud Networking
    34. Docker Networking 101 – Host mode
    35. Configuring DNS (Docker)
    36. Configuring dnsmasq to serve my own domain name zone
    37. Understand Docker container networks (Docker)
    38. dnsmasq - A lightweight DHCP and caching DNS server.
    39. Understand Container Communitcation
    40. Linux: Check if in Same Network
    41. Packet flow in Netfilter and General Networking (diagram)
    42. How to Enable IP Forwarding in Linux
    43. Exposing a port on a live docker container
    44. The docker-proxy
    45. Linux Firewall Tutorial: IPTables Tables, Chains, Rules Fundamentals
    46. iptables (ipset.netfilter.org)
    47. How to find out capacity for network interfaces?
    48. Security Considerations: Enabling/Disabling Ping /Traceroute for Your Network (Xml and More)
    49. How to Read a Traceroute (good)

    1 comment:

    1. Thank you for sharing valuable information with us. Keep share more content on Devops Online Training Bangalore

      ReplyDelete