When setting up an experiment, numerous issues can prevent VMs from being able to connect to one another. This article describes some troubleshooting steps that may help to resolve the issue.
We will use the following simple test environment to concretely describe where potential connectivity issues may arise:
# Create three VM environment: # A --- router --- B # Create a router between two subnets clear vm config vm config kernel $images/minirouter.kernel vm config kernel $images/minirouter.initrd vm config net A B vm launch kvm router # Configure router, enable dhcp for VMs and OSPF between interfaces router router interface 0 10.0.0.1/24 router router dhcp 10.0.0.1 range 10.0.0.2 10.0.0.2 router router route ospf 0 0 router router interface 1 184.108.40.206/24 router router dhcp 220.127.116.11 range 18.104.22.168 22.214.171.124 router router route ospf 0 1 router router vm start router # Create two clients, one in each subnet clear vm config vm config kernel $images/miniccc.kernel vm config initrd $images/miniccc.initrd vm config net A vm launch kvm A vm config net B vm launch kvm B vm start all
In this environment, we assume that we wish to connect from VM A to VM B and vice versa.
Ping is one of the most basic connectivity tests. It can be used to ping another IP address on the same subnet or on another subnet if there is a route between them. For example, a working ping from A to the router interface on the same subnet:
$ ping -c 3 10.0.0.1 PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data. 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.079 ms 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.085 ms 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.085 ms --- 10.0.0.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2029ms rtt min/avg/max/mdev = 0.079/0.083/0.085/0.003 ms
If there ping cannot reach the destination, it would look something like:
$ ping -c 3 126.96.36.199 PING 188.8.131.52 (184.108.40.206) 56(84) bytes of data. From 10.0.0.1 icmp_seq=1 Destination Host Unreachable From 10.0.0.1 icmp_seq=2 Destination Host Unreachable From 10.0.0.1 icmp_seq=3 Destination Host Unreachable --- 220.127.116.11 ping statistics --- 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2043ms
Ping can be used to differentiate between a subnet issue and a routing issue. If VM A can ping VM B, we are in great shape, we will likely be able to connect from VM A to B. If we cannot, we should test whether VM A can ping the router interface on the same subnet (i.e. 10.0.0.1). If it can, we should then test whether VM A can ping the router interface on the B subnet (i.e. 18.104.22.168). If we cannot, we most likely have a routing issue. If VM A cannot ping the router at all, we most likely have a subnet issue.
There are a number of reasons that VM A might not be able to ping the router interface on the same subnet. We enumerate a few below:
- Mistyped VLANS: when configuring the VLANs, one or both of the VLANs were mistyped. VMs must be on the same VLAN to ensure layer 2 connectivity. - Duplicate IPs: the IP addresses of the VMs are identical. VMs must have distinct IP addresses on the same subnet to ensure layer 3 connectivity. - Subnets misconfigured: the IP address of the VMs are in different subnets. The broadcast address must be the same to ensure layer 3 connectivity.
A helpful tool to enumerate the IPs in a subnet is available on the Go Playground. Simply change the CIDR on line 10 and then hit run and the output will be all the IPs in the subnet. If both IPs are not in the list, then you will not have layer 3 connectivity. The last IP in the list is the broadcast IP and is typically not used by an individual VM.
ip neighbor can be used to list the layer 2 neighbors for an interface:
10.0.0.1 dev eth0 lladdr 00:19:7e:7d:2b:d2 REACHABLE
If the table is empty or the entry says
FAILED, there is likely a layer 2
tcpdump can be used to sniff traffic on an interface. This can be helpful to
see if packets are making it to the VM but are being dropped by the kernel. For
$ tcpdump -i eth0
If you can see pings but are not getting a response, there is likely a subnet misconfiguration or a firewall blocking pings.
Routing issues can be difficult to diagnose, especially in large networks. Here, we list some basic troubleshooting ideas but it is far from exhaustive.
- Double check all IPs and masks - [[http://bird.network.cz/?get_doc&v=20&f=bird-4.html][Check bird status]] (if using minirouter)