The peerDbDelete was passing the wrong field to the underlay
Delete operation causing the mac entry to not being deleted
from the bridge on the overlay. This caused connectivity issue
when a container that before was remote was now scheduled
on the local node. The entry was such:
bridge fdb show | grep -i 02:42:0a:01:00:02
02:42:0a:01:00:02 dev vxlan0 master br0
02:42:0a:01:00:02 dev vxlan0 dst 172.31.14.63 link-netnsid 0 self permanent
That was still pointing to a remove node
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Refreshed the PR: https://github.com/docker/libnetwork/pull/1585
Addressed comments suggesting to remove the IPAlias logic not anymore used
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Move the sandbox init logic into the go routine that handles
peer operations.
This is to avoid deadlocks in the use of the pMap.Lock for the
network
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Remove the need for the wait group and avoid new
locks
Added utility to print the method name and the caller name
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
This patch improves debugging for the resolver;
- prefix debug messages with `[resolver]` for easier finding in the daemon logs
- use `A` / `AAAA` for query-types in the logs instead of their numeric code
- add debug messages if the external DNS did not return a result
- print sucessful results (t.b.d.)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Before when a node was failing, all the nodes would bump the lamport time of all their
entries. This means that if a node flap, there will be a storm of update of all the entries.
This commit on the base of the previous logic guarantees that only the node that joins back
will readvertise its own entries, the other nodes won't need to advertise again.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
join/leave fixes:
- when a node leaves the network will deletes all the other nodes entries but will keep track of its
to make sure that other nodes if they are tcp syncing will be aware of them being deleted. (a node that
did not yet receive the network leave will potentially tcp/sync)
add network reapTime, was not being set locally
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Remove the need for the wait group and avoid new
locks
Added utility to print the method name and the caller name
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Deletion of the dynamic mac is expected to work only if there was active
traffic with that endpoint and a dynamic entry exists. It can also age
out. Hence the mac removal failing is not error. Removing it to make the
debugging easier when parsing the logs.
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
- Diagnose framework that exposes REST API for db interaction
- Dockerfile to build the test image
- Periodic print of stats regarding queue size
- Client and server side for integration with testkit
- Added write-delete-leave-join
- Added test write-delete-wait-leave-join
- Added write-wait-leave-join
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
- Introduce the possibility to specify the max buffer length
in network DB. This will allow to use the whole MTU limit of
the interface
- Add queue stats per network, it can be handy to identify the
node's throughput per network and identify unbalance between
nodes that can point to an MTU missconfiguration
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
A rapid (within networkReapTime 30min) leave/join network
can corrupt the list of nodes per network with multiple copies
of the same nodes.
The fix makes sure that each node is present only once
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>