Commit Graph

53 Commits

Author SHA1 Message Date
Derek McGowan 43f00b74d7 Update logrus to v1.0.1
Fix case sensitivity issue
Update docker and runc vendors

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2017-08-07 11:20:47 -07:00
Flavio Crisciani 37502aca3c
PeerInit for the sandbox init
Move the sandbox init logic into the go routine that handles
peer operations.
This is to avoid deadlocks in the use of the pMap.Lock for the
network

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-08-05 12:07:31 -07:00
Flavio Crisciani ceb8146a90
NetworkDB allow setting PacketSize
- Introduce the possibility to specify the max buffer length
  in network DB. This will allow to use the whole MTU limit of
  the interface

- Add queue stats per network, it can be handy to identify the
  node's throughput per network and identify unbalance between
  nodes that can point to an MTU missconfiguration

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-07-26 13:44:33 -07:00
Flavio Crisciani c6917f91f5
Add gosimple check
Add the gosimple tool check in the Makefile
Fix all the issues identified

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-07-06 09:42:38 -07:00
Flavio Crisciani 6426d1e66f Service discovery race on serviceBindings delete. Bug on IP reuse (#1808)
* Correct SetMatrix documentation

The SetMatrix is a generic data structure, so the description
should not be tight to any specific use

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

* Service Discovery reuse name and serviceBindings deletion

- Added logic to handle name reuse from different services
- Moved the deletion from the serviceBindings map at the end
  of the rmServiceBindings body to avoid race with new services

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

* Avoid race on network cleanup

Use the locker to avoid the race between the network
deletion and new endpoints being created

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

* CleanupServiceBindings to clean the SD records

Allow the cleanupServicebindings to take care of the service discovery
cleanup. Also avoid to trigger the cleanup for each endpoint from an SD
point of view
LB and SD will be separated in the future

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

* Addressed comments

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

* NetworkDB deleteEntry has to happen

If there is an error locally guarantee that the delete entry
on network DB is still honored

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-06-18 05:25:58 -07:00
Flavio Crisciani b34bc70afb
Fix handleEPTable log
There was an extra parameter not in the formatters

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-06-13 15:47:31 -07:00
Flavio Crisciani 4994c597ce
Fixed code issues
Fixed issues highlighted by the new checks

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-06-12 11:31:35 -07:00
Flavio Crisciani d64e71e4f7
Service discovery logic rework
changed the ipMap to SetMatrix to allow transient states
Compacted the addSvc and deleteSvc into a one single method
Updated the datastructure for backends to allow storing all the information needed
to cleanup properly during the cleanupServiceBindings
Removed the enable/disable Service logic that was racing with sbLeave/sbJoin logic
Add some debug logs to track further race conditions

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-06-11 20:49:29 -07:00
Flavio Crisciani 1045cfeda2
Fix leak of handleTableEvents
The channel ch.C is never closed.
Added the listen of the ch.Done() to guarantee
that the goroutine is exiting once the event channel
is closed

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-31 11:04:19 -07:00
Flavio Crisciani 90f43a798b
Revert "Move Cluster provider back to Moby"
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-25 10:47:02 -07:00
Flavio Crisciani 9cc5cd9b53
Moved the cluster provider to Moby
Moved the cluster provider interface definition from
libnetwork to moby

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-24 11:28:23 -07:00
Alessandro Boch 596122e05e Add ConnectivityScope capability for network drivers along with scope network option
- It specifies whether the network driver can
  provide containers connectivity across hosts.
- As of now, the data scope of the driver was
  being overloaded with this notion.
- The driver scope information is still valid
  and it defines whether the data allocation
  of the network resources can be done globally
  or only locally.
- With the scope network option, user can now
  force a network as swarm scoped
  regardless of the driver data scope.
- In case the network is configured as swarm scoped,
  and the network driver is multihost capable,
  a network DB instance will be launched for it.

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-05-12 17:16:34 -07:00
Flavio Crisciani 5008b0c26d
Fix for swarm/libnetwork init race condition
This change cleans up the SetClusterProvider method.
Swarm calls the SetClusterProvider to pass to libnetwork the pointer
of the provider from which libnetwork can fetch all the information to
initialize the internal agent.

The method can be and is called multiple times passing the same value,
with the previous logic that was erroneusly spawning multiple go routines that
were making possiblea race between an agentInit and an agentClose.

The new logic aims to disallow it by checking for the provider passed and
ensuring that if the provider is already present there is nothing to do because
there is already an active go routine that is ready to process cluster events.
Moreover a patch on moby side takes care of clearing up the Cluster Events
dispacthing using only 1 channel to handle all the events types.
This will also guarantee in order event handling because now all the events are
piped into one single channel.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-04 15:35:28 -07:00
Flavio Crisciani c9912b19e4
Fix for remote addr parsing
Fix initialization of starting vector

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-04-28 09:10:29 -07:00
Flavio Crisciani c517188a56
Change GetRemoteAddr to return all managers
Change in the provider interface to let the provider
return the whole list of managers.
This will allow the netwrok db to have multiple choice
to establish the first adjacencies

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-04-27 16:58:42 -07:00
Alessandro Boch 5f62c01f9b Merge pull request #1719 from fcrisciani/data_path
Add the datapath-addr in libnetwork
2017-04-24 13:55:24 -07:00
Alessandro Boch 849e92b2e9 agentSetup to first check if clusterProvider is nil
- concurrent swarm join and daemon stop seen in
  integration tests may cause agentSetup to access
  a nil clusterProvider, resulting in a panic

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-04-18 11:34:05 -07:00
Flavio Crisciani 0c175ff67c
Add the data-path-addr
During configuration in SWARM mode is now possible to pass an additional
parameter --data-path-addr <ip|interface>.
The information is going to be used to configure which is the interface
that is going to be used for the data path for global scope drivers.
Up to now the only driver really using this extra parameter is the
overlay driver.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-04-14 16:52:40 -07:00
Alessandro Boch eda2dbf808 Add AgentStopWait method
- to signal when the networking cluster agent is stopped

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-04-05 11:13:56 -07:00
Santhosh Manohar 4693eab00d swarm mode network inspect should provide cluser-wide task details
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-03-10 19:12:00 -08:00
Alessandro Boch f2307265c7 Add anonymous container alias to service record on attachable network
- Currently when a non-named container with network aliases
  is connected to a swarm attachable network, its aliases are
  not added to the service records.
  This is not in line with what we do when connecting to
  a local scope network, or to a kv-store based overlay network.

Signed-off-by: Alessandro Boch <aboch@docker.com>
2017-03-02 12:28:39 -08:00
Madhu Venugopal 4131b4d9f4 Generating node discovery events to the drivers from networkdb
With the introduction of networkdb, the node discovery events were not
sent to the drivers. This commit generates the node discovery events and
sents it to the drivers interested in it.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2017-02-01 17:54:51 -08:00
Alessandro Boch 8dcf9960aa Add missing locks in agent and service code
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-11-29 13:58:06 -08:00
Santhosh Manohar 19e42ae0e7 Separate service LB & SD from network plumbing
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-11-17 13:09:14 -08:00
Alessandro Boch 856ea84cde Allow concurrent calls to agentClose
- This fixes a panic in memberlist.Leave() because called
  after memberlist.shutdown = false
  It happens because of two interlocking calls to NetworkDB.clusterLeave()
  It is easily reproducible with two back-to-back calls
  to docker swarm init && docker swarm leave --force
  While the first clusterLeave() is waiting for sendNodeEvent(NodeEventTypeLeave)
  to timeout (5 sec) a second clusterLeave() is called. The second clusterLeave()
  will end up invoking memberlist.Leave() after the previous call already did
  the same, therefore after memberlist.shutdown was set false.
- The fix is to have agentClose() acquire the agent instance and reset the
  agent pointer right away under lock. Then execute the closing/leave functions
  on the agent instance.

Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-11-01 14:51:08 -07:00
Jana Radhakrishnan 23a782bb92 Avoid returning early on agent join failures
When a gossip join failure happens do not return early in the call chain
because a join failure is most likely transient and the retry logic
built in the networkdb is going to retry and succeed. Returning early
makes the initialization of ingress network/sandbox to not happen which
causes a problem even after the gossip join on retry is successful.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-27 08:36:10 -07:00
Jana Radhakrishnan 8b04ffb31a Honor user provided listen address for gossip
If user provided a non-zero listen address, honor that and bind only to
that address. Right now it is not honored and we always bind to all ip
addresses in the host.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-22 11:41:57 -07:00
Alessandro Boch 5bba3aac65 Lock agent access in addDriverWatches
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-09-20 14:18:49 -07:00
Santhosh Manohar 82fba3c357 Make nodenames unique in Gossip cluster
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-09-19 09:57:23 -07:00
Jana Radhakrishnan 37c0b6e517 Avoid double close of agentInitDone
Avoid by reinitializing the channel immediately after closing the
channel within a lock. Also change the wait code to cache the channel in
stack be retrieving it from controller and wait on the stack copy of the
channel.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-08-24 14:00:36 -07:00
Jana Radhakrishnan c98780f0d1 Notify agentInitDone after joining the cluster
Currently the initDone notification is provided immediately after
initializing the cluster. This may be fine for the first manager. But
for all subsequent nodes which join the cluster we need to wait until
the node completes the joining to the gossip cluster inorder to
synchronize the gossip network clock with other nodes. If we don't have
uptodate clock the updates that this node provides to the cluster may be
discarded by the other nodes if they have entries which are yet to be
reaped but have a better clock.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-08-19 17:57:58 -07:00
Santhosh Manohar acd4edd197 Reset the encryption keys on swarm leave
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-08-16 17:37:33 -07:00
Madhu Venugopal f77a0c9f54 Merge pull request #1382 from mrjana/overlay
Fix spurious overlay errors
2016-08-11 11:38:57 +05:30
Jana Radhakrishnan 1fb56bcb55 Fix spurious overlay errors
Fixed certain spurious overlay errors which were not errors at all but
showing up everytime service tasks are started in the engine.

Also added a check to make sure a delete is valid by checking the
incoming endpoint id wih the one in peerdb just to make sure if the
delete from gossip is not stale.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-08-08 11:55:06 -07:00
Santhosh Manohar ab713a5fbd Remove unused key handling functions
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-08-05 04:46:01 -07:00
Jana Radhakrishnan f7b5ffe9f4 Merge pull request #1372 from sanimej/gossip
Add container short-id as an alias for swarm mode tasks
2016-08-03 17:27:49 -07:00
Santhosh Manohar ee145e5b17 Add container short-id as an alias for swarm mode tasks
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-08-02 20:28:33 -07:00
Aaron Lehmann 92337b412e Check size of keys slice
If not enough keys are provided to SetKeys, this may cause a panic. This
should not cause problems with the current integration in Docker 1.12.0,
but the panic might happen loading data created by an earlier version,
or data that is corrupted somehow. Add a length check to be defensive.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-08-02 19:07:43 -07:00
Madhu Venugopal f6d896889d Adding Advertise-addr support
With this change, all the auto-detection of the addresses are removed
from libnetwork and the caller takes the responsibilty to have a proper
advertise-addr in various scenarios (including externally facing public
advertise-addr with an internal facing private listen-addr)

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-07-21 02:44:25 -07:00
Alessandro Boch 48870117a6 On agent init, re-join on existing cluster networks
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-07-12 17:35:32 -07:00
Santhosh Manohar d45a81df54 Switch overlay encryption to use IPSec susbsystem keys
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-06-15 04:10:23 -07:00
Santhosh Manohar e83d68b7d1 Update key handling logic to process keyring with 3 keys
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-06-11 04:50:25 -07:00
Jana Radhakrishnan c53e26dc0f Add service alias support
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-06-14 16:40:54 -07:00
Madhu Venugopal a4453d7ce1 Resolve host-name before trying the interface-name in agent bind
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-06-12 10:08:26 -07:00
Alessandro Boch 84418f4194 Overlay driver to support network layer encryption
Signed-off-by: Alessandro Boch <aboch@docker.com>
2016-06-08 23:38:55 -07:00
Santhosh Manohar 5f9be5a6cb Use controller methods for handling the encyrption keys from agent
instead of the Provider interface methods.

Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-06-05 00:47:30 -07:00
Santhosh Manohar 9a0ad6492f Add support for encrypting gossip traffic
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-06-04 03:55:14 -07:00
Madhu Venugopal 600ba1ed0a Provide a way for libnetwork to make use of Agent mode functionalities
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2016-06-05 18:41:21 -07:00
Jana Radhakrishnan 0c9db265d5 Add ingress load balancer
Ingress load balancer is achieved via a service sandbox which acts as
the proxy to translate incoming node port requests and mapping that to a
service entry. Once the right service is identified, the same internal
loadbalancer implementation is used to load balance to the right backend
instance.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-06-04 20:38:32 -07:00
Jana Radhakrishnan f3ede06779 Add loadbalancer support
This PR adds support for loadbalancing across a group of endpoints that
share the same service configuration as passed in by
`OptionService`. The loadbalancer is implemented using ipvs with just
round robin scheduling supported for now.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-05-26 13:05:58 -07:00