Troubleshooting the Docker Network

Introduction


The following page provides instruction on how to troubleshoot a missing or misconfigured Docker Swarm Network that SOAJS requires to execute maintenance operations on any SOAJS deployed microservice or daemon inside containers as well as how to fix the network configuration.

When you connect your docker swarm provider and provide the Docker Token, SOAJS driver automatically creates a network called soajsnet and uses this network at a later step to allow you to execute maintenance operations on all deployed SOAJS microservices & daemons.

In case you accidentally remove or rename the network after you connect the provider, SOAJS will still be able to run any docker operation such as create, list, redeploy, scale, show logs ... etc, however, running a maintenance operation on a SOAJS microservice or daemon like reload registry, heartbeat or awareness check, will not work. These maintenance operations instruct SOAJS to execute a command inside the container using a WebSocket and this socket relies on the soajsnet network to execute that command.




Scenario


  1. Concept
  2. Inspecting the Network
  3. Removing a Misconfigured Network
  4. Creating a New Network
  5. Rerunning the Docker Script




1- Concept


The docker swarm network requires special configuration in order to adapt to the SOAJS component that is responsible for maintenance operations. The network should be attachable (any container can join it) and should span the whole cluster. As mentioned in the introduction, the SOAJS driver automatically creates the soajsnet network in case it was not found. However, if the network was already created in a cluster, the cluster is not capable of modifying or deleting it. This behavior is mandatory to ensure that the SOAJS driver does not alter with any previously deployed non-SOAJS components. In this case, a network misconfiguration might prevent maintenance operations from being executed.


Fixing this issue requires manual intervention from the user. Although the fix is straight forward, it does require a few steps to ensure that the cluster remains functional and reachable via the SOAJS components. Using this guide, you will be able to troubleshoot and fix a misconfigured docker swarm network for your cluster.


If you are using any of the SOAJS IaC templates to create and provision clusters and you did not change any configuration at the cluster level, you should not be facing this issue.




2- Inspecting the Network


Start by running this command in the terminal of one of the master nodes in your cluster:

Inspect the Current Networks
docker network inspect soajsnet

If the output of this command is a JSON object with details about the network, this means that the network is available in the cluster. In this case you should proceed to step 3 to remove the misconfigured networkHowever, if the output of the command is an error message similar to the one below, this means that the network is missing and needs to be recreated.

[]
Error: No such network: soajsnet

In this case, skip step 3 and head directly to step 4.




3- Removing a Misconfigured Network


Removing a docker network is not allowed if the network is being used by one or more services or containers. In order to be able to delete a network, you need to make sure that it is not being used by any docker component.


It is not possible to update a network in docker. Therefore, in case a network is misconfigured the only way to fix it is by removing and recreating it. To do so, run this command in the terminal of one of the master nodes in your cluster:

docker network rm soajsnet

When done, run the following command to verify that the network has been successfully removed:

docker network ls

At this point the soajsnet network should no longer be available in the list of networks. Proceed to step 4 to recreate it.




4- Creating a New Network


To manually create the SOAJS network on your cluster, run this command in the terminal of one of the master nodes:

docker network create --driver overlay --attachable soajsnet

Now inspect the new network to verify that it has been created successfully:

# inspect the network
docker network inspect soajsnet

# output
[
    {
        "Name": "soajsnet",
		 
		 ...

        "Scope": "swarm",
        "Driver": "overlay",

		 ...

        "Attachable": true,

		 ...
    }
]

Make sure that the output indicates that the soajsnet network has an overlay driver, a swarm scope, and is attachable. If you are able to verify this step, this means that the network is now available and properly configured.




5- Rerunning the Docker Script


After fixing the network configuration and in order to prevent any further errors, it is recommended to head over to How to Get Docker Token page and recreate the docker token for your cluster.