@maorongrong 2017-03-12T11:21:00.000000Z 字数 34973 阅读 962

使用docker-machine的一点了解

for-github

Docker-machine

实现相应的虚拟化驱动（virtualbox、hyperv）创建出轻量级虚拟机并下载官方提供的精简的已经安装了docker daemon的虚拟机镜像boot2docker，即为一个简单的docker host.

[amy@amy-Heizi ~]$ dm create --driver virtualbox --engine-insecure-registry 202.117.16.167:5000 --engine-registry-mirror http://c71b9b35.m.daocloud.io  manager1

由--engine-insecure-registry指定私有仓库地址，--engine-registry-mirror指定Daocloud加速镜像地址,也可以创建完成后运行curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://c71b9b35.m.daocloud.io指定。

使用本地安装的docker client重定向连接到创建的docker machine host,进行操作。

关于 `eval "$(docker-machine env host_name)"`

执行docker-machine env host_name：

export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://192.168.99.101:2376"
export DOCKER_CERT_PATH="/home/amy/.docker/machine/machines/manager1"
export DOCKER_MACHINE_NAME="manager1"
export DOCKER_API_VERSION="1.26"
# Run this command to configure your shell: 
# eval $(docker-machine env manager1)

显示了一堆关于新建machine的环境变量，而shell强大的eval命令，首先将"$(docker-machine env host_name)"执行之后将执行的结果，即这些环境变量设置到当前shell窗口中，因此当前shell中的docker client能够通过指定的方式连接到新建的machine docker daemon，进而执行命令。

登录新建的machine:

[amy@amy-Heizi ~]$ dm ip manager1
192.168.99.101
[amy@amy-Heizi ~]$ dm ssh manager1
                        ##         .
                  ## ## ##        ==
               ## ## ## ## ##    ===
           /"""""""""""""""""\___/ ===
      ~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ /  ===- ~~~
           \______ o           __/
             \    \         __/
              \____\_______/
 _                 _   ____     _            _
| |__   ___   ___ | |_|___ \ __| | ___   ___| | _____ _ __
| '_ \ / _ \ / _ \| __| __) / _` |/ _ \ / __| |/ / _ \ '__|
| |_) | (_) | (_) | |_ / __/ (_| | (_) | (__|   <  __/ |
|_.__/ \___/ \___/ \__|_____\__,_|\___/ \___|_|\_\___|_|
  WARNING: this is a build from test.docker.com, not a stable release.
Boot2Docker version 17.03.0-ce-rc1, build HEAD : f08ca37 - Tue Feb 21 05:59:38 UTC 2017
Docker version 17.03.0-ce-rc1, build ce07fb6

初始化swarm

docker@manager1:~$ docker swarm init --advertise-addr 192.168.99.101
Swarm initialized: current node (auqsxvb40r23q5rd62e806bw1) is now a manager.
To add a worker to this swarm, run the following command:
    docker swarm join \
    --token SWMTKN-1-3iqq804240ys14ld5cbpl39gfeq80e8687ct429k6rmnecqev4-dyvoa3wiuxpq7efjpxrz8etl3 \
    192.168.99.101:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

swarm中manager1将自己的ip 192.168.99.101发布出去，其他想加入当前swarm的swarm node必须能够访问该地址。

创建worker1 node

[amy@amy-Heizi ~]$ dm create --driver virtualbox --engine-insecure-registry 202.117.16.167:5000 --engine-registry-mirror http://c71b9b35.m.daocloud.io worker1
Running pre-create checks...
Creating machine...
(worker1) Copying /home/amy/.docker/machine/cache/boot2docker.iso to /home/amy/.docker/machine/machines/worker1/boot2docker.iso...
(worker1) Creating VirtualBox VM...
(worker1) Creating SSH key...
(worker1) Starting the VM...
(worker1) Check network to re-create if needed...
(worker1) Waiting for an IP...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env worker1
[amy@amy-Heizi ~]$ dm ssh worker1
                        ##         .
                  ## ## ##        ==
               ## ## ## ## ##    ===
           /"""""""""""""""""\___/ ===
      ~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ /  ===- ~~~
           \______ o           __/
             \    \         __/
              \____\_______/
 _                 _   ____     _            _
| |__   ___   ___ | |_|___ \ __| | ___   ___| | _____ _ __
| '_ \ / _ \ / _ \| __| __) / _` |/ _ \ / __| |/ / _ \ '__|
| |_) | (_) | (_) | |_ / __/ (_| | (_) | (__|   <  __/ |
|_.__/ \___/ \___/ \__|_____\__,_|\___/ \___|_|\_\___|_|
  WARNING: this is a build from test.docker.com, not a stable release.
Boot2Docker version 17.03.0-ce-rc1, build HEAD : f08ca37 - Tue Feb 21 05:59:38 UTC 2017
Docker version 17.03.0-ce-rc1, build ce07fb6
docker@worker1:~$ ifconfig

创建的时候顺便把搭建的docker registry引入，并且配置Daocloud镜像加速器。。。

查看现在的状态：

docker@manager1:~$ docker node ls
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
auqsxvb40r23q5rd62e806bw1 *  manager1  Ready   Active        Leader
gko28cd7wa81bfc577owyv7ro    worker2   Ready   Active        
y4adhllxrumqx967qdudl4009    worker1   Ready   Active        
docker@manager1:~$

部署服务

manger1节点：

docker@manager1:~$ docker service create --replicas 1 --name helloword alpine ping www.baidu.
com
rns7n1uqo6hftulsdu7e8knzo
docker@manager1:~$ docker service ls
ID            NAME       MODE        REPLICAS  IMAGE
rns7n1uqo6hf  helloword  replicated  1/1       alpine:latest
docker@manager1:~$ docker service ps helloword
ID            NAME         IMAGE          NODE      DESIRED STATE  CURRENT STATE           ERROR  PORTS
jc2lq658h51e  helloword.1  alpine:latest  manager1  Running        Running 56 seconds ago  
docker@manager1:~$ docker service scale helloword=3
helloword scaled to 3
docker@manager1:~$ docker service ls
ID            NAME       MODE        REPLICAS  IMAGE
rns7n1uqo6hf  helloword  replicated  1/3       alpine:latest
docker@manager1:~$ docker service ps helloword
ID            NAME         IMAGE          NODE      DESIRED STATE  CURRENT STATE             ERROR  PORTS
jc2lq658h51e  helloword.1  alpine:latest  manager1  Running        Running 2 minutes ago            
hwm8p89y0tyr  helloword.2  alpine:latest  worker2   Running        Running 42 seconds ago           
8qizgs4se6i6  helloword.3  alpine:latest  worker1   Running        Preparing 51 seconds ago         
docker@manager1:~$ docker ps -a
CONTAINER ID        IMAGE                                                                            COMMAND                CREATED             STATUS              PORTS               NAMES
d43d4a334ab8        alpine@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8   "ping www.baidu.com"   3 minutes ago       Up 3 minutes                            helloword.1.jc2lq658h51e85m03v6gs6u6t

worker2节点：

docker@worker2:~$ docker ps -a
CONTAINER ID        IMAGE                                                                            COMMAND                CREATED              STATUS              PORTS               NAMES
3dc94bec5c66        alpine@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8   "ping www.baidu.com"   About a minute ago   Up About a minute                       helloword.2.hwm8p89y0tyr1p8xsv8p2ds87

查看具体服务信息及更新情况：

docker@manager1:~$ docker service inspect helloword
[
    {
        "ID": "rns7n1uqo6hftulsdu7e8knzo",
        "Version": {
            "Index": 35
        },
        "CreatedAt": "2017-02-24T09:10:10.898515737Z",
        "UpdatedAt": "2017-02-24T09:12:24.641090641Z",
        "Spec": {
            "Name": "helloword",
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "alpine:latest@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8",
                    "Args": [
                        "ping",
                        "www.baidu.com"
                    ],
                   。。。。。。
            "Mode": {
                "Replicated": {
                    "Replicas": 3
                }
            },
  。。。。。。
        },
        "PreviousSpec": {
            "Name": "helloword",
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "alpine:latest@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8",
                    "Args": [
                        "ping",
                        "www.baidu.com"
                    ],
                    。。。。。。
            "Mode": {
                "Replicated": {
                    "Replicas": 1
                }
            },
            。。。。。。
        "UpdateStatus": {
            "StartedAt": "0001-01-01T00:00:00Z",
            "CompletedAt": "0001-01-01T00:00:00Z"
        }
    }
]

简洁版：

docker@manager1:~$ docker service inspect --pretty  helloword
ID:     rns7n1uqo6hftulsdu7e8knzo
Name:       helloword
Service Mode:   Replicated
 Replicas:  3
Placement:
UpdateConfig:
 Parallelism:   1
 On failure:    pause
 Max failure ratio: 0
ContainerSpec:
 Image:     alpine:latest@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8
 Args:      ping www.baidu.com 
Resources:
Endpoint Mode:  vip

扩容分布：

docker@manager1:~$ docker service scale helloword=5
helloword scaled to 5
docker@manager1:~$ docker service ps helloword
ID            NAME         IMAGE          NODE      DESIRED STATE  CURRENT STATE           ERROR  PORTS
jc2lq658h51e  helloword.1  alpine:latest  manager1  Running        Running 13 minutes ago         
hwm8p89y0tyr  helloword.2  alpine:latest  worker2   Running        Running 11 minutes ago         
8qizgs4se6i6  helloword.3  alpine:latest  worker1   Running        Running 9 minutes ago          
yw4w0l4u823t  helloword.4  alpine:latest  worker1   Running        Running 11 seconds ago         
3jd5fk9r5rlr  helloword.5  alpine:latest  manager1  Running        Running 11 seconds ago

service删除

docker@manager1:~$ docker service rm helloword
helloword
docker@manager1:~$ docker service ls
ID  NAME  MODE  REPLICAS  IMAGE
docker@worker2:~$ docker ps -a
CONTAINER   IMAGE COMMAND CREATED STATUS PORTS NAMES

不需要自己去各个node上一个个停止、删除运行该服务的容器，docker service rm <service_name>直接完成。

服务部署时间

docker@manager1:~$ docker service create --replicas 4 --name redis --update-delay 10s redis:3
.0.6
wohnpe9yqiculurpmt7ulid6e
docker@manager1:~$ docker service ps redis
ID            NAME     IMAGE        NODE      DESIRED STATE  CURRENT STATE             ERROR  PORTS
cr7lug0ard1o  redis.1  redis:3.0.6  worker2   Running        Preparing 23 seconds ago         
oeqscrvryxuy  redis.2  redis:3.0.6  manager1  Running        Preparing 23 seconds ago         
o4wqgrgsj1l4  redis.3  redis:3.0.6  worker1   Running        Preparing 23 seconds ago         
nm4vmi0fydlj  redis.4  redis:3.0.6  worker1   Running        Preparing 23 seconds ago         
docker@manager1:~$

其中字段DESIRED STATE表示期望服务运行状态
字段CURRENT STATE表示实际的状态，在service指定之后，服务会在选定docker host拉取相关镜像并启动容器，由于镜像拉取需要时间所以出现类似Preparing 23 seconds ago。

swarm 2 type services

Swarm mode has two types of services, replicated and global. For replicated services, you specify the number of replica tasks for the swarm manager to schedule onto available nodes. For global services using the --mode global, the scheduler places one task on each available node. Every time a new node becomes available, the scheduler places a task for the global service on the new node.

服务版本更新

docker@manager1:~$ docker service inspect --pretty redis
ID:     wohnpe9yqiculurpmt7ulid6e
Name:       redis
Service Mode:   Replicated
 Replicas:  4
Placement:
UpdateConfig:
 Parallelism:   1
 Delay:     10s
 On failure:    pause
 Max failure ratio: 0
ContainerSpec:
 Image:     redis:3.0.6@sha256:6a692a76c2081888b589e26e6ec835743119fe453d67ecf03df7de5b73d69842
Resources:
Endpoint Mode:  vip
docker@manager1:~$ docker service update --image redis:3.0.7 redis
redis
docker@manager1:~$ docker service ps redis
ID            NAME         IMAGE        NODE      DESIRED STATE  CURRENT STATE            ERROR  PORTS
x6sqdshvc5nu  redis.1      redis:3.0.7  worker2   Running        Preparing 7 minutes ago         
cr7lug0ard1o   \_ redis.1  redis:3.0.6  worker2   Shutdown       Shutdown 7 minutes ago          
oeqscrvryxuy  redis.2      redis:3.0.6  manager1  Running        Running 2 hours ago             
o4wqgrgsj1l4  redis.3      redis:3.0.6  worker1   Running        Running 2 hours ago             
nm4vmi0fydlj  redis.4      redis:3.0.6  worker1   Running        Running 2 hours ago

服务更新默认采用rolling update 即先更新service中第一个task，更新之后如果返回running,则根据--update-delay Ts标记推迟T秒后进行下一个task的更新。无论什么时候，若某个task更新返回failed则停止更新，使用--update-failure-action规定更新失败后的行为。
一般来说若因返回failure而更新失败，执行docker service update <service_name>,为了避免重复失败而暂停，最好执行时传入相关flags进行处理。

Drain a node

将一个node从active状态下线到drain状态，停止接受新的task，对原有的task要求manager node停止该task并分布到其他active状态node上。

docker@manager1:~$ docker node update --availability drain worker1
worker1
docker@manager1:~$ docker node ls
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
auqsxvb40r23q5rd62e806bw1 *  manager1  Ready   Active        Leader
y4adhllxrumqx967qdudl4009    worker1   Ready   Drain         
zfn93z0w5le92dpncn33vigqe    worker2   Ready   Active        
docker@manager1:~$ docker node inspect --pretty worker1
ID:         y4adhllxrumqx967qdudl4009
Hostname:       worker1
Joined at:      2017-02-24 08:34:49.908882971 +0000 utc
Status:
 State:         Ready
 Availability:      Drain
 Address:       192.168.99.104
Platform:
 Operating System:  linux
 Architecture:      x86_64
Resources:
 CPUs:          1
 Memory:        995.8 MiB
Plugins:
  Network:      bridge, host, macvlan, null, overlay
  Volume:       local
Engine Version:     17.03.0-ce-rc1
Engine Labels:
 - provider = virtualbox

将worker1节点状态由availabilty调整到drain，因此worker1不再接受新任务，并且原本运行在worker1上的task由manager节点调整到其他availability节点上。

docker@manager1:~$ docker service ps redis
ID            NAME         IMAGE        NODE                       DESIRED STATE  CURRENT STATE           ERROR  PORTS
qui4c3hdyysj  redis.1      redis:3.0.7  manager1                   Running        Running 4 minutes ago          
4uxjo5aukl9i   \_ redis.1  redis:3.0.7  worker1                    Shutdown       Shutdown 4 minutes ago         
kmfwhpdovgnr  redis.2      redis:3.0.7  manager1                   Running        Running 7 minutes ago          
yp9a2f8v38dk  redis.3      redis:3.0.7  worker2                    Running        Running 7 minutes ago

将drain状态回复到active状态：

docker@manager1:~$ docker node update --availability active worker1
worker1
docker@manager1:~$ docker node inspect --pretty worker1
ID:         y4adhllxrumqx967qdudl4009
Hostname:       worker1
Joined at:      2017-02-24 08:34:49.908882971 +0000 utc
Status:
 State:         Ready
 Availability:      Active
 Address:       192.168.99.104
Platform:
 Operating System:  linux
 Architecture:      x86_64
Resources:
 CPUs:          1
 Memory:        995.8 MiB
Plugins:
  Network:      bridge, host, macvlan, null, overlay
  Volume:       local
Engine Version:     17.03.0-ce-rc1
Engine Labels:
 - provider = virtualbox

测试一下，如何从指定的私有docker registry下载镜像启动服务

docker@manager1:~$ docker node ls
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
q42ke002z89pqhzcuzdsa5gw8 *  manager1  Ready   Active        Leader
y4adhllxrumqx967qdudl4009    worker1   Ready   Active        
zfn93z0w5le92dpncn33vigqe    worker2   Ready   Active        
docker@manager1:~$ docker service ls
ID  NAME  MODE  REPLICAS  IMAGE
docker@manager1:~$ docker service create --name my-web --publish 8080:80 --replicas 2 202.117.16.16
7:5000/library/nginx
mfdyfx0ql4oht5dtnxa93pkxe
docker@manager1:~$ docker service ps my-web
ID            NAME      IMAGE                                     NODE      DESIRED STATE  CURRENT STATE             ERROR  PORTS
ismaqxafsl4b  my-web.1  202.117.16.167:5000/library/nginx:latest  manager1  Running        Preparing 10 seconds ago         
1heo9bv3ui3t  my-web.2  202.117.16.167:5000/library/nginx:latest  worker1   Running        Preparing 10 seconds ago

静待变化，困死了，我要出去散散步去。。。。。。。。。。。。。。

docker@worker1:~$ docker ps  -a
CONTAINER ID        IMAGE                                                                                                       COMMAND                  CREATED             STATUS              PORTS               NAMES
66a422852059        202.117.16.167:5000/library/nginx@sha256:4296639ebdf92f035abf95fee1330449e65990223c899838283c9844b1aaac4c   "nginx -g 'daemon ..."   18 hours ago        Up 18 hours         80/tcp, 443/tcp     my-web.2.1heo9bv3ui3tz2siiyk5evx69

结果喜人啊，说明啥？只要你的node-registry host网络通，那么你可以任意在swarm集群使用你的registry！！

swarm mode route mesh

To make the web server accessible from outside the swarm, you need to publish the port where the swarm listens for web requests.

对于一个service, swarm可以负责通过各个节点node开放的配置给该服务的端口，将来自于外部的访问该service的请求给路由到运行该服务的各个节点上，通过访问节点上运行task的容器的服务端口完成访问。

来个说明

那么高潮来了，我们若对swarm load banlancer不满意，可以自己选择代理来实现。
官方文档推荐的HAproxy,he Reliable, High Performance TCP/HTTP Load Balancer,但是我看到的还有基于Nginx和NginxPlus实现的TSL加解密负载均衡。

来一个HAproxy部署之后的示意图吧

HAproxy

swarm orchestrator and scheduler

The built-in swarm orchestrator and scheduler deploy your application to nodes in your swarm to achieve and maintain the desired state.

swarm service

容器资源限制

使用 --reserve-memory or --reserve-cpu flags 限制分配给容器的资源，如果分给容器4CPUs资源，而没有node满足条件，那么service remains in a pending state until a node is available to run its tasks.

service networking

Swarm mode lets you network services in a couple of ways:

publish ports externally to the swarm using ingress networking or directly on each swarm node
connect services and tasks within the swarm using overlay networks

Publish a service’s ports using the routing mesh

一句话总结使用docker service create --publish <TARGET-PORT>:<SERVICE-PORT> IMAGE：tag创建的服务，只要swarm外部主机能够访问到 swarm any node IP:Target-port，那么就能访问到该服务，即使连接到的node实际没有运行该service的task，也是可以的！！！！

To publish a service’s ports externally to the swarm, use the --publish <TARGET-PORT>:<SERVICE-PORT> flag. The swarm makes the service accessible at the target port on every swarm node. If an external host connects to that port on any swarm node, the routing mesh routes it to a task. The external host does not need to know the IP addresses or internally-used ports of the service tasks to interact with the service. When a user or process connects to a service, any worker node running a service task may respond.

Example: Run a three-task Nginx service on 10-node swarm

Imagine that you have a 10-node swarm, and you deploy an Nginx service running three tasks on a 10-node swarm:

$ docker service create --name my_web \
                        --replicas 3 \
                        --publish 8080:80 \
                        nginx

Three tasks will run on up to three nodes. You don’t need to know which nodes are running the tasks; connecting to port 8080 on any of the 10 nodes will connect you to one of the three nginx tasks. You can test this using curl (the HTML output is truncated):

$ curl localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...truncated...
</html>

Publish a service’s ports directly on the swarm node

swarm体系

swarm架构图

测试在swarm上使用docker compose

情况说明：

docker@manager1:~$ docker node ls
ID                           HOSTNAME   STATUS  AVAILABILITY  MANAGER STATUS
h59f83m47ls7potrpb9ua2ywd    amy-Heizi  Ready   Active                       （192.168.99.1）
q42ke002z89pqhzcuzdsa5gw8 *  manager1   Ready   Active        Leader
（192.168.99.109）
y4adhllxrumqx967qdudl4009    worker1    Ready   Active        
（192.168.99.111）
zfn93z0w5le92dpncn33vigqe    worker2    Ready   Active        
（192.168.99.110）

除了amy-Heizi，其余节点都是用docker-machine创建的虚拟机。现在在每个docker主机上安装docker-compose,先切换到root权限使用sudo su命令。

docker@worker1:~$ sudo su
root@worker1:/home/docker# curl -L https://github.com/docker/compose/releases/download/1.11.2
/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   600    0   600    0     0    541      0 --:--:--  0:00:01 --:--:--   541
100 8066k  100 8066k    0     0   707k      0  0:00:11  0:00:11 --:--:-- 1624k
root@worker1:/home/docker# chmod +x /usr/local/bin/docker-compose
root@worker1:~# su - docker
Boot2Docker version 17.03.0-ce-rc1, build HEAD : f08ca37 - Tue Feb 21 05:59:38 UTC 2017
Docker version 17.03.0-ce-rc1, build ce07fb6
docker@worker1:~$ docker-compose version
docker-compose version 1.11.2, build dfed245
docker-py version: 2.1.0
CPython version: 2.7.13
OpenSSL version: OpenSSL 1.0.1t  3 May 2016
snip....
docker@manager1:~$ docker-compose version
docker-compose version 1.11.2, build dfed245
docker-py version: 2.1.0
CPython version: 2.7.13
OpenSSL version: OpenSSL 1.0.1t  3 May 2016
snip....
docker@worker2:~$ docker-compose version
docker-compose version 1.11.2, build dfed245
docker-py version: 2.1.0
CPython version: 2.7.13
OpenSSL version: OpenSSL 1.0.1t  3 May 2016

至此，环境问题已经解决了。

单节点尝试docker compose初次设置

使用version 2版本格式，参见文档

docker@worker1:~$ mkdir composetest && cd composetest
docker@worker1:~/composetest$ vi app.py
docker@worker1:~/composetest$ vi requirements.txt
docker@worker1:~/composetest$ vi Dockerfile
docker@worker1:~/composetest$ vi docker-compose.yml
docker@worker1:~/composetest$ docker-compose up
WARNING: The Docker Engine you're using is running in swarm mode.
Compose does not use swarm mode to deploy services to multiple nodes in a swarm. All containers will be scheduled on the current node.
To deploy your application across the swarm, use `docker stack deploy`.
Creating network "composetest_default" with the default driver
Building web
Step 1/5 : FROM python:3.4-alpine
3.4-alpine: Pulling from library/python
snip..............
Successfully installed Jinja2-2.9.5 MarkupSafe-0.23 Werkzeug-0.11.15 click-6.7 flask-0.12 itsdangerous-0.24 redis-2.10.5
............
Successfully built c6548e252618
..................
Pulling redis (redis:alpine)...
alpine: Pulling from library/redis
................
Creating composetest_redis_1
Creating composetest_web_1
Attaching to composetest_redis_1, composetest_web_1
redis_1  | 1:C 27 Feb 11:40:39.539 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf

浏览器打开192.168.99.111:5000,看到类似 Hello World! I have been seen 19 times. (因为是单节点部署的，所以只有通过worker1的ip能够访问到。)

compose file version diff

This table shows which Compose file versions support specific Docker releases.

compose file format	docker engine release
3.0;3.1	1.13.0+
2.1	1.12.0+
2.0	1.10.0+

Versioning
There are currently three versions of the Compose file format:

Version 1, the legacy format. This is specified by omitting a version key at the root of the YAML.
Version 2.x. This is specified with a version: '2' or version: '2.1' entry at the root of the YAML.
Version 3.x, the latest and recommended version, designed to be cross-compatible between Compose and the Docker Engine’s swarm mode. This is specified with a version: '3' or version: '3.1', etc., entry at the root of the YAML.

version3 结合swarm mode构建服务

根据官方文档Sample app overview和Deploy the application学习 compose file version 3 及 docker stack deploy结合使用 compose 和 swarm mode。

Docker Swarm、Docker Engine Swarm mode、SwarmKit

现在docker社区太混乱了，搞出这么多项目。。。

Swarm、Swarmkit和Swarm模式傻傻分不清

Docker Swarm,独立于Docker engine需要额外的KV存储（也可以用Docker Hub的token）没有服务模型与Docker machine的集成使用Docker CLI
Swarmkit, 在Docker 1.12 RC之前的几周，Docker 发布了 Swarmkit。这是一个独立的、开源的容器编排项目。
使用自己的CLI（swarmd负责管理，swarmctl用于控制）没有服务发现、负载均衡和路由功能提供编排和调度服务是Swarm mode的基础。
Docker Engine Swarm mode,(推荐)集成到了Docker engine中（docker swarm子命令）不需要额外的KV存储支持服务模型（及task概念）以及相应的扩容缩容、服务发现、滚动升级、路由和负载均衡等加密通信还没有和Docker machine与Docker compose集成使用Docker CLISwarm mode基于Swarmkit编写。

搭建那个抄牛逼的集群管理

Docker DNS & Service Discovery with Consul and Registrator
[参照]Docker学习记录: Shipyard+Swarm+Consul+Service Discover 搭建教程
 那个超牛逼的系统，ps有些地方有错误

环境说明

均用dm create --driver virtualbox --engine-insecure-registry 202.117.16.167:5000 --engine-registry-mirror http://c71b9b35.m.daocloud.io swarm-01命令创建machine。

swarm-01: 192.168.99.112
swarm-02: 192.168.99.113
swarm-03: 192.168.99.114

consul

master

[amy@amy-Heizi ~ [swarm-01]]$ docker run -d -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8400:8400 -p 8500:8500 -p 8600:53 -p 8600:53/udp -v /opt/test/data/consul:/data -h $HOSTNAME --restart=always --name=consul progrium/consul:latest -server -bootstrap -ui-dir=/ui -advertise 192.168.99.112 -client 0.0.0.0
Unable to find image 'progrium/consul:latest' locally
latest: Pulling from progrium/consul
..........
Digest: sha256:8cc8023462905929df9a79ff67ee435a36848ce7a10f18d6d0faba9306b97274
Status: Downloaded newer image for progrium/consul:latest
54ad17ce03182f381462bda0adf9a23ab78d072932d4a556ef23085be42e0005

-p 8400:8400 映射 consul的 rpc 端口8400
-p 8500:8500 映射 UI 界面的端口8500.
-p 53/udp 绑定udp 端口53(默认 DNS端口)在 docker0 bridge 地址上.
-v /opt/test/data/consul:/data >这个把consul的数据文件目录挂载到宿主机上，这样万一容器重启，数据就不会丢失。
-advertise 192.168.9.112 服务对外公布的 IP, 否则 service 会显示为内部的容器的 IP 地址, 这样就访问不到了.
-client 0.0.0.0 consul 监听的地址

agent1,2

agent1：192.168.99.113

docker@swarm-02:~$ docker run -d -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8400:8400 -p 8500:8500 -v /opt/test/data/consul:/data -h $HOSTNAME --restart=always --name=consul progrium/consul:latest -server -join 192.168.99.112 -advertise 192.168.99.113 -client 0.0.0.0
0c3dd09e8b31550b59084d20b0cf8c6ffe8de6c299345362a1eadc964f28665f
docker@swarm-02:~$ docker ps -a
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                                                                                                                                NAMES
0c3dd09e8b31        progrium/consul:latest   "/bin/start -serve..."   4 hours ago         Up 4 hours          0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp   consul

agent2: 192.168.99.114

docker@swarm-03:~$ docker run -d -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8400:8400 -p 8500:8500 -v /opt/test/data/consul:/data -h $HOSTNAME --restart=always --name=consul progrium/consul:latest -server -join 192.168.99.112 -advertise 192.168.99.114 -client 0.0.0.0
54284534d563c450a8bffed65a38626b17f7d839943395215d91bf1645326548
docker@swarm-03:~$ docker ps -a
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                                                                                                                                NAMES
54284534d563        progrium/consul:latest   "/bin/start -serve..."   3 seconds ago       Up 2 seconds        0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp   consul

进master查看下

看到leader = true和server=true说明整集群状态已经正常了.具体原理可以参看https://www.consul.io/intro/getting-started/install.html

docker@swarm-01:~$ docker exec -ti consul consul info
WARNING: It is highly recommended to set GOMAXPROCS higher than 1
agent:
    check_monitors = 0
    check_ttls = 0
    checks = 0
    services = 1
build:
    prerelease = 
    revision = 9a9cc934
    version = 0.5.2
consul:
    bootstrap = true
    known_datacenters = 1
    leader = true
    server = true
raft:
....
runtime:
......
serf_lan:
.....
serf_wan:
.....

查看一下当前consul cluster成员信息

docker@swarm-01:~$ docker exec -ti consul consul members
Node       Address              Status  Type    Build  Protocol  DC
amy-Heizi  192.168.99.112:8301  alive   server  0.5.2  2         dc1
swarm-02   192.168.99.113:8301  alive   server  0.5.2  2         dc1
swarm-03   192.168.99.114:8301  alive   server  0.5.2  2         dc1

shipyard + swarm 可视化

Shipyard的介绍以及安装请参考官网，不过官网是以 etcd 作为默认的键值存储以服务发现的，所以安装时忽略了Discovery和proxy的安装，直接进行 swarm manager 、swarm agent 与 shipyard controller的安装。

swarm-01安装rethibkDB

docker@swarm-01:~$ docker run -d --restart=always --name shipyard-rethinkdb rethinkdb
a6613327aaab3e3108105bea081eaf0106eba5481de9080ab315ccfc81d5435d
docker@swarm-01:~$ docker ps -a
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                                                                                                                                                            NAMES
a6613327aaab        rethinkdb                "rethinkdb --bind all"   2 minutes ago       Up 2 minutes        8080/tcp, 28015/tcp, 29015/tcp                                                                                                                                 shipyard-rethinkdb
dc4e8cf5ebae        progrium/consul:latest   "/bin/start -serve..."   4 hours ago         Up 4 hours          0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp, 0.0.0.0:8600->53/tcp, 0.0.0.0:8600->53/udp   consul

swarm-01安装swarm manage

将consul://192.168.99.112:8500中 IP 192.168.99.112 换成你的consul server安装的宿主IP

docker@swarm-01:~$ docker run -d -p 3375:3375 --restart=always --name shipyard-swarm-manager swarm:latest manage --host tcp://0.0.0.0:3375 consul://192.168.99.112:8500
87392296efa4264f40cfda96f5e4d6669a23e09700b5c85debf97dcd6b8ca074

分别在swarm-02,swarm-03安装 swarm agent

将 --addr 192.168.99.113 换成你的swarm agent 所运行的宿主ip, consul://192.168.99.112:8500 换成 consul server所运行的宿主ip。

swarm-02:

docker@swarm-02:~$ docker run -d --restart=always --name shipyard-swarm-agent swarm:latest join --addr 192.168.99.113:2375 consul://192.168.99.112:8500
0b52bb3c9c9a517462b4736756044c1c14b8d34f8e39e918894fadf255690b1c
docker@swarm-02:~$ docker ps -a
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                                                                                                                                NAMES
0b52bb3c9c9a        swarm:latest             "/swarm join --add..."   13 minutes ago      Up 12 minutes       2375/tcp                                                                                                                             shipyard-swarm-agent
0c3dd09e8b31        progrium/consul:latest   "/bin/start -serve..."   4 hours ago         Up 4 hours          0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp   consul

swarm-03:

docker@swarm-03:~$ docker run -d --restart=always --name shipyard-swarm-agent swarm:latest join --addr 192.168.99.114:2375 consul://192.168.99.112:8500
8ab64ffc1d397e4b0bddf99e1b084e04c3e7dfbe9591875e3342867b41dd2e65
docker@swarm-03:~$ docker ps -a
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                                                                                                                                NAMES
8ab64ffc1d39        swarm:latest             "/swarm join --add..."   10 minutes ago      Up 10 minutes       2375/tcp                                                                                                                             shipyard-swarm-agent
54284534d563        progrium/consul:latest   "/bin/start -serve..."   4 hours ago         Up 4 hours          0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp   consul

swarm-01 部署shipyard安装controller

docker@swarm-01:~$ docker run -d --restart=always --name shipyard-controller --link shipyard-rethinkdb:rethinkdb --link shipyard-swarm-manager:swarm -p 8080:8080 shipyard/shipyard:latest server -d tcp://swarm:3375
e89ba8237b982605c306c4fd8acd531a7ec4223b64db1609b561b7d4464af748

查看一下 swarm-01都有哪些容器啦：

docker@swarm-01:~$ docker ps -a
CONTAINER ID        IMAGE                      COMMAND                  CREATED             STATUS              PORTS                                                                                                                                                            NAMES
e89ba8237b98        shipyard/shipyard:latest   "/bin/controller s..."   13 minutes ago      Up 13 minutes       0.0.0.0:8080->8080/tcp                                                                                                                                           shipyard-controller
87392296efa4        swarm:latest               "/swarm manage --h..."   19 minutes ago      Up 19 minutes       2375/tcp, 0.0.0.0:3375->3375/tcp                                                                                                                                 shipyard-swarm-manager
a6613327aaab        rethinkdb                  "rethinkdb --bind all"   25 minutes ago      Up 25 minutes       8080/tcp, 28015/tcp, 29015/tcp                                                                                                                                   shipyard-rethinkdb
dc4e8cf5ebae        progrium/consul:latest     "/bin/start -serve..."   4 hours ago         Up 4 hours          0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp, 0.0.0.0:8600->53/tcp, 0.0.0.0:8600->53/udp   consul

访问 shipyard-sawrm-manager 所在的主机ip 192.168.99.112:8080，账户：admin 密码：shipyard 访问 shipyard资源管理页面。

registrator服务发现

registrator 是基于 docker 的 sock 文件来做服务发现的一个第三方的方案, 我试了下, 使用起来非常简单。在112-114机器上执行下面的命令分别安装registrtator:

docker run -d --restart=always --name=registrator --net=host -v /var/run/docker.sock:/tmp/docker.sock gliderlabs/registrator -ip <ip-of-host> consul://localhost:8500

参数说明

-v /var/run/docker.sock:/tmp/docker.sock 映射 docker 的 socket 到 container 中的对应位置, 这样 registration 就能监控有哪些服务启动了.
<ip-of-host> 是registration 将安装到的主机 IP, 一定要设置此属性, 否则服务IP会显示为127.0.0.1
consul://localhost:8500 consul 绑定到本地的 consul 接口上。

注： registrator发现服务的命名规则是以 “镜像的名字+端口”的方式命名的，所以我们在生产使用过程中一定要做好镜像的命名方案，确保不要有重复。

swarm-01安装registrator

docker@swarm-01:~$ docker run -d --restart=always --name=registrator --net=host -v /var/
run/docker.sock:/tmp/docker.sock gliderlabs/registrator -ip 192.168.99.112 consul://loca
lhost:8500
26ed7d6bebadbd269a3442c380f9272e9a25fe118e37571e644129682b72ba68

swarm-02安装registrator

docker@swarm-02:~$ docker run -d --restart=always --name=registrator --net=host -v /var/run/docker
.sock:/tmp/docker.sock gliderlabs/registrator -ip 192.168.99.113 consul://localhost:8500
cdc5a98d6c0f7633ee41e15c7b5eb45a3243505a7f3c0c08d92f7e85361c5045

swarm-03安装registrator

docker@swarm-03:~$ docker run -d --restart=always --name=registrator --net=host -v /var/run/d
ocker.sock:/tmp/docker.sock gliderlabs/registrator -ip 192.168.99.114 consul://localhost:8500
bc1f5fe318cf1c184843cc09e85e8b4ac1f997163d020f5a6525465e6fbc3f79

牛逼系统搭建错误

讲道理说，到这里我们是可以访问到shipyard监控画面的，事实是，我们进入192.168.99.112:8080看到是shipyard的监控画面，但是没有具体的信息，即there are no containers, there are no images.....

swarm manger也联系不到自己的agent:

docker@swarm-01:~$ docker -H 127.0.0.1:3375 info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: swarm/1.2.6
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint, whitelist
Nodes: 3
 (unknown): 192.168.99.113:2375
  └ ID: 
  └ Status: Pending
  └ Containers: 0
  └ Reserved CPUs: 0 / 0
  └ Reserved Memory: 0 B / 0 B
  └ Labels: 
  └ Error: Cannot connect to the Docker daemon at tcp://192.168.99.113:2375. Is the docker daemon running?
  └ UpdatedAt: 2017-03-03T02:11:47Z
  └ ServerVersion: 
 (unknown): 192.168.99.114:2375
  └ ID: 
  └ Status: Pending
  └ Containers: 0
  └ Reserved CPUs: 0 / 0
  └ Reserved Memory: 0 B / 0 B
  └ Labels: 
  └ Error: Cannot connect to the Docker daemon at tcp://192.168.99.114:2375. Is the docker daemon running?
  └ UpdatedAt: 2017-03-03T02:11:47Z
  └ ServerVersion: 
.....
......

开始查错

首先，看到大神的配置，无论centos还是ubuntu都让把 vm 的daemond监听方式从原来的-H unix:///var/run/docker.sock更改为DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock"，并重启daemond：/etc/init.d/docker restart，是为了swarm manager可以通过端口2375联系到自己的swarm agent。但是啊，咱们的环境daemond监听的是2376啊。那咱们自己改改：

root@swarm-03:/home/docker# vi /etc/default/docker 
root@swarm-03:/home/docker# /etc/init.d/docker start
Need TLS certs for swarm-03,127.0.0.1,10.0.2.15,192.168.99.114
-------------------
root@swarm-03:/home/docker# docker ps -a
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                                                                                                                                NAMES
bc1f5fe318cf        gliderlabs/registrator   "/bin/registrator ..."   11 hours ago        Up 2 seconds                                                                                                                                             registrator
8ab64ffc1d39        swarm:latest             "/swarm join --add..."   11 hours ago        Up 2 seconds        2375/tcp                                                                                                                             shipyard-swarm-agent
54284534d563        progrium/consul:latest   "/bin/start -serve..."   15 hours ago        Up 2 seconds        0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp   consul
root@swarm-03:/home/docker# ps -ef | grep docker
root     10352     1  1 01:14 pts/0    00:00:00 /usr/local/bin/dockerd -D -g /var/lib/docker -H unix:// -H tcp://0.0.0.0:2376 --label provider=virtualbox --insecure-registry 202.117.16.167:5000 --registry-mirror http://c71b9b35.m.daocloud.io --tlsverify --tlscacert=/var/lib/boot2docker/ca.pem --tlscert=/var/lib/boot2docker/server.pem --tlskey=/var/lib/boot2docker/server-key.pem -s aufs
.......

没起作用啊，daemond仍旧监听2376,而咱们的swarm agent仍旧监听2375，就这么错过了，服务怎么可能起来。swarm manager眼巴巴的在2375房间等你，你的房东却只说我在2376没有接到他在等你的电话说明。。。。。。

分析后我认为（原理上不懂只能表面分析），使用docker-machine创建的vm，在创建之前boot2docker.iso已经将-H 0.0.0.0:2376开放作为宿主的docker client远程访问vm中daemond的开放端口了，若之后强制更改默认远程端口为2375，会因为docker-machine在创建vm之初已经做了开放端口2376的TSL认证，所以失败。

再来看一下，docker-machine创建虚拟机的过程：

Running pre-create checks...
Creating machine...
(test) Copying /home/amy/.docker/machine/cache/boot2docker.iso to /home/amy/.docker/machine/machines/test/boot2docker.iso...
(test) Creating VirtualBox VM...
(test) Creating SSH key...
(test) Starting the VM...
(test) Check network to re-create if needed...
(test) Waiting for an IP...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env test

我也试图在swarm manager和swarm agent上将主机端口2376绑定到2375上，结果docker: Error response from daemon: driver failed programming external connectivity on endpoint shipyard-swarm-manager : Bind for 0.0.0.0:2376 failed: port is already allocated.。

解决办法

继续使用docker-machien创建的vm

那么就不要用shipyard内置的swarm镜像创建swarm cluster了，直接在创建完kv store后，用docker-machine create --swarm.....直接指定创建，既解决了daemond端口绑定问题，又直接设置好了swarm 节点间的TSL认证。

将docker-machine创建的vm换成virtualbox等直接创建的虚拟机，安装完整的OS,或者换成物理服务器
（下午2017-3-2）来实验这种方式！

牛逼环境换到物理机

事实证明名确实是的，得换到物理环境下

164主机：consul server,rethinkDB,swarm-shipyard-manager,registrator
167主机：consul agent,swarm-shipyard-agent,registrator
168主机：consul agent,swarm-shipyard-agent,registrator

以上这换环境配置，继续按照之前的讲的来就行了。

访问202.117.16.164:8080能看到如下：
（说明啊，我的截图传不上来，网上找了一张放在这儿）
此处输入图片的描述

安装完registrator就可以在consul自动发现所有机器上启动的容器服务了

通过这个平台就可以看到这3台宿主机上启动的所有容器，可以对它进行重启，暂停，部署，扩容等各种可视化操作。这个平台满足目前开发和测试的环境，我觉得问题不大，如果要用于生产环境，需要思考以下几个问题：
1、需要考虑shipyard管理平台本身的高可用。
2、服务的资源隔离，例如：这个集群我有10台slave宿主机，我想把其中的一个服务的10个容器只部署在3台宿主机上。目前这个集群做不到。
3、集群中的容器要做升级时，无法选择升级策略。

注：为了增加rethinkdb的高可用,有条件的话可以把它做成一个cluster，具体参考https://github.com/dockerfile/rethinkdb

宿主机安装nginx+consultemplate

因为consul-template需要动态修改nginx配置文件，consul-template和nginx必须装到一台机器

nginx可以直接使用系统命令安装 sudo apt install nginx
修改nginx配置文件upstream字段

复制nginx.conf默认配置文件复制到~/nginx_web.ctmpl，修改upstream字段


# vim /root/nginx_web.ctmpl 
worker_processes  1;
events {
    worker_connections  1024;
}
http {
    include       mime.types;
    default_type  application/octet-stream;
    sendfile        on;
    keepalive_timeout  65;
   upstream app {
     {{range $key, $pairs := tree "hello/" | byKey}}{{range $serverid, $pair := $pairs}}
     server {{.Value}}； weight=1 {{end}}{{end}}
   }
   server {
        listen       80;
        server_name  localhost;
       location / {
         http://app;
       }
    }
}

安装consul-template

可从这里下载二进制包https://releases.hashicorp.com/consul-template/, 以1.12.0 为例。

下载consul-template_0.12.0_linux_amd64.zip包，并解压。
cp consul-template /usr/local/bin/

配置并启动consul-template

使用consul-template配置文件启动服务，或者命令启动。

若写了配置文件 check_nginx.conf，则使用./consul-template --config check_nginx.conf 启动。

check_nginx.conf：
consul = "10.2.0.80:8500"
log_level = "warn"
token = "f37ab43b-4d2de-aa283-6effsdf507a9eb71d1b” //如果consul有配token，需要加上token，不然是取不到数据。
template {
source = "~/nginx_web.ctmpl"
destination = "/usr/local/nginx/conf/nginx.conf"
command = “/usr/local/nginx/sbin/nginx -t && /usr/local/nginx/sbin/nginx -s
reload"
}

或者直接使用consul-template -consul 192.168.0.149:8500 -template ~/nginx_web.ctmpl:/usr/local/nginx/conf/nginx.conf:"/usr/local/nginx/sbin/nginx -s reload" 启动。

测试

启动 influxdb[168]

我们通过三个组件（cadvisor influxdb grafana）来搭建一个监控容器主机系统的实时信息。这里面有一个重要的问题，需要大家明确一个问题，因为容器有一个很重要的特性就是随时启动运行，随时停止销毁，所以我们的监控也需要支持，能够随着容器的启动运行，并自动加入监控，当销毁时，监控能够自动删除。这样就不需要人工过多的干预。
这边介绍下几个组件的功能，cadvisor谷歌公司自己用来监控他们基础设施的一款工具，这个工具厉害之处不仅能监控docker容器的实时信息，而且还能将你的cadvisor这容器所在的主机的系统的实时信息！，但是由于cadvisor只是能监控到实时的信息而不能保存，所以我们要使用influxdb将这些实时监控到的信息存放起来，以备以后需要。而grafana这个就是将influxdb存放的信息以图表的形式，非常清晰，完美地展现出来！

docker pull tutum/influxdb:0.10
[168] docker run -d -p 8083:8083 -p 8086:8086 --expose 8090 --expose 8099 -v /opt/test/data/influxdb:/data --name influxsrv tutum/influxdb:0.10

创建数据库在influx容器

cloud@xiaohei10t:~$ docker exec -ti influxsrv /bin/bash
root@d141d8cf7b47:/# influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 0.10.3
InfluxDB shell 0.10.3
> CREATE DATABASE cadvisor
> SHOW DATABASES
name: databases
---------------
name
_internal
cadvisor
> use cadvisor
Using database cadvisor
> CREATE USER "root" WITH PASSWORD 'root' WITH ALL PRIVILEGES
> show users
user    admin
root    true
> exit

查看202.117.16.168:8083

cadvisor

cloud@xiaohei10t:~$ docker run -d --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8088:8080 -h $HOSTNAME --detach=true --name=cadvisor google/cadvisor:latest -docker_only -storage_driver=influxdb -storage_driver_db=cadvisor -storage_driver_host=202.117.16.168:8086
1a243d164568a170a6f1506be413c05d300a01130dec8b2dc1423f814cba296f

将容器端口映射到8088是因为主机的8080端口已经在占用，避免冲突。
在202.117.16.168:8088

部署grafana[168]

docker run -d \
-p 4000:3000 \
-e INFLUXDB_HOST=202.117.16.168 \
-e INFLUXDB_PORT=8086 \
-e INFLUXDB_NAME=cadvisor \
-e INFLUXDB_USER=root \
-e INFLUXDB_PASS=root \
--link influxsrv:influxsrv \
--name grafana grafana/grafana

我的168主机上300端口冲突，查看冲突看这里：

cloud@xiaohei10t:~$ lsof -i :3000
COMMAND  PID  USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
gogs    4093 cloud    9u  IPv6  924324      0t0  TCP *:3000 (LISTEN)
gogs    4093 cloud   12u  IPv6 4458921      0t0  TCP 16h168.xjtu.edu.cn:3000->16h19.xjtu.edu.cn:12558 (ESTABLISHED)
gogs    4093 cloud   17u  IPv6 4457008      0t0  TCP 16h168.xjtu.edu.cn:3000->16h19.xjtu.edu.cn:12567 (ESTABLISHED)
cloud@xiaohei10t:~$ ps -aux  | grep 4093
cloud     4093  0.0  0.4 434924 34344 pts/4    Sl+   2016  26:37 ./gogs web
cloud    30590  0.0  0.0  14684  1012 pts/0    S+   09:46   0:00 grep --color=auto 4093

打开202.117.16.168:3000一看原来是gogs的代码版本控制的dashboard，那就换一个端口吧放到4000上。默认帐号、密码均是admin,可以按照该网页进行dashboard的设置

日志收集-graylog

Dockerfile

  1 version: '2'
  2 services:
  3     mongo:
  4         image: "mongo:3"
  5         volumes:
  6             - /opt/graylog/data/mongo:/data/db
  7     elasticsearch:
  8         image: "elasticsearch:2.3"
  9         command: "elasticsearch -Des.cluster.name='graylog'"
 10         volumes:
 11             - /opt/graylog/data/elasticsearch:/usr/share/elasticsearch/data
 12     graylog:
 13         image: graylog2/server:2.0.3-2
 14         volumes:
 15             - /opt/graylog/data/journal:/usr/share/graylog/data/journal
 16             #- /opt/graylog/config:/usr/share/graylog/data/config
 17         environment:
 18             GRAYLOG_PASSWORD_SECRET: somepasswordpepper
 19             GRAYLOG_ROOT_PASSWORD_SHA2: 4bbdd5a829dba09d7a7ff4c1367be7d36a017b4267d728d31bd264f63debeaa6
 20             GRAYLOG_REST_TRANSPORT_URI: http://202.117.16.168:12900
 21         depends_on:
 22             - mongo
 23             - elasticsearch
 24         ports:
 25             - "9000:9000"
 26             - "12900:12900"
 27             - "12201/udp:12201/udp"
 28             - "12202/udp:12202/udp"

使用命令docker-compose -f docker-compose-5.yml up 启动服务，访问202.117.16.168:9000，使用用户:admin,密码:graylog登入

swarm零散的总结

Docker 1.12以及之后的版本集成了swarmkit等编排服务，现在docker的版本为1.12-rc2版本
Swarm是Docker公司在2014年12月初发布的一套较为简单的工具，用来管理Docker集群，它将一群Docker宿主机变成一个单一的，虚拟的主机。Swarm使用标准的Docker API接口作为其前端访问入口，换言之，各种形式的Docker Client(docker client in Go, docker_py, docker等)均可以直接与Swarm通信。Swarm几乎全部用Go语言来完成开发，上周五，4月17号，Swarm0.2发布，相比0.1版本，0.2版本增加了一个新的策略来调度集群中的容器，使得在可用的节点上传播它们，以及支持更多的Docker命令以及集群驱动。

Swarm deamon只是一个调度器（Scheduler）加路由器(router)，Swarm自己不运行容器，它只是接受docker客户端发送过来的请求，调度适合的节点来运行容器，这意味着，即使Swarm由于某些原因挂掉了，集群中的节点也会照常运行，当Swarm重新恢复运行之后，它会收集重建集群信息。
3.