[关闭]
@maorongrong 2017-03-12T11:21:00.000000Z 字数 34973 阅读 962

使用docker-machine的一点了解

for-github


Docker-machine

实现相应的虚拟化驱动(virtualbox、hyperv)创建出轻量级虚拟机并下载官方提供的精简的已经安装了docker daemon的虚拟机镜像boot2docker,即为一个简单的docker host.

  1. [amy@amy-Heizi ~]$ dm create --driver virtualbox --engine-insecure-registry 202.117.16.167:5000 --engine-registry-mirror http://c71b9b35.m.daocloud.io manager1

--engine-insecure-registry指定私有仓库地址,--engine-registry-mirror指定Daocloud加速镜像地址,也可以创建完成后运行curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://c71b9b35.m.daocloud.io指定。

使用本地安装的docker client重定向连接到创建的docker machine host,进行操作。

关于 eval "$(docker-machine env host_name)"

执行docker-machine env host_name

  1. export DOCKER_TLS_VERIFY="1"
  2. export DOCKER_HOST="tcp://192.168.99.101:2376"
  3. export DOCKER_CERT_PATH="/home/amy/.docker/machine/machines/manager1"
  4. export DOCKER_MACHINE_NAME="manager1"
  5. export DOCKER_API_VERSION="1.26"
  6. # Run this command to configure your shell:
  7. # eval $(docker-machine env manager1)

显示了一堆关于新建machine的环境变量,而shell强大的eval命令,首先将"$(docker-machine env host_name)"执行之后将执行的结果,即这些环境变量设置到当前shell窗口中,因此当前shell中的docker client能够通过指定的方式连接到新建的machine docker daemon,进而执行命令。

  1. [amy@amy-Heizi ~]$ dm ip manager1
  2. 192.168.99.101
  3. [amy@amy-Heizi ~]$ dm ssh manager1
  4. ## .
  5. ## ## ## ==
  6. ## ## ## ## ## ===
  7. /"""""""""""""""""\___/ ===
  8. ~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ / ===- ~~~
  9. \______ o __/
  10. \ \ __/
  11. \____\_______/
  12. _ _ ____ _ _
  13. | |__ ___ ___ | |_|___ \ __| | ___ ___| | _____ _ __
  14. | '_ \ / _ \ / _ \| __| __) / _` |/ _ \ / __| |/ / _ \ '__|
  15. | |_) | (_) | (_) | |_ / __/ (_| | (_) | (__| < __/ |
  16. |_.__/ \___/ \___/ \__|_____\__,_|\___/ \___|_|\_\___|_|
  17. WARNING: this is a build from test.docker.com, not a stable release.
  18. Boot2Docker version 17.03.0-ce-rc1, build HEAD : f08ca37 - Tue Feb 21 05:59:38 UTC 2017
  19. Docker version 17.03.0-ce-rc1, build ce07fb6
  1. docker@manager1:~$ docker swarm init --advertise-addr 192.168.99.101
  2. Swarm initialized: current node (auqsxvb40r23q5rd62e806bw1) is now a manager.
  3. To add a worker to this swarm, run the following command:
  4. docker swarm join \
  5. --token SWMTKN-1-3iqq804240ys14ld5cbpl39gfeq80e8687ct429k6rmnecqev4-dyvoa3wiuxpq7efjpxrz8etl3 \
  6. 192.168.99.101:2377
  7. To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

swarm中manager1将自己的ip 192.168.99.101发布出去,其他想加入当前swarm的swarm node必须能够访问该地址。

  1. [amy@amy-Heizi ~]$ dm create --driver virtualbox --engine-insecure-registry 202.117.16.167:5000 --engine-registry-mirror http://c71b9b35.m.daocloud.io worker1
  2. Running pre-create checks...
  3. Creating machine...
  4. (worker1) Copying /home/amy/.docker/machine/cache/boot2docker.iso to /home/amy/.docker/machine/machines/worker1/boot2docker.iso...
  5. (worker1) Creating VirtualBox VM...
  6. (worker1) Creating SSH key...
  7. (worker1) Starting the VM...
  8. (worker1) Check network to re-create if needed...
  9. (worker1) Waiting for an IP...
  10. Waiting for machine to be running, this may take a few minutes...
  11. Detecting operating system of created instance...
  12. Waiting for SSH to be available...
  13. Detecting the provisioner...
  14. Provisioning with boot2docker...
  15. Copying certs to the local machine directory...
  16. Copying certs to the remote machine...
  17. Setting Docker configuration on the remote daemon...
  18. Checking connection to Docker...
  19. Docker is up and running!
  20. To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env worker1
  21. [amy@amy-Heizi ~]$ dm ssh worker1
  22. ## .
  23. ## ## ## ==
  24. ## ## ## ## ## ===
  25. /"""""""""""""""""\___/ ===
  26. ~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ / ===- ~~~
  27. \______ o __/
  28. \ \ __/
  29. \____\_______/
  30. _ _ ____ _ _
  31. | |__ ___ ___ | |_|___ \ __| | ___ ___| | _____ _ __
  32. | '_ \ / _ \ / _ \| __| __) / _` |/ _ \ / __| |/ / _ \ '__|
  33. | |_) | (_) | (_) | |_ / __/ (_| | (_) | (__| < __/ |
  34. |_.__/ \___/ \___/ \__|_____\__,_|\___/ \___|_|\_\___|_|
  35. WARNING: this is a build from test.docker.com, not a stable release.
  36. Boot2Docker version 17.03.0-ce-rc1, build HEAD : f08ca37 - Tue Feb 21 05:59:38 UTC 2017
  37. Docker version 17.03.0-ce-rc1, build ce07fb6
  38. docker@worker1:~$ ifconfig

创建的时候顺便把搭建的docker registry引入,并且配置Daocloud镜像加速器。。。

查看现在的状态:

  1. docker@manager1:~$ docker node ls
  2. ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
  3. auqsxvb40r23q5rd62e806bw1 * manager1 Ready Active Leader
  4. gko28cd7wa81bfc577owyv7ro worker2 Ready Active
  5. y4adhllxrumqx967qdudl4009 worker1 Ready Active
  6. docker@manager1:~$

manger1节点:

  1. docker@manager1:~$ docker service create --replicas 1 --name helloword alpine ping www.baidu.
  2. com
  3. rns7n1uqo6hftulsdu7e8knzo
  4. docker@manager1:~$ docker service ls
  5. ID NAME MODE REPLICAS IMAGE
  6. rns7n1uqo6hf helloword replicated 1/1 alpine:latest
  7. docker@manager1:~$ docker service ps helloword
  8. ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
  9. jc2lq658h51e helloword.1 alpine:latest manager1 Running Running 56 seconds ago
  10. docker@manager1:~$ docker service scale helloword=3
  11. helloword scaled to 3
  12. docker@manager1:~$ docker service ls
  13. ID NAME MODE REPLICAS IMAGE
  14. rns7n1uqo6hf helloword replicated 1/3 alpine:latest
  15. docker@manager1:~$ docker service ps helloword
  16. ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
  17. jc2lq658h51e helloword.1 alpine:latest manager1 Running Running 2 minutes ago
  18. hwm8p89y0tyr helloword.2 alpine:latest worker2 Running Running 42 seconds ago
  19. 8qizgs4se6i6 helloword.3 alpine:latest worker1 Running Preparing 51 seconds ago
  20. docker@manager1:~$ docker ps -a
  21. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  22. d43d4a334ab8 alpine@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8 "ping www.baidu.com" 3 minutes ago Up 3 minutes helloword.1.jc2lq658h51e85m03v6gs6u6t

worker2节点:

  1. docker@worker2:~$ docker ps -a
  2. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  3. 3dc94bec5c66 alpine@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8 "ping www.baidu.com" About a minute ago Up About a minute helloword.2.hwm8p89y0tyr1p8xsv8p2ds87

查看具体服务信息及更新情况:

  1. docker@manager1:~$ docker service inspect helloword
  2. [
  3. {
  4. "ID": "rns7n1uqo6hftulsdu7e8knzo",
  5. "Version": {
  6. "Index": 35
  7. },
  8. "CreatedAt": "2017-02-24T09:10:10.898515737Z",
  9. "UpdatedAt": "2017-02-24T09:12:24.641090641Z",
  10. "Spec": {
  11. "Name": "helloword",
  12. "TaskTemplate": {
  13. "ContainerSpec": {
  14. "Image": "alpine:latest@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8",
  15. "Args": [
  16. "ping",
  17. "www.baidu.com"
  18. ],
  19. 。。。。。。
  20. "Mode": {
  21. "Replicated": {
  22. "Replicas": 3
  23. }
  24. },
  25. 。。。。。。
  26. },
  27. "PreviousSpec": {
  28. "Name": "helloword",
  29. "TaskTemplate": {
  30. "ContainerSpec": {
  31. "Image": "alpine:latest@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8",
  32. "Args": [
  33. "ping",
  34. "www.baidu.com"
  35. ],
  36. 。。。。。。
  37. "Mode": {
  38. "Replicated": {
  39. "Replicas": 1
  40. }
  41. },
  42. 。。。。。。
  43. "UpdateStatus": {
  44. "StartedAt": "0001-01-01T00:00:00Z",
  45. "CompletedAt": "0001-01-01T00:00:00Z"
  46. }
  47. }
  48. ]

简洁版:

  1. docker@manager1:~$ docker service inspect --pretty helloword
  2. ID: rns7n1uqo6hftulsdu7e8knzo
  3. Name: helloword
  4. Service Mode: Replicated
  5. Replicas: 3
  6. Placement:
  7. UpdateConfig:
  8. Parallelism: 1
  9. On failure: pause
  10. Max failure ratio: 0
  11. ContainerSpec:
  12. Image: alpine:latest@sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8
  13. Args: ping www.baidu.com
  14. Resources:
  15. Endpoint Mode: vip

扩容分布:

  1. docker@manager1:~$ docker service scale helloword=5
  2. helloword scaled to 5
  3. docker@manager1:~$ docker service ps helloword
  4. ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
  5. jc2lq658h51e helloword.1 alpine:latest manager1 Running Running 13 minutes ago
  6. hwm8p89y0tyr helloword.2 alpine:latest worker2 Running Running 11 minutes ago
  7. 8qizgs4se6i6 helloword.3 alpine:latest worker1 Running Running 9 minutes ago
  8. yw4w0l4u823t helloword.4 alpine:latest worker1 Running Running 11 seconds ago
  9. 3jd5fk9r5rlr helloword.5 alpine:latest manager1 Running Running 11 seconds ago
  1. docker@manager1:~$ docker service rm helloword
  2. helloword
  3. docker@manager1:~$ docker service ls
  4. ID NAME MODE REPLICAS IMAGE
  5. docker@worker2:~$ docker ps -a
  6. CONTAINER IMAGE COMMAND CREATED STATUS PORTS NAMES

不需要自己去各个node上一个个停止、删除运行该服务的容器,docker service rm <service_name>直接完成。

  1. docker@manager1:~$ docker service create --replicas 4 --name redis --update-delay 10s redis:3
  2. .0.6
  3. wohnpe9yqiculurpmt7ulid6e
  4. docker@manager1:~$ docker service ps redis
  5. ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
  6. cr7lug0ard1o redis.1 redis:3.0.6 worker2 Running Preparing 23 seconds ago
  7. oeqscrvryxuy redis.2 redis:3.0.6 manager1 Running Preparing 23 seconds ago
  8. o4wqgrgsj1l4 redis.3 redis:3.0.6 worker1 Running Preparing 23 seconds ago
  9. nm4vmi0fydlj redis.4 redis:3.0.6 worker1 Running Preparing 23 seconds ago
  10. docker@manager1:~$

其中字段DESIRED STATE表示期望服务运行状态
字段CURRENT STATE表示实际的状态,在service指定之后,服务会在选定docker host拉取相关镜像并启动容器,由于镜像拉取需要时间所以出现类似Preparing 23 seconds ago

Swarm mode has two types of services, replicated and global. For replicated services, you specify the number of replica tasks for the swarm manager to schedule onto available nodes. For global services using the --mode global, the scheduler places one task on each available node. Every time a new node becomes available, the scheduler places a task for the global service on the new node.

  1. docker@manager1:~$ docker service inspect --pretty redis
  2. ID: wohnpe9yqiculurpmt7ulid6e
  3. Name: redis
  4. Service Mode: Replicated
  5. Replicas: 4
  6. Placement:
  7. UpdateConfig:
  8. Parallelism: 1
  9. Delay: 10s
  10. On failure: pause
  11. Max failure ratio: 0
  12. ContainerSpec:
  13. Image: redis:3.0.6@sha256:6a692a76c2081888b589e26e6ec835743119fe453d67ecf03df7de5b73d69842
  14. Resources:
  15. Endpoint Mode: vip
  16. docker@manager1:~$ docker service update --image redis:3.0.7 redis
  17. redis
  18. docker@manager1:~$ docker service ps redis
  19. ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
  20. x6sqdshvc5nu redis.1 redis:3.0.7 worker2 Running Preparing 7 minutes ago
  21. cr7lug0ard1o \_ redis.1 redis:3.0.6 worker2 Shutdown Shutdown 7 minutes ago
  22. oeqscrvryxuy redis.2 redis:3.0.6 manager1 Running Running 2 hours ago
  23. o4wqgrgsj1l4 redis.3 redis:3.0.6 worker1 Running Running 2 hours ago
  24. nm4vmi0fydlj redis.4 redis:3.0.6 worker1 Running Running 2 hours ago

服务更新默认采用rolling update 即先更新service中第一个task,更新之后如果返回running,则根据--update-delay Ts标记推迟T秒后进行下一个task的更新。无论什么时候,若某个task更新返回failed则停止更新,使用--update-failure-action规定更新失败后的行为。
一般来说若因返回failure而更新失败,执行docker service update <service_name>,为了避免重复失败而暂停,最好执行时传入相关flags进行处理。

将一个node从active状态下线到drain状态,停止接受新的task,对原有的task要求manager node停止该task并分布到其他active状态node上。

  1. docker@manager1:~$ docker node update --availability drain worker1
  2. worker1
  3. docker@manager1:~$ docker node ls
  4. ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
  5. auqsxvb40r23q5rd62e806bw1 * manager1 Ready Active Leader
  6. y4adhllxrumqx967qdudl4009 worker1 Ready Drain
  7. zfn93z0w5le92dpncn33vigqe worker2 Ready Active
  8. docker@manager1:~$ docker node inspect --pretty worker1
  9. ID: y4adhllxrumqx967qdudl4009
  10. Hostname: worker1
  11. Joined at: 2017-02-24 08:34:49.908882971 +0000 utc
  12. Status:
  13. State: Ready
  14. Availability: Drain
  15. Address: 192.168.99.104
  16. Platform:
  17. Operating System: linux
  18. Architecture: x86_64
  19. Resources:
  20. CPUs: 1
  21. Memory: 995.8 MiB
  22. Plugins:
  23. Network: bridge, host, macvlan, null, overlay
  24. Volume: local
  25. Engine Version: 17.03.0-ce-rc1
  26. Engine Labels:
  27. - provider = virtualbox

将worker1节点状态由availabilty调整到drain,因此worker1不再接受新任务,并且原本运行在worker1上的task由manager节点调整到其他availability节点上。

  1. docker@manager1:~$ docker service ps redis
  2. ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
  3. qui4c3hdyysj redis.1 redis:3.0.7 manager1 Running Running 4 minutes ago
  4. 4uxjo5aukl9i \_ redis.1 redis:3.0.7 worker1 Shutdown Shutdown 4 minutes ago
  5. kmfwhpdovgnr redis.2 redis:3.0.7 manager1 Running Running 7 minutes ago
  6. yp9a2f8v38dk redis.3 redis:3.0.7 worker2 Running Running 7 minutes ago

将drain状态回复到active状态:

  1. docker@manager1:~$ docker node update --availability active worker1
  2. worker1
  3. docker@manager1:~$ docker node inspect --pretty worker1
  4. ID: y4adhllxrumqx967qdudl4009
  5. Hostname: worker1
  6. Joined at: 2017-02-24 08:34:49.908882971 +0000 utc
  7. Status:
  8. State: Ready
  9. Availability: Active
  10. Address: 192.168.99.104
  11. Platform:
  12. Operating System: linux
  13. Architecture: x86_64
  14. Resources:
  15. CPUs: 1
  16. Memory: 995.8 MiB
  17. Plugins:
  18. Network: bridge, host, macvlan, null, overlay
  19. Volume: local
  20. Engine Version: 17.03.0-ce-rc1
  21. Engine Labels:
  22. - provider = virtualbox

测试一下,如何从指定的私有docker registry下载镜像启动服务

  1. docker@manager1:~$ docker node ls
  2. ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
  3. q42ke002z89pqhzcuzdsa5gw8 * manager1 Ready Active Leader
  4. y4adhllxrumqx967qdudl4009 worker1 Ready Active
  5. zfn93z0w5le92dpncn33vigqe worker2 Ready Active
  6. docker@manager1:~$ docker service ls
  7. ID NAME MODE REPLICAS IMAGE
  8. docker@manager1:~$ docker service create --name my-web --publish 8080:80 --replicas 2 202.117.16.16
  9. 7:5000/library/nginx
  10. mfdyfx0ql4oht5dtnxa93pkxe
  11. docker@manager1:~$ docker service ps my-web
  12. ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
  13. ismaqxafsl4b my-web.1 202.117.16.167:5000/library/nginx:latest manager1 Running Preparing 10 seconds ago
  14. 1heo9bv3ui3t my-web.2 202.117.16.167:5000/library/nginx:latest worker1 Running Preparing 10 seconds ago

静待变化,困死了,我要出去散散步去。。。。。。。。。。。。。。

  1. docker@worker1:~$ docker ps -a
  2. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  3. 66a422852059 202.117.16.167:5000/library/nginx@sha256:4296639ebdf92f035abf95fee1330449e65990223c899838283c9844b1aaac4c "nginx -g 'daemon ..." 18 hours ago Up 18 hours 80/tcp, 443/tcp my-web.2.1heo9bv3ui3tz2siiyk5evx69

结果喜人啊,说明啥?只要你的node-registry host网络通,那么你可以任意在swarm集群使用你的registry!!

swarm mode route mesh

To make the web server accessible from outside the swarm, you need to publish the port where the swarm listens for web requests.

对于一个service, swarm可以负责通过各个节点node开放的配置给该服务的端口,将来自于外部的访问该service的请求给路由到运行该服务的各个节点上,通过访问节点上运行task的容器的服务端口完成访问。

来个说明

那么高潮来了,我们若对swarm load banlancer不满意,可以自己选择代理来实现。
官方文档推荐的HAproxy,he Reliable, High Performance TCP/HTTP Load Balancer,但是我看到的还有基于Nginx和NginxPlus实现的TSL加解密负载均衡

来一个HAproxy部署之后的示意图吧

HAproxy

swarm orchestrator and scheduler

The built-in swarm orchestrator and scheduler deploy your application to nodes in your swarm to achieve and maintain the desired state.

swarm service

使用 --reserve-memory or --reserve-cpu flags 限制分配给容器的资源,如果分给容器4CPUs资源,而没有node满足条件,那么service remains in a pending state until a node is available to run its tasks.

Swarm mode lets you network services in a couple of ways:

publish ports externally to the swarm using ingress networking or directly on each swarm node
connect services and tasks within the swarm using overlay networks
  1. Publish a service’s ports using the routing mesh

一句话总结使用docker service create --publish <TARGET-PORT>:<SERVICE-PORT> IMAGE:tag创建的服务,只要swarm外部主机能够访问到 swarm any node IP:Target-port,那么就能访问到该服务,即使连接到的node实际没有运行该service的task,也是可以的!!!!

To publish a service’s ports externally to the swarm, use the --publish <TARGET-PORT>:<SERVICE-PORT> flag. The swarm makes the service accessible at the target port on every swarm node. If an external host connects to that port on any swarm node, the routing mesh routes it to a task. The external host does not need to know the IP addresses or internally-used ports of the service tasks to interact with the service. When a user or process connects to a service, any worker node running a service task may respond.

Example: Run a three-task Nginx service on 10-node swarm

Imagine that you have a 10-node swarm, and you deploy an Nginx service running three tasks on a 10-node swarm:

  1. $ docker service create --name my_web \
  2. --replicas 3 \
  3. --publish 8080:80 \
  4. nginx

Three tasks will run on up to three nodes. You don’t need to know which nodes are running the tasks; connecting to port 8080 on any of the 10 nodes will connect you to one of the three nginx tasks. You can test this using curl (the HTML output is truncated):

  1. $ curl localhost:8080
  2. <!DOCTYPE html>
  3. <html>
  4. <head>
  5. <title>Welcome to nginx!</title>
  6. ...truncated...
  7. </html>
  1. Publish a service’s ports directly on the swarm node

swarm体系

swarm架构图

测试在swarm上使用docker compose

情况说明:

  1. docker@manager1:~$ docker node ls
  2. ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
  3. h59f83m47ls7potrpb9ua2ywd amy-Heizi Ready Active 192.168.99.1
  4. q42ke002z89pqhzcuzdsa5gw8 * manager1 Ready Active Leader
  5. 192.168.99.109
  6. y4adhllxrumqx967qdudl4009 worker1 Ready Active
  7. 192.168.99.111
  8. zfn93z0w5le92dpncn33vigqe worker2 Ready Active
  9. 192.168.99.110

除了amy-Heizi,其余节点都是用docker-machine创建的虚拟机。现在在每个docker主机上安装docker-compose,先切换到root权限使用sudo su命令。

  1. docker@worker1:~$ sudo su
  2. root@worker1:/home/docker# curl -L https://github.com/docker/compose/releases/download/1.11.2
  3. /docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
  4. % Total % Received % Xferd Average Speed Time Time Time Current
  5. Dload Upload Total Spent Left Speed
  6. 100 600 0 600 0 0 541 0 --:--:-- 0:00:01 --:--:-- 541
  7. 100 8066k 100 8066k 0 0 707k 0 0:00:11 0:00:11 --:--:-- 1624k
  8. root@worker1:/home/docker# chmod +x /usr/local/bin/docker-compose
  9. root@worker1:~# su - docker
  10. Boot2Docker version 17.03.0-ce-rc1, build HEAD : f08ca37 - Tue Feb 21 05:59:38 UTC 2017
  11. Docker version 17.03.0-ce-rc1, build ce07fb6
  12. docker@worker1:~$ docker-compose version
  13. docker-compose version 1.11.2, build dfed245
  14. docker-py version: 2.1.0
  15. CPython version: 2.7.13
  16. OpenSSL version: OpenSSL 1.0.1t 3 May 2016
  17. snip....
  18. docker@manager1:~$ docker-compose version
  19. docker-compose version 1.11.2, build dfed245
  20. docker-py version: 2.1.0
  21. CPython version: 2.7.13
  22. OpenSSL version: OpenSSL 1.0.1t 3 May 2016
  23. snip....
  24. docker@worker2:~$ docker-compose version
  25. docker-compose version 1.11.2, build dfed245
  26. docker-py version: 2.1.0
  27. CPython version: 2.7.13
  28. OpenSSL version: OpenSSL 1.0.1t 3 May 2016

至此,环境问题已经解决了。

单节点尝试docker compose初次设置

使用version 2版本格式,参见文档

  1. docker@worker1:~$ mkdir composetest && cd composetest
  2. docker@worker1:~/composetest$ vi app.py
  3. docker@worker1:~/composetest$ vi requirements.txt
  4. docker@worker1:~/composetest$ vi Dockerfile
  5. docker@worker1:~/composetest$ vi docker-compose.yml
  6. docker@worker1:~/composetest$ docker-compose up
  7. WARNING: The Docker Engine you're using is running in swarm mode.
  8. Compose does not use swarm mode to deploy services to multiple nodes in a swarm. All containers will be scheduled on the current node.
  9. To deploy your application across the swarm, use `docker stack deploy`.
  10. Creating network "composetest_default" with the default driver
  11. Building web
  12. Step 1/5 : FROM python:3.4-alpine
  13. 3.4-alpine: Pulling from library/python
  14. snip..............
  15. Successfully installed Jinja2-2.9.5 MarkupSafe-0.23 Werkzeug-0.11.15 click-6.7 flask-0.12 itsdangerous-0.24 redis-2.10.5
  16. ............
  17. Successfully built c6548e252618
  18. ..................
  19. Pulling redis (redis:alpine)...
  20. alpine: Pulling from library/redis
  21. ................
  22. Creating composetest_redis_1
  23. Creating composetest_web_1
  24. Attaching to composetest_redis_1, composetest_web_1
  25. redis_1 | 1:C 27 Feb 11:40:39.539 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf

浏览器打开192.168.99.111:5000,看到类似 Hello World! I have been seen 19 times. (因为是单节点部署的,所以只有通过worker1的ip能够访问到。)

compose file version diff

This table shows which Compose file versions support specific Docker releases.

compose file format docker engine release
3.0;3.1 1.13.0+
2.1 1.12.0+
2.0 1.10.0+

Versioning
There are currently three versions of the Compose file format:

  1. Version 1, the legacy format. This is specified by omitting a version key at the root of the YAML.

  2. Version 2.x. This is specified with a version: '2' or version: '2.1' entry at the root of the YAML.

  3. Version 3.x, the latest and recommended version, designed to be cross-compatible between Compose and the Docker Engine’s swarm mode. This is specified with a version: '3' or version: '3.1', etc., entry at the root of the YAML.

version3 结合swarm mode构建服务

根据官方文档Sample app overviewDeploy the application学习 compose file version 3docker stack deploy结合使用 composeswarm mode

Docker Swarm、Docker Engine Swarm mode、SwarmKit

现在docker社区太混乱了,搞出这么多项目。。。

Swarm、Swarmkit和Swarm模式傻傻分不清

搭建那个抄牛逼的集群管理

Docker DNS & Service Discovery with Consul and Registrator
[参照]Docker学习记录: Shipyard+Swarm+Consul+Service Discover 搭建教程
那个超牛逼的系统,ps有些地方有错误

环境说明

均用dm create --driver virtualbox --engine-insecure-registry 202.117.16.167:5000 --engine-registry-mirror http://c71b9b35.m.daocloud.io swarm-01命令创建machine。

consul

  1. [amy@amy-Heizi ~ [swarm-01]]$ docker run -d -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8400:8400 -p 8500:8500 -p 8600:53 -p 8600:53/udp -v /opt/test/data/consul:/data -h $HOSTNAME --restart=always --name=consul progrium/consul:latest -server -bootstrap -ui-dir=/ui -advertise 192.168.99.112 -client 0.0.0.0
  2. Unable to find image 'progrium/consul:latest' locally
  3. latest: Pulling from progrium/consul
  4. ..........
  5. Digest: sha256:8cc8023462905929df9a79ff67ee435a36848ce7a10f18d6d0faba9306b97274
  6. Status: Downloaded newer image for progrium/consul:latest
  7. 54ad17ce03182f381462bda0adf9a23ab78d072932d4a556ef23085be42e0005

-p 8400:8400 映射 consul的 rpc 端口8400
-p 8500:8500 映射 UI 界面的端口8500.
-p 53/udp 绑定udp 端口53(默认 DNS端口)在 docker0 bridge 地址上.
-v /opt/test/data/consul:/data >这个把consul的数据文件目录挂载到宿主机上,这样万一容器重启,数据就不会丢失。
-advertise 192.168.9.112 服务对外公布的 IP, 否则 service 会显示为内部的容器的 IP 地址, 这样就访问不到了.
-client 0.0.0.0 consul 监听的地址

agent1:192.168.99.113

  1. docker@swarm-02:~$ docker run -d -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8400:8400 -p 8500:8500 -v /opt/test/data/consul:/data -h $HOSTNAME --restart=always --name=consul progrium/consul:latest -server -join 192.168.99.112 -advertise 192.168.99.113 -client 0.0.0.0
  2. 0c3dd09e8b31550b59084d20b0cf8c6ffe8de6c299345362a1eadc964f28665f
  3. docker@swarm-02:~$ docker ps -a
  4. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  5. 0c3dd09e8b31 progrium/consul:latest "/bin/start -serve..." 4 hours ago Up 4 hours 0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp consul

agent2: 192.168.99.114

  1. docker@swarm-03:~$ docker run -d -p 8300:8300 -p 8301:8301 -p 8301:8301/udp -p 8302:8302 -p 8302:8302/udp -p 8400:8400 -p 8500:8500 -v /opt/test/data/consul:/data -h $HOSTNAME --restart=always --name=consul progrium/consul:latest -server -join 192.168.99.112 -advertise 192.168.99.114 -client 0.0.0.0
  2. 54284534d563c450a8bffed65a38626b17f7d839943395215d91bf1645326548
  3. docker@swarm-03:~$ docker ps -a
  4. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  5. 54284534d563 progrium/consul:latest "/bin/start -serve..." 3 seconds ago Up 2 seconds 0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp consul

看到leader = true和server=true说明整集群状态已经正常了.具体原理可以参看https://www.consul.io/intro/getting-started/install.html

  1. docker@swarm-01:~$ docker exec -ti consul consul info
  2. WARNING: It is highly recommended to set GOMAXPROCS higher than 1
  3. agent:
  4. check_monitors = 0
  5. check_ttls = 0
  6. checks = 0
  7. services = 1
  8. build:
  9. prerelease =
  10. revision = 9a9cc934
  11. version = 0.5.2
  12. consul:
  13. bootstrap = true
  14. known_datacenters = 1
  15. leader = true
  16. server = true
  17. raft:
  18. ....
  19. runtime:
  20. ......
  21. serf_lan:
  22. .....
  23. serf_wan:
  24. .....

查看一下当前consul cluster成员信息

  1. docker@swarm-01:~$ docker exec -ti consul consul members
  2. Node Address Status Type Build Protocol DC
  3. amy-Heizi 192.168.99.112:8301 alive server 0.5.2 2 dc1
  4. swarm-02 192.168.99.113:8301 alive server 0.5.2 2 dc1
  5. swarm-03 192.168.99.114:8301 alive server 0.5.2 2 dc1

shipyard + swarm 可视化

Shipyard的介绍以及安装请参考官网,不过官网是以 etcd 作为默认的键值存储以服务发现的,所以安装时忽略了Discovery和proxy的安装,直接进行 swarm managerswarm agentshipyard controller的安装。

  1. docker@swarm-01:~$ docker run -d --restart=always --name shipyard-rethinkdb rethinkdb
  2. a6613327aaab3e3108105bea081eaf0106eba5481de9080ab315ccfc81d5435d
  3. docker@swarm-01:~$ docker ps -a
  4. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  5. a6613327aaab rethinkdb "rethinkdb --bind all" 2 minutes ago Up 2 minutes 8080/tcp, 28015/tcp, 29015/tcp shipyard-rethinkdb
  6. dc4e8cf5ebae progrium/consul:latest "/bin/start -serve..." 4 hours ago Up 4 hours 0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp, 0.0.0.0:8600->53/tcp, 0.0.0.0:8600->53/udp consul

consul://192.168.99.112:8500IP 192.168.99.112 换成你的consul server安装的宿主IP

  1. docker@swarm-01:~$ docker run -d -p 3375:3375 --restart=always --name shipyard-swarm-manager swarm:latest manage --host tcp://0.0.0.0:3375 consul://192.168.99.112:8500
  2. 87392296efa4264f40cfda96f5e4d6669a23e09700b5c85debf97dcd6b8ca074

--addr 192.168.99.113 换成你的swarm agent 所运行的宿主ip, consul://192.168.99.112:8500 换成 consul server所运行的宿主ip

swarm-02:

  1. docker@swarm-02:~$ docker run -d --restart=always --name shipyard-swarm-agent swarm:latest join --addr 192.168.99.113:2375 consul://192.168.99.112:8500
  2. 0b52bb3c9c9a517462b4736756044c1c14b8d34f8e39e918894fadf255690b1c
  3. docker@swarm-02:~$ docker ps -a
  4. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  5. 0b52bb3c9c9a swarm:latest "/swarm join --add..." 13 minutes ago Up 12 minutes 2375/tcp shipyard-swarm-agent
  6. 0c3dd09e8b31 progrium/consul:latest "/bin/start -serve..." 4 hours ago Up 4 hours 0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp consul

swarm-03:

  1. docker@swarm-03:~$ docker run -d --restart=always --name shipyard-swarm-agent swarm:latest join --addr 192.168.99.114:2375 consul://192.168.99.112:8500
  2. 8ab64ffc1d397e4b0bddf99e1b084e04c3e7dfbe9591875e3342867b41dd2e65
  3. docker@swarm-03:~$ docker ps -a
  4. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  5. 8ab64ffc1d39 swarm:latest "/swarm join --add..." 10 minutes ago Up 10 minutes 2375/tcp shipyard-swarm-agent
  6. 54284534d563 progrium/consul:latest "/bin/start -serve..." 4 hours ago Up 4 hours 0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp consul
  1. docker@swarm-01:~$ docker run -d --restart=always --name shipyard-controller --link shipyard-rethinkdb:rethinkdb --link shipyard-swarm-manager:swarm -p 8080:8080 shipyard/shipyard:latest server -d tcp://swarm:3375
  2. e89ba8237b982605c306c4fd8acd531a7ec4223b64db1609b561b7d4464af748

查看一下 swarm-01都有哪些容器啦:

  1. docker@swarm-01:~$ docker ps -a
  2. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  3. e89ba8237b98 shipyard/shipyard:latest "/bin/controller s..." 13 minutes ago Up 13 minutes 0.0.0.0:8080->8080/tcp shipyard-controller
  4. 87392296efa4 swarm:latest "/swarm manage --h..." 19 minutes ago Up 19 minutes 2375/tcp, 0.0.0.0:3375->3375/tcp shipyard-swarm-manager
  5. a6613327aaab rethinkdb "rethinkdb --bind all" 25 minutes ago Up 25 minutes 8080/tcp, 28015/tcp, 29015/tcp shipyard-rethinkdb
  6. dc4e8cf5ebae progrium/consul:latest "/bin/start -serve..." 4 hours ago Up 4 hours 0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp, 0.0.0.0:8600->53/tcp, 0.0.0.0:8600->53/udp consul

访问 shipyard-sawrm-manager 所在的主机ip 192.168.99.112:8080,账户:admin 密码:shipyard 访问 shipyard资源管理页面。

registrator服务发现

registrator 是基于 docker 的 sock 文件来做服务发现的一个第三方的方案, 我试了下, 使用起来非常简单。 在112-114机器上执行下面的命令分别安装registrtator:

docker run -d --restart=always --name=registrator --net=host -v /var/run/docker.sock:/tmp/docker.sock gliderlabs/registrator -ip <ip-of-host> consul://localhost:8500

参数说明

注: registrator发现服务的命名规则是以 “镜像的名字+端口”的方式命名的,所以我们在生产使用过程中一定要做好镜像的命名方案,确保不要有重复。

swarm-01安装registrator

  1. docker@swarm-01:~$ docker run -d --restart=always --name=registrator --net=host -v /var/
  2. run/docker.sock:/tmp/docker.sock gliderlabs/registrator -ip 192.168.99.112 consul://loca
  3. lhost:8500
  4. 26ed7d6bebadbd269a3442c380f9272e9a25fe118e37571e644129682b72ba68

swarm-02安装registrator

  1. docker@swarm-02:~$ docker run -d --restart=always --name=registrator --net=host -v /var/run/docker
  2. .sock:/tmp/docker.sock gliderlabs/registrator -ip 192.168.99.113 consul://localhost:8500
  3. cdc5a98d6c0f7633ee41e15c7b5eb45a3243505a7f3c0c08d92f7e85361c5045

swarm-03安装registrator

  1. docker@swarm-03:~$ docker run -d --restart=always --name=registrator --net=host -v /var/run/d
  2. ocker.sock:/tmp/docker.sock gliderlabs/registrator -ip 192.168.99.114 consul://localhost:8500
  3. bc1f5fe318cf1c184843cc09e85e8b4ac1f997163d020f5a6525465e6fbc3f79

牛逼系统搭建错误

讲道理说,到这里我们是可以访问到shipyard监控画面的,事实是,我们进入192.168.99.112:8080看到是shipyard的监控画面,但是没有具体的信息,即there are no containers, there are no images.....

swarm manger也联系不到自己的agent:

  1. docker@swarm-01:~$ docker -H 127.0.0.1:3375 info
  2. Containers: 0
  3. Running: 0
  4. Paused: 0
  5. Stopped: 0
  6. Images: 0
  7. Server Version: swarm/1.2.6
  8. Role: primary
  9. Strategy: spread
  10. Filters: health, port, containerslots, dependency, affinity, constraint, whitelist
  11. Nodes: 3
  12. (unknown): 192.168.99.113:2375
  13. ID:
  14. Status: Pending
  15. Containers: 0
  16. Reserved CPUs: 0 / 0
  17. Reserved Memory: 0 B / 0 B
  18. Labels:
  19. Error: Cannot connect to the Docker daemon at tcp://192.168.99.113:2375. Is the docker daemon running?
  20. UpdatedAt: 2017-03-03T02:11:47Z
  21. ServerVersion:
  22. (unknown): 192.168.99.114:2375
  23. ID:
  24. Status: Pending
  25. Containers: 0
  26. Reserved CPUs: 0 / 0
  27. Reserved Memory: 0 B / 0 B
  28. Labels:
  29. Error: Cannot connect to the Docker daemon at tcp://192.168.99.114:2375. Is the docker daemon running?
  30. UpdatedAt: 2017-03-03T02:11:47Z
  31. ServerVersion:
  32. .....
  33. ......

开始查错

首先,看到大神的配置,无论centos还是ubuntu都让把 vm 的daemond监听方式从原来的-H unix:///var/run/docker.sock更改为DOCKER_OPTS="-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock",并重启daemond:/etc/init.d/docker restart,是为了swarm manager可以通过端口2375联系到自己的swarm agent。但是啊,咱们的环境daemond监听的是2376啊。那咱们自己改改:

  1. root@swarm-03:/home/docker# vi /etc/default/docker
  2. root@swarm-03:/home/docker# /etc/init.d/docker start
  3. Need TLS certs for swarm-03,127.0.0.1,10.0.2.15,192.168.99.114
  4. -------------------
  5. root@swarm-03:/home/docker# docker ps -a
  6. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
  7. bc1f5fe318cf gliderlabs/registrator "/bin/registrator ..." 11 hours ago Up 2 seconds registrator
  8. 8ab64ffc1d39 swarm:latest "/swarm join --add..." 11 hours ago Up 2 seconds 2375/tcp shipyard-swarm-agent
  9. 54284534d563 progrium/consul:latest "/bin/start -serve..." 15 hours ago Up 2 seconds 0.0.0.0:8300-8302->8300-8302/tcp, 0.0.0.0:8400->8400/tcp, 53/tcp, 53/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8301-8302->8301-8302/udp consul
  10. root@swarm-03:/home/docker# ps -ef | grep docker
  11. root 10352 1 1 01:14 pts/0 00:00:00 /usr/local/bin/dockerd -D -g /var/lib/docker -H unix:// -H tcp://0.0.0.0:2376 --label provider=virtualbox --insecure-registry 202.117.16.167:5000 --registry-mirror http://c71b9b35.m.daocloud.io --tlsverify --tlscacert=/var/lib/boot2docker/ca.pem --tlscert=/var/lib/boot2docker/server.pem --tlskey=/var/lib/boot2docker/server-key.pem -s aufs
  12. .......

没起作用啊,daemond仍旧监听2376,而咱们的swarm agent仍旧监听2375,就这么错过了,服务怎么可能起来。swarm manager眼巴巴的在2375房间等你,你的房东却只说我在2376没有接到他在等你的电话说明。。。。。。

分析后我认为(原理上不懂只能表面分析),使用docker-machine创建的vm,在创建之前boot2docker.iso已经将-H 0.0.0.0:2376开放作为宿主的docker client远程访问vm中daemond的开放端口了,若之后强制更改默认远程端口为2375,会因为docker-machine在创建vm之初已经做了开放端口2376的TSL认证,所以失败。

再来看一下,docker-machine创建虚拟机的过程:

  1. Running pre-create checks...
  2. Creating machine...
  3. (test) Copying /home/amy/.docker/machine/cache/boot2docker.iso to /home/amy/.docker/machine/machines/test/boot2docker.iso...
  4. (test) Creating VirtualBox VM...
  5. (test) Creating SSH key...
  6. (test) Starting the VM...
  7. (test) Check network to re-create if needed...
  8. (test) Waiting for an IP...
  9. Waiting for machine to be running, this may take a few minutes...
  10. Detecting operating system of created instance...
  11. Waiting for SSH to be available...
  12. Detecting the provisioner...
  13. Provisioning with boot2docker...
  14. Copying certs to the local machine directory...
  15. Copying certs to the remote machine...
  16. Setting Docker configuration on the remote daemon...
  17. Checking connection to Docker...
  18. Docker is up and running!
  19. To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env test

我也试图在swarm manager和swarm agent上将主机端口2376绑定到2375上,结果docker: Error response from daemon: driver failed programming external connectivity on endpoint shipyard-swarm-manager : Bind for 0.0.0.0:2376 failed: port is already allocated.

解决办法

那么就不要用shipyard内置的swarm镜像创建swarm cluster了,直接在创建完kv store后,用docker-machine create --swarm.....直接指定创建,既解决了daemond端口绑定问题,又直接设置好了swarm 节点间的TSL认证。

牛逼环境换到物理机

事实证明名确实是的,得换到物理环境下

164主机:consul server,rethinkDB,swarm-shipyard-manager,registrator
167主机:consul agent,swarm-shipyard-agent,registrator
168主机:consul agent,swarm-shipyard-agent,registrator

以上这换环境配置,继续按照之前的讲的来就行了。

访问202.117.16.164:8080能看到如下:
(说明啊,我的截图传不上来,网上找了一张放在这儿)
此处输入图片的描述

安装完registrator就可以在consul自动发现所有机器上启动的容器服务了

通过这个平台就可以看到这3台宿主机上启动的所有容器,可以对它进行重启,暂停,部署,扩容等各种可视化操作。这个平台满足目前开发和测试的环境,我觉得问题不大,如果要用于生产环境,需要思考以下几个问题:
1、需要考虑shipyard管理平台本身的高可用。
2、服务的资源隔离,例如:这个集群我有10台slave宿主机,我想把其中的一个服务的10个容器只部署在3台宿主机上。目前这个集群做不到。
3、集群中的容器要做升级时,无法选择升级策略。

注:为了增加rethinkdb的高可用,有条件的话可以把它做成一个cluster,具体参考https://github.com/dockerfile/rethinkdb

宿主机安装nginx+consultemplate

因为consul-template需要动态修改nginx配置文件,consul-template和nginx必须装到一台机器

复制nginx.conf默认配置文件复制到~/nginx_web.ctmpl,修改upstream字段

  1. # vim /root/nginx_web.ctmpl
  2. worker_processes 1;
  3. events {
  4. worker_connections 1024;
  5. }
  6. http {
  7. include mime.types;
  8. default_type application/octet-stream;
  9. sendfile on;
  10. keepalive_timeout 65;
  11. upstream app {
  12. {{range $key, $pairs := tree "hello/" | byKey}}{{range $serverid, $pair := $pairs}}
  13. server {{.Value}}; weight=1 {{end}}{{end}}
  14. }
  15. server {
  16. listen 80;
  17. server_name localhost;
  18. location / {
  19. http://app;
  20. }
  21. }
  22. }

可从这里下载二进制包https://releases.hashicorp.com/consul-template/, 以1.12.0 为例。

下载consul-template_0.12.0_linux_amd64.zip包,并解压。
cp consul-template /usr/local/bin/

使用consul-template配置文件启动服务,或者命令启动。

若写了配置文件 check_nginx.conf,则使用./consul-template --config check_nginx.conf 启动。

  1. check_nginx.conf
  2. consul = "10.2.0.80:8500"
  3. log_level = "warn"
  4. token = "f37ab43b-4d2de-aa283-6effsdf507a9eb71d1b” //如果consul有配token,需要加上token,不然是取不到数据。
  5. template {
  6. source = "~/nginx_web.ctmpl"
  7. destination = "/usr/local/nginx/conf/nginx.conf"
  8. command = “/usr/local/nginx/sbin/nginx -t && /usr/local/nginx/sbin/nginx -s
  9. reload"
  10. }

或者直接使用consul-template -consul 192.168.0.149:8500 -template ~/nginx_web.ctmpl:/usr/local/nginx/conf/nginx.conf:"/usr/local/nginx/sbin/nginx -s reload" 启动。

启动 influxdb[168]

我们通过三个组件(cadvisor influxdb grafana)来搭建一个监控容器主机系统的实时信息。这里面有一个重要的问题,需要大家明确一个问题,因为容器有一个很重要的特性就是随时启动运行,随时停止销毁,所以我们的监控也需要支持,能够随着容器的启动运行,并自动加入监控,当销毁时,监控能够自动删除。这样就不需要人工过多的干预。
这边介绍下几个组件的功能,cadvisor谷歌公司自己用来监控他们基础设施的一款工具,这个工具厉害之处不仅能监控docker容器的实时信息,而且还能将你的cadvisor这容器所在的主机的系统的实时信息!,但是由于cadvisor只是能监控到实时的信息而不能保存,所以我们要使用influxdb将这些实时监控到的信息存放起来,以备以后需要。而grafana这个就是将influxdb存放的信息以图表的形式,非常清晰,完美地展现出来!

  1. docker pull tutum/influxdb:0.10
  2. [168] docker run -d -p 8083:8083 -p 8086:8086 --expose 8090 --expose 8099 -v /opt/test/data/influxdb:/data --name influxsrv tutum/influxdb:0.10
  1. cloud@xiaohei10t:~$ docker exec -ti influxsrv /bin/bash
  2. root@d141d8cf7b47:/# influx
  3. Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
  4. Connected to http://localhost:8086 version 0.10.3
  5. InfluxDB shell 0.10.3
  6. > CREATE DATABASE cadvisor
  7. > SHOW DATABASES
  8. name: databases
  9. ---------------
  10. name
  11. _internal
  12. cadvisor
  13. > use cadvisor
  14. Using database cadvisor
  15. > CREATE USER "root" WITH PASSWORD 'root' WITH ALL PRIVILEGES
  16. > show users
  17. user admin
  18. root true
  19. > exit

查看202.117.16.168:8083

cadvisor

  1. cloud@xiaohei10t:~$ docker run -d --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8088:8080 -h $HOSTNAME --detach=true --name=cadvisor google/cadvisor:latest -docker_only -storage_driver=influxdb -storage_driver_db=cadvisor -storage_driver_host=202.117.16.168:8086
  2. 1a243d164568a170a6f1506be413c05d300a01130dec8b2dc1423f814cba296f

将容器端口映射到8088是因为主机的8080端口已经在占用,避免冲突。
在202.117.16.168:8088

部署grafana[168]

  1. docker run -d \
  2. -p 4000:3000 \
  3. -e INFLUXDB_HOST=202.117.16.168 \
  4. -e INFLUXDB_PORT=8086 \
  5. -e INFLUXDB_NAME=cadvisor \
  6. -e INFLUXDB_USER=root \
  7. -e INFLUXDB_PASS=root \
  8. --link influxsrv:influxsrv \
  9. --name grafana grafana/grafana

我的168主机上300端口冲突,查看冲突看这里:

  1. cloud@xiaohei10t:~$ lsof -i :3000
  2. COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
  3. gogs 4093 cloud 9u IPv6 924324 0t0 TCP *:3000 (LISTEN)
  4. gogs 4093 cloud 12u IPv6 4458921 0t0 TCP 16h168.xjtu.edu.cn:3000->16h19.xjtu.edu.cn:12558 (ESTABLISHED)
  5. gogs 4093 cloud 17u IPv6 4457008 0t0 TCP 16h168.xjtu.edu.cn:3000->16h19.xjtu.edu.cn:12567 (ESTABLISHED)
  6. cloud@xiaohei10t:~$ ps -aux | grep 4093
  7. cloud 4093 0.0 0.4 434924 34344 pts/4 Sl+ 2016 26:37 ./gogs web
  8. cloud 30590 0.0 0.0 14684 1012 pts/0 S+ 09:46 0:00 grep --color=auto 4093

打开202.117.16.168:3000一看原来是gogs的代码版本控制的dashboard,那就换一个端口吧放到4000上。默认帐号、密码均是admin,可以按照该网页进行dashboard的设置

日志收集-graylog

  1. 1 version: '2'
  2. 2 services:
  3. 3 mongo:
  4. 4 image: "mongo:3"
  5. 5 volumes:
  6. 6 - /opt/graylog/data/mongo:/data/db
  7. 7 elasticsearch:
  8. 8 image: "elasticsearch:2.3"
  9. 9 command: "elasticsearch -Des.cluster.name='graylog'"
  10. 10 volumes:
  11. 11 - /opt/graylog/data/elasticsearch:/usr/share/elasticsearch/data
  12. 12 graylog:
  13. 13 image: graylog2/server:2.0.3-2
  14. 14 volumes:
  15. 15 - /opt/graylog/data/journal:/usr/share/graylog/data/journal
  16. 16 #- /opt/graylog/config:/usr/share/graylog/data/config
  17. 17 environment:
  18. 18 GRAYLOG_PASSWORD_SECRET: somepasswordpepper
  19. 19 GRAYLOG_ROOT_PASSWORD_SHA2: 4bbdd5a829dba09d7a7ff4c1367be7d36a017b4267d728d31bd264f63debeaa6
  20. 20 GRAYLOG_REST_TRANSPORT_URI: http://202.117.16.168:12900
  21. 21 depends_on:
  22. 22 - mongo
  23. 23 - elasticsearch
  24. 24 ports:
  25. 25 - "9000:9000"
  26. 26 - "12900:12900"
  27. 27 - "12201/udp:12201/udp"
  28. 28 - "12202/udp:12202/udp"

使用命令docker-compose -f docker-compose-5.yml up 启动服务,访问202.117.16.168:9000,使用用户:admin,密码:graylog登入

swarm零散的总结

  1. Docker 1.12以及之后的版本集成了swarmkit等编排服务,现在docker的版本为1.12-rc2版本
  2. Swarm是Docker公司在2014年12月初发布的一套较为简单的工具,用来管理Docker集群,它将一群Docker宿主机变成一个单一的,虚拟的主机。Swarm使用标准的Docker API接口作为其前端访问入口,换言之,各种形式的Docker Client(docker client in Go, docker_py, docker等)均可以直接与Swarm通信。Swarm几乎全部用Go语言来完成开发,上周五,4月17号,Swarm0.2发布,相比0.1版本,0.2版本增加了一个新的策略来调度集群中的容器,使得在可用的节点上传播它们,以及支持更多的Docker命令以及集群驱动。

Swarm deamon只是一个调度器(Scheduler)加路由器(router),Swarm自己不运行容器,它只是接受docker客户端发送过来的请求,调度适合的节点来运行容器,这意味着,即使Swarm由于某些原因挂掉了,集群中的节点也会照常运行,当Swarm重新恢复运行之后,它会收集重建集群信息。
3.

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注