[关闭]
@xmruibi 2015-04-13T21:42:37.000000Z 字数 5193 阅读 788

Digital Ocean VMs

cloud_computing


Currently, we have 3 vms.(shown as following...)

We need a hadoop cluster (done) and a mongodb sharding cluster;

Due to the amount of vms limited, each vm should regarded as the node server for hadoop and the shard server for mongodb. I think it doesn't matter for hadoop and mongodb integration?

Basically, Mongodb are already installed in 3 vms.

Then we have to configure mongodb shard cluster:
#1. http://www.lanceyan.com/category/tech/mongodb
#2. http://blog.itpub.net/22664653/viewspace-710195

Droplet Name: newcldvm
IP Address: 45.55.188.234
Username: root
Password: CloudComputing3Bosses
Shard1_Main,Shard2_Arbiter,Shard3_Replica

The following two nodes are working right now
Droplet Name: newcldvm7
IP Address: 45.55.186.238
Username: root
Password: CloudComputing3Bosses
Shard2_Main,Shard3_Arbiter,Shard1_Replica

Droplet Name: newcldvm8
IP Address: 104.131.106.22
Username: root
Password: CloudComputing3Bosses
Shard3_Main,Shard2_Arbiter,Shard1_Replica

This Project has using Distributed Sharded MongoDB as Data Storage and Hadoop Map Reduce as Computing Framework.

The system following these procedures:

Sharded MongoDB Configuration (This part cannot shown on codes so I just present it here.)

The following graph is the architecture of how I set three VMs with different port to simulate the real sharding pattern(which need 15 machines actually)

The following procedure is how I configured MongoDB on remote three VMs.

1. Set up data path, config file and log file paths in each node with mongos 、config 、 shard1 、shard2、shard3 (directory name)

  1. mkdir -p /data/mongos/log
  2. sudo chmod -R 777 /data/mongos/log
  3. mkdir -p /data/config/data
  4. sudo chmod -R 777 /data/config/data
  5. mkdir -p /data/config/log
  6. sudo chmod -R 777 /data/config/log
  7. mkdir -p /data/mongos/log
  8. sudo chmod -R 777 /data/mongos/log
  9. mkdir -p /data/shard1/data
  10. sudo chmod -R 777 /data/shard1/data
  11. mkdir -p /data/shard1/log
  12. sudo chmod -R 777 /data/shard1/log
  13. mkdir -p /data/shard2/data
  14. sudo chmod -R 777 /data/shard2/data
  15. mkdir -p /data/shard2/log
  16. sudo chmod -R 777 /data/shard2/log
  17. mkdir -p /data/shard3/data
  18. sudo chmod -R 777 /data/shard3/data
  19. mkdir -p /data/shard3/log
  20. sudo chmod -R 777 /data/shard3/log

2. Make a plan of port number and modify some config parameter in mongod.conf

  1. mongod --configsvr --dbpath /data/config/data --port 27019 --logpath /data/config/log/config.log --fork
  2. mongos --configdb 45.55.188.234:27019,45.55.186.238:27019,104.131.106.22:27019 --port 27017 --logpath /data/mongos/log/mongos.log --fork
  3. ##or
  4. ## These codes can migration configuration
  5. rsync -az /data/configdb mongo-config1.example.net:/data/configdb
  6. rsync -az /data/configdb mongo-config2.example.net:/data/configdb
  7. nano etc/mongod.conf
  8. ###This is important!!
  9. set bing_ip = 0.0.0.0; for remote login
  10. set default mongod: 27019
  11. ##!Otherwise your mongos port(27017) will be blocked

3. Config sharding setting on each VM

  1. ### set up shards ports and dbpath and log path
  2. mongod --shardsvr --replSet shard1 --port 22001 --dbpath /data/shard1/data --logpath /data/shard1/log/shard1.log --fork --journal --oplogSize 10
  3. mongod --shardsvr --replSet shard2 --port 22002 --dbpath /data/shard2/data --logpath /data/shard2/log/shard2.log --fork --journal --oplogSize 10
  4. mongod --shardsvr --replSet shard3 --port 22003 --dbpath /data/shard3/data --logpath /data/shard3/log/shard3.log --fork --journal --oplogSize 10
  5. #Shard_1 in Node(45.55.188.234)
  6. mongo 127.0.0.1:22001
  7. use admin
  8. config = { _id:"shard1", members:[
  9. {_id:0,host:"45.55.188.234:22001"},
  10. {_id:1,host:"45.55.186.238:22001"},
  11. {_id:2,host:"104.131.106.22:22001",arbiterOnly:true}
  12. ]
  13. }
  14. rs.initiate(config);
  15. #Shard_2 in Node(45.55.186.238)
  16. mongo 127.0.0.1:22002
  17. use admin
  18. config = { _id:"shard2", members:[
  19. {_id:0,host:"45.55.186.238:22002"},
  20. {_id:1,host:"104.131.106.22:22002"},
  21. {_id:2,host:"45.55.188.234:22002",arbiterOnly:true}
  22. ]
  23. }
  24. rs.initiate(config);
  25. #Shard_3 in Node(104.131.106.22)
  26. mongo 127.0.0.1:22003
  27. use admin
  28. config = { _id:"shard3", members:[
  29. {_id:0,host:"104.131.106.22:22003"},
  30. {_id:1,host:"45.55.188.234:22003"},
  31. {_id:2,host:"45.55.186.238:22003",arbiterOnly:true}
  32. ]
  33. }
  34. rs.initiate(config);
  35. ## if you need to reconfig, please use Cmd( rs.reconfig(your_para) )

4. Add Shard Config just in one of VMs

It seems no master mode concept in MongoDB. So just choose one of it. Config Sharding info in mongos; I also made the sharding part on different machines (e.g. Primary Shard1 on Server One, Primary Shard2 on Server Two).

  1. ## no specific addr/port
  2. mongo
  3. use admin
  4. db.runCommand({addshard : "shard1/45.55.188.234:22001,45.55.186.238:22001,104.131.106.22:22001"});
  5. db.runCommand({addshard: "shard2/45.55.186.238:22002,104.131.106.22:22002,45.55.188.234:22002"});
  6. db.runCommand({addshard : "shard3/104.131.106.22:22003,45.55.188.234:22003,45.55.186.238:22003"});
  7. ## if you need to reset your previous setting
  8. ##
  9. db.runCommand( { removeShard: "shard1" } )
  10. #Test:
  11. db.runCommand( { enablesharding :"stock"});
  12. db.runCommand( { shardcollection : "stock.quotes",key : {"_id": 1} })
  13. for (var i = 1; i <= 100000; i++) db.table1.save({id:i,"test1":"testval1"});
  14. use admin
  15. db.addUser('test','test')
  16. db.auth('test','test')

5. Some Commmands for check Database Status;

  1. db.stats();
  2. show databases
  3. db.dropDatabase()
  4. db.printShardingStatus()

6. Misc.

  1. ## Sometimes export jar failed
  2. zip -d stockCrawler.jar META-INF/LICENSE
  3. jar tvf stockCrawler.jar | grep -i license
  4. ## HDFS manipulation
  5. hadoop fs -ls
  6. hadoop fs -mkdir /user/${adminName}
  7. hadoop fs -touch test
  8. hdfs dfs -copyFromLocal ${fileName}
  9. hdfs dfs -cat ${fileName}
  10. hadoop fs -rmr output
  11. hadoop jar stockCrawler.jar
  12. ## Some query example:
  13. db.quotes.find({'historical_quotes.date':'2015-04-10'})
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注