@xmruibi
2015-04-13T21:42:37.000000Z
字数 5193
阅读 788
cloud_computing
Currently, we have 3 vms.(shown as following...)
We need a hadoop cluster (done) and a mongodb sharding cluster;
Due to the amount of vms limited, each vm should regarded as the node server for hadoop and the shard server for mongodb. I think it doesn't matter for hadoop and mongodb integration?
Basically, Mongodb are already installed in 3 vms.
Then we have to configure mongodb shard cluster:
#1. http://www.lanceyan.com/category/tech/mongodb
#2. http://blog.itpub.net/22664653/viewspace-710195
Droplet Name: newcldvm
IP Address: 45.55.188.234
Username: root
Password: CloudComputing3Bosses
Shard1_Main,Shard2_Arbiter,Shard3_Replica
The following two nodes are working right now
Droplet Name: newcldvm7
IP Address: 45.55.186.238
Username: root
Password: CloudComputing3Bosses
Shard2_Main,Shard3_Arbiter,Shard1_Replica
Droplet Name: newcldvm8
IP Address: 104.131.106.22
Username: root
Password: CloudComputing3Bosses
Shard3_Main,Shard2_Arbiter,Shard1_Replica
The following graph is the architecture of how I set three VMs with different port to simulate the real sharding pattern(which need 15 machines actually)
The following procedure is how I configured MongoDB on remote three VMs.
mkdir -p /data/mongos/logsudo chmod -R 777 /data/mongos/logmkdir -p /data/config/datasudo chmod -R 777 /data/config/datamkdir -p /data/config/logsudo chmod -R 777 /data/config/logmkdir -p /data/mongos/logsudo chmod -R 777 /data/mongos/logmkdir -p /data/shard1/datasudo chmod -R 777 /data/shard1/datamkdir -p /data/shard1/logsudo chmod -R 777 /data/shard1/logmkdir -p /data/shard2/datasudo chmod -R 777 /data/shard2/datamkdir -p /data/shard2/logsudo chmod -R 777 /data/shard2/logmkdir -p /data/shard3/datasudo chmod -R 777 /data/shard3/datamkdir -p /data/shard3/logsudo chmod -R 777 /data/shard3/log
mongod --configsvr --dbpath /data/config/data --port 27019 --logpath /data/config/log/config.log --forkmongos --configdb 45.55.188.234:27019,45.55.186.238:27019,104.131.106.22:27019 --port 27017 --logpath /data/mongos/log/mongos.log --fork##or## These codes can migration configurationrsync -az /data/configdb mongo-config1.example.net:/data/configdbrsync -az /data/configdb mongo-config2.example.net:/data/configdbnano etc/mongod.conf###This is important!!set bing_ip = 0.0.0.0; for remote loginset default mongod: 27019##!Otherwise your mongos port(27017) will be blocked
### set up shards ports and dbpath and log pathmongod --shardsvr --replSet shard1 --port 22001 --dbpath /data/shard1/data --logpath /data/shard1/log/shard1.log --fork --journal --oplogSize 10mongod --shardsvr --replSet shard2 --port 22002 --dbpath /data/shard2/data --logpath /data/shard2/log/shard2.log --fork --journal --oplogSize 10mongod --shardsvr --replSet shard3 --port 22003 --dbpath /data/shard3/data --logpath /data/shard3/log/shard3.log --fork --journal --oplogSize 10#Shard_1 in Node(45.55.188.234)mongo 127.0.0.1:22001use adminconfig = { _id:"shard1", members:[{_id:0,host:"45.55.188.234:22001"},{_id:1,host:"45.55.186.238:22001"},{_id:2,host:"104.131.106.22:22001",arbiterOnly:true}]}rs.initiate(config);#Shard_2 in Node(45.55.186.238)mongo 127.0.0.1:22002use adminconfig = { _id:"shard2", members:[{_id:0,host:"45.55.186.238:22002"},{_id:1,host:"104.131.106.22:22002"},{_id:2,host:"45.55.188.234:22002",arbiterOnly:true}]}rs.initiate(config);#Shard_3 in Node(104.131.106.22)mongo 127.0.0.1:22003use adminconfig = { _id:"shard3", members:[{_id:0,host:"104.131.106.22:22003"},{_id:1,host:"45.55.188.234:22003"},{_id:2,host:"45.55.186.238:22003",arbiterOnly:true}]}rs.initiate(config);## if you need to reconfig, please use Cmd( rs.reconfig(your_para) )
It seems no master mode concept in MongoDB. So just choose one of it. Config Sharding info in mongos; I also made the sharding part on different machines (e.g. Primary Shard1 on Server One, Primary Shard2 on Server Two).
## no specific addr/portmongouse admindb.runCommand({addshard : "shard1/45.55.188.234:22001,45.55.186.238:22001,104.131.106.22:22001"});db.runCommand({addshard: "shard2/45.55.186.238:22002,104.131.106.22:22002,45.55.188.234:22002"});db.runCommand({addshard : "shard3/104.131.106.22:22003,45.55.188.234:22003,45.55.186.238:22003"});## if you need to reset your previous setting##db.runCommand( { removeShard: "shard1" } )#Test:db.runCommand( { enablesharding :"stock"});db.runCommand( { shardcollection : "stock.quotes",key : {"_id": 1} })for (var i = 1; i <= 100000; i++) db.table1.save({id:i,"test1":"testval1"});use admindb.addUser('test','test')db.auth('test','test')
db.stats();show databasesdb.dropDatabase()db.printShardingStatus()
## Sometimes export jar failedzip -d stockCrawler.jar META-INF/LICENSEjar tvf stockCrawler.jar | grep -i license## HDFS manipulationhadoop fs -lshadoop fs -mkdir /user/${adminName}hadoop fs -touch testhdfs dfs -copyFromLocal ${fileName}hdfs dfs -cat ${fileName}hadoop fs -rmr outputhadoop jar stockCrawler.jar## Some query example:db.quotes.find({'historical_quotes.date':'2015-04-10'})