@Sarah
2015-12-22T19:51:04.000000Z
字数 4575
阅读 1133
NOSQL
not only SQl
1 cassandra
2 neo45
3 data locality:in the heart of every data数据局部性
只有规范化的表才被接受
高范式导致performance低
受限制的scalability 可扩展性
slow
need to be consisitancy
FB加#可以连接到指定的话题:用noSQL
存电影
social media network
agile development
Saas[NOSQL] - - - -DYNAMODB(by amazon)
#NOSQL 特点
- schema free
- no joins
- map-reduce style programming
- generally prefers data instead of consistency
c consestancy
a avability
p partation dollors 不能跳过
在理論計算機科學中,CAP定理(CAP theorem),又被稱作布魯爾定理(Brewer's theorem),它指出對於一個分布式计算系統來說,不可能同時滿足以下三點:[1][2]
一致性(英语:Consistency (database systems))(等同于所有节点访问同一份最新的数据副本)
可用性(Availability)(对数据更新具备高可用性)
容忍网络分区(Partition tolerance)(以实际效果而言,分区相当于对通信的时限要求。系统如果不能在时限内达成数据一致性,就意味着发生了分区的情况,必须就当前操作在C和A之间做出选择[3]。)
根據定理,分佈式系統只能滿足三項中的兩項而不可能滿足全部三項[4]。理解CAP理论的最简单方式是想象两个节点分处分区两侧。允许至少一个节点更新状态会导致数据不一致,即丧失了C性质。如果为了保证数据一致性,将分区一侧的节点设置为不可用,那么又丧失了A性质。除非两个节点可以互相通信,才能既保证C又保证A,这又会导致丧失P性质。
RDBMS:C+P 没有A
NoSQL:A+P 没有CA
B basically avalibale 数据一致都avaliable
S soft state 不是consistancy
E eventual consistency
timestamps
A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. The term derives from rubber stamps used in offices to stamp the current date, and sometimes time, in ink on paper documents, to record when the document was received. Common examples of this type of timestamp are a postmark on a letter or the "in" and "out" times on a time card.
optimistic locking
Document--Json-BSON(binary格式的Json)
Document Database
sue
26
{
name:"sue",
age:26,
status:"A",
groups:["news","sports"]
}
key-value
JSOM example:
{subjects:
["Maths","English"]
}
{address:
{"staddress":"SEc.12A",
"city":"Dallas",
"state":"Texas",
"zopcode":"13409"
}
}
{"employees":[
{"firstName":"John", "lastName":"Doe"},
{"firstName":"Anna", "lastName":"Smith"},
{"firstName":"Peter", "lastName":"Jones"}
]}
--storeEngine=im-memory mmapv1
database-collections
tables
document-row
field
key-value
show dbs
use test
db.restaurants.find()
use test
db.restaurants.create
db.restaurants.insert(
{ "address" :
{ "street" : "2 Avenue", "zipcode" : "10075", "building" : "1480",
"coord" : [ -73.9557413, 40.7720266 ], },
"borough" : "Manhattan",
"cuisine" : "Italian",
"grades" : [
{ "date" : ISODate("2014-10- 01T00:00:00Z"), \"grade" : "A", "score" : 11 },
{"date" : ISODate("2014-01-16T00:00:00Z"), "grade" : "B", "score" : 17 } ],
"name" : "Vella",
"restaurant_id" : "41704620" } )
https://raw.githubusercontent.com/mongodb/docs-assets/primer-dataset/dataset.json
ayush.jsom格式
server/3.2/bin文件夹
--file-dataset.json
cursors
curor batches :cursor.next()
db.users.find({age:{$gt:18}}).sort({age:1})
//$gt:great than
collection+ query criteria+ modification
db.restaurants.findOne({"borough":"Manhattan"},{name:1,_id:0});
db.restaurants.find({"grade.grade":"A"})(name:1,_id)
objectid:12位唯一byte值 。优势:不需要主键了
data modelling:两种都可以
- embedded
- reference
db.users.update(
{age:{$gt:18}},
{$set:{status:"A"}},
{multi:true
}
db.users.update(
{"age":{$gt:25}},{$set:{status:"C"}},{multi:true})
AWS S3
shift+} 返回上一步
ayush.gaur@utdallas.edu
可以用来sort the data
http://blog.csdn.net/youmoo/article/details/8898662
mogoDB
document
key value:redis,cassnndra,hbase,
graph:
column:
SQL局限性: horizontal
1Wircd Tiger
2ls-memory
3encryt
4mmavd
——id: is the primary key in MongoDB
embeded有什么局限性:增加的时候持续增加一个文件的大小,因为是一个整体,不能分成部分,
根据关系类型使用哪种结构:
1:M:用enbedded
M:M: relational :为啥不能用嵌入式?因为会导致每一个都重复嵌入一遍,
而用relational可以设置FK
db.creatCollection
db.collection_name.find({or:[key1:'v1'},{key2:'v2'})
db.restaurants.find({or:[{"b":"M},{"c":"A"}]})
secondary indexed
name{FName:
ranged-based query operations
B+ tree :sql 高效index
Hash based:更快的index
db.restaurants.find({"cusine";"Bakery"},{_id:0,cuisine:1})
creatindex({"address,zipcode".1})
不是id
db.factory.createIndex({ttt:1})
{unique:true}
db.inventory.find({type:"food",item:/^c/
克服sever faliure
primary:master node
secondary:slave node
arbiter:决定哪个奴隶能够成为主要的,当主要的失败的时候
当主要的失败的时候,奴隶投票谁成为主要的
all the write requests are only sent to the primary node
read request.........makes the reads as consistent
在不同srver里同步
data is local to the particuality machine,increase data localitity
shard key:可以是single也可以是compound 但必须是个key index
- range based:
- hash nased partitioning:都放进hash表里 然偶返回integers 通过functions .
Hadoop:
- hdfs存储
- Mr:process
mapper :convert data
通过code flowing to the data
reducer
input mappcs-mappce phase
0,this is mongo class
每一个value都是一个key 有word所以不是0应该是1