@ChuckIsReady
2018-11-16T03:43:42.000000Z
字数 4556
阅读 625
未分类
Big Volume
The amount of data is enormous
Big Velocity
Data are generated very fast, often faster than the ability to process them
Big Variety
Multimedia data constitute an important data variety
Big Veracity
uncertainty
Outcome of intelligence phase: A Formal Problem Statement
Benefit: time saving
bounded rationality
- Simulation - most common descriptive modeling method
Includes the search, evaluation, and recommendation of an appropriate solution to the model
Data Management Subsystem
Model Management Subsystem
Knowledgebase Management Subsystem
User Interface Subsystem
1) Intelligence
2) Design
3) Choice
4) Implementation
5) Monitoring
A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision
-
Characteristics of DW
- Subject oriented
- Integrated
- Time-variant (time series)
- Nonvolatile
- Summarized
- Not normalized
- Metadata
- Web based, relational/multi-dimensional
- Client/server
- Real-time and/or right-time (active)
- Operational data stores (ODS)
A type of database often used as an interim area for a data warehouse- Oper marts
An operational data mart.- Enterprise data warehouse (EDW)
A data warehouse for the enterprise.- Metadata
Data about data. In a data warehouse, metadata describe the contents of a data warehouse and the manner of its acquisition and use
Warehouse (Eager)
OLTP: On Line Transaction Processing
– Describes processing at operational sites
OLAP: On Line Analytical Processing
– Describes processing at warehouse
OLTP
- Mostly updates
- Many small transactions
- Mb-Tb of data
- Raw data
- Clerical users
- Up-to-date data
- Consistency, recoverability critical
OLAP
- Mostly reads
- Queries long, complex
- Gb-Tb of data
- Summarized, consolidated data
- Decision-makers, analysts as users
Relational OLAP (ROLAP)
– Use relational or extended-relational DBMS to store and manage warehouse data and OLAP middle ware to support missing pieces
– Include optimization of DBMS backend, implementation of aggregation navigation logic, and additional tools and services
– greater scalabilityMultidimensional OLAP (MOLAP)
– Array-based multidimensional storage engine (sparse matrix techniques)
– fast indexing to pre-computed summarized data- Hybrid OLAP (HOLAP)
– User flexibility, e.g., low level: relational, high-level: array- Specialized SQL servers
– specialized support for SQL queries over star/snowflake schemas
- Extraction -- reading data from a database
- Transformation -- converting the extracted data
from its previous form into the form in which it
needs to be so that it can be placed into a data
warehouse or simply another database- Load -- putting the data into the data warehouse