@Sarah
2015-12-05T23:45:08.000000Z
字数 5559
阅读 899
SQL
What is big data? Velocity, variety and volume
Business intelligence is a term used to describe a comprehensive, cohesive and integrated set of tools and processes used to support business decision making.
Data > information > knowledge > wisdom
Collect & store operational data > aggregate operational data into decision support data > analyze decision support data to generate information >…
Traditional info system focus on operational automation and reporting
BI focus on strategic and tactical use of information
Business intelligence architecture is composed of data, people, processes, technology and the management of such components. Main goal: improved decision making
Operational data: mostly stored in relational db, optimized to support transactions representing daily operations
Operational data vs analytical data
- Extract operational data for BI processing.
- BI effectiveness depends on quality of data gathered at operational level
- Operational data seldom well-suited for decision support tasks
- Need reformat data in order to be useful for business intelligence
Explanatory analytics & Predictive analytics
Online analytical processing OLAP
Multidimensional analysis of business data, provides the ability for complex calculations, trend analysis, and sophisticated data modeling.
Three main characteristics:
- Multidimensional data analysis techniques
Data are processed and viewed as part of a multidimensional structure
Augmented by the following functions:
1) Advanced data presentation functions
2) Advanced data aggregation, consolidation, and classification functions
3) Advanced computational functions
4) Advanced data modeling functions- Advanced database support
- Easy-to-use end-user interfaces
OLAP Architecture
Three main architectural components:
1. Graphical user interface (GUI)
2. Analytical processing logic
3. Data-processing logic
Designed to use both operational and data warehouse data
In most implementations, data warehouse and OLAP are interrelated and complementary
OLAP systems merge data warehouse and data mart approaches
Relational OLAP
- Multidimensional data schema support within the RDBMS
- Data access language and query performance optimized for multidimensional data
- Support for very large databases (VLDBs)
SQL Extensions for OLAP
ROLLUP
- Used with GROUP BY clause to generate aggregates by different dimensions
- GROUP BY generates only one aggregate for each new value combination of attributes
- ROLLUP extension enables subtotal for each column listed except for the last one
- Last column gets grand total
- Order of column list important
SELECT column1, colun2, …
FROM table1, table 2…
WHERE condition
GROUP BY ROLLUP(column1, column2…)
HAVING condition
ORDER BY column1, column2…
SELECT column1, colun2, …
FROM table1, table 2…
WHERE condition
GROUP BY CUBE(column1, column2…)
HAVING condition
ORDER BY column1, column2…
RANKING (rank, percent_rank, row_number)
WINDOWING (cumulative aggregate functions, moving averages, lag, lead)
STATISTICAL
Materialized Views: A dynamic table that contains SQL query command to generate rows(Also contains the actual rows)
- Created the first time query is run and summary rows are stored in table
- Automatically updated when base tables are updated