Data Warehouse
Data warehouse is an architecture for organizing information system. It is a process for building decision support systems and knowledge management environment that supports both day-to-day tactical decision making and long-term business strategies. "Subject-oriented, integrated, time variant, non-volatile collection of data in support of management's decision making process."
Data MartA data mart is a collection of subject areas organized for decision support system based on the needs of a given department. Typically, the database design for a data mart is built as star schema structure that is optimal for the needs of the users found in the department. There are two kinds of data marts - dependent and independent. A dependent data mart is one whose source is a data warehouse. An independent data mart is one whose source is legacy applications or OLTP environment.
Operational Data Store(ODS)An operational data store is an integrated, subject-oriented, volatile(including update/deletion), current valued structure designed to serve operational users as they do high performance integrated processing.
OLTP(Online Transaction Processing)
OLTP is a class of program that facilitates and manages transaction-oriented applications, typically for data entry and retrieval transaction processing. OLTP systems are optimized for data entry operations. e.g. Order Entry, Banking, CRM, ERP applications etc.
Data Warehouse vs Operational
Data Warehouse |
Operational/Transactional
|
Subject oriented
|
Application oriented
|
Summarized, refined & detailed
|
Detailed
|
Represents value over time
|
Accurate as of moment
|
Supports managerial needs
|
Supports day-to-day needs
|
Read only data
|
Can be updated
|
Batch processing
|
Real time transactions
|
Completely different life cycle
|
Software Development Life Cycle
|
Analysis driven
|
Transaction driven
|
Dimensional model
|
Entity Relational Diagram
|
Large amount of data
|
Small amount of data
|
Relaxed availability
|
High availability
|
Flexible structure
|
Static structure
|
Top-Down
|
Bottom-Up
|
Hybrid
|
Federated
| |
Practitioner
|
Bill Inmon
|
Ralph Kimball
|
Many practitioners
|
Doug Hackney
|
Emphasize
|
Data Warehouse
|
Data Marts
|
DW and data marts
|
Integrate heterogeneous BI environments
|
Design
|
Enterprise based normalized model; marts use a subject orient dimensional model
|
Dimensional model of data mart, consists star schema
|
Start enterprise and local models; one or more star schemas
|
An achitecture of architectures; share dimensions, facts, rules, definitions across organizations
|
Architect
|
Multi-tier comprised of staging area and dependent data marts
|
Staging area and data marts
|
High-level normalized enterprise model; initial marts
|
Reality of change in organizations and systems
|
Data set
|
DW atomic level data; marts summary data
|
Contains both atomic and summary data
|
Populates marts with atomic and summary data via a non-persistent staging area.
|
Use of whatever means possible to integrate business needs
|
Agile Developement
Agile methodology emphasize close collaboration between the technical team and business experts; face-to-face communication; self-organizing teams; frequent delivery of business value releases.
Agile methodology emphasize close collaboration between the technical team and business experts; face-to-face communication; self-organizing teams; frequent delivery of business value releases.
- A project's overall scope, objectives, constraints, clients, risks, etc. should be briefly documented.
- Lean, iterative, feature-driven, time-boxed development cycles.
- Constant feedback. Exploratory processes require constant feedback to stay on track.
- Customer involvement. Focusing on business value requires constant interaction between customers and developers.
- Technical excellence. Creating, refactoring and maintaining a technically excellent product.
3D Lifecycle
Dimensional Data Warehouse Development Lifecycle - Our approach Agile data warehouse development with integrating iterative and data driven components. Enterprise data warehouse data model is suggested as dimensional with conformed subject areas. The goal of 3D methodology is to define strategies that enable data warehouse practitioners to work effectively on development and deliverables. This does not mean "one size fits all" methodology. Instead, consider 3D life cycle as a collection of philosophies that will enable technical and business experts to work together effectively to maximize ROI. 3DLC is an adaptable process framework, intended to be tailored by project teams that will select the elements of the process that are appropriate for their needs.
Dimensional Data Warehouse Development Lifecycle - Our approach Agile data warehouse development with integrating iterative and data driven components. Enterprise data warehouse data model is suggested as dimensional with conformed subject areas. The goal of 3D methodology is to define strategies that enable data warehouse practitioners to work effectively on development and deliverables. This does not mean "one size fits all" methodology. Instead, consider 3D life cycle as a collection of philosophies that will enable technical and business experts to work together effectively to maximize ROI. 3DLC is an adaptable process framework, intended to be tailored by project teams that will select the elements of the process that are appropriate for their needs.
- Collaboration across technical and subject matter expertise teams.
- Iterative and incremental approach.
- Monthly releases, fully functional, set of building blocks.
- Small team size max up to 10 people.
- Phase(Project) plan 4-6 months.
- Commitment to the team, Active participation.
- Build consensus and ownership, create win/win solution.
- Focus on quality, testing & communication.
Business Intelligence (BI)Business Intelligence is a set of business processes for collecting and analying business information. BI functions include trend analysis, aggregation of data, drilling down to complex levels of detail, slice-dice, data rotation for comparative viewing.
- OLAP(On-Line Analytical Processing) Querying and presenting data from data warehouse exemplifying as multiple dimensions.
- ROLAP(Relational OLAP) Applications and set of user interfaces that retrieve data from RDBMS and present as dimensional model.
- MOLAP(Multidimensional OLAP) Applications, set of user interfaces and database technologies that have dimensional model.
- DOLAP(Desktop OLAP) Designed for low-end, single user. Data is stored/downloaded on the desktop.
HOLAP(Hybrid OLAP) is a combination of all the above OLAP methodologies.
No comments:
Post a Comment