Data Warehousing

 

Data Warehouse: Data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision-making process. [Bill W. H. Inmon ].  A Data warehouse is a copy transaction data specifically  structured  for query and analysis. Data warehousing is the process of constructing and using data warehouses Data Warehouse can be defined as collection of Data marts.  
Data warehousing : Data warehousing is a collection of decision support technologies, aimed at enabling the knowledge worker to make better decisions.   Requirements of a Data Warehouse system Efficient cube computation,  Better access methods, Efficient query processing

Characteristics of data warehouse:

  1. Subject Oriented: Data that gives information about a particular subject instead of about a ongoing operations of organization. A data warehouse can be used to analyze a particular subject area.  Ex: "Sales" can be a particular subject. Employee, Department, sales, weather data, stock market are subjects.
  1. Integrated: Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole. A data warehouse integrates data from multiple data sources. Data cleaning and data integration techniques are applied to ensure consistency in encoding structures, naming conventions, attribute measures etc.
  2. Time-variant: All data in the data warehouse is identified with a particular time period. Historical data is kept in a data warehouse.  Operational and transactional database stores current value data, Data warehouse stores historical data such as past 10-15 years of data

  3. Non-volatile: Data is stable in a data warehouse. More data is added but data is never removed. Once data is in the data warehouse, it will not change.  Initial loading of data and access of data, No update of data allowed and Only loading and access of data operation.




References

Jiawei Han, Micheline Kamber and Jian Pei, Data Mining: Concepts and Techniques, Elsevier, 3rd edition, 2013.

Comments