Monday, September 23, 2013

Chapter 8 : Accessing Organizational Information – Data Warehouse

HISTORY OF DATA WAREHOUSING
          Data warehouses extend the transformation of data into information
          In the 1990’s executives became less concerned with the day-to-day business operations and more concerned with overall business functions
          The data warehouse provided the ability to support decision making without disrupting the day-to-day operations

DATA WAREHOUSE FUNDAMENTALS
          Data warehouse – a logical collection of information – gathered from many different operational databases – that supports business analysis activities and decision-making tasks
          The primary purpose of a data warehouse is to aggregate information throughout an organization into a single repository for decision-making purposes
          The primary difference between a database and a data warehouse is that a database stores information for a single application, whereas a data warehouse stores information from multiple databases, or multiple applications, and external information such as industry information           
          This enables cross-functional analysis, industry analysis, market analysis, etc., all from a single repository
          Data warehouses support only analytical processing (OLAP)
          Extraction, transformation, and loading (ETL) – a process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse
          The ETL process gathers data from the internal and external databases and passes it to the data warehouse
          The ETL process also gathers data from the data warehouse and passes it to the data marts

          Data mart – contains a subset of data warehouse information



          The data warehouse modeled in the above figure compiles information from internal databases or transactional/operational databases and external databases through ETL
          It then send subsets of information to the data marts through the ETL process

MULTIDIMENSIONAL ANALYSIS AND DATA MINING
          Databases contain information in a series of two-dimensional tables
          In a data warehouse and data mart, information is multidimensional, it contains layers of columns and rows
      Dimension – a particular attribute of information
          Each layer in a data warehouse or data mart represents information according to an additional dimension
          Dimensions could include such things as:
Products
Promotions
Stores
Category
Region
Stock price
Date
Time
Weather

          Why is the ability to look at information based on different dimensions critical to a business success?
      Ans:  The ability to look at information from different dimensions can add tremendous business insight
      By slicing-and-dicing the information a business can uncover great unexpected insights
          Cube – common term for the representation of multidimensional information




          Users can slice and dice the cube to drill down into the information
          Cube A represents store information (the layers), product information (the rows), and promotion information (the columns)
          Cube B represents a slice of information displaying promotion II for all products at all stores
          Cube C represents a slice of information displaying promotion III for product B at store 2
          Data mining – the process of analyzing data to extract information not offered by the raw data alone
          Data mining can begin at a summary information level (coarse granularity) and progress through increasing levels of detail (drilling down), or the reverse (drilling up)
          To perform data mining users need data-mining tools
Data-mining tool – uses a variety of techniques to find patterns and relationships in large volumes of information and infers rules that predict future behavior and guide decision making
Data-mining tools include query tools, reporting tools, multidimensional analysis tools, statistical tools, and intelligent agents

INFORMATION CLEANSING OR SCRUBBING
          An organization must maintain high-quality data in the data warehouse
          Information cleansing or scrubbing – a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information
          Contact information in an operational system


Taking a look at customer information highlights why information cleansing and scrubbing is necessary
Customer information exists in several operational systems
In each system all details of this customer information could change form the customer ID to contact information
Determining which contact information is accurate and correct for this customer depends on the business process that is being executed


          Standardizing Customer name from Operational Systems



          Information cleansing activities


          Accurate and complete information


          Why do you think most businesses cannot achieve 100% accurate and complete information?
          If they had to choose a percentage for acceptable information what would it be and why?
§  Some companies are willing to go as low as 20% complete just to find business intelligence
§  Few organizations will go below 50% accurate – the information is useless if it is not accurate
          Achieving perfect information is almost impossible
§  The more complete and accurate an organization wants to get its information, the more it costs
§  The tradeoff between perfect information lies in accuracy verses completeness
§  Accurate information means it is correct, while complete information means there are no blanks
§  Most organizations determine a percentage high enough to make good decisions at a reasonable cost, such as 85% accurate and 65% complete

BUSINESS INTELLIGENCE
BI is information that people use to support their decision-making efforts
Principle BI enablers include:
          Technology
          Even the smallest company with BI software can do sophisticated analyses today that were unavailable to the largest organizations a generation ago. The largest companies today can create enterprisewide BI systems that compute and monitor metrics on virtually every variable important for managing the company. How is this possible? The answer is technology—the most significant enabler of business intelligence.
          People
          Understanding the role of people in BI allows organizations to systematically create insight and turn these insights into actions. Organizations can improve their decision making by having the right people making the decisions. This usually means a manager who is in the field and close to the customer rather than an analyst rich in data but poor in experience. In recent years “business intelligence for the masses” has been an important trend, and many organizations have made great strides in providing sophisticated yet simple analytical tools and information to a much larger user population than previously possible.
          Culture
          A key responsibility of executives is to shape and manage corporate culture. The extent to which the BI attitude flourishes in an organization depends in large part on the organization’s culture. Perhaps the most important step an organization can take to encourage BI is to measure the performance of the organization against a set of key indicators. The actions of publishing what the organization thinks are the most important indicators, measuring these indicators, and analyzing the results to guide improvement display a strong commitment to BI throughout the organization.


1 comments:

John Das said...

This is so essential post. This information helps them who are new bloggers. Thanks for helpful post for us. Refrigerated Warehouse Chicago

Post a Comment

 
;