1、Chapter 2:Data Warehousing,Business Intelligence: A Managerial Perspective on Analytics (3rd Edition),Learning Objectives,Understand the basic definitions and concepts of data warehousesLearn different types of data warehousing architectures; their comparative advantages and disadvantagesDescribe th
2、e processes used in developing and managing data warehousesExplain data warehousing operations,(Continued),Learning Objectives,Explain the role of data warehouses in decision supportExplain data integration and the extraction, transformation, and load (ETL) processesDescribe real-time (a.k.a. right-
3、time and/or active) data warehousingUnderstand data warehouse administration and security issues,Opening Vignette,Isle of Capri Casinos Is Winning with Enterprise Data WarehouseCompany backgroundProblem descriptionProposed solutionResultsAnswer & discuss the case questions.,Questions for the Opening
4、 Vignette,Why is it important for Isle to have an EDW?What were the business challenges or opportunities that Isle was facing?What was the process Isle followed to realize EDW? Comment on the potential challenges Isle might have had going through the process of EDW development.What were the benefits
5、 of implementing an EDW at Isle? Can you think of other potential benefits that were not listed in the case?Why do you think large enterprises like Isle in the gaming industry can succeed without having a capable data warehouse/business intelligence infrastructure?,Main Data Warehousing Topics,DW de
6、finitionCharacteristics of DWData Marts ODS, EDW, MetadataDW FrameworkDW Architecture & ETL ProcessDW DevelopmentDW Issues,What is a Data Warehouse?,A physical repository where relational data are specially organized to provide enterprise-wide, cleansed data in a standardized format“The data warehou
7、se is a collection of integrated, subject-oriented databases designed to support DSS functions, where each unit of data is non-volatile and relevant to some moment in time”,A Historical Perspective to Data Warehousing,Characteristics of DWs,Subject orientedIntegratedTime-variant (time series)Nonvola
8、tileSummarizedNot normalizedMetadataWeb based, relational/multi-dimensional Client/server, real-time/right-time/active ,Data Mart,A departmental small-scale “DW” that stores only limited/relevant data Dependent data mart A subset that is created directly from a data warehouse Independent data martA
9、small data warehouse designed for a strategic business unit or a department,Other DW Components,Operational data stores (ODS)A type of database often used as an interim area for a data warehouseOper marts - an operational data mart. Enterprise data warehouse (EDW)A data warehouse for the enterprise.
10、 Metadata Data about data. In a data warehouse, metadata describe the contents of a data warehouse and the manner of its acquisition and use,Application Case 2.1,A Better Data Plan: Well-Established TELCOs Leverage Data Warehousing and Analytics to Stay on Top in a Competitive IndustryQuestions for
11、DiscussionWhat are the main challenges for TELCOs?How can data warehousing and data analytics help TELCOs in overcoming their challenges?Why do you think TELCOs are well suited to take full advantage of data analytics?,A Generic DW Framework,Application Case 2.2,Data Warehousing Helps MultiCare Save
12、 More LivesQuestions for DiscussionWhat do you think is the role of data warehousing in healthcare systems?How did MultiCare use data warehousing to improve health outcomes?,DW Architecture,Three-tier architectureData acquisition software (back-end)The data warehouse that contains the data & softwar
13、eClient (front-end) software that allows users to access and analyze data from the warehouseTwo-tier architectureFirst two tiers in three-tier architecture is combined into one sometimes there is only one tier?,DW Architectures,3-tier architecture,2-tier architecture,1-tier Architecture?,Data Wareho
14、using Architectures,Issues to consider when deciding which architecture to use:Which database management system (DBMS) should be used? Will parallel processing and/or partitioning be used? Will data migration tools be used to load the data warehouse?What tools will be used to support data retrieval
15、and analysis?,A Web-Based DW Architecture,Alternative DW Architectures,Alternative DW Architectures,Each architecture has advantages and disadvantages!Which architecture is the best?,Ten factors that potentially affect the architecture selection decision,Information interdependence between organizat
16、ional unitsUpper managements information needsUrgency of need for a data warehouseNature of end-user tasksConstraints on resources,Strategic view of the data warehouse prior to implementationCompatibility with existing systemsPerceived ability of the in-house IT staffTechnical issuesSocial/political
17、 factors,Teradata Corp. DW Architecture,Data Integration and the Extraction, Transformation, and Load (ETL) Process,ETL = Extract Transform LoadData integration Integration that comprises three major processes: data access, data federation, and change capture. Enterprise application integration (EAI
18、)A technology that provides a vehicle for pushing data from source systems into a data warehouse Enterprise information integration (EII) An evolving tool space that promises real-time data integration from a variety of sources, such as relational or multidimensional databases, Web services, etc.,Da
19、ta Integration and the Extraction, Transformation, and Load (ETL) Process,ETL (Extract, Transform, Load),Issues affecting the purchase of an ETL toolData transformation tools are expensiveData transformation tools may have a long learning curveImportant criteria in selecting an ETL toolAbility to re
20、ad from and write to an unlimited number of data sources/architecturesAutomatic capturing and delivery of metadataA history of conforming to open standardsAn easy-to-use interface for the developer and the functional user,Data Warehouse Development,Data warehouse development approachesInmon Model: E
21、DW approach (top-down) Kimball Model: Data mart approach (bottom-up)Which model is best?Table 2.3 provides a comparative analysis between EDW and Data Mart approachOne alternative is the hosted warehouse,Application Case 2.5,Starwood Hotels & Resorts Manages Hotel Profitability with Data Warehousing
22、Questions for DiscussionHow big and complex are the business operations of Starwood Hotels & Resorts?How did Starwood Hotels & Resorts use data warehousing for better profitability?What were the challenges, the proposed solution, and the obtained results?,Additional Data Warehouse Considerations Hos
23、ted Data Warehouses,Benefits:Requires minimal investment in infrastructureFrees up capacity on in-house systemsFrees up cash flowMakes powerful solutions affordableEnables solutions that provide for growthOffers better quality equipment and softwareProvides faster connections more in the book,Repres
24、entation of Data in DW,Dimensional Modeling A retrieval-based system that supports high-volume query accessStar schema The most commonly used and the simplest style of dimensional modelingContain a fact table surrounded by and connected to several dimension tablesSnowflakes schema An extension of st
25、ar schema where the diagram resembles a snowflake in shape,The ability to organize, present, and analyze data by several dimensions, such as sales by region, by product, by salesperson, and by time (four dimensions)Multidimensional presentation Dimensions: products, salespeople, market segments, bus
26、iness units, geographical locations, distribution channels, country, or industryMeasures: money, sales volume, head count, inventory profit, actual versus forecastTime: daily, weekly, monthly, quarterly, or yearly,Multidimensionality,Star versus Snowflake Schema,Analysis of Data in DW,OLTP vs. OLAPO
27、LTP (online transaction processing)Capturing and storing data from ERP, CRM, POS, The main focus is on efficiency of routine tasksOLAP (Online analytical processing)Converting data into information for decision supportData cubes, drill-down / rollup, slice & dice, Requesting ad hoc reportsConducting
28、 statistical and other analyses Developing multimedia-based applicationsmore in the book,OLAP vs. OLTP,OLAP Operations,Slice - a subset of a multidimensional arrayDice - a slice on more than two dimensionsDrill Down/Up - navigating among levels of data ranging from the most summarized (up) to the mo
29、st detailed (down)Roll Up - computing all of the data relationships for one or more dimensions Pivot - used to change the dimensional orientation of a report or an ad hoc query-page display,OLAP,Slicing Operations on a Simple Tree-DimensionalData Cube,Variations of OLAP,Multidimensional OLAP (MOLAP)
30、OLAP implemented via a specialized multidimensional database (or data store) that summarizes transactions into multidimensional views ahead of time Relational OLAP (ROLAP)The implementation of an OLAP database on top of an existing relational database Database OLAP and Web OLAP (DOLAP and WOLAP); De
31、sktop OLAP,Technology Insights 2.2Hands-On DW with MicroStrategy,A wealth of teaching and learning resources can be found at TUN The available resources include scripted demonstrations, assignments, white papers, etc,DW Implementation Issues,Identification of data sources and governanceData quality
32、 planning, data model designETL tool selectionEstablishment of service-level agreementsData transport, data conversionReconciliation processEnd-user supportPolitical issues more in the book,Successful DW ImplementationThings to Avoid,Starting with the wrong sponsorship chainSetting expectations that
33、 you cannot meetEngaging in politically naive behaviorLoading the data warehouse with information just because it is availableBelieving that data warehousing database design is the same as transactional database designChoosing a data warehouse manager who is technology oriented rather than user orie
34、nted more in the book,Failure Factors in DW Projects,Lack of executive sponsorshipUnclear business objectivesCultural issues being ignoredChange managementUnrealistic expectationsInappropriate architectureLow data quality / missing informationLoading data just because it is available,Massive DW and
35、Scalability,ScalabilityThe main issues pertaining to scalability:The amount of data in the warehouseHow quickly the warehouse is expected to growThe number of concurrent usersThe complexity of user queries Good scalability means that queries and other data-access functions will grow linearly with th
36、e size of the warehouse,Real-Time/Active DW/BI,Enabling real-time data updates for real-time analysis and real-time decision making is growing rapidlyPush vs. Pull (of data)Concerns about real-time BINot all data should be updated continuouslyMismatch of reports generated minutes apartMay be cost pr
37、ohibitiveMay also be infeasible,Enterprise Decision Evolution and Data Warehousing,Real-Time/Active DW at Teradata,Traditional versus Active DW,DW Administration and Security,Data warehouse administrator (DWA)DWA shouldhave the knowledge of high-performance software, hardware and networking technolo
38、giespossess solid business knowledge and insightbe familiar with the decision-making processes so as to suitably design/maintain the data warehouse structurepossess excellent communications skillsSecurity and privacy is a pressing issue in DWSafeguarding the most valuable assets Government regulatio
39、ns (HIPAA, etc.)Must be explicitly planned and executed,The Future of DW,SourcingWeb, social media, and Big DataOpen source softwareSaaS (software as a service)Cloud computingInfrastructureColumnarReal-time DWData warehouse appliancesData management practices/technologiesIn-database & In-memory proc
40、essing New DBMSAdvanced analytics,Free of Charge DW Portal for Teaching & Learning,www.TeradataUniversityNPassword to signup: ,End of the Chapter,Questions, comments,All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America.,