1、Data Mining: Concepts and Techniques Chapter 11 Additional Theme: RFID Data Warehousing and Mining and High-Performance ComputingJiawei Han and Micheline KamberDepartment of Computer Science University of Illinois at Urbana-Champaignwww.cs.uiuc.edu/hanj2006 Jiawei Han and Micheline Kamber. All right
2、s reserved.Acknowledgements: Hector Gonzalez and Shengnan CongDate 1Data Mining: Concepts and TechniquesDate 2Data Mining: Concepts and TechniquesOutlinen Introduction to RFID Technologyn Motivation: Why RFID-Warehousing?n RFID-Warehouse Architecturen Performance Studyn Linking RFID Data Analysis wi
3、th HPCn ConclusionsDate 3Data Mining: Concepts and TechniquesWhat is RFID?n Radio Frequency Identification (RFID)n Technology that allows a sensor (reader) to read, from a distance, and without line of sight, a unique electronic product code (EPC) associated with a tagTag ReaderDate 4Data Mining: Co
4、ncepts and TechniquesRFID SystemSource: Date 5Data Mining: Concepts and TechniquesApplicationsn Supply Chain Management: real-time inventory trackingn Retail: Active shelves monitor product availabilityn Access control: toll collection, credit cards, building accessn Airline luggage management: (Bri
5、tish airways) Implemented to reduce lost/misplaced luggage (20 million bags a year)n Medical: Implant patients with a tag that contains their medical historyn Pet identification: Implant RFID tag with pet owner information (www.pet-)Date 6Data Mining: Concepts and TechniquesOutlinen Introduction to
6、RFID Technologyn Motivation: Why RFID-Warehousing?n RFID-Warehouse Architecturen Performance Studyn Linking RFID Data Analysis with HPCn ConclusionsDate 7Data Mining: Concepts and TechniquesRFID Warehouse ArchitectureDate 8Data Mining: Concepts and TechniquesChallenges of RFID Data Setsn Data genera
7、ted by RFID systems is enormous due to redundancy and low level of abstractionn Walmart is expected to generate 7 terabytes of RFID data per dayn Solution Requirementsn Highly compact summary of the datan OLAP operations on multi-dimensional view of the datan Summary should preserve the path structu
8、re of RFID datan It should be possible to efficiently drill down to individual tags when an interesting pattern is discoveredDate 9Data Mining: Concepts and TechniquesWhy RFID-Warehousing? (1)n Lossless compressionn Significantly reduce the size of the RFID data set by redundancy removal and groupin
9、g objects that move and stay togethern Data cleaning: reasoning based on more complete infon Multi-reading, miss-reading, error-reading, bulky movement, n Multi-dimensional summary: product, location, time, n Store manager: Check item movements from the backroom to different shelves in his storen Region manager: Collapse intra-store movements and look at distribution centers, warehouses, and storesDate 10Data Mining: Concepts and Techniques