数据挖掘全英文课件.ppt

上传人:99****p 文档编号:1420405 上传时间:2019-02-25 格式:PPT 页数:68 大小:4.32MB
下载 相关 举报
数据挖掘全英文课件.ppt_第1页
第1页 / 共68页
数据挖掘全英文课件.ppt_第2页
第2页 / 共68页
数据挖掘全英文课件.ppt_第3页
第3页 / 共68页
数据挖掘全英文课件.ppt_第4页
第4页 / 共68页
数据挖掘全英文课件.ppt_第5页
第5页 / 共68页
点击查看更多>>
资源描述

1、 Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Data Mining: DataLecture Notes for Chapter 2Introduction to Data MiningbyTan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 2 What is Data?l Collection of data objects and their attributesl An attribute is a

2、property or characteristic of an object Examples: eye color of a person, temperature, etc. Attribute is also known as variable, field, characteristic, or featurel A collection of attributes describe an object Object is also known as record, point, case, sample, entity, or instanceAttributesObjects T

3、an,Steinbach, Kumar Introduction to Data Mining 4/18/2004 3 Attribute ValueslAttribute values are numbers or symbols assigned to an attributelDistinction between attributes and attribute values Same attribute can be mapped to different attribute valuesu Example: height can be measured in feet or met

4、ers Different attributes can be mapped to the same set of valuesu Example: Attribute values for ID and age are integersu But properties of attribute values can be different ID has no limit but age has a maximum and minimum value Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 4 Measuremen

5、t of Length l The way you measure an attribute is somewhat may not match the attributes properties. Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 5 Types of Attributes l There are different types of attributes Nominal(标称 )u Examples: ID numbers, eye color, zip codes Ordinal(序数 )u Exampl

6、es: rankings (e.g., taste of potato chips on a scale from 1-10), grades, height in tall, medium, short Interval(区间 )u Examples: calendar dates, temperatures in Celsius or Fahrenheit. Ratio(比率 )u Examples: temperature in Kelvin, length, time, counts Tan,Steinbach, Kumar Introduction to Data Mining 4/

7、18/2004 6 Properties of Attribute Values lThe type of an attribute depends on which of the following properties it possesses: Distinctness: = Order: Addition: + - Multiplication: * / Nominal attribute: distinctness Ordinal attribute: distinctness & order Interval attribute: distinctness, order & add

8、ition Ratio attribute: all 4 propertiesAttribute TypeDescription Examples OperationsNominal The values of a nominal attribute are just different names, i.e., nominal attributes provide only enough information to distinguish one object from another. (=, )zip codes, employee ID numbers, eye color, sex

9、: male, femalemode, entropy, contingency correlation, 2 testOrdinal The values of an ordinal attribute provide enough information to order objects. ()hardness of minerals, good, better, best, grades, street numbersmedian, percentiles, rank correlation, run tests, sign testsInterval For interval attr

10、ibutes, the differences between values are meaningful, i.e., a unit of measurement exists. (+, - )calendar dates, temperature in Celsius or Fahrenheitmean, standard deviation, Pearsons correlation, t and F testsRatio For ratio variables, both differences and ratios are meaningful. (*, /)temperature

11、in Kelvin, monetary quantities, counts, age, mass, length, electrical currentgeometric mean, harmonic mean, percent variationAttribute LevelTransformation CommentsNominal Any permutation of values If all employee ID numbers were reassigned, would it make any difference?Ordinal An order preserving ch

12、ange of values, i.e., new_value = f(old_value) where f is a monotonic function.An attribute encompassing the notion of good, better best can be represented equally well by the values 1, 2, 3 or by 0.5, 1, 10.Interval new_value =a * old_value + b where a and b are constantsThus, the Fahrenheit and Ce

13、lsius temperature scales differ in terms of where their zero value is and the size of a unit (degree).Ratio new_value = a * old_value Length can be measured in meters or feet. Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 9 Discrete and Continuous Attributes l Discrete Attribute Has onl

14、y a finite or countably infinite set of values Examples: zip codes, counts, or the set of words in a collection of documents Often represented as integer variables. Note: binary attributes are a special case of discrete attributes l Continuous Attribute Has real numbers as attribute values Examples:

15、 temperature, height, or weight. Practically, real values can only be measured and represented using a finite number of digits. Continuous attributes are typically represented as floating-point variables. Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 10 Types of data sets l Record Data Matrix Document Data Transaction Datal Graph World Wide Web Molecular Structuresl Ordered Spatial Data Temporal Data Sequential Data Genetic Sequence Data

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 教育教学资料库 > 课件讲义

Copyright © 2018-2021 Wenke99.com All rights reserved

工信部备案号浙ICP备20026746号-2  

公安局备案号:浙公网安备33038302330469号

本站为C2C交文档易平台,即用户上传的文档直接卖给下载用户,本站只是网络服务中间平台,所有原创文档下载所得归上传人所有,若您发现上传作品侵犯了您的权利,请立刻联系网站客服并提供证据,平台将在3个工作日内予以改正。