数据挖掘之异常检测.pptx

上传人:99****p 文档编号:1420427 上传时间:2019-02-25 格式:PPTX 页数:44 大小:5.33MB
下载 相关 举报
数据挖掘之异常检测.pptx_第1页
第1页 / 共44页
数据挖掘之异常检测.pptx_第2页
第2页 / 共44页
数据挖掘之异常检测.pptx_第3页
第3页 / 共44页
数据挖掘之异常检测.pptx_第4页
第4页 / 共44页
数据挖掘之异常检测.pptx_第5页
第5页 / 共44页
点击查看更多>>
资源描述

1、Anomaly Detection: A introduction Source of slides:Tutorial At American Statistical Association (ASA2008)Jiawei Han-data mining : concepts and techniquesTutorial at the European Conference on Principles and Practice of Knowledge Discovery in DatabasesSpeaker: Wentao LiOutline Definition Application

2、MethodsLimited time, So I just draw the picture of anomaly detection, for more detail, please turn to the paper for help.What are Anomalies? Anomaly is a pattern in the data that does not conform to the expected behavior Anomaly is A data object that deviates significantly from the normal objects as

3、 if it were generated by a different mechanism Also referred to as outliers, exceptions, peculiarities, surprises, etc. Anomalies translate to significant (often critical) real life entities Cyber intrusions Credit card fraud Faults in mechanical systemsRelated problems Outliers are different from t

4、he noise data Noise is random error or variance in a measured variableNoise should be removed before outlier detectionOutliers are interesting: It violates the mechanism that generates the normal data Outlier detection vs. novelty detection: early stage, outlier; but later merged into the modelKey C

5、hallenges Defining a representative normal region is challenging The boundary between normal and outlying behavior is often not precise Availability of labeled data for training/validation The exact notion of an outlier is different for different application domains Data might contain noise Normal b

6、ehavior keeps evolving Appropriate selection of relevant features MapRelated areas(theory)Application(practice) Problem formulation Detection effect +Aspects of Anomaly Detection Problem Nature of input data What is the characteristic of input data Availability of supervision Number of label Type of

7、 anomaly: point, contextual, structural Type of anomaly Output of anomaly detection Score vs label Evaluation of anomaly detection techniques What kind of detection is goodInput Data Most common form of data handled by anomaly detection techniques is Record DataUnivariateMultivariateInput Data Most common form of data handled by anomaly detection techniques is Record DataUnivariateMultivariateInput Data Nature of Attributes Nature of attributesBinaryCategoricalContinuousHybridcategorical continuous continuouscategorical binary

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 教育教学资料库 > 课件讲义

Copyright © 2018-2021 Wenke99.com All rights reserved

工信部备案号浙ICP备20026746号-2  

公安局备案号:浙公网安备33038302330469号

本站为C2C交文档易平台,即用户上传的文档直接卖给下载用户,本站只是网络服务中间平台,所有原创文档下载所得归上传人所有,若您发现上传作品侵犯了您的权利,请立刻联系网站客服并提供证据,平台将在3个工作日内予以改正。