1、云计算与云数据管理,陆嘉恒中国人民大学,先进数据管理前沿讲习班,主要内容,2,云计算概述 Google 云计算技术:GFS,Bigtable 和MapreduceYahoo云计算技术和Hadoop云数据管理的挑战,人民大学新开的分布式系统与云计算课程,3,分布式系统概述分布式云计算技术综述分布式云计算平台分布式云计算程序开发,第一篇分布式系统概述,4,第一章:分布式系统入门 第二章:客户-服务器端构架 第三章:分布式对象 第四章:公共对象请求代理结构 (CORBA),第二篇 云计算综述,5,第五章:云计算入门 第六章:云服务 第七章:云相关技术比较7.1网格计算和云计算7.2 Utility计
2、算(效用计算)和云计算 7.3并行和分布计算和云计算 7.4集群计算和云计算,第三篇 云计算平台,6,第八章:Google云平台的三大技术 第九章:Yahoo云平台的技术 第十章:Aneka 云平台的技术第十一章:Greenplum云平台的技术第十二章:Amazon dynamo云平台的技术,第四篇 云计算平台开发,7,第十三章:基于Hadoop系统开发 第十四章:基于HBase系统开发 第十五章:基于Google Apps系统开发 第十六章:基于MS Azure系统开发 第十七章:基于Amazon EC2系统开发,Cloud computing,Why we use cloud comput
3、ing?,Why we use cloud computing?,Case 1:Write a fileSaveComputer down, file is lostFiles are always stored in cloud, never lost,Why we use cloud computing?,Case 2:Use IE - download, install, useUse QQ - download, install, useUse C+ - download, install, useGet the serve from the cloud,What is cloud a
4、nd cloud computing?,CloudDemand resources or services over Internetscale and reliability of a data center.,What is cloud and cloud computing?,Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a serve over the Internet. Users need no
5、t have knowledge of, expertise in, or control over the technology infrastructure in the cloud that supports them.,Characteristics of cloud computing,Virtual. software, databases, Web servers, operating systems, storage and networking as virtual servers. On demand. add and subtract processors, memory
6、, network bandwidth, storage.,IaaSInfrastructure as a Service,PaaSPlatform as a Service,SaaSSoftware as a Service,Types of cloud service,Software delivery modelNo hardware or software to manageService delivered through a browserCustomers use the service on demandInstant Scalability,SaaS,ExamplesYour
7、 current CRM package is not managing the load or you simply dont want to host it in-house. Use a SaaS provider such as SYour email is hosted on an exchange server in your office and it is very slow. Outsource this using Hosted Exchange.,SaaS,Platform delivery modelPlatforms are built upon Infrastruc
8、ture, which is expensiveEstimating demand is not a science!Platform management is not fun!,PaaS,ExamplesYou need to host a large file (5Mb) on your website and make it available for 35,000 users for only two months duration. Use Cloud Front from Amazon.You want to start storage services on your netw
9、ork for a large number of files and you do not have the storage capacityuse Amazon S3.,PaaS,Computer infrastructure delivery modelA platform virtualization environmentComputing resources, such as storing and processing capacity. Virtualization taken a step further,IaaS,ExamplesYou want to run a batc
10、h job but you dont have the infrastructure necessary to run it in a timely manner. Use Amazon EC2.You want to host a website, but only for a few days. Use Flexiscale.,IaaS,Cloud computing and other computing techniques,The 21st Century Vision Of Computing,Leonard Kleinrock , one of the chief scienti
11、sts of the original Advanced Research Projects Agency Network (ARPANET) project which seeded the Internet, said: “As of now, computer networks are still in theirinfancy, but as they grow up and become sophisticated, we will probably see the spread of computer utilities which, like present electric a
12、nd telephone utilities, will service individual homes and offices across the country.”,The 21st Century Vision Of Computing,Sun Microsystemsco-founder Bill Joy He also indicated “It would take time until these markets to mature to generate this kind ofvalue. Predicting now which companies will captu
13、re the value is impossible. Many of them have not even been created yet.”,The 21st Century Vision Of Computing,Definitions,utility,Definitions,utility,Utility computing is the packaging of computing resources, such as computation and storage, as a metered service similar to a traditional public util
14、ity,Definitions,utility,A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer.,Definitions,utility,Grid computing is the application of several computers to a single problem at the same time usually to a scientific or technic
15、al problem that requires a great number of computer processing cycles or access to large amounts of data,Definitions,utility,Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.,Grid Computing & Cloud Compu
16、ting,share a lot commonality intention, architecture and technology Difference programming model, business model, compute model, applications, and Virtualization.,Grid Computing & Cloud Computing,the problems are mostly the samemanage large facilities;define methods by which consumers discover, requ
17、est and use resources provided by the central facilities; implement the often highly parallel computations that execute on those resources.,Grid Computing & Cloud Computing,VirtualizationGriddo not rely on virtualization as much as Clouds do, each individual organization maintain full control of the
18、ir resources Cloudan indispensable ingredient for almost every Cloud,2019/7/2,36,Any question and any comments ?,主要内容,37,云计算概述 Google 云计算技术:GFS,Bigtable 和MapreduceYahoo云计算技术和Hadoop云数据管理的挑战,Google Cloud computing techniques,The Google File System,The Google File System(GFS),A scalable distributed fil
19、e system for large distributed data intensive applicationsMultiple GFS clusters are currently deployed.The largest ones have:1000+ storage nodes300+ TeraBytes of disk storageheavily accessed by hundreds of clients on distinct machines,Introduction,Shares many same goals as previous distributed file
20、systemsperformance, scalability, reliability, etcGFS design has been driven by four key observation of Google application workloads and technological environment,Intro: Observations 1,1. Component failures are the normconstant monitoring, error detection, fault tolerance and automatic recovery are i
21、ntegral to the system2. Huge files (by traditional standards)Multi GB files are commonI/O operations and blocks sizes must be revisited,Intro: Observations 2,3. Most files are mutated by appending new dataThis is the focus of performance optimization and atomicity guarantees4. Co-designing the appli
22、cations and APIs benefits overall system by increasing flexibility,The Design,Cluster consists of a single master and multiple chunkservers and is accessed by multiple clients,The Master,Maintains all file system metadata.names space, access control info, file to chunk mappings, chunk (including rep
23、licas) location, etc.Periodically communicates with chunkservers in HeartBeat messages to give instructions and check state,The Master,Helps make sophisticated chunk placement and replication decision, using global knowledgeFor reading and writing, client contacts Master to get chunk locations, then
24、 deals directly with chunkserversMaster is not a bottleneck for reads/writes,Chunkservers,Files are broken into chunks. Each chunk has a immutable globally unique 64-bit chunk-handle.handle is assigned by the master at chunk creationChunk size is 64 MBEach chunk is replicated on 3 (default) servers,
25、Clients,Linked to apps using the file system API.Communicates with master and chunkservers for reading and writingMaster interactions only for metadataChunkserver interactions for dataOnly caches metadata informationData is too large to cache.,Chunk Locations,Master does not keep a persistent record
26、 of locations of chunks and replicas.Polls chunkservers at startup, and when new chunkservers join/leave for this.Stays up to date by controlling placement of new chunks and through HeartBeat messages (when monitoring chunkservers),Operation Log,Record of all critical metadata changesStored on Maste
27、r and replicated on other machinesDefines order of concurrent operationsAlso used to recover the file system state,System Interactions: Leases and Mutation Order,Leases maintain a mutation order across all chunk replicasMaster grants a lease to a replica, called the primaryThe primary choses the ser
28、ial mutation order, and all replicas follow this orderMinimizes management overhead for the Master,Atomic Record Append,Client specifies the data to write; GFS chooses and returns the offset it writes to and appends the data to each replica at least onceHeavily used by Googles Distributed applicatio
29、ns.No need for a distributed lock managerGFS choses the offset, not the client,Atomic Record Append: How?,Follows similar control flow as mutationsPrimary tells secondary replicas to append at the same offset as the primaryIf a replica append fails at any replica, it is retried by the client. So rep
30、licas of the same chunk may contain different data, including duplicates, whole or in part, of the same record,Atomic Record Append: How?,GFS does not guarantee that all replicas are bitwise identical.Only guarantees that data is written at least once in an atomic unit.Data must be written at the sa
31、me offset for all chunk replicas for success to be reported.,Detecting Stale Replicas,Master has a chunk version number to distinguish up to date and stale replicasIncrease version when granting a leaseIf a replica is not available, its version is not increasedmaster detects stale replicas when a ch
32、unkservers report chunks and versionsRemove stale replicas during garbage collection,Garbage collection,When a client deletes a file, master logs it like other changes and changes filename to a hidden file.Master removes files hidden for longer than 3 days when scanning file system name spacemetadat
33、a is also erasedDuring HeartBeat messages, the chunkservers send the master a subset of its chunks, and the master tells it which files have no metadata.Chunkserver removes these files on its own,Fault Tolerance:High Availability,Fast recoveryMaster and chunkservers can restart in secondsChunk Repli
34、cationMaster Replication“shadow” masters provide read-only access when primary master is downmutations not done until recorded on all master replicas,Fault Tolerance:Data Integrity,Chunkservers use checksums to detect corrupt dataSince replicas are not bitwise identical, chunkservers maintain their
35、own checksumsFor reads, chunkserver verifies checksum before sending chunkUpdate checksums during writes,Introduction to MapReduce,MapReduce: Insight,”Consider the problem of counting the number of occurrences of each word in a large collection of documents”How would you do it in parallel ?,MapReduc
36、e Programming Model,Inspired from map and reduce operations commonly used in functional programming languages like Lisp.Users implement interface of two primary methods:1. Map: (key1, val1) (key2, val2)2. Reduce: (key2, val2) val3,Map operation,Map, a pure function, written by the user, takes an inp
37、ut key/value pair and produces a set of intermediate key/value pairs. e.g. (docid, doc-content)Draw an analogy to SQL, map can be visualized as group-by clause of an aggregate query.,Reduce operation,On completion of map phase, all the intermediate values for a given output key are combined together
38、 into a list and given to a reducer.Can be visualized as aggregate function (e.g., average) that is computed over all the rows with the same group-by attribute.,Pseudo-code,map(String input_key, String input_value): / input_key: document name / input_value: document contents for each word w in input
39、_value: EmitIntermediate(w, 1); reduce(String output_key, Iterator intermediate_values): / output_key: a word / output_values: a list of counts int result = 0; for each v in intermediate_values: result += ParseInt(v); Emit(AsString(result);,MapReduce: Execution overview,MapReduce: Example,MapReduce
40、in Parallel: Example,MapReduce: Fault Tolerance,Handled via re-execution of tasks.Task completion committed through master What happens if Mapper fails ?Re-execute completed + in-progress map tasksWhat happens if Reducer fails ?Re-execute in progress reduce tasksWhat happens if Master fails ?Potenti
41、al trouble !,MapReduce:,Walk through of One more Application,MapReduce : PageRank,PageRank models the behavior of a “random surfer”.C(t) is the out-degree of t, and (1-d) is a damping factor (random jump)The “random surfer” keeps clicking on successive links at random not taking content into conside
42、ration.Distributes its pages rank equally among all pages it links to.The dampening factor takes the surfer “getting bored” and typing arbitrary URL.,PageRank : Key Insights,Effects at each iteration is local. i+1th iteration depends only on ith iterationAt iteration i, PageRank for individual nodes
43、 can be computed independently,PageRank using MapReduce,Use Sparse matrix representation (M)Map each row of M to a list of PageRank “credit” to assign to out link neighbours.These prestige scores are reduced to a single PageRank value for a page by aggregating over them.,PageRank using MapReduce,Sou
44、rce of Image: Lin 2008,Phase 1: Process HTML,Map task takes (URL, page-content) pairs and maps them to (URL, (PRinit, list-of-urls)PRinit is the “seed” PageRank for URLlist-of-urls contains all pages pointed to by URLReduce task is just the identity function,Phase 2: PageRank Distribution,Reduce tas
45、k gets (URL, url_list) and many (URL, val) valuesSum vals and fix up with d to get new PREmit (URL, (new_rank, url_list)Check for convergence using non parallel component,MapReduce: Some More Apps,Distributed Grep.Count of URL Access Frequency.Clustering (K-means)Graph Algorithms.Indexing Systems,Ma
46、pReduce Programs In Google Source Tree,MapReduce: Extensions and similar apps,PIG (Yahoo)Hadoop (Apache)DryadLinq (Microsoft),Large Scale Systems Architecture using MapReduce,BigTable: A Distributed Storage System for Structured Data,Introduction,BigTable is a distributed storage system for managing
47、 structured data.Designed to scale to a very large sizePetabytes of data across thousands of serversUsed for many Google projectsWeb indexing, Personalized Search, Google Earth, Google Analytics, Google Finance, Flexible, high-performance solution for all of Googles products,Motivation,Lots of (semi
48、-)structured data at GoogleURLs:Contents, crawl metadata, links, anchors, pagerank, Per-user data:User preference settings, recent queries/search results, Geographic locations:Physical entities (shops, restaurants, etc.), roads, satellite image data, user annotations, Scale is largeBillions of URLs,
49、 many versions/page (20K/version)Hundreds of millions of users, thousands or q/sec100TB+ of satellite image data,Why not just use commercial DB?,Scale is too large for most commercial databasesEven if it werent, cost would be very highBuilding internally means system can be applied across many projects for low incremental costLow-level storage optimizations help performance significantlyMuch harder to do when running on top of a database layer,