第九章网络与分布式操作系统-Piazza.ppt

资源描述

1、Centralized vs Distributed Systems,Centralized System: System in which major functions are performed by a single physical computerOriginally, everything on single computerLater: client/server modelDistributed System: physically separate computers working together on some taskEarly model: multiple se

2、rvers working togetherProbably in the same room or buildingOften called a “cluster”Later models: peer-to-peer/wide-spread collaboration,Distributed Systems: Motivation/Issues,Why do we want distributed systems?Cheaper and easier to build lots of simple computersEasier to add power incrementallyUsers

3、 can have complete control over some componentsCollaboration: Much easier for users to collaborate through network resources (such as network file systems)The promise of distributed systems:Higher availability: one machine goes down, use anotherBetter durability: store data in multiple locationsMore

4、 security: each piece easier to make secure Reality has been disappointingWorse availability: depend on every machine being upLamport: “a distributed system is one where I cant do work because some machine Ive never heard of isnt working!”Worse reliability: can lose data if any machine crashesWorse

5、security: anyone in world can break into systemCoordination is more difficultMust coordinate multiple copies of shared state information (using only a network)What would be easy in a centralized system becomes a lot more difficult,Distributed Systems: Goals/Requirements,Transparency: the ability of

6、the system to mask its complexity behind a simple interfacePossible transparencies:Location: Cant tell where resources are locatedMigration: Resources may move without the user knowingReplication: Cant tell how many copies of resource existConcurrency: Cant tell how many users there areParallelism:

7、System may speed up large jobs by spliting them into smaller piecesFault Tolerance: System may hide varoius things that go wrong in the systemTransparency and collaboration require some way for different processors to communicate with one another,Networking Definitions,Network: physical connection t

8、hat allows two computers to communicatePacket: unit of transfer, sequence of bits carried over the networkNetwork carries packets from one CPU to anotherDestination gets interrupt when packet arrivesProtocol: agreement between two parties as to how information is to be transmitted,网络与分布式操作系统,由于网络操作系

9、统与分布式操作系统所采用的技术大多是相通的，将它们放在一起介绍。计算机网络一、网络的概念：计算机网络是利用通信设备和通信线路，将地理上分散而且有相对独立功能的多个计算机系统，按照某种原则相互连接在一起构成的计算机体系，它是计算机技术和通信技术相结合的产物。二、网络组成： 1、组成：独立计算机、通信处理机、通信线路。 2.结点:网络中的主机及所附带的外部设备,也叫站点。,三、网络分类：（一）按网络覆盖的地理范围，可将网络分为局域网和广域网、城域网。（二）按照入网计算机的统一性分为：同构网络和异构网络 1、同构网络：在分布式操作系统中常采用同构网络因为进程的动态迁移要求迁出站点与迁

10、入站点具有相同或兼容的硬件环境。 2、异构网络：由不同类型的机器所构成的计算机网络。在大型网络操作系统中，常采用异构网络，因为它对入网机器的类型没有任何限制。,四、网络的拓扑：网络系统中的各个站点在物理上的联结方式。每种拓扑结构各有优点、缺点，对拓扑结构的评估常用以下标准： 1、基本成本：将系统中各站点联结起来所花费的代价。 2、通信成本：把一个信息由站点A传送到站点B的距离 3、可靠性：如果一个通信链或站点失效，是否影响基余站点之间的通信。（一）全联通拓朴结构：每个站点都直接与其它站点相连，这种结构的代价是昂贵的，因系统中任两个站点之间必须有直接的通信链。 *基本成本高，按站点数成平方

11、地增长。 *传送速度快，因任两站点间的信息传送仅涉及一条通信链 *可靠性高，因仅当所有通信链都失效时，系统割裂。（二）部分互联结构： 1、仅在一部分站点之间存在通信链，因而基本成本较低。 2、通信速度慢，由消息的传递可能要经过几个中间站点。 3、可靠性较差。,（三）层次结构：除根站点外，每个站点均有唯一的父节点和若干个子节点 1、基本成本低 2、通信时往往要涉及几个节点 3、除叶站点外，任何一个站点的失效将导致不连通（四）星型结构： 1、基本成本与站点数成线性比例关系 2、通信成本较低因一个站点与另一个站点之间的通信至多仅需两步。 3、可靠性差:一方面中心站点可能成为系统的瓶颈，另一方面，

12、中心站点一旦失效，网络瘫痪。（五）环形结构（六）总线结构,通信与协议OSI中，各层协议主要功能：（1）物理层：负责两个站点之间字位流的传输。（2）链路层，负责提供传输错误的恢复功能（3）网络层：负责将消息分解为传输单位，并选择路径。（4）传输层：负责站点之间的消息传送。（5）会话层：负责进程间通信。（6）表示层：负责数据转换。（7）应用层：负责提供用户界面。,Routing,Routing: the process of forwarding packets hop-by-hop through routers to reach their destinationNeed

13、more than just a destination address! Need a pathPost Office Analogy:Destination address on each letter is not sufficient to get it to the destinationTo get a letter from here to Florida, must route to local post office, sorted and sent on plane to somewhere in Florida, be routed to post office, sor

14、ted and sent with carrier who knows where street and house isInternet routing mechanism: routing tablesEach router does table lookup to decide which link to use to get packet closer to destinationDont need 4 billion entries in table: routing is by subnetCould packets be sent in a loop? Yes, if table

15、s incorrectRouting table contains:Destination address range output link closer to destinationDefault entry (for subnets without explicit entries),Naming in the Internet,How to map human-readable names to IP addresses?E.g. www.berkeley.edu 128.32.139.48E.g. different addresses depending on location,

16、 and loadWhy is this necessary?IP addresses are hard to rememberIP addresses change:Say, Server 1 crashes gets replaced by Server 2Or handled by different serversMechanism: Domain Naming System (DNS),Setting up Routing Tables,How do you set up routing tables?Internet has no centralized state!No sin

17、gle machine knows entire topologyTopology constantly changing (faults, reconfiguration, etc)Need dynamic algorithm that acquires routing tablesIdeally, have one entry per subnet or portion of addressCould have “default” routes that send packets for unknown subnets to a different router that has more

18、 informationPossible algorithm for acquiring routing tableRouting table has “cost” for each entryIncludes number of hops to destination, congestion, etc.Entries for unknown subnets have infinite costNeighbors periodically exchange routing tablesIf neighbor knows cheaper route to a subnet, replace yo

19、ur entry with neighbors entry (+1 for hop to neighbor)In reality:Internet has networks of many different scalesDifferent algorithms run at different scales,Domain Name System,DNS is a hierarchical mechanism for naming Name divided in domains, right to left: www.eecs.berkeley.eduEach domain owned by

20、a particular organizationTop level handled by ICANN (Internet Corporation for Assigned Numbers and Names)Subsequent levels owned by organizationsResolution: series of queries to successive serversCaching: queries take time, so results cached for period of time,Top-level,com,edu,Mit.edu,169.229.131.8

21、1,128.32.61.103,128.32.139.48,berkeley.edu,www,calmail,eecs,berkeley,MIT,How Important is Correct Resolution?,If attacker manages to give incorrect mapping:Can get someone to route to server, thinking that they are routing to a different serverGet them to log into “bank” give up username and passwor

22、dIs DNS Secure?Definitely a weak linkWhat if “response” returned from different server than original query?Get person to use incorrect IP address!Attempt to avoid substitution attacks:Query includes random number which must be returned July 2008 hole in DNS security located!Dan Kaminsky (security re

23、searcher) discovered an attack that broke DNS globallyOne person in an ISP convinced to load particular web page, then all users of that ISP end up pointing at wrong addressHigh profile, highly advertised need for patching DNS Big press release, lots of mysterySecurity researchers told no speculatio

24、n until patches applied,计算机模型一、数据迁移：当处于站点A的用户想要存取驻留于站点B的数据（文件）时，系统有两种方式：（1）将整个文件都传送给站点A,此后对文件的访问便成为局部的了，当用户A不再需要该文件时，它便被送回到站点B。（2）仅将文件的一部分传送到A，如果以后还需要，再传送另一部分，当站点A用户不再需要该文件时，将其修改部分传送回站点B。二、计算迁移在某些环境中，迁移计算比迁移数据效果更好例如：设站点A处的进程P要使用站点B处的文件，它不是将B处的文件取过来，而是执行一个远程过程调用，以调用一个在B点已定义好的过程，该过程可对P所需的文件进行适当计

25、算，然后将结果发送给进程P。另一种方法是：进程P发一个消息到站点B，然后由站点B处的操作系统创建一个代理进程Q，其功能是执行P所指定的任务，当Q完成使命后，通过消息将结果送给P，此法允许P、Q在不同站点并行。,三、作业迁移当一个作业到达时，它可以全部或部分地在不同站点处执行，其优点是： 1、负载平衡：作业或作业步可以在网上分布以均衡工作负载， 2、计算加速：如果一个作业可以划分为若干子作业，这些子作业可以在不同站点处并行执行，则整个作业的处理时间能被缩短。 3、硬件优选：有些作业可能只适合于在专用处理机上运行，例如矩际求逆。 4、软件优选：有的作业可能需要某些站点处的特别软件，而该软件不

26、适合迁移，或迁移开销比作业开销大。四、进程迁移：进程迁移是将正运行于某站点处理的进程迁移到另一站点，由于在迁移时刻进程已经在原站点运行了一段时间，迁移时不仅需要迁移其代码和数据，还应迁移与进程有关的数据结构，即进程控制块，进程迁移的目的是实现负载平衡。,互斥为了解决网络和分布式系统中互斥问题，必须提供类似信号灯的同步机构。为简单起见，这里只讨论二值信号灯的实现（相当于锁），由于网络和分布式系统中的互斥所涉及的进程可能位于不同站点，它们之间没有公共内存，因此比较复杂，这里假设共有n个处理机，所有处理机依次编号为1N，每个处理机中仅有一个进程，且进程与处理机具有相同编号。一、集中方式： 1

27、、基本思想：系统中有一个进程负责协调对于临界区的进入。每一个要求进入临界区的进程都必须发送一个请求给协调者进程，仅当收到协调者进程的回答信号后，它才能进入自己的临界区；当一个进程退出临界区时，也需发送一个释放信号给协调者进程，然后继续执行。当收到一个请求消息时，协调者进程需考查是否有某些进程正在其临界区内，若无，协调者进程发送一个回答消息给请求进程，否则请求进程需排队等待，若协调者进程收到一个释放消息，则它给等待队列中的某一进程发送回答信号允许它进入其临界区。 2、特点：（1）无死锁（2）如协调者进程是公平的，如FCS,则不会发生“饥饿”现象。（3）每次进入临界区需三个消息：请求、回答

28、、释放.,二、分布方式： 1、方法：当一个进程P要进入其临界区时。它产生一个新的时间邮戳TS，并发送一个Request(P,TS)给所有其它进程，当某个进程接收到此消息时它可能立即回答（如果它当前不在其临界区内）；也可能延迟回答（如果它当前正在其临界区内）。一个收到系统中所有进程回答信号的进程可以进入它的临界区，当一个进程退出其临界区后，它需要给所有向它发来请求消息的进程发送回答消息。 2、进程作出立即回答或延迟回答的决定因素：（1）如果进程正在它的临界区内，延迟回答（2）如果一个进程不想进入特的临界区，立即回答; (2)如果一个进程想进入但尚未进入它的临界区，该进程考查所有保存的请求表，

29、此表用于保存该进程已收到但尚未回答的消息，并将当前收到的REQUEST(P,TS)消息中的TS与该表中所有消息的TS作比较，如果这个TS比表中所有消息的TS都小,则立即回答进程P，否则REQUEST被加到等待表中 3、上述算法特性：（1）可实现互斥（2）确保无死锁（3）无“饿死“情况（4）每次进入临界区需2（N-1）个消息。,DISTRIBUTED COORDINATION,FULLY DISTRIBUTED APPROACHApproach due to Lamport. These are the general properties for the method:The gen

30、eral mechanism is for a process Pi to send a request ( with ID and time stamp ) to all other processes.When a process Pj receives such a request, it may reply immediately or it may defer sending a reply back.When responses are received from all processes, then Pi can enter its Critical Section.When

31、Pi exits its critical section, the process sends reply messages to all its deferred requests.,Mutual Exclusion/Synchronization,DISTRIBUTED COORDINATION,FULLY DISTRIBUTED APPROACHThe general rules for reply for processes receiving a request:If Pj receives a request, and Pj process is in its critical

32、section, defer (hold off) the response to Pi.If Pj receives a request, and not in critical section, and doesnt want to get in, then reply immediately to Pi.If Pj wants to enter its critical section but has not yet entered it, then it compares its own timestamp TSj with the timestamp TSi from Ti.If T

33、Sj TSi, then it sends a reply immediately to Pi. Pi asked first.Otherwise the reply is deferred until after Pj finishes its critical section.,Mutual Exclusion/Synchronization,DISTRIBUTED COORDINATION,The Fully Distributed Approach assures:Mutual exclusionFreedom from deadlockFreedom from starvation,

34、 since entry to the critical section is scheduled according to the timestamp ordering. The timestamp ordering ensures that processes are served in a first-come, first-served order.2 X ( n - 1 ) messages needed for each entry. This is the minimum number of required messages per critical-section entry

35、 when processes act independently and concurrently.Problems with the method include:Need to know identity of everyone in system.Fails if anyone dies - must continually monitor the state of all processes.Processes are always coming and going so its hard to maintain current data.,Mutual Exclusion/Sync

36、hronization,三、令牌传递方式： 1、方法：该法仅适用于逻辑拓朴结构为环形的系统，为实现互斥，系统中有一个标志。它作为特殊类型的消息在系统中环行。当一个进程接收到这个标志后，它就可以进入其临界区，并扣留这个标志；当它退出临界区之后，标志才被释放，并沿环路继续绕行，如果一个接收到标志的进程并不想进入其临界区，只需放行此标志。 2、此法特点：（1）可实现互斥。因系统中只有一个标志，则最多只有一个进程在其临界区内。（2）无“饿死”情况（在单向环形系统中）（3）两种失效情况：一是如果标志丢失，则应能发现并选择一个进程产生新标志。二是如果一个进程夭折了，则逻辑环断裂，

37、此时系统应能重构一个新的逻辑环。,死锁处理传统系统中所用的死锁预防、避免，以及检测等算法的思想一般也适合于网络和分存式系统，只需做某些适当的修改。例如：只要在系统事件之间定义一个全序，资源分配预防死锁技术就可用于网络和分布式环境中，也即，系统内所有资源被赋予一个唯一的编号，一个进程可以要求一个编号为I的资源，当且仅当它未占有编号比I更小的资源。只要确定系统中某个进程为银行家，由它保持执行银行家算法所必需的信息，并负责系统中资源的分配，则银行家算法也同样适合于网络和分布式系统,一、死锁预防： (一)死锁产生的四个必要条件：互斥条件，请求和保持条件、不剥夺条件、环路等待条件 (二)网络与分布式

38、系统中预防死锁的方法:通过剥夺资源以破坏循环等待条件。方法：赋给每个进程一个唯一的优先数，这个优先数被用于决定一个进程Pi 是否等待另外一个进程 Pi。例如：如果Pi具有更高的优先数，可以令Pi等待Pj，否则Pi回退，即死掉。缺陷：可能出现饥饿现象，因为有些低优先级的进程可能总是被回退，为此有人提出用时间邮戳的设想，系统中的每个进程，在其产生时被赋予一个唯一的时间邮戳，下面介绍两个方案：,方案一：死等：资源申请者回退：此方案基于非剥夺技术，当一个进程Pi 要求另外一个进程Pj保持的资源时，Pi被允许等待, 仅当它具有比Pj更小的邮戳时间，即Pi是比Pj更老的，否则Pi回退。方案二：剥夺式

39、等待：资源占有者回退：此法基于剥夺技术，是死等方案的改进，当进程Pi要求进程Pj当前占有的资源时，则Pi获准等待的条件是它具有比Pj更大的邮戳时间，即Pi比Pj更年轻, 否则Pj回退。,二、死锁检测：死锁预防技术可能剥夺一些资源或回退一些进程，即使死锁不会出现。为了防止不必要的剥夺和回退，可以允许死锁发生，并在发生后将死锁检测出来，为此需要构造资源分配状态图，若此图出现环路，则死锁发生。 (一)资源等待图的保存方法：要求每个站点保持一个局部等待图，图中站点对应占有和申请局部于该站点资源的那些进程，这些进程可能是局部于本站点的, 也可能是属于其它站点的。 (二)判断：如果注

40、意一个局部等待图中出现了环路，则死锁已经发生，但若每个站点的局部等待图均无环路, 并不意味着没有死锁，为了确定没有思锁，必须证明所有局部等待图之“并”没有环路，两个或多个局部图之“并”已经出现了一个环路。,The problem here is how to get agreement with an unreliable mechanism. In order to do an election, as we just discussed, it would be necessary to work around the following problems.UNRELIABLE CO

41、MMUNICATIONSCan have faulty links - can use a timeout to detect this.FAULTY PROCESSESCan have faulty processes generating bad messages.Cannot guarantee agreement.,DISTRIBUTED COORDINATION,Reaching Agreement Between Processes,分布式文件系统,Caching vs. Remote Access局部性原理：Many remote accesses can be handled

42、by a local cache. Theres a great deal of locality of reference in file accesses. Servers can be accessed only occasionally rather than for each access.网络效率：Caching causes data to be moved in a few big chunks rather than in many smaller pieces; this leads to considerable efficiency for the network.一致

43、性：Cache consistency is the major problem with caching. When there are infrequent writes, caching is a win. In environments with many writes, the work required to maintain consistency overwhelms caching advantages.复杂性：Caching requires a whole separate mechanism to support acquiring and storage of lar

44、ge amounts of data. Remote service merely does whats required for each call. As such, caching introduces an extra layer and mechanism and is more complicated than remote service.,How to perform Authorization for Distributed Systems?,Issues: Are all user names in world unique?No! They only have small

45、 number of characterskubimit.edu kubitronlcs.mit.edu kubitroncs.berkeley.eduHowever, someone thought their friend was kubimit.edu and I got very private email intended for someone elseNeed something better, more unique to identify personSuppose want to connect with any server at any time?Need an account on every machine! (possibly with different user name for each account)OR: Need to use something more universal as identityPublic Keys! (Called “Principles”)People are their public keys,

展开阅读全文