1、Introduction to Virtualization,Kang ChenHPC, DCST, Tsinghua,Outlines,Virtualization is HOT!,History,History,What is virtualization?,See multiple OSes?,Terminology,Virtualization: What is it?,Motivations,Virtual Machine Monitor,Software Layer between hardware and OS, virtualizes and manage hardware r
2、esources,Virtualization Levels,HARDWARE,KERNEL,USER LEVEL LIBRARIES,APPLICATIONS,API Calls,System Calls,Instructions,User Space,Kernel Space,Levels of Virtualization,ISA Level Virtualization,ISA Level Virtualization:Examples,Stanford DISCO,Emulator , Binary Translation,CPU EmulatorSimulate different
3、 ISA while running on one ISA machine, E.g. Intel IA32 emulator running on MAC based on PowerPCBinary TranslationTranslate the code from one ISA to anotherE.g. translate the application code from IA32 to IPF,Other Emulators,Virtutech Simics, Simics models dozens of processor types and associated per
4、ipheral devices, including systems based on Alpha, PowerPC, SPARC, IA32 (x86), and x86-64 “Hammer” CPUs, as well as IA-64 (Itanium), ARM and MIPSMicrosoft Virtual PC for MACSun Shade: emulate the ISA of SPARC and MIPS I,Binary Translation,FX!32: Run x86 windows NT programs on top of Windows for Alph
5、aIA32 EL: The IA32 Execution Layer. Run x86 programs on top of Windows for IPFDynamo System: Improving the performances by using binary translation, both on HP PA8000Aries System: Run programs written for PA-RISC on top of IPF,HAL Level Virtualization,Stand Alone vs. Hosted,21,Hardware,Hardware,Hard
6、ware,Host OS,VMM,Guest OS2,Guest OS1,App,App,Host OS,VMM,Guest OS1,App,Guest OS2,App,JVMCLRVMware Workstation,Microsoft Virtual Server,VMM,Guest OS2,Guest OS1,App,App,VMware ESX ServerXenMS Viridian,Vmware Workstation Architecture,Guest OS Applications,Guest Operating System,Host OS Apps,Host OS,Dis
7、ks,Memory CPU,NIC,VMware App,Virtual Machine,VMware Driver,Virtual Machine Monitor,PC Hardware,World Switch: save and restore all the hardware states (Host OS and VMM share the same highest privilege.,Vmware ESX Server,Memory,nic,nic,NIC,disk,x86 SMPHardware,ConsoleOS,VMM,GuestOS,GuestOS,GuestOS,Gue
8、stOS,VMkernel,VMM,VMM,VMM,VMware I/O Virtualization,VMM does not have access to I/OI/O in “host world”Low level I/O instructions (issued by guest OS) are merged to high-level I/O system callsVM Application executes I/O SysCallsVM Driver works as the communication link between VMM and VM Application,
9、Virtual PC,“Everything is about Microsoft”,Paravirtualization,VMware / Virtual PCGuest OS lives in a complete virtual world with no knowledge about the real machineSupport legacy OSDifficult to scale to high numbersInterrupt handling/memory management/world switchingSolution?ParavirtualizationProvid
10、es a much simpler architecture interface for the customized guest OSesVirtual I/O and CPU instructions, registers, Trade portability for performance and scalabilityDenali/Xen,Xen,Exposes some “real” hardwareE.g. clock, physical memory addressMaintain the same application binary interface(ABI)The eff
11、ort of porting OSes is minimalExamples:Page tableGuest OS have the direct access to hardware page tables, but updates are batched and validated by XenTimer InterfaceGuest Os is aware both “real” and “virtual” time,Xen Architecture,Hardware,Xen,Domain0,XenoLinux,XenoWindows,Application,Application,Ap
12、plication,Application,Application,Application,Application,Application,Application,Control, I/O(Domain 0),Guest Domain,Guest Domain,Application,Denali,Design to support thousands of VM instance running network servicesHosting a single application, single-user unprotected guest OSNot support ABI compa
13、tibilityNot support virtual memory,Pre- Virtualization,Others,User-mode Linux: run Linux on top of LinuxVirtual device: port Linux kernel to the Linux system call interface rather than a hardware interfaceUsing ptrace facility to track system calls and trap them into user-space kernelCooperative Lin
14、ux: run Linux as an unprivileged VM in kernel mode on top of another OS, e.t., WindowsHost OS needs to support loading driversEach kernel has its own complete CPU context and address space, and decide when to give the control back to its parterner,ExoKernel,Traditional centralized resource managemen
15、t cannot be specialized, extended or replacedPrivileged software must be used by all applicationsFixed high level abstractions too costly for good efficiencyProvide low level interface for library operating systems (libOSes) to use in claiming, using and releasing machine resourcesSeparate protectio
16、n from management using secure bindings, visible revocation and an abort protocol,Hardware Support for HAL Virtualization,Intel Virtualization TechnologyVT-x,VT-i, VT-dCPU virtualizationMemory virtualizationI/O virtualizationAMDSecure Virtual Machine,Software Challenge, Running VMM Code,Physical Hos
17、t Hardware,VM1,VM Monitor,VM0,Guest OS0,App,App,App,.,.,Guest OS1,App,App,App,.,OS and APPs in a VM dont know that the VMM exists and will hog the CPU,VMM should run “protected” from Guest SW,SW Solution: Guest OS Ring Deprivileging,Physical Host Hardware,VM1,VM Monitor,VM0,Guest OS0,App,App,App,.,.
18、,Guest OS1,App,App,App,.,Run Guest OS above Ring-0 and have privileged instructions generate faults.,Run VMM in Ring-0 as a collection of fault handlers,Non-trivial Problems: Ring CompressionNon-trapping InstructionsExcessive FaultingAddr Space Compression ,Guest OS de-privileging requires complex u
19、northodox methods,Non-trivial Solutions: Source guest OS Modifications: Paravirtualization Legacy OSes not supported Binary guest OS Mods: Dynamic patching/ Binary translation OS Service Pack VMM Service Pack,Intel Virtualization Technology,Physical Host Hardware,VM1,VM Monitor,VM0,Guest OS0,App,App
20、,App,.,.,Guest OS1,App,App,App,.,OSs and Apps run in the intended ring,VMM runs in a new operation mode VMM preempts guest execution via new programmatic transitions,VT HW support for Processor Virtualization New CPU execution mode HW-based mode transitions Memory protection in HW,By design, VT elim
21、inates both virtualization holes and the need for unorthodox software methods,VT-x Technology Overview,ring 3,ring 0,VMX Root,Virtual Machines (VMs),Apps,OS,VM Monitor (VMM),Apps,OS,VM Exit,VM Entry,VMCS,VMCS,VMX Non-Root,OS Level Virtualization,Containers (operating environments) on top of OSProces
22、ses, File System, Network resource (IP address), Environment variables, System call interfaceTechnologieschroot(): File system virtualization on Unix; change the root directory of a process and its childrenName spaces: Each container is tagged and new entities (fork() generated from a container rema
23、ins insideUsages:SandboxingFine grain access control (root in the container),chroot,Examples,JailFreeBSD based virtualization using “chroot()”Scope is limited to the jailCurtailed assess to resources and operationsA file-system sub-tree, one IP address, one “root”Ensims “Virtual Private Server”Linux
24、 “Virtual Environment” (VE), Linux V-ServerSolaris Zones,Usage: PlanetLab,PlanetLab Central,Site A,Site B,Site C,Internet,PlanetLab Virtual Machine Monitor (VMM),NodeMgr,OwnerVM,VMGLOBUS,VMSun GE,VMIntel,VMMPIx,VMMPIy,NODE,Library Level Virtualization,Goal: ABI/API compatibilityTechnologiesAPI inter
25、ception through DLL hookingPartial/complete implementation of APIsEmulate low level kernel implementations in user-spaceUseful when the host OS does not provide required support (e.g. Win32 threads vs. pthreads)ExamplesWINE: Win32 API implementation on Unix/XLxRun: Linux API implementation on SCO Un
26、ixWare, SolarisWABI: Suns implementation similar to WINE,WINE-Wine Is Not an Emulator,Closely follows NTRe-implements all the “core” DLLs (ntdll, user32, kernel32)Wine server provides the NT backboneMessage passingSynchronizationObject handlesNative DLL support for non-core librariesHardware access
27、through Unix device drivers,Application Level Virtualization,Java Virtual Machine (JVM)Executes Java byte code (virtual instructions)Provides the implementation for the instruction set interpreter (or JIT compiler)Provides code verification, SEH, garbage collection Hardware access through underlying
28、 OSJVM ArchitectureStack-based architectureVirtual hardware: PC, register-set, heap, method (code) areasRich instruction set Direct object manipulation, type conversion, exception throwsProvides a runtime environment through JREOther Examples: .NET CLI, Parrot (PERL 6), MONOUCSD pSystem: The UCSD p-
29、System was a highly portable operating system that ran programs whose object code was pseudocode for an idealized 16-bit processor. The p-System contained an interpreter for this virtual machine. Programs were thus object-code-compatible across different hardware platforms.,HLL Virtualization,Web Ba
30、sed OS Virtualization,,www.eyeos.org,Virtual Machine Applications,Server ConsolidationSecurityMigrationCase by case,Server Consolidation,基于服务器巩固(Server Consolidation)的高性能虚拟计算环境及其高效管理技术研究,包括基于大规模集群的虚拟机动态生成、资源调度、负载平衡与高效迁移技术,以及故障快速检测与容错技术。主要思路在于:许多现有的大规模集群的单机使用效率并不高,因此可以借助于虚拟机技术,在物理集群上构建更多数量的虚拟机集群,从而提高
31、资源使用率。与物理集群不同的是,虚拟机集群的结点可以根据负载/任务需求进行动态的生成或者结束,使得资源的动态性大大增强,由此可以进行深入的高性能虚拟计算环境及其高效管理技术研究。,VMware VCenter,50,Parallax (HOTOS2005),由剑桥大学设计的多虚拟机管理器,可以管理大量的虚拟机,消除写共享,增强客户端的缓存,利用模板映像来建立整个系统,snaphot和copy-on-write机制实现块级别共享,使用副本来保证可用性,Virtual Machine Migration,虚拟机的迁移类似于经典的进程迁移,由于是整个虚拟机的环境迁移,具有一些不同的特点值得研究。由于
32、是整个操作系统环境的迁移,减轻了进程迁移的依赖性。下面两个虚拟机迁移的例子利用类似的技术,能够在一秒钟内(根据虚拟机大小,时间从60毫秒到1秒内不等)从一个物理节点迁移到另外一个物理节点。并且在虚拟机内部运行的程序不受到影响。主要需要解决的问题有3个:网络连接的保持,内存的迁移与存储系统的迁移。这两个虚拟机NAS方法绕过存储系统的迁移,用pre-copy的方法迁移内存。Live Migration of Virtual Machines (Cambridge, NSDI2005): 基于Xen虚拟机,利用ARP协议来维持连接,只能应用于集群系统内部的迁移。Fast Transparent Mi
33、gration for Virtual Machines (VMware, USENIX2005): 基于VMware ESX Server的VMotion,利用虚拟的NIC来维持唯一的MAC地址,也只能应用于集群系统内部,Performance,下面三篇论文讨论的是xen虚拟机中的性能问题A. Menon, et al, Diagnosing Performance Overheads in the Xen Virtual Machine Environment, VEE 2005.开发了Xenoprof系统用以研究xen虚拟机的性能L. Cherkasova, R. Gardner, Me
34、asuring CPU Overhead for I/O Processing in the Xen Virtual Machine Monitor, USENIX 2005.实现一个轻量级的监测系统,用以检测每个虚拟机的CPU利用率以及IO的利用率,从而确定性能瓶颈A. Menon, et al, Optimizing Network Virtualization in Xen, USENIX 2006 1)使用offloading engine 2)优化数据传输,避免remapping 3) 让guest能够充分高级的虚拟内存机制,Virtual Devices,主要研究如何使用有效的虚拟
35、设备在VMM基础上提高系统的整体性能。Operating System Support for Virtual Machines(umich)将原有User Level的VMM放到Kernel Level并且更新内存管理算法,利用硬件加速,能够达到更好的性能。High Performance VMM-Bypass I/O in Virtual Machines (IBM,Ohio State) 利用InfiniBand的高速性能以及RDMA的特性,修改VMM代码,使得用户程序能够直接访问硬件,从而达到高性能的目的。Intel Virtualization Technology for Dire
36、cted I/O: 通过硬件的方式,能够让虚拟直接访问到I/O设备,从而能够提高整个虚拟机种操作系统的性能。,Storage Virtualization,Logical Volume,In/Out Band Storage Virtualization,Storage Virtualization,传统的磁盘虚拟化能够将多个磁盘,或者磁盘系统虚拟成一个磁盘(或者是标准接口的存储服务),并对上层提供类似服务,减轻存储系统的管理开销MINERVA: (HP,TOCS),根据工作负载的要求,在一个存储池中选择相应的阵列满足要求,相当于虚拟出一个磁盘的服务,满足上层的要求。原来的问题是NP的,通过机
37、器选择,多次优化的方法,降低复杂度,减轻存储服务配置的负担。Faade: (HP, FAST03) 提供虚拟存储设备,满足预先给定的Qos。Stonehenge: (IBM, Sunysb, SIGMetrics/Performance04) 传统的虚拟化专注于磁盘的空间,multiple dimensional storage virtualization 虚拟了存储空间,带宽,延迟等。在同一个集群存储内虚拟出多个完全不同的磁盘。Ventana: (Stanford, NSDI06),集中式的存储服务器,能够为虚拟化提供服务,来保证虚拟服务的多版本,隔离性和移动性。虚拟机文件保存在这个文件系
38、统中,通过创建View来为虚拟机提供虚拟存储。这样,通过同一个文件系统可以安装操作系统,维持应用程序以及用户文件。通过不同的组合能够生成不同的虚拟机的映像。Design Tradeoffs in Applying Content Addressable Storage to Enterprise-scale Systems Based on Virtual Machines: (CMU,Usenix06):在ISR系统中,观察数据如何存储与组织能够保证尽量在服务器上保存少的数据。使用CAS的方法,将数据块进行Hash,将Hash值作为判断重复的依据,Software Environment A
39、pplications,虚拟化技术可以使得软件与其运行环境(或者部分运行环境)结合在一起进行独立的分发、运行以及迁移,从而可能会改变现有的软件发布以及维护模式将繁琐的软件安装、升级、维护转变为数据的分发与更新。研究方向包括基于虚拟化技术的可迁移软件应用模式、适应于虚拟机技术的新型软件使用许可模式研究、基于虚拟机技术的软件互操作技术等。 Internet Suspend/Resume (http:/isr.cmu.edu/)PlanetLab PlanetLab是用于开发下一代互联网技术的开放式全球性测试平台,建立一个全球范围内的虚拟网络实验室,在节点内部使用了VServer的虚拟服务技术,提供
40、Linux环境。在节点之间利用VNET可以建立各种实验的虚拟覆盖网络。Collective: (Stanford, NSDI2005) 基于虚拟机构建了一个类似于ISR的系统,管理员给用户安装操作系统以及应用程序,用户保留自己的数据。这样的话用户可以使用到最新的操作系统以及应用软件,因为管理员已经为其安装了更新。,58,Internet Suspend/Resume,59,Stanford Collective,每一个用户容器包括下面四个部分:1)System Disk包括了操作系统和应用程序,管理员会预先安装好一个系统给用户使用;2)User Disk包含了用户的私有数据,例如用户文件和设置
41、;3)Ephemeral Disk包含了临时数据的磁盘,例如浏览器的缓存和临时文件;4)Memory image容器被挂起的状态,60,安全与可靠性,虚拟化技术提供了一个更好的软件与软件之间、软件与低层系统之间的隔离机制,有利于实现信息的安全处理。研究方向包括:基于虚拟机的入侵检测技术(如糖罐技术),基于虚拟机的恶意代码隔离技术、网格计算中的沙箱技术(sandboxing)等;以及基于虚拟机的可信计算研究。Collapsar: (Purdue, Security) A VM-Based Architecture for Network Attack Detention Center 利用虚拟机
42、来建立蜜罐网络,应对网络入侵,记录详细的入侵行为Virtual-machine based security services (umich)Terra: A Virtual Machine-Based Platform for Trusted Computing (Stanford) http:/suif.stanford.edu/collective/Livewire: (Stanford) 利用虚拟机来进行入侵检测 Open Trusted Computing,Honeypots,Honeypots are machines deliberately left for attackers
43、 to compromiseUsed to gather information about attackersLure attackers into revealing their presence to protect real machines,Collapsar,63,Virtual-machine based security services (umich),基于UML (User Mode Linux) 进行了一系列的研究Operating System Support for Virtual Machines 将原有User Level的VMM放到Kernel Level并且更
44、新内存管理算法,利用硬件加速,能够达到更好的性能ReVirt: 利用记录日志的办法重现操作系统的执行,在指令的级别上重现,这样可以重现入侵监测等非确定的行为BackTracker :基于上面的技术,能够根据可疑的文件探测出入侵的步骤,重现入侵的过程。Time Travel Virtual Machine: 加入gdb的支持,以及Checkpoint等,能够在调试级别上再现操作系统的运行,方便调试操作系统操作系统漏洞与入侵检测:一般操作系统漏洞被发布的时候,需要一段时间才能开发出修补工具。这篇文章讨论了如何检测在这段时间内,是否有人通过这个漏洞入侵了系统。SubVirt:虚拟机是把双刃剑,黑客也
45、能通过虚拟机的技术绑架现有的操作系统,达到入侵的目的,这篇文章讨论了可能的入侵方法以及防范措施,Different Trustful Levels,Information Assurance requirementData cannot flow between diff classification networksConventional solutionMilitary “airgap”Dedicate distinct computer for access to each networkVM Solution:Have 1 computer running different VMs
46、, each with their own security classificationVirtual network monitors (or in some cases, prevents) communication between VMs depending on their classification level,Terra: a Virtual Machine Based Platform for Trusted Computing,从硬件开始,逐级进行验证,建立一个可信任的计算平台硬件包含了密钥,并且被厂商签署硬件签署固件(firmware),固件签署启动程序(bootloa
47、der),启动程序签署TVMM(Trusted VM),最后由TVMM来签署虚拟机可以在可信任的平台上实现不同信任级别的虚拟机,既可以兼容原有的操作系统,也可以建立可信任的操作系统,互相之间不受影响,Terra Architecture,67,Livewire 利用虚拟机进行入侵检测,入侵检测模块放在系统中,则可以探测到整个系统的活动,但是,容易受到入侵行动的影响。入侵检测模块放到网络中,能够探测到入侵行为,更好地抵御入侵,但是没有整个系统的视图,无法抵御逃避攻击(一种攻击方法,能够逃避IDS的检测)将入侵检测系统集成到虚拟机的层次,能够将入侵检测系统与主机系统隔离,同时能够保持整个系统完整的视图,68,Open Trust Computing,http:/ about China?,TH-MSNSChinaGrid, CNGridGridowsLUCOS,Thats All,Thank you very much!,
Copyright © 2018-2021 Wenke99.com All rights reserved
工信部备案号:浙ICP备20026746号-2
公安局备案号:浙公网安备33038302330469号
本站为C2C交文档易平台,即用户上传的文档直接卖给下载用户,本站只是网络服务中间平台,所有原创文档下载所得归上传人所有,若您发现上传作品侵犯了您的权利,请立刻联系网站客服并提供证据,平台将在3个工作日内予以改正。