08级毕业设计译文和原文---程序不变量.doc

资源描述

1、本科毕业论文（设计）译文和原文题目程序不变式检测工具的分析与应用作者学院计算机科学与技术专业计算机科学与技术学号指导教师二一二年二月八日1第一章前言程序不变量是在特定的一个或多个程序点上永保真值的属性，如可存在某一断言语句，形式化规格说明，或不变式的描述。例如 Y=4*X+3；数组 a 不包含复制；n=n；子节点：父亲节点( 给所有节点 n)；size(key)=size(contents)；以及有向无环图。不变量表明了程序的数据结构和变量运算规则，并且有利于程序的维护。例如，当修改代码时他们定义的程序属性必须保存。尽管有这些优势，不变量还是经常在编程中丢失。此外程序员应能从程序中自动

2、推断出疑似不变量来充分注释代码。动态技术的研究专注于动态技术从运行的轨迹中发现不变量。一个程序不变量的动态检测是检查变量在一系列测试用例的值并记录在这些值中的属性和关系。本章节讨论了如何获得不变量(1.1 章节)，介绍动态不变量检测(1.2 章节)，讨论关于两个与动态技术的运用有关的观点(测试集在 1.3 章节，属性的使用在 1.4 章节)，列举不变量的使用(1.5 章节)，列举关于论文的贡献者(1.6 章节)，给出其余文件的路线图(1.7 章节 )。1.1 获取不变量的方法程序员在写程序或者用其他方式操作程序时，应该有意识的加入程序不变量，因为这样可以很好的把握系统的运行情况，可以清楚的描述

3、数据结构以及表达变量之间的关系等。但是比较可惜的是，这一主张在实际编码过程中几乎没有得到应用，因此大多数程序完全缺乏正式的或者非正式的不变量分析。另一方面，程序员应在用不变量注释代码时能自动的推测出不变量。不变量检测包含在设计空间的隐藏部分：程序员脑海里所想的不变量。不变量检测是动态或静态的。静态分析检测程序测试并找出可能的执行和运行时间状态。最普遍的静态分析是数据流分析，用抽象作为其理论解释依据。传统，健全的分析结果能保证所有可能的执行为真；由于编译器和其它一些系统不是面向用户，它在一些对正确性要求严格时非常合适。静态分析有局限性。它不能真实的报告除了不可判定的属性或程序上下文的属性。在静态

4、分析程序中运用了语言的特征，例由于堆栈近似值的精度缺失而难以表示并会产生较弱的结果，指针会余留在技术状态那边。动态分析在这些执行中运行程序，检查执行并报告属性，且不受这些缺点影响同时互补静态分析。1.2 动态不变量检测2本研究主要专注动态发现不变量：这个技术是通过执行一个关于程序集的输入以及从获得的不变量轨迹中推测出的不变量。图 1.1 显示的是以一亚洲人命名的不变量检测工具 Daikon 的高层体系结构。动态不变量检测从程序执行中发现疑似不变量，通过指令目标程序去产生一些确定值，通过测试套件运行机器程序，推论编译的不变量并获得未在程序中显露不变量。所有的步骤都是自动的（除了选择测试套件）。

5、现在存在的工具是资源到资源翻译的 C，Java ，Lisp ；我们使用的是工具和前后交互模式。推论步骤测试可能的不变量对于那些从指令值获得的值。满足所有值，并能满足其它测试，例如数据合理性，不被相关联值，不被其它报告的不变量所包含的属性才称为可能不变量。1.3 运行程序动态分析需要执行的目标程序。一个好的测试套件的好处大于它的花费，即使是没有动态不变量的检测下，它还能使用于其它动态技术例如（回归）测试。的确，一个程序缺少测试套件（不能运行或没有）可能会出现很多问题，并应在不变量检测前用标准技术来改正之。与其它的动态测试或研究方法相比，推断不变量的准确性取决于测试用例的质量及理解性。额外的测试用

6、例可能会提供一些新数据，以推断出更准确的不变量。推断的不变量可在执行时通过揭示属性为真来验证测试套件。这样，用户才知道这个测试套件是否足够，如果不是，则直接告知怎样改进它。事实上，标准的测试套件是很合适的，且能从特别的相对不敏感套件中检测不变量，只要它足够大（参见第 6 章）。然而，我们还不知道确切的属性从而使测试套件能良好检测错误。这些不尽相同使测试套件能够检测缺陷。测试套件执行的效率如果是通用最小次数来执行每条语句，那么它对不变量检测不好，它需要重复的执行作为归纳基础，并有统计数据作为推论的支持。在任何动态分析中，动态不变量检测都不能保证其结果的完整性或正确性。它不完整因为这可能报告出很

7、多不确定的潜在不变量。然而，它又是完整的，因为它检查的不变量集；Daikon 的语法在 3.2 节给出并在后面的论文中扩展，大多数在 4.3节。它不准确因为测试套件不能列出所有的执行：存在前 10000 次执行后的属性不一定存在下一个里。然而，技术在这些训练数据（所有呈现的数据）又是准确的。虽然动态不变量检测不够完整和准确，它还是很有用的，也是相当重要的财产。（这包括适合其它工具，例如测试，PurifyHJ92，ESCDLNS98,PREfixPRE99，等等。）此外，它能与其它工具互补；能弥补静态分析的缺陷，而静态分析也能覆盖3它的缺陷。1.4 功能不变量和使用性能动态不变量检测的结果取决

8、于特别的测试套件，不是所有报告的不变量能在每一可能执行的程序为真。Daikon 输出分为功能不变量和使用内容。功能不变量只取决于特定数据结构或功能的代码，不变量普遍适用于此种实体。另一方面，使用属性是由数据结构或功能的特别功能所产生的；它们取决于使用背景和测试套件。因为它依赖于操作的轨迹。（第 4 章和第 4.5 章的统计明显表明能区分它们。也能区分测试套件。）人们通常不能分辨两者。功能不变量和使用属性的区别是在一系统的入口和出口Java 或 C 语言主程序的开始或结束。在其他程序点它们的区别更清楚。比如，关于一个程序的论证是有效的这么一个前提陈述，可以作为程序功能不变量或访问者的使用属性

9、，并注意不要提供提供不合法的值。（在正式详细文献上，如果前提条件，如果符合，并保证了后置条件在出口处为真，则退出。所有由 Daikon 所报告的前提条件和后置条件要符合此准则，即使 Daikon 的前提条件不会成为最弱的前提条件。）使用内容在解释文档操作或一个数据结构使用的功能的界限文档时很有用。因为功能不变量和使用属性间的区别与程序员的工作状态有关，我们会从此把检测作为首要目标。1.5 不变量的使用不变量对人类和工具，程序的所有方面，包括设计，编程，测试，优化以及维护都很有用。这一章节列了不变量的一些应用，去作为程序员们为什么关注它们并从程序中跟踪它们并是很值得的目标的动力。编写更好的程

10、序。一些作者已经注明了好的程序对代码的轨迹有影响当不变量运用在他们的设计时Gri81,LG86 ，不变量准确的正式化代码的合约，能区分它有意的操作并在完成时指示。更正式的思考代码能影响更多有规律的设计和执行，即使是非正规的使用不变量也对程序员有帮助HHJ+87b，HHJ+87a.其他作者建议把不变量作为注释的一个必要部分，在程序中定义个说明CM88,BG93，FM97。虽然这些使用是有用的，论文关注不变量在已构建程序中的使用。文档代码。不变量注重程序执行某些方面和提供有用的程序操作文档，算法和数据结构。例如，他们认为程序的理解在某些程度上是每个程序运算的先提条件。文档是从至今为止的程序中自动提

11、取出来的，不像人们所写的的信息，当代码变了时则需被更新。定义文档。自动关联不变量对已注注释文档，断言陈述，或规格说明的代码很4有用。先关联的不变量能检查和定义编程提供不变量；程序自检通常过时，无效或不正确LCKS90。此外，人们相互检查是比较弱的因为不同的人倾向于犯同样的错KL86。检查假定。不变量能插入程序中作为断言陈述为以后的测试或确认检测不变量不是违反后来的代码演变。程序类型是另一种这样的假设，能在编译，运行，或两者都是时被检查。避免缺陷。不变量能阻止程序员因疏忽而违反程序正确行为的假设而做改变。当改变时，现有的程序中明确的不变量的缺失使得程序员更易产生错误。在一点上确定的不变量可能依

12、赖于其它地方，但如果原始的不变量不是被文档化，少依赖，那么程序员很容易会违反它，在程序的某个远部分产生错误。程序维护产生错误OC89，GKMS00，以及更多，大多数是因为违反不变量。帮助程序员避免产生错误是最初动态不变量检测的动力。这个活动和检测缺陷一样有价值（即使数量困难少量），因为阻止一个问题比后来再修改更简单也更便宜。不管是静态还是动态区分不变量都作为以上所述使用（除了第一个，用于程序设计）。其它的运用，动态检测不变量比静态不变量检测更有用。形成范围。一个程序范围AFMS96，RBDL97 ，HRWY98是程序任何可度量的属性或它的执行。例子包括程序执行的行集（或路径），输出的大小

13、，运行时间，或静态属性，例如资源复杂McC76。通常，一个范围可以当做一个摘要或哈希代码；程序范围的不同能指出或描绘程序，输入或执行的不同。动态检测不变量也能形成程序范围，能指出变化的程序属性或输入并能作为另一范围使用。找出不同的条件。不寻常或异常情况能指示错误或特别的情况，进而引起程序员的注意。一个近而真的不变量能指示环境需要特别注意或一个异常输入。有效的测试套件。动态检测不变量能揭露程序本身测试套件，因为属性反映了程序在运行测试套件。一个不变量可能会揭露程序只能操作小值，或一些值经常存在相互间特别关系。这些属性可能会指示程序值（和状态）覆盖不足和程序行为锻炼不足，即使套件覆盖程序的每行或路

14、径。这些不变量能帮助测试用例以一个或两个不同的方式产生。新的测试能明显违反不变量TCMM98 ，通过扩大值的覆盖来提升套件（类似扩大而不是操作范围CR99，它需要两个值必须有不同的值）。相互的，新的测试能观察在程序运行的不变量，测试套件能描程序或部件的实际，正确的使用。优化公共用例。简介直接编译使用信息聚集在先前的运行来优化。如果一个特别的值或情况是普遍，便宜去测试，以用允许有用的专门化当它运行时，那么一个编译（在其它技术中）能插入检查和扩展成一个专门化代码版本，并假设制约。举个例子，不变量命名别名是在运行时的点同等测试，专业化的程序版本中已别名和5未别名的用例比普通用例更有效。低级执行信息

15、用于预置文件编辑（通常是最常用的单变量值），它能扩张高级不变量，更好的优化普通用例。不变量是数据结构，而不是内存位置或寄存器的不变量，能使整个数据结构作业，避免或优化。自举证明。原理证明，流量分析，模型检查，以用其它自动或半自动机制，能通过观察，确认安全属性例如在执行时边界溢出或空指针引用，建立或终止响应，以及其它方法来在执行时增加，分辨程序正确性。然而，它们是单调而易出错的，人们很难去区分将证明的属性，现在的系统难以假设它们；某些研究者认为任务比演示证明更难Weg74,BLS96 。动态检测程序不变量可以送入到一个自动化系统，减少人们全手动注释程序的必要，去支持不变量的属性一项很少程序员会

16、擅长或享受的任务。6Chapter 1IntroductionA program invariant is a property that is true at a particular program point or points, such as might be found in an assert statement, a formal specification, or a representa- tion invariant. Examples include y = 4 x + 3; x abs(y); array a contains no duplicates; n = n

17、.child.parent (for all nodes n); size(keys) = size(contents); and graph g is acyclic.Invariants explicate data structures and algorithms and are helpful for programming tasks from design to maintenance. As one example, they identify program properties that must be preserved when modifying code. Desp

18、ite their advantages, invariants are usually missing from programs. An alternative to expecting programmers to fully annotate code with invariants is to automatically infer likely invariants from the program itself. This research focuses on dynamic techniques for discovering invariants from executio

19、n traces. A dynamic detector of program invariants examines variable values captured during execution over a test suite and reports properties and relationships that hold over those values.This chapter discusses how to obtain invariants (Section 1.1), introduces dynamic in- variant detection (Sectio

20、n 1.2), discusses two issues having to do with the use of a dynamic technique (test suites in Section 1.3 and usage properties in Section 1.4), lists some uses for invariants (Section 1.5), lists the contributions of the dissertation (Section 1.6), and gives a roadmap to the remainder of the documen

21、t (Section 1.7).1.1 Ways to obtain invariantsI contend that most programmers have invariants in mind,consciously or unconsciously, when they write or otherwise manipulate programs: they have an idea of how the system works or is intended to work, how the data structures are laid out and related to o

22、ne another, and the like. Regrettably, these notions are rarely written down, and so most programs are completely lacking in formal or informal invariants and other documentation.An alternative to expecting programmers to annotate code with invariants is to auto- matically infer invariants. Invarian

23、t detection recovers a hidden part of the design space: the invariants that the programmer had in mind. This can be done either statically or dynamically.7Static analysis examines the program text and reasons over the possible executions and runtime states. The most common static analysis is dataflo

24、w analysis, with abstract interpretation as its theoretical underpinning. The results of a conservative, sound analysis are guaranteed to be true for all possible executions; itis most appropriate when correctness is crucial, as for compilers and some other systems whose consumer is not a human.Stat

25、ic analysis has a number of limitations. It cannot report true but undecidable properties or properties of the program context. Static analysis of programs using language features such as pointers remains beyond the state of the art because the difficulty of representing the heap forces precision-lo

26、sing approximations and produces weak results. Dynamic analysis, which runs the program, examines the executions, and reports properties over those executions, does not suffer these drawbacks and so complements static analysis.1.2 Dynamic invariant detectionThis research focuses on the dynamic disco

27、very of invariants: the technique is to execute a program on a collection of inputs and infer invariants from captured variable traces. Figure 1.1 shows the high-level architecture of the Daikon invariant detector, which is named after an Asian radish.Dynamic invariant detection discovers likely inv

28、ariants from program executions by in- strumenting the target program to trace certain variables, running the instrumented pro- gram over a test suite, and inferring invariants over both the instrumented variables and derived variables not manifest in the program.All of the steps are fully automatic

29、 (except selecting a test suite). The currently-existing instrumenters are source-to-source translators for C, Java, and Lisp; we use the terms “instrumenter” and “fro nt end” interchangeably.The inference step tests possible invariants against the values captured from the instru- mented variables.

30、Properties that are satisfied over all the values, and that also satisfy other tests, such as being statistically justified, not being over unrelated variables, and not being implied by other reported invariants, are reported as likely invariants.1.3 Running the programDynamic analysis requires exec

31、uting the target program. A test suites benefits outweigh its costs even in the absence of dynamic invariant detection, for it enables other dynamic techniques such as (regression) testing. Indeed, a program lacking a test suite (or that cannot be run or never has) is likely to have many problems th

32、at should be addressed using standard techniques before invariants are detected over it.As with other dynamic approaches such as testing and profiling,the accuracy of the inferred invariants depends in part on the quality and comprehensiveness of the test cases. Additional test cases might provide n

33、ew data from which more accurate invariants 8can be inferred. The inferred invariants can also validate the test suite by revealing properties true when executing it. Thus, users know whether a test suite is adequate and, if not,are directly informed how to improve it.In practice, standard test suit

34、es have performed adequately, and detected invariants are relatively insensitive to the particular test suite, so long as it is large enough (see Chapter 6). However, we are not yet sure of the precise properties that make a test suite good for invariant detection. These are not necessarily the same

35、 as make a test suite good for detecting bugs. A test suite that strives for efficiency by executing each statement a minimal number of times would be bad for invariant detection, which requires multiple executions as a basis for generalization so that there is statistical support for the inferences

36、.Like any dynamic analysis, dynamic invariant detection cannot guarantee the complete- ness or soundness of its results. It is not complete because there are infinitely many potential invariants that could be reported. However, it is complete for the set of invariants checks; Daikons grammar is give

37、n in Section 3.2 and extended later in the dissertation, most no- tably in Section 4.3. It is not sound because the test suite may not fully characterize all executions: a property that held for the first 10,000 executions does not necessarily hold on the next one. However, the technique is sound ov

38、er the training data (all the data presented to it).Although dynamic invariant detection is not complete or sound, it is useful, which is a considerably more important property. (This characterization fits other tools, such as testing, Purify HJ92, ESC DLNS98, PREfix PRE99, and many more.) Additiona

39、ll y, it is complementary to other techniques; it can shore up the weaknesses of static analysis while static analysis covers for its deficiencies.1.4 Functional invariants and usage propertiesBecause the results of dynamic invariant detection depend on the particular test suite, not all reported in

40、variants will be true for every possible execution of the program. The Daikon output can be generally classified into functional invariants and usage properties.A functional invariant depends only on the code for a particular data structure or func- tion, and the invariant is universally true for an

41、y use of that entity. Usage properties, on the other hand, result from specific usage of a data structure or function; they depend on the context of use and the test suite. Because it operates on traces, Daikon cannot distinguish between these classes, which are intermingled in its output. (The heur

42、istics of Chapter 4, and notably the statistical checks of Section 4.5, can help to separate them, as can varying the test suite.)People cannot necessarily discriminate between the two, either. The distinction between functional invariants and usage properties is clear only at a systems entry and ex

43、it for a standalone Java or C program, at the beginning and end of the main routine. At other points in the program, the differences are less clear. For instance, a precondition stating that a procedures arguments are valid can be viewed as a functional invariant of 9the procedure or as a usage prop

44、erty of the callers, which take care not to supply illegal values. (In the formal specification literature, a precondition, if met, guarantees that the postcondition will be true upon exit. All preconditions and postconditions reported by Daikon satisfy this criterion, though Daikons preconditions w

45、ill not in general be weakest preconditions.) Usage properties are useful in explicating program operation or the limited contexts in which a function is called or a data structure is used. Because the distinction between functional invariants and usage properties is irrelevant to working programmer

46、s, we will henceforth set aside the goal of detecting only the former.1.5 Uses for invariantsInvariants are useful, to humans and to tools, in all aspects of programming, including design, coding, testing, optimization, and maintenance. This section lists some uses of invariants, to motivate why pro

47、grammers care about them and why extracting them from programs is a worthwhile goal.Write better programs. A number of authors have noted that better programs result when invariants are used in their design Gri81, LG86. Invariants precisely formal- ize the contract of a piece of code, clarifying its

48、 intended operation and indicating when it is complete.Thinking about code formally can result in more disciplined design and implementation, but even informal use of invariants can help program- mers HHJ+ 87b, HHJ+ 87a. Other authors suggest making invariants an essential part of implementation, re

49、fining a specification into a program CM88, BG93, FM97. Although these uses are valuable, this dissertation focuses on uses of invariants in already-constructed programs.Document code. Invariants characterize certain aspects of program execution and provide valuable documentation of a programs operation, algorithms, and data structures. As such, they assist program understanding, some level of which is a prerequisite to every program manipulation. Documentation that is automat

展开阅读全文