An Analysis of Content Validity of Reading Papers in TEM-8【毕业论文】.doc

资源描述

1、毕业论文文客久久本科毕业论文 (设计 ) 题目： An Analysis of Content Validity of Reading Papers in TEM-8 学院：学生姓名：专业：英语班级：指导教师：起止日期：毕业论文 Contents Abstract . 1 Introduction. 3 1. An Introduction of Language Testing . 3 1.1 Historical Background of Language Testing . 3 1.2 Main Criteria for Evaluating a Tes

2、t . 3 1.2.1 Backwash . 5 1.2.2 Reliability . 5 1.2.3 Validity . 6 2. Content Validity . 6 2.1 Whether Test Content is Representative . 7 2.2 Whether Test Content is Consistent with the Stated Goal . 7 2.3 Whether Test Content is Suitable for Test-takers. 8 3. Content Validity of Reading Papers in TE

3、M-8 . 8 3.1 An Overall Introduction of TEM-8 . 8 3.2 Analysis of Content Validity of Reading Papers in TEM-8 . 9 3.2.1 Representativeness of Reading Papers in TEM-8. 9 3.2.2 Consistency with the Stated Goal of Reading Papers in TEM-8 . 11 3.2.3 Feasibility of Reading Papers in TEM-8 .12 4. Findings

4、and Suggestions . 13 4.1 Major Findings. 13 4.1.1 From the Perspective of Representativeness .13 4.1.2 From the Perspective of Consistency .13 4.1.3 From the Perspective of Feasibility .14 4.2 Suggestions . 14 4.2.1 For TEM-8 Designers .14 4.2.2 For Teachers and Students .14 Conclusion . 15 Referenc

5、es . 16 毕业论文 1 摘要英语专业八级考试 (TEM-8)是国内唯一一种专门为高等院校英语语言文学专业学生高级阶段设计的大规模标准化考试。对于英语学习者而言，阅读能力是最重要的能力之一，因此阅读理解也是其最重要的测试部分之一。因此，对 TEM-8 的阅读试卷进行内容效度分析具有了重要的现实意义。本文主要通过 Bachman和 Palmer提出的语言测试理论，从三方面对近 3 年（ 2008-2010）的 TEM-8 的阅读试题进行内容效度分析。本研究主要分为四个部分。第一部分主要阐述了语言测试发展的四个阶段以及评价试卷质量的主要标准，特别强调了效度的重要性；第二部分主要介绍了内

6、容效度，以及分析测试内容效度的三个方面；第三部分则从代表性、一致性和适用性三方面着重分析了 TEM-8 阅读试卷的内容效度；第四部分主要是对所做研究的总结和提出进一步提高 TEM-8阅读部分内容效度的建议。研究表明， 2008-2010 年 TEM-8 阅读试卷总体上内容效度较高，但依然存在一些问题。具体体现在： 1）在题材方面，虽然涉及较广，但社会生活题材占了相当大的比重，抑制了题材的多样性； 2）在体裁方面，考试大纲中规定的广告、说明书以及图表在这三年试卷中未曾涉及。 3）在是否符合测试目标方面，实际阅读速度与大纲相比偏低，同时阅读技能的考察缺乏全面性。作者希望通过本研究能提高 TE

7、M-8阅读试题的内容效度，使其更好地服务于英语教学。关键词内容效度；阅读试卷；英语专业八级考试毕业论文 2 Abstract Test for English Majors Grade 8 (TEM-8) is a large-scale nationwide criterion-referenced test for English majors in China. Meanwhile, reading comprehension is regarded as one of the most important parts in English language tests. Henc

8、e, it is considerably significant to analyze the content validity of reading comprehension parts in TEM-8 papers. This thesis is intended to explore the content validity of reading papers of TEM-8 over the past three years (2008-2010) in terms of language testing theories from the Bachman and Palmer

9、s framework. This thesis naturally falls into four parts. Part One serves as an introduction to four stages of language testing and criteria for evaluating a test, among which the importance of validity is particularly emphasized. Part Two mainly reviews the theoretical rationales on which this thes

10、is is based, that is, “what content validity is” and “how to analyze content validity”. Part Three mainly lays emphasis on the analyses of content validity of reading papers in TEM-8, which is conducted from the perspectives of representativeness, consistency and feasibility. The last part summarize

11、s the major findings and presents the suggestions for TEM-8 designers and teachers and students. The study shows that the reading papers of TEM-8 (2008-2010) generally have a high content validity, but there are still some problems existing: 1) In terms of topics, the topics of society and daily lif

12、e range at top, which sterilizes the diversity of material selection. 2) As for genres, advertisements, instruction books and charts and graphs are not included in three years reading papers. 3) In light of consistency, the reading speed demonstrated in reading papers is not in accordance with the s

13、tated goal and the coverage of reading skills in reading papers is far from adequacy. It is hoped that this thesis can make some contribution to improving the content validity of reading papers in TEM-8, and make it have a more beneficial impact on English language teaching and learning. Key Words c

14、ontent validity; reading papers; TEM-8 毕业论文 3 An Analysis of Content Validity of Reading Papers in TEM-8 Introduction Test for English Majors Grade 8 (TEM-8) is a large-scale nationwide criterion-referenced test for English majors in China. It covers a wide range of English skills, such as listening

15、, reading, writing and translation. Among all the English skills incorporated in TEM-8, reading comprehension (20%), as an indispensable component in language tests, takes a relatively large proportion in TEM-8, which highlights its prominence in the test. Validation study can put a test in a more f

16、avorable position. As is known to all, content validity is often referred to when discussing validity of a test. Content validity means that whether or not the test “adequately measure what it is supposed to measure” (Henning, 2001: 10). A common way to study content validity is to analyze the conte

17、nt of a test in terms of representativeness, consistency and feasibility, which will be adopted in this thesis. Since TEM-8 is a syllabus-based test, the content validity analysis can verify the implementation of The 2000 Teaching Syllabus for TEM-8 and The 2004 Test Syllabus for TEM-8, which can ge

18、nerate a positive backwash on both English teaching and learning. Thus, the aim of the present thesis is to analyze the content validity of the reading comprehension part of TEM-8 from 2008 to 2010. 1. An Introduction of Language Testing Where there is language teaching, there is language testing. L

19、anguage testing is extremely crucial to assess test-takers linguistic competence. Therefore, it is a necessity to have a good knowledge of language testing. In addition, knowing how to evaluate a test is of great importance, too. In this sense, criteria for evaluating a test will be discussed below.

20、 1.1 Historical Background of Language Testing “Language testing is a measurement instrument designed to elicit a specific sample of an individuals behavior from which one can make inferences about certain characteristics of the individual” (Wang, 2009:13). Moreover, a language test usually seeks to

21、 find out what candidates can do with the target language. Language testing emerges with the development of language teaching and they are closely interrelated with each other. The theoretical development of foreign language testing has gone through four stages on the whole: the pre-scientific perio

22、d, the psychometric-structuralist period, the psycholinguistic-sociolinguistic period, and the communicative testing period (Shen second, start to study the process of test-takers response instead of test results. It can be noted that language testing in China is in a state of benign development. 1.

23、2 Main Criteria for Evaluating a Test The quality of a test usually can be judged from six aspects: validity, reliability, power (or difficulty), discrimination, practicality and backwash effect (Liu in reverse, if it does bad to teaching and learning, it generates negative or harmful backwash. It i

24、s expected that testing can always have a beneficial backwash on teaching. In this sense, testing is expected to bring at least three beneficial backwash effects. First of all, the development of those abilities that are supposed to be tested is encouraged. Furthermore, testing can diagnose the weak

25、nesses and difficulties in teaching, which can help teachers to well understand where students are having trouble, and to improve their efficiency by making adjustments in their teaching to enable certain group of students to benefit more. In other words, testing can instructively affect teaching co

26、ntent. Moreover, it can help students to locate and make up for the deficiencies in study and drive students to train themselves to obtain those required abilities and skills. In summary, testing should be “supportive of good teaching and, where necessary, exerts a corrective influence on bad teachi

27、ng” (Hughes, 2000:2). In other words, positive backwash is always the primary concern of test designers. 1.2.2 Reliability “Reliability is a necessary characteristic of any good test: for it to be valid at all, a test must first be reliable as a measuring instrument” (Heaton, 2000:162). It is becaus

28、e that “a test without reliability is of no use value” (Wu, 2003:20). Reliability refers to the credibility and stability of test results. It has to do with accuracy of measurement. This 毕业论文 6 kind of accuracy is reflected in the obtaining of similar results when measurement is repeated on differen

29、t occasions or with different instruments or by different persons. This characteristic of reliability is sometimes termed consistency. For instance, if the results of a test given to a group of test-takers twice or more times are quite corresponding, it shows that this test has a relatively high rel

30、iability. Reliability can be symbolized by correlation coefficient of two test results, that is, coefficient of reliability. “It is possible to quantify the reliability of a test in the form of a reliability coefficient” (Hughes, 2000:38). Reliability coefficients allow people to compare the reliabi

31、lity of different tests. It is widely accepted that the ideal reliability coefficient is 1. “A test with a reliability coefficient of 1 is one which would give precisely the same results for a particular set of candidates regardless of when it happened to be administered” (Hughes, 2000:39), while if

32、 a test has a reliability coefficient of zero, it is not a reliable test. “It is between the two extremes of 1 and zero that genuine test reliability coefficients are to be found” (Hughes, 2000:39). Since reliability and validity are often considered as “complementary aspects of a common concern in

33、measurement” (Bachman, 1999:160), it is a necessity to discuss validity of a test. 1.2.3 Validity “Validity in general refers to the appropriateness of a given test or any of its component parts as a measure of what it is purported to measure” (Henning, 2001:89). In other words, “the validity of a t

34、est is the extent to which it measures what it is supposed to measure” (Heaton, 2000:159). As many testing specialists believe, validity is the most important among issues of language testing. Indeed, if a test is not valid for the purpose for which it was designed, then the scores can hardly reflec

35、t what they are believed to mean. Validity can be divided into four types: construct validity, content validity, concurrent validity and predictive validity (Hughes, 2000:26). Construct validity refers to the underlying linguistic principles and language learning principles in a test, i.e. the theor

36、etical basis of a test, while content validity means that “the test should adequately measure what it is supposed to measure” (Henning, 2001: 10). Meanwhile, “concurrent validity is established when the test and criterion are administered at about the same time”, while predictive validity “concerns

37、the degree to which a test can predict candidates future performance” (Hughes, 2000:27). A test without validity cannot realize its projected goal, which highlights the importance of validation study for a test. Hence, content validity, as a kind of validity, is also of great importance. Since this

38、thesis is to study the content validity of reading papers in TEM-8, it is necessary to review content validity in detail. 2. Content Validity As noted linguist Grant Henning (2001:94) points out in A Guide to Language 毕业论文 7 Testing: Development, Evaluation and Research, “content validity, as the na

39、me implies, is concerned with whether or not the content of the test is sufficiently representative and comprehensive for the test to be a valid measure of what it is supposed to measure.” Content validity is of great importance. Firstly, “the greater a tests content validity, the more likely it is

40、to be an accurate measure of what it is supposed to measure”. Secondly, “areas that are not tested are likely to become areas ignored in teaching and learning” (Hughes, 2000:27), which is quite meaningful to the improvement of language testing and teaching. In terms of judging whether or not a test

41、has content validity, we can start from three aspects: whether or not test content is representative, whether or not test content is consistent with the stated goal, and whether or not test content is suitable for test-takers. 2.1 Whether Test Content is Representative Whether test content is repres

42、entative relates to the representativeness of test items. It means that test items should constitute a representative sample of all the skills and structures etc., which it is supposed to measure, that is, a test should have enough test items to represent all the things it is meant to cover. When testing one certain skill, it is impossible to include all the related knowledge into the test. As people do in this situation, they extract parts of relevant knowledge as a sample to test. Therefore, “the representativeness of this sample will directly affect the tests content val

展开阅读全文