1、1语料库语言学术语汇编(V2.0)Last updated 2015-07-31 by 许家金Aboutness 所言之事Absolute frequency 绝对频数Alignment (of parallel texts) (平行或对应)语料的对齐Alphanumeric 字母数字构成的Annotate 标注(动词) 、赋码Annotated text/corpus 标注文本/语料库、赋码文本 /语料库Annotation 标注(名词) 、赋码Annotation scheme 标注方案、赋码方案ANSI/American National Standards Institute 美国国家
2、标准学会ASCII/American Standard Code for Information Exchange美国信息交换标准码Associates (of keywords) (主题词的)联想词AWL/academic word list 学术词表Balanced corpus 平衡语料库Base list/baselist 底表、基础词表Bigram 二元组、二元序列、二元结构Bi-text/bitext 双语合璧文本、双语合并文本、双语分行对齐文本(一句源语一句目标语对齐后的文本)Bi-hapax 两次词、二次词Bilingual corpus 双语语料库Bootcamp debat
3、e/discourse/discussion (新手)训练营大辩论/话语 /大探讨CA/Contrastive Analysis 对比分析Case-sensitive/case sensitivity 大小写敏感、区分大小写Category-based approach 基于类(范畴)的方法Chi-square test/2 卡方检验Chunk 词块CIA/Contrastive Interlanguage Analysis 中介语对比分析CLAWS/Constituent Likelihood Automatic Word-tagging SystemCLAWS 词性赋码系统Clean te
4、xt policy 干净文本原则Cluster 词丛、词簇Colligation 类联接、类连接、类联结Collocate n./v. 搭配词;搭配Collocability 搭配强度、搭配力Collocation 搭配、词语搭配Collocational strength 搭配强度Collocational framework/frame 搭配框架Collocational profile 搭配概貌2Collocational network 搭配网络Comparable corpora 类比语料库、可比语料库Computational Linguistics 计算语言学ConcGram/c
5、oncgram 框合结构、同现词列Concord 索引(行) (简略形式)Concordance (line) 索引(行)Concordance plot (索引)词图Concordancer 索引工具Concordancing 索引分析Context 语境、上下文Context word 语境词Contextual prosody 语境韵律Contingency table 连列表、联列表、列连表、列联表Co-occurrence/Co-occurring 共现、同现Corpus Linguistics 语料库语言学Corpus, pl. corpora 语料库Corpus-based 基于
6、语料库的Corpus-based translation studies 语料库译学、基于语料库的翻译研究、语料库翻译学、基于语料库的译学研究Corpus-driven 语料库驱动的Corpus-informed 语料库指导下的、参考了语料库的Corpus size 库容Corpus stylistics 语料库文体学Co-select/co-selection/co-selectiveness 共选(机制)Co-text 共文Data mining 数据挖掘DDL/Data Driven Learning 数据驱动学习Dependency (句法)依存关系Dice coefficient D
7、ice 系数Disambiguation 消歧Diachronic corpus 历时语料库Discourse 话语、语篇Discourse prosody 话语韵律Documentation 说明文档、文检报告EAGLES/Expert Advisory Groups on Language Engineering StandardsEAGLES 文本规格Effect size 效应量Empirical linguistics 实证语言学Empiricism 经验主义Encoding/Text encoding/Character encoding 字符编码Error-tagging 错误标
8、注、错误赋码Explicitation 显化3Extended unit of meaning 扩展意义单位File-based search/concordancing 批量检索Firthian (linguistics) 弗斯(语言学) 、弗斯学派的(语言学)Fishers exact test 费舍尔精确检验Formulaic sequence 程式化序列、套语Frequency 频数、频率Frequency list 词频表General (purpose) corpus 通用语料库Genre 语体、体裁Grammatical patterning 语法型式Granularity 颗粒
9、度Hapax legomenon/hapax (pl. hapax legomena/hapaxes)一次词Header/Text head 文本头、头文件Hidden Markov model (HMM) 隐马尔科夫模型、隐马模型Historical corpus 历时语料库HowNet 知网ICTCLAS 中科院汉语分词系统Idiom principle 习语原则、成语原则Idiomaticity 习语性、地道程度Implicitation 隐化Index/indexing (建)索引In-line annotation 文内标注、行内标注Interlanguage 中介语Inter-co
10、der agreement/reliability 标注者间一致性/信度Introspection/introspective 内省(式) (的)Intuition 直觉Key keywords 关键主题词Keyness 主题性、关键性Keywords 主题词KWIC/Key Word in Context 语境共现(方式) 、语境中的关键词KWIC sort 语境共现排序、索引行排序Learner corpus 学习者语料库Lemma, pl. lemmata/lemmas 原形词、词目Lemmatization 词形还原Lemmatizer 词形还原工具Lexical bundle 词束L
11、exical density 词汇密度Lexical frequency profile 词频概貌Lexical grammar 词汇语法Lexical item 词项、词语项目Lexical patterning 词语型式、词汇型式4Lexical priming 词汇触发理论、词汇启动理论Lexical profile 词汇分布概貌Lexical richness 词汇丰富度Lexico-grammar 词汇语法Lexis 词语、词项Log-likelihood ratio 对数似然比、对数似然率Longitudinal/developmental corpus 跟踪语料库、发展语料库、历
12、时语料库Machine-readable (可)机读的Machine translation 机器翻译Manual annotation 手工标注、人工标注Markup/mark-up 标记MDA (Multi-dimensional analysis/approach) 多维分析、多维度分析法Meaning by collocation 搭配辨义Metadata 元信息MF/MD approach/multi-feature/multi-dimensional analysis多特征/多维度分析法Misuse 误用Monitor corpus (动态)监察语料库Monolingual cor
13、pus 单语语料库Multilingual corpus 多语语料库Multimodal corpus 多模态语料库MWU/multiword unit 多词单位MWE/multiword expression 多词表达MI/mutual information 互信息、互现信息N-gram N 元组、N 元序列、N 元结构、N 元词、多词序列Neo-Firth (school) 新弗斯学派Neo-Firthian 新弗斯学派的NLP/Natural Language Processing 自然语言处理Node (word) 节点(词)Normalization 标准化、 (翻译)规范化Nor
14、malized frequency 标准化频率、归一频率Observed corpus 观察语料库Ontology 知识本体、本体Open-choice principle 开放选择原则Orthographic 正字层面的、字面的Orthography 正字法Overuse 过多使用、超用、使用过度、过度使用Paradigmatic 纵聚合(关系)的Parallel corpus 平行语料库、对应语料库Parole linguistics 言语语言学5Parsed corpus 句法标注的语料库、树库Parser 句法分析器Parsing 句法标注、句法分析Pattern/patterning
15、 型式、模式Pattern grammar 型式语法Pattern matching 模式匹配Pedagogic corpus 教学语料库Phraseology 短语学、短语Phraseological unit/sequence 短语单位/序列Phraseological profile 短语概貌Plain text 纯文本POSgram 赋码序列、码串POS sequence 赋码序列、码串POS tagging/Part-of-Speech tagging 词性赋码、词性标注、词性附码POS tagger 词性赋码器、词性赋码工具Prefab 预制语块Probabilistic (基于)
16、概率的、概率性的、盖然的Probabilistic grammar 概率语法、概率性语法、盖然语法Probability 概率Query 查询、检索Range 分布(范围) 、跨度Rationalism 理性主义Raw frequency 原始频数、生频数Raw text/corpus 生文本/生语料Reference corpus 参照语料库Regex/RE/RegExp/regular expressions 正则表达式、正则式Register 语域Register variation 语域变异Relative frequency 相对频率Representative/representa
17、tiveness 代表性(的)Representivity 代表性Rule-based 基于规则的S-universals 源语型翻译共性(特征)Sample n./v. 样本;取样、采样、抽样Sampling 取样、采样、抽样Sampling frame 取样框架、取样方案Sampling strategy 取样方案Sanitization 净化Search term 检索项Search word 检索词Segmentation 切分、分词Semantic association 语义联想6Semantic preference 语义倾向、语义趋向Semantic prosody 语义韵Se
18、ntence alignment 句对齐、句级对齐SGML/Standard Generalized Markup Language 标准通用标记语言Simplification 简化Sketch Engine 文擎Skipgram 跨词序列、跨词结构Span 跨距Specialized corpus 专用语料库、专门用途语料库、专题语料库Standardized type/token ratio 标准化型次比、标准化类符/形符比、标准化类/形比Standardized TTR/STTR 标准化型次比、标准化类符/形符比、标准化类/形比Stand-off annotation 分离式标注Sto
19、chastic 随机的Stop list 停用词表、过滤词表Stop word 停用词、过滤词Synchronic corpus 共时语料库Syntagmatic 横组合(关系)的T score T 值T-universals 目标语型翻译共性(特征)Tag 赋码、标记Tagger 赋码器、赋码工具、标注工具Tagging 赋码、标注Tag sequence 赋码序列、码串Tagset 赋码集、码集Tertium comparationis 对比基础、对比中立项、对比中间项Text 文本Text type 文体、文类Text category 文体、文类Text mining 文本挖掘TEI/
20、Text Encoding Initiative TEI 文本编码计划The Lexical Approach 词汇中心教学法The Lexical Syllabus 词汇大纲Token 形符、词次Token definition/word definition 形符界定、单词界定Tokenization 分词Tokenizer 分词工具Transcription 转写Translation memory 翻译记忆(库)Translation norms 翻译规范7Translation universals/Universal features of translation翻译共性、翻译普遍
21、特征Translational corpus 翻译(文本)语料库、译语语料库Translationese 翻译腔、翻译体Treebank 树库Trigram 三元组、三元序列、三元结构T-score T 值Type 类符、词种、词型TTR 型次比、类符/形符比、类/形比Type/token ratio 型次比、类符/形符比、类/形比Underuse 少用、使用不足Unicode 通用码Unicodify 按通用码编码、转换为通用码Unit of meaning 意义单位WaC/Web as Corpus 网络语料库Wildcard 通配符Word alignment 词级对齐、词对齐Word
22、form 词形Word family 词族Word list 词表WordNet 词网Word sketch 词语素描Word type 词种、词型WSD/Word-sense disambiguation 词义消歧XML/Extensible Markup Language 可扩展标记语言Zipfs Law/Zipfian Law 齐夫定律、齐普夫定律Z score Z 值常用语料库ACE Australian Corpus of EnglishANC American National CorpusARCHER A Representative Corpus of Historical E
23、nglish RegistersBASE British Academic Spoken English CorpusBAWE British Academic Written English CorpusBNC British National CorpusBoE Bank of EnglishBrown Brown CorpusCANCODE Cambridge and Nottingham Corpus of Discourse in EnglishCEC China English CorpusCEM Corpus for English MajorsCHILDES Child Lan
24、guage Data Exchange System8CIC Cambridge International CorpusCLEC Chinese Learners English CorpusCLOB 2009 Brown family corpus of British EnglishCOBUILD Collins Birmingham University International Language DatabaseCOCA The Corpus of Contemporary American EnglishCOHA Corpus of Historical American Eng
25、lishCOLSEC College Learners Spoken English CorpusCOLT Bergen Corpus of London Teenage LanguageCrown 2009 Brown family corpus of American EnglishFLOB Freiburg-LOB Corpus of British EnglishFROWN Freiburg-Brown Corpus of American EnglishGloWbE Global Web-Based EnglishHelsinki Diachronic corpusDiachroni
26、c part of the Helsinki Corpus of English TextsHKCSE Hong Kong Corpus of Spoken EnglishICCI International Corpus of Crosslinguistic InterlanguageICE International Corpus of EnglishICE-GB International Corpus of English: Great BritainICLE International Corpus of Learner EnglishICNALE The International
27、 Corpus Network of Asian Learners of EnglishJEFLL Japanese EFL Learner CorpusLCMC Lancaster Corpus Mandarin ChineseLINDSEI Louvain International Database of Spoken English InterlanguageLIVAC Linguistic Variations in Chinese Speech CommunitiesLLC London Lund CorpusLOB Lancaster-Oslo/Bergen CorpusLOCN
28、ESS Louvain Corpus of Native English EssaysLONGDALE LONGitudinal DAtabase of Learner EnglishMICASE Michigan Corpus of Academic Spoken EnglishMICUSP Michigan Corpus of Upper-level Student PapersNESSIE Native English Speakers Similarly and Identically-prompted EssaysPACCEL Parallel Corpus of Chinese E
29、FL LearnersSBCSAE Santa Barbara Corpus of Spoken American EnglishSCCSD The Spoken Chinese Corpus of Situated Discourse SCORE Singapore Corpus of Research in EducationSEC Spoken English CorpusSECCL Spoken English Corpus of Chinese LearnersSECOPETS Spoken English Corpus of Public English Test SystemSEU Survey of English UsageSWECCL Spoken and Written English Corpus of Chinese LearnersToRCH2009 Texts of Recent CHinese 20099WECCL Written English Corpus of Chinese LearnersLast updated 2015-07-31 by 许家金