1、生物信息学课程- 数据库与网络服务,杜舟生物信息学2007级苏震实验室,博二的老人了,Concepts,BioinformaticsComputational BiologyDatabaseWeb serverWeb service,(Many who draw a distinction between bioinformatics and computational biology portray the former as a tool kit and the latter as science. ),Nucleic Acids Research Database and Web Ser
2、ver issue,Database,Web Server,http:/www.oxfordjournals.org/nar/database/c/,Database,http:/bioinformatics.ca/links_directory/,Web sever,Google !,Bioinformatics主要期刊,专业期刊(以计算文章为主):Bioinformaitcs,ploscomputationalbiology,BMCbioinformatics, journalofcomputationalbiology, BMC genomics , BMCsystemsbiology,
3、molecularbiologyeolution.准专业期刊(基本上每期都有一定比例):genomebiology,nucleicacidsresearch,genomeresearch,molecularsystemsbiology,americanjournalofhumangenetics,.综合期刊:nature,science,pnas,plosone,.其它(偶尔有计算类文章发表):naturebiotechnology,naturegenetics,naturemethods,cell,trendsgenetics,plosgenetics,.,Part I Overview o
4、f the bioinformaticsDatabase and web serverPart II Introduction to bioinformatics webservices created in Su Zhens labPart III Construction of database and web services,14,Three major public DNA databases,In 1988, 由此三家组成了国际核酸序列数据库协作组织(INSDC),规定:1、数据交换与共享(每24小时进行一次)2、使用统一的数据记录格式处理提交数据,以保证各数据库相应记录在内容上的
5、一致性。3、数据的维护与更新。Each database updates only the records that were directly submitted to it.,19,Accession number 是用来确定一个记录的标签。Examples (all for retinol-binding protein, RBP4):X02775GenBank genomic DNA sequence(1+5,2+6)NT_030059Genomic contig in RefSeqRs7079946dbSNP (single nucleotide polymorphism)N9175
6、9.1An expressed sequence tag (1 of 170)NM_006744RefSeq DNA sequence (from a transcript)NP_007635RefSeq proteinAAC02945GenBank proteinQ28369SwissProt protein1KT7Protein Data Bank structure record,protein,DNA,RNA,What is accession number?,20,Accession number series in RefSeq,Experimentally determined
7、sequences NT_123456 Genomic contigs (DNA) NM_123456 mRNA NP_123456 ProteinsSequences derived through genome annotation efforts XM_123456 Model mRNAs XP_123456 Model proteins,NCBI简介,NCBI(National Center for Biotechnology Information),建立于1988年主要任务开发数据库进行计算生物学研究开发基因组数据分析的工具发布生物医学信息等对于数据库管理数据库GenbankUni
8、geneRefseqdbSNPdbESTOMIM提供Entrez数据库检索BLAST数据库序列搜索比对等,利用NCBI获取所有玉米的全长cDNA,1.利用关键字 FLI-CDNA搜索2. 选择nucleotide,3. 选择物种 - 玉米,4.选择浏览方式 (可选),5. 选择下载方式,可直接下载fasta文件,Pfam,http:/pfam.janelia.org/,Genome Browser,浏览基因组信息:原始测序序列、基因结构、EST支持、转录因子、序列保守性、SNP等一系列信息。缺点:只适合手工浏览,不适和大规模处理,Jbrowser,UCSC Introduction,Unive
9、rsity of California Santa Cruz (UCSC)Genome Browser DatabaseURL:http:/genome.ucsc.edu/数据构成:基因组数据基因组间的比对信息参考序列(mRNA, EST)基因注释信息(ENCODE项目),UCSC HomePage,Genome Browser,Customized UCSC Browser,苏震实验室数据库及网络服务介绍,植物mRNA数据库,Zhenhai Zhang, Jingyin Yu, Daofeng Li, Zuyong Zhang, Fengxia Liu, Xin Zhou, Tao Wang
10、, Yi Ling, and Zhen Su Nucleic Acids Research, 2010, Vol. 38, Database issue D806-D813,大豆功能数据库,苜蓿数据库,Li D, Su Z, Dong J, Wang T. An expression database for roots of the model legume Medicago truncatula under salt stress. BMC Genomics. 2009 Nov 11;10(1):517.,植物分泌蛋白数据库,Zhou Du, Xin Zhou, Li Li, Zhen S
11、u, plantsUPS: a database of plants Ubiquitin Proteasome System, BMC Genomics, 2009, 10:227,植物泛素化系统数据库,玉米信号转导数据库,BMC genomics, 2010,EasyGO:GO富集分析平台,Xin Zhou, Zhen Su, EasyGO: Gene Ontology-based annotation and functional enrichment analysis tool for agronomical species, BMC Genomics 2007, 8:246,agriGO:农业物种GO富集分析平台,Zhou Du, Xin Zhou, Yi Ling, Zhenhai Zhang and Zhen SuNucleic Acids Research, 2010Faculty of 1000 biology “Recommend”,构建数据库或网络服务可能需要用到的技术,Database,Biological Meaning,Computertechnique,Linux,Apache,MySQL,PHP/Python/Perl,(LAMP) + HTML (CSS) + Javascript,Literaturemining,谢谢 ,