Авторизация
Поиск по указателям
Raychaudhury S. — Computational text analysis for functional genomics and bioinformatics
Обсудите книгу на научном форуме
Нашли опечатку? Выделите ее мышкой и нажмите Ctrl+Enter
Название: Computational text analysis for functional genomics and bioinformatics
Автор: Raychaudhury S.
Аннотация: This book brings together the two disparate worlds of computational text analysis and biology and presents some of the latest methods and applications to proteomics, sequence analysis and gene expression data. Modern genomics generates large and comprehensive data sets but their
interpretation requires an understanding of a vast number of genes, their complex functions, and interactions. Keeping up with the literature on a single gene is a challenge itself-for thousands of genes it is simply impossible.
Here, Soumya Raychaudhuri presents the techniques and algorithms needed to access and utilize the vast scientific text, i.e. methods that automatically "read" the literature on all the genes. Including background chapters on the necessary biology, statistics and genomics, in addition to practical
examples of interpreting many different types of modern experiments, this book is ideal for students and researchers in computational biology, bioinformatics, genomics, statistics and computer science.
Язык:
Рубрика: Биология /
Статус предметного указателя: Готов указатель с номерами страниц
ed2k: ed2k stats
Год издания: 2006
Количество страниц: 312
Добавлена в каталог: 11.12.2007
Операции: Положить на полку |
Скопировать ссылку для форума | Скопировать ID
Предметный указатель
Munich Information Center for Protein Sequences (MIPS) 7 152 196
Murzin, A.G. Brenner, S.E. et al. 117
Mus musculus see Mouse
Mycobacterium tuberculosis, genome size 18
N-gram classifier 241
Naive Bayes text classification scheme 203 204—5 216
Naive Bayes text classification scheme, accuracy 221 222
Naive Bayes text classification scheme, use in gene name recognition 235—6 242—3
Naive Bayes text classification scheme, use in protein-protein interaction identification 262
Name recognition see Gene name recognition
National Library of Medicine, assignment of MeSH headings 224
Natural language processing 2
Natural language processing algorithms 83
Nearest neighbor classification 75
Nearest neighbor classification, accuracy 217 218
Nearest neighbor classification, application to text classification 203—4 216
Needleman, S.B. and Wunsch, C.D. 44
Negations 86
Neighbor divergence (ND) 163—4
Neighbor divergence (ND), precision-recall plot 160
Neighbor divergence per gene (NDPG) 152 162 164—6
Neighbor divergence per gene (NDPG) scores, nodes 179
Neighbor divergence per gene (NDPG) scores, random and functional groups 166
Neighbor divergence per gene (NDPG), computational approach 153—5
Neighbor divergence per gene (NDPG), corruption studies 166—7 168
Neighbor divergence per gene (NDPG), data types required 155
Neighbor divergence per gene (NDPG), evaluation across different organisms 184—9 191
Neighbor divergence per gene (NDPG), evaluation of method 155—7
Neighbor divergence per gene (NDPG), precision-recall plot 160
Neighbor divergence per gene (NDPG), scores for functional groups 167
Neighbor divergence per gene (NDPG), screening of gene expression clusters 168—9 173—8 pl
Neighbor divergence per gene (NDPG), sensitivity, relationship to annotation quality 179
Neighbor expression information (NEI) scoring 124 130—2
Neighbor expression information (NEI) scoring, application to phosphate metabolism data set 132—6 140
Neighbor expression information (NEI) scoring, application to SAGE and yeast-2-hybrid assays 141
Neighbor expression information (NEI) scoring, low induction genes 138—9
Neighbor expression information (NEI) scoring, scores of individual experiments 136—8 137
Neighborhood words 48
Nelson, D.L., Lehninger, A.L. et al. 17
Networks, genetic 245 246—7
Neural networks 75
Ng, S.K. and Wong, M. 7
Nigam, K., Lafferty, J. et al. 205—6 218
Node selection, dendrograms 179—81
Nodes, pruning 178—81 pl
Nodes, states 180
Noise 123 124—6
Noise, management in phosphate metabolism study 129—41
Normal distribution 32
Normal distribution, z-score 35
Novak, J.P., Sladek, R. et al. 124
Nucleosome GO functional group 176
Nucleotide bases 18 19
Nucleotide bases, in RNA 21
Nucleotide bases, pairing 19 20
Nucleotide bases, phosphodiester bond 20
Number, prediction of likelihood of interactions 256 257 258 259
Nylon gene arrays 63
Ohta, Y., Yamamoto, Y. et al. 7
Oligonucleotide arrays 62
Online journals 3
Ono, T., Hishigaki, H. et al. 7 262
Oryza sativa, GO annotated genes 13
Overlap coefficient 87
Overlap, clusters and functional groups 176—7 pl
p53 5
Pairwise sequence alignment 44—8 96
PAM250 matrix 44
Parameter weights, maximum entropy classification 209
Parsing sentences 262
Part-of-speech tagging 233—5 241
Part-of-speech, use in identification of protein-protein interactions 262
PATHWAYS database 201
Pearson, W.R. 43 48
Pearson, W.R. and Lipman, D.J. 48
Peer-reviewed literature 2
Peer-reviewed literature, value in genomic data set analysis 8
Peptide bond 24
Performance measures 35—7
Petukhova, G. et al. 149 151
PH011 gene 128 129—30
PH011 gene, expression ratio distribution 131
Phenylalanine 25
Phillips, B., Billin, A.N. et al. 192
Phosphate metabolism study 126—7
Phosphate metabolism study, distribution of NEI scores 133
Phosphate metabolism study, expression log ratios 127
Phosphate metabolism study, keywords 144—5
Phosphate metabolism study, literature-based scoring system 129—30
Phosphate metabolism study, NEI scores 136
Phosphate metabolism study, NEI scores of individual experiments 136—8 137
Phosphate metabolism study, neighbor expression information (NEI) scoring 132—6
Phosphate metabolism study, top fifteen genes 127—9 128
Phosphodiester bond 20
Plasmodium falciparum, genome size 18
Poisson distribution 32 33 163 164
Pollack, J.R., Perou, C.M. et al. 63
Poly-A tail, RNA 21 22
Polyadenylation signal 23
Polymerase proteins 24. see also DNA polymerase; RNA polymerase
Poorly referenced areas 108 117 140 184
Poorly referenced areas, functions 188—9
Poorly referenced areas, transference of references 189—92
Poorly referenced areas, use of sequence similarity 111
Poorly referenced areas, worm 187
Population statistics 34—5
Porter, M.F. (Porter’s algorithm) 90
Position specific iterative BLAST (PSI-BLAST) 53—4 115
Position specific iterative BLAST (PSI-BLAST), evaluation 117—20 118 119
Position specific iterative BLAST (PSI-BLAST), modification to include text 116—17
PRECISION 37 212
Precision, PSI-BLAST 118—19
Precision-recall performance, GO codes 222—4 223
Precision-recall plot, functional coherence scoring methods 160
Predefined sets of words 90
Prediction results 36
Predictive algorithms, measures of 35—7
Prey proteins 248
Primary structure, proteins 25
Primary transcript 22
Principal component analysis (PCA) 73—4 92
probability 27—8
Probability density function, multivariate normal distribution 76
Probability distribution functions (pdfs) 31—3
Probability distribution functions (pdfs), statistical parameters 35
Probability, Bayes' theorem 30
Probability, conditional 28—9
Probability, independence of events 29—30
Probability, information theory 33—4
Profile drift 116
Profiles 50 65
Progressive alignment 49
Proline 25
Promoter sites, DNA 21 22 23
Protein binding 141
Protein interaction networks 245
Protein name recognition, use of word appearance 233 234
Protein sequence probabilities, use of Bayes’ theorem 30
Protein-gene interactions 247
Protein-protein interactions 245 247
Protein-protein interactions, affinity precipitation 248
Protein-protein interactions, gene name co-occurrence 250—59
Protein-protein interactions, information extraction strategies 259—2
Protein-protein interactions, statistical textual classifiers 262—68
Protein-protein interactions, yeast-2-hybrid method 247—48
Proteins 24—6
Proteins, Edman degradation 39—40
Proteins, function assignment, role of text analysis 108
Proteins, function assignment, utilization of text and sequence information 120—21
Proteins, functions 18 26 27
Proteins, SCOP database 117—18
Proteins, synthesis 18 21—2
Proteins, tertiary structure pl 2.4
Proteomics methods, introduction 1
Proux, D., Rechenmann, F. et al. 7 233
Pruitt, K.D. and Maglott, D.R. 4 11
Pruning dendrograms 178—81 pl
Pruning dendrograms, application to yeast data set 181—4
Pseudo-counts of words, use in naive Bayes classification 204—5
Pseudo-reference assignation 110
PU conditions, phosphate metabolism study 126
Public Library of Science (PLOS) 3 9
PubMed abstracts 2 3 4 9 11 pl
PubMed abstracts, use for NDPG 155
PubMed Central 3 9
Purine nucleotide bases 18 19
Pustejovsky, J., Castano, J. et al. 238
Pyrimidine nucleotide bases 18 19
Quality, genomics literature 4
Rain, J.C., Selig, L. et al. 248
Rare words 88 89 91
Ratnaparkhi, A. 205 209
Rattus norvegicus, GO annotated genes 13
Raychaudhuri, S. and Altman, R. B. 184
Raychaudhuri, S., Chang, J.T. et al. 7 8 179 188
Raychaudhuri, S., Schu? tze, H. et al. 152 157
Raychaudhuri, S., Stuart, M. et al. 63 72
Raychaudhuri, S., Sutphin, P.D. et al. 62
RDH54 gene, representation in literature 150 151
Real-values vectors, comparison metrics 87
Recall 37 212
Recall, PSI-BLAST 118
Reference indices 95 152 185 188
Reference indices, genome databases 9—11
Reference matrix (R) 142
References, in SWISS-PROT 109
Relevance, literature sources 4—5
Replicates, value in recognition of false positives 138
Reporter genes 248
Restriction enzymes 64
Ribonucleic acid see RNA
Ribonucleotides 21
Ribose 19
Ribosomal RNAs (rRNA) 21
Riley, M. 196
Rindflesch, T.C., Tanabe, L. et al. 236
Ripley, B.D. 67 75
RNA 18 20—2
RNA polymerase 18 21
RNA, binding by proteins 25 26
RNA, nucleotide bases 19
RNA, yeast transfer RNA structure pl 2.2
Roberts, R.J. 3
Roots of gene names 237 241—2
Rosenfeld, R. 83 197 218
Ross, D.T., Scherf, U. et al. 67
Saccharomyces cerevisiae see Yeast
Saccharomyces Genome Database (SGD) 9 11 127 174 180 184 212 221 251
SAGE (Serial Analysis of Gene Expression) 62 64—5 pl
SAGE (Serial Analysis of Gene Expression), use with NEI scores 141
Saldanha, A.J., Brauer, M. et al. 126
Sample preparation, sources of variability 125
Sanger dideoxy method 39 pl
Sanger dideoxy sequencing method 39 pl
Schena, M., Shalon, D. et al. 63
Schug, J., Diskin, S. et al. 188
Scope of functionally coherent gene groups 150
Score matrix, dynamic programming 45
Score step, gene name finding algorithm 241
Scoring functions in multiple alignment 48—9
Scoring functions in pairwise alignment 42
Scoring of functional coherence 153—4 157
Secondary structure prediction, hidden Markov models 56 57
Sekimizu, T., Park, H.S. et al. 7 262
Selected state of nodes 180
Self-hybridization, mRNA 21
Self-organizing maps 69—70 173 pl.7.1
Self-organizing maps, yeast gene expression data 70 174
Semantic neighbors 153
Semantic neighbors, number, relationship to performance of NDPG 165
Sensitivity 36 37
Sensitivity of NDPG 187
Sentence co-occurrences 251 252 253 254
Sequence alignment 42—4
Sequence alignment, BLAST 48
Sequence alignment, dynamic programming 44—7
Sequence alignment, multiple 48—61
Sequence analysis, use of text 107—9
Sequence comparison 40—2
Sequence contamination 54
Sequence hits, description by keywords 112—14
Sequence hits, organization by textual profiles 114
Sequence hits, sequence information, combination with textual information 120—21
Sequence similarity, relationship to word vector similarity (breathless) 99
Sequence similarity, use to extend literature references 111—12
Sequences, comparison to profiles 50—3
sequencing 8 14 38
Sequencing, Edman degradation of proteins 39—40
Serine 25
Sharff, A. and Jhoti, H. 1
Shatkay, H. and Feldman, R. 2 7
Shatkay, H., Edwards, S. et al. 8 95 112
Sherlock, G. 67
Shinohara, M. et al. 151
Shor, E. et al. 151
Shotgun assembly strategy 39
Signon, L. et al. 151
Single expression series, keyword assignment 141—5
Single expression series, lack of context 123
Single expression series, noise 124—6
Single expression series, phosphate metabolism study 126—30 132—40
Single nucleotide polymorphism, detection, introduction 1
Single nucleotide polymorphism, identification 63
Smarr, J. and Manning, C. 241
Sources of noise 125
Specificity 36
Spellman, P.T., Sherlock, G. et al. 63 78
Splicing, primary transcript 22
Spotted DNA microarrays 62 63
Standard deviation 34 35
Standardized gene names 228—29
Stanford Microarray Database (SMD) 126
Stapley, B.J. , Kelley, L.A. et al. 120
Statistical machine learning 262—68
Statistical parameters 34—5
Statistical parameters, gene reference indices 174
Stein, L., Sternberg, P. et al. 9 184 229
Stemming 90
Stephens, M., Palakal, M. et al. 7
Stop lists 89 90
stopwords 216
String matching strategy 40—1
Structural Classification of Proteins Database (SCOP) 117—18
Structural proteins 24
Stryer, L. 17 39
Study areas, bias 5
Subsequence alignment 45
Substitution matrices 43 44
Substitution of amino acids 41 42—3
Sum of pairs scoring system 49
Sung, P. et al. 151
Supervised machine learning algorithms 66 74—9 202
Support vector machine classifiers 242 243 263
SWISS-PROT database 3 11 11 108 109—11 115 118 189 pl
Symington, L.S. 151
Synonym lists 229
Synonyms for genes 230—1 232
Syntax, use in recognition of gene names 228 233—5 241
Tag sequences 64
Tagging enzyme 64
Реклама