Àâòîðèçàöèÿ
Ïîèñê ïî óêàçàòåëÿì
Raychaudhury S. — Computational text analysis for functional genomics and bioinformatics
Îáñóäèòå êíèãó íà íàó÷íîì ôîðóìå
Íàøëè îïå÷àòêó? Âûäåëèòå åå ìûøêîé è íàæìèòå Ctrl+Enter
Íàçâàíèå: Computational text analysis for functional genomics and bioinformatics
Àâòîð: Raychaudhury S.
Àííîòàöèÿ: This book brings together the two disparate worlds of computational text analysis and biology and presents some of the latest methods and applications to proteomics, sequence analysis and gene expression data. Modern genomics generates large and comprehensive data sets but their
interpretation requires an understanding of a vast number of genes, their complex functions, and interactions. Keeping up with the literature on a single gene is a challenge itself-for thousands of genes it is simply impossible.
Here, Soumya Raychaudhuri presents the techniques and algorithms needed to access and utilize the vast scientific text, i.e. methods that automatically "read" the literature on all the genes. Including background chapters on the necessary biology, statistics and genomics, in addition to practical
examples of interpreting many different types of modern experiments, this book is ideal for students and researchers in computational biology, bioinformatics, genomics, statistics and computer science.
ßçûê:
Ðóáðèêà: Áèîëîãèÿ /
Ñòàòóñ ïðåäìåòíîãî óêàçàòåëÿ: Ãîòîâ óêàçàòåëü ñ íîìåðàìè ñòðàíèö
ed2k: ed2k stats
Ãîä èçäàíèÿ: 2006
Êîëè÷åñòâî ñòðàíèö: 312
Äîáàâëåíà â êàòàëîã: 11.12.2007
Îïåðàöèè: Ïîëîæèòü íà ïîëêó |
Ñêîïèðîâàòü ññûëêó äëÿ ôîðóìà | Ñêîïèðîâàòü ID
Ïðåäìåòíûé óêàçàòåëü
Frequency-inverse document frequency weighting 91
Fukuda, K., Tamura, A. et al. 7 233 235 240
Fukuda, T. et al. 151
Function of genes and proteins 26—7
Functional assignment 120
Functional assignment, effectiveness of text classification algorithms 212—21
Functional assignment, value of keywords 123
Functional coherence 147 148—52 171
Functional coherence, assessment see also neighbor divergence per gene (NDPG)
Functional coherence, assessment, best article score 160—2
Functional coherence, assessment, computational approach 152—5
Functional coherence, assessment, evaluation of algorithms 155—7
Functional coherence, assessment, neighbor divergence (ND) 163—4
Functional coherence, assessment, screening gene expression clusters 167—9
Functional coherence, assessment, word distribution divergence (WDD) 157—60
Functional coherence, corruption studies 166—7
Functional coherence, relationship to NDPG score 181
Functional coherence, scoring 153—4 157
Functional coherence, scoring, precision-recall plot 160
Functional determination, gene groups 170
Functional gene groups 156
Functional information, use in sequence analysis 107
Functional neighbors, neighbor expression information (NEI) scoring 129—32
Functional vocabularies 90 196—7.
Functional vocabularies, Enzyme Commission (EC) 200 201
Functional vocabularies, Kyoto Encyclopedia of Genes and Genomes (KEGG) 200—1
Functions, poorly referenced 188—9
Funk, M.E. and Reid, C.A. 218
G distribution of words 158—9
Gap penalties in multiple alignment 49
Gap penalties in pairwise alignment 42 43—4 45 46
Gavin, A.C., Bosche, M. et al. 248
Gelbart, W.M., Crosby, M. et al. 184 229
GenBank database, growth 37
Gene annotation 104
Gene annotation by maximum entropy, classifier 221—4
Gene deletion identification 63
Gene dictionaries 228—2
Gene duplication identification 63
Gene expression analysis 1 8 14 61—2 65—6 83 171—2 202 pl
Gene expression analysis, arrays 26—7 63 pl
Gene expression analysis, assignment of keywords 140—5 173
Gene expression analysis, gene groups 172—3
Gene expression analysis, hierarchical clustering 178 183
Gene expression analysis, hierarchical clustering, application to yeast data set 181—3
Gene expression analysis, hierarchical clustering, pruning dendrograms 178—81
Gene expression analysis, SAGE 64—5 pl
Gene expression analysis, screening clusters 173—8
Gene expression analysis, sources of noise 125
Gene expression clusters, functional coherence assessment 167—9
Gene expression data, advantage of text-based approach 12 13
Gene expression data, clustering algorithms 66—72
Gene expression data, dimensional reduction 72—4
Gene expression data, matrix organization 65
Gene expression regulation 23—4
Gene expression similarity, relationship to word vector similarity 99 100
Gene expression, relationship to NEI scores 133—5 134
Gene function annotation 195—6
Gene function vocabularies see Functional vocabularies
Gene groups 147. see also neighbor divergence per gene (NDPG)
Gene groups, best article score (BAS) 160—2
Gene groups, corruption studies 166—7
Gene groups, determination of function 170
Gene groups, evaluation of assessment algorithms 155—7
Gene groups, functional coherence 148—52
Gene groups, in gene expression analysis 172—3
Gene groups, keyword assignment 100
Gene groups, neighbor divergence (ND) 163—4
Gene groups, theoretical distribution of article scores 163—4
Gene groups, word distribution divergence (WDD) 157—60
Gene interactions databases 7
Gene interactions, textual cooccurrences 250—59
Gene name recognition 227—28
Gene name recognition, dictionaries 228—2
Gene name recognition, unified algorithm 240—3
Gene name recognition, use of abbreviations 237—40 238 239
Gene name recognition, use of context 235—7
Gene name recognition, use of morphology 237
Gene name recognition, use of syntax 233—5
Gene name recognition, word structure and appearance 232—3
Gene names, synonyms 228—29 230 231
Gene networks 245 246—7
Gene networks, roles of scientific text 249
Gene networks, roles of scientific text, co-occurring genes 249—50
Gene Ontology 7 11—12 13 90 152 184 196 197—198
Gene Ontology, evidence codes 198 199—200
Gene Ontology, functional groups, yeast 175—6
Gene Ontology, functional groups, yeast, correlation with NDPG score of nodes 181 182
Gene Ontology, precision-recall performance of codes 222—4 223
Gene Ontology, quality of annotations 188
Gene references, skewed distribution 174
Gene-protein interactions 247
General Repository for Interaction Datasets (GRID) 250 251 261 266 267
Generalized iterative scaling (GIS) 209—10
Genes 22—4
GENES database 201
Genes, defining textual profiles 94—6
Genes, functional assignment 120
Genes, functions 26—7
Genes, homology 40
Genes, querying for biological function 101—4
Genes, structure 22
genetic code 22 23
Genome databases 9—11
Genome sequence information 125—6
Genome sizes 18
Genomic data analysis 7—8 pl
Genomics era 1
Genomics literature 2—4
Genomics literature, diversity 5
Genomics literature, quality 4
Genomics literature, relevance 4—5
Giot, L., Bader, J.S. et al. 248
Glenisson, P., Coessons, B. et al. 90 95 99
Glutamic acid 25
Glutamine 25
Glycine 25
Glycogenin 192
Glycolysis genes 150
gold standards 116—17 184 197 222
Golub, T. R, Slonim, D.K. et al. 78
Gotoh, O. 47
Groups of genes see Gene groups
Guanine 18 19
Guzder, S.N. et al. 149
Haber, J.E. 151
Hairpin loops, RNA 21
Halushka, M.K., Fan, J.B. et al. 63
Heartless gene 97 98
Heartless gene, synonyms 229 230
Heat shock protein GO functional group 176
Hermeking, H. 64
Hersh, W. 86
Hersh, W., Bhuporaju, R.T. et al. 195
Heyer, L.J., Kruglyak, S. et al. 67
Hidden Markov models (HMM) 54—61 57
Hidden Markov models (HMM), use in gene name recognition 237
Hierarchical clustering 70—2 86 pl
Hierarchical clustering, fly gene expression data et 191—4
Hierarchical clustering, gene expression analysis 178—84
Hierarchical organization, Gene Ontology 12 197 198
High entropy models 206
High-Wire press 3 9
Hill, D.P., Davis, A.P. et al. 187—88
Histidine 25
Ho, Y., Gruhler, A. et al. 248
Homayouni, R., Heinrich, K. et al. 92
Homologous genes, recognition 108 114—15 190
Homologous sequences 111
Homology 40—2 117
Homology, remote 114—15
Hughes, T.R., Marton, M.J. et al. 63
Human genes, bias in areas studied 5 6
Human genome project 1
Human genome size 18
Humphreys, K., Demetriou, G. et al. 7
Hutchinson, D. 3
Hvidsten, T.R., Komorowski, J. et al. 187—88
Hydrogen bonding, nucleotide bases 19 20
Hydrogen bonding, proteins 25 26 pl
IDA (inferred from direct assay) 188
Incoherent gene groups, article scores 163 164
Inconsistencies in classification of documents 218
Independence assumption 29—30
Independence assumption, naive Bayes classification 204 205 218
Independence of events 29—30
Inferred from Electronic Annotation (IEA) evidence code 189 198 200
Inferred from Reviewed Computational Analysis (RCA), evidence code 198 200
Inferred from Sequence Similarity (ISS), evidence code 189 198 199
Information extraction 259—2
Information retrieval 86
Information retrieval, latent semantic indexing 92 104
Information theory 33—4
Inter-gene difference calculation 181
Interaction verbs 260—1
interactions 245
Introns 21 22 23
Inverse document frequency weighted word vectors 91 161
Isoleucine 25
Iterative sequence similarity searches modification to include text 115—17. see also Position specific iterative BLAST (PSI-BLAST)
Ito, T., Chiba, T. et al. 248
Jacard coefficient 87
Jenssen, T.K., Laegreid, A. et al. 8 152 250
Journals relevant to genomics 3
Journals, online 3
K-means clustering 68 173 177 pl
Kanehisa, M., Goto, S. et al. 200
Kegg Orthology (KO) numbers 201
Kellis, M. et al. 151
Kerr, M.K. and Churchill, G.A. 67
Key articles, recognition 153
Keyword queries 101—4 102 103
Keywords field (KW), SWISS-PROT 109
Keywords, assignment 100 141—5 173
Keywords, assistance in functional assignment 123
Keywords, breathless and heartless genes 96 97
Keywords, definition for proteins 107
Keywords, expression values pl 5.1
Keywords, in identification of protein-protein interactions 260
Keywords, MeSH headings 9
Keywords, phosphate metabolism study 144—5
Keywords, use in recognition of gene names 233 235 236
Keywords, use to summarize sequence hits 112—14
Klein, H.L. 151
Krauthammer, M., Rzhetsky, A. et al. 232
Krogh, A., Brown, M. et al. 54
Kullback — Liebler (KL) distance 34 131—2
Kullback — Liebler (KL) distance in ND 163
Kullback — Liebler (KL) distance in NDPG 154
Kullback — Liebler (KL) distance in WDD 159
Kwok, P.Y. and Chen, X. 1
Kyoto Encyclopedia of Genes and Genomes (KEGG) 200—1
Latent dimension, relationship to variance 94
Latent semantic indexing (LSI) 92—4 93 104 140
Lee, M.L., Kuo, F.C. et al. 124
Lee, S.E. et al. 151
Lesk, A.M. 1
Leucine 25
Linear discriminant analysis (LDA) 75—9 76 pl
Linear discriminant analysis (LDA), applications 78
Linear time algorithms 48
Literature 2—4
Literature index 185 186
Literature index, comparison between organisms 185—6
Literature similarity constraint, modified PSI-BLAST 117 120
Literature, diversity 5
Literature, quality 4
Literature, relevance 4—5
LocusLink 3—4 5 6 11 11
Logistic regression 75
Logistic regression classification 239—40
Low entropy models 206
Low induction false positives, recognition 138
Low induction genes, NEI scores 139
Lu, Z., Szafron, D. et al. 120
Lymphoma, gene expression profiles 62
Lysine 25
MacCallum, R.M., Kelley, L.A. et al. 8 107 115
Machine learning algorithms, combination of sequence and textual information 120—21
Machine learning algorithms, supervised 66 74—9
Machine learning algorithms, unsupervised see Clustering algorithms
Manning, C.M. and Schutze, H. 2 84 86 89 202 204
Mantovani, R. 50
Marcotte, E.M., Xenarios, I. et al. 7 262
Mass spectroscopy 248
Masys, D.R., Welsh, J.B. et al. 112
Matching coefficient 87
Matrices, reference matrix (R) 142
Matrices, text matrix (T) 143
Matrices, weighted word-document matrix (W) 91
Matrices, word covariance matrix 92—4
Matrices, word-document matrix (A) 85
Matrix organization, gene expression data 65
Maximum entropy modeling 195—6 203 205—10 207
Maximum entropy modeling, accuracy 217 218 219 220
Maximum entropy modeling, annotation of genes 221—4
Maximum entropy modeling, in identification of protein-protein interactions 263—68 264—5 266 267
Maximum entropy modeling, use in gene name recognition 236 242 243
McCallum, J. and Ganesh, S. 107 114
Mean 34—5
Meaning of text 86
Median 34 35
Medline database, abbreviations 240
Medline database, format 9 10
Merging articles, disadvantages 157
MeSH headings 9 213
MeSH headings, assignment by National Library of Medicine 224
MeSH headings, consistency of assignment 218
Messenger RNA (mRNA) 18 21
Messenger RNA (mRNA), measurement in gene expression arrays 63
Metabolism genes 150
Methionine 25
Mewes, H.W., Frishman, D. et al. 7 78 152 169 196
Michaels, G.S., Carr, D.B. et al. 67
Microarrays see Arrays
Mitchell, A.P. 78
Miyagawa, K. et al. 151
Molecular biology, biological function 26—7
Molecular biology, central dogma 18 pl
Molecular biology, deoxyribonucleic acid (DNA) 18—20
Molecular biology, genes 22—4
Molecular biology, proteins 24—6
Molecular biology, ribonucleic acid (RNA) 20—2
Molecular function terms, Gene Ontology 11—12 197—198
Morgan, A.A., Hirschman, L. et al. 232 237 240
Morphological variants 242
Morphology, use in gene name recognition 228
Mouse genes, ank root 237
Mouse Genome Database (MGD) 9 11 184
Mouse Genome Database (MGD), synonym lists 229
Mouse, assembly of functional groups 185—9
Mouse, GO annotated genes 13
Mouse, literature index 185 186
Mouse, sensitivity of NDPG 187
Mouse, tricarboxylic acid cycle (TCA) functional group 189
Multiple functions of genes 150
Multiple sequence alignment 48—9 83
Multiple sequence alignment, hidden Markov models 54—61 57
Multiple sequence alignment, position specific iterative BLAST (PSI-BLAST) 53—4
Multivariate normal distribution 76
Ðåêëàìà