Raychaudhury S. Ч Computational text analysis for functional genomics and bioinformatics
Raychaudhury S. Ч Computational text analysis for functional genomics and bioinformatics

Ќазвание: Computational text analysis for functional genomics and bioinformatics

јвтор: Raychaudhury S.


This book brings together the two disparate worlds of computational text analysis and biology and presents some of the latest methods and applications to proteomics, sequence analysis and gene expression data. Modern genomics generates large and comprehensive data sets but their
interpretation requires an understanding of a vast number of genes, their complex functions, and interactions. Keeping up with the literature on a single gene is a challenge itself-for thousands of genes it is simply impossible.
Here, Soumya Raychaudhuri presents the techniques and algorithms needed to access and utilize the vast scientific text, i.e. methods that automatically "read" the literature on all the genes. Including background chapters on the necessary biology, statistics and genomics, in addition to practical
examples of interpreting many different types of modern experiments, this book is ideal for students and researchers in computational biology, bioinformatics, genomics, statistics and computer science.

язык: en

–убрика: Ѕиологи€/

—татус предметного указател€: √отов указатель с номерами страниц

ed2k: ed2k stats

√од издани€: 2006

 оличество страниц: 312

ƒобавлена в каталог: 11.12.2007

ѕредметный указатель
3Т and 5Т untranslated regions      23
Aach, J. and Church, G.M.      67
Abbreviations, use in gene name recognition      228 237Ч42 238 239 243
Abstract co-occurrences      254Ч59 256
Abstract co-occurrences, number, prediction of likelihood of interaction      254Ч59 256 257 258
Accession number (AC), SWISS-PROT      109
Accuracy      36 212
Adenosine      18 19
Affine gap penalty function      43
Affinity precipitation      248
Agglomerative hierarchical clustering      71Ч2
Alanine      25
Alberts, B., Bray, D. et al.      17
Algorithms, measurement of performance      35Ч7
Aligned sequences      42
Alignment algorithms      42Ч4
Alignment, dynamic programming      44Ч7
Alizadeh, A.A., Eisen, M.B. et al.      62 63 68 78
Alpha helices      25
Alpha helices, hydrogen bonding      pl 2.3
Altman, R.B. and Raychaudhuri, S.      67 86
Altschul, S.F., Gish, W. et al.      48
Altschul, S.F., Madden, T.L. et al.      115
Ambiguity of gene names      229 232
Amino acid sequences, probabilities      30
Amino acids      24Ч5 25
Amino acids, emission probabilities      59
Amino acids, genetic code      23
Amino acids, secondary structure prediction      56Ч7
Amino acids, structure      24
Amino acids, substitutions      41 42Ч3
Amino acids, synthesis      21Ч2
Amino acids, transition probabilities      59Ч60
Anchoring enzymes      64
Andrade, M.A. and Valencia, A.      112 113 173
Ank root      237
Annotated genes      see also Functional vocabularies
Annotated genes, use of maximum entropy classifier      221Ч4
Annotated genes, uses      196Ч7
Annotation quality, GO      187
Annotation quality, relationship to NDPG sensitivity      179
Appearance of words, use in name recognition      228 232Ч3 234 241Ч3 242
Arabidopsis thaliana, GO annotated genes      13
Arbeitman, M.N., Furlong, E.E. et al.      98 193
Arginine      25
Arrays, gene expression profiling      26Ч7 63 pl
Arrays, noise sources      125
Article indices      227
Ashburner, M., Ball, C.A. et al.      7 90 148 152 196
Asparagine      25
Aspartic acid      25
Average linkage clustering      181 191
Bachrach, C.A. and Charen, T.      213 224
Backward algorithm      60
Bailly, V. et al.      149
Bait proteins      248
Ball, C.A., Awad, I.A. et al.      126
Ball, C.A., Dolinski, K. et al.      212
Base pairing      19 20
Base pairing in RNA      21
Baum Ч Welsh algorithm      60Ч1
Bayes' theorem      30 255
Behr, M.A., Wilson, M.A. et al.      63
Ben-Hur, A., Elisseeff, A. et al.      67
Best article score (BAS)      160Ч2
Best article score (BAS), precision-recall plot      160
Beta sheets      25 26
Beta-GAL      248
Bias in study areas      5 6
Binary vectors, comparison metrics      87
Binding proteins      141
Binomial distribution      31 32 33
Bioinformatics      1 2
Biological function      26Ч7
Biological function codes      195
Biological function databases      7
Biological function querying      101Ч4
Biological process terms, Gene Ontology      12 198
Biological similarity, relationship to textual similarity      97Ч9
BioMed Central      3 9
Biomolecular Interaction Network Database (BIND)      263
Blake, J.A., Richardson, J.E. et al.      9 184 229
Blaschke, C., Andrade, M.A. et al.      7 260Ч1 265
BLAST (Basic Linear Alignment Search Tool)      39 48 83 107
BLAST, comparison of breathless protein with other proteins      97 98.
Boeckmann, B., Bairoch, A. et al.      3 109
Breathless      228
Breathless, abbreviations      237Ч38
Breathless, gene literature study      96Ч9
Breathless, SWISS-PROT record      109 pl
Breathless, synonyms      229 231
Breitkreutz, B.J., Stark, C. et al.      250
Brill, E.      234
Brown, P.O. and Bostein, D.      1
Caenorhabditis elegans, assembly of functional groups      185Ч9
Caenorhabditis elegans, Candida albicans, GO annotated genes      13
Caenorhabditis elegans, GO annotated genes      13
Caenorhabditis elegans, literature index      185 186
Caenorhabditis elegans, sensitivity of NDPG      187
Calculation of mean      35
Candidate gene identification      8
Carbohydrate metabolism genes      150
Catlett, M.G. and Forsburg, S.L.      151
CCAAT promoter      50
Cellular compartment terms, Gene Ontology      198
Central dogma of molecular biology      18 pl
Centred correlation metric      181
Chang, J.T., Raychaudhuri, S. et al.      8 107 117 118
Chang, J.T., Schutze, H. et al.      233 235 238Ч40
Chang, J.T., Schutze, H. et al., unified gene name finding algorithm      240Ч3
Chaperones      24
Chaussabel, D. and Sher, A.      95
Chee, M., Yang, R. et al.      63
Chen, J.J., Wu, R. et al.      63
Cherry, J.M., Adler, C. et al.      9 155 174 181 184 212 229
Chi-square testing, feature selection      210Ч12 211 216 218
Chips, sources of noise      125
Cho, R.J., Campbell, M.J. et al.      63
Chu, S. and Herskowitz, I.      78
Chu, S., DeRisi, J.L. et al.      78
Classification methods      66 74Ч9
Classification of documents, inconsistencies      218
Clustal Walgorithm      48 49
Cluster boundary optimization      178Ч84 192Ч3
Cluster identification      192Ч3
Cluster software      86 181
Clustering algorithms      66Ч72 172
Clustering algorithms, k-means clustering      pl 2.8
Clustering, hierarchical      178Ч84
Clustering, NDPG scoring      173Ч8
Clustering, use in organizing sequence hits      114
Co-occurring gene names      249Ч50
Co-occurring gene names, assessment of efficacy      250Ч4
Co-occurring gene names, interaction verbs      260Ч1
Co-occurring gene names, number, prediction of likelihood of interaction      254Ч59
Coded messages, information theory      33Ч4
Codons      21Ч2
Codons, genetic code      23
Coherence of gene groups      147. See also Functional coherence of gene groups
Coin tossing, hidden Markov models      55Ч6
Coin tossing, probabilities      28 29
Collection frequency      85Ч6
Comments field (CC), SWISS-PROT      109
Comprehensive Yeast Genome Database (CYGD)      169
Concordance      see Overlap clusters
Conditional probability      28Ч9
Conditional probability, BayesТ theorem      30
Conditions, in expression analysis      65
Confidence scores of maximum entropy classifier      220Ч21
Consensus sequences      50
Conserved substitutions      41
Context, use in recognition of gene names      228 235Ч7 242
Continuous probability distribution functions      31 32 33
Core terms, in name finding algorithm      233 234
Correlation coefficient      67
Corruption studies, gene groups      166Ч7
Cosine metric      87
Cosine metric, comparison of breathless with other genes      96Ч7
Cosine metric, comparison of gene expression profiles      98
Cosine metric, neighborhood expression information scoring      130Ч1 203
Covariance matrices, linear discriminant analysis      77 pl
Covariance matrices, principal component analysis      73
Craven, M. and Kalian, J.      7
Credibility, genomics literature      4
Cross-referencing, assessment of functional coherence of gene groups      152
Cysteine      25
Cytochrome P450 genes, appearance      232Ч3
Cytosine      18 19
Danio rerio, GO annotated genes      13
Data analysis      65Ч6
Data analysis, clustering algorithms      66Ч72
Data analysis, dimensional reduction      72Ч4
Data interpretation      66 68 74 77 pl pl
Data interpretation problems      1Ч2
Data, statistical parameters      34Ч5
Database building      5 7
Database of Interacting Proteins (DIP)      7 262
databases      3Ч4 7 9Ч11.
Databases, Biomolecular Interaction Network Database (BIND)      263
Databases, Comprehensive Yeast Genome Database (CYGD)      169
Databases, electronic text      9
Databases, GenBank database, growth      37
Databases, GENES database      201
Databases, PATHWAYS database      201
Databases, SCOP database      117Ч18
Databases, Stanford Microarray Database (SMD)      126
Dendrograms, hierarchical clustering      71 178
Deoxyribonucleic acid      see DNA
Deoxyribonucleotides      18 19
Deoxyribose      18 19
DeRisi, J.L., Iyer, V.R. et al.      66 78
Dice coefficient      87
Dictionary strategy, gene name identification      228Ч2 240 251
Dictyostelium discoideum, GO annotated genes      13
Dimensional reduction      66 67 72Ч4
Dimensional reduction, feature selection      88Ч90
Dimensional reduction, latent semantic indexing      92Ч4
Dimensional reduction, weighting words      90Ч1
Dirichlet priors      159
Discrete probability distribution functions      31 32 33
Discriminant line, linear discriminant analysis      76
Distance metrics, clustering algorithms      67
Distribution functions      see Probability
Distribution functions (pdfs)      0
Distributions of words, WDD      157Ч60
Divergence value, WDD      15
Diversity, genomics literature      5 141 150 195
DNA (deoxyribonucleic acid)      18Ч20
DNA (deoxyribonucleic acid), binding by proteins      25 26
DNA (deoxyribonucleic acid), Sanger dideoxy sequencing method      39 pl
DNA (deoxyribonucleic acid), transcription      21 22 245 247
DNA polymerase      18
DNA polymerase, use in Sanger dideoxy sequencing method      39
DNA-dependent ATPase genes, yeast      148Ч50 149
Document classification      see Text classification
Document frequency      85 88 89 91
Document gene indices      95. see also Databases
Document similarity assessment      83Ч4
Document similarity assessment, comparison metrics      86Ч7
Document similarity assessment, word values      88
Document vectors      84Ч6 85
Document vectors, latent semantic indexing      92Ч3
Document vectors, vocabulary building      88Ч90
Document vectors, weighting words      90Ч1
Donaldson, I., Martin, J. et al.      7 263
Dossenbach, C. Roch, S. et al.      96
Dot plots      41Ч2
Drosophila melanogaster, assembly of functional groups      185Ч9
Drosophila melanogaster, breathless gene literature search      96Ч9
Drosophila melanogaster, breathless gene literature search, BLAST hits      pl 5.1
Drosophila melanogaster, breathless gene literature search, BLAST hits, keywords      112 113
Drosophila melanogaster, gene name detection      232
Drosophila melanogaster, genome size      18
Drosophila melanogaster, GO annotated genes      13
Drosophila melanogaster, keyword queries      101Ч4 103 104
Drosophila melanogaster, latent semantic indexing      94
Drosophila melanogaster, literature      183
Drosophila melanogaster, literature index      185 186
Drosophila melanogaster, literature, document frequencies of words      88 89
Drosophila melanogaster, sensitivity of NDPG      187
Durbin, R., Eddy, S. et al.      40
Dwight, S.S., Harris, M.A. et al.      187
Dynamic programming      44Ч7 83
Dynamic programming score matrix      45
Dynamic programming, forward algorithm      59
Dynamic programming, multiple alignment      49
Dynamic programming, tracing back      47
Dynamic programming, use in gene name recognition      238Ч40
Dynamic programming, Viterbi algorithm      57Ч9 58
Edman degradation of proteins      39Ч40
Eisen, M.B., Spellman, P.T. et al.      67 70 78 86 168Ч9 172 174 180
Electronic publishers      2Ч3
Electronic text resources      9
Emission probabilities, amino acids      59
Empirical distribution, article scores      164
Enhancers      23Ч4
Entrez Gene      11
Entropy models      206. see also Maximum entropy modeling
Entropy of a distribution      34
Enzyme Commission (EC) classification scheme      200 201
Enzymes      24
Epstein Barr virus, genome size      18
Error sources, gene expression analysis      125
Escherichia. coli, genome size      18
Eskin, E. and Agichtein, E.      107 120 121
Euclidean metric      67 87
Events, conditional probability      28Ч9
Events, independence      29Ч30
Events, probability      27Ч8
Evidence codes      188 189 198 199Ч200
Exons      21 22
Exponential distributions      32
Exponential distributions, expression value of words      142Ч3 pl
Exponential distributions, maximum entropy probability distribution      208
Extend step, gene name recognition algorithm      243
Faculty of†1000      4
False negatives      36
False positives      36
False positives in single gene expression series      124
False positives in single gene expression series, recognition      135 137Ч8
Fbgn0023184      192
Fbgn0029196      192
Fbgn0034603 (glycogenin)      192
Feature selection      88Ч90
Feature selection, text classification algorithms      210Ч12
Feature terms in name finding algorithm      233 234
Features, in expression analysis      65
Features, in maximum entropy classification      206
Feng, Z.P.      120
Fields, S. and Song, O.      141
Filtering, gene name detection      232 241
Fly functional clusters      193 pl
Fly gene expression data et, hierarchical pruning      189Ч2
FlyBase      9 11 88 95 109 184 190
FlyBase, lists of synonyms      229 230
FlyBase, standardized names      228
Fractional reference (fr) parameter, WDD      158
Fractional references for documents, best article score system      160Ч1
Frequency of words      see Document frequency
1 2 3 4
