This book will review work from a number of researchers who have produced open source software addressing the need for data management, integration, analysis, and visualization to aid cancer research. With the advent of high-throughput technologies in biomedicine, the need for data management and appropriate data analysis tools in genomics has increased dramatically, joining clinical trials data as a major driver of informatics at cancer research centers.
The gathering of this data requires careful encoding of metadata, usually through the use of controlled vocabularies or ontologies, as well as the linking of data from model organisms, done at both a physiological level (e.g., anatomy) and at a molecular level (e.g., orthology). This data will then find use within computational and statistical models, which require data pipelines and analysis systems, as well as algorithms, visualization methods, and computational modeling systems. We will introduce open source tools available for these aspects of the problem.
The editors plan to divide the book into five sections, beginning with a section containing high level overviews of the field and key issues. This will include an introductory review of informatics in cancer research, followed by five overviews addressing issues in authentication and authorization, data management, data pipelines and annotations, algorithms and models, and the NCI caBIG initiative. This will be followed by sections dedicated to data systems, data pipelines, algorithms for analysis and visualization, and modeling systems. Each of these areas has seen publication of open source tools, ranging from the widely known R/Bioconductor package to little known but powerful systems such as SImmune for biochemical modeling. The area of laboratory information management systems has seen development of a number of unpublished but powerful systems, which we would also include. Three groups have agreed to provide chapters in this area (USC/Norris CAFE extensible clinical trials system, St Jude Unified LIMS, Fox Chase/British Columbia flow cytometry LIMS).
While there has been a great deal of development of informatics tools that can be applied to problems in cancer research, there has not been adequate dissemination of details on these tools to the community. As such, there remains low adoption of all but a few tools. This book aims to increase overall adoption of tools by providing cancer center leaders and researchers with a single volume detailing both issues that must be addressed and tools that are ready for use.