Home » Books » Biology » A Computer Scientist’s Guide to Cell Biology

A Computer Scientist’s Guide to Cell Biology

Author: William W. Cohen

Publisher: Springer

Genres: Biology

Publish Date: June 25, 2007

ISBN-10: 038748275X

Pages: 114

File Type: PDF

Language: English

Book Preface

For the past few months, I have been spending most of my time learning about biology. This is a major departure for me, as for the previous 25 years, I’ve spent most of my time learning about programming, computer science, text processing, artificial intelligence, and machine learning. Surprisingly, many of my long-time colleagues are doing something similar (albeit usually less intensively than I am). This document is written mainly for them—the many folks that are coming into biology from the perspective of computer science, especially from the areas of information retrieval and/or machine learning—and secondarily for me, so that I can organize and retain more of what I’ve learned.

I find it helpful to think of “biology” in three parts. One part of biology is information about biological systems (for instance, how yeast cells metabolize sugar). This is the focus of most introductory biological textbooks and overviews, and is the essence of what biologists actually study—what biologists are trying to determine from their experiments. However, it is not always what biologists spend most of their time talking about. If you pick up a typical biology paper, the conclusions are typically quite compact: often all the new information about bio-logical systems in a paper appears in the title, and almost always it can be squeezed into the abstract. The bulk of the paper is about experi-mental methods and how they were used—this, I consider to be the second part of “biology.” The third part of “biology” is the language and nomenclature used, which is rich, detailed, and highly impenetrable to mere laymen. To read and understand current literature in biology, it is necessary to have some background each of these three parts: core biology, experimental procedures, and the vocabulary.

I like to think of the last few months as something like a field trip to a new and exotic land. The inhabitants speak a strange and often incompre-hensible language (the nomenclature of biology) and have equally strange and new customs and practices (the experimental methods used to explore biology). To further confuse things, the land is filled with many tribes, each with its own dialect, leaders, and scientific meetings. But all the tribes share a single religion, with a single dogma—and all their customs, terms and rituals are organized around this religion. The highest goal of their religion is discover truth about living things—as much truth as possible, in as much detail as possible. This truth is “core” biology—information about living things. Knowing this “truth” is important, of course, but merely knowing the “truth” is not enough to understand a community of biologists, just as reading the Torah is not enough to understand a community of Jews.

In this document, I will provide a short introduction to “core” cell biology, mainly to introduce the most common terms and ideas. In doing so, I will occasionally oversimplify. This is deliberate. Computer scientists are used to analyzing complex systems by analyzing successively more complex abstractions, many of which are “real” (to the extent that any-thing computational is “real”): for instance, a push-down automaton is a generalization of a finite state machine, and both are useful for many real-world problems. One would like to operate in the same way in understanding biology, for instance, by first analyzing “finite-state” organisms, and then progressing to more complex ones. In biology, however, it is hardly ever the case that a clean and comprehensible abstract model perfectly models a real-life organism, so (almost) every simple general statement about how organisms function needs to be qualified—a tedious process in a document of this sort. I will also, by necessity, omit many interesting details, again deliberately. For a more comprehensive background on biology, there are many excellent text-books, written by people far more qualified, some of which are mentioned in the final section of this paper.

After discussing “core” cell biology, I will then move on to discuss the most widely-used experimental procedures in biology. I will focus on what I perceive to be the high-level principles behind experimental pro-cedures and mechanisms, and relate them to concepts well-understood in computer science whenever possible. Comments on nomenclature and background points will be made in side boxes.