Epigenomics

So I’ve been admonished in the past for posting ponderings and opinions on my blog–I guess the problem is that my comments are not a-priori peer-reviewed, and it seems a lot like I’m just pontificating to an audience on my personal peeves. I think, however, writing to the blog helps me organize my internal thoughts, and I enjoy the a-postiori commentary to my post, which can be more embarrassing and candid than any private peer-review. Well, either way, if you don’t like reading about my opinions or don’t want to be influenced by them, skip this post and the one after it.

I was reading Nature again (a lot of my pondering posts seem to start there!) and being a hacker as well as an armchair quarterback in molecular biology and genomics, I’m gently amused by the surprise that the genetics community is registering about the results from the Human Genome project. Simply put, there was a prevailing notion that once we had the entire genetic sequence written out, we would crack the code on all sorts of diseases and be able to trace out the function of a cell–and perhaps the human body–from the ground truth of the genetic code.

However, for the past year I have read numerous articles that contain a phrase similar to this: researchers were surprised to find that having the source code told them nothing about how the network was configured. Or better yet, having the source code wasn’t useful because the code is self-modifying. Simply put, the Human Genome project is like having the source code to your OS, but humans are complex networks of cellular machines; many diseases and problems arise from a failure of the network or a failure of the configuration of the OS, which is not apparent from the source code alone.

I guess, to some extent, it’s not surprising that biologists are peeling the onion instead of cutting through it. I remember back in college, I took a couple of molecular biology courses. It was interesting to see the approach of the typical pre-med/biology student toward biology: lots of rote memorization, with no attention at all to system design. It’s like trying to study computer architecture by memorizing the configuration of all the transistors in a standard cell library, without understanding why you’d use one element over another.

My personal experience is that there is a significant amount of architecture in biology. When people found out I had none of the organic chemistry or genetics prerequisites for the molecular biology class, they looked at me like I was crazy. However, I survived the class with relatively little studying, the difference being that I looked at molecular biology from a system standpoint. I tried to look for high-level patterns, and totally skipped the memorizing the basic patterns–because for the tests, we were allowed to bring in an 8.5×11 sheet with notes. I wrote the basic organic chemistry operations on there, as well as the basic formulae and chemical reaction sequences I would need, so I didn’t have to memorize them. The class also focused a lot of its attention on the design of an experiment–how do you analyze a complex system and determine its features given a set of limited techniques? I remember we had a number of difficult questions about using radioactive carbon labeling to try and determine the metabolic path of a molecule. The techniques you use to design these experiments are very similar to those you use when reverse engineering a hardware system.

Epigenomics is a field that I think is very interesting and exciting, and is closing in on the idea of a “biological architect”. Epigenomics is the study of the tertiary and quaternary genetic code, to borrow terms from protein folding (okay, for you real biologists out there, I am really pushing it). It turns out that DNA is indeed self-modifying and carries information beyond the genetic code. For example, your DNA adds methyl (CH3) groups to its backbone, which modifies the rate of protein expression from that segment of DNA. Also, DNA has a very complex 3-D structure. Those Hollywood views we have of DNA being this beautiful, perfect double-helix are eminently misleading. DNA is twisted upon itself, tied in knots, and bound up by histones (protein complexes that act like DNA katamari). Given that chemical machinery is essentially a mechanical computer, the 3-D morphology of a molecule is as much part of the programming as is its composition. So Epigenomics in my view should be the study of all the factors that aren’t coded in the genome–sort of like a study of all the different configurations of an OS and how it affects the race conditions, callbacks and stability of an OS. Stepping beyond that, we have the network context and ultimately the user behavior. A human cell is many orders of magnitude more complex than the internet, and a single cell is a far cry from a human being. We are a long way off from understanding the human genome and what it really means in the context of the human network, which means there will be a lot of interesting and exciting work for years to come.

And so I ponder on this beautiful, mellow Saturday afternoon in San Diego as I procrastinate on my long list of things to do…

4 Responses to “Epigenomics”

  1. bearcat says:

    I keep seeing people call the genome `source code’. It’s not the source code; it’s a binary of part of the system.

    I seem to remember a paper similar to http://citeseer.ist.psu.edu/thompson96silicon.html that mentioned that removing some apparently unconnected gates broke the system (perhaps it’s the same paper; I didn’t see that note just now when I skimmed it). Evolved systems take advantage of any sort of marginal hackery, so they’re not guaranteed to be simple to reverse-engineer.

  2. bunnie says:

    I see your point about the genome vs. source code. However, I could see an argument for why a genome could be called source code, or at least, assembly language. DNA itself isn’t very useful, it needs to be translated into mRNA and then that is transcribed into proteins, which is then folded into the final active form. Thus, one could say that the transcribed proteins are the binary and the genome is the source code, or at least the assembly-level instructions. To the extent that you can reverse-engineer a protein into constituent amino acids than then translate back into a DNA sequence, I think reinforces the idea of DNA being source code. Then again, DNA is heavily self-modifying code that gets wrapped up in protein nodules, so the whole boundary of source versus binary gets fuzzy.

    RNA is even more interesting, because as a molecule it is much more flexible and can form structures that act on itself. I think there was an argument recently that came about which hypothesizes that life started with single-stranded RNA that folded on itself to form active structures, and that DNA was an innovation of the viruses…because early life had no need for an archival format like DNA, but viruses did need a more stable and informationally redundant double-helix form, even if it meant paying the price for extra proteins that split and transcribe the DNA into RNA.

    You’re absolutely right that evolved systems are compilations of marginal hacks on marginal hacks, but somehow over time some robust themes do seem to arise. It’s those marginal hacks that are getting us, though…there was a disasterous clinical trial of TGN1412 in Europe where the monoclonal antibody was administered to hack the immune system and bypass certain checks to putatively cure inflamatory diseases like arthritis. However, despite being safe in primates, the TGN1412 triggered something like a cytokine storm in humans and the six participants’ bodies shut themselves down (I think they didn’t die but they were in critical condition). System crashes in humans bring a very somber note to the term blue screen of death. One theory is that a small modification to the CD-24 receptor in humans vs primates subtly changed the binding of the antibody. This modification was in an area far away from the active site, so it wasn’t studied extensively, but like you said, small hacks on small hacks lead to non-robust behavior…so any measure designed to work around the checks and balances of a human system can’t be assumed to just “work” like it might in a engineered (as opposed to an evolved) system. Or perhaps, we are just not sophisticated enough in our understanding of the system. Perhaps any hack we do should also include a self-regulatory system, so if something were to go awry, it would become benign. Hm, but that smacks of hubris to think that we could ever engineer something so robust. The human system itself is so diverse and unpredictable, how could we be sure that one person’s safety mechanism isn’t another person’s poison?

  3. stevelu says:

    I really enjoyed this post. And the comments above are well taken too. Marginal hackery indeed.

    I too have been surprised by anyone who thought the ‘raw’ genome would yield immediate insights. An old teacher of mine, a feminist scholar of science name Evelyn Fox Keller, called the basics of the DNA theory the ‘central dogma’ of biology, a ‘master molecule theory’ that gained strength as much from it’s resonance with hierarchical organization that scientists were familiar with as it did from observation — meaning not that it was false, but that it was treated with too much reverence and oversimplified, glossing over a lot of the complexities such as you and bearcat mention.

    She called attention, among other things, to the cytoplasmic material that is passed along to offspring (from the egg cell mostly) along with the genes, and how little attention had been paid to what role it played in the development of the subsequent organism.

    Your thoughts on the critical role of structure reminds me also of the folks who are pursuing complexity theory, with their emphasis on the roles of architecture and structure and pure mathematics acts as context, undergirding how living things play out their DNA instructions. For example, lots of things don’t have to be coded explicitly if the processes that *are* coded can take advantage of some pattern or tendency that is already there (much like sand forming ripples when swept by a current “just happens”) — strange attractors and the like.

    [BTW, thanks for your workshop at the Maker Faire. I stopped by and since then have acquired 3 breadboards to play with…]

    Cheers.

  4. Bee Alive says:

    I’ve recently started a blog, the information you provide on this site has helped me tremendously. Thank you for all of your time & work.