How To Code A Life
Synthetic biology — the science fiction-like branch of genetic engineering — hopes to automate programs used to engineer organisms that could produce better drugs and cleaner fuels. But can open-source science really succeed?
Synthetic biologists write code. But when their code is compiled, it doesn't become an app. It becomes, or at least changes, life.
"It's quite literally the same thing [as lines of code], once we get to the point where it's all electronic," J. Christopher Anderson, a synthetic biologist at the University of California at Berkeley, tells me. "It's a code that is A-T-C-Gs instead of 0s and 1s."
Synthetic biology, the newer, cooler branch of genetic engineering, has gained a lot of attention in recent years because of its innovative take on biology, as well as for its similarities with the hugely successful software industry — programs to automate DNA sequencing used to write new genetic code — but in roughly a decade of existence, the field hasn't achieved much of what it promises. Engineered microbes that produce sustainable fuels or turn carbon dioxide into plastic, bacteria that makes blood or antimalarial drugs, and organisms designed to attack cancer cells are just a handful of the potential applications from the biologically generated software.
But synthetic biology still struggles in one key area where the software industry excels: open access to information. Synthetic biology could easily be buried beneath patents protecting proprietary information, much like the pharmaceutical and biotech industries today. And while computer science and synthetic biology aren't identical (there will likely be a lot less on the consumer-facing end from engineered DNA), a more open-source model within synthetic biology could expedite the experimentation process, allowing researchers to focus on the engineering aspects and not time-consuming DNA synthesis — ultimately bringing some of these ungodly sounding new life-forms out from labs and into the commercial world.
In the last few years, most of the hype surrounding synthetic biology has been about the counterculture of "biopunks" and "DIY bio-ers" that are shaking up the routine, methodical arena that is science — people tinkering with yogurt cells in homemade labs. Like nerdier Mark Zuckerbergs, it was cool to talk about the "generally young and in college, who work not in gleaming, glistening, bleeding-edge university or corporate laboratories, but in attics, basements, garages," as a UCLA Magazine feature from two years ago reads. They're the kind of people who were just "hacking up DNA," said Wired. Yet the real promise of synthetic biology is not in labs — garage, university, or otherwise — but in open-source software programs used to engineer life.
Still, it's been almost a decade since a bunch of engineering dudes at MIT joined forces with computer science guru Tom Knight, now known as The Godfather of Synthetic Biology, and decided that instead of simply moving genes from one organism to another — the more traditional field of genetic engineering — they'd mix genes or make DNA sequences from scratch, writing brand-new genetic code. They'd make things that could never be produced naturally.
Fero thinks advances in synthetic biology will depend on the availability of an advanced toolkit — but a lot of that hinges on maintaining open standards and accessible algorithms. For now synthetic biology remains, like most scientific research, locked in labs and within tight-knit academic circles. But if synthetic biology could demonstrate that a more open source, proprietary-sharing-with-public model is possible in certain fields of science, it could change the way patent-obsessed, government-funded research has always been done.
Open-source science is not a new idea, and there have been small pockets of success in drug research (mostly for drugs that don't make any money), but some argue that none of these are truly open-source models. "In the computer science business, open source actually results in new code and effort," Stephen Maurer, a public policy professor at the University of California at Berkeley, told me. "If you are not generating new value, or creating incentives to get people to donate money or labor, then [open source] is a bumper sticker."
Since the guys at MIT started trying to engineer cells almost a decade ago, they quickly realized that their experiments were limited by the available DNA sequences. Each time they wanted to tinker with a different gene, they had to rebuild the piece of DNA they needed, prompting the idea of a standard library of DNA "parts," which Knight called BioBricks — where researchers could share information about a piece of DNA, a specific gene, and its observed function. From this evolved The Registry of Standard Biological Parts, a collection of thousands of genetic "parts" and "tools" that anyone could use to engineer new genetic machines.
But for the most part, the registry is only used by undergraduate students who compete in MIT's annual International Genetically Engineered Machine (iGEM) competition, from which several cool projects have been born — like biosensors that glow green when arsenic is detected. Synthetic biology still hasn't seen much commercial application. The registry wiki says, "It's *always* a work in progress!" But it doesn't look like it's been updated in a decade.
"It's not as accessible of a space like Github," says Anderson, referring to the free, open-source code site created by Tom Preston-Werner. The complexity and cost of DNA is not analogous to using free code on Github — a single gene can cost as much as $400, so even if the information about its function is available for free, experimenting with a gene is not. Biology is also hard! Those using the registry are typically trained undergraduates with a mentor in synthetic biology, not like the amateur developers or hackers who are drawn to open-source code.
"There is considerable skepticism within the established biotech community that amateurs could carry out substantive beneficial and sustainable biohacks, mainly because biology is messy and complex, but there are others who would counter that complexity hasn't stopped the hacker community in the past," Andrew D. Maynard, the chair of the department of environmental health sciences at the University of Michigan School of Public Health, told me in an e-mail.
Solaris, the operating system developed by Sun Microsystems in the early '90s, was released as OpenSolaris in 2005, an open-source version of the operating system used to invite developers and programmers to improve the existing system. "It was a much more capable version than anything that academics built," said Maurer, adding that this is the approach needed in synthetic biology, where industry works with researchers to make the best synthetic biology "tools" available.
Right now, the open-source model in synthetic biology looks more like a vertical divide, with the academics doing the DNA synthesizing and analysis on one side and huge biotech companies on the other, with little overlap between. Drawing parallels between the software industry and synthetic biology, Maurer argues that what needs to happen is to make more of the good "parts" of DNA — the ones currently locked up in huge biotech corporations — available to academia and to the registry, so valuable information is accessible to more people. "One way to break that is share data about what does and doesn't work across the industry, and make high-quality parts available to scientists and academia too," says Maurer. Of course not every piece of information would be open — that's not what open source means in the software industry either — but Maurer says right now there's no incentive for people to go out and make new parts, to add valuable information to the registry that might result in new genetic code.
New companies, like the Stanford-born TeselaGen and MIT's GinkgoBioworks, are trying to bridge this gap by providing software that automates the DNA assembly process, making it easier for researchers to focus on the creative, more experimental aspects of engineering organisms. TeselaGen adopted a drag-and-drop interface where users can choose the particular DNA sequence combinations they need for an experiment, which is then sent to a server that calculates the best way to produce the physical DNA to go inside a cell. "Few in academia are doing this back-end work; it's too much work," says Fero. "But the academic community could benefit from certain tools we're building on the interface side — how to pull their information together and build new biomolecules." In turn, TeselaGen (like Oracle) would benefit from opening up its design code for others to help improve. "It's good to have that algorithm be published and out there in open, so that anyone can implement it."
Not everyone is excited by the prospect of software that will allow us to quickly and easily engineer DNA. Like anything that involves tampering with nature, people worry about what synthetic organisms could mean for public health, environmental contamination, or even bioterror — maybe even against the president. These issues will have to be addressed as they come, but avoiding an open-source model in fear of bioterror isn't the right way to approach this. Not everything is openly available in all of computing, and the same model should apply for science — it won't take over the full ecosystem, but in the growing field of synthetic biology there could be real benefits from a more open-source approach.
In genetic engineering, researchers learn everything about how a particular snippet of DNA works. In synthetic biology, they need to be able to use that same snippet over and over with dozens of other parts to quickly learn everything that it can and cannot do. If the hope is to engineer organisms compiled from only the best parts — jellyfish genes that glow green inside arsenic-detecting bacteria and organisms that can turn electricity and carbon dioxide into fuel — then the information and technology has to move beyond expensive labs to anyone who's ever wanted to find a cytomegalovirus vaccine.