DNA computing, the performing of computations using biological molecules, rather than traditional silicon chips. The idea that individual molecules (or even atoms) could be used for computation dates to 1959, when American physicist Richard Feynman presented his ideas on nanotechnology. However, DNA computing was not physically realized until 1994, when American computer scientist Leonard Adleman showed how molecules could be used to solve a computational problem.

Solving problems with DNA molecules

A computation may be thought of as the execution of an algorithm, which itself may be defined as a step-by-step list of well-defined instructions that takes some input, processes it, and produces a result. In DNA computing, information is represented using the four-character genetic alphabet (A [adenine], G [guanine], C [cytosine], and T [thymine]), rather than the binary alphabet (1 and 0) used by traditional computers. This is achievable because short DNA molecules of any arbitrary sequence may be synthesized to order. An algorithm’s input is therefore represented (in the simplest case) by DNA molecules with specific sequences, the instructions are carried out by laboratory operations on the molecules (such as sorting them according to length or chopping strands containing a certain subsequence), and the result is defined as some property of the final set of molecules (such as the presence or absence of a specific sequence).

Adleman’s experiment involved finding a route through a network of “towns” (labeled “1” to “7”) connected by one-way “roads.” The problem specifies that the route must start and end at specific towns and visit each town only once. (This is known to mathematicians as the Hamiltonian path problem, a cousin of the better-known traveling salesman problem.) Adleman took advantage of the Watson-Crick complementarity property of DNA—A and T stick together in pairwise fashion, as do G and C (so the sequence AGCT would stick perfectly to TCGA). He designed short strands of DNA to represent towns and roads such that the road strands stuck the town strands together, forming sequences of towns that represented routes (such as the actual solution, which happened to be “1234567”). Most such sequences represented incorrect answers to the problem (“12324” visits a town more than once, and “1234” fails to visit every town), but Adleman used enough DNA to be reasonably sure that the correct answer would be represented in his initial pot of strands. The problem was then to extract this unique solution. He achieved this by first greatly amplifying (using a method known as polymerase chain reaction [PCR]) only those sequences that started and ended at the right towns. He then sorted the set of strands by length (using a technique called gel electrophoresis) to ensure that he retained only strands of the correct length. Finally, he repeatedly used a molecular “fishing rod” (affinity purification) to ensure that each town in turn was represented in the candidate sequences. The strands Adleman was left with were then sequenced to reveal the solution to the problem.

Although Adleman sought only to establish the feasibility of computing with molecules, soon after its publication his experiment was presented by some as the start of a competition between DNA-based computers and their silicon counterparts. Some people believed that molecular computers could one day solve problems that would cause existing machines to struggle, due to the inherent massive parallelism of biology. Because a small drop of water can contain trillions of DNA strands and because biological operations act on all of them—effectively—in parallel (as opposed to one at a time), it was argued that one day DNA computers could represent (and solve) difficult problems that were beyond the scope of “normal” computers.

However, in most difficult problems the number of possible solutions grows exponentially with the size of the problem (for example, the number of solutions might double for every town added). This means that even relatively small problems would require unmanageable volumes of DNA (on the order of large bathtubs) in order to represent all possible answers. Adleman’s experiment was significant because it performed small-scale computations with biological molecules. More importantly, however, it opened up the possibility of directly programmed biochemical reactions.

Biochemistry-based information technology

Programmable information chemistry will allow the building of new types of biochemical systems that can sense their own surroundings, act on decisions, and perhaps even communicate with other similar forms. Although chemical reactions occur at the nanoscale, so-called biochemistry-based information technology (bio/chem IT) is distinct from nanotechnology, due to the reliance of the former on relatively large-scale molecular systems.

Although contemporary bio/chem IT uses many different types of (bio) chemical systems, early work on programmable molecular systems was largely based on DNA. American biochemist Nadrian Seeman was an early pioneer of DNA-based nanotechnology, which originally used this particular molecule purely as a nanoscale “scaffold” for the manipulation and control of other molecules. American computer scientist Erik Winfree worked with Seeman to show how two-dimensional “sheets” of DNA-based “tiles” (effectively rectangles made up of interwoven DNA strands) could self-assemble into larger structures. Winfree, together with his student Paul Rothemund, then showed how these tiles could be designed such that the process of self-assembly could implement a specific computation. Rothemund later extended this work with his study of “DNA origami,” in which a single strand of DNA is folded multiple times into a two-dimensional shape, aided by shorter strands that act as “staples.”

Other experiments have shown that basic computations may be executed using a number of different building blocks (for example, simple molecular “machines” that use a combination of DNA and protein-based enzymes). By harnessing the power of molecules, new forms of information-processing technology are possible that are evolvable, self-replicating, self-repairing, and responsive. The possible applications of this emerging technology will have an impact on many areas, including intelligent medical diagnostics and drug delivery, tissue engineering, energy, and the environment.