One ring to rule them all: The cohesin complex

By Johannes Buheitel, PhD

In my blog post about mitosis (, I explained some of the challenges a human cell faces when it tries to disentangle its previously replicated chromosomes (for an overview of the cell cycle, see also and segregate them in a highly ordered fashion into the newly forming daughter cells. I also mentioned a protein complex, which is integral for this chromosomal ballet, the cohesin complex. To recap, cohesin is a multimeric ring complex, which holds the two chromatids of a chromosome together from the time the second sister chromatid is generated in S phase until their separation in M phase. This decreases complexity, and thereby increases the fidelity of chromosome segregation, and thus, mitosis/cell division. And while this feat should already be enough to warrant devoting a whole blog post to cohesin, you will shortly realize that this complex also performs a myriad of other functions during the cell cycle, which really makes it “one ring to rule them all”.

Figure 1: The cohesin complex. The core complex consists of three subunits: Scc1/Rad21, Smc1, and Smc3. They interact to form a ring structure, which embraces ("coheses") sister chromatids.
Figure 1: The cohesin complex. The core complex consists of three subunits: Scc1/Rad21, Smc1, and Smc3. They interact to form a ring structure, which embraces (“coheses”) sister chromatids.

But let’s back up a little first. Cohesin’s integral ring structure is composed of three proteins: Smc1, Smc3 (Structural maintenance of chromosomes), and Scc1/Rad21 (Sister chromatid cohesin/radiation sensitive). These three proteins attach to each other in a more or less end-to-end manner, thereby forming a circular structure (see Figure 1; ONLY for the nerds: Smc1 and -3 form from long intramolecular coiled-coils by folding back onto themselves, bringing together their N- and C-termini at the same end. This means that these two proteins actually interact with their middle parts, forming the so-called “hinge”, as opposed to really “end-to-end”). Cohesin obviously gets its name from the fact that it causes “cohesion” between sister chromatids, which has been first described 20 years ago in budding yeast. The theory that the protein complex does so by embracing DNA inside the ring’s lumen was properly formulated in 2002 by the Nasmyth group, and much evidence supporting this “ring embrace model” has been brought forth over last decades, making it widely (but not absolutely) accepted in the field. According to our current understanding, cohesin is already loaded onto DNA (along the entire length of the decondensed one-chromatid chromosome) in telophase, i.e. only minutes after chromosome segregation, by opening/closing its Smc1-Smc3 interaction site (or “entry gate”). When the second sister chromatid is synthesized in S phase, cohesin establishes sister chromatid cohesion in a co-replicative manner (only after you have the second sister chromatid, you can actually start talking about “cohesion”). Early in the following mitosis, in prophase to be exact, the bulk of cohesin is removed from chromosome arms in a non-proteolytic manner by opening up the Smc3-Scc1/Rad21 interface (or “exit gate”; this mechanism is also called “prophase pathway”). However, a small but very important fraction of cohesin molecules, which is located at the chromosomes’ centromere regions, remains protected from this removal mechanism in prophase. This not only ensures that sister chromatids remain cohesed until the metaphase-to-anaphase transition, but also provides us with the stereotypical image of an X-shaped chromosome. The last stage in the life of a cohesin ring is its removal from centromeres, a tightly regulated process, which involves proteolytic cleavage of cohesin’s Scc1/Rad21 subunit (see Figure 2).

Figure 2: The cohesin cycle. Cohesin is topologically loaded onto DNA in telophase by opening up the Smc1-Smc3 interphase ("entry gate"). Sister chromatid cohesion is established during S phase, coinciding with the synthesis of the second sister. In prophase of early mitosis, the bulk of cohesin molecules are removed from chromosome arms (also called "prophase pathway") by opening up the interphase between Scc1/Rad21 and Smc3 ("exit gate"). Centromeric cohesin is ultimately proteolytically removed at the metaphase-to-anaphase transition.
Figure 2: The cohesin cycle. Cohesin is topologically loaded onto DNA in telophase by opening up the Smc1-Smc3 interphase (“entry gate”). Sister chromatid cohesion is established during S phase, coinciding with the synthesis of the second sister. In prophase of early mitosis, the bulk of cohesin molecules are removed from chromosome arms (also called “prophase pathway”) by opening up the interphase between Scc1/Rad21 and Smc3 (“exit gate”). Centromeric cohesin is ultimately proteolytically removed at the metaphase-to-anaphase transition.

As you can see, during the 24 hours of a typical mammalian cell cycle, cohesin is pretty much always directly associated with the entire genome (the exceptions being chromosomes arms during most of mitosis, i.e. 20-40 minutes and entire chromatids during anaphase, i.e. ~10 minutes). This means that cohesin has at least the potential to influence a whole bunch of other chromosomal events, like DNA replication, gene expression and DNA topology. And you know what? Turns out it does!

Soon after cohesin was described as this guardian of sister chromatid cohesion, it also became clear that there is just more to it. Take DNA replication for example. There is good evidence that initial cohesin loading is already topological (meaning, the ring closes around the single chromatid). That poses an obvious problem during S phase: While DNA replication machineries (“replisomes”) zip along the chromosomes trying to faithfully duplicate the entire genome in a matter of just a couple of hours, they encounter – on average – multiple cohesin rings that are already wrapped around DNA. Simultaneously, cohesin’s job is to take those newly generated sister chromatids and hold them tightly to the old one. Currently, we don’t really know how this works, whether the replisome can pass through closed cohesin rings, or whether cohesin gets knocked off and reloaded after synthesis. What we do know, however, is that cohesion establishment and DNA replication are strongly interdependent, with defects in cohesion metabolism causing replication phenotypes and vice versa.

Cohesin has also been shown to have functions in transcriptional regulation. It was observed quite early that cohesin can act as an insulation factor, blocking long-range promoter-enhancer association. Today we have good evidence showing that cohesin binds to chromosomal insulator elements that are usually associated with the CTCF (CCCTC-binding factor) transcriptional regulator. Here, the ring complex is thought to help CTCF’s agenda by creating internal loops, i.e. inside the same sister chromatid!

Studying cohesin has, of course, not only academic value. Because of its pleiotropic functions, defects in human cohesin biology can cause a number of clinically relevant issues. Since actual cohesion defects will cause mitotic failure (which most surely results in cell death), most of cohesin-associated diseases are believed to be caused by misregulation of the complex’s non-canonical functions in replication/transcription. These so-called cohesinopathies (e.g. Roberts syndrome and Cornelia de Lange syndrome) are congenital birth defects with widely ranging symptoms, which usually include craniofacial/upper limb deformities as well as mental retardation.

It is important to mention that cohesin also has a very unique role in meiosis where it not only coheses sister chromatids but also chromosomal homologs (the two maternal/paternal versions of a chromosome, each consisting of two sisters, which themselves are cohesed). As a reminder, the lifetime supply of all oocytes of a human female is produced before puberty. These oocytes are arrested in prophase I (prophase of the first meiotic division) with fully cohesed homologs and sisters, and resume meiosis one by one each menstrual cycle. This means that some oocytes might need to keep up their cohesion (between sisters AND homologs) over decades, which, considering the half-life of your average protein, can be challenging. This has important medical relevance as cohesion failure is believed to be the main cause behind missegregation of homologs, and thus, age-related aneuploidies, like e.g. trisomy 21.

After twenty years of research, the cohesin complex still manages to surprise us regularly, as new functions in new areas of cell cycle regulation come to light. Currently, extensive research is conducted to better understand the role of certain cohesin mutations in cancers such as glioblastoma, or Ewing’s sarcoma. And while we’re still far away from completely understanding this complex complex, we already know enough to say that cohesin really is “one ring to rule them all”.


Repair Gone Wrong: Targeting The DNA Damage Response To Treat Cancer

By Gesa Junge, PhD


Our cells are subject to damage every minute of every day, be it from endogenous factors such as reactive oxygen species generated as part of normal cell respiration, or exogenous factors such as UV radiation from the sun. Together, these factors can lead to as many as 60 000 damaged DNA bases per cell per day. Most of these are changes to the DNA bases or single strand breaks (SSBs), which only affect one strand of the double helix, and can usually be repaired before the DNA is replicated and the cell divides. However, about 1% of SSBs escape and become double stand breaks (DSBs) upon DNA replication. DSBs are highly toxic, and a single DSB can be lethal to a cell if not repaired.

Usually, cells are well-equipped to deal with DNA damage and have several pathways that can remove damaged DNA bases and restore the DNA sequence. Nucleotide excision repair (NER, e.g. for UV damage) and base excision repair (BER, for oxidative damage) are the main SSB repair pathways, and homologous recombination (HR) and non-homologous enjoining (NHEJ) repair most DSBs. HR is the more accurate pathways for DSB repair, as it relies on a homologous DNA sequence on the sister chromosome to restore the damaged bases, whereas NHEJ simply relegates the ends of the break, potentially losing genetic information. However, NHEJ can function at any time in the cell cycle whereas HR requires a template and is only active once the DNA is replicated (i.e. in G2 and S-phase).

Depending on the severity of the damage, cells can either stop the cell cycle to allow for repair to take place or, if the damage is too severe, undergo apoptosis and die, which in a multicellular organism is generally favourable to surviving with damaged DNA. If cells are allowed to replicate with unrepaired DNA damage, they pass this damage on to their daughter cells in mitosis, and mutations in the DNA accumulate. While mutations are essential to evolution, they can also be problematic. Genomic instability, and mutations in genes such as those that control the cell cycle and the DNA damage response can increase the risk of developing cancer. For example, germline mutations in ATM, a key protein in HR pathway of DSB repair, leads to Ataxia Telangiectasia (AT), a neurodegenerative disorder. AT sufferers are hypersensitive to DSB-inducing agents such as x-rays, and have a high risk of developing cancer. Deficiencies in NER proteins lead to conditions such as Xeroderma Pigmentosa or Cockayne Syndrome which are characterised by hypersensitivity to UV radiation and an increased risk of skin cancer, and mutations BRCA2, another key HR protein, increase a woman’s risk of developing breast cancer to 60-80% (compared to 13% on average).

Even though deficiencies in DNA repair can predispose to cancer, DNA repair is also emerging as a viable target for cancer therapy. For example, DNA repair inhibitors can be used to sensitise cancer cells to chemotherapy- or radiation-induced damage, making it possible to achieve more tumour cell kill with the same dose of radiation or chemotherapy. However, this approach is not yet used clinically and a major complication is that it often increases both the efficacy as well as the toxicity of treatment.

Another approach is the idea of “synthetic lethality”, which relies on a cancer cell being dependent on a specific DNA repair pathway because it is defective in another, such that deficiency of either one of two pathways is sustainable, but loss of both leads to cell death. This concept was first described by Calvin Bridges in 1922 in a study of fruit flies and is now used in the treatment of breast cancer in the form of an inhibitor of Poly-ADP ribose polymerase (PARP), a key enzyme in the repair of SSBs. Loss of PARP function leads to increased DSBs after cell division due to unrepaired SSBs, which in normal tissue are removed by the DSB repair system. However, BRCA2-deficient tumours are defective in HR and cannot repair the very toxic DSBs, leading to cell death. Therefore, BRCA2-deficient tumours are hypersensitive to PARP inhibitors, which are now an approved therapy for advanced BRCA2-deficient breast and ovarian cancer.

PARP inhibitors are a good example of a so-called “target therapy” for cancer, which is the concept of targeting the molecular characteristics that distinguish the tumour cell from healthy cells (in this case, BRCA2 deficiency), as opposed to most older, cytotoxic chemotherapies, which generally target rapidly dividing cells by inducing DNA damage, and can actually lead to second cancers. With an improved understanding of the molecular differences between normal and tumour cells, cancer therapy is slowly moving away from non-specific cytotoxic drugs towards more tolerable and effective treatments.

From String to Strand


By Jordana Lovett


Ask a molecular biologist what image DNA conjures up in the mind. A convoluted ladder of nitrogenous bases, twisting and coiling dynamically. Pose the very same question to a theoretical physicist- chances are that DNA takes on a completely different meaning. As it turns out, DNA is in the eye of the beholder. Science is about perspective. Moreover, it relies on the convergence of distinct, yet interrelated angles to tackle scientific questions wholly.


When I met Dr. Vijay Kumar at a Cancer Immunotherapy meeting, I was immediately intrigued by his unique background and path to biology.  Vijay largely credits his family for strongly instilling in him core values of education and assiduousness. He was raised to strive for the best, and was driven to satisfy the goals of his parents, who encouraged him to pursue a degree in electrical engineering. While slightly resentful at the time, he now realizes that this broad degree would afford him multiple career options as well as the opportunity to branch into other fields of physics in the future. As early as his teenage years, Vijay had already begun thinking about the interesting unknowns of the natural universe. With his blinders on, he sought to explore them using physics and math, both theoretically and practically. As he advanced to university in pursuance of a degree in electrical engineering, he strategized and planned what would be his future transition into theoretical physics. He dabbled in various summer research projects and sought mentorship to help guide his career. Vijay ultimately applied and was accepted to a PhD program at MIT, where he studied string theory in a 6-dimensional model universe. He describes string theory as a broad framework rather than a theory that can be related to the world through ‘thought experiments’ and mathematical consistency.  Kumar continued his work in string theory during a post-doc in Santa Barbara, California, where he found himself surrounded by a more diverse group of physicists. Theoretical physicists, astrophysicists, and biophysicists were able to intermingle and share their science.


This diversity spurred new perspectives and reconsideration of what he had originally thought would be a clear road to professorship and a career in academia. As one would imagine, the broader impacts of string theory are limited; the ideas are part of a specialized pool of knowledge available to an elite handful. Even among the few, competition was fierce- at the time, there were only two available openings for professors in string theory in the United States. Additionally, seeing the need and presence of ‘quantitative people’ in other fields, such as biology made him increasingly curious about alternatives to the automated choices he had been making until this point. With the support of his (now) wife, and inspiration from his brother (who had just completed a degree in statistics/informatics and started a PhD in biology), he networked with other post-docs and set up meetings with principle investigators (PI’s) to discuss how he, as a theoretical physicist, could play a role in a biological setting. He spent time during his post-doc in Santa Barbara, and throughout his second post-doc at Stony Brook reflecting, taking courses and shifting into a different mindset. Vijay interviewed and gave talks at a number of institutions, and eventually landed in lab at Cold Spring Harbor, where he now is involved in addressing some of the shortcomings in DNA sequencing technology.


Starting in a different lab within the confines of a field means readjusting to brand new settings, acquainting with new lab mates and shifting from one narrowly focused project to another. Launching not only into a new lab, but into a foreign field adds an extra unsettling and daunting layer to the scenario.  Vijay, however, viewed this as yet another opportunity to uncover mysteries in nature- through a new perspective.  He recognized an interplay between string theory, wherein the vibration of strings allows you to make predictions about the universe, and biology, where the raw sequence of DNA can inform the makeup of an organism, and its interactions with the world.  It is with this viewpoint that Vijay understands DNA. He sees it as an abstraction, as a sequence of letters that allows you to draw inferences and predict biological outcomes. A change or deletion in just one letter can have enormous, tangible effects. It is this tangibility that speaks to Vijay. He is drawn to the application and broader consequences of the work he is doing, and excited that he can use his expertise to contribute to this knowledge.


While approaching a radically different field can impose obstacles, Kumar sees common challenges in both physics and biology and simply avoids getting lost in scientific translation. Just as theory has a language, so too biology has its own jargon. Once past this barrier, addressing gaps in knowledge becomes part of the common scientific core. Biology enables a question to be answered through various assays and allows observable results to guide future experiments- expertise in various subjects is therefore not only encouraged, but necessary. Collaborations between different labs across various disciplines enable painting a complete picture. “I’m a small piece of a larger puzzle, and that’s ok”, says Vijay. His insight into how scientists ought to work is admirable. Sharing and communicating data in a way that is comprehendible across the scientific playing field will more quickly and efficiently allow for scientific progress.


If I’ve learned one thing from Vijay’s story, it is to understand that science has room for multiple perspectives. In fact, it demands questions to be addressed in an interdisciplinary fashion. You might question yourself along the way. You might shift gears, change directions. But these unique paths mold the mind to perceive, ask, challenge, and contribute in a manner that no one else can.

How Low Can You Go? Designing a Minimal Genome

By Elizabeth Ohneck, PhD

How many genes are necessary for life? We humans have 19,000 – 20,000 genes, while the water flea Daphnia pulex has over 30,000 and the microbe Mycoplasma genitalium has only 525. But how many of these genes are absolutely required for life? Is there a minimum number of genes needed for a cell to survive independently? What are the functions of these essential genes? Researchers from the J. Craig Venter Institute and Synthetic Genomics, Inc., set out to explore these questions by designing the smallest cellular genome that can maintain an independently replicating cell. Their findings were published in the March 25th version of Science.

The researchers started with a modified version of the Mycoplasma mycoides genome, which contains over 900 genes. Mycoplasmas are simplest cells capable of autonomous growth, and their small genome size provides a good starting point for building minimal cells. To identify genes unnecessary for cell growth, the team used Tn5 transposon mutagenesis, in which a piece of mobile DNA is introduced to the cells and randomly “jumps” into the bacterial chromosome, thereby disrupting gene function. If many cells were found to have the transposon inserted into the same gene at any position in the gene sequence, and these cells were able to grow normally, the gene was considered non-essential, since its function was not required for growth; such genes were candidates for deletion in a minimal genome. In some genes, the transposon was only found to insert at the ends of the genes, and cells with these insertions grew slowly; such genes were considered quasi-essential, since they were needed for robust growth but were not necessary for cell survival. Genes which were never found to contain the transposon in any cells were considered essential, since cells that had transposon insertions in these genes did not survive; these essential genes were required in the minimal genome.

The researchers then constructed genomes with various combinations of non-essential and quasi-essential gene deletions using in vitro DNA synthesis and yeast cells. The synthetic chromosomes were transplanted into Mycoplasma capricolum, replacing its normal chromosome with the minimized genome. If the M. capricolum survived and grew in culture, the genome was considered viable. Some viable genomes, however, caused the cells to grow too slowly to be practical for further experiments. The team therefore had to find a compromise between small genome size and workable growth rate.

The final bacterial strain containing the optimized minimal genome, JCVI-syn3.0, had 473 genes, a genome smaller than any autonomously replicating cell found in nature. Its doubling time was 3 hours, which, while slower than the 1 hour doubling time of the M. mycoides parent strain, was not prohibitive of further experiments.

What genes were indispensable for an independently replicating cell? The 473 genes in the minimal genome could be categorized into 5 functional groups: cytosolic metabolism (17%), cell membrane structure and function (18%), preservation of genomic information (7%), expression of genomic information (41%), and unassigned or unknown function (17%). Because the cells were grown in rich medium, with almost all necessary nutrients provided, many metabolic genes were dispensable, aside from those necessary to effectively use the provided nutrients (cytosolic metabolism) or transport nutrients into the cell (cell membrane function). In contrast, a large proportion of genes involved in reading, expressing, replicating, and repairing DNA were maintained (after all, the presence of genes is of little use if there is no way to accurately read and maintain them). As the cell membrane is critical for a defined, intact cell, it’s unsurprising that the minimal genome also required many genes for cell membrane structure.

Of the 79 genes that could not be assigned to a functional category, 19 were essential and 36 were quasi-essential (necessary for rapid growth). Thirteen of the essential genes had completely unknown functions. Some were similar to genes of unknown function in other bacteria or even eukaryotes, suggesting these genes may encode proteins of novel but universal function. Those essential genes that were not similar to genes in any other organisms might encode novel, unique proteins or unusual sequences of genes with known function. Studying and identifying these genes could provide important insight into the core molecular functions of life.

One of the major advancements resulting from this study was the optimization of a semi-automated method for rapidly generating large, error-free DNA constructs. The technique used to generate the genome of JCVI-syn3.0 allows any small genome to be designed and built in yeast and then tested for viability under standard laboratory conditions in a process that takes about 3 weeks. This technique could be used in research to study the function of single genes or gene sets in a well-defined background. Additionally, genomes could be built to include pathways for the production of drugs or chemicals, or to enable cells to carry out industrially or environmentally important processes. The small, well-defined genome of a minimal cell that can be easily grown in laboratory culture would allow accurate modeling of the consequences of adding genes to the genome and lead to greater efficiency in the development of bacteria useful for research and industry.

Lethal Weapon: How Many Lethal Mutations Do We Carry?


By John McLaughlin

Many human genetic disorders, such as cystic fibrosis and sickle cell anemia, are caused by recessive mutations with a predictable pattern of inheritance. Tracking hereditary disorders such as these is an important part of genetic counseling, for example when planning a family. In fact, there exists an online database dedicated to medical genetics, Mendelian Inheritance in Man, which contains information on most human genetic disorders and their associated phenotypes.


The authors of a new paper in Genetics set out to estimate the number of recessive lethal mutations carried in the average human’s genome. The researchers’ rationale for specifically focusing on recessive mutations is their higher potential impact on human health; because deleterious mutations that are recessive are less likely to be purged by selection, they can be maintained in heterozygotes with little impact on fitness, and therefore occur in greater frequency. For the purposes of their analysis, recessive lethal disorders (i.e. caused by a recessive lethal mutation) were defined by two main criteria: first, when homozygous for its causative mutation, the disease leads to the death or effective sterility of its carrier before reproductive age, and second, mutant heterozygotes do not display any disease symptoms.


For this study, the researchers had access to an excellent sample population, a religious community known as the Hutterian Brethren. This South Dakotan community of ~1600 individuals is one of three closely related groups that migrated from Europe to North America in the 19th century. Importantly, the community has maintained a detailed genealogical record tracing back to the original 64 founders, which also contains information on individuals affected by genetic disorders since 1950. An additional bonus is that the Hutterites practice a communal lifestyle in which there is no private property; this helps to reduce the impact of confounding socioeconomic factors on the analysis.


Four recessive lethal genetic disorders have been identified in the Hutterite pedigree since their more detailed records began: cystic fibrosis, nonsyndromic mental retardation, restrictive dermopathy, and myopathy. To estimate the number of recessive lethal mutations carried by the original founders, the team used both the Hutterite pedigree and a type of computational simulation known as “gene dropping”. In a typical gene dropping simulation, alleles are assigned to a founder population, the Mendelian segregation and inheritance of these alleles across generations is simulated, and the output is compared with the known pedigree. One simplifying assumption made during the analysis is that no de novo lethal mutations had arisen in the population since its founding; therefore, any disorders arising in the pedigree are attributed to mutations carried by the original founder population.


After combining the results from many thousands of such simulations with the Hutterite pedigree, the authors make a final estimate of roughly one or two recessive lethal mutations carried per human genome (the exact figure is ~0.58). What are the implications of this estimate for human health? Although mating between more closely related individuals has been long known to increase the probability of recessive mutations homozygosing in offspring, a more precise risk factor was generated from this study’s mutation estimate. In the discussion section it is noted that mating between first cousins, although fairly rare today in the United States, is expected to increase the chance of a recessive lethal disorder in offspring by ~1.8%.


Perhaps the most interesting finding from this paper was the consistency of the predicted lethal mutation load across the genomes of different animal species. The authors compared their estimates for human recessive lethal mutation number to those from previous studies examining this same question in fruit fly and zebrafish genomes, and observed a similar value of one or two mutations per genome. Of course, the many simplifying assumptions made during their analyses should be kept in mind; the estimates are considered tentative and will most likely be followed up with similar future work in other human populations. It will certainly be interesting to see how large-scale studies such as this one will impact human medical genetics in the future.


Are Existing Policies Regulating Recombinant DNA Technology Adapted for Synthetic Biology?


By Florence Chaverneff, PhD


Background on Synthetic Biology

Synthetic biology is gaining increasing interest as one of the most promising new technologies of the 21st century. Its revolutionary nature, wide-ranging applications across several scientific disciplines, and the fact that it may help solve some of the world’s most pressing issues, all contribute to the justified enthusiasm for the field. As the boundaries, prospects and even nature of synthetic biology still need to be clearly outlined, the definition advanced by a high-level expert group of the European Commission, encompasses it well: “Synthetic Biology is the engineering of biology: the synthesis of complex, biologically based (or inspired) systems which display functions that do not exist in nature. This engineering perspective may be applied at all levels of the hierarchy of biological structures-from individual molecules to whole cells, tissues and organisms.”


In the same manner that recombinant DNA technology revolutionized biology in the 1970s, synthetic biology is breaking new grounds. However, because it requires a greater need for DNA synthesis than recombinant DNA technology, synthetic biology brings life sciences closer to engineering. It aims to make biology easy to engineer. And that is the revolutionary part. Its multi-disciplinary nature at the nexus of biology, engineering, genetics, computational biosciences and chemistry implies that synthetic biology be practiced in a global and networked fashion, posing it as the ultimate collaborative venue for scientific research.


Applications of Synthetic Biology

The array of what synthetic biology allows to design and produce, from biomolecules, to cells, pathways, and ultimately, to living organisms, in by itself gives an idea of the power of the technology. Synthetic biology, with its new category of tools that allow advanced DNA synthesis, conceptualization of biologically complex systems, and standardization for mass production is more approachable to a less skilled workforce in a more efficient and manageable manner than what is currently practiced in biotechnology companies.


Applications of synthetic biology are wide-ranging, from global health (e.g. vaccine and antibody production, regenerative medicine, development of therapies for cancer, approaches for cell therapy) to generation of biofuels, to food production. And as the field is growing, technologies are bound to evolve, giving rise to an even wider array of applications. One of the most notable and highly publicized successes of synthetic biology was published in Nature Biotechnology in 2003. The article describes a novel way of producing the anti-malarial drug artemisinin, using the bacteria E. coli as a host, in which enzyme and metabolic pathway for artemisinin production were expressed. Artemisinin synthesized in this manner can be produced at much higher yield and much lower cost than by plant extraction. These considerations are of great importance for an anti-malarial drug, destined to large populations in low income countries.  Another powerful example of the promises held by synthetic biology lies in a study published last year in Science, reporting the assembly of a synthetic yeast chromosome, heavily edited from its natural counterpart, yet functional when expressed in its organism.


Crafting Policies for Synthetic Biology

Despite being over a decade old, synthetic biology is still in its infancy, its full potential has yet to be realized, and a regulatory framework indispensable to any new technology that can be applied to life sciences, will have to match the field’s evolution. Some policies for synthetic biology may be adapted from existing ones that were designed to regulate recombinant DNA technology and genetic engineering. However, it is critical that new regulations, tailored to synthetic biology, which is tantamount to engineering artificial life, be established. Considerable changes in regulations should be avoided, as they might result in holding up development of the fast-evolving synthetic biology.


Perhaps one of the most important policy aspects to consider for synthetic biology is linked to its sheer nature. Synthetic biology permits manufacturing of whole living organisms, which, if released in the environment, could greatly affect it by interacting with ecosystems. It is therefore imperative that preventive measures be taken and that ethical oversight be installed to avoid misuse of the technology. Another policy aspect particular to synthetic biology is related to its multi-disciplinary nature: all its practitioners, not just biologists, should be educated in biosafety. Additionally, policies should allow for training of scientist, researchers and other professionals to meet the demands of the field. Several top institutions in the US have already launched graduate programs in synthetic biology, but more educational programs are required.


Synthetic biology research and frameworks for funding are also vital to support evolution of the field by strengthening research and development capabilities, and supporting innovation. Synthetic biology should be practiced in academic institutions and private ventures alike.  In both instances, policies should be adapted so that results from research meet demands of modern economy, by taking measures to industrialize innovation in commercially successful ways through facilitation of technology transfer and intellectual property management.


Finally, because synthetic biology is heavily reliant on openness and sharing and holds great potential for becoming the poster child of international scientific cooperation, national policies formulated in the US and elsewhere could serve as template for transnational policies.


Sensationalized Science


By Elizabeth Ohneck, PhD


In a recent Letter to Nature, researchers from the Scripps Research Institute announced that they had successfully engineered a bacterium that could recognize and replicate DNA containing an unnatural base pair (UBP). Their publication, entitled “A semi-synthetic organism with an expanded genetic alphabet”, demonstrated that E. coli could recognize, take up, and utilize man-made nucleotides to reproduce a plasmid containing a base pair of the synthetic nucleotides, faithfully replicating the UBP for over 20 generations (read more about it in our post about the paper).


The findings presented by the authors are incredibly exciting and have huge implications for future research in genetics, microbiology, and medicine. The presentation, however, is concerning. The authors refer to their strain of E. coli as “semi-synthetic.” Such a term could, for non-scientists (or even the scientist with a highly active imagination), conjure up images of some half bacterium-half robot, a sort of Frankenstein’s monster bacterium manufactured by man in the lab. What they actually have is a strain of E. coli carrying two plasmids, one that expresses an algal transporter able to import the synthetic nucleotides, and one containing the UBP. The introduction of plasmids into bacteria is a staple of biological research, and non-native proteins are regularly expressed in microorganisms from E. coli to yeast for countless research and industrial purposes. Are these microorganisms, then, also considered semi-synthetic? Referring to this E. coli strain as such actually does the findings a disservice, as part of what makes this report so exciting is that a common organism could recognize and utilize synthetic nucleotides with its own DNA replication machinery. The idea of an “expanded genetic alphabet” is also somewhat of a stretch, as the second plasmid contained a single UBP but was otherwise composed of canonical, naturally occurring A-T/G-C base pairs. This single UBP wasn’t utilized in any biological or genetic function; it was merely maintained during plasmid replication. Can we consider this UBP a true expansion of the genetic alphabet if it is not interpreted for inclusion in a bacterial function? Do the lofty terms used in the title sensationalize the story in an effort to attract an audience?


For trained scientists, this issue may seem minor; after all, would anyone outside of the research sector truly read or pay attention to this paper? If the research results become a news story, they might. In fact, the bigger problem is the communication of this research to the general public by the media, which further sensationalized the story. CNN even published an article entitled “New life engineered with artificial DNA.” One merely needs to glance through the comments section of the online article to understand the backlash of such a claim. Is this organism really “new life?” Is “artificial DNA” perhaps an overstatement?


The current climate of public attitude toward health science and genetic research is bitterly divided. Consider, for example, the well-publicized, acrimonious debates over vaccination, pharmaceuticals, and GMOs. Articles that imply scientists are “playing God” by “creating new life” only increase suspicion and inflame anti-science sentiment among groups already wary or contemptuous of health and science research. While it’s important to draw readers and sell stories, sensationalizing the science inhibits fair dialogue over the subject and detracts from the value of the scientific discovery.


The advancement of science needs public support – financially, politically, and even in terms of morale – which we can only gain through transparency and the communication of accurate information in the interest of educating the public. As research scientists, good communication starts with us. We have the responsibility to ensure our findings are clearly and truthfully conveyed to any audience, including among the research community. In turn, it is up to science writers and journalists to ensure the appropriate communication of scientific research to the public, in a manner intended to do more than sell stories. Science, itself, is sensational. Let’s not allow fabricated drama to take away from the excitement and wonder of scientific discovery.

DNA is made of A, C, T, G…X, and Y?


By Elaine To


In biology classes, everybody is taught that deoxyribonucleic acid (DNA, AKA the genetic information of a cell) has four and only four nucleotide bases. Adenine (A) and thymine (T) base pair together and cytosine (C) and guanine (G) base pair together. For the first time ever, researchers have expanded the genetic alphabet to include two additional bases: dNaM (X) and d5SICS (Y).

The researchers have previously shown that DNA polymerases, the enzymes responsible for replicating DNA, successfully replicate DNA containing the dNaM-d5SICS base pair. However these reactions were not carried out within living cells. The researchers decided to try this in the bacterium Escherichia coli due to the simplicity of the cells. Multiple factors had to be optimized in preparation for carrying out these reactions inside cells.

Firstly, the unnatural bases must be present inside the bacteria for DNA polymerase to use them as raw materials. Cells normally obtain A, C, T, and G from breaking down food or recycling previously used nucleotides. Both these pathways were not options for X and Y, so the researchers first tried passive diffusion across the cell membrane. Once X and Y diffused into the cell, they could then be phosphorylated by naturally occurring enzymes to their triphosphate form, which is the form that DNA polymerases recognize and use. The phosphorylation was unsuccessful.

The researchers then explored the idea of transporting the triphosphate forms (XTP and YTP) directly into the cells. Uptake of XTP and YTP by nucleotide triphosphate transporters from multiple other species was screened. The PtNTT2 transporter from the diatom Phaeodactylum tricornutum was most efficient at bringing XTP and YTP into the cells.

The next issue was the instability of XTP and YTP in the culture medium, especially when the E. coli were actively growing. Tests were first carried out on the natural triphosphate ATP. It was determined that addition of KPi to the culture medium increased ATP stability significantly and that KPi had the same effect on XTP and YTP.

And with that, the researchers were ready to generate their E. coli organism containing X and Y. They prepared two circular pieces of DNA, known as plasmids, which are easy to transport into bacteria. One plasmid contained the gene for the PtNTT2 transporter and the other contained a gene with an A-T base pair replaced by X and an analog of Y. Since YTP is the provided substrate, any newly produced plasmid will contain X and Y. This distinguishes it from the original template plasmid containing X and the Y analog.

After inserting both plasmids into the bacteria and growing them in KPi, XTP, and YTP containing medium, the plasmids were extracted from inside the cells. Analyzing the total nucleotide content with mass spectrometry showed that Y was clearly present. X was not detected, but it is known to fragment poorly and thus be difficult to detect with mass spectrometry.

To check the incorporation of XTP and YTP into the extracted plasmid, it was replicated in a PCR reaction using the natural nucleotides, YTP, and biotinylated XTP as substrates. The new product should contain biotin and thus react with streptavidin, which binds very strongly to biotin. As expected, streptavidin bound to the PCR product, confirming that the X-Y base pair is in the plasmid.

Sequencing of the plasmid shows that the nucleotide sequence is correct up until the expected location of the X-Y base pair. The sequencing reaction terminates at this location because there is no X nor Y provided in the sequencing reagents. This proves that X-Y is present in the right location in the plasmid.

In a series of landmark experiments, the researchers have shown replication of DNA containing an unnatural base pair inside living cells. The next step to be undertaken is the transcription of this DNA to mRNA and then hopefully translation into a functional protein. It is conceivable that the incorporation of X-Y into mRNA will soon transpire due to the similarity of DNA and RNA. Subsequently, work that has already been done in incorporating unnatural amino acids could be leveraged to facilitate the use of X-Y in codons that result in proteins.

From 1 to 100,000 human genomes: the challenges faced



Sophia David

In 2003, the Human Genome Project was completed and the final version of the first human genome sequence became available as a free scientific resource. Having taken 13 years to complete, the total cost of the project was US$2.7 billion. Since then, rapid technological advancements have allowed genome sequencing costs to plummet and, just ten years later, a human genome can now be fully sequenced for less than $10,000. Moreover, the process takes just days, or even hours. Costs are expected to fall even further to just $1000 per genome within the next few years.


Of course, as the costs have fallen, scientists have sequenced an increasing number of genomes.  The 1000 Genomes Project, launched in 2008, has allowed scientists to characterise the variation within human genome sequences at a high resolution.


Remarkably, just tens years after the first human genome was sequenced, several projects are now set up with ambitions of sequencing 100,000 human genomes. One such project is the Personal Genomes Project, set up back in 2005 by George Church, a Professor of Genetics at Harvard Medical School. Although initially a US-based project, it is expanding and now also has branches in the UK and Canada. Combined, they hope to sequence 100,000 volunteers over the next ten years.  The data will be freely available for all scientists to use.


Meanwhile, the UK government have also pledged £100 million to sequence the genomes of 100,000 patients of the NHS (National Health Service) by the end of 2017. The data obtained from this project will be for primarily clinical use, making the UK one of the first countries to be moving genomics into the clinic on a large-scale. The UK government has set up a company called Genomics England that is responsible for undertaking this ambitious task.


While the translation of genomic data into useful clinical information has been slower than once expected, genome sequencing is undoubtedly impacting on medical diagnoses and treatments. For example, by examining the mutational changes within a patient’s tumour cells, clinicians are able to better characterise some cancers and consequently provide a more appropriate treatment.


However, the developments in genome sequencing have not come without their challenges. Currently, a major debate revolves around how open or private genomic data should be. On the one hand, sharing data is hugely important for enabling scientific discoveries. Thousands of studies have been published as a result of freely available data made by the Human Genome Project and 1000 Genomes Project. On the other hand, a whole genome sequence can reveal a large amount of information about the respective individual, as well as their family members. While such data may be published anonymously with no personal information attached, a study by scientists at MIT shower earlier this year that it is still possible to link genomic data back to the individual using Y-chromosome data and geneology websites. There are fears that genomic information linked to a specific individual could be used in malicious ways.


Scientists at the Personal Genome Project are only too aware of this dilemma and have devised their own solution. They state on their website that, “We feel the most ethical and practical solution to this dilemma is to turn the privacy problem on its head and collaborate with individuals who are willing to share their data publicly with the understanding that re-identification is possible.”


Thus, data produced by the Personal Genome Project will be freely available to everyone. However, in order to take part in the study, participants must pass tests to prove that they fully understand the risks of having their genomic information shared with the world before taking part.


Meanwhile, genome sequence data from NHS patients will not be publicly available and instead be stored inside the NHS firewall. It will be linked to patient records for clinical use, and anonymised data will also be available in a restricted place for scientists to use for research purposes.


Another potential problem that arises from the analysis of genomic data regards how incidental or secondary findings are managed. These could occur when a genome sequence is used to answer a particular question about a patient’s health but something unrelated and unexpected arises. While incidental findings are nothing new in medicine, the risks of such occurrences in genomics are probably higher than most areas of medicine. Earlier this year, the American College of Medical Genetics and Genomics released their recommendations on incidental findings that occur through genome sequencing. They suggest that all labs performing clinical sequencing should test for well-studied mutations in 57 genes that have a strong association with disease. These include BRCA1 and BRCA2 mutations that are linked to hereditary breast and ovarian cancers. They believe that people should not be able to opt out of knowing these results unless they refuse clinical sequencing.


There is also the risk that genomic information could actually cause harm to patients through misdiagnoses, or that clinicians or scientists could fail to identify clinical useful genetic variants.  However, it is likely that these risks will diminish as more genomes are sequenced and clinicians and scientists gain experience in the application of genomic data to medicine.


Finally, one of the great challenges may lie in managing expectations. While the last decade has seen remarkable progress in genomics, the application of genomics to medicine will be a much longer road. Politicians must understand that they will not see a quick return on their investment and patients offered genome sequencing should not always expect a straightforward cure. And lastly, clinicians and scientists should not expect to see medicine transformed overnight by genomics.

Division Doppelgangers

Alisa Moskaleva


Cyclin A is a confounded nuisance for cell biologists. Noticed serendipitously in 1982 in sea urchins and clams in an experiment that earned a share of the 2001 Nobel Prize in Physiology or Medicine, cyclin A and its doppelganger protein, cyclin B, help cells of all animals grow and divide properly. Cells stockpile both proteins before dividing, use them to control division, and then degrade them after they have served their purpose. If cells are deprived of cyclin A or cyclin B, they can’t divide. If cells have too much of these proteins they start dividing early and get stuck, unable to separate into two new cells. But whereas cyclin B sticks around until the step before the two new cells separate, when the two copies of the cell genome are all set to separate, cyclin A disappears several minutes earlier when those two copies of the genome are nowhere near ready to split. Why does a responsible regulator like cyclin A leave its post so scandalously early? And why does a cell need cyclin A to regulate division when it has cyclin B there willing and present?

Lilian Kabeche and Duane A. Compton begin to answer both of these questions in their October 3 Nature paper. They took a close, microscope-assisted look at what goes on during cell division. The general process of cell division has been known for over a hundred years. Before starting to divide, the cell replicates its contents, including its DNA, so it can pass on a copy to both cells of the new generation. Then, during the prometaphase stage, the cell packs up its DNA really tightly and simultaneously builds up lots of microtubules, which are long fibers of protein that act as miniature ropes and sprout from two opposite sides of the dividing cell. The microtubules attempt to lasso the DNA, so that half of the DNA is attached to microtubules from one end of the cell and the other half is attached to microtubules from the other end of the cell. At this time cyclin A disappears. Then, at a stage called metaphase when the DNA is all lined up in the middle of the cell and properly attached to microtubules, cyclin B disappears. What follows is separation of the two copies of DNA to the two sides of the cell, pulled by microtubules; this is called anaphase. Finally, in telophase the two cells pinch off from each other and resume growing.

Kabeche and Compton focused on how cyclin A may be regulating the way microtubules attach to DNA. The big blob of DNA inside a cell is quite easy to see under a microscope, but it’s much harder to see the thin individual microtubules. Thus, Kabeche and Compton labeled microtubules with a photoactivatable fluorescent protein, a protein that can be made to glow by shining a certain wavelength of light on it. Then they looked for microtubules that approached DNA, shone light on them to make them glow, and assessed whether the glowing microtubules would stay in place or wander off. They observed that in prometaphase microtubules were much more likely to wander off than in metaphase. This makes sense. In metaphase, the DNA is organized and aligned, so it should be easy for microtubules to grab it. In prometaphase, by contrast, the DNA is still unorganized and in the process of aligning, so mistakes in attaching microtubules are likely. Microtubules from both sides of the cell may grab the same copy of DNA. Or microtubules from only one side of the cell may grab both DNA copies. These attachment mistakes, if not corrected, would distribute DNA unevenly or even tear it up, leading to deleterious mutations. So, it’s good that microtubules in prometaphase do not attach stably. When Kabeche and Compton gave cells extra cyclin A, they saw that microtubules would wander much longer than normally even in cells that were in metaphase and had their DNA aligned properly. And when Kabeche and Compton deprived cells of cyclin A, they noticed that the DNA separated unevenly, suggesting that microtubules attached at the wrong place.

All of these observations suggest that cyclin A somehow makes microtubules restless, whereas cyclin B, still present when microtubules make stable attachments, does not. The cell uses cyclin A to control the attachment of microtubules to DNA, and then disposes of it, while relying on cyclin B to control the separation of DNA copies. Given its distinct function, cyclin A disappears not early, but at precisely the right time. If it were to stick around, microtubules would never attach to DNA and division would never proceed. On the other hand, if it were not present at all, microtubules would attach too early and in all the wrong places, leading to mistakes in partitioning the genome to the new generation. Of course, there are many vexing questions that remain to be answered, the most obvious of which is how does cyclin A cause microtubules to no longer attach to DNA? It looks like cyclin A has many more mysteries to reveal.

Want to stay on top of cyclin A and cyclin B and their affect on microtubules? Create your feed, it’s free