Followers

Monday, November 9, 2015

The Ocean holds secrets about genome editing




It’s being told that the ocean holds many secrets, of lovers and murders, of treasures and ships untouched, that it makes me wonder what other secrets could it possibly hold. We have to thank the ocean for giving us biochemists such special gifts as our beloved GFP (Green Fluorescent Protein), but what other cool proteins can come to us from the ocean?


Everyone has a favorite protein, what’s yours?

I was once asked by a biochemistry teacher: “everybody has a favorite protein, what’s yours?” I think I was way too young to have an answer back then. It’s been over a decade since I graduated and right now I don’t really think I have one favorite protein, but I have a group of favorites and among them are the TALEs. That’s why I think they deserve a post dedicated just to them.


Is it possible to edit the genome?

According to the great Wikipedia, editing is the process of selecting and preparing writtenvisual and audible media used to convey information. But editing can be applied in other aspects of life as well, even at the gene level. Genome editing is a type of genetic engineering in which the DNA is inserted, replaced or removed from a genome using artificial nucleases or “molecular scissors” or in other words is about re-writing the DNA code.

Genetic engineering emerged in the lab of Paul Berg in 1972 in the form of a recombinant DNA technology, when scientists combined the E. coli genome with the genes of a bacteriophage and the SV40 virus. Since then, this science has achieved tremendous success; many methods to manipulate DNA have been developed as well as vector systems and methods for their delivery to the cell.

From 1990 to 2003, the DNA sequence of human, E. coli, mouse and others was deciphered, retrieving information about nucleotide sequence only. In 2003, the U.S. National Human Genome Research Institute launched a new international project, ENCODE (Encyclopedia Of DNA Elements), which aim was to obtain a complete list of the functional elements of the human genome. Unfortunately, the methods used back then are not only expensive but also quite labor-intensive and they do not allow one to introduce precise changes into a strictly defined genome locus. Currently, researchers have several tools that allow them to solve the problems of precise plant’s, animal’s, and human’s genome editing.


A tale about TALEs

Bacteria of the genus Xanthomonas are pathogens of crop plants, such as rice, pepper and tomato that cause significant economic damage to agriculture. Cool thing about these bacteria is that they were found to inject proteins into the host plants where they mimic eukaryotic transcription factors thus hijacking the host’s transcriptional machinery to control gene expression. These proteins are known as TALEs for Transcription Activator-Like Effectors. But why would TALEs do that? Well, to increase the plant susceptibility to the pathogen, and they do it pretty much like a virus.

One of the most interesting things about TALEs is what I call their “outfit”. It turns out that TALEs bind to DNA in a specific manner and this specificity is given by its DNA-binding domain “outfit” which consists of tandem repeat arrays of amino acids, with each repeat binding a single DNA base. Moreover, The residues located at positions 12 and 13 in each repeat are highly variable and are called the BSR (base specifying residues), which are responsible for the recognition of a specific nucleotide. This TALE code is degenerate in the way that some BSRs can bind to several nucleotides with different efficiencies.


After TALEs were described, two other plant disease associated bacteria were found to encode for TALE-like proteins as well. These proteins are RipTALs from Ralstonia solanacaerum and Bats from endofungal bacterium Burkholderia rhizoxinica. TALE-likes seems to be united only by possession of DNA binding repeats with a conserved code, in fact Bats lack the domains necessary to function as eukaryotic transcription factors.


What can TALEs be used for?

So lets imagine: we have a cool protein, wearing a nice tandem repeat outfit, which holds a special code that allows to bind to DNA in a specific manner. So, what if we knew the code? Well, then one could predict the DNA binding element for any given TALE and to design them to match any DNA sequence of interest. Now THAT is interesting. But it gets even better! Genetic engineering in not called that just for nothing, so the idea is to make these proteins even cooler. Now, what could be done? Specific DNA-binding in one thing, but the function is another thing. So a great idea is to couple TALEs to a functional domain of choice and those chimeras are invaluable tools for precision manipulation of genome.

But we can go beyond: non-BSR polymorphisms might also be useful to tune DNA-binding properties and further expand the diversity of TALE-DNA interactions. One could then create libraries of designed TALEs with a range of binding strengths for the same DNA element, useful for the regulation of synthetic genetic circuits.


The birth of a new pair of genetic scissors: TALENs

After deciphering the code of DNA recognition by TALE proteins, which attracted the attention of researchers across the world due to its simplicity (one monomer – one nucleotide), the first chimeras of TALEs and nucleases were created: the sequence encoding the DNA-binding domain of TALE was inserted into a plasmid vector previously used for creating Zinc Finger Nucleases. This resulted in the generation of genetic constructs expressing artificial chimeric nucleases (TALENs) and this was so amazing that in 2011 Nature Methods named the methods of precise genome editing, including the TALEN system, method of the year.


How to ‘pimp’ your TALE

One approach to TALE repeat engineering is random mutagenesis and screening. Alternatively, mutations could be introduced in a more targeted fashion, but this requires information on the impact of different types of polymorphisms at different positions in the TALE repeat. So, where do we get information on what sites should be good targets?....in mother nature. Natural variation would provide useful information on what residues can or cannot be tolerated at which positions and with what effect. Unfortunately, TALEs sequence diversity is very low and residues clustered around the BSR are largely invariant across all currently known TALEs, RipTALs and Bats. Good news is RipTALs and Bats are only near 40% identical to TALEs, making them useful for repeat engineering.


TALEs from the Ocean

The ocean is quite an interesting place to do research. We often look for answers or work on the model of terrestrial organisms, but the ocean holds many secrets.
On the Gulf of Mexico/Yucatan Channel, during the Global Ocean Sampling (GOS) expedition, biological samples were taken and used for DNA sequencing. Among them, a particular sample was analyzed, likely from bacterial origin based on size filtering of the biological material that was used. During sequence analysis two predicted proteins came into the light: MOrTL1 and MOrTL2 (Marine Organism TALE-likes, names that reflect the limited information scientist had regarding their provenance).

These sequences have been previously suggested to encode modular DNA binding repeats, but no functional analysis had been reported. And guess what, both proteins are tandem repeat arrays … does it ring a bell?
This MOrTLs repeats differ at more than 60% of positions from each other and from all other TALE-likes. Although MOrTL sequences are incomplete and likely to be fragments of larger, incompletely sequenced genes, they were synthesized and MOrTL1 was further expressed and purified from E. coli. As analyzed by electrophoretic mobility shift assay (EMSA), MOrTL1 showed a weak DNA binding, inconsistent with TALE-likes. This would likely reflect the incomplete sequence of MOrTL1, yielding a incomplete functional protein.

Then, how to study a predicted protein, based on an incomplete sequence, if even when expressed it’s not fully functional? Well….  CHIMERAS!
MOrTL repeats were embedded within the repeat domain of a Bat named Bat1. Now there were ready to be studied.


Can Bat1-MOrTL chimeras bind to DNA?

Yes they can! And this was confirmed by mixing Bat1-MOrTL chimeras and their cognate TALE-code predicted DNA binding element and analyzed by two approaches: a qualitative classical EMSA and the quantitative MicroScale Thermophoresis (MST) which can quantify the affinity of the binding and calculate binding constants (KD) of any intermolecular interactions with high precision. In fact, Bat1-MOrTL chimeras bind with similar strength to the wild type Bat1 protein.


Is this binding specific?

When you prove that a TALE-like protein binds to its predicted on-target sequence, does it prove adherence to the TALE code? NOT NECESSARILY!
How to prove it then? Well, specificity needs to be tested and by this I mean proving the binding to the worst predicted match DNA sequence based on the TALE code. How can this experiment be done? By competition experiments. So when analyzing probe-protein interactions (DNA-binding elements-MOrTLs interactions) with on and off-target competitors it was confirmed that both MOrTL1 and MOrTL2 TALE-code consistence base preference. Additionally, when quantifying the protein-DNA interactions with the off-target probe by MST a very interesting result came into light: the Bat1-MOrTL1 chimera had an affinity 19 times lower than the on-target interaction, while Bat1-MOrTL2 was only two times lower, thus they differ in the discriminating power. The higher discriminatory power of MOrTL1 repeats thus make them better for integration into TALE-like repeat arrays for biotechnological applications. A very nice piece of information.


Could protein stability account for the differences observed?

Functional differences among proteins could be due to stability issues. Protein stability can be analyzed by inducing thermal denaturation and then monitoring the fluorescence of SYPRO Orange, which binds non-specifically to hydrophobic surfaces. So, when the protein unfolds, the exposed hydrophobic surfaces bind the dye, resulting in an increase in fluorescence. This method is called DSF (Differential Scanning Fluorimetry), also known as Thermofluor. But as the music band ‘Outkast’ asks “what’s cooler than being cool?”, well the answer is not “Ice Cold”, it is "nanoDSF". This technology allows to do the same as DSF but taking advantage of the Tryptophan’s intrinsic fluorescence and by using this pretty cool technology, it was shown that the functional difference between MOrTL1 and MOrTL2 chimeras are not due to differences in protein stability.


By demonstrating that MOrTL repeats mediate DNA binding behavior analogous to that of other TALE-likes repeats, scientists have gain insights into the nature of the whole TALE-like family… hopefully this amazing discovered secrets from the ocean will enable further research into the distribution and functions of these fascinating DNA binding proteins.

1.    

11. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats. Nucl. Acids Res. first published online October 19, 2015 doi:10.1093/nar/gkv1053