Beyond the Eppi-tube: EMSA

It’s being told that the ocean holds many secrets, of lovers and murders, of treasures and ships untouched, that it makes me wonder what other secrets could it possibly hold. We have to thank the ocean for giving us biochemists such special gifts as our beloved GFP (Green Fluorescent Protein), but what other cool proteins can come to us from the ocean?

Everyone has a favorite protein, what’s yours?

I was once asked by a biochemistry teacher: “everybody has a favorite protein, what’s yours?” I think I was way too young to have an answer back then. It’s been over a decade since I graduated and right now I don’t really think I have one favorite protein, but I have a group of favorites and among them are the TALEs. That’s why I think they deserve a post dedicated just to them.

Is it possible to edit the genome?

According to the great Wikipedia, editing is the process of selecting and preparing written, visual and audible media used to convey information. But editing can be applied in other aspects of life as well, even at the gene level. Genome editing is a type of genetic engineering in which the DNA is inserted, replaced or removed from a genome using artificial nucleases or “molecular scissors” or in other words is about re-writing the DNA code.

Genetic engineering emerged in the lab of Paul Berg in 1972 in the form of a recombinant DNA technology, when scientists combined the E. coli genome with the genes of a bacteriophage and the SV40 virus. Since then, this science has achieved tremendous success; many methods to manipulate DNA have been developed as well as vector systems and methods for their delivery to the cell.

From 1990 to 2003, the DNA sequence of human, E. coli, mouse and others was deciphered, retrieving information about nucleotide sequence only. In 2003, the U.S. National Human Genome Research Institute launched a new international project, ENCODE (Encyclopedia Of DNA Elements), which aim was to obtain a complete list of the functional elements of the human genome. Unfortunately, the methods used back then are not only expensive but also quite labor-intensive and they do not allow one to introduce precise changes into a strictly defined genome locus. Currently, researchers have several tools that allow them to solve the problems of precise plant’s, animal’s, and human’s genome editing.

A tale about TALEs

Bacteria of the genus Xanthomonas are pathogens of crop plants, such as rice, pepper and tomato that cause significant economic damage to agriculture. Cool thing about these bacteria is that they were found to inject proteins into the host plants where they mimic eukaryotic transcription factors thus hijacking the host’s transcriptional machinery to control gene expression. These proteins are known as TALEs for Transcription Activator-Like Effectors. But why would TALEs do that? Well, to increase the plant susceptibility to the pathogen, and they do it pretty much like a virus.

One of the most interesting things about TALEs is what I call their “outfit”. It turns out that TALEs bind to DNA in a specific manner and this specificity is given by its DNA-binding domain “outfit” which consists of tandem repeat arrays of amino acids, with each repeat binding a single DNA base. Moreover, The residues located at positions 12 and 13 in each repeat are highly variable and are called the BSR (base specifying residues), which are responsible for the recognition of a specific nucleotide. This TALE code is degenerate in the way that some BSRs can bind to several nucleotides with different efficiencies.

After TALEs were described, two other plant disease associated bacteria were found to encode for TALE-like proteins as well. These proteins are RipTALs from Ralstonia solanacaerum and Bats from endofungal bacterium Burkholderia rhizoxinica. TALE-likes seems to be united only by possession of DNA binding repeats with a conserved code, in fact Bats lack the domains necessary to function as eukaryotic transcription factors.

What can TALEs be used for?

So lets imagine: we have a cool protein, wearing a nice tandem repeat outfit, which holds a special code that allows to bind to DNA in a specific manner. So, what if we knew the code? Well, then one could predict the DNA binding element for any given TALE and to design them to match any DNA sequence of interest. Now THAT is interesting. But it gets even better! Genetic engineering in not called that just for nothing, so the idea is to make these proteins even cooler. Now, what could be done? Specific DNA-binding in one thing, but the function is another thing. So a great idea is to couple TALEs to a functional domain of choice and those chimeras are invaluable tools for precision manipulation of genome.

But we can go beyond: non-BSR polymorphisms might also be useful to tune DNA-binding properties and further expand the diversity of TALE-DNA interactions. One could then create libraries of designed TALEs with a range of binding strengths for the same DNA element, useful for the regulation of synthetic genetic circuits.

The birth of a new pair of genetic scissors: TALENs

After deciphering the code of DNA recognition by TALE proteins, which attracted the attention of researchers across the world due to its simplicity (one monomer – one nucleotide), the first chimeras of TALEs and nucleases were created: the sequence encoding the DNA-binding domain of TALE was inserted into a plasmid vector previously used for creating Zinc Finger Nucleases. This resulted in the generation of genetic constructs expressing artificial chimeric nucleases (TALENs) and this was so amazing that in 2011 Nature Methods named the methods of precise genome editing, including the TALEN system, method of the year.

How to ‘pimp’ your TALE

One approach to TALE repeat engineering is random mutagenesis and screening. Alternatively, mutations could be introduced in a more targeted fashion, but this requires information on the impact of different types of polymorphisms at different positions in the TALE repeat. So, where do we get information on what sites should be good targets?....in mother nature. Natural variation would provide useful information on what residues can or cannot be tolerated at which positions and with what effect. Unfortunately, TALEs sequence diversity is very low and residues clustered around the BSR are largely invariant across all currently known TALEs, RipTALs and Bats. Good news is RipTALs and Bats are only near 40% identical to TALEs, making them useful for repeat engineering.

TALEs from the Ocean

The ocean is quite an interesting place to do research. We often look for answers or work on the model of terrestrial organisms, but the ocean holds many secrets.

On the Gulf of Mexico/Yucatan Channel, during the Global Ocean Sampling (GOS) expedition, biological samples were taken and used for DNA sequencing. Among them, a particular sample was analyzed, likely from bacterial origin based on size filtering of the biological material that was used. During sequence analysis two predicted proteins came into the light: MOrTL1 and MOrTL2 (Marine Organism TALE-likes, names that reflect the limited information scientist had regarding their provenance).

These sequences have been previously suggested to encode modular DNA binding repeats, but no functional analysis had been reported. And guess what, both proteins are tandem repeat arrays … does it ring a bell?

This MOrTLs repeats differ at more than 60% of positions from each other and from all other TALE-likes. Although MOrTL sequences are incomplete and likely to be fragments of larger, incompletely sequenced genes, they were synthesized and MOrTL1 was further expressed and purified from E. coli. As analyzed by electrophoretic mobility shift assay (EMSA), MOrTL1 showed a weak DNA binding, inconsistent with TALE-likes. This would likely reflect the incomplete sequence of MOrTL1, yielding a incomplete functional protein.

Then, how to study a predicted protein, based on an incomplete sequence, if even when expressed it’s not fully functional? Well…. CHIMERAS!

MOrTL repeats were embedded within the repeat domain of a Bat named Bat1. Now there were ready to be studied.

Can Bat1-MOrTL chimeras bind to DNA?

Yes they can! And this was confirmed by mixing Bat1-MOrTL chimeras and their cognate TALE-code predicted DNA binding element and analyzed by two approaches: a qualitative classical EMSA and the quantitative MicroScale Thermophoresis (MST) which can quantify the affinity of the binding and calculate binding constants (K_D) of any intermolecular interactions with high precision. In fact, Bat1-MOrTL chimeras bind with similar strength to the wild type Bat1 protein.

Is this binding specific?

When you prove that a TALE-like protein binds to its predicted on-target sequence, does it prove adherence to the TALE code? NOT NECESSARILY!

How to prove it then? Well, specificity needs to be tested and by this I mean proving the binding to the worst predicted match DNA sequence based on the TALE code. How can this experiment be done? By competition experiments. So when analyzing probe-protein interactions (DNA-binding elements-MOrTLs interactions) with on and off-target competitors it was confirmed that both MOrTL1 and MOrTL2 TALE-code consistence base preference. Additionally, when quantifying the protein-DNA interactions with the off-target probe by MST a very interesting result came into light: the Bat1-MOrTL1 chimera had an affinity 19 times lower than the on-target interaction, while Bat1-MOrTL2 was only two times lower, thus they differ in the discriminating power. The higher discriminatory power of MOrTL1 repeats thus make them better for integration into TALE-like repeat arrays for biotechnological applications. A very nice piece of information.

Could protein stability account for the differences observed?

Functional differences among proteins could be due to stability issues. Protein stability can be analyzed by inducing thermal denaturation and then monitoring the fluorescence of SYPRO Orange, which binds non-specifically to hydrophobic surfaces. So, when the protein unfolds, the exposed hydrophobic surfaces bind the dye, resulting in an increase in fluorescence. This method is called DSF (Differential Scanning Fluorimetry), also known as Thermofluor. But as the music band ‘Outkast’ asks “what’s cooler than being cool?”, well the answer is not “Ice Cold”, it is "nanoDSF". This technology allows to do the same as DSF but taking advantage of the Tryptophan’s intrinsic fluorescence and by using this pretty cool technology, it was shown that the functional difference between MOrTL1 and MOrTL2 chimeras are not due to differences in protein stability.

By demonstrating that MOrTL repeats mediate DNA binding behavior analogous to that of other TALE-likes repeats, scientists have gain insights into the nature of the whole TALE-like family… hopefully this amazing discovered secrets from the ocean will enable further research into the distribution and functions of these fascinating DNA binding proteins.

11. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats. Nucl. Acids Res. first published online October 19, 2015 doi:10.1093/nar/gkv1053

“Men adjust their walking speed to match their romantic (female) partner's pace — a phenomenon not seen when guys walk with female friends”¹.

What does it possibly have to do with electrophoretic mobility??? Well, think about it. That same effect is pretty much what you can see in a classic EMSA (Electrophoretic Mobility Shift Assay), when a rather physical non-loving kind of interaction between nucleic acids and proteins is observed.

Well known is the relationship between electrophoretic mobility and size: bigger molecules have slower mobility than small ones. In the case of men, much of what determines walking speed is height: the longer your legs are, the faster you're likely to walk — a fact that means men, on average, have a higher optimal speed than women do.

But the interesting thing is that researchers discovered that when a lovely-dovey couple walked together, the man slowed his pace to match his female’s optimal speed. In the same way, when nucleic acids interact with proteins, they slow down in an electrophoretic run compared to unbound nucleic acids. So you can say that we ladies are to proteins as men are to nucleic acids!

In 1981, while the world’s eyes where on Lady Di marrying prince Charles and Olivia Newton John’s hit ‘Physical’ was all over, a great technique to measure DNA-protein interactions, named EMSA, was published by two independent groups.

The research on protein-DNA interactions began in the early 1960s, when analyzing the binding of Lac and phage λ repressors to DNA. Back then, these complexes could be analyzed by a technique that arose from the discovery that certain membrane filters will retain DNA-protein complexes, but not free DNA². So, by quantifying the retention of radiolabeled DNA fragments mixed with varying amounts of a protein of interest, it became possible to determine the stoichiometry and binding affinity of a protein for a given sequence. Anyway, filter binding remained impractical for the characterization of less stable complexes and non-DNA-protein complexes.

At the very beginning of the 1980s, Arnold Revzin and Mark Garner, at Michigan State University, knew of a study that showed that the ternary transcription elongation complex—DNA bound to RNA polymerase with a nascent RNA chain—was sufficiently stable for visualization by gel electrophoresis³. Combining purified protein with DNA restriction fragments containing appropriate binding sites and then running the mixture on a polyacrylamide gel, Revzin and Garner observed an amazing result: protein-DNA complexes forming distinctly ‘shifted’ higher molecular weight bands on the gels. Thus was born the electrophoretic mobility shift assay (EMSA)⁴.

But they were not the only one working on it. Michael Fried and Donald Crothers at Yale also had developed their version of EMSA. Initially, Fried had speculated that only free DNA would be amenable to electrophoresis, and that DNA-protein binding could be quantified by determining how much DNA did not enter the gel, a very interesting thought by the way. But what they saw instead was a variety of shifted bands that appeared to correlate with the number of repressor molecules bound to each DNA fragment. (That’s when Crothers said to Michael “forget what you’re doing-follow this up!”). Their paper also offered some important extensions of Garner and Revzin’s assay, using radioactive labeling rather than ethidium bromide staining to detect shifted bands, and demonstrating he capabilities of EMSA as a means for measuring the relative binding constants and stoichiometry of protein-DNA interactions⁵.

Despite its popularity and application depth, EMSA is typically limited to semiquantitative interaction analysis. Nowadays, MicroScale Thermophoresis (MST) appears a solution-based method with high sensitivity that provides reliable quantitative information on molecular interactions such as protein-nucleic acids, based on a simple protocol, making measurements very fast and efficient with low sample consumption. This technique relies on binding-induced changes in thermophoretic mobility, which depends on several molecular properties, including not only size, but also charge and solvation entropy⁶.

Science and lab techniques evolve, it can go from electrophoresis to MicroScale thermophoresis, but parallels among human behaviour and molecules will continue to impress me.

1. Wagnild J. and Wall-Scheffler CM (2013). PLoS One 8(10): e76576.

2. Jones, G.W. and Berg, P. J. (1966). Mol. Biol. 22, 199–209.

3. Chelm, B.K.and Geiduschek, E.P. (1979). Nucleic Acids Res. 7, 1851–1867.

4. Garner, M.M. and Revzin, A. (1981). Nucleic Acids Res. 9, 3047–3060.

5. Fried, M. and Crothers, D.M. (1981). Nucleic Acids Res. 9, 6505–6525.

6. Seidel SAI, Dijkman PM, Lea WA, et al. (2013). Methods. 59(3): 301-315.

Beyond the Eppi-tube

Followers

Monday, November 9, 2015

The Ocean holds secrets about genome editing

Thursday, August 27, 2015

Parallels between men in love and electrophoretic mobility

Blog Archive