It’s being told that the ocean holds many
secrets, of lovers and murders, of treasures and ships untouched, that it makes
me wonder what other secrets could it possibly hold. We have to thank the ocean
for giving us biochemists such special gifts as our beloved GFP (Green
Fluorescent Protein), but what other cool proteins can come to us from the
ocean?
Everyone
has a favorite protein, what’s yours?
I was once asked by a biochemistry teacher:
“everybody has a favorite protein, what’s yours?” I think I was way too young
to have an answer back then. It’s been over a decade since I graduated and
right now I don’t really think I have one favorite protein, but I have a group
of favorites and among them are the TALEs.
That’s why I think they deserve a post dedicated just to them.
Is
it possible to edit the genome?
According to the great Wikipedia, editing is the process of selecting and preparing written, visual and audible media used to convey
information. But editing can be applied in other aspects of life as well, even
at the gene level. Genome editing is a type of genetic engineering in which the
DNA is inserted, replaced or removed from a genome using artificial nucleases
or “molecular scissors” or in other words is about re-writing the DNA code.
Genetic
engineering emerged in the lab of Paul Berg in 1972 in the form of a
recombinant DNA technology, when scientists combined the E. coli genome with the genes of a bacteriophage and the SV40
virus. Since then, this science has achieved tremendous success; many methods
to manipulate DNA have been developed as well as vector systems and methods for
their delivery to the cell.
From
1990 to 2003, the DNA sequence of human, E.
coli, mouse and others was
deciphered, retrieving information about nucleotide sequence only. In 2003, the
U.S. National Human Genome Research Institute launched a new international
project, ENCODE (Encyclopedia Of DNA Elements), which aim was to obtain a
complete list of the functional elements of the human genome. Unfortunately,
the methods used back then are not only expensive but also quite
labor-intensive and they do not allow one to introduce precise changes into a
strictly defined genome locus. Currently, researchers have several tools that
allow them to solve the problems of precise plant’s, animal’s, and human’s
genome editing.
A tale about TALEs
Bacteria
of the genus Xanthomonas are pathogens of crop plants, such as rice, pepper and
tomato that cause significant economic damage to agriculture. Cool thing about
these bacteria is that they were found to inject
proteins into the host plants where they mimic eukaryotic transcription factors
thus hijacking the host’s transcriptional machinery to control gene expression.
These proteins are known as TALEs for Transcription Activator-Like Effectors. But
why would TALEs do that? Well, to increase the plant susceptibility to the
pathogen, and they do it pretty much like a virus.
One
of the most interesting things about TALEs is what I call their “outfit”. It
turns out that TALEs bind to DNA in a specific manner and this specificity is
given by its DNA-binding domain “outfit” which consists of tandem repeat arrays
of amino acids, with each repeat binding a single DNA base. Moreover, The
residues located at positions 12 and 13 in each repeat are highly variable and
are called the BSR (base specifying residues), which are responsible for the
recognition of a specific nucleotide. This TALE code is degenerate in the way
that some BSRs can bind to several nucleotides with different efficiencies.
After
TALEs were described, two other plant disease associated bacteria were found to
encode for TALE-like proteins as well. These proteins are RipTALs from Ralstonia
solanacaerum and Bats from
endofungal bacterium Burkholderia
rhizoxinica. TALE-likes seems to be united only by possession of DNA
binding repeats with a conserved code, in fact Bats lack the domains necessary
to function as eukaryotic transcription factors.
What can TALEs be used for?
So
lets imagine: we have a cool protein, wearing a nice tandem repeat outfit,
which holds a special code that allows to bind to DNA in a specific manner. So,
what if we knew the code? Well, then one could predict the DNA binding element
for any given TALE and to design them
to match any DNA sequence of interest. Now THAT is interesting. But it gets
even better! Genetic engineering in not called that just for nothing, so the
idea is to make these proteins even cooler. Now, what could be done? Specific
DNA-binding in one thing, but the function is another thing. So a great idea is
to couple TALEs to a functional domain
of choice and those chimeras are
invaluable tools for precision manipulation of genome.
But
we can go beyond: non-BSR polymorphisms might also be useful to tune
DNA-binding properties and further expand the diversity of TALE-DNA
interactions. One could then create libraries of designed TALEs with a range of
binding strengths for the same DNA element, useful for the regulation of
synthetic genetic circuits.
The birth of a new pair of genetic scissors:
TALENs
After
deciphering the code of DNA recognition by TALE proteins, which attracted the
attention of researchers across the world due to its simplicity (one monomer –
one nucleotide), the first chimeras of TALEs and nucleases were created: the
sequence encoding the DNA-binding domain of TALE was inserted into a plasmid
vector previously used for creating Zinc Finger Nucleases. This resulted in the
generation of genetic constructs expressing artificial chimeric nucleases (TALENs)
and this was so amazing that in 2011 Nature Methods named the methods of
precise genome editing, including the TALEN system, method of the year.
How to ‘pimp’ your TALE
One
approach to TALE repeat engineering is random mutagenesis and screening.
Alternatively, mutations could be introduced in a more targeted fashion, but
this requires information on the impact of different types of polymorphisms at
different positions in the TALE repeat. So, where do we get information on what
sites should be good targets?....in mother nature. Natural variation would
provide useful information on what residues can or cannot be tolerated at which
positions and with what effect. Unfortunately, TALEs sequence diversity is very
low and residues clustered around the BSR are largely invariant across all
currently known TALEs, RipTALs and Bats. Good news is RipTALs and Bats are only
near 40% identical to TALEs, making them useful for repeat engineering.
TALEs from the Ocean
The
ocean is quite an interesting place to do research. We often look for answers
or work on the model of terrestrial organisms, but the ocean holds many
secrets.
On
the Gulf of Mexico/Yucatan Channel, during the Global Ocean Sampling (GOS)
expedition, biological samples were taken and used for DNA sequencing. Among
them, a particular sample was analyzed, likely from bacterial origin based on
size filtering of the biological material that was used. During sequence analysis
two predicted proteins came into the light: MOrTL1 and MOrTL2 (Marine Organism
TALE-likes, names that reflect the limited information scientist had regarding
their provenance).
These
sequences have been previously suggested to encode modular DNA binding repeats,
but no functional analysis had been reported. And guess what, both proteins are
tandem repeat arrays … does it ring a bell?
This
MOrTLs repeats differ at more than 60% of positions from each other and from
all other TALE-likes. Although
MOrTL sequences are incomplete and likely to be fragments of larger,
incompletely sequenced genes, they were synthesized and MOrTL1 was further
expressed and purified from E. coli.
As analyzed by electrophoretic mobility shift assay (EMSA), MOrTL1 showed a
weak DNA binding, inconsistent with TALE-likes. This would likely reflect the
incomplete sequence of MOrTL1, yielding a incomplete functional protein.
Then, how to study a predicted protein, based
on an incomplete sequence, if even when expressed it’s not fully functional?
Well…. CHIMERAS!
MOrTL repeats were embedded within the repeat
domain of a Bat named Bat1. Now there were ready to be studied.
Can
Bat1-MOrTL chimeras bind to DNA?
Yes they can! And this was confirmed by
mixing Bat1-MOrTL chimeras and their cognate TALE-code predicted DNA binding
element and analyzed by two approaches: a qualitative classical EMSA and the quantitative MicroScale Thermophoresis (MST) which
can quantify the affinity of the binding and calculate binding constants (KD) of any intermolecular
interactions with high precision. In fact, Bat1-MOrTL chimeras bind with
similar strength to the wild type Bat1 protein.
Is
this binding specific?
When you prove that a TALE-like protein binds
to its predicted on-target sequence, does it prove adherence to the TALE code?
NOT NECESSARILY!
How to prove it then? Well, specificity needs
to be tested and by this I mean proving the binding to the worst predicted match DNA sequence based on the TALE
code. How can this experiment be done? By competition
experiments. So when analyzing probe-protein interactions (DNA-binding
elements-MOrTLs interactions) with on and off-target competitors it was
confirmed that both MOrTL1 and MOrTL2 TALE-code consistence base preference.
Additionally, when quantifying the protein-DNA interactions with the off-target
probe by MST a very interesting result
came into light: the Bat1-MOrTL1 chimera had an affinity 19 times lower than
the on-target interaction, while Bat1-MOrTL2 was only two times lower, thus
they differ in the discriminating power.
The higher discriminatory power of MOrTL1 repeats thus make them better for
integration into TALE-like repeat arrays for biotechnological applications. A
very nice piece of information.
Could
protein stability account for the differences observed?
Functional differences among proteins could
be due to stability issues. Protein stability can be analyzed by inducing
thermal denaturation and then monitoring the fluorescence of SYPRO Orange,
which binds non-specifically to hydrophobic surfaces. So, when the protein
unfolds, the exposed hydrophobic surfaces bind the dye, resulting in an
increase in fluorescence. This method is called DSF (Differential Scanning
Fluorimetry), also known as Thermofluor. But as the music band ‘Outkast’ asks
“what’s cooler than being cool?”, well the answer is not “Ice Cold”, it is "nanoDSF". This technology allows to do the same as DSF but taking advantage of
the Tryptophan’s intrinsic fluorescence and by using this pretty cool
technology, it was shown that the functional difference between MOrTL1 and
MOrTL2 chimeras are not due to differences in protein stability.
By demonstrating that MOrTL repeats mediate
DNA binding behavior analogous to that of other TALE-likes repeats, scientists
have gain insights into the nature of the whole TALE-like family… hopefully
this amazing discovered secrets from the ocean will enable further research
into the distribution and functions of these fascinating DNA binding proteins.
1.
11. DNA-binding
proteins from marine bacteria expand the known sequence diversity of TALE-like
repeats. Nucl. Acids
Res. first published online October 19,
2015 doi:10.1093/nar/gkv1053
No comments:
Post a Comment