thumb|515x515px|Figure 1. TATA box structural elements. The TATA box consensus sequence is TATAWAW, where W is either A or T.
In molecular biology, the TATA box (also called the Goldberg–Hogness box) is a sequence of DNA found in the core promoter region of genes in archaea and eukaryotes. The bacterial homolog of the TATA box is called the Pribnow box which has a shorter consensus sequence.
The TATA box is considered a non-coding DNA sequence (also known as a cis-regulatory element). It was termed the "TATA box" as it contains a consensus sequence characterized by repeating T and A base pairs. When consensus nucleotides and alternative ones were compared, homologous regions were "boxed" by the researchers.
History
Discovery
The TATA box was the first eukaryotic core promoter motif to be identified in 1978 by American biochemist David Hogness They first discovered the TATA sequence while analyzing 5' DNA promoter sequences in Drosophila, mammalian, and viral genes. In Drosophila, less than 40% of 205 core promoters contain a TATA box.
Features
Location
Promoter sequences vary between bacteria and eukaryotes. In eukaryotes, the TATA box is located 25 base pairs upstream of the start site that Rpb4/Rbp7 use to initiate transcription . In mammals, the TATA box is located 30 base pairs upstream of the transcription start site. When promoters use the SAGA/TATA box complex to recruit RNA polymerase II, they are more highly regulated and display higher expression levels than promoters using the TFIID/TBP mode of recruitment. In yeast, for example, one study found that various Saccharomyces genomes had the consensus sequence 5'-TATA(A/T)A(A/T)(A/G)-3', yet only about 20% of yeast genes even contained the TATA sequence. Similarly, in humans only 24% of genes have promoter regions containing the TATA box. Genes containing the TATA-box tend to be involved in stress-responses and certain types of metabolism and are more highly regulated when compared to TATA-less genes. Generally, TATA-containing genes are not involved in essential cellular functions such as cell growth, DNA replication, transcription, and translation because of their highly regulated nature. of the TATA box via a region of antiparallel β sheets in the protein.
- Four hydrogen bonds form between polar side chains on TBP amino acid (Asn27, Asn117, Thr82, Thr173)( and bases in the minor groove. These secondary interactions induce bending of the DNA and helical unwinding. The degree of DNA bending is species and sequence dependent. For example, one study used the adenovirus TATA promoter sequence (5'-CGCTATAAAAGGGC-3') as a model binding sequence and found that human TBP binding to the TATA box induced a 97° bend toward the major groove while the yeast TBP protein only induced an 82° bend. X-ray crystallography studies of TBP/TATA-box complexes generally agree that the DNA goes through an ~80° bend during the process of TBP-binding. TFIIB then binds to the TFIID-TFIIA-DNA complex through interactions both upstream and downstream of the TATA box. RNA polymerase II is then recruited to this multi-protein complex with the help of TFIIF. Interaction of TATA boxes with a variety of activators or repressors can influence the transcription of genes in many ways. Enhancers are long-range regulatory elements that increase promoter activity while silencers repress promoter activity.
Mutations
thumb|612x612px|Figure 3. Effects on TBP binding to the TATA box from mutations. Wildtype shows transcription done normally. An insertion or deletion shifts the TATA box recognition site which results in a shifted transcription site. Point mutations risk the TBP being unable to bind for initiation.
Mutations to the TATA box can range from a deletion or insertion to a point mutation with varying effects based on the gene that has been mutated. The mutations change the binding of the TATA-binding protein (TBP) for transcription initiation. Thus, there is a resulting change in phenotype based on the gene that is not being expressed (Figure 3).
Insertions or deletions
One of the first studies of TATA box mutations looked at a sequence of DNA from Agrobacterium tumefaciens for the octopine type cytokinin gene. A duplication of the TATA box leads to a significant decrease in enzymatic activity in the scutellum and roots, leaving pollen enzymatic levels unaffected. A deletion of the TATA box leads to a small decrease in enzymatic activity in the scutellum and roots, but a large decrease in enzymatic levels in pollen. However, a change can be seen in HeLa cells with a TATAAAA to TATACAA which leads to a 20 fold decrease in transcription. Some diseases that can be caused due to this insufficiency by specific gene transcription are: Thalassemia, lung cancer, chronic hemolytic anemia, immunosuppression, hemophilia B Leyden, and thrombophlebitis and myocardial infarction.
Savinkova et al. has written a simulation to predict the K<sub>D</sub> value for a selected TATA box sequence and TBP. This can be used to directly predict the phenotypic traits resulting from a selected mutation based on how tightly TBP is binding to the TATA box.
Diseases
Mutations in the TATA box region affects the binding of the TATA-binding protein (TBP) for transcription initiation, which may cause carriers to have a disease phenotype.
Gastric cancer is correlated with TATA box polymorphism. The TATA box has a binding site for the transcription factor of the PG2 gene. This gene produces PG2 serum, which is used as a biomarker for tumours in gastric cancer. Longer TATA box sequences correlates with higher levels of PG2 serum indicating gastric cancer conditions. Carriers with shorter TATA box sequences may produce lower levels of PG2 serum.
Several neurodegenerative disorders are associated TATA box mutations. Two disorders have been highlighted, spinocerebellar ataxia and Huntington's disease. In spinocerebellar ataxia, the disease phenotype is caused by expansion of the polyglutamine repeat in the TATA-binding protein (TBP). An accumulation of these polyglutamine-TBP cells will occur, as shown by protein aggregates in brain sections of patients, resulting in a loss of neuronal cells.
Blindness can be caused by excessive cataract formation when the TATA box is targeted by microRNAs to increase the level of oxidative stress genes. MicroRNAs can target the 3'-untranslated region and bind to the TATA box to activate the transcription of oxidative stress related genes.
SNPs in TATA boxes are associated with B-thalassemia, immunosuppression, and other neurological disorders. SNPs destabilize the TBP/TATA complex which significantly decreases the rate at which TATA-binding proteins (TBP) will bind to the TATA box. This leads to lower levels of transcription affecting the severity of the disease. Results from studies have shown the interaction in vitro so far, but results may be comparable to that in vivo.
Gilbert's syndrome is correlated with UGT1A1 TATA box polymorphism. This poses a risk for developing jaundice in newborns.
MicroRNAs also play a role in replicating viruses such as HIV-1. Novel HIV-1-encoded microRNA have been found to enhance the production of the virus as well as activating HIV-1 latency by targeting the TATA box region.
Clinical significance
Technology
Many of the studies so far have been performed in vitro, providing only a prediction of what may happen not a real-time representation of what is happening in the cells. Recent studies in 2016 have been done to demonstrate TATA-binding activity in vivo. Core promoter-specific mechanisms for transcription initiation by the canonical TBP/TFIID-dependent basal transcription machinery has recently been documented in vivo showing the activation by SRF-dependent upstream activating sequence (UAS) of the human ACTB gene involved in TATA-binding.
Cancer therapy
Pharmaceutical companies have been designing cancer therapy drugs to target DNA in traditional methods over the years, and have proven to be successful. However, the toxicity of these drugs have pushed scientists to explore other processes related to DNA that could be targeted instead. In recent years, a collective effort has been made to find cancer-specific molecular targets, such as protein-DNA complexes, which include the TATA binding motif. Compounds that trap the protein-DNA intermediate could result in it being toxic to the cell once they encounter a DNA processing event. Example of drugs that contain such compounds include topotecan, SN-38 (topoisomerase I), doxorubicin, and mitoxantrone (topoisomerase II). Compared to other members of the same species, Malus baccata var. xiaojinensis has a TATA box inserted in the promoter upstream of the iron-regulated transporter 1 (IRT1) promoter. As a result, the promoter activity levels are enhanced, increasing TFIID activity and subsequently transcription initiation, resulting in a more iron-efficient phenotype. With genetic engineering, a similar modification can be done to other plants, such as the model species of tobacco and Arabidopsis thaliana.<!-- The study only performs transient expression to prove the causation between TATA insertion and increased expression. -->
