Total Information Awareness

thumb|325px|Diagram of the Total Information Awareness system from the official (decommissioned) [[Information Awareness Office website]]

thumb|Presentation slide produced by DARPA describing TIA|325px

Total Information Awareness (TIA) was a mass detection program by the United States Information Awareness Office. It operated under this title from February to May 2003 before being renamed Terrorism Information Awareness.

Based on the concept of predictive policing, TIA was meant to correlate detailed information about people in order to anticipate and prevent terrorist incidents before execution. The program modeled specific information sets in the hunt for terrorists around the globe. Admiral John Poindexter called it a "Manhattan Project for counter-terrorism". According to Senator Ron Wyden, TIA was the "biggest surveillance program in the history of the United States".

Congress defunded the Information Awareness Office in late 2003 after media reports criticized the government for attempting to establish "Total Information Awareness" over all citizens.

Although the program was formally suspended, other government agencies later adopted some of its software with only superficial changes. TIA's core architecture continued development under the code name "Basketball". According to a 2012 New York Times article, TIA's legacy was "quietly thriving" at the National Security Agency (NSA).

Program synopsis

TIA was intended to be a five-year research project by the Defense Advanced Research Projects Agency (DARPA). The goal was to integrate components from previous and new government intelligence and surveillance programs, including Genoa, Genoa II, Genisys, SSNA, EELD, WAE, TIDES, Communicator, HumanID and Bio-Surveillance, with data mining knowledge gleaned from the private sector to create a resource for the intelligence, counterintelligence, and law enforcement communities. These components consisted of information analysis, collaboration, decision-support tools, language translation, data-searching, pattern recognition, and privacy-protection technologies.

TIA research included or planned to include the participation of nine government entities: INSCOM, NSA, DIA, CIA, CIFA, STRATCOM, SOCOM, JFCOM, and JWAC.

Companies contracted to work on TIA included the Science Applications International Corporation,

Increased information coverage by an order of magnitude and afforded easy scaling
Provided focused warnings within an hour after a triggering event occurred or an evidence threshold was passed
Automatically queued analysts based on partial pattern matches and had patterns that covered 90% of all previously known foreign terrorist attacks
Supported collaboration, analytical reasoning and information sharing so that analysts could hypothesize, test and propose theories and mitigating strategies, so decision-makers could effectively evaluate the impact of policies and courses of action.

Components

Genoa

Unlike the other program components, Genoa predated TIA and provided a basis for it. Genoa's primary function was intelligence analysis to assist human analysts.

Genisys

thumb|Graphic describing the goals of the Genysis project

Genisys aimed to develop technologies that would enable "ultra-large, all-source information repositories". Vast amounts of information were to be collected and analyzed, and the available database technology at the time was insufficient for storing and organizing such enormous quantities of data. So they developed techniques for virtual data aggregation to support effective analysis across heterogeneous databases, as well as unstructured public data sources, such as the World Wide Web. "Effective analysis across heterogenous databases" means the ability to take things from databases which are designed to store different types of data—such as a database containing criminal records, a phone call database and a foreign intelligence database. The Web is considered an "unstructured public data source" because it is publicly accessible and contains many different types of data—blogs, emails, records of visits to websites, etc.—all of which need to be analyzed and stored efficiently.

Evidence extraction and link discovery

thumb|right|Graphic displaying a simulated application of the evidence extraction and link discovery (EELD) project

Evidence extraction and link discovery (EELD) developed technologies and tools for automated discovery, extraction and linking of sparse evidence contained in large amounts of classified and unclassified data sources (such as phone call records from the NSA call database, internet histories, or bank records).

EELD was designed to design systems with the ability to extract data from multiple sources (e.g., text messages, social networking sites, financial records, and web pages). It was to develop the ability to detect patterns comprising multiple types of links between data items or communications (e.g., financial transactions, communications, travel, etc.).

Translingual information detection, extraction and summarization

Translingual information detection, extraction and summarization (TIDES) developed advanced language processing technology to enable English speakers to find and interpret critical information in multiple languages without requiring knowledge of those languages.

Outside groups (such as universities, corporations, etc.) were invited to participate in the annual information retrieval, topic detection and tracking, automatic content extraction, and machine translation evaluations run by NIST.

The dialogue interaction software was to interpret dialogue's context to improve performance, and to automatically adapt to new topics so conversation could be natural and efficient. Communicator emphasized task knowledge to compensate for natural language effects and noisy environments. Unlike automated translation of natural language speech, which is much more complex due to an essentially unlimited vocabulary and grammar, Communicator takes on task-specific issues so that there are constrained vocabularies (the system only needs to be able to understand language related to war). Research was also started on foreign-language computer interaction for use in coalition operations.

The goals of HumanID were to: They planned to develop a system that recovered static body and stride parameters of subjects as they walked, while also looking into the ability of time-normalized joint angle trajectories in the walking plane as a way of recognizing gait. The university also worked on finding and tracking faces by expressions and speech.

Carnegie Mellon University's Robotics Institute (part of the School of Computer Science) worked on dynamic face recognition. The research focused primarily on the extraction of body biometric features from video and identifying subjects from those features. To conduct its studies, the university created databases of synchronized multi-camera video sequences of body motion, human faces under a wide range of imaging conditions, AU coded expression videos, and hyperspectal and polarimetric images of faces. The video sequences of body motion data consisted of six separate viewpoints of 25 subjects walking on a treadmill. Four separate 11-second gaits were tested for each: slow walk, fast walk, inclined, and carrying a ball. Tests included filming 38 male and 6 female subjects of different ethnicities and physical features walking along a T-shaped path from various angles.

The University of Southampton's Department of Electronics and Computer Science was developing an "automatic gait recognition" system and was in charge of compiling a database to test it. The University of Texas at Dallas was compiling a database to test facial systems. The data included a set of nine static pictures taken from different viewpoints, a video of each subject looking around a room, a video of the subject speaking, and one or more videos of the subject showing facial expressions. Colorado State University developed multiple systems for identification via facial recognition. Columbia University participated in implementing HumanID in poor weather. the scope of surveillance included credit card purchases, magazine subscriptions, web browsing histories, phone records, academic grades, bank deposits, gambling histories, passport applications, airline and railway tickets, driver's licenses, gun licenses, toll records, judicial records, and divorce records. fingerprints, gait, face and iris data,

Privacy

TIA's Genisys component, in addition to integrating and organizing separate databases, was to run an internal "privacy protection program". This was intended to restrict analysts' access to irrelevant information on private U.S. citizens, enforce privacy laws and policies, and report misuses of data. There were also plans for TIA to have an application that could "anonymize" data, so that information could be linked to an individual only by court order (especially for medical records gathered by the bio-surveillance project). A set of audit logs were to be kept, which would track whether innocent Americans' communications were getting caught up in relevant data.

Early developments

TIA was proposed as a program shortly after the September 11 attacks in 2001, by Rear Admiral John Poindexter. A former national security adviser to President Ronald Reagan and a key player in the Iran–Contra affair, he was working with Syntek Technologies, a company often contracted out by the government for work on defense projects. TIA was officially commissioned during the 2002 fiscal year. In January 2002 Poindexter was appointed Director of the newly created Information Awareness Office division of DARPA, which managed TIA's development. The office temporarily operated out of the fourth floor of DARPA's headquarters, while Poindexter looked for a place to permanently house TIA's researchers.

Late that year, the Information Awareness Office awarded the Science Applications International Corporation (SAIC) a $19 million contract to develop the "Information Awareness Prototype System", the core architecture to integrate all of TIA's information extraction, analysis, and dissemination tools. This was done through its consulting arm, Hicks & Associates, which employed many former Defense Department and military officials.

Congressional restrictions and termination

On 24 January 2003, the United States Senate voted to limit TIA by restricting its ability to gather information from emails and the commercial databases of health, financial and travel companies. According to the Consolidated Appropriations Resolution, 2003, Pub. L. No. 108-7, Division M, § 111(b) passed in February, the Defense Department was given 90 days to compile a report laying out a schedule of TIA's development and the intended use of allotted funds or face a cutoff of support.

The report arrived on May 20. It disclosed that the program's computer tools were still in their preliminary testing phase. Concerning the pattern recognition of transaction information, only synthetic data created by researchers was being processed. The report also conceded that a full prototype of TIA would not be ready until the 2007 fiscal year.

At some point in early 2003, the National Security Agency began installing access nodes on TIA's classified network.

On September 30, 2003, Congress officially cut off TIA's funding and the Information Awareness Office (with the Senate voting unanimously against it) because of its unpopular perception by the general public and the media. Senators Ron Wyden and Byron Dorgan led the effort.

After 2003

Reports began to emerge in February 2006 that TIA's components had been transferred to the authority of the NSA. In the Department of Defense appropriations bill for the 2004 fiscal year, a classified annex provided the funding. It was stipulated that the technologies were limited for military or foreign intelligence purposes against non-U.S. citizens. Most of the original project goals and research findings were preserved, but the privacy protection mechanics were abandoned.

Criticism

Critics said that the program could be abused by government authorities as part of their practice of mass surveillance in the United States. In an op-ed for The New York Times, William Safire called it "the supersnoop's dream: a Total Information Awareness about every U.S. citizen".

Hans Mark, a former director of defense research and engineering at the University of Texas, called it a "dishonest misuse of DARPA".

The American Civil Liberties Union launched a campaign to terminate TIA's implementation, saying that it would "kill privacy in America" because "every aspect of our lives would be catalogued". The San Francisco Chronicle criticized the program for "Fighting terror by terrifying U.S. citizens".

In 2013 former Director of National Intelligence James Clapper lied about a massive data collection on US citizens and others. Edward Snowden said that because of Clapper's lie he lost hope of changing things formally.