A multinational security firm has secretly developed software capable of tracking people's movements and predicting future behaviour by mining data from social networking websites.
A video obtained by the Guardian reveals how an "extreme-scale analytics" system created by Raytheon, the world's fifth largest defence contractor, can gather vast amounts of information about people from websites including Facebook, Twitter and Foursquare.
Raytheon says it has not sold the software named Riot, or Rapid Information Overlay Technology to any clients.
But the Massachusetts-based company has acknowledged the technology was shared with US government and industry as part of a joint research and development effort, in 2010, to help build a national security system capable of analysing "trillions of entities" from cyberspace.
Using Riot it is possible to gain an entire snapshot of a person's life their friends, the places they visit charted on a map in little more than a few clicks of a button.
Riot can display on a spider diagram the associations and relationships between individuals online by looking at who they have communicated with over Twitter. It can also mine data from Facebook and sift GPS location information from Foursquare, a mobile phone app used by more than 25 million people to alert friends of their whereabouts. The Foursquare data can be used to display, in graph form, the top 10 places visited by tracked individuals and the times at which they visited them.
Mining from public websites for law enforcement is considered legal in most countries.
Ginger McCall, an attorney at the Washington-based Electronic Privacy Information Centre, said the Raytheon technology raised concerns about how troves of user data could be covertly collected without oversight or regulation.
"Social networking sites are often not transparent about what information is shared and how it is shared," McCall said. "Users may be posting information that they believe will be viewed only by their friends, but instead, it is being viewed by government officials or pulled in by data collection services like the Riot search."
Jared Adams, a spokesman for Raytheon's intelligence and information systems department, said in an email: "Riot is a big data analytics system design we are working on with industry, national labs and commercial partners to help turn massive amounts of data into useable information to help meet our nation's rapidly changing security needs.
"Its innovative privacy features are the most robust that we're aware of, enabling the sharing and analysis of data without personally identifiable information [such as social security numbers, bank or other financial account information] being disclosed."
In December, Riot was featured in a newly published patent Raytheon is pursuing for a system designed to gather data on people from social networks, blogs and other sources to identify whether they should be judged a security risk.
Graphing software has been successfully used to track/map the online bad guys with a remarkable degree of both depth & accuracy. Adapting it to the general public that is less concerned over their footprint should return very detailed/accurate data.
It's bound to be simultaneously a useful and a dangerously misleading tool for investigators. Scatter enough data-pairings and chainings on a piece of paper or a computer screen and one can infer all manner of subsidiary connection patterns amongst them... some real and some not-so-real, some relevant and some irrelevant to whatever paradigm one hypothesizes at the outset. While these techniques may uncover some genuine patterns, they are bound to create many 'apparent' patterns of no meaningful significance. In the end, as with any other tool, it will be up to the analyst and his superiors to draw the final conclusions... and those conclusions will depend greatly on both the skill and the preloaded biases of all the players involved.
It's called data fog. Collecting extreme amounts of data on extreme numbers of entities results in "extreme-squared" pieces of data and apparent connections between them... and a point will rapidly be reached where almost anything one wants to find can be found (real or not) amidst the vast fog of data particles and pairings. That A apparently is connected to B does not explain "why" A happens to be associated with B in that instance. The very real danger is that analysts can supply (by presumption) all manner of "whys" that make an A-B connection seem far more ominous than it may really be... and that in turn weights how they look at all the connections from B to X, Y, and Z and so on. Just as one can make some dim shadow viewed erratically through fog to be anything their imagination prompts, so voluminous data itself can create a fog that obscures reality rather than accurately describing it.
makes me wonder how well such image following software can handle photoshop. With a green section of fabric, my camera and open source software I can composite myself anywhere. I could have a picture of myself standing on the surface of mars waving to the Curiosity cameras. And right next to that a real photo from the top of the Empire State building. And with those two a photo from the bridge of the Starship Enterprise.
I could have a picture of myself standing on the surface of mars waving to the Curiosity cameras...
That's the beauty of a graph. The data that carries any credibilty isn't evaluated as a single item because as the data grows there are supporting connectors. e.g., the photo of yourself on mars would be connected to web searches for "Mars nightlife" connected to a flight itinerary from Interplanetary Flights connected to a credit card purchase at Mars Bar & Oxygen. As far as using misdirection goes, it's not as effective as it might seem at a glance because supporting connectors will either materialize or they won't. Besides, maintaining an alias of sorts works only as well as the ability to keep it all straight, which becomes exponentially difficult with the addition of each piece of misdirection introduced into the mix. Strict graphing with those that have the discipline is scary stuff from a privacy view. It's only a matter of time before it goes mainstream under the guise of better serving you & I.
I should have patented my system years ago (oh ya I don't believe in software patents), the keys words here are ' social networking websites', which generally implies 'not private' otherwise its not much of a social site is it. All of these systems have private capabilities but it is up to the user of the system to learn how to use these features and keep up on changes to these features. If you blast a tweet out and unless you have correctly configured it to be private, its public kids which means even Big Bother has a right to read it and act upon it.
The software functions are not that new. Similar police systems are used pooling different data to that of social media already and finding new ways to incorporate such are happening. This system if upgraded may actually be able to "predict behavior" and if done well be accurate to a percentage. I don't think the video shows any ability to predict, only correlation analysis. For a trained Analyst this would be quite useful as part of their investigation as well as for Profile Analyst's as is. Further development on the prediction side of things may just produce fog as Blackbird puts it. Sometimes i think that allowing private enterprise to develop "targeting ads" has been GOV's way of allowing industry to develop GOV tracking work for them. Obama's speech today indicated that comp security and privacy are now more of a priority...Perhaps they already have enough prediction algorithms and research to move in other directions?
... Sometimes i think that allowing private enterprise to develop "targeting ads" has been GOV's way of allowing industry to develop GOV tracking work for them. ...
You are far closer to the truth than you may realize. Many things for which the government (rightfully) would catch flak can be done quite effectively by private enterprise in a free society, and the government need only (quietly) use the results and encourage further development. It's spook-world 101.
It's bound to be simultaneously a useful and a dangerously misleading tool for investigators. Scatter enough data-pairings and chainings on a piece of paper or a computer screen and one can infer all manner of subsidiary connection patterns amongst them... some real and some not-so-real, some relevant and some irrelevant to whatever paradigm one hypothesizes at the outset.
if you target a specific person, I agree. If you use this to figure out the best place to open a coffee shop, put police officers to guard against vandalism, or what ad to put on a billboard, this works remarkably well. Social media becomes one more data point to mine.
The sophisticated technology demonstrates how the same social networks that helped propel the Arab Spring revolutions can be transformed into a "Google for spies" and tapped as a means of monitoring and control.
With this Riot technology have they predicted any riot yet? And to whom this product is designed to serve in the first place?
Jared Adams, a spokesman for Raytheon's intelligence and information systems department, said in an email: "Its innovative privacy features are the most robust that we're aware of, enabling the sharing and analysis of data without personally identifiable information [such as social security numbers, bank or other financial account information] being disclosed."
They always say "enabling the sharing and analysis of data without personally identifiable information". It's a common lie, covering development of a product, created to track and spy on a specific person: