Deanonymization requires luck but nonetheless shows limits of Tor privacy.

Computer scientists have devised an attack on the Tor privacy network that in certain cases allows them to deanonymize hidden service websites with 88 percent accuracy.

Such hidden services allow people to host websites without end users or anyone else knowing the true IP address of the service. The deanonymization requires the adversary to control the Tor entry point for the computer hosting the hidden service. It also requires the attacker to have previously collected unique network characteristics that can serve as a fingerprint for that particular service. Tor officials say the requirements reduce the effectiveness of the attack. Still, the new research underscores the limits to anonymity on Tor, which journalists, activists, and criminals alike rely on to evade online surveillance and monitoring.

"Our goal is to show that it is possible for a local passive adversary to deanonymize users with hidden service activities without the need to perform end-to-end traffic analysis," the researchers from the Massachusetts Institute of Technology and Qatar Computing Research Institute wrote in a research paper. "We assume that the attacker is able to monitor the traffic between the user and the Tor network. The attacker’s goal is to identify that a user is either operating or connected to a hidden service. In addition, the attacker then aims to identify the hidden service associated with the user."

The attack works by gathering the network data of a pre-determined list of hidden services in advance. By analyzing patterns in the number of packets passing between the hidden service and the entry guard it uses to access Tor, the researchers were able to obtain a unique fingerprint of each service. They were later able to use the fingerprint to identify the service even though they were unable to decrypt the traffic it was sending. In a press release, the researchers elaborated:

The researchers’ attack requires that the adversary’s computer serve as the guard on a Tor circuit. Since guards are selected at random, if an adversary connects enough computers to the Tor network, the odds are high that, at least on some occasions, one or another of them would be well-positioned to snoop.

During the establishment of a circuit, computers on the Tor network have to pass a lot of data back and forth. The researchers showed that simply by looking for patterns in the number of packets passing in each direction through a guard, machine-learning algorithms could, with 99 percent accuracy, determine whether the circuit was an ordinary Web-browsing circuit, an introduction-point circuit, or a rendezvous-point circuit. Breaking Tor’s encryption wasn’t necessary.

Furthermore, by using a Tor-enabled computer to connect to a range of different hidden services, they showed that a similar analysis of traffic patterns could identify those services with 88 percent accuracy. That means that an adversary who lucked into the position of guard for a computer hosting a hidden service, could, with 88 percent certainty, identify it as the service’s host.

Similarly, a spy who lucked into the position of guard for a user could, with 88 percent accuracy, tell which sites the user was accessing.

The research is sure to interest governments around the world, including the US. On at least two occasions over the past few years, FBI agents have exploited software vulnerabilities, once Adobe Flash and once in Mozilla Firefox, to identify criminal suspects. Recently unsealed court documents also show the FBI seizing a Tor-hidden child porn site and allowing it to run for weeks so agents could gather evidence on visitors.
In an e-mail, Tor project leader Roger Dingledine said the requirements of the attack greatly limited its effectiveness in real-world settings. First, he said, the adversary must control one of the entry guards a hidden service is using. Such entry guards in theory are assigned randomly, so attackers would have to operate a large number of Tor nodes to have a reasonable expectation of seeing traffic of a given hidden service. Additionally, he cited research from last year arguing that researchers routinely exaggerate the risk of website fingerprinting on anonymity.

He went on to question the "classifier" algorithm that allowed the researchers to identify certain traffic as belonging to a Tor hidden service. It wouldn't be hard to thwart it, he said, by adding random padding to the data being sent.

"It's not surprising that their classifier basically stops working in the face of more padding," he wrote. "Classifiers are notoriously brittle when you change the situation on them. So the next research step is to find out if it's easy or hard to design a classifier that isn't fooled by padding.

The full text of Dingledine's e-mail is below:

This is a well-written paper. I enjoyed reading it, and I'm glad the researchers are continuing to work in this space.

First, for background, run (don't walk) to Mike Perry's blog post explaining why website fingerprinting papers have historically overestimated the risks for users:
and then check out Marc Juarez et al's followup paper from last year's ACM CCS that backs up many of Mike's concerns:

To recap, this new paper describes three phases. In the first phase, they hope to get lucky and end up operating the entry guard for the Tor user they're trying to target. In the second phase, the target user loads some web page using Tor, and they use a classifier to guess whether the web page was in onion-space or not. Lastly, if the first classifier said "yes it was", they use a separate classifier to guess which onion site it was.

The first big question comes in phase three: is their website fingerprinting classifier actually accurate in practice? They consider a world of 1000 front pages, but and other onion-space crawlers have found millions of pages by looking beyond front pages. Their 2.9% false positive rate becomes enormous in the face of this many pages—and the result is that the vast majority of the classification guesses will be mistakes.

For example, if the user loads ten pages, and the classifier outputs a guess for each web page she loads, will it output a stream of "She went to Facebook!" "She went to Riseup!" "She went to Wildleaks!" while actually she was just reading posts in a Bitcoin forum the whole time? Maybe they can design a classifier that works well when faced with many more web pages, but the paper doesn't show one, and Marc Juarez's paper argues convincingly that it's hard to do.

The second big question is whether adding a few padding cells would fool their "is this a connection to an onion service" classifier. We haven't tried to hide that in the current Tor protocol, and the paper presents what looks like a great classifier. It's not surprising that their classifier basically stops working in the face of more padding though: classifiers are notoriously brittle when you change the situation on them. So the next research step is to find out if it's easy or hard to design a classifier that isn't fooled by padding.

I look forward to continued attention by the research community to work toward answers to these two questions. I think it would be especially fruitful to look also at true positive rates and false positives of both classifiers together, which might show more clearly (or not) that a small change in the first classifier has a big impact on foiling the second classifier. That is, if we can make it even a little bit more likely that the "is it an onion site" classifier guesses wrong, we could make the job of the website fingerprinting classifier much harder because it has to consider the billions of pages on the rest of the web too.