Skip to content

Conversation

@laanwj
Copy link
Member

@laanwj laanwj commented Mar 16, 2015

According to Tor's extensions to the SOCKS protocol (https://gitweb.torproject.org/torspec.git/tree/socks-extensions.txt) it is possible to perform stream isolation by providing authentication to the proxy. Each set of credentials will create a new circuit, which makes it harder to correlate connections.

This patch adds an option, -proxyrandomize (on by default) that randomizes credentials for every outgoing connection, thus creating a new circuit for every peer connection.

2015-03-16 15:29:59 SOCKS5 Sending proxy authentication 3842137544:3256031132

Aside: we really need tests for the proxy functionality (as mentioned in other proxy-related pulls #5298, #4871, #4587), and I plan on adding a few, but I've put this up so it can already get some review and manual testing.

@laanwj laanwj added the P2P label Mar 16, 2015
@laanwj laanwj force-pushed the 2015_03_tor_circuit_randomization branch from 3095787 to 34c2014 Compare March 16, 2015 15:54
@laanwj laanwj added this to the 0.11.0 milestone Mar 16, 2015
@laanwj
Copy link
Member Author

laanwj commented Mar 19, 2015

This change has another advantage: every connection will (potentially) go through a different exit node. That significantly reduces the chance to get unlucky and pick a single exit node that is either malicious with regard to bitcoin, or widely banned on the P2P network.

You can test this by looking at via in the peers list or addrlocal in getpeerinfo. This is the address that peers report that you connect from.

@jgarzik
Copy link
Contributor

jgarzik commented Mar 19, 2015

Vaguely in favor -- needs a lot of thinking-through. Consult with some Tor experts for their opinion?

@laanwj
Copy link
Member Author

laanwj commented Mar 19, 2015

needs a lot of thinking-through

Can you explain what problem you see with this?

@jgarzik
Copy link
Contributor

jgarzik commented Mar 19, 2015

@laanwj Nothing specific - just pattern matching. Sometimes stream cycling makes activities more visible rather than less, e.g. for the reasons that Tor bridges are used. This is the type of change that can have subtle ramifications not immediately obvious.

@laanwj
Copy link
Member Author

laanwj commented Mar 19, 2015

In that case it may be prudent to have it disabled by default, but I would like to have the option for this. Note that Tor already switches the circuit (for new connections) every 10 minutes by default, so this will mostly have an effect on the initial connections which are made in a short timespan.

@gmaxwell was talking about defaulting this to enabled, maybe he can comment.

@gmaxwell
Copy link
Contributor

I previously got an opinion from a tor developer that sounded vaguely like "you're not doing that already??"-- without it we potentially lose all our sybil resistant and a single bad-exit can partition us from the network, as we make all our connections at the same time, at startup.

Hopping around the globe can make things more visible if you're a web broswer and you're hoping to not be noticed by the two people left in the world that don't know how to RBL lookup tor exits, but I don't think that applies in our case. We're not futility hoping that the traffic exiting the tor network won't be known to come from Tor.

Also, without doing this there are non-trivial odds of having the users web traffic correlated with their node connections.

We could do something more mild, like have each of our outbound connection indexes share a identifer with no loss... but I don't currently see a reason to, I'll ask for more of an opinion.

@laanwj
Copy link
Member Author

laanwj commented Apr 1, 2015

We could do something more mild, like have each of our outbound connection indexes share a identifer with no loss...

This would have the effect of reusing the same circuit when reconnecting quickly. I can imagine that would happen when a node boots you immediately, or for some reason you're cycling faster than every 10 minutes - if it takes longer a new circuit would be used automatically by Tor.

When going through a list of questionable nodes, this may reduce the number of circuit building operations. I'm not sure that is an issue, at most a performance one. Potentially it could make it easier to correlate subsequent connections for an exit node.

@gmaxwell
Copy link
Contributor

gmaxwell commented Apr 1, 2015

@laanwj one downside that occurs to me is that an exit can make you ban that peer, then you'd continue using that same exit again and again. So at a minimum a misbehaving disconnect should change the identity for that slot.

@laanwj
Copy link
Member Author

laanwj commented Apr 2, 2015

Agreed. Bleh, I'd like to avoid that kind of complexity and 'slot management'.

In practice I've encountered no performance problems in using a different circuit every time. And even if it introduced some extra latency for new connections, we're not terribly sensitive to that.

@laanwj
Copy link
Member Author

laanwj commented Apr 13, 2015

I'd like to move this forward. Anything that still needs to be done here?

If we are in doubt about enabling the randomization by default, I'd like to merge this with the option defaulting to false.

@gmaxwell
Copy link
Contributor

This seems to work fine for me at least. I'm not concerned with more streams, it seems to make multiple connections a bit slower (for the obvious reasons), but also seems to make the connectivity more reliable (previously all the ipv4 peers would go down at once sometimes).

According to Tor's extensions to the SOCKS protocol
(https://gitweb.torproject.org/torspec.git/tree/socks-extensions.txt)
it is possible to perform stream isolation by providing authentication
to the proxy. Each set of credentials will create a new circuit,
which makes it harder to correlate connections.

This patch adds an option, `-proxyrandomize` (on by default) that randomizes
credentials for every outgoing connection, thus creating a new circuit.

    2015-03-16 15:29:59 SOCKS5 Sending proxy authentication 3842137544:3256031132
@laanwj laanwj force-pushed the 2015_03_tor_circuit_randomization branch 2 times, most recently from fef4d19 to 01b328f Compare April 17, 2015 17:29
@laanwj
Copy link
Member Author

laanwj commented Apr 17, 2015

Added a RPC test for the -proxy, -onion and -proxyrandomize functionality, as promised in OP.

@laanwj laanwj force-pushed the 2015_03_tor_circuit_randomization branch from 01b328f to ab8791b Compare April 17, 2015 17:30
Add test for -proxy, -onion and -proxyrandomize.
@laanwj laanwj force-pushed the 2015_03_tor_circuit_randomization branch from ab8791b to 6be3562 Compare April 20, 2015 13:04
@laanwj laanwj merged commit 6be3562 into bitcoin:master Apr 20, 2015
laanwj added a commit that referenced this pull request Apr 20, 2015
6be3562 rpc-tests: Add proxy test (Wladimir J. van der Laan)
67a7949 privacy: Stream isolation for Tor (Wladimir J. van der Laan)
@sipa
Copy link
Member

sipa commented Apr 20, 2015 via email

@isislovecruft
Copy link

@sipa: What happens when you use this socks5 authentication with a non-Tor socks5 proxy?

Overwhemingly likely: it won't accept the username and password because there isn't any user $RANDOM with password $RANDOM, and so it will reject the connection. (See RFC1929 §2 for what the byte values are in an auth rejection reply.)

Rather unlikely: if the non-Tor SOCKS5 proxy behaves like Tor, in that it accepts any username/password pair, then you'll authenticate to it and it'll do… something, maybe, depending on what type of proxy it is. Off the top of my head, the only SOCKS5 proxies I can think of (other than Tor) which accept random username/password pairs, are Tor Pluggable Transports, which use the SOCKS5 username/password fields as a hack to pass keyword-value pairs from the outgoing client-side Tor process to the incoming client-side Pluggable Transport process (see §2.1.0.3 of pt-spec.txt).

@isislovecruft
Copy link

@gmaxwell: We could do something more mild, like have each of our outbound connection indexes share a identifer with no loss...

@laanwj: This would have the effect of reusing the same circuit when reconnecting quickly. I can imagine that would happen when a node boots you immediately, or for some reason you're cycling faster than every 10 minutes - if it takes longer a new circuit would be used automatically by Tor.

Yes, it would reuse the same circuit. However, if the exit node you were using suddenly won't allow you to reach your peer, Tor will automatically detach that "stream" (a.k.a. connection) from that bad circuit, and reattach it to a new circuit. Subsequently, if you try to use some other, previously-opened, outbound connection to a peer (which has the same index, and thus the same SOCKS5 username/password) which is still attached to that bad circuit, it'll get automatically reattached to the same new circuit as the previous streams with identical SOCKS5 username/passwords. Therefore, you won't keep connecting to the same suddenly-offline/censoring/misbehaving/whatever exit node, but Tor will automatically find a new working circuit for you, and still keep all your connections grouped by SOCKS5 username/password.

So, yes, you should group connections, when possible. It'll save on the number of total circuits needed, cutting the overhead by some O(M-N) where M is the total number of peers and N is the total number of peer "groups".

If that's difficult or annoying to do, or there's no potential use cases for grouping peer connections to the same circuit, then completely randomised SOCKS authentication is still much better than no SOCKS authentication at all.

Although I've little authority to comment on the mergability of bitcoin patches, I did a quick read of @laanwj's patch 67a7949 and it looks pretty good to me.

@laanwj
Copy link
Member Author

laanwj commented May 7, 2015

@sipa

What happens when you use this socks5 authentication with a non-Tor socks5 proxy?

  • If the SOCKS5 proxy doesn't support authentication (for example, ssh -D), it makes no difference. No token is sent.
  • If the proxy requires authentication then submitting random passwords will make it deny your connections just like submitting no password did before. I've never heard any requests requests for supporting authenticated proxies. However - such an option could be added trivially after this.
  • If the proxy supports authentication, but doesn't require it, then this may cause it to reject connections as a user/password is sent where it wasn't before. I think this is exceedingly rare apart from Tor. In this case -proxyrandomize=0 will have to be passed to disable the behavior.

@isislovecruft Thanks a lot for looking over the patch and commenting here!

So, yes, you should group connections, when possible. It'll save on the number of total circuits needed, cutting the overhead by some O(M-N) where M is the total number of peers and N is the total number of peer "groups".

Right, makes sense. However in our case, outgoing connections are quite rare. We make only up to 8. So as I understand, it'll never create a huge number of circuits. This makes it less of an issue, I suppose?

@isislovecruft
Copy link

@laanwj No problem; happy to help!

Right, makes sense. However in our case, outgoing connections are quite rare. We make only up to 8. So as I understand, it'll never create a huge number of circuits. This makes it less of an issue, I suppose?

Oh, if it's only 8 at a time, then you're correct that it really doesn't matter… the overhead is miniscule.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants