TorChat Protocol Dissection

This page is also reachable within the Tor network: http://vb75uj2ap3hyyava.onion/research/torchat_protocol/

DISCLAIMER: This is NOT a specification but an external analysis of the TorChat protocol by studying the Java and Python source code implementation of it which can be found here and here. I am not related to the Tor or TorChat projects. I simply try to cover the protocol as correctly as possible but don't cry if it contains errors or is incomplete. I am using this as base for a C# based implementation for use in the Smuxi Messenger.

(Assumed) Design Goals

p2p (decentral)
encrypted
anonymous (for everyone else, except the peers itself)
- hides who communicates with whom
- hides physical location
registration free (auto id generation)

Transport

Uses TCP sockets to hidden services running on port 11009.

Peers send and receive messages on that TCP socket.

Connections

Hidden services behave like regular server sockets except that the server has no idea who (in the sense of IP source address) the client is because it is a tor client. As TorChat is p2p, it needs to make out-bound connections to send messages and allow in-bound connections to receive messages from other peers. Both in- and out-bound connections always happen on port 11009.

Out-Bound Connections

Connections to TorChat peers (the hidden service on port 11009) are out-bound and are authenticated by definition as only the owner of the hidden service key is able to respond to the connection attempt.

In-Bound Connections

Connections from other TorChat peers are always unauthenticated except they can prove in some way that they are who they pretent to be. TorChat uses an session token for each peer to authenticate their connection and only then we can believe the claimed origin of the messages we receive on that in-bound connection. For more details how this authentication procedure works refer to the Authentication section below.

Message Format

type: byte array
message seperator: 0x0a (LF)
decode as string:
- replace '\r\n' with '\n' then '\n' with "\n" (LF)

Command Format

command: a-z or _
seperator: 0x20 (SP)
payload: byte array

Example:

ping <payload>

Message Commands

ping

command: ping
seperator: 0x20 (SP)
payload: <origin_hidden_service_id><seperator><authentication_token>

<origin_hidden_service_id> is the hash of the public key used in the onion network (also known as onion address). This is the address the peer needs to contact to return the authentication_token. This way the origin knows on which in-bound connection the peer sits on as the authentication_token was only sent to a single hidden service.

<authentication_token> is a string of no specific length, but should be 7-bit-only to avoid charset conversion issues.

WARNING: this authentication token has to be unique, cryptographically random and kept secret! If this token leaks, anyone can impersonate the identity of that TorChat peer as long as the TorChat application which generated this token runs.

Example:

ping mb4bc4jk4cj2fky4 31754944747097474078662100165902771331350515775810664422385852963171834014133

pong

client

version

add_me

message

status

Format:

status <status>

<status> can be one of:

away
available
xa

Example:

status available

filename

filedata

filedata_ok

Authentication / Handshake

TBD

Potential Security Issues / Weaknesses

Hidden Services Keys

Tor's hidden services fully rely on 1024 bit RSA keys. I don't know yet how these keys are used to make a conclusion if this is a real weakness or not.

Hidden Services Guessing

Hidden services are stored on a DHT and can be iterated (?) to find existing hidden services. As TorChat uses a static port this can be used to find TorChat users.

Authentication Tokens

The TorChat client that wants to authenticate an hidden service hashkey (which is a chat buddy / peer in TorChat) has to generate an authentication token that the chat peer needs to return. If the client selects a weak token, say a pretty short one or one without real random data, then the token could be guessed and you could send spoofed TorChat messages that look like they are coming from the authenticated peer.

Message Size

The protocol doesn't seem to specify or enforce a maximum message size. This could potentially easily be abused to consume all memory of the client.