The code can be downloaded from this git branch (compare changes, my commits).

Synopsis

Apache Wave is a software framework for online real-time collaborative edition. Similarly to Google Docs and Etherpad, it uses Operational Transformations to manage user collaboration.

During this Google Summer of Code we have provided end-to-end encryption to wave documents. This means that only the people who know a particular key, have access to the documents and can edit and retreive the contents of a them, protecting in that way the privacy of Wave users.

We have based our work on this awesome paper that explains how some researchers encrypted Google Docs’ Operational Transformations. We have took their ideas and adapted them to Apache Wave’s architecture.

Produced work

To sumarize the work we have produced, we have recorded this video:

To encrypt the messages we have used the algorithm AES-GCM from the WebCrypto API. We have used JsInterop bindings to call it from our Java classes.

Messages are properly encrypted and decrypted when they are sent and received by the clients. The texts of a documents are also properly recovered from the server’s snapshot. Everything seems to run smoothly, except for some annoying bugs that appear sparsely, and a serious user interface bug that prevents users that did not created the wave to decrypt its snapshot. My mentor and me think that we can fix them quickly, just after the program has ended.

How to use it

Building our modified version of Wave does not require any additional configuration, just download the code from our git branch and use Gradle commands as usual, as it is stated in the Wave’s README file. To compile the code and run the server use:

$ ./gradlew run

Then, open the url http://localhost:9898/ with any browser. Once registered and logged in, use the “New Encrypted Wave” button to create a new encrypted wave.

Encrypted Wave button

In its URL you can see that the new wave’s identifier starts with “ew+” instead of “w+”, as it is usual in common waves. Also, a symmetric cryptographic key is attached, after the wave identifier, separated by an exclamation mark (!).

Encrypted Wave URL

The user must preserve that URL (or at least the key part) in order to open the wave again in the future.

Future work

AES-GCM assures both confidentiality and integrity for the messages written by the legitimate users, but an attacker who has the control over the server can still do a lot of harm:

  • Only the text of a document is encrypted, but not other parts like the content of its hiperlinks, for example. We should extend the encryption beyond the inserted characters.
  • The authentication could also be extended to all the components, not only text ones. Also, as the paper states that the history of a document should also be authenticaded (see appendix A.2).
  • It is unlikely to hide the structure and format of the document to the server, but we may be able to hide some more information, like user’s typing traits.

On the other hand, it is not convenient having users handling symmetric keys by themselves. Keys should be encrypted and stored in the server as user data. To do so, we should derive a key from the user’s password using pbkdf2 (available in the WebCrypto API), to encrypt all the keys a user generates or registers for her waves.

The users could use public key cryptograpy in order to being able to invite each other to edit in a wave document. This feature were part of the original plan of work for this Summer, but we have had not enough time to develop this part.