Peergos - your private online space

You can think of Peergos as a cross between Dropbox, email, Facebook, YouTube and Twitter, but fully end-to-end encrypted and decentralised to keep your data and social graph private.

Peergos is a peer-to-peer encrypted filesystem with secure sharing of files designed to be resistant to surveillance of data content or friendship graphs. It will have a secure email replacement, with some interoperability with email. There will also be a totally private and secure social network, where users are in control of who sees what (executed cryptographically).

The name Peergos comes from the Greek word Πύργος (Pyrgos), which means stronghold or tower, but phonetically spelt with the nice connection to being peer-to-peer. It is pronounced peer-goss, as in gossip.

WARNING: Peergos has had an audit by Cure53, but is still in active development. Some of the features in this documentation are yet to be imlemented, in particular, Tor is not used yet.

Aims

  • Securely and privately store files in a peer to peer network which has no central node and is generally difficult to disrupt or surveil
  • Secure sharing of such files with other users of the network without visible meta-data (who shares with who)
  • Beautiful user interface that any computer or mobile user can understand
  • Secure messaging, with optional interop with actual email
  • Independent of the central SSL CA trust architecture, and the domain name system
  • Self hostable - A user should be able to easily run Peergos on a machine in their home and get their own Peergos storage space, and social communication platform from it
  • Secure web interface as well as desktop clients, and native folder sync
  • Enable users to collaborate, editing a document in place concurrently
  • Secure real time chat, and eventually realtime video chat
  • Plausibly deniable dual login to an account, ala Truecrypt
  • Optional use of U2F for securing login

Features

  • Self hosting
  • Peer-to-peer
  • Multi-device login
  • Web interface
  • Social
  • Sharing
  • Large files
  • Streaming
  • File viewers
  • Public links
  • Migration
  • Folder Sync
  • Open source

Self hosting

Peergos is fully self hostable. You can run peergos from your own home or server to obtain as much storage and bandwidth as you need, whilst still transparently interacting with anyone using any other server. Because the server only ever sees encrypted data you can also tell it to directly store your data in a standard cloud storage provider like Backblaze or Amazon without any loss of privacy.

Self host at home or on your own server

Peer-to-peer

Peergos is built with a peer-to-peer architecture to protect against censorship and surveillance and to improve resiliency. There is no central surveillance point that an attacker could monitor all file transfers through. There is also no central dns name or TLS certificate authority that could be used to attack the network.

Peergos has a peer-to-peer architecture

Multi-device login

Peergos is naturally mutli-device. You can log in to your account from any device, and through any Peergos server. It is not tied to any other data like your phone number or email address. All you need is your username and your password. Any modern browser will suffice, including mobile.

Login through any device with a web browser

Web interface

In keeping with our aim to be as convenient to use as existing centralised services, Peergos has a web interface which can be used instead of a native application. This interface does not require any special knowledge, especially not of cryptography or keys, but should none-the-less encourage/enforce safe practices. The web interface does not load any code from third-party servers and is entirely self hosted. You even load it directly from ipfs and log in!

The web interface can be accessed from a public server over https or from your machine if you run Peergos locally.

Social

Peergos users can send follow requests to each other. If accepted, the other user can then share files or send messages with you. Following is a one-way mechanism: If Agata follows Bartek then Bartek can share files with Agata. If Bartek is also following Agata, then she can also share files with Bartek. Following can be revoked by either user at any point.

Your friend list is kept encrypted in your own Peergos space, hidden from other users and the server.

There is also a public link mechanism for sharing files with people who do not have peergos accounts.

Sharing

A file or folder can be shared with any user who is following you. This access can be read only or writable. Access can be revoked at any time whilst maintaining access to anyone else the item is shared with. This is all achieved cryptographically with capabilities and lazy re-encryption.

Large files

There is no file size limit in Peergos apart from what will fit in your storage quota. Despite doing client side encryption/decryption we can still upload or download arbitrarily large files sending directly from/to the filesystem.

Streaming

Peergos is naturally streaming and despite having to decrypt files in the client we can still stream large files directly with low ram usage. This allows us to stream large videos in the browser directly to a html5 video element.

File viewers

There are several built-in file viewers in Peergos. We have viewers for the following file types:

  • images
  • videos
  • audio
  • pdf
  • binary (hex viewer)

We have editors for the following formats

  • text
  • markdown
  • code

We support the following languages in the code editor:

  • c
  • c++
  • Clojure
  • css
  • diff
  • Go
  • html
  • Java
  • Javascript
  • Kotlin
  • python
  • Ruby
  • Rust
  • Scala
  • shell
  • tex
  • xml
  • yaml

Secret links

A secret link can be generated to point to any file or folder. Anyone with a (Javascript enabled) web browser can view such a link. This is a capability based link which includes the necessary key in the hash fragment of the url. A secret link doesn't expose the file to the network, or indeed to anyone who doesn't have the link itself because the key material isn't sent to the server.

An example of a secret link to a folder is:

https://demo.peergos.net/#6MDZhRRPT4ugkJuUfcPPhf1US9u7FvRALmj42mJ6e3yDibnLtqfhchE6Frm6Lf/6MDZhRRPT4ugkJuUfcZdxu6JLKyrLBE36Kasxb4jix7An4dbeiekpDF6h2fDBM/HUja6zmXVs24zcRf15s1MWB7kfvyTCp2X9NF4EZqcw7/5Pf7SvCKyBYfP1vm5LfTSw8TMHtLWvJDLv1P4QtCXV8P2Zv8FwR

Migration

Your identity in Peergos is not tied to any particular server. Compared to other federated social networks where moving server typically involves losing your social network and meta-data, if not data too, Peergos allows you to transparently migrate between servers and storage providers without any action required from your friends and without any data loss.

This means, for example, you could start out by creating an account on our demo server which gives you a small amount of storage, then effortlessly migrate to a paid server, or to your own server when you realise how awesome Peergos is.

File Sync

Peergos has the ability to do standard directory syncing and transparently mount a folder to your host operating system. This is achieved with a FUSE binding (or equivalent for Windows and MacOS).

Open source

Peergos is fully open-source, both clients, and server (incuding the web-interface). The main interface is a web-ui, but Peergos can also be accessed using a Java client on the command line, or with a FUSE mount of your Peergos filesystem.

No part of our infrastructure, apart from TLS and Peergos private keys, are secret. We also have reproducible builds (we don't use npm or browserify etc.) We also vendor all dependencies so any historic git commit should be buildable without any external data.

Eventually we want to self host our git repos in Peergos itself.

Peergos architecture

Logical

The logical architecture of Peergos consists of the following:

  1. Content addressed storage: a data store with a mapping from the hash of a block of data to the data itself
  2. Mutable pointers: a mapping from a public key to a hash
  3. PKI: a global append only log for the username <==> {identity public key, storage public key} mappings
  4. Social: each user designates a server for sending follow requests for users to (the server can't see the source user). This is the same as the storage server for that user and is identified and contacted via its public key.

Logical Architecture

Physical

Each user must have at least one Peergos server (which includes an instance of IPFS). This server stores their data, their mutable pointers and any pending follow requests for them. There is also the global append only log for the PKI which is mirrored on every node. Communication between IPFS instances is done over encrypted secio streams. secio is like TLS but with a different handshake.

The physical architecture

Immutable data

The immutable data store is provided by IPFS and allows anyone to retrieve any cipher text from its hash through any Peergos node. Note that IPFS is used in a fully trustless manner. Every single hash and signature is checked client side during reads and writes. The underlying storage can be provided by the local harddisk or any cloud provider without loss of privacy.

The interface for this storage is call ContentAddressedStorage, with the following methods:

/**
 *
 * @return The identity (hash of the public key) of the storage node we are talking to
 */
CompletableFuture<Multihash> id();

/**
 *
 * @param owner
 * @return A new transaction id that can be used to group writes together and protect them from being garbage
 * collected before they have been pinned.
 */
CompletableFuture<TransactionId> startTransaction(PublicKeyHash owner);

/**
 * Release all associated objects from this transaction to allow them to be garbage collected if they haven't been
 * pinned.
 * owner
 * @param tid
 * @return
 */
CompletableFuture<Boolean> closeTransaction(PublicKeyHash owner, TransactionId tid);

/**
 *
 * @param owner The owner of these blocks of data
 * @param writer The public signing key authorizing these writes, which must be owned by the owner key
 * @param signatures The signatures of each block being written (by the writer)
 * @param blocks The blocks to write
 * @param tid The transaction to group these writes under
 * @return
 */
CompletableFuture<List<Multihash>> put(PublicKeyHash owner, PublicKeyHash writer, List<byte[]> signatures, List<byte[]> blocks, TransactionId tid);

/**
 *
 * @param hash
 * @return The data with the requested hash, deserialized into cbor, or Optional.empty() if no object can be found
 */
CompletableFuture<Optional<CborObject>> get(Multihash hash);

/**
 * Write a block of data that is just raw bytes, not ipld structured cbor
 * @param owner
 * @param writer
 * @param signatures
 * @param blocks
 * @param tid
 * @return
 */
CompletableFuture<List<Multihash>> putRaw(PublicKeyHash owner, PublicKeyHash writer, List<byte[]> signatures, List<byte[]> blocks, TransactionId tid);

/**
 * Get a block of data that is not in ipld cbor format, just raw bytes
 * @param hash
 * @return
 */
CompletableFuture<Optional<byte[]>> getRaw(Multihash hash);

/**
 * Update an existing pin with a new root. This is useful when modifying a tree of ipld objects where only a small
 * number of components are changed
 * @param owner The owner of the data
 * @param existing The present root hash
 * @param updated The new root hash
 * @return
 */
CompletableFuture<List<MultiAddress>> pinUpdate(PublicKeyHash owner, Multihash existing, Multihash updated);

/**
 * Recursively pin all the objects referenced via ipld merkle links from a root object
 * @param owner The owner of the data
 * @param hash The root hash of the merkle-tree
 * @return A list of the multihashes pinned
 */
CompletableFuture<List<Multihash>> recursivePin(PublicKeyHash owner, Multihash hash);

/**
 * Recursively unpin a merkle tree of objects. This releases the objects to be collected by garbage collection
 * @param owner The owner of the data
 * @param hash The root hash of the merkle-tree
 * @return
 */
CompletableFuture<List<Multihash>> recursiveUnpin(PublicKeyHash owner, Multihash hash);

/**
 * Get all the merkle-links referenced directly from this object
 * @param root The hash of the object whose links we want
 * @return A list of the multihashes referenced with ipld links in this object
 */
CompletableFuture<List<Multihash>> getLinks(Multihash root);

/**
 * Get the size in bytes of the object with the requested hash
 * @param block The hash of the object
 * @return The size in bytes, or Optional.empty() if it cannot be found.
 */
CompletableFuture<Optional<Integer>> getSize(Multihash block);

Mutable

Mutable pointers in Peergos are just a mapping from a public key to a root hash. Clearly, being mutable, they need some kind of synchronization or concurrent data structure. Each user lists an ipfs node id (the hash of its public key) which is responsible for synchronising their writes and publishing the latest root hashes. This means the global filesystem is sharded by username and each user can use an ipfs instance (or cluster) with sufficient capability for their bandwidth requirements.

Initially each user's file system is under a single public key. Additional keys are generated when granting write access.

The interface for MutablePointers has the following methods:

/** Update the hash that a public key maps to (doing a cas with the existing value)
 *
 * @param owner The owner of this signing key
 * @param writer The public signing key
 * @param writerSignedBtreeRootHash the signed serialization of the HashCasPair
 * @return True when sucessfully completed
 */
CompletableFuture<Boolean> setPointer(PublicKeyHash owner, PublicKeyHash writer, byte[] writerSignedBtreeRootHash);

/** Get the current hash a public key maps to
 *
 * @param writer The public signing key
 * @return The signed cas of the pointer from its previous value to its current value
 */
CompletableFuture<Optional<byte[]>> getPointer(PublicKeyHash owner, PublicKeyHash writer);

Writing subspaces

Each user has a randomly generated writing key pair which controls writes to their filesystem. They can create new writing key pairs for any subtree, for example when granting write access to a file or folder. If desired, a given writing key can be quota controlled, to prevent users to which you've granted write access to a file from filling your data store.

Every signing keypair, including your identity keypair, and your root writing keypair, map to a data structure called WriterData. A WriterData can contain merkle links to roots of merkle champs and various public keys. The full data structure is listed below. If any of the properties are empty they do not contribute to the size of the serialized WriterData.

// the public signing key controlling this subspace
PublicKeyHash controller;

// publicly readable and present on owner keys
Optional<SecretGenerationAlgorithm> generationAlgorithm;

// This is the root of a champ containing publicly shared files and folders (a lookup from path to capability)
Optional<Multihash> publicData;

// The public boxing key to encrypt follow requests to
Optional<PublicKeyHash> followRequestReceiver;

// Any keys directly owned by the controller, that aren't named
Set<PublicKeyHash> ownedKeys;

// Any keys directly owned by the controller that have specific labels
Map<String, PublicKeyHash> namedOwnedKeys;

// Encrypted entry points to our and our friends file systems (present on owner keys)
Optional<UserStaticData> staticData;

// This is the root of a champ containing the controller's filesystem (present on writer keys)
Optional<Multihash> tree;

Merkle-CHAMP

The main network visible data structure in Peergos is a merkle compressed hash array mapped trie, or merkle-champ. This data structure is explained in the next section. All the data under a given writing keypair has its own merkle-champ. This is just a mapping from random 32 byte labels to cipher-text blobs. These blobs are cryptree nodes containing the cryptree data structure, and, in the case of a file section, merkle links to encrypted file fragments. A merkle-link is just a hash that references another ipfs object. Each 5 MiB section of a file is stored under a different random label in the btree, and similarly with large directories.

the network visible merkle-champ

Usernames

The public keys and usernames are stored in a global append only data structure, with names taken on a first come first served basis. This needs consensus to ensure uniqueness of usernames. This is also where the ipfs node id of the server(s) responsible for synchronising the user's writes is stored. The public key infrastructure (pki) server is called the Corenode, and its interface is the following.

/**
 *
 * @param username
 * @return the key chain proving the claim of the requested username and the ipfs node id of their storage
 */
CompletableFuture<List<UserPublicKeyLink>> getChain(String username);

/** Claim a username, or change the public key owning a username
 *
 * @param username
 * @param chain The changed links of the chain
 * @return True if successfully updated
 */
CompletableFuture<Boolean> updateChain(String username, List<UserPublicKeyLink> chain);

/**
 *
 * @param key the hash of the public identity key of a user
 * @return the username claimed by a given public key
 */
CompletableFuture<String> getUsername(PublicKeyHash key);

/**
 *
 * @param prefix
 * @return All usernames starting with prefix
 */
CompletableFuture<List<String>> getUsernames(String prefix);

Follow requests

A user's storage server stores their pending follow requests until they are retrieved and deleted. These are not actually stored in ipfs itself, and reading them is guarded by a challenge protocol to mitigate against someone logging them alll now and decrypting them with a large quantum computer when one is built.

Follow requests contain no unencrypted data visible to the network, or server, apart from the target user. Only the target user can decrypt the follow request to see the sender.

The interface for sending, receiving and removing follow requests is called SocialNetwork and has the following methods:

/** Send a follow request to the target public key
 *
 * @param target The public identity key hash of the target user
 * @param encryptedPermission The encrypted follow request
 * @return True if successful
 */
CompletableFuture<Boolean> sendFollowRequest(PublicKeyHash target, byte[] encryptedPermission);

/**
 *
 * @param owner The public identity key hash of user who's pending follow requests are being retrieved
 * @param signedTime The current time signed by the owner
 * @return all the pending follow requests for the given user
 */
CompletableFuture<byte[]> getFollowRequests(PublicKeyHash owner, byte[] signedTime);

/** Delete a follow request for a given public key
 *
 * @param owner The public identity key hash of user who's follow request is being deleted
 * @param data The original follow request data to delete, signed by the owner
 * @return True if successful
 */
CompletableFuture<Boolean> removeFollowRequest(PublicKeyHash owner, byte[] data);

Security

Peergos' primary focus is security.

Threat models

Peergos supports several threat models depending on the user and their situation.

Casual user:

  • Trusts the SSL certificate hierarchy and the domain name system
  • Is happy to run Javascript in their browser
  • Trusts TLS and their browser (and OS and CPU ;-) )

Such a user can interact with peergos purely through a public web server that they trust over TLS.

Slightly paranoid user:

  • Doesn't trust DNS or SSL certificates
  • Is happy to run Javascript served from localhost in their browser

This class of user can download and run the Peergos application and access the web interface through their browser over localhost.

More paranoid user:

  • Doesn't not trust the SSL certificate system
  • Doesn't trust DNS
  • Doesn't trust javascript

This class of user can download the Peergos application (or otherwise obtain a signed copy), or build it from source. They can then run Peergos locally and use the native user interface, either the comand line or a FUSE mount. Once they have obtained or built a copy they trust, then they need trust only the integrity of TweetNacl cryptography (or our post-quantum upgrade) and the Tor architecture.

Login

Decentralised login is achieved using a capability based system. Your identity key pairs and root encryption key are derived from your password salted with your username and passed through the scrypt hashing function (with parameters 17, 8, 1, 96). By virtue of being decentralised, we cannot rate limit attempts to crack your password, so choosing a good passord is imperative. We recommend at least 14 random alphanumeric characters.

Login key derivation

Encryption

All your files are encrypted symmetrically with a random 256-bit key using salsa20+poly1305 (from TweetNaCl). These keys are not derived from the contents of the file (as some services do) because this leaks to the network which files you are storing. Files are split into chunks of up to 5 MiB and each chunk is independently encrypted, and optionally erasure coded. The link from one chunk of a file to the next is also encrypted so that the network cannot deduce how big an individual file is from the data at rest.

Access control

Read access to your files is controlled by a data structure called cryptree, which is essentially a tree of symmetric keys, where the holder of one key can decrypt all the descendant keys. The result is extremely fine grained access control. You can grant access to someone to a file and that user won't be able to see any of the sibling files in the same folder (or even their names - or even their labels in the champ). Granting read access to a folder implies granting read access to all the contents of the folder recursively.

Read access capability tree

Write access is independently controlled by a similar, but simpler cryptree. All updates to a given subtree are signed by a corresponding writing key pair. When you grant write access to a file or folder then that item is moved to a new writing key pair, to keep the fine grained access control applicable to write access too. This operates independently of the read access control cryptree.

Write access capability tree

Metadata

All of the metadata for a given file is encrypted, with a different symmetric key from the file itself. This includes the name for directories and also the filesize, modification time, any thumbnail and mime type for files. The size of files is further hidden by splitting files into 5MiB chunks and storing each chunk under a random label (along with those for all other files owned by the same user and controlled by the same writing key pair).

The metadata around access patterns will be hidden by hosting files behind a tor hidden service once Tor is integrated. This will ensure that when one user reads a file shared with them by a friend this access does not leak to the network the fact that they are friends.

Quantum resistance

Peergos aims to be a long term secure file storage system, and hence we have architected it with an awareness of quantum computer based attacks (many of us are ex physicists).

Files that you store but don't share with anyone are already resistant to quantum computer based attacks. This is because the process from logging in to decrypting them only involves hashing and symmetric encryption, neither of which are significantly weakened by a quantum computer.

Files that have been shared are currently vulnerable to a quantum computer attack because they use asymmetric elliptic curve cryptography (Curve25519) to share the decryption capability. However, we plan to upgade to a suitable post-quantum algorithm soon.

Social graph

Following a user is implemented by them sharing read access to a directory in their filesystem. The read capability is sent encrypted from a random single use keypair to the target user's public key. These requests will be sent over Tor to that user's hidden service to hide the metadata from the network. Once retrieved, the receiving user stores the capability in their own storage, symmetrically encrypted and deletes the follow request from their server.

TOFU

All users have a public identity key, and these are stored in an append only content addressed data structure (or blockchain if you will). This structure is mirrored by all nodes. This allows users to do public key lookups without leaking to the network who they are looking up. Users also store the keys of their friends in their own filesystem in a TOFU setup. This means that ordinary usage doesn't involve looking up keys from the public blockchain.

How does it work?

This section goes into technical detail about how different operations work in Peerogs.

Signing up

The steps involved in signing up are:

  1. Register the username

    • Hash the password and username through scrypt to get the identity key pair, following key pair and symmetric root key.
    • Generate a signed username claim including an expiry, and the ipfs node id of the storage server (the server we are signing up through) This is just identity.sign(username, expiry, [storage id])
    • Send this claim to the pki node for confirmation
  2. Set up your identity

    • Write the public identity key to ipfs
    • Write the public following key to ipfs
    • Create a WriterData for the identity key pair with the two resulting public key hashes
    • Generate a random key pair to control writes to the users filesystem. Add this key pair as an owned key to the identity WriterData.
    • Commit the identity WriterData (write it to ipfs and set the mutable pointer for the identity key pair to the resulting hash).
  3. Set up your filesystem

    • Create a DirAccess cryptree node for the user's root directory, and add this to the champ of the filesystem key pair.
    • Add a write capability (encrypted) to the static data section of the identity key pair's WriterData
    • Create the /username/shared directory which is used when sending follow requests

Uploading a file

A file upload proceeds in the following steps

  1. Check filename is valid and free

  2. Create a transaction file with a plan for the upload

  3. For every section of the file which is up to 5 MiB:

    • Encrypt the 5 MiB file section with a random symmetric key
    • Split the cipher text into 128 KiB fragments
    • Create a FileAccess cryptree node with merkle links to all the resulting fragments and an encrypted link to the next section (even if there isn't a next section)
    • Add the FileAccess to the champ of the writing key pair under a random 32 byte label
  4. Add a cryptree link from the parent directory to the file

  5. Delete the transaction file

A modification, such as uploading a file, can be done through any Peergos server as the writes are proxied through an ipfs p2p stream to the owner's storage ipfs node.

Sending a follow request

Sending a follow request proceeds in the following steps:

  1. Look up the target friend's public following key

  2. Create a directory /our-name/shared/friend-name

  3. Encrypt a read capability for that directory using a random key pair to the target's following key. Using a random keypair ensures that noone but the target friend can see who sent the request.

  4. Send the follow request to the storage server of the target friend

The target can then either allow and reciprocate (full bi-directional friendship), allow (you are following them), reciprocate (they are following you) or deny. If they have reciprocated then you can grant read or write access to any file or folder by adding a read or write capability in their directory in your space.

When you receive a follow request and either allow or reciprocate it then you add the capability in the request to your static data on your identity WriterData, so you can find it again later, before deleting the follow request from your server.

Proxying requests

Any modifying request needs to be proxied to the correct destination server. This could be signing up, uploading a file, or sending a follow request. This is achieved using an ipfs p2p stream. In particular, because all these requests are http requests, we use the http p2p proxy exposed locally on the ipfs gateway. It means we can send any request to

http://localhost:8080/p2p/$target_node_id/http/$path

and it will go through an end to end encrypted stream through the ipfs network to the destination node, which then sends it to the local Peergos server at:

http://localhost:8000/$path

This is illustrated below: Proxying a request through ipfs