Frequently Asked Questions¶

General questions¶

1. Why is the software named `brig`?¶

It is named after the ship with the same name. When we named it, we thought it’s a good name for the following reason:

A brig is a very lightweight and fast ship.
It was commonly used to transport small amount of goods.
A ship operates on streams (sorry 😛)
The name is short and somewhat similar to git.
It gives you a few nautical metaphors and a logo for free.

Truth be told, only half of the two name givers thought it’s a good name, but I still kinda like it.

2. Who develops it?¶

Although this documentation sometimes speaks of »we«, the only developer is currently Chris Pahl. He writes it entirely in his free time, mostly during commuting with the train.

Technical questions¶

1. How is the encryption working?¶

A stream is chunked into equal sized blocks that are encrypted in GCM mode using AES-256. Additionally ChaCha20 (with Poly1305) is currently supported but it might be removed soon. The overall file format is somewhat similar to NaCL secretboxes, but it is more tailored to supporting efficient seeking.

The current default is ChaCha20, although machines with the aes-ni instruction set might yield significant higher throughput. The source of the encryption layer can be found here. Here’s a basic overview over the format:

The key of each file is currently being derived from the content hash of the file (See also Convergent Encryption). If the content changes later, the key does not change since the key is only generated once during the first staging of the file.

Please refer to the implementation for all implementation details for now. No security audits of the implementation have been done yet, therefore I’d appreciate every pair of eyes. Especially while everything is still in flux and won’t harm any users.

2. Is there compression implemented?¶

Yes. The compression is being done before encryption and is only enabled if the file looks compression-worthy. The »worthiness« is determined by looking at its header to guess a mime-type. Depending on the mime-type either snappy or lz4 is selected or no compression is added at all.

The source of the compression layer can be found here. Here’s a basic overview over the format:

3. What hash algorithms are used?¶

Two algorithms are used:

SHA256 is used by IPFS for every backend hash.
SHA3-256 is used as general purpose hash for everything brig internal (Content and Tree hash).

Each hash is encoded as multihash. For output purposes this representation is encoded additionally in base58. Therefore, all hashes that start with W1 are sha3-256 hashes while the ones starting with Qm are sha256 hashes. Keep in mind that base58 is case-sensitive.

4. What kind of deduplication is currently used?¶

It is currently only possible to deduplicate between individual versions of a file. And there also only the portion before the modification.

IPFS implements deduplication, but it is circumvented by encrypting blocks before giving them over to the backend. Implementing a more proper and informed deduplication is one of the long term goals, which require more thorough interaction with IPFS. It is also possible to do some basic deduplication purely on brig side since we have more info on the file than IPFS has.

5. How fast is the I/O when using `brig`?¶

Here are some rather outdated graphs where you can get a rough feeling how fast it can be. There are a few rules of thumb with mostly obvious content:

It it goes over the network, it’s the network speed plus a smaller constant overhead.
If it comes over FUSE, it is quite a bit slower than over brig cat.
If you do not use compression, writing and reading will be faster.

The graphs below only measure in-memory performance compared to a dd like speed (see the »baseline« line).

Your mileage may vary and you better do your own benchmarks for now.

Todo

Explain/Update those graphs.