Frequently Asked Questions¶
General questions¶
1. Why is the software named brig
?¶
It is named after the ship with the same name. When we named it, we thought it’s a good name for the following reason:
- A
brig
is a very lightweight and fast ship. - It was commonly used to transport small amount of goods.
- A ship operates on streams (sorry 😛)
- The name is short and somewhat similar to
git
. - It gives you a few nautical metaphors and a logo for free.
Truth be told, only half of the two name givers thought it’s a good name, but I still kinda like it.
2. Who develops it?¶
Although this documentation sometimes speaks of »we«, the only developer is currently Chris Pahl. He writes it entirely in his free time, mostly during commuting with the train.
Technical questions¶
1. How is the encryption working?¶
A stream is chunked into equal sized blocks that are encrypted in GCM mode using AES-256. Additionally ChaCha20 (with Poly1305) is currently supported but it might be removed soon. The overall file format is somewhat similar to NaCL secretboxes, but it is more tailored to supporting efficient seeking.
The current default is ChaCha20
, although machines with the aes-ni
instruction set might yield significant higher throughput. The source of the
encryption layer can be found here.
Here’s a basic overview over the format:
The key of each file is currently being derived from the content hash of the file (See also Convergent Encryption). If the content changes later, the key does not change since the key is only generated once during the first staging of the file.
Please refer to the implementation for all implementation details for now. No security audits of the implementation have been done yet, therefore I’d appreciate every pair of eyes. Especially while everything is still in flux and won’t harm any users.
2. Is there compression implemented?¶
Yes. The compression is being done before encryption and is only enabled if the
file looks compression-worthy. The »worthiness« is determined by looking at its
header to guess a mime-type. Depending on the mime-type either snappy
or
lz4
is selected or no compression is added at all.
The source of the compression layer can be found here. Here’s a basic overview over the format:
3. What hash algorithms are used?¶
Two algorithms are used:
SHA256
is used byIPFS
for every backend hash.SHA3-256
is used as general purpose hash for everythingbrig
internal (Content and Tree hash).
Each hash is encoded as multihash. For output purposes this
representation is encoded additionally in base58
. Therefore, all hashes
that start with W1
are sha3-256
hashes while the ones starting with
Qm
are sha256
hashes. Keep in mind that base58
is case-sensitive.
4. What kind of deduplication is currently used?¶
It is currently only possible to deduplicate between individual versions of a file. And there also only the portion before the modification.
IPFS
implements deduplication, but it is circumvented by encrypting blocks
before giving them over to the backend. Implementing a more proper and informed
deduplication is one of the long term goals, which require more thorough
interaction with IPFS
. It is also possible to do some basic deduplication
purely on brig
side since we have more info on the file than IPFS
has.
5. How fast is the I/O when using brig
?¶
Here are some rather outdated graphs where you can get a rough feeling how fast it can be. There are a few rules of thumb with mostly obvious content:
- It it goes over the network, it’s the network speed plus a smaller constant overhead.
- If it comes over FUSE, it is quite a bit slower than over
brig cat
. - If you do not use compression, writing and reading will be faster.
The graphs below only measure in-memory performance compared to a dd
like
speed (see the »baseline« line).
Your mileage may vary and you better do your own benchmarks for now.
Todo
Explain/Update those graphs.