# plop database The plop database is a distributed, single-master, append-only database suitable for transparency systems like Certificate Transparency. Data entries are stored together with three attributes: - index integer; the first entry has index 0, the next one 1 and so on - entry hash the hash over the entry, used for duplicate detection - leaf hash hash over specific parts of the entry, usually together with a timestamp for use as a leaf in a merkle tree ## Storage Data is kept in two regular files and three key-value stores. The two regular files, regardless of database backend: - treesize filename is static, file contains one line -- the number of entries in the part of the database up to and including the last entry in the last published tree, like a "current pointer" - index filename is static, file contains one line per entry -- the leaf hash The three key-value stores, in one of two formats -- fsdb or permdb (described in separate sections of this document): - entry key=leaf hash, value=the actual data of the entry - entryhash key=entry hash, value=leaf hash - indexforhash key=leaf hash, value=index ## fsdb backend The fsdb backend uses regular files in a file system. fsdb is implemented as "bucketed" directory trees with one file per key-value pair. The file name is the key and the file content is the value. For a concrete example, here's how catlfish names the three key-value stores used in plop: - entry: 'certentries' - entryhash: 'entryhash' - indexforhash: 'certindex' ## permdb backend The permdb backend uses a C implementation of a key-value store optimised for append-only. See permdb.md for a description of permdb. ## Distributed TODO: describe distribution ## Erlang code in src/ - db.erl public interface for adding database entries as well as retrieving entries by index, leaf hash or entry hash - index.erl file-based storage for ordered append-only lists of fixed-sized entries, retrievable by index - perm.erl dispatching to configured database backend -- fsdb or permdb - fsdb.erl file-based database backend - permdb.erl interface to C implementation of key-value store - atomic.erl atomic file operations - util.erl helper functions for lower level file handling - fsyncport.erl interface to C implementation for fsync(2) syscall ## C code in c_src/ - net_read_write.c read and write to/from a file descriptor, using fsync(2) to increase probability that data lands on disk - fsynchelper.c erlang port for net_read_write - erlport.c erlang/C glue - filebuffer.c buffered files - permdb.c permdb implementation - permdbport.c erlang port for permdb - permdbpy.c python bindings for permdb - permdbtest.c permdb tests - pstring.h pascal string implementation - utarray.h - uthash.h array and hash table implementations - util.c helper functions