-*- markdown -*- The merge process ================= - merge-fetch fetches new entries from all storage nodes. Maintains a single file 'fetched' referring to a given entry in 'logorder', indicating which entries are fetched and sequenced so far. - merge-backup reads 'fetched' and pushes these entries to secondary merge nodes, maintaining one file per secondary, 'verified.', indicating which entries have been copied to and verified at node . - merge-sth writes a new 'sth' file by reading the 'verified.' files into a list, picking a new tree size by sorting the list (in falling order) and indexing it with the 'backup-quorum-size' config option. If the new tree size is smaller than what the old 'sth' file says, no new STH is created. - merge-dist distributes 'sth' and missing entries to all frontend nodes. Merge distribution (merge_dist) ----------------------------------------------------- * get current position from frontend server (curpos) * send log * sends log in chunks of 1000 hashes from curpos * get missing entries * server goes through all hashes from curpos and checks if they are present * when the server has collected 100000 non-present entries, it returns them * server also keep a separate (in-memory) counter that caches the index of the first entry that either hasn't been checked if it is present or not, or that is checked and found to be non-present, to allow the server to start from that position * send entries * send these entries one at a time * does not get more missing entries when it is done * send sth * sends the previously (merge-sth) constructed sth to the server, which verifies all entries and adds entry-to-hash and hash-to-index * saves the last verified position continuously to avoid doing the work again if the verification is aborted and restarted Merge backup (merge_backup) ----------------------------------------------------- * get verifiedsize from backup server * send log: * determines the end of the log by trying to send small chunks of the log hashes from verifiedsize until it fails, then restarts with the normal chunk size (1000) * get missing entries * this stage is the same as for merge_dist * send entries * send these entries in chunks of 100 at a time (this is limited because of memory considerations and web server limits) * when it is done, goes back to the "get missing entries" stage, until there are no more missing entries * verifyroot * server verifies all entries from verifiedsize, and then calculates and returns root hash * unlike merge distribution, does not save the last verified position either continuously or when it is finished, which means that it then has to verify all entries again if it is aborted and restarted before verifiedsize is set to the new value * if merge_backup sees that the root hash is correct, it sets verifiedsize on backup server TODO ==== - Improve merge-fetch by parallellising it using one process per storage node writing to a "queue info" file (storage-node, hash) and a single "queue handling process" reading queue files and writing to the 'fetched' file.