This file describes the design, layouts, and file formats of a libsvn_fs_fs repository. Design ------ In FSFS, each committed revision is represented as an immutable file containing the new node-revisions, contents, and changed-path information for the revision, plus a second, changeable file containing the revision properties. In contrast to the BDB back end, the contents of recent revision of files are stored as deltas against earlier revisions, instead of the other way around. This is less efficient for common-case checkouts, but brings greater simplicity and robustness, as well as the flexibility to make commits work without write access to existing revisions. Skip-deltas and delta combination mitigate the checkout cost. In-progress transactions are represented with a prototype rev file containing only the new text representations of files (appended to as changed file contents come in), along with a separate file for each node-revision, directory representation, or property representation which has been changed or added in the transaction. During the final stage of the commit, these separate files are marshalled onto the end of the prototype rev file to form the immutable revision file. Layout of the FS directory -------------------------- The layout of the FS directory (the "db" subdirectory of the repository) is: revs/ Subdirectory containing revs <shard>/ Shard directory, if sharding is in use (see below) <revnum> File containing rev <revnum> <shard>.pack Pack directory, if the repo has been packed (see below) pack Pack file, if the repository has been packed (see below) manifest Pack manifest file, if a pack file exists (see below) revprops/ Subdirectory containing rev-props <shard>/ Shard directory, if sharding is in use (see below) <revnum> File containing rev-props for <revnum> transactions/ Subdirectory containing transactions <txnid>.txn/ Directory containing transaction <txnid> txn-protorevs/ Subdirectory containing transaction proto-revision files <txnid>.rev Proto-revision file for transaction <txnid> <txnid>.rev-lock Write lock for proto-rev file txn-current File containing the next transaction key locks/ Subdirectory containing locks <partial-digest>/ Subdirectory named for first 3 letters of an MD5 digest <digest> File containing locks/children for path with <digest> node-origins/ Lazy cache of origin noderevs for nodes <partial-nodeid> File containing noderev ID of origins of nodes current File specifying current revision and next node/copy id fs-type File identifying this filesystem as an FSFS filesystem write-lock Empty file, locked to serialise writers txn-current-lock Empty file, locked to serialise 'txn-current' uuid File containing the UUID of the repository format File containing the format number of this filesystem fsfs.conf Configuration file min-unpacked-rev File containing the oldest revision not in a pack file rep-cache.db SQLite database mapping rep checksums to locations Files in the revprops directory are in the hash dump format used by svn_hash_write. The format of the "current" file is: * Format 3 and above: a single line of the form "<youngest-revision>\n" giving the youngest revision for the repository. * Format 2 and below: a single line of the form "<youngest-revision> <next-node-id> <next-copy-id>\n" giving the youngest revision, the next unique node-ID, and the next unique copy-ID for the repository. The "write-lock" file is an empty file which is locked before the final stage of a commit and unlocked after the new "current" file has been moved into place to indicate that a new revision is present. It is also locked during a revprop propchange while the revprop file is read in, mutated, and written out again. Note that readers are never blocked by any operation - writers must ensure that the filesystem is always in a consistent state. The "txn-current" file is a file with a single line of text that contains only a base-36 number. The current value will be used in the next transaction name, along with the revision number the transaction is based on. This sequence number ensures that transaction names are not reused, even if the transaction is aborted and a new transaction based on the same revision is begun. The only operation that FSFS performs on this file is "get and increment"; the "txn-current-lock" file is locked during this operation. "fsfs.conf" is a configuration file in the standard Subversion/Python config format. It is automatically generated when you create a new repository; read the generated file for details on what it controls. When representation sharing is enabled, the filesystem tracks representation checksum and location mappings using a SQLite database in "rep-cache.db". The database has a single table, which stores the sha1 hash text as the primary key, mapped to the representation revision, offset, size and expanded size. A final field, reuse count, is currently used for distinguishing representations which are shared. This file is not required, and may be removed at an abritrary time, with the subsequent loss of rep-sharing capabilities. Filesystem formats ------------------ The "format" file defines what features are permitted within the filesystem, and indicates changes that are not backward-compatible. It serves the same purpose as the repository file of the same name. The filesystem format file was introduced in Subversion 1.2, and so will not be present if the repository was created with an older version of Subversion. An absent format file should be interpreted as indicating a format 1 filesystem. The format file is a single line of the form "<format number>\n", followed by any number of lines specifying 'format options' - additional information about the filesystem's format. Each format option line is of the form "<option>\n" or "<option> <parameters>\n". Clients should raise an error if they encounter an option not permitted by the format number in use. The formats are: Format 1, understood by Subversion 1.1+ Format 2, understood by Subversion 1.4+ Format 3, understood by Subversion 1.5+ Format 4, understood by Subversion 1.6+ The differences between the formats are: Delta representation in revision files Format 1: svndiff0 only Formats 2-4: svndiff0 or svndiff1 Format options Formats 1-2: none permitted Format 3-4: "layout" option Transaction name reuse Formats 1-2: transaction names may be reused Format 3-4: transaction names generated using txn-current file Location of proto-rev file and its lock Formats 1-2: transactions/<txnid>/rev and transactions/<txnid>/rev-lock. Format 3-4: txn-protorevs/<txnid>.rev and txn-protorevs/<txnid>.rev-lock. Node-ID and copy-ID generation Formats 1-2: Node-IDs and copy-IDs are guaranteed to form a monotonically increasing base36 sequence using the "current" file. Format 3-4: Node-IDs and copy-IDs use the new revision number to ensure uniqueness and the "current" file just contains the youngest revision. Mergeinfo metadata: Format 1-2: minfo-here and minfo-count node-revision fields are not stored. svn_fs_get_mergeinfo returns an error. Format 3-4: minfo-here and minfo-count node-revision fields are maintained. svn_fs_get_mergeinfo works. Revision changed paths list: Format 1-3: Does not contain the node's kind. Format 4: Contains the node's kind. Filesystem format options ------------------------- Currently, the only recognised format option is "layout", which specifies the paths that will be used to store the revision files and revision property files. The "layout" option is followed by the name of the filesystem layout and any required parameters. The default layout, if no "layout" keyword is specified, is the 'linear' layout. The known layouts, and the parameters they require, are as follows: "linear" Revision files and rev-prop files are named after the revision they represent, and are placed directly in the revs/ and revprops/ directories. r1234 will be represented by the revision file revs/1234 and the rev-prop file revprops/1234. "sharded <max-files-per-directory>" Revision files and rev-prop files are named after the revision they represent, and are placed in a subdirectory of the revs/ and revprops/ directories named according to the 'shard' they belong to. Shards are numbered from zero and contain between one and the maximum number of files per directory specified in the layout's parameters. For the "sharded 1000" layout, r1234 will be represented by the revision file revs/1/1234 and rev-prop file revprops/1/1234. The revs/0/ directory will contain revisions 0-999, revs/1/ will contain 1000-1999, and so on. Packing ------- A repository can optionally be "packed" to conserve space on disk. The packing process concatenates all the revision files in each full shard to create pack files. A manifest file is also created for each shard which records the indexes of the corresponding revision files in the pack file. In addition, the original shard is removed, and reads are redirected to the pack file. The manifest file consists of a list of offsets, one for each revision in the pack file. The offsets are stored as ASCII decimal, and separated by a newline character. Node-revision IDs ----------------- In order to support efficient lookup of node-revisions by their IDs and to simplify the allocation of fresh node-IDs during a transaction, we treat the fields of a node-ID in new and interesting ways. Within a revision file, node-revs have a txn-id field of the form "r<rev>/<offset>", to support easy lookup. New node-revision IDs assigned within a transaction have the txn-id field of "t<txnid>". When a new node-id or copy-id is assigned in a transaction, the ID used is a "_" followed by a base36 number unique to the transaction. During the final phase of a commit, node-revision IDs are rewritten to have repository-wide unique node-ID and copy-ID fields, and to have "r<rev>/<offset>" txn-id fields. In Format 3 and above, this uniqueness is done by changing a temporary id of "_<base36>" to "<rev>-<base36>". Note that this means that the originating revision of a line of history or a copy can be determined by looking at the node ID. In Format 2 and below, the "current" file contains global base36 node-ID and copy-ID counters; during the commit, the counter value is added to the transaction-specific base36 ID, and the value in "current" is adjusted. (It is legal for Format 3 repositories to contain Format 2-style IDs; this just prevents I/O-less node-origin-rev lookup for those nodes.) The temporary assignment of node-ID and copy-ID fields has implications for svn_fs_compare_ids and svn_fs_check_related. The IDs _1.0.t1 is not related to the ID _1.0.t2 even though they have the same node-ID, because temporary node-IDs are restricted in scope to the transactions they belong to. There is a lazily created cache mapping from node-IDs to the full node-revision ID where they are created. This is in the node-origins directory; the file name is the node-ID without its last character (or "0" for single-character node IDs) and the contents is a serialized hash mapping from node-ID to node-revision ID. This cache is only used for node-IDs of the pre-Format 3 style. Copy-IDs and copy roots ----------------------- Copy-IDs are assigned in the same manner as they are in the BDB implementation: * A node-rev resulting from a creation operation (with no copy history) receives the copy-ID of its parent directory. * A node-rev resulting from a copy operation receives a fresh copy-ID, as one would expect. * A node-rev resulting from a modification operation receives a copy-ID depending on whether its predecessor derives from a copy operation or whether it derives from a creation operation with no intervening copies: - If the predecessor does not derive from a copy, the new node-rev receives the copy-ID of its parent directory. If the node-rev is being modified through its created-path, this will be the same copy-ID as the predecessor node-rev has; however, if the node-rev is being modified through a copied ancestor directory (i.e. we are performing a "lazy copy"), this will be a different copy-ID. - If the predecessor derives from a copy and the node-rev is being modified through its created-path, the new node-rev receives the copy-ID of the predecessor. - If the predecessor derives from a copy and the node-rev is not being modified through its created path, the new node-rev receives a fresh copy-ID. This is called a "soft copy" operation, as distinct from a "true copy" operation which was actually requested through the svn_fs interface. Soft copies exist to ensure that the same <node-ID,copy-ID> pair is not used twice within a transaction. Unlike the BDB implementation, we do not have a "copies" table. Instead, each node-revision record contains a "copyroot" field identifying the node-rev resulting from the true copy operation most proximal to the node-rev. If the node-rev does not itself derive from a copy operation, then the copyroot field identifies the copy of an ancestor directory; if no ancestor directories derive from a copy operation, then the copyroot field identifies the root directory of rev 0. Revision file format -------------------- A revision file contains a concatenation of various kinds of data: * Text and property representations * Node-revisions * The changed-path data * Two offsets at the very end A representation begins with a line containing either "PLAIN\n" or "DELTA\n" or "DELTA <rev> <offset> <length>\n", where <rev>, <offset>, and <length> give the location of the delta base of the representation and the amount of data it contains (not counting the header or trailer). If no base location is given for a delta, the base is the empty stream. After the initial line comes raw svndiff data, followed by a cosmetic trailer "ENDREP\n". If the a representation is for the text contents of a directory node, the expanded contents are in hash dump format mapping entry names to "<type> <id>" pairs, where <type> is "file" or "dir" and <id> gives the ID of the child node-rev. If a representation is for a property list, the expanded contents are in the form of a dumped hash map mapping property names to property values. The marshalling syntax for node-revs is a series of fields terminated by a blank line. Fields have the syntax "<name>: <value>\n", where <name> is a symbolic field name (each symbolic name is used only once in a given node-rev) and <value> is the value data. Unrecognized fields are ignored, for extensibility. The following fields are defined: id The ID of the node-rev type "file" or "dir" pred The ID of the predecessor node-rev count Count of node-revs since the base of the node text "<rev> <offset> <length> <size> <digest>" for text rep props "<rev> <offset> <length> <size> <digest>" for props rep <rev> and <offset> give location of rep <length> gives length of rep, sans header and trailer <size> gives size of expanded rep <digest> gives hex MD5 digest of expanded rep cpath FS pathname node was created at copyfrom "<rev> <path>" of copyfrom data copyroot "<rev> <created-path>" of the root of this copy minfo-cnt The number of nodes under (and including) this node which have svn:mergeinfo. minfo-here Exists if if this node itself has svn:mergeinfo. The predecessor of a node-rev crosses both soft and true copies; together with the count field, it allows efficient determination of the base for skip-deltas. The first node-rev of a node contains no "pred" field. A node-revision with no properties may omit the "props" field. A node-revision with no contents (a zero-length file or an empty directory) may omit the "text" field. In a node-revision resulting from a true copy operation, the "copyfrom" field gives the copyfrom data. The "copyroot" field identifies the root node-revision of the copy; it may be omitted if the node-rev is its own copy root (as is the case for node-revs with copy history, and for the root node of revision 0). Copy roots are identified by revision and created-path, not by node-rev ID, because a copy root may be a node-rev which exists later on within the same revision file, meaning its offset is not yet known. The changed-path data is represented as a series of changed-path items, each consisting of two lines. The first line has the format "<id> <action> <text-mod> <prop-mod> <path>\n", where <id> is the node-rev ID of the new node-rev, <action> is "add", "delete", "replace", or "modify", <text-mod> and <prop-mod> are "true" or "false" indicating whether the text and/or properties changed, and <path> is the changed pathname. For deletes, <id> is the node-rev ID of the deleted node-rev, and <text-mod> and <prop-mod> are always "false". The second line has the format "<rev> <path>\n" containing the node-rev's copyfrom information if it has any; if it does not, the second line is blank. Starting with FS format 4, <action> may contain the kind ("file" or "dir") of the node, after a hyphen; for example, an added directory may be represented as "add-dir". At the very end of a rev file is a pair of lines containing "\n<root-offset> <cp-offset>\n", where <root-offset> is the offset of the root directory node revision and <cp-offset> is the offset of the changed-path data. All numbers in the rev file format are unsigned and are represented as ASCII decimal. Transaction layout ------------------ A transaction directory has the following layout: props Transaction props next-ids Next temporary node-ID and copy-ID changes Changed-path information so far node.<nid>.<cid> New node-rev data for node node.<nid>.<cid>.props Props for new node-rev, if changed node.<nid>.<cid>.children Directory contents for node-rev In FS formats 1 and 2, it also contains: rev Prototype rev file with new text reps rev-lock Lockfile for writing to the above In newer formats, these files are in the txn-protorevs/ directory. The prototype rev file is used to store the text representations as they are received from the client. To ensure that only one client is writing to the file at a given time, the "rev-lock" file is locked for the duration of each write. The two kinds of props files are all in hash dump format. The "props" file will always be present. The "node.<nid>.<cid>.props" file will only be present if the node-rev properties have been changed. The "next-ids" file contains a single line "<next-temp-node-id> <next-temp-copy-id>\n" giving the next temporary node-ID and copy-ID assignments (without the leading underscores). The next node-ID is also used as a uniquifier for representations which may share the same underlying rep. The "children" file for a node-revision begins with a copy of the hash dump representation of the directory entries from the old node-rev (or a dump of the empty hash for new directories), and then an incremental hash dump entry for each change made to the directory. The "changes" file contains changed-path entries in the same form as the changed-path entries in a rev file, except that <id> and <action> may both be "reset" (in which case <text-mod> and <prop-mod> are both always "false") to indicate that all changes to a path should be considered undone. Reset entries are only used during the final merge phase of a transaction. Actions in the "changes" file always contain a node kind, even if the FS format is older than format 4. The node-rev files have the same format as node-revs in a revision file, except that the "text" and "props" fields are augmented as follows: * The "props" field may have the value "-1" if properties have been changed and are contained in a "props" file within the node-rev subdirectory. * For directory node-revs, the "text" field may have the value "-1" if entries have been changed and are contained in a "contents" file in the node-rev subdirectory. * For the directory node-rev representing the root of the transaction, the "is-fresh-txn-root" field indicates that it has not been made mutable yet (see Issue #2608). * For file node-revs, the "text" field may have the value "-1 <offset> <length> <size> <digest>" if the text representation is within the prototype rev file. * The "copyroot" field may have the value "-1 <created-path>" if the copy root of the node-rev is part of the transaction in process. Locks layout ------------ Locks in FSFS are stored in serialized hash format in files whose names are MD5 digests of the FS path which the lock is associated with. For the purposes of keeping directory inode usage down, these digest files live in subdirectories of the main lock directory whose names are the first 3 characters of the digest filename. Also stored in the digest file for a given FS path are pointers to other digest files which contain information associated with other FS paths that are our path's immediate children. To answer the question, "Does path FOO have a lock associated with it?", one need only generate the MD5 digest of FOO's absolute-in-the-FS path (say, 3b1b011fed614a263986b5c4869604e8), look for a file located like so: /path/to/repos/locks/3b1/3b1b011fed614a263986b5c4869604e8 And then see if that file contains lock information. To inquire about locks on children of the path FOO, you would reference the same path as above, but look for a list of children in that file (instead of lock information). Children are listed as MD5 digests, too, so you would simply iterate over those digests and consult the files they reference, and so on, recursively.