From 00d3e8d7dd9ece6fe89dafff384ff32444754211 Mon Sep 17 00:00:00 2001 From: Ævar Arnfjörð Bjarmason Date: Thu, 4 Aug 2022 18:28:37 +0200 Subject: docs: move index format docs to man section 5 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Continue the move of existing Documentation/technical/* protocol and file-format documentation into our main documentation space by moving the index format documentation. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano --- Documentation/Makefile | 2 +- Documentation/gitformat-index.txt | 424 +++++++++++++++++++++++++++++++ Documentation/technical/index-format.txt | 406 ----------------------------- command-list.txt | 1 + 4 files changed, 426 insertions(+), 407 deletions(-) create mode 100644 Documentation/gitformat-index.txt delete mode 100644 Documentation/technical/index-format.txt diff --git a/Documentation/Makefile b/Documentation/Makefile index b53f3c1284..029067a77d 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -26,6 +26,7 @@ MAN1_TXT += gitweb.txt MAN5_TXT += gitattributes.txt MAN5_TXT += gitformat-bundle.txt MAN5_TXT += gitformat-commit-graph.txt +MAN5_TXT += gitformat-index.txt MAN5_TXT += githooks.txt MAN5_TXT += gitignore.txt MAN5_TXT += gitmailmap.txt @@ -104,7 +105,6 @@ TECH_DOCS += technical/bitmap-format TECH_DOCS += technical/cruft-packs TECH_DOCS += technical/hash-function-transition TECH_DOCS += technical/http-protocol -TECH_DOCS += technical/index-format TECH_DOCS += technical/long-running-process-protocol TECH_DOCS += technical/multi-pack-index TECH_DOCS += technical/pack-format diff --git a/Documentation/gitformat-index.txt b/Documentation/gitformat-index.txt new file mode 100644 index 0000000000..41491d2a07 --- /dev/null +++ b/Documentation/gitformat-index.txt @@ -0,0 +1,424 @@ +gitformat-index(5) +================== + +NAME +---- +gitformat-index - Git index format + +SYNOPSIS +-------- +[verse] +$GIT_DIR/index + +DESCRIPTION +----------- + +Git index format + +== The Git index file has the following format + + All binary numbers are in network byte order. + In a repository using the traditional SHA-1, checksums and object IDs + (object names) mentioned below are all computed using SHA-1. Similarly, + in SHA-256 repositories, these values are computed using SHA-256. + Version 2 is described here unless stated otherwise. + + - A 12-byte header consisting of + + 4-byte signature: + The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache") + + 4-byte version number: + The current supported versions are 2, 3 and 4. + + 32-bit number of index entries. + + - A number of sorted index entries (see below). + + - Extensions + + Extensions are identified by signature. Optional extensions can + be ignored if Git does not understand them. + + Git currently supports cache tree and resolve undo extensions. + + 4-byte extension signature. If the first byte is 'A'..'Z' the + extension is optional and can be ignored. + + 32-bit size of the extension + + Extension data + + - Hash checksum over the content of the index file before this checksum. + +== Index entry + + Index entries are sorted in ascending order on the name field, + interpreted as a string of unsigned bytes (i.e. memcmp() order, no + localization, no special casing of directory separator '/'). Entries + with the same name are sorted by their stage field. + + An index entry typically represents a file. However, if sparse-checkout + is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the + `extensions.sparseIndex` extension is enabled, then the index may + contain entries for directories outside of the sparse-checkout definition. + These entries have mode `040000`, include the `SKIP_WORKTREE` bit, and + the path ends in a directory separator. + + 32-bit ctime seconds, the last time a file's metadata changed + this is stat(2) data + + 32-bit ctime nanosecond fractions + this is stat(2) data + + 32-bit mtime seconds, the last time a file's data changed + this is stat(2) data + + 32-bit mtime nanosecond fractions + this is stat(2) data + + 32-bit dev + this is stat(2) data + + 32-bit ino + this is stat(2) data + + 32-bit mode, split into (high to low bits) + + 4-bit object type + valid values in binary are 1000 (regular file), 1010 (symbolic link) + and 1110 (gitlink) + + 3-bit unused + + 9-bit unix permission. Only 0755 and 0644 are valid for regular files. + Symbolic links and gitlinks have value 0 in this field. + + 32-bit uid + this is stat(2) data + + 32-bit gid + this is stat(2) data + + 32-bit file size + This is the on-disk size from stat(2), truncated to 32-bit. + + Object name for the represented object + + A 16-bit 'flags' field split into (high to low bits) + + 1-bit assume-valid flag + + 1-bit extended flag (must be zero in version 2) + + 2-bit stage (during merge) + + 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF + is stored in this field. + + (Version 3 or later) A 16-bit field, only applicable if the + "extended flag" above is 1, split into (high to low bits). + + 1-bit reserved for future + + 1-bit skip-worktree flag (used by sparse checkout) + + 1-bit intent-to-add flag (used by "git add -N") + + 13-bit unused, must be zero + + Entry path name (variable length) relative to top level directory + (without leading slash). '/' is used as path separator. The special + path components ".", ".." and ".git" (without quotes) are disallowed. + Trailing slash is also disallowed. + + The exact encoding is undefined, but the '.' and '/' characters + are encoded in 7-bit ASCII and the encoding cannot contain a NUL + byte (iow, this is a UNIX pathname). + + (Version 4) In version 4, the entry path name is prefix-compressed + relative to the path name for the previous entry (the very first + entry is encoded as if the path name for the previous entry is an + empty string). At the beginning of an entry, an integer N in the + variable width encoding (the same encoding as the offset is encoded + for OFS_DELTA pack entries; see pack-format.txt) is stored, followed + by a NUL-terminated string S. Removing N bytes from the end of the + path name for the previous entry, and replacing it with the string S + yields the path name for this entry. + + 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes + while keeping the name NUL-terminated. + + (Version 4) In version 4, the padding after the pathname does not + exist. + + Interpretation of index entries in split index mode is completely + different. See below for details. + +== Extensions + +=== Cache tree + + Since the index does not record entries for directories, the cache + entries cannot describe tree objects that already exist in the object + database for regions of the index that are unchanged from an existing + commit. The cache tree extension stores a recursive tree structure that + describes the trees that already exist and completely match sections of + the cache entries. This speeds up tree object generation from the index + for a new commit by only computing the trees that are "new" to that + commit. It also assists when comparing the index to another tree, such + as `HEAD^{tree}`, since sections of the index can be skipped when a tree + comparison demonstrates equality. + + The recursive tree structure uses nodes that store a number of cache + entries, a list of subnodes, and an object ID (OID). The OID references + the existing tree for that node, if it is known to exist. The subnodes + correspond to subdirectories that themselves have cache tree nodes. The + number of cache entries corresponds to the number of cache entries in + the index that describe paths within that tree's directory. + + The extension tracks the full directory structure in the cache tree + extension, but this is generally smaller than the full cache entry list. + + When a path is updated in index, Git invalidates all nodes of the + recursive cache tree corresponding to the parent directories of that + path. We store these tree nodes as being "invalid" by using "-1" as the + number of cache entries. Invalid nodes still store a span of index + entries, allowing Git to focus its efforts when reconstructing a full + cache tree. + + The signature for this extension is { 'T', 'R', 'E', 'E' }. + + A series of entries fill the entire extension; each of which + consists of: + + - NUL-terminated path component (relative to its parent directory); + + - ASCII decimal number of entries in the index that is covered by the + tree this entry represents (entry_count); + + - A space (ASCII 32); + + - ASCII decimal number that represents the number of subtrees this + tree has; + + - A newline (ASCII 10); and + + - Object name for the object that would result from writing this span + of index as a tree. + + An entry can be in an invalidated state and is represented by having + a negative number in the entry_count field. In this case, there is no + object name and the next entry starts immediately after the newline. + When writing an invalid entry, -1 should always be used as entry_count. + + The entries are written out in the top-down, depth-first order. The + first entry represents the root level of the repository, followed by the + first subtree--let's call this A--of the root level (with its name + relative to the root level), followed by the first subtree of A (with + its name relative to A), and so on. The specified number of subtrees + indicates when the current level of the recursive stack is complete. + +=== Resolve undo + + A conflict is represented in the index as a set of higher stage entries. + When a conflict is resolved (e.g. with "git add path"), these higher + stage entries will be removed and a stage-0 entry with proper resolution + is added. + + When these higher stage entries are removed, they are saved in the + resolve undo extension, so that conflicts can be recreated (e.g. with + "git checkout -m"), in case users want to redo a conflict resolution + from scratch. + + The signature for this extension is { 'R', 'E', 'U', 'C' }. + + A series of entries fill the entire extension; each of which + consists of: + + - NUL-terminated pathname the entry describes (relative to the root of + the repository, i.e. full pathname); + + - Three NUL-terminated ASCII octal numbers, entry mode of entries in + stage 1 to 3 (a missing stage is represented by "0" in this field); + and + + - At most three object names of the entry in stages from 1 to 3 + (nothing is written for a missing stage). + +=== Split index + + In split index mode, the majority of index entries could be stored + in a separate file. This extension records the changes to be made on + top of that to produce the final index. + + The signature for this extension is { 'l', 'i', 'n', 'k' }. + + The extension consists of: + + - Hash of the shared index file. The shared index file path + is $GIT_DIR/sharedindex.. If all bits are zero, the + index does not require a shared index file. + + - An ewah-encoded delete bitmap, each bit represents an entry in the + shared index. If a bit is set, its corresponding entry in the + shared index will be removed from the final index. Note, because + a delete operation changes index entry positions, but we do need + original positions in replace phase, it's best to just mark + entries for removal, then do a mass deletion after replacement. + + - An ewah-encoded replace bitmap, each bit represents an entry in + the shared index. If a bit is set, its corresponding entry in the + shared index will be replaced with an entry in this index + file. All replaced entries are stored in sorted order in this + index. The first "1" bit in the replace bitmap corresponds to the + first index entry, the second "1" bit to the second entry and so + on. Replaced entries may have empty path names to save space. + + The remaining index entries after replaced ones will be added to the + final index. These added entries are also sorted by entry name then + stage. + +== Untracked cache + + Untracked cache saves the untracked file list and necessary data to + verify the cache. The signature for this extension is { 'U', 'N', + 'T', 'R' }. + + The extension starts with + + - A sequence of NUL-terminated strings, preceded by the size of the + sequence in variable width encoding. Each string describes the + environment where the cache can be used. + + - Stat data of $GIT_DIR/info/exclude. See "Index entry" section from + ctime field until "file size". + + - Stat data of core.excludesFile + + - 32-bit dir_flags (see struct dir_struct) + + - Hash of $GIT_DIR/info/exclude. A null hash means the file + does not exist. + + - Hash of core.excludesFile. A null hash means the file does + not exist. + + - NUL-terminated string of per-dir exclude file name. This usually + is ".gitignore". + + - The number of following directory blocks, variable width + encoding. If this number is zero, the extension ends here with a + following NUL. + + - A number of directory blocks in depth-first-search order, each + consists of + + - The number of untracked entries, variable width encoding. + + - The number of sub-directory blocks, variable width encoding. + + - The directory name terminated by NUL. + + - A number of untracked file/dir names terminated by NUL. + +The remaining data of each directory block is grouped by type: + + - An ewah bitmap, the n-th bit marks whether the n-th directory has + valid untracked cache entries. + + - An ewah bitmap, the n-th bit records "check-only" bit of + read_directory_recursive() for the n-th directory. + + - An ewah bitmap, the n-th bit indicates whether hash and stat data + is valid for the n-th directory and exists in the next data. + + - An array of stat data. The n-th data corresponds with the n-th + "one" bit in the previous ewah bitmap. + + - An array of hashes. The n-th hash corresponds with the n-th "one" bit + in the previous ewah bitmap. + + - One NUL. + +== File System Monitor cache + + The file system monitor cache tracks files for which the core.fsmonitor + hook has told us about changes. The signature for this extension is + { 'F', 'S', 'M', 'N' }. + + The extension starts with + + - 32-bit version number: the current supported versions are 1 and 2. + + - (Version 1) + 64-bit time: the extension data reflects all changes through the given + time which is stored as the nanoseconds elapsed since midnight, + January 1, 1970. + + - (Version 2) + A null terminated string: an opaque token defined by the file system + monitor application. The extension data reflects all changes relative + to that token. + + - 32-bit bitmap size: the size of the CE_FSMONITOR_VALID bitmap. + + - An ewah bitmap, the n-th bit indicates whether the n-th index entry + is not CE_FSMONITOR_VALID. + +== End of Index Entry + + The End of Index Entry (EOIE) is used to locate the end of the variable + length index entries and the beginning of the extensions. Code can take + advantage of this to quickly locate the index extensions without having + to parse through all of the index entries. + + Because it must be able to be loaded before the variable length cache + entries and other index extensions, this extension must be written last. + The signature for this extension is { 'E', 'O', 'I', 'E' }. + + The extension consists of: + + - 32-bit offset to the end of the index entries + + - Hash over the extension types and their sizes (but not + their contents). E.g. if we have "TREE" extension that is N-bytes + long, "REUC" extension that is M-bytes long, followed by "EOIE", + then the hash would be: + + Hash("TREE" + + + "REUC" + ) + +== Index Entry Offset Table + + The Index Entry Offset Table (IEOT) is used to help address the CPU + cost of loading the index by enabling multi-threading the process of + converting cache entries from the on-disk format to the in-memory format. + The signature for this extension is { 'I', 'E', 'O', 'T' }. + + The extension consists of: + + - 32-bit version (currently 1) + + - A number of index offset entries each consisting of: + + - 32-bit offset from the beginning of the file to the first cache entry + in this block of entries. + + - 32-bit count of cache entries in this block + +== Sparse Directory Entries + + When using sparse-checkout in cone mode, some entire directories within + the index can be summarized by pointing to a tree object instead of the + entire expanded list of paths within that tree. An index containing such + entries is a "sparse index". Index format versions 4 and less were not + implemented with such entries in mind. Thus, for these versions, an + index containing sparse directory entries will include this extension + with signature { 's', 'd', 'i', 'r' }. Like the split-index extension, + tools should avoid interacting with a sparse index unless they understand + this extension. + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt deleted file mode 100644 index 65da0daaa5..0000000000 --- a/Documentation/technical/index-format.txt +++ /dev/null @@ -1,406 +0,0 @@ -Git index format -================ - -== The Git index file has the following format - - All binary numbers are in network byte order. - In a repository using the traditional SHA-1, checksums and object IDs - (object names) mentioned below are all computed using SHA-1. Similarly, - in SHA-256 repositories, these values are computed using SHA-256. - Version 2 is described here unless stated otherwise. - - - A 12-byte header consisting of - - 4-byte signature: - The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache") - - 4-byte version number: - The current supported versions are 2, 3 and 4. - - 32-bit number of index entries. - - - A number of sorted index entries (see below). - - - Extensions - - Extensions are identified by signature. Optional extensions can - be ignored if Git does not understand them. - - Git currently supports cache tree and resolve undo extensions. - - 4-byte extension signature. If the first byte is 'A'..'Z' the - extension is optional and can be ignored. - - 32-bit size of the extension - - Extension data - - - Hash checksum over the content of the index file before this checksum. - -== Index entry - - Index entries are sorted in ascending order on the name field, - interpreted as a string of unsigned bytes (i.e. memcmp() order, no - localization, no special casing of directory separator '/'). Entries - with the same name are sorted by their stage field. - - An index entry typically represents a file. However, if sparse-checkout - is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the - `extensions.sparseIndex` extension is enabled, then the index may - contain entries for directories outside of the sparse-checkout definition. - These entries have mode `040000`, include the `SKIP_WORKTREE` bit, and - the path ends in a directory separator. - - 32-bit ctime seconds, the last time a file's metadata changed - this is stat(2) data - - 32-bit ctime nanosecond fractions - this is stat(2) data - - 32-bit mtime seconds, the last time a file's data changed - this is stat(2) data - - 32-bit mtime nanosecond fractions - this is stat(2) data - - 32-bit dev - this is stat(2) data - - 32-bit ino - this is stat(2) data - - 32-bit mode, split into (high to low bits) - - 4-bit object type - valid values in binary are 1000 (regular file), 1010 (symbolic link) - and 1110 (gitlink) - - 3-bit unused - - 9-bit unix permission. Only 0755 and 0644 are valid for regular files. - Symbolic links and gitlinks have value 0 in this field. - - 32-bit uid - this is stat(2) data - - 32-bit gid - this is stat(2) data - - 32-bit file size - This is the on-disk size from stat(2), truncated to 32-bit. - - Object name for the represented object - - A 16-bit 'flags' field split into (high to low bits) - - 1-bit assume-valid flag - - 1-bit extended flag (must be zero in version 2) - - 2-bit stage (during merge) - - 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF - is stored in this field. - - (Version 3 or later) A 16-bit field, only applicable if the - "extended flag" above is 1, split into (high to low bits). - - 1-bit reserved for future - - 1-bit skip-worktree flag (used by sparse checkout) - - 1-bit intent-to-add flag (used by "git add -N") - - 13-bit unused, must be zero - - Entry path name (variable length) relative to top level directory - (without leading slash). '/' is used as path separator. The special - path components ".", ".." and ".git" (without quotes) are disallowed. - Trailing slash is also disallowed. - - The exact encoding is undefined, but the '.' and '/' characters - are encoded in 7-bit ASCII and the encoding cannot contain a NUL - byte (iow, this is a UNIX pathname). - - (Version 4) In version 4, the entry path name is prefix-compressed - relative to the path name for the previous entry (the very first - entry is encoded as if the path name for the previous entry is an - empty string). At the beginning of an entry, an integer N in the - variable width encoding (the same encoding as the offset is encoded - for OFS_DELTA pack entries; see pack-format.txt) is stored, followed - by a NUL-terminated string S. Removing N bytes from the end of the - path name for the previous entry, and replacing it with the string S - yields the path name for this entry. - - 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes - while keeping the name NUL-terminated. - - (Version 4) In version 4, the padding after the pathname does not - exist. - - Interpretation of index entries in split index mode is completely - different. See below for details. - -== Extensions - -=== Cache tree - - Since the index does not record entries for directories, the cache - entries cannot describe tree objects that already exist in the object - database for regions of the index that are unchanged from an existing - commit. The cache tree extension stores a recursive tree structure that - describes the trees that already exist and completely match sections of - the cache entries. This speeds up tree object generation from the index - for a new commit by only computing the trees that are "new" to that - commit. It also assists when comparing the index to another tree, such - as `HEAD^{tree}`, since sections of the index can be skipped when a tree - comparison demonstrates equality. - - The recursive tree structure uses nodes that store a number of cache - entries, a list of subnodes, and an object ID (OID). The OID references - the existing tree for that node, if it is known to exist. The subnodes - correspond to subdirectories that themselves have cache tree nodes. The - number of cache entries corresponds to the number of cache entries in - the index that describe paths within that tree's directory. - - The extension tracks the full directory structure in the cache tree - extension, but this is generally smaller than the full cache entry list. - - When a path is updated in index, Git invalidates all nodes of the - recursive cache tree corresponding to the parent directories of that - path. We store these tree nodes as being "invalid" by using "-1" as the - number of cache entries. Invalid nodes still store a span of index - entries, allowing Git to focus its efforts when reconstructing a full - cache tree. - - The signature for this extension is { 'T', 'R', 'E', 'E' }. - - A series of entries fill the entire extension; each of which - consists of: - - - NUL-terminated path component (relative to its parent directory); - - - ASCII decimal number of entries in the index that is covered by the - tree this entry represents (entry_count); - - - A space (ASCII 32); - - - ASCII decimal number that represents the number of subtrees this - tree has; - - - A newline (ASCII 10); and - - - Object name for the object that would result from writing this span - of index as a tree. - - An entry can be in an invalidated state and is represented by having - a negative number in the entry_count field. In this case, there is no - object name and the next entry starts immediately after the newline. - When writing an invalid entry, -1 should always be used as entry_count. - - The entries are written out in the top-down, depth-first order. The - first entry represents the root level of the repository, followed by the - first subtree--let's call this A--of the root level (with its name - relative to the root level), followed by the first subtree of A (with - its name relative to A), and so on. The specified number of subtrees - indicates when the current level of the recursive stack is complete. - -=== Resolve undo - - A conflict is represented in the index as a set of higher stage entries. - When a conflict is resolved (e.g. with "git add path"), these higher - stage entries will be removed and a stage-0 entry with proper resolution - is added. - - When these higher stage entries are removed, they are saved in the - resolve undo extension, so that conflicts can be recreated (e.g. with - "git checkout -m"), in case users want to redo a conflict resolution - from scratch. - - The signature for this extension is { 'R', 'E', 'U', 'C' }. - - A series of entries fill the entire extension; each of which - consists of: - - - NUL-terminated pathname the entry describes (relative to the root of - the repository, i.e. full pathname); - - - Three NUL-terminated ASCII octal numbers, entry mode of entries in - stage 1 to 3 (a missing stage is represented by "0" in this field); - and - - - At most three object names of the entry in stages from 1 to 3 - (nothing is written for a missing stage). - -=== Split index - - In split index mode, the majority of index entries could be stored - in a separate file. This extension records the changes to be made on - top of that to produce the final index. - - The signature for this extension is { 'l', 'i', 'n', 'k' }. - - The extension consists of: - - - Hash of the shared index file. The shared index file path - is $GIT_DIR/sharedindex.. If all bits are zero, the - index does not require a shared index file. - - - An ewah-encoded delete bitmap, each bit represents an entry in the - shared index. If a bit is set, its corresponding entry in the - shared index will be removed from the final index. Note, because - a delete operation changes index entry positions, but we do need - original positions in replace phase, it's best to just mark - entries for removal, then do a mass deletion after replacement. - - - An ewah-encoded replace bitmap, each bit represents an entry in - the shared index. If a bit is set, its corresponding entry in the - shared index will be replaced with an entry in this index - file. All replaced entries are stored in sorted order in this - index. The first "1" bit in the replace bitmap corresponds to the - first index entry, the second "1" bit to the second entry and so - on. Replaced entries may have empty path names to save space. - - The remaining index entries after replaced ones will be added to the - final index. These added entries are also sorted by entry name then - stage. - -== Untracked cache - - Untracked cache saves the untracked file list and necessary data to - verify the cache. The signature for this extension is { 'U', 'N', - 'T', 'R' }. - - The extension starts with - - - A sequence of NUL-terminated strings, preceded by the size of the - sequence in variable width encoding. Each string describes the - environment where the cache can be used. - - - Stat data of $GIT_DIR/info/exclude. See "Index entry" section from - ctime field until "file size". - - - Stat data of core.excludesFile - - - 32-bit dir_flags (see struct dir_struct) - - - Hash of $GIT_DIR/info/exclude. A null hash means the file - does not exist. - - - Hash of core.excludesFile. A null hash means the file does - not exist. - - - NUL-terminated string of per-dir exclude file name. This usually - is ".gitignore". - - - The number of following directory blocks, variable width - encoding. If this number is zero, the extension ends here with a - following NUL. - - - A number of directory blocks in depth-first-search order, each - consists of - - - The number of untracked entries, variable width encoding. - - - The number of sub-directory blocks, variable width encoding. - - - The directory name terminated by NUL. - - - A number of untracked file/dir names terminated by NUL. - -The remaining data of each directory block is grouped by type: - - - An ewah bitmap, the n-th bit marks whether the n-th directory has - valid untracked cache entries. - - - An ewah bitmap, the n-th bit records "check-only" bit of - read_directory_recursive() for the n-th directory. - - - An ewah bitmap, the n-th bit indicates whether hash and stat data - is valid for the n-th directory and exists in the next data. - - - An array of stat data. The n-th data corresponds with the n-th - "one" bit in the previous ewah bitmap. - - - An array of hashes. The n-th hash corresponds with the n-th "one" bit - in the previous ewah bitmap. - - - One NUL. - -== File System Monitor cache - - The file system monitor cache tracks files for which the core.fsmonitor - hook has told us about changes. The signature for this extension is - { 'F', 'S', 'M', 'N' }. - - The extension starts with - - - 32-bit version number: the current supported versions are 1 and 2. - - - (Version 1) - 64-bit time: the extension data reflects all changes through the given - time which is stored as the nanoseconds elapsed since midnight, - January 1, 1970. - - - (Version 2) - A null terminated string: an opaque token defined by the file system - monitor application. The extension data reflects all changes relative - to that token. - - - 32-bit bitmap size: the size of the CE_FSMONITOR_VALID bitmap. - - - An ewah bitmap, the n-th bit indicates whether the n-th index entry - is not CE_FSMONITOR_VALID. - -== End of Index Entry - - The End of Index Entry (EOIE) is used to locate the end of the variable - length index entries and the beginning of the extensions. Code can take - advantage of this to quickly locate the index extensions without having - to parse through all of the index entries. - - Because it must be able to be loaded before the variable length cache - entries and other index extensions, this extension must be written last. - The signature for this extension is { 'E', 'O', 'I', 'E' }. - - The extension consists of: - - - 32-bit offset to the end of the index entries - - - Hash over the extension types and their sizes (but not - their contents). E.g. if we have "TREE" extension that is N-bytes - long, "REUC" extension that is M-bytes long, followed by "EOIE", - then the hash would be: - - Hash("TREE" + + - "REUC" + ) - -== Index Entry Offset Table - - The Index Entry Offset Table (IEOT) is used to help address the CPU - cost of loading the index by enabling multi-threading the process of - converting cache entries from the on-disk format to the in-memory format. - The signature for this extension is { 'I', 'E', 'O', 'T' }. - - The extension consists of: - - - 32-bit version (currently 1) - - - A number of index offset entries each consisting of: - - - 32-bit offset from the beginning of the file to the first cache entry - in this block of entries. - - - 32-bit count of cache entries in this block - -== Sparse Directory Entries - - When using sparse-checkout in cone mode, some entire directories within - the index can be summarized by pointing to a tree object instead of the - entire expanded list of paths within that tree. An index containing such - entries is a "sparse index". Index format versions 4 and less were not - implemented with such entries in mind. Thus, for these versions, an - index containing sparse directory entries will include this extension - with signature { 's', 'd', 'i', 'r' }. Like the split-index extension, - tools should avoid interacting with a sparse index unless they understand - this extension. diff --git a/command-list.txt b/command-list.txt index ed859fdd79..5e8d838668 100644 --- a/command-list.txt +++ b/command-list.txt @@ -211,6 +211,7 @@ giteveryday guide gitfaq guide gitformat-bundle developerinterfaces gitformat-commit-graph developerinterfaces +gitformat-index developerinterfaces gitglossary guide githooks userinterfaces gitignore userinterfaces -- cgit v1.3