<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/string-list.h, branch main</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=main</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=main'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2026-03-25T21:00:45Z</updated>
<entry>
<title>hook: move unsorted_string_list_remove() to string-list.[ch]</title>
<updated>2026-03-25T21:00:45Z</updated>
<author>
<name>Adrian Ratiu</name>
<email>adrian.ratiu@collabora.com</email>
</author>
<published>2026-03-25T19:54:52Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=add3564d2f2804ad37b9af773ec6420b497a1725'/>
<id>urn:sha1:add3564d2f2804ad37b9af773ec6420b497a1725</id>
<content type='text'>
Move the convenience wrapper from hook to string-list since
it's a more suitable place. Add a doc comment to the header.

Also add a free_util arg to make the function more generic
and make the API similar to other functions in string-list.h.
Update the existing call-sites.

Suggested-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Adrian Ratiu &lt;adrian.ratiu@collabora.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list: add string_list_sort_u() that mimics "sort -u"</title>
<updated>2026-01-29T17:32:50Z</updated>
<author>
<name>Amisha Chhajed</name>
<email>amishhhaaaa@gmail.com</email>
</author>
<published>2026-01-29T12:12:20Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=2e711acfbdfc0fedf631688d78cde153ba835c93'/>
<id>urn:sha1:2e711acfbdfc0fedf631688d78cde153ba835c93</id>
<content type='text'>
Many callsites of string_list_remove_duplicates() call it
immdediately after calling string_list_sort(), understandably
as the former requires string-list to be sorted, it is clear
that these places are sorting only to remove duplicates and
for no other reason.

Introduce a helper function string_list_sort_u that combines
these two calls that often appear together, to simplify
these callsites. Replace the current calls of those methods with
string_list_sort_u().

Signed-off-by: Amisha Chhajed &lt;amishhhaaaa@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list: change "string_list_find_insert_index" return type to "size_t"</title>
<updated>2025-10-06T16:11:07Z</updated>
<author>
<name>shejialuo</name>
<email>shejialuo@gmail.com</email>
</author>
<published>2025-10-06T06:32:40Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=51c3385e3736aeb5f78cc9ed193779e2cb4a2a29'/>
<id>urn:sha1:51c3385e3736aeb5f78cc9ed193779e2cb4a2a29</id>
<content type='text'>
As "string_list_find_insert_index" is a simple wrapper of
"get_entry_index" and the return type of "get_entry_index" is already
"size_t", we could simply change its return type to "size_t".

Update all callers to use size_t variables for storing the return value.
The tricky fix is the loop condition in "mailmap.c" to properly handle
"size_t" underflow by changing from `0 &lt;= --i` to `i--`.

Remove "DISABLE_SIGN_COMPARE_WARNINGS" from "mailmap.c" as it's no
longer needed with the proper unsigned types.

Signed-off-by: shejialuo &lt;shejialuo@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list: replace negative index encoding with "exact_match" parameter</title>
<updated>2025-10-06T16:11:07Z</updated>
<author>
<name>shejialuo</name>
<email>shejialuo@gmail.com</email>
</author>
<published>2025-10-06T06:32:31Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=e8a32e766fe3f5e40fb8918a4d825e6d0b9aa272'/>
<id>urn:sha1:e8a32e766fe3f5e40fb8918a4d825e6d0b9aa272</id>
<content type='text'>
The "string_list_find_insert_index()" function is used to determine
the correct insertion index for a new string within the string list.
The function also doubles up to convey if the string is already
existing in the list, this is done by returning a negative index
"-1 -index". Users are expected to decode this information. This
approach has several limitations:

1. It requires the callers to look into the detail of the function to
   understand how to decode the negative index encoding.
2. Using int for indices can cause overflow issues when dealing with
   large string lists.

To address these limitations, change the function to return size_t for
the index value and use a separate bool parameter to indicate whether
the index refers to an existing entry or an insertion point.

In some cases, the callers of "string_list_find_insert_index" only need
the index position and don't care whether an exact match is found.
However, "get_entry_index" currently requires a non-NULL "exact_match"
parameter, forcing these callers to declare unnecessary variables.
Let's allow callers to pass NULL for the "exact_match" parameter when
they don't need this information, reducing unnecessary variable
declarations in calling code.

Signed-off-by: shejialuo &lt;shejialuo@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list: use bool instead of int for "exact_match"</title>
<updated>2025-10-06T16:11:07Z</updated>
<author>
<name>shejialuo</name>
<email>shejialuo@gmail.com</email>
</author>
<published>2025-10-06T06:32:23Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=03ef7762ea12f3b034a2281040bd61c74fd36386'/>
<id>urn:sha1:03ef7762ea12f3b034a2281040bd61c74fd36386</id>
<content type='text'>
The "exact_match" parameter in "get_entry_index" is used to indicate
whether a string is found or not, which is fundamentally a true/false
value. As we allow the use of bool, let's use bool instead of int to
make the function more semantically clear.

Signed-off-by: shejialuo &lt;shejialuo@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list: optionally omit empty string pieces in string_list_split*()</title>
<updated>2025-08-03T05:34:45Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-08-01T22:04:22Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=27531efa41cfa882473513dd93e696a16f6eb87b'/>
<id>urn:sha1:27531efa41cfa882473513dd93e696a16f6eb87b</id>
<content type='text'>
Teach the unified split_string() machinery a new flag bit,
STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces to be
omitted from the resulting string list.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list: optionally trim string pieces split by string_list_split*()</title>
<updated>2025-08-03T05:34:32Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-08-01T22:04:20Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=576454974165d51b7e39c0608cde1c84978f1a8a'/>
<id>urn:sha1:576454974165d51b7e39c0608cde1c84978f1a8a</id>
<content type='text'>
Teach the unified split_string() to take an optional "flags" word,
and define the first flag STRING_LIST_SPLIT_TRIM to cause the split
pieces to be trimmed before they are placed in the string list.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list: align string_list_split() with its _in_place() counterpart</title>
<updated>2025-08-03T05:29:27Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2025-08-01T22:04:18Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=9f6dfe43c8a55b833ae16486bcafe29b543461f9'/>
<id>urn:sha1:9f6dfe43c8a55b833ae16486bcafe29b543461f9</id>
<content type='text'>
The string_list_split_in_place() function was updated by 52acddf3
(string-list: multi-delimiter `string_list_split_in_place()`,
2023-04-24) to take more than one delimiter characters, hoping that
we can later use it to replace our uses of strtok().  We however did
not make a matching change to the string_list_split() function,
which is very similar.

Before giving both functions more features in future commits, allow
string_list_split() to also take more than one delimiter characters
to make them closer to each other.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list: introduce `string_list_setlen()`</title>
<updated>2023-04-24T23:01:28Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-04-24T22:20:14Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=492ba81346cc45322d0c26bc927b01a34becf304'/>
<id>urn:sha1:492ba81346cc45322d0c26bc927b01a34becf304</id>
<content type='text'>
It is sometimes useful to reduce the size of a `string_list`'s list of
items without having to re-allocate them. For example, doing the
following:

    struct strbuf buf = STRBUF_INIT;
    struct string_list parts = STRING_LIST_INIT_NO_DUP;
    while (strbuf_getline(&amp;buf, stdin) != EOF) {
      parts.nr = 0;
      string_list_split_in_place(&amp;parts, buf.buf, ":", -1);
      /* ... */
    }
    string_list_clear(&amp;parts, 0);

is preferable over calling `string_list_clear()` on every iteration of
the loop. This is because `string_list_clear()` causes us free our
existing `items` array. This means that every time we call
`string_list_split_in_place()`, the string-list internals re-allocate
the same size array.

Since in the above example we do not care about the individual parts
after processing each line, it is much more efficient to pretend that
there aren't any elements in the `string_list` by setting `list-&gt;nr` to
0 while leaving the list of elements allocated as-is.

This allows `string_list_split_in_place()` to overwrite any existing
entries without needing to free and re-allocate them.

However, setting `list-&gt;nr` manually is not safe in all instances. There
are a couple of cases worth worrying about:

  - If the `string_list` is initialized with `strdup_strings`,
    truncating the list can lead to overwriting strings which are
    allocated elsewhere. If there aren't any other pointers to those
    strings other than the ones inside of the `items` array, they will
    become unreachable and leak.

    (We could ourselves free the truncated items between
    string_list-&gt;items[nr] and `list-&gt;nr`, but no present or future
    callers would benefit from this additional complexity).

  - If the given `nr` is larger than the current value of `list-&gt;nr`,
    we'll trick the `string_list` into a state where it thinks there are
    more items allocated than there actually are, which can lead to
    undefined behavior if we try to read or write those entries.

Guard against both of these by introducing a helper function which
guards assignment of `list-&gt;nr` against each of the above conditions.

Co-authored-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>string-list: multi-delimiter `string_list_split_in_place()`</title>
<updated>2023-04-24T23:01:28Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2023-04-24T22:20:10Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=52acddf36c8cb3778ab2098a0d95cc2e375a4069'/>
<id>urn:sha1:52acddf36c8cb3778ab2098a0d95cc2e375a4069</id>
<content type='text'>
Enhance `string_list_split_in_place()` to accept multiple characters as
delimiters instead of a single character.

Instead of using `strchr(2)` to locate the first occurrence of the given
delimiter character, `string_list_split_in_place_multi()` uses
`strcspn(2)` to move past the initial segment of characters comprised of
any characters in the delimiting set.

When only a single delimiting character is provided, `strpbrk(2)` (which
is implemented with `strcspn(2)`) has equivalent performance to
`strchr(2)`. Modern `strcspn(2)` implementations treat an empty
delimiter or the singleton delimiter as a special case and fall back to
calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)`
this way.

This change is one step to removing `strtok(2)` from the tree. Note that
`string_list_split_in_place()` is not a strict replacement for
`strtok()`, since it will happily turn sequential delimiter characters
into empty entries in the resulting string_list. For example:

    string_list_split_in_place(&amp;xs, "foo:;:bar:;:baz", ":;", -1)

would yield a string list of:

    ["foo", "", "", "bar", "", "", "baz"]

Callers that wish to emulate the behavior of strtok(2) more directly
should call `string_list_remove_empty_items()` after splitting.

To avoid regressions for the new multi-character delimter cases, update
t0063 in this patch as well.

[1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35
[2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
