From b0e3e2d06ba3b71079d7b9b6402a0f410180c09b Mon Sep 17 00:00:00 2001 From: Jeff King Date: Mon, 16 Jan 2023 22:04:44 -0500 Subject: http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION The IOCTLFUNCTION option has been deprecated, and generates a compiler warning in recent versions of curl. We can switch to using SEEKFUNCTION instead. It was added in 2008 via curl 7.18.0; our INSTALL file already indicates we require at least curl 7.19.4. But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL}, and those didn't arrive until 7.19.5. One workaround would be to use a bare 0/1 here (or define our own macros). But let's just bump the minimum required version to 7.19.5. That version is only a minor version bump from our existing requirement, and is only a 2 month time bump for versions that are almost 13 years old. So it's not likely that anybody cares about the distinction. Switching means we have to rewrite the ioctl functions into seek functions. In some ways they are simpler (seeking is the only operation), but in some ways more complex (the ioctl allowed only a full rewind, but now we can seek to arbitrary offsets). Curl will only ever use SEEK_SET (per their documentation), so I didn't bother implementing anything else, since it would naturally be completely untested. This seems unlikely to change, but I added an assertion just in case. Likewise, I doubt curl will ever try to seek outside of the buffer sizes we've told it, but I erred on the defensive side here, rather than do an out-of-bounds read. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano Signed-off-by: Johannes Schindelin --- http.c | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-) (limited to 'http.c') diff --git a/http.c b/http.c index 8b23a546af..b228137388 100644 --- a/http.c +++ b/http.c @@ -186,22 +186,20 @@ size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) return size / eltsize; } -#ifndef NO_CURL_IOCTL -curlioerr ioctl_buffer(CURL *handle, int cmd, void *clientp) +#ifndef NO_CURL_SEEK +int seek_buffer(void *clientp, curl_off_t offset, int origin) { struct buffer *buffer = clientp; - switch (cmd) { - case CURLIOCMD_NOP: - return CURLIOE_OK; - - case CURLIOCMD_RESTARTREAD: - buffer->posn = 0; - return CURLIOE_OK; - - default: - return CURLIOE_UNKNOWNCMD; + if (origin != SEEK_SET) + BUG("seek_buffer only handles SEEK_SET"); + if (offset < 0 || offset >= buffer->buf.len) { + error("curl seek would be outside of buffer"); + return CURL_SEEKFUNC_FAIL; } + + buffer->posn = offset; + return CURL_SEEKFUNC_OK; } #endif -- cgit v1.3 From 07f91e5e79810a8f17de745d2d84c384add75f0a Mon Sep 17 00:00:00 2001 From: Jeff King Date: Mon, 16 Jan 2023 22:04:48 -0500 Subject: http: support CURLOPT_PROTOCOLS_STR The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was deprecated in curl 7.85.0, and using it generate compiler warnings as of curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we can't just do so unilaterally, as it was only introduced less than a year ago in 7.85.0. Until that version becomes ubiquitous, we have to either disable the deprecation warning or conditionally use the "STR" variant on newer versions of libcurl. This patch switches to the new variant, which is nice for two reasons: - we don't have to worry that silencing curl's deprecation warnings might cause us to miss other more useful ones - we'd eventually want to move to the new variant anyway, so this gets us set up (albeit with some extra ugly boilerplate for the conditional) There are a lot of ways to split up the two cases. One way would be to abstract the storage type (strbuf versus a long), how to append (strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use, and so on. But the resulting code looks pretty magical: GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT; if (...http is allowed...) GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP); and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than actual code. On the other end of the spectrum, we could just implement two separate functions, one that handles a string list and one that handles bits. But then we end up repeating our list of protocols (http, https, ftp, ftp). This patch takes the middle ground. The run-time code is always there to handle both types, and we just choose which one to feed to curl. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano Signed-off-by: Johannes Schindelin --- http.c | 57 ++++++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 44 insertions(+), 13 deletions(-) (limited to 'http.c') diff --git a/http.c b/http.c index b228137388..a42de7a751 100644 --- a/http.c +++ b/http.c @@ -808,20 +808,37 @@ void setup_curl_trace(CURL *handle) } #ifdef CURLPROTO_HTTP -static long get_curl_allowed_protocols(int from_user) +static void proto_list_append(struct strbuf *list, const char *proto) { - long allowed_protocols = 0; + if (!list) + return; + if (list->len) + strbuf_addch(list, ','); + strbuf_addstr(list, proto); +} + +static long get_curl_allowed_protocols(int from_user, struct strbuf *list) +{ + long bits = 0; - if (is_transport_allowed("http", from_user)) - allowed_protocols |= CURLPROTO_HTTP; - if (is_transport_allowed("https", from_user)) - allowed_protocols |= CURLPROTO_HTTPS; - if (is_transport_allowed("ftp", from_user)) - allowed_protocols |= CURLPROTO_FTP; - if (is_transport_allowed("ftps", from_user)) - allowed_protocols |= CURLPROTO_FTPS; + if (is_transport_allowed("http", from_user)) { + bits |= CURLPROTO_HTTP; + proto_list_append(list, "http"); + } + if (is_transport_allowed("https", from_user)) { + bits |= CURLPROTO_HTTPS; + proto_list_append(list, "https"); + } + if (is_transport_allowed("ftp", from_user)) { + bits |= CURLPROTO_FTP; + proto_list_append(list, "ftp"); + } + if (is_transport_allowed("ftps", from_user)) { + bits |= CURLPROTO_FTPS; + proto_list_append(list, "ftps"); + } - return allowed_protocols; + return bits; } #endif @@ -979,10 +996,24 @@ static CURL *get_curl_handle(void) curl_easy_setopt(result, CURLOPT_POST301, 1); #endif #ifdef CURLPROTO_HTTP +#if LIBCURL_VERSION_NUM >= 0x075500 + { + struct strbuf buf = STRBUF_INIT; + + get_curl_allowed_protocols(0, &buf); + curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS_STR, buf.buf); + strbuf_reset(&buf); + + get_curl_allowed_protocols(-1, &buf); + curl_easy_setopt(result, CURLOPT_PROTOCOLS_STR, buf.buf); + strbuf_release(&buf); + } +#else curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS, - get_curl_allowed_protocols(0)); + get_curl_allowed_protocols(0, NULL)); curl_easy_setopt(result, CURLOPT_PROTOCOLS, - get_curl_allowed_protocols(-1)); + get_curl_allowed_protocols(-1, NULL)); +#endif #else warning(_("Protocol restrictions not supported with cURL < 7.19.4")); #endif -- cgit v1.3 From 5843080c85ae9d13f77442bec7bbec8e84a18100 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 2 Mar 2023 08:44:16 -0800 Subject: http.c: clear the 'finished' member once we are done with it In http.c, the run_active_slot() function allows the given "slot" to make progress by calling step_active_slots() in a loop repeatedly, and the loop is not left until the request held in the slot completes. Ages ago, we used to use the slot->in_use member to get out of the loop, which misbehaved when the request in "slot" completes (at which time, the result of the request is copied away from the slot, and the in_use member is cleared, making the slot ready to be reused), and the "slot" gets reused to service a different request (at which time, the "slot" becomes in_use again, even though it is for a different request). The loop terminating condition mistakenly thought that the original request has yet to be completed. Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10) fixed this issue, uses a separate "slot->finished" member that is set in run_active_slot() to point to an on-stack variable, and the code that completes the request in finish_active_slot() clears the on-stack variable via the pointer to signal that the particular request held by the slot has completed. It also clears the in_use member (as before that fix), so that the slot itself can safely be reused for an unrelated request. One thing that is not quite clean in this arrangement is that, unless the slot gets reused, at which point the finished member is reset to NULL, the member keeps the value of &finished, which becomes a dangling pointer into the stack when run_active_slot() returns. Clear the finished member before the control leaves the function, which has a side effect of unconfusing compilers like recent GCC 12 that is over-eager to warn against such an assignment. Signed-off-by: Junio C Hamano --- http.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) (limited to 'http.c') diff --git a/http.c b/http.c index 8b23a546af..f44209881e 100644 --- a/http.c +++ b/http.c @@ -1523,6 +1523,32 @@ void run_active_slot(struct active_request_slot *slot) finish_active_slot(slot); } #endif + + /* + * The value of slot->finished we set before the loop was used + * to set our "finished" variable when our request completed. + * + * 1. The slot may not have been reused for another requst + * yet, in which case it still has &finished. + * + * 2. The slot may already be in-use to serve another request, + * which can further be divided into two cases: + * + * (a) If call run_active_slot() hasn't been called for that + * other request, slot->finished would have been cleared + * by get_active_slot() and has NULL. + * + * (b) If the request did call run_active_slot(), then the + * call would have updated slot->finished at the beginning + * of this function, and with the clearing of the member + * below, we would find that slot->finished is now NULL. + * + * In all cases, slot->finished has no useful information to + * anybody at this point. Some compilers warn us for + * attempting to smuggle a pointer that is about to become + * invalid, i.e. &finished. We clear it here to assure them. + */ + slot->finished = NULL; } static void release_active_slot(struct active_request_slot *slot) -- cgit v1.3