aboutsummaryrefslogtreecommitdiffstats
path: root/parsing.c
AgeCommit message (Collapse)AuthorFilesLines
2024-08-02git: update to v2.46.0origin/masterChristian Hesse1-0/+2
Update to git version v2.46.0, this requires changes for these upstream commits: * e7da9385708accf518a80a1e17969020fb361048 global: introduce `USE_THE_REPOSITORY_VARIABLE` macro * 9da95bda74cf10e1475384a71fd20914c3b99784 hash: require hash algorithm in `oidread()` and `oidclr()` * 30aaff437fddd889ba429b50b96ea4c151c502c5 refs: pass repo when peeling objects * c8f815c2083c4b340d4148a15d45c55f2fcc7d3f refs: remove functions without ref store Signed-off-by: Christian Hesse <mail@eworm.de>
2023-06-01git: update to v2.41.0Christian Hesse1-1/+1
Update to git version v2.41.0, with lots of changes... This requires changes for these upstream commits: * 60ff56f50372c1498718938ef504e744fe011ffb banned.h: mark `strtok()` and `strtok_r()` as banned * 52acddf36c8cb3778ab2098a0d95cc2e375a4069 string-list: multi-delimiter `string_list_split_in_place()` * d850b7a545fcfbd97460a921c7f7c59d933eb0f7 cocci: apply the "cache.h" part of "the_repository.pending" * cb338c23d6d518947bf6f7240bf30e2ec232bd3b cocci: apply the "commit-reach.h" part of "the_repository.pending" * ecb5091fd4301ac647db0bd2504112b38f7ee06d cocci: apply the "commit.h" part of "the_repository.pending" * 085390328f5fe1dfba67039b1fd6cc51546a4e41 cocci: apply the "diff.h" part of "the_repository.pending" * bc726bd075929aab6b3e09d4dd5c2b0726fd5350 cocci: apply the "object-store.h" part of "the_repository.pending" * bab821646a74c446370fa8d01ca851f247df5033 cocci: apply the "pretty.h" part of "the_repository.pending" * afe27c889429438829bc8818ed17e4960bd3ef02 cocci: apply the "packfile.h" part of "the_repository.pending" * 12cb1c10a64170a5d600dd1c6c8abfeec105fb6b cocci: apply the "refs.h" part of "the_repository.pending" * 035c7de9e9ea11d26df5f9e4bb117f91ed11a9fd cocci: apply the "revision.h" part of "the_repository.pending" ... and some more I missed to list 😜 - for example the move and cleanup of headers and includes (see changes in `cgit.h`) comes to mind... Signed-off-by: Christian Hesse <mail@eworm.de>
2020-10-20global: replace hard coded hash lengthChristian Hesse1-3/+2
With sha1 we had a guaranteed length of 40 hex chars. This changes now that we have to support sha256 with 64 hex chars... Support both. Signed-off-by: Christian Hesse <mail@eworm.de>
2020-10-20global: replace references to 'sha1' with 'oid'Christian Hesse1-3/+3
For some time now sha1 is considered broken and upstream is working to replace it with sha256. Replace all references to 'sha1' with 'oid', just as upstream does. Signed-off-by: Christian Hesse <mail@eworm.de>
2019-11-08git: update to v2.24.0Christian Hesse1-1/+1
Update to git version v2.24.0. Never use get_cached_commit_buffer() directly, use repo_get_commit_buffer() instead. The latter calls the former anyway. This fixes segmentation fault when commit-graph is enabled and get_cached_commit_buffer() does not return the expected result. Signed-off-by: Christian Hesse <mail@eworm.de>
2018-10-12git: update to v2.19.1Christian Hesse1-1/+1
Update to git version v2.19.1. Required changes follow upstream commits: * commit: add repository argument to get_cached_commit_buffer (3ce85f7e5a41116145179f0fae2ce6d86558d099) * commit: add repository argument to lookup_commit_reference (2122f6754c93be8f02bfb5704ed96c88fc9837a8) * object: add repository argument to parse_object (109cd76dd3467bd05f8d2145b857006649741d5c) * tag: add repository argument to deref_tag (a74093da5ed601a09fa158e5ba6f6f14c1142a3e) * tag: add repository argument to lookup_tag (ce71efb713f97f476a2d2ab541a0c73f684a5db3) * tree: add repository argument to lookup_tree (f86bcc7b2ce6cad68ba1a48a528e380c6126705e) * archive.c: avoid access to the_index (b612ee202a48f129f81f8f6a5af6cf71d1a9caef) * for_each_*_object: move declarations to object-store.h (0889aae1cd18c1804ba01c1a4229e516dfb9fe9b) Signed-off-by: Christian Hesse <mail@eworm.de>
2018-09-11parsing: ban sprintf()Christian Hesse1-1/+1
Git upstream bans sprintf() with commit: banned.h: mark sprintf() as banned cc8fdaee1eeaf05d8dd55ff11f111b815f673c58 Signed-off-by: Christian Hesse <mail@eworm.de>
2018-09-11parsing: ban strncpy()Christian Hesse1-2/+1
Git upstream bans strncpy() with commit: banned.h: mark strncpy() as banned e488b7aba743d23b830d239dcc33d9ca0745a9ad Signed-off-by: Christian Hesse <mail@eworm.de>
2018-06-27git: update to v2.18.0Christian Hesse1-1/+1
Update to git version v2.18.0. Required changes follow upstream commits: * Convert find_unique_abbrev* to struct object_id (aab9583f7b5ea5463eb3f653a0b4ecac7539dc94) * sha1_file: convert read_sha1_file to struct object_id (b4f5aca40e6f77cbabcbf4ff003c3cf30a1830c8) * sha1_file: convert sha1_object_info* to object_id (abef9020e3df87c441c9a3a95f592fce5fa49bb9) * object-store: move packed_git and packed_git_mru to object store (a80d72db2a73174b3f22142eb2014b33696fd795) * treewide: rename tree to maybe_tree (891435d55da80ca3654b19834481205be6bdfe33) The changed data types required some of our own functions to be converted to struct object_id: ls_item print_dir print_dir_entry print_object single_tree_cb walk_tree write_tree_link And finally we use new upstream functions that were added for struct object_id: hashcpy -> oidcpy sha1_to_hex -> oid_to_hex Signed-off-by: Christian Hesse <mail@eworm.de> Reviewed-by: John Keeping <john@keeping.me.uk>
2017-10-14parsing: don't clear existing state with empty inputJohn Keeping1-2/+1
Since commit c699866 (parsing: clear query path before starting, 2017-02-19), we clear the "page" variable simply by calling cgit_parse_url() even if the URL is empty. This breaks a URL like: .../cgit?p=about which is generated when using the "root-readme" configuration option. This happens because "page" is set to "about" when parsing the query string before we handle the path (which is empty, but non-null). It turns out that this is not the only case which is broken, but specifying repository and page via query options has been broken since before the commit mentioned above, for example: .../cgit?r=git&p=log Fix both of these by allowing the previous state to persist if PATH_INFO is empty, falling back to the query parameters if no path has been requested. Reported-by: Tom Ryder <tom@sanctum.geek.nz> Signed-off-by: John Keeping <john@keeping.me.uk>
2017-08-10parsing: clear query path before startingJohn Keeping1-1/+1
By specifying the "url" query parameter multiple times it is possible to end up with ctx.qry.vpath set while ctx.repo is null, which triggers an invalid code path from cgit_print_pageheader() while printing path crumbs, resulting in a null dereference. The previous patch fixed this segfault, but it makes no sense for us to clear ctx.repo while leaving ctx.qry.path set to the previous value, so let's just clear it here so that the last "url" parameter given takes full effect rather than partially overriding the effect of the previous value. Signed-off-by: John Keeping <john@keeping.me.uk>
2016-02-08parsing: add timezone to ident structuresJohn Keeping1-4/+6
This will allow us to mimic Git's behaviour of showing times in the originator's timezone when displaying commits and tags. Signed-off-by: John Keeping <john@keeping.me.uk>
2016-01-13git: update to v2.7.0Christian Hesse1-2/+2
Update to git version v2.7.0. * Upstream commit ed1c9977cb1b63e4270ad8bdf967a2d02580aa08 (Remove get_object_hash.) changed API: Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: Christian Hesse <mail@eworm.de>
2015-03-05Drop return value from parse_user()Lukas Fleischer1-11/+7
In commit 936295c (Simplify commit and tag parsing, 2015-03-03), the commit and tag parsing code was refactored. This broke tag messages in ui-tag since the line after the tagger header was erroneously skipped. Rework parse_user() and skip the line manually outside parse_user(). Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
2015-03-05Remove leading newline characters from tag messagesLukas Fleischer1-0/+3
Fixes a regression introduced in commit 936295c (Simplify commit and tag parsing, 2015-03-03). Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
2015-03-03Simplify commit and tag parsingLukas Fleischer1-72/+42
* Use skip_prefix to avoid magic numbers in the code. * Use xcalloc() instead of xmalloc(), followed by manual initialization. * Split out line splitting. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
2014-12-24Use split_ident_line() in parse_user()Lukas Fleischer1-28/+17
Use Git's built-in ident line splitting algorithm instead of reimplementing it. This does not only simplify the code but also makes sure that cgit is consistent with Git when it comes to author parsing. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
2014-07-28git: update to v2.0.3John Keeping1-1/+2
This is slightly more involved than just bumping the version number because it pulls in a change to convert the commit buffer to a slab, removing the "buffer" field from "struct commit". All sites that access "commit->buffer" have been changed to use the new functions provided for this purpose. Signed-off-by: John Keeping <john@keeping.me.uk>
2014-07-28parsing.c: make commit buffer constJohn Keeping1-4/+4
This will be required in order to incorporate the changes to commit buffer handling in Git 2.0.2. Signed-off-by: John Keeping <john@keeping.me.uk>
2014-06-28git: update for git 2.0Christian Hesse1-6/+6
prefixcmp() and suffixcmp() have been remove, functionality is now provided by starts_with() and ends_with(). Retrurn values have been changed, so instead of just renaming we have to fix logic. Everything else looks just fine.
2014-04-06Fix cgit_parse_url when a repo url is contained in another repo urlJulian Maurice1-9/+14
For example, if I have two repos (remove-suffix is enabled): /foo /foo/bar http://cgit/foo/bar/ is interpreted as "repository 'foo', command 'bar'" instead of "repository 'foo/bar'"
2014-01-16parsing.c: Remove leading space from committerLukas Fleischer1-1/+1
This did not really break anything in the past since spaces are ignored when rendering HTML. Remove the preceding space anyway to prevent from potential future problems. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
2014-01-10parsing: fix header typoJason A. Donenfeld1-1/+1
2014-01-10Replace most uses of strncmp() with prefixcmp()Lukas Fleischer1-6/+6
This is a preparation for replacing all prefix checks with either strip_prefix() or starts_with() when Git 1.8.6 is released. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
2014-01-08Update copyright informationLukas Fleischer1-1/+1
* Name "cgit Development Team" as copyright holder to avoid listing every single developer. * Update copyright ranges. Signed-off-by: Lukas Fleischer <cgit@crytocrack.de>
2013-03-05Mark several functions/variables staticLukas Fleischer1-3/+3
Spotted by parsing the output of `gcc -Wmissing-prototypes [...]`. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
2013-03-04White space around control verbs.Jason A. Donenfeld1-2/+2
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2012-10-02do not write outside heap bufferJim Meyering1-0/+2
* parsing.c (substr): Handle tail < head. This started when I noticed some cgit segfaults on savannah.gnu.org. Finding the offending URL/commit and then constructing a stand-alone reproducer were far more time-consuming than writing the actual patch. The problem arises with a commit like this, in which the user name part of the "Author" field is empty: $ git log -1 commit 6f3f41d73393278f3ede68a2cb1e7a2a23fa3421 Author: <T at h.or> Date: Mon Apr 23 22:29:16 2012 +0200 Here's what happens: (this is due to buf=malloc(0); strncpy (buf, head, -1); where "head" may point to plenty of attacker-specified non-NUL bytes, so we can overwrite a zero-length heap buffer with arbitrary data) Invalid write of size 1 at 0x4A09361: strncpy (mc_replace_strmem.c:463) by 0x408977: substr (parsing.c:61) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) by 0x404078: cache_process (cache.c:322) Address 0x4c718d0 is 0 bytes after a block of size 0 alloc'd at 0x4A0884D: malloc (vg_replace_malloc.c:263) by 0x455C85: xmalloc (wrapper.c:35) by 0x40894C: substr (parsing.c:60) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) Invalid write of size 1 at 0x4A09400: strncpy (mc_replace_strmem.c:463) by 0x408977: substr (parsing.c:61) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) by 0x404078: cache_process (cache.c:322) Address 0x4c7192b is not stack'd, malloc'd or (recently) free'd Invalid write of size 1 at 0x4A0940E: strncpy (mc_replace_strmem.c:463) by 0x408977: substr (parsing.c:61) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) by 0x404078: cache_process (cache.c:322) Address 0x4c7192d is not stack'd, malloc'd or (recently) free'd Process terminating with default action of signal 11 (SIGSEGV) Access not within mapped region at address 0x502F000 at 0x4A09400: strncpy (mc_replace_strmem.c:463) by 0x408977: substr (parsing.c:61) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) by 0x404078: cache_process (cache.c:322) This happens when tail - head == -1 here: (parsing.c) char *substr(const char *head, const char *tail) { char *buf; buf = xmalloc(tail - head + 1); strncpy(buf, head, tail - head); buf[tail - head] = '\0'; return buf; } char *parse_user(char *t, char **name, char **email, unsigned long *date) { char *p = t; int mode = 1; while (p && *p) { if (mode == 1 && *p == '<') { *name = substr(t, p - 1); t = p; mode++; } else if (mode == 1 && *p == '\n') { The fix is to handle the case of (tail < head) before calling xmalloc, thus avoiding passing an invalid value to xmalloc. And here's the reproducer: It was tricky to reproduce, because git prohibits use of an empty "name" in a commit ID. To construct the offending commit, I had to resort to using "git hash-object". git init -q foo && ( cd foo && echo a > j && git add . && git ci -q --author='au <T at h.or>' -m. . && h=$(git cat-file commit HEAD|sed 's/au //' \ |git hash-object -t commit -w --stdin) && git co -q -b test $h && git br -q -D master && git br -q -m test master) git clone -q --bare foo foo.git cat <<EOF > in repo.url=foo.git repo.path=foo.git EOF CGIT_CONFIG=in QUERY_STRING=url=foo.git valgrind ./cgit The valgrind output is what you see above. AFAICS, this is not exploitable thanks (ironically) to the use of strncpy. Since that -1 translates to SIZE_MAX and this is strncpy, not only does it copy whatever is in "head" (up to first NUL), but it also writes SIZE_MAX - strlen(head) NUL bytes into the destination buffer, and that latter is guaranteed to evoke a segfault. Since cgit is single-threaded, AFAICS, there is no way that the buffer clobbering can be turned into an exploit.
2011-07-22Remove dead initialization in cgit_parse_commit()Lukas Fleischer1-1/+1
The value stored to "t" during its initialization gets overwritten in any case, so just leave it uninitialized. Spotted by clang-analyzer. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de> Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2011-05-23Avoid null pointer dereference in reencode().Lukas Fleischer1-1/+4
Returning "*txt" if "txt" is a null pointer is a bad thing. Spotted with clang-analyzer. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de> Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2011-03-26fix two encoding bugsJulius Plenz1-9/+15
reencode() takes three arguments in the order (txt, from, to), opposed to reencode_string, which will, like iconv, handle the arguments with from and to swapped. Fix that (this makes reencode more intuitive). If src and dst encoding are equivalent, don't do any encoding. If no special encoding parameter is found within the commit, assume UTF-8 and explicitly convert to PAGE_ENCODING. The change to reencode() mentioned above avoids re-encoding a UTF-8 string to UTF-8, for example. Signed-off-by: Julius Plenz <plenz@cis.fu-berlin.de> Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2010-07-13Reencode author and committerRémi Lagacé1-0/+4
When a commit has a specific encoding, this encoding also applies to the author and committer name and email. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2008-12-05parsing.c: enable builds with NO_ICONV definedLars Hjemli1-0/+4
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2008-09-15parsing.c: be prepared for unexpected content in commit/tag objectsLars Hjemli1-63/+96
When parsing commits and tags cgit made too many assumptions about the formatting of said objects. This patch tries to make the code be more prepared to handle 'malformed' objects. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2008-04-08Move cgit_parse_query() from parsing.c to html.c as http_parse_querystring()Lars Hjemli1-49/+0
This is a generic http-function. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2008-03-28Move function for configfile parsing into configfile.[ch]Lars Hjemli1-75/+0
This is a generic function which wanted its own little object file. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2008-03-24Add command dispatcherLars Hjemli1-2/+2
This simplifies the code in cgit.c and makes it easier to extend cgit with new pages/commands. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2008-02-16Move cgit_repo into cgit_contextLars Hjemli1-8/+8
This removes the global variable which is used to keep track of the currently selected repository, and adds a new variable in the cgit_context structure. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2008-02-16Introduce struct cgit_contextLars Hjemli1-4/+4
This struct will hold all the cgit runtime information currently found in a multitude of global variables. The first cleanup removes all querystring-related variables. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2007-12-02Merge branch 'stable'Lars Hjemli1-3/+3
* stable: Handle missing timestamp in commit/tag objects Set commit date on snapshot contents
2007-12-02Handle missing timestamp in commit/tag objectsLars Hjemli1-3/+3
When a commit or tag lacks author/committer/tagger timestamp, do not skip the next line in the commit/tag object. Also, do not bother to print timestamps with value 0 as it is close to certain to be bogus. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2007-11-06Use utf8::reencode_string from gitLars Hjemli1-60/+4
This replaces the iconv-support in cgit with similar functions already existing in git. Signed-off-by: Lars Hjemli <hjemli@gmai.com>
2007-11-06Convert subject and message with iconv_msg.Jonathan Bastien-Filiatrault1-0/+14
2007-11-06Add iconv_msg function.Jonathan Bastien-Filiatrault1-0/+58
2007-11-06Set msg_encoding according to the header.Jonathan Bastien-Filiatrault1-0/+8
2007-11-06Add commit->msg_encoding, allocate msg dynamicly.Jonathan Bastien-Filiatrault1-0/+1
2007-10-27cgit_parse_commit(): Add missing call to xstrdup()Lars Hjemli1-2/+2
It's rather silly to point into random memory-locations. Also, remove a call to strdup() used on a literal char *. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2007-10-27Skip unknown header fields when parsing tags and commitsLars Hjemli1-0/+6
Both the commit- and tagparser failed to handle unexpected header fields. This adds futureproofing by simply skipping any header we don't know/care about. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2007-06-26Add trim_end() and use it to remove trailing slashes from repo pathsLars Hjemli1-1/+1
The new function removes all trailing instances of an arbitrary character from a copy of the supplied char array. This is then used to remove any trailing slashes from cgit_query_path. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2007-05-31Check for NULL commit buffer in cgit_parse_commit()Ondrej Jirman1-0/+3
This can be NULL, so try not to segfault. Signed-off-by: Lars Hjemli <hjemli@gmail.com>