Skip to content

Take string view in mime methods #12167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 50 commits into from
May 12, 2025

Conversation

hnakamur
Copy link
Contributor

@hnakamur hnakamur commented Apr 7, 2025

No description provided.

hnakamur added 28 commits April 6, 2025 22:18
@cmcfarlen cmcfarlen self-requested a review April 7, 2025 22:11
@hnakamur
Copy link
Contributor Author

hnakamur commented Apr 8, 2025

Here's the clang-analyzer report concerning a nullptr dereference, in case it's helpful:

https://ci.trafficserver.apache.org/job/Github_Builds/job/clang-analyzer/5704/artifact/output/12167/scan-build-2025-04-07-13-36-30-195012-dqlezm4f/index.html

I suppose this was a false positive from clang-analyzer.

I did 'git bisect' with a script whose content is https://github.com/apache/trafficserver-ci/blob/9a6bb8b40f2a6c830d52f619fa06cda22c60abf4/jenkins/github/clang-analyzer.pipeline#L60-L91
and found the commit 99a15bf was the trigger.

However I did not think the commit was actual cause after reviewing the diff.
I think clang-analyzer was confused that string_get may return nullptr because of if (new_url == nullptr) below it.

new_url = to_free = u->string_get(&s->arena, &new_url_len);
if (new_url == nullptr) {
  new_url = "";
}

I suppose that string_get always returns non nullptr.
So I added assert(to_free != nullptr); at 038258a and clang-analyzer does not report nullptr dereference anymore.

If this assumption is true, the following three lines are not needed.

if (new_url == nullptr) {
  new_url = "";
}

If this assumption is false, we must guard the call to str_free like below:

if (to_free) {
  s->arena.str_free(to_free);
}

Copy link
Contributor

@cmcfarlen cmcfarlen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of this cleanup, but I think if we kept the old functions and just added overrides then there would be fewer incidental conversions to string_view.

int32_t value_get_int(std::string_view name) const;
uint32_t value_get_uint(std::string_view name) const;
int64_t value_get_int64(std::string_view name) const;
time_t value_get_date(std::string_view name) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this one and a few others we could provide an override that takes string_view rather than replace the existing one. That will avoid some of the conversions to string_view that just get converted back immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we could change lower-level functions such as mime_hdr_prepare_for_value_set to take string_view once this pull request is merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder should I change lower-level functions to take string_view before this pull request?
Or should I revert removing the version taking const char * and int as you suggested?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed lower level functions to take string_view up to the commit 673b9e5.

@@ -1652,7 +1636,7 @@ MIMEHdr::value_append(const char *name, int name_length, const char *value, int
inline time_t
MIMEHdr::get_age() const
{
int64_t age = value_get_int64(MIME_FIELD_AGE, MIME_LEN_AGE);
int64_t age = value_get_int64(std::string_view{MIME_FIELD_AGE, static_cast<std::string_view::size_type>(MIME_LEN_AGE)});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another example where also having the other override is cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we could add string_view version of mime field constants once this pull request is merged.
After that, we could deprecate const char * MIME_FIELD_XXX and int MIME_LEN_XXX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added c_str_view which is similar to std::string_view but asserts the string is terminate with \0.
It is needed in src/api/InkAPIInternal.cc like:

TS_MIME_FIELD_ACCEPT                    = MIME_FIELD_ACCEPT.c_str();

The type of TS_MIME_FIELD_ACCEPT is const char * and it expected the string is terminated with \0.
The type of MIME_FIELD_ACCEPT was changed to c_str_view which expresses so.

@bryancall bryancall requested a review from cmcfarlen May 5, 2025 22:17
@hnakamur
Copy link
Contributor Author

hnakamur commented May 6, 2025

I merged master to resolve conflicts caused by #12091 being merged.

Copy link
Contributor

@cmcfarlen cmcfarlen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more change request just for commentary. Thanks for doing this!

@@ -617,3 +618,45 @@ struct ats_unique_buf_deleter {
};
using ats_unique_buf = std::unique_ptr<uint8_t[], ats_unique_buf_deleter>;
ats_unique_buf ats_unique_malloc(size_t size);

class c_str_view
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok, BUT, please add a comment here that the purpose of this class is to transition between older code that uses char * and cleaned up code that uses std::string_view and it will be removed when it is not needed.

We should convert the wks code to be more C++ and then remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment. I added a comment at 629bdf3.

}
explicit c_str_view(std::string_view sv) : sv_{sv}
{
ink_assert(sv_.data() == nullptr ? sv_.length() == 0 : sv_.data()[sv_.length()] == '\0');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sv_.data()[sv_.length()] == '\0' could be considered UB if the string_view is not null terminated as this might be reading past a buffer, but since this is an assert anyway its fine if it crashes when called with a non-null terminated string_view.

@cmcfarlen cmcfarlen merged commit 3b0f0b7 into apache:master May 12, 2025
15 checks passed
@hnakamur hnakamur deleted the take_string_view_in_mime_methods branch May 15, 2025 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants