Skip to content

[GEN][ZH] Simplify repeated calls to removeLastChar with new truncate and trim methods #1257

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 14, 2025

Conversation

slurmlord
Copy link

@slurmlord slurmlord commented Jul 10, 2025

Currently the only way to remove characters from the end of a string (AsciiString, UnicodeString, DisplayString) is by calling removeLastChar. If there are multiple characters that needs to be removed, removeLastChar must be called in a loop.
Each call to removeLastChar results in a call to strlen or wcslen, which is quite wasteful when done in a loop.

This change introduces 4 new methods for string classes to alleviate this; truncateBy, truncateTo, trimEnd() and trimEnd(char).
truncateBy will remove a given number of characters from the end of a string.
truncateTo will remove characters from the end of the string until it is of a given length.
Each of these will result in a single call to strlen or wcslen, thus saving an extra call per character beyond the first one.
trimEnd() exposes the end-part from trim as a separate function, where all consecutive white space characters are removed from the string.
trimEnd(char) will remove all consecutive characters of a given value from the end of the string.

The truncation functionality is extracted from #1119 as a separate PR to keep it cleaner.

Replace repeated calls to removeLastChar in various places in the code base with single calls to truncateBy, truncateTo or trimEnd.

Copy link

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in principle.

Going forward we can also optimize the string classes to store their length.

{
++numNonSep;
}
s.truncateBy(numNonSep);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is very difficult to read :(

Can this be written shorter or be a named function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems like it can be written a lot clearer.
Basically a findLast('\') and a truncate to the diff from the start of the string like in other places in this PR. Must have been a Gehirnfurz.


if (index < len - 1)
{
uniLine.truncateTo(index + 1);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like its trimming trailing characters? Perhaps expose that as a function in AsciiString as well? See AsciiString::trim

Copy link
Author

@slurmlord slurmlord Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion - I will add something like trimEnd(const char) and update the corresponding places to use that instead.
I Think I read too quickly. I'll add trimEnd().


if (index < len - 1)
{
truncateTo(index + 1);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? index + 1?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so; index at this point contains the index of the last character that is not a space.
As an example, if the index is 2 then m_data[2] is the character we want the string to stop after. That translates to a total string length of 3, and truncateTo will then set m_data[3] to '\0'.

Is this perhaps an early warning sign of confusing naming/description of the function and its parameter?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the - + 1 stuff can be simplified by starting looping at index = len and then look at index - 1 to test the whitespace char.

@xezon xezon added Minor Severity: Minor < Major < Critical < Blocker Refactor Edits the code with insignificant behavior changes, is never user facing labels Jul 10, 2025
@slurmlord slurmlord changed the title [ZH] Add string truncation to replace repeated calls to removeLastChar [GEN][ZH] Add string truncation to replace repeated calls to removeLastChar Jul 12, 2025
@@ -215,19 +215,18 @@ std::vector<AsciiString> ReplaySimulation::resolveFilenameWildcards(const std::v
AsciiString dir1 = TheRecorder->getReplayDir();
AsciiString dir2 = *filename;
AsciiString wildcard = *filename;
const char* lastSep = dir2.reverseFind('\\');
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach here assumes that there's not a mix of separator characters in a file name.
As an example, c:\\some\\path/here would set dir2 to c:\\some\\ instead of c:\\some\\path/.

Is this a plausible possibility?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid such problem you can search for both and then take the address that is larger.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this string contains userinput from commandline, so it can contain /. I think we can just use the old code here. This will get replaced anyway as soon as we have a helper function for this stuff.

@xezon
Copy link

xezon commented Jul 13, 2025

Is this meant to be a Draft Review?

@slurmlord slurmlord marked this pull request as ready for review July 13, 2025 17:41
Copy link

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file just has whitespace changes now.

/**
Truncate the string to a length of maxLength characters, not including null termination,
by removing from the end.
If the string is empty or shorter than maxLength, do nothing.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can move this line right behind above sentence without new line.

@slurmlord
Copy link
Author

Thanks all for the time and effort taken to review and provide feedback on this - the code has improved a lot, and as always many valuable lessons learned.

@xezon xezon changed the title [GEN][ZH] Add string truncation to replace repeated calls to removeLastChar [GEN][ZH] Replace repeated calls to removeLastChar with new truncation methods Jul 14, 2025
@xezon xezon changed the title [GEN][ZH] Replace repeated calls to removeLastChar with new truncation methods [GEN][ZH] Replace repeated calls to removeLastChar with new truncate and trim methods Jul 14, 2025
@xezon xezon changed the title [GEN][ZH] Replace repeated calls to removeLastChar with new truncate and trim methods [GEN][ZH] Simplify repeated calls to removeLastChar with new truncate and trim methods Jul 14, 2025
@xezon xezon merged commit d11ecb9 into TheSuperHackers:main Jul 14, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Minor Severity: Minor < Major < Critical < Blocker Refactor Edits the code with insignificant behavior changes, is never user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants