Skip to content

Commit bdade0c

Browse files
committed
regcomp.c: Need to account for UTF group name
I found this by reading the code. Prior to this commit, the parse pointer was advanced by one byte; it should be advanced by one character. As long as the the character was ASCII, things worked. I looked through the regcomp.c source for other mis-use of the macro changed by this commit; none were obvious.
1 parent ba00806 commit bdade0c

File tree

2 files changed

+3
-2
lines changed

2 files changed

+3
-2
lines changed

β€Žregcomp.cβ€Ž

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2533,8 +2533,8 @@ S_reg_scan_name(pTHX_ RExC_state_t *pRExC_state, U32 flags)
25332533
&& (advance = isWORDCHAR_utf8_safe( (U8 *) RExC_parse,
25342534
(U8 *) RExC_end)));
25352535
} else {
2536-
RExC_parse_inc_by(1); /* so the <- from the vFAIL is after the offending
2537-
character */
2536+
/* so the <- from the vFAIL is after the offending character */
2537+
RExC_parse_inc_safe();
25382538
vFAIL("Group name must start with a non-digit word character");
25392539
}
25402540
sv_name = newSVpvn_flags(name_start, (int)(RExC_parse - name_start),

β€Žt/re/reg_mesg.tβ€Ž

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -547,6 +547,7 @@ my @death_utf8 = mark_as_utf8(
547547
'/[\cネ]/' => "Character following \"\\c\" must be printable ASCII {#} m/[\\cネ{#}]/",
548548
'/\b{ネ}/' => "'ネ' is an unknown bound type {#} m/\\b{ネ{#}}/",
549549
'/\B{ネ}/' => "'ネ' is an unknown bound type {#} m/\\B{ネ{#}}/",
550+
'/ネ(?<β€Ώname>match)ネ/; #no latin1' => 'Group name must start with a non-digit word character {#} m/ネ(?<β€Ώ{#}name>match)ネ/',
550551
);
551552
push @death, @death_utf8;
552553

0 commit comments

Comments
Β (0)