Skip to content

Fix currency UTF-8 test vector to actually test invalid UTF-8 sequences #1264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

erickcestari
Copy link
Contributor

The previous test vector for "Malformed: invalid currency UTF-8" was incorrectly testing invalid length encoding (02 length with 2804 value) rather than testing invalid UTF-8 byte sequences in the currency field itself.

This commit replaces it with a test vector that:

  • Uses correct length encoding (03 length with proper 3-byte value)
  • Contains an actual invalid UTF-8 sequence in the currency data
  • Properly validates that parsers reject malformed UTF-8 in currency fields

The previous test vector incorrectly tested currency length encoding
rather than an invalid UTF-8 sequence in the currency field. This commit
updates the vector to have a correctly encoded length but an invalid
UTF-8 currency string, properly matching the test description.
@t-bast
Copy link
Collaborator

t-bast commented Jun 2, 2025

@thomash-acinq can you review this and check what eclair does?

@thomash-acinq
Copy link

The current test vector uses the following TLV stream:

06 02 8041  // Type: 6 (offer_currency), Length: 2, Value: two bytes that form an invalid utf8 string
0a 05 414c494345  // Type: 10 (offer_description), Length: 5, Value: "ALICE"
16 21 020202020202020202020202020202020202020202020202020202020202020202  // Type: 22 (offer_issuer_id), Length: 33, Value : some public key

It does what it claims to do (test that the offer_currency is validated as a utf8 string) and I don't see any need to change it.

Your proposed change is

06 03 4d2aa4  // Type: 6 (offer_currency), Length: 3, Value: three bytes that form an invalid utf8 string
08 02 2710  // Type: 8 (offer_amount)
0a 0c 5465737420766563746f7273 // Type: 10 (offer_description)
16 21 02eec7245d6b7d2ccb30380bfbe2a3648cd7a942653f5aa340edcea1f283686619  // Type: 22 (offer_issuer_id)

which does not seem to be an improvement.

@erickcestari
Copy link
Contributor Author

The current test vector uses the following TLV stream:

06 02 8041  // Type: 6 (offer_currency), Length: 2, Value: two bytes that form an invalid utf8 string
0a 05 414c494345  // Type: 10 (offer_description), Length: 5, Value: "ALICE"
16 21 020202020202020202020202020202020202020202020202020202020202020202  // Type: 22 (offer_issuer_id), Length: 33, Value : some public key

It does what it claims to do (test that the offer_currency is validated as a utf8 string) and I don't see any need to change it.

Your proposed change is

06 03 4d2aa4  // Type: 6 (offer_currency), Length: 3, Value: three bytes that form an invalid utf8 string
08 02 2710  // Type: 8 (offer_amount)
0a 0c 5465737420766563746f7273 // Type: 10 (offer_description)
16 21 02eec7245d6b7d2ccb30380bfbe2a3648cd7a942653f5aa340edcea1f283686619  // Type: 22 (offer_issuer_id)

which does not seem to be an improvement.

Some applications (rust-lightning) validate currency codes by first checking for exactly 3 bytes before parsing the UTF-8 content. Since this test vector isn't 3 bytes long, those applications would reject it based on length rather than detecting the invalid UTF-8 encoding. Using a 3-byte invalid UTF-8 sequence would better test the UTF-8 validation logic specifically. I'll also remove the other fields to keep this test vector focused solely on offer_currency.

@thomash-acinq
Copy link

OK, I understand the problem now. But if we really want to check that this field is a valid ISO 4217 code we should also add test vectors with symbols (US$), lowercase letters (usd) or even codes that look good but are not part of the standard (ABC).
Eclair is not affected as it doesn't support currency conversion so it will reject any offer that sets offer_currency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants