Skip to content

Commit c5a6d80

Browse files
committed
Only test the wrongly encoded string behavior on the C version
Both the pure and java version already raise an error on such case, so this confirms that we're rather deprecate and fix the C version. We shouldn't make the pure or java versions accept these broken strings.
1 parent 210a6e7 commit c5a6d80

File tree

2 files changed

+11
-44
lines changed

2 files changed

+11
-44
lines changed

lib/json/pure/generator.rb

Lines changed: 2 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -337,20 +337,9 @@ def generate(obj)
337337
# Assumes !@ascii_only, !@script_safe
338338
if Regexp.method_defined?(:match?)
339339
private def fast_serialize_string(string, buf) # :nodoc:
340-
if string.encoding == ::Encoding::UTF_8
341-
unless string.valid_encoding?
342-
raise GeneratorError, "source sequence is illegal/malformed utf-8"
343-
end
344-
else
345-
utf8_string = string.dup.force_encoding(::Encoding::UTF_8)
346-
string = if utf8_string.valid_encoding?
347-
utf8_string
348-
else
349-
string.encode(::Encoding::UTF_8)
350-
end
351-
end
352-
353340
buf << '"'.freeze
341+
string = string.encode(::Encoding::UTF_8) unless string.encoding == ::Encoding::UTF_8
342+
354343
if /["\\\x0-\x1f]/n.match?(string)
355344
buf << string.gsub(/["\\\x0-\x1f]/n, MAP)
356345
else
@@ -361,19 +350,6 @@ def generate(obj)
361350
else
362351
# Ruby 2.3 compatibility
363352
private def fast_serialize_string(string, buf) # :nodoc:
364-
if string.encoding == ::Encoding::UTF_8
365-
unless string.valid_encoding?
366-
raise GeneratorError, "source sequence is illegal/malformed utf-8"
367-
end
368-
else
369-
utf8_string = string.dup.force_encoding(::Encoding::UTF_8)
370-
string = if utf8_string.valid_encoding?
371-
utf8_string
372-
else
373-
string.encode(::Encoding::UTF_8)
374-
end
375-
end
376-
377353
buf << string.to_json(self)
378354
end
379355
end
@@ -539,16 +515,6 @@ def to_json(state = nil, *args)
539515
end
540516
string = self
541517
else
542-
# Since the `json` gem was initially written for Ruby 1.8
543-
# before strings had encoding, it used to do its own UTF-8
544-
# validation direction on bytes and never really considered
545-
# the string declared encoding. So passing a ASCII-8BIT string
546-
# worked as long as the bytes were valid UTF-8
547-
# We may want to deprecate this, but we should emit warnings first.
548-
utf8_string = dup.force_encoding(::Encoding::UTF_8)
549-
if utf8_string.valid_encoding?
550-
return utf8_string.to_json(state, *args)
551-
end
552518
string = encode(::Encoding::UTF_8)
553519
end
554520
if state.ascii_only?

tests/json_generator_test.rb

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -445,15 +445,16 @@ def test_invalid_encoding_string
445445
assert_includes error.message, "source sequence is illegal/malformed utf-8"
446446
end
447447

448-
def test_valid_utf8_in_different_encoding
449-
utf8_string = "€™"
450-
wrong_encoding_string = utf8_string.b
451-
# This behavior is historical. Not necessary desirable.
452-
assert_equal utf8_string.to_json, wrong_encoding_string.to_json
453-
assert_equal JSON.dump(utf8_string), JSON.dump(wrong_encoding_string)
454-
end
455-
456448
if defined?(JSON::Ext::Generator) and RUBY_PLATFORM != "java"
449+
def test_valid_utf8_in_different_encoding
450+
utf8_string = "€™"
451+
wrong_encoding_string = utf8_string.b
452+
# This behavior is historical. Not necessary desirable. We should deprecated it.
453+
# The pure and java version of the gem already don't behave this way.
454+
assert_equal utf8_string.to_json, wrong_encoding_string.to_json
455+
assert_equal JSON.dump(utf8_string), JSON.dump(wrong_encoding_string)
456+
end
457+
457458
def test_string_ext_included_calls_super
458459
included = false
459460

0 commit comments

Comments
 (0)