Skip to content

JsonGenerationException: Split surrogate on writeRaw() input thrown for input of a certain size #307

@mtnaseef

Description

@mtnaseef

In short, I am seeing the following exception while processing text that includes valid multi-byte Unicode characters, and adding or removing characters before the "problematic" characters can affect whether the exception is thrown.

$ java -classpath .:../../jackson-core/target/jackson-core-2.8.2-SNAPSHOT.jar BadMsg com.fasterxml.jackson.core.JsonGenerationException: Split surrogate on writeRaw() input (last character) at com.fasterxml.jackson.core.JsonGenerator._reportError(JsonGenerator.java:1887) at com.fasterxml.jackson.core.json.UTF8JsonGenerator._outputRawMultiByteChar(UTF8JsonGenerator.java:1916) at com.fasterxml.jackson.core.json.UTF8JsonGenerator._writeSegmentedRaw(UTF8JsonGenerator.java:697) at com.fasterxml.jackson.core.json.UTF8JsonGenerator.writeRaw(UTF8JsonGenerator.java:611) at com.fasterxml.jackson.core.json.UTF8JsonGenerator.writeRaw(UTF8JsonGenerator.java:560) at com.fasterxml.jackson.core.base.GeneratorBase.writeRawValue(GeneratorBase.java:306) at BadMsg.main(BadMsg.java:17)

The simplest way to demonstrate this is code, so I will attach a sample program with a document that causes the error. Sorry for the ugly redacted text, but you can imagine some real words and other interesting strings in place of all the x's. Note that if I delete or add enough of the 'x' characters (doesn't matter where in the JSON they appear, as long as it's before the character that causes the exception) the exception will not be thrown. I believe the problem is in buffering the data that is passed to the lower level functions, but I have not debugged to that level.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions