-
-
Notifications
You must be signed in to change notification settings - Fork 816
Description
In short, I am seeing the following exception while processing text that includes valid multi-byte Unicode characters, and adding or removing characters before the "problematic" characters can affect whether the exception is thrown.
$ java -classpath .:../../jackson-core/target/jackson-core-2.8.2-SNAPSHOT.jar BadMsg com.fasterxml.jackson.core.JsonGenerationException: Split surrogate on writeRaw() input (last character) at com.fasterxml.jackson.core.JsonGenerator._reportError(JsonGenerator.java:1887) at com.fasterxml.jackson.core.json.UTF8JsonGenerator._outputRawMultiByteChar(UTF8JsonGenerator.java:1916) at com.fasterxml.jackson.core.json.UTF8JsonGenerator._writeSegmentedRaw(UTF8JsonGenerator.java:697) at com.fasterxml.jackson.core.json.UTF8JsonGenerator.writeRaw(UTF8JsonGenerator.java:611) at com.fasterxml.jackson.core.json.UTF8JsonGenerator.writeRaw(UTF8JsonGenerator.java:560) at com.fasterxml.jackson.core.base.GeneratorBase.writeRawValue(GeneratorBase.java:306) at BadMsg.main(BadMsg.java:17)
The simplest way to demonstrate this is code, so I will attach a sample program with a document that causes the error. Sorry for the ugly redacted text, but you can imagine some real words and other interesting strings in place of all the x's. Note that if I delete or add enough of the 'x' characters (doesn't matter where in the JSON they appear, as long as it's before the character that causes the exception) the exception will not be thrown. I believe the problem is in buffering the data that is passed to the lower level functions, but I have not debugged to that level.