Skip to content

Error when parsing a valid XML file #44

@whiver

Description

@whiver

Hi,
I am trying to parse a sample document into a Protobuf Message, using the AddressBook schema from Google examples:

Here is the document:

<AddressBook>
    <people>
        <name>John Doe</name>
        <id>42</id>
        <email>john.doe@example.com</email>
    </people>
    <people>
        <name>Jane Doe</name>
        <id>41</id>
    </people>
</AddressBook>

Here is the code:

// All this initialization stuff is tested
InputStream inputData = XMLMapperTest.class.getResourceAsStream("/data/AddressBook_several.xml");
DynamicSchema schema = SchemaParser.parseSchema(XMLMapperTest.class.getResource("/schemas/AddressBook.desc").getPath(), false);
Descriptors.Descriptor descriptor = schema.getMessageDescriptor("AddressBook");

DynamicMessage.Builder builder = DynamicMessage.newBuilder(descriptor);

XmlFormat xmlFormat = new XmlFormat();
// Here is the instruction that raises the exception
xmlFormat.merge(inputData, StandardCharsets.UTF_8, builder);

Though, I get the following error:

com.googlecode.protobuf.format.ProtobufFormatter$ParseException: 5:21: Expected ">".

	at com.googlecode.protobuf.format.XmlFormat$Tokenizer.parseException(XmlFormat.java:619)
	at com.googlecode.protobuf.format.XmlFormat$Tokenizer.consume(XmlFormat.java:418)
	at com.googlecode.protobuf.format.XmlFormat.consumeClosingElement(XmlFormat.java:680)
	at com.googlecode.protobuf.format.XmlFormat.mergeField(XmlFormat.java:764)
	at com.googlecode.protobuf.format.XmlFormat.handleObject(XmlFormat.java:882)
	at com.googlecode.protobuf.format.XmlFormat.handleValue(XmlFormat.java:775)
	at com.googlecode.protobuf.format.XmlFormat.mergeField(XmlFormat.java:755)
	at com.googlecode.protobuf.format.XmlFormat.merge(XmlFormat.java:663)
	at com.googlecode.protobuf.format.AbstractCharBasedFormatter.merge(AbstractCharBasedFormatter.java:75)
	at com.googlecode.protobuf.format.AbstractCharBasedFormatter.merge(AbstractCharBasedFormatter.java:53)
	at com.googlecode.protobuf.format.ProtobufFormatter.merge(ProtobufFormatter.java:141)
[...]

I tried with UTF-8 and ISO-8859-1 encoding but I still get the error. Then I tried to remove the dots in the email address in my XML doc and I now parse successfully.

This is the working XML:

<AddressBook>
    <people>
        <name>John Doe</name>
        <id>42</id>
        <email>johndoe@examplecom</email>
    </people>
    <people>
        <name>Jane Doe</name>
        <id>41</id>
    </people>
</AddressBook>

If you want, I can also join the Protobuf schema if you want to try by yourself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions