Skip to content

Space in file names within epub stops conversion #816

@krishnshyam

Description

@krishnshyam

Having space in the xml files that constitute the epub blocks conversion with the following message:
Unexpected error in pxi:fileset-add-entries (Please see detailed log for more info.)

When looking into the detailed log I see this:

Unexpected error in pxi:fileset-add-entries
at {http://www.daisy.org/ns/pipeline/xproc/internal}fileset-add-entries name="add-entries"(fileset-add-entries.xpl:98)
at {http://www.daisy.org/ns/pipeline/xproc}fileset-add-entries(load.xpl:206)
at {http://www.w3.org/ns/xproc}otherwise(load.xpl:143)
at {http://www.w3.org/ns/xproc}choose name="result"(load.xpl:124)
at {http://www.daisy.org/ns/pipeline/xproc}epub-load name="load"(__processed__epub-to-daisy.script.xpl:220)
Caused by: Illegal character in path at index 9: EPUB/Home Science IInd Part-1.xhtml
at java.base/java.net.URI.create(URI.java:906)
at org.daisy.pipeline.fileset.calabash.impl.AddEntriesStep.run(AddEntriesStep.java:126)
at {http://www.daisy.org/ns/pipeline/xproc/internal}fileset-add-entries name="add-entries"(fileset-add-entries.xpl:98)
... 4 more
Caused by: Illegal character in path at index 9: EPUB/Home Science IInd Part-1.xhtml
at java.base/java.net.URI$Parser.fail(URI.java:2976)
at java.base/java.net.URI$Parser.checkChars(URI.java:3147)
at java.base/java.net.URI$Parser.parseHierarchical(URI.java:3229)
at java.base/java.net.URI$Parser.parse(URI.java:3188)
at java.base/java.net.URI.<init>(URI.java:623)
at java.base/java.net.URI.create(URI.java:904)
at org.daisy.pipeline.fileset.calabash.impl.AddEntriesStep.run(AddEntriesStep.java:126)
... 5 more

I'm attaching the log file as well

I confirmed that this is due to the space by replacing spaces in the filename with underscores as well as editing the corresponding references in the toc.xhtml and the package.opf files - then this error did not show up.

EPUB conventions seems to suggest that these filenames should be URL compliant, but major epub readers like Thorium can read such documents, so I want to know if this can be fixed for pipeline as well

acc3d469-a57a-4f4a-95cb-0f692ede592e.log

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions