Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complaints about inkscape, Sodipodi, etc attributes in SVG files #1147

Closed
DavidGriffith opened this issue Jun 15, 2020 · 17 comments · Fixed by #1150
Closed

Complaints about inkscape, Sodipodi, etc attributes in SVG files #1147

DavidGriffith opened this issue Jun 15, 2020 · 17 comments · Fixed by #1150
Assignees
Labels
priority: high To be processed and published in the next release spec: EPUB 2.x Impacting the support of EPUB 2.x specifications status: has PR The issue is being processed in a pull request type: false-positive This issue is about valid content being incorrectly rejected
Milestone

Comments

@DavidGriffith
Copy link

When I run a check on an epub I'm building by way of text4ebook (https://github.com/michal-h21/tex4ebook/) I get zillions of complaints like this:

ERROR(RSC-005): /home/dave/proj/ebooks/latex/foo/foo.epub/OEBPS/art/card.svg(84,51): Error while parsing file: attribute "inkscape:label" not allowed here; expected attribute "alignment-baseline", "baseline-shift", "class", "clip",
...
ERROR(RSC-005): /home/dave/proj/ebooks/latex/foo/foo.epub/OEBPS/art/card.svg(98,51): Error while parsing file: attribute "inkscape:groupmode" not allowed here; expected attribute "alignment-baseline", "baseline-shift",
...
ERROR(RSC-005): /home/dave/proj/ebooks/latex/foo/foo.epub/OEBPS/art/card.svg(151,24): Error while parsing file: attribute "sodipodi:role" not allowed here; expected attribute "alignment-baseline", "baseline-shift", "class", "clip-path", "clip-rule", "color", "color-interpolation",
...

Why do the presence of these attributes matter?

@mattgarrish
Copy link
Member

They shouldn't. The use of foreign namespaces should have been fixed with #491

Are you using an older version of epubcheck by any chance? I haven't been able to reproduce the warnings with the latest, but could be something I'm not trying.

@DavidGriffith
Copy link
Author

I just now did a pull to make sure I have the latest of the master branch. I'm getting the same complaints.

@rdeltour
Copy link
Member

@DavidGriffith thanks for confirming. Would you be able to share a sample EBUB to help us reproduce and further investigate?

@DavidGriffith
Copy link
Author

Of course -- test.epub wrapped in a zip.

test.zip

@rdeltour
Copy link
Member

I’m not able to reproduce with you file and EPUBCheck v4.2.2 🤔. Here’s what I get:

EPUBCheck log
$ epubcheck -version
EPUBCheck v4.2.2
No file specified in the arguments. Exiting.
EPUBCheck completed
$ epubcheck test.epub 
Validating using EPUB version 2.0.1 rules.
ERROR(RSC-005): test.epub/OEBPS/testch2.html(16,4): Error while parsing file: element "i" not allowed here; expected the element end-tag or element "address", "blockquote", "del", "div", "dl", "h1", "h2", "h3", "h4", "h5", "h6", "hr", "ins", "noscript", "ns:svg", "ol", "p", "pre", "script", "table" or "ul" (with xmlns:ns="http://www.w3.org/2000/svg")
ERROR(RSC-005): test.epub/OEBPS/testch2.html(17,34): Error while parsing file: element "p" not allowed here; expected the element end-tag, text or element "a", "abbr", "acronym", "applet", "b", "bdo", "big", "br", "cite", "code", "del", "dfn", "em", "i", "iframe", "img", "ins", "kbd", "map", "noscript", "ns:svg", "object", "q", "samp", "script", "small", "span", "strong", "sub", "sup", "tt" or "var" (with xmlns:ns="http://www.w3.org/2000/svg")
FATAL(RSC-016): test.epub/OEBPS/testch2.html(18,60): Fatal Error while parsing file: The element type "p" must be terminated by the matching end-tag "</p>".
ERROR(RSC-005): test.epub/OEBPS/testch2.html(-1,-1): Error while parsing file: The element type "p" must be terminated by the matching end-tag "</p>".
ERROR(RSC-005): test.epub/OEBPS/testch3.html(16,4): Error while parsing file: element "i" not allowed here; expected the element end-tag or element "address", "blockquote", "del", "div", "dl", "h1", "h2", "h3", "h4", "h5", "h6", "hr", "ins", "noscript", "ns:svg", "ol", "p", "pre", "script", "table" or "ul" (with xmlns:ns="http://www.w3.org/2000/svg")
ERROR(RSC-005): test.epub/OEBPS/testch3.html(17,34): Error while parsing file: element "p" not allowed here; expected the element end-tag, text or element "a", "abbr", "acronym", "applet", "b", "bdo", "big", "br", "cite", "code", "del", "dfn", "em", "i", "iframe", "img", "ins", "kbd", "map", "noscript", "ns:svg", "object", "q", "samp", "script", "small", "span", "strong", "sub", "sup", "tt" or "var" (with xmlns:ns="http://www.w3.org/2000/svg")
FATAL(RSC-016): test.epub/OEBPS/testch3.html(18,40): Fatal Error while parsing file: The element type "p" must be terminated by the matching end-tag "</p>".
ERROR(RSC-005): test.epub/OEBPS/testch3.html(-1,-1): Error while parsing file: The element type "p" must be terminated by the matching end-tag "</p>".

Check finished with errors
Messages: 2 fatals / 6 errors / 0 warnings / 0 infos

EPUBCheck completed

@DavidGriffith
Copy link
Author

DavidGriffith commented Jun 18, 2020

Those complaints about elements not being terminated properly or in the wrong place are from a bug in tex4ebook (michal-h21/tex4ebook#68) that I recently reported. I don't expect them to be relevant to this bug report.

This is what I'm using:

$ ./epubcheck.jar --version
EPUBCheck v4.2.3-SNAPSHOT
No file specified in the arguments. Exiting.
EPUBCheck completed

Maybe I'm missing some library? It took a long time for me to figure out just what libraries are required. Doing apt-get build-dep epubcheck didn't catch all of them. Using Debian 10. Ultimately I had to do mvn -e -X -DskipTests clean install to stop the build from aborting like this:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on project epubcheck: There are test failures.

I can't figure out what to install to satisfy this.

@Doktorchen
Copy link

According to Synaptic Debian 10 (stable) uses epubcheck 4.1.0
On my computer it works with 'epubcheck *.epub'

Using a newer version from github within a specific directory it works with something like:
java -jar ~/epubcheck/epubcheck-4.2.1/epubcheck.jar *.epub
(I did not try newer versions yet)

@mattgarrish
Copy link
Member

Ah, this is an EPUB 2 issue with SVG. The test file didn't actually contain any SVG images, but when I added one I was able to reproduce the issue.

@mattgarrish
Copy link
Member

Looks like we'd need to use a similar nvdl script to the epub 3 one -- just switch to sending the stripped file to validate against the svg 1.1 dtd.

@DavidGriffith
Copy link
Author

According to Synaptic Debian 10 (stable) uses epubcheck 4.1.0
On my computer it works with 'epubcheck *.epub'

Using a newer version from github within a specific directory it works with something like:
java -jar ~/epubcheck/epubcheck-4.2.1/epubcheck.jar *.epub
(I did not try newer versions yet)

Debian includes a wrapper that allows one to simply execute a .jar file as I did.

@DavidGriffith
Copy link
Author

Looking at the test epub, things are working okay now except for the aforementioned tex4ebook errors. But my project wherein this problem first appeared, I still have trouble. Here's a second test epub prepared more carefully.

test2.zip

@mattgarrish
Copy link
Member

But my project wherein this problem first appeared, I still have trouble.

See my comment above. The issue is that foreign namespaced elements and attributes aren't filtered out before validating the SVG against the 1.1 DTD. The DTD can't handle them.

We fixed this problem for EPUB 3 by using an NVDL schema to strip foreign namespaces so that they aren't present in the validation step. The same needs to be implemented for EPUB 2 validation.

@DavidGriffith
Copy link
Author

But my project wherein this problem first appeared, I still have trouble.

See my comment above. The issue is that foreign namespaced elements and attributes aren't filtered out before validating the SVG against the 1.1 DTD. The DTD can't handle them.

We fixed this problem for EPUB 3 by using an NVDL schema to strip foreign namespaces so that they aren't present in the validation step. The same needs to be implemented for EPUB 2 validation.

Oh. I mistakenly thought you said it was fixed for epub2. My apologies.

@rdeltour rdeltour added priority: high To be processed and published in the next release spec: EPUB 2.x Impacting the support of EPUB 2.x specifications status: ready for implem The issue is ready to be implemented type: false-positive This issue is about valid content being incorrectly rejected labels Jun 19, 2020
@rdeltour rdeltour added this to the 4.2.3 milestone Jun 19, 2020
@laudrain
Copy link
Collaborator

@DavidGriffith Thanks for the alert !
However let me encourage you to switch to EPUB3 : all dev and maintenance efforts are based on this up to date version, particularly accessibility.
TeX4ebook is a great tool and it has an EPUB3 output.

@mattgarrish
Copy link
Member

FYI, pull request #1150 will fix this problem. You'll still get errors about the flowRoot element, though, as EPUB 2 only supports SVG 1.1 (plus flowRoot died with SVG 1.2, so even an EPUB 3 file won't validate with it).

@DavidGriffith
Copy link
Author

This seems to clear up the trouble. I've stripped my SVG files of flowRoot by way of this: http://inkscape.13.x6.nabble.com/How-to-eliminate-aria-label-flowRoot-and-rdf-from-my-SVGs-td4981839.html and checked it over with Inkscape.

When my document is built into an epub2 file, I get no errors. But when I generate an epub3 file, epubcheck complains like this:

ERROR(RSC-005): /home/dave/proj/ebooks/latex/foo/foo.epub/OEBPS/longmen.xhtml(3,15): Error while parsing file: Element "title" must not be empty.
ERROR(RSC-005): /home/dave/proj/ebooks/latex/foo/foo.epub/OEBPS/longmench13.xhtml(51,2): Error while parsing file: attribute "cellspacing" not allowed here; expected attribute "about", "accesskey", "aria-activedescendant", "aria-atomic", "aria-autocomplete", "aria-busy", "aria-checked", "aria-colcount", "aria-colindex", "ari
...

I think the complaint about Element "title" must not be empty. may have something to do with tex4ebook and I'll check with that project on this. The complaint about attribute "cellspacing" not allowed here (which happens eight times) seems like it might be an oversight in #1150.

@Doktorchen
Copy link

The elements title and desc in SVG are related to a text alternative for the parent element, for example if this is the most outside svg element, it represents the complete graphic.
It is an accessibility issue to have a text alternative for non decorative graphics - if the svg represents no content, you simply do not use title or desc in it.

The attribute cellspacing is available for example in XHTML 1.1 (used in EPUB2), but not in HTML5 (EPUB3 uses the XML variant of this; here cellspacing is indicated to be obsolete, it is assumed, that it is decorative, therefore CSS can be used for the desired effect - because readability of table content can be much better, we need not to share this opinion of the recommendation authors, however, removed is removed, therefore not available anymore with all consequences).

@rdeltour rdeltour added status: has PR The issue is being processed in a pull request and removed status: ready for implem The issue is ready to be implemented labels Jun 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: high To be processed and published in the next release spec: EPUB 2.x Impacting the support of EPUB 2.x specifications status: has PR The issue is being processed in a pull request type: false-positive This issue is about valid content being incorrectly rejected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants