Unexpected transformation from <em></em> tag to <em /> #2291
Replies: 2 comments
-
Hi, It won't if that's the only instance of the But, jsoup will infer the ability for a tag to self-close if it sees an example of such in the document. And so in the XML serialization, an empty element may be output as a self-closing tag. See e.g. https://try.jsoup.org/~MPZy4b2lpKjmsefN8xnhk39-meY This behaviour has not changed recently. Can you check that your document has a In #2285 I plan to make (custom) tag settings configurable, so you could override this inference. Let me ask though -- what is the impact of this for you? XML parsers support self closing tags. And semantically both If you are feeding this eventually to an HTML parser, then why serialize as XML? Use HTML. (Providing a code sample would be helpful here to show how you've configured jsoup and are using it.) |
Beta Was this translation helpful? Give feedback.
-
Good afternoon! Thank you very much, the self-closing of the "em" tag really exceeded my expectations. It doesn't exactly cause a bug for me, but it was good to be aware of the lib's standard. I checked here the code block that imposes self-closing on the parser when encountering the occurrence |
Beta Was this translation helpful? Give feedback.
-
Does jsoup, when parsing XML, transform a
"<em></em>"
tag into"<em />"
?Was this introduced recently?
How can I see the output of an XML parser? Is there documentation?
How can I avoid this?
Beta Was this translation helpful? Give feedback.
All reactions