Latest version generates unexpected space in <em> tag #2279
Replies: 2 comments
-
Hi there, Can you describe what you're ultimately trying to do, and why you're going between the HTML and XML parsers / serializers? The pretty-printer is designed for HTML and only knows about HTML tags, and only when it is in the HTML (parse/serialize) mode. We automatically disable it when parsing XML. In XML, we don't know if The style of the pretty-printer is subject to change and is for HTML, so I don't consider this a bug. Perhaps we can add better heuristics to work out a nice way to emit it; that would be fun to add. I am planning on revisiting the printer in a version down the track to simplify it and improve the output. Also, I have been thinking of adding configuration to the printer to allow custom tags / formatting options, like your reflection code is doing. (But actually part of the API; please don't be upset when that code breaks! :) For now my suggestion would probably be not to bounce between HTML and XML parsers (but as I'm not clear on why you're using that, and without seeing your actual code flow, I'm not sure). Or, add the HTML element formatters into your custom XML injection. |
Beta Was this translation helpful? Give feedback.
-
Thanks, I managed to solve it using my custom inline tag guidelines |
Beta Was this translation helpful? Give feedback.
-
I use jsoup to do a two-way parse dynamic in my application:
HTML -> XML
XML -> HTML
I updated my jsoup to the newest version on this date. After the update, the parse from html to xml started to generate line breaks in the
<em>
tag, including adding a white space after the opening of the tagt hat does not exist in the html, but goes into the xml.Example in html and how it is saved in xml:
"<p>dd<em>d</em>dd</p>"
The expectation is that it would be an inline tag and without a space that exists after the opening of the tag (
<em>SPACE B</em>
I use the following configuration file:
I use prettyprint and indent(4) in all jsoup Document parsers.
This is causing a problem for me because it is generating an unexpected space in the em tag.
Thank you in advance for your help!!
Beta Was this translation helpful? Give feedback.
All reactions