Fix #44 SI-9060 XML 5th edition name characters#93
Conversation
|
xml parsing used to be shared with compiler, or cut & pasted from it. The compiler has to know when to enter "xml mode". (To answer your question why colon was disallowed.) I don't see why you'd want to make something minimally correct with a minimal change. |
|
Thanks for the feedback, @som-snytt. I have added a new issue, #94, for looking further at names starting with a colon. I added some examples, but please add more so I can understand how the colon should ideally work with XML names. Thanks again, |
695fa5e to
de786b1
Compare
578d372 to
b26bc19
Compare
b26bc19 to
457972e
Compare
457972e to
8943924
Compare
@som-snytt I don't understand what you mean...? |
|
@SethTisue I don't understand why my other PR was cannibalized instead of merged. I think my comment meant, Why take correctness piecemeal instead of whole hog? |
|
@som-snytt It's been a while, but I recall that the minimal changes pass your unit test you provided. I was attempting to improve the likelihood that a fix would get merged, by making a more minimal change based on your work. Evidently, that didn't speed anything up. |
6af8f0a to
986874f
Compare
|
needs rebase because of tests moving around |
986874f to
d1ba890
Compare
55e9887 to
74f8b7a
Compare
9b65348 to
445ac6c
Compare
445ac6c to
b4e883f
Compare
* jvm/src/main/scala/scala/xml/parsing/TokenTests.scala (isNameChar): Add middle dot, #xB7, as specified at [4a] of XML 1.0 5th edition. * jvm/src/test/scala/scala/xml/XMLTest.scala (t9060): New test. * jvm/src/test/scala/scala/xml/parsing/ConstructingParserTest.scala (t9060): New test.
* jvm/src/main/scala/scala/xml/parsing/TokenTests.scala (isNameChar): Add middle dot. * jvm/src/main/scala/scala/xml/parsing/TokenTests.scala (isNameStart): Allow colon to start a name. * jvm/src/test/scala/scala/xml/UtilityTest.scala (isNameStart): Colon should be able to start a name.
b4e883f to
ee91a55
Compare
Refactored PR from #44 by @som-snytt to only make minimal changes to tokenization code, namely the acceptance of the "MIDDLE DOT" character, #xB7, from the original ticket.
To follow the standard, a colon should be allowed to start a name. It seems that a colon could always start an XML name -- in addition to a letter or an underscore. Not sure why it was avoided, but it's probably pretty rare, anyway.
https://www.w3.org/TR/2008/REC-xml-20081126
https://www.w3.org/TR/2006/REC-xml-20060816
https://www.w3.org/TR/2004/REC-xml-20040204
https://www.w3.org/TR/2000/REC-xml-20001006
https://www.w3.org/TR/1998/REC-xml-19980210