Skip to content

Add Separator property to XmlTextAttribute and XmlAttributeAttribute for configurable list serialization#126767

Open
Copilot wants to merge 11 commits intomainfrom
copilot/add-xmltextattribute-separator-property
Open

Add Separator property to XmlTextAttribute and XmlAttributeAttribute for configurable list serialization#126767
Copilot wants to merge 11 commits intomainfrom
copilot/add-xmltextattribute-separator-property

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 10, 2026

[XmlText] string[] has always concatenated array items with no separator (val1val2), while [XmlAttribute] string[] uses space separation (val1 val2). This inconsistency is intentional and tested — changing the default would be breaking. This PR adds an opt-in Separator property to both attributes so users can control the separator character.

API Changes

XmlTextAttribute — new char Separator { get; set; } (default '\0' = no separator, preserves existing concatenation behavior):

// opt-in: space-separated text content, round-trips correctly
[XmlText(Separator = ' ')]
public string[] Items { get; set; }

// opt-in: comma-separated
[XmlText(Separator = ',')]
public string[] Tags { get; set; }

XmlAttributeAttribute — same char Separator { get; set; } (default '\0' = use existing space behavior):

// override attribute separator to comma
[XmlAttribute(Separator = ',')]
public string[] Values { get; set; }

'\0' (null char) is the "not set" sentinel — chosen because char? is not valid as a C# attribute argument type (CS0655).

Implementation

  • XmlTextAttribute / XmlAttributeAttribute: add char Separator property
  • Mappings.cs: add char? Separator to TextAccessor and AttributeAccessor (internal; char? is valid here)
  • XmlReflectionImporter: wire attribute → accessor, validate with XmlConvert.VerifyXmlChars() (rejects XML-illegal chars like \x01; '\0' sentinel skips validation)
  • Writers (ReflectionXmlSerializationWriter, XmlSerializationWriter, XmlSerializationWriterILGen): when TextAccessor.Separator.HasValue, emit items with separator between them; for attributes, use Separator ?? ' '
  • Readers (ReflectionXmlSerializationReader, XmlSerializationReader, XmlSerializationReaderILGen): when TextAccessor.Separator.HasValue, split text on separator and populate array; uses string.Split(char) overload (not char[]) for IL-gen compatibility
  • System.Xml.ReaderWriter ref assembly: updated with new public surface

Backward Compatibility

  • [XmlText] string[] with no Separator → unchanged concatenation (val1val2)
  • [XmlAttribute] string[] with no Separator override → unchanged space separation (val1 val2)
  • Existing test XML_TypeWithXmlTextAttributeOnArray continues to assert val1val2
Original prompt

Context

Issue #115837 reports that [XmlText] string[] on element content serializes array items concatenated without any separator (abcd), while [XmlAttribute] string[] serializes them space-separated (a b c d). This inconsistency has existed since .NET Framework and is the documented/tested behavior — changing the default would be a breaking change.

Design Decision (from discussion with area owner)

Rather than adding a simple IsList boolean, the solution is to add a char? Separator property to both XmlTextAttribute and XmlAttributeAttribute, allowing users to opt into list-style serialization with a configurable separator character.

Key design points:

  1. XmlTextAttribute.Separator — defaults to null (meaning no separator, preserving current concatenation behavior of [XmlText] string[]). Setting e.g. Separator = ' ' opts into space-separated list serialization for element text content.

  2. XmlAttributeAttribute.Separator — defaults to ' ' (preserving existing space-separated behavior for [XmlAttribute] string[]). Users can override to a different separator if desired.

  3. Type should be char? (nullable char), where null means "no separator / use default behavior". This eliminates multi-character separator edge cases and is simple to validate and emit.

  4. Validation: Use XmlConvert.VerifyXmlChars() on the separator character at reflection/import time (in XmlReflectionImporter). The XmlWriter layer already handles escaping of characters like <, &, > etc. in both attribute values and text content, so those are safe — they'll be entity-escaped in the output. The only truly dangerous characters are those illegal in XML 1.0 entirely (like \0), which VerifyXmlChars catches.

Implementation Guide

Files that need changes:

Public API surface:

  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlTextAttribute.cs — Add public char? Separator { get; set; } property (default: null)
  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlAttributes.cs — The XmlAttributeAttribute class needs the same public char? Separator { get; set; } property (default: ' ')

Internal mapping infrastructure:

  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/Mappings.csTextAccessor is currently an empty class (internal sealed class TextAccessor : Accessor { }). Add an IsList property (or Separator property) mirroring what AttributeAccessor already has with its IsList bool. Consider whether to store the separator char itself or just a bool here.

Reflection import (wiring up the attribute to the mapping):

  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlReflectionImporter.cs — In ImportAccessorMapping, around line 1630-1640, where [XmlText] on array-like types creates a TextAccessor, wire up the separator from the attribute to the accessor. Validate the separator character here using XmlConvert.VerifyXmlChars(). Also around line 1595, where isList is computed for attributes, incorporate the new Separator property from XmlAttributeAttribute.

Serialization writers (emitting the separator during write):

  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/ReflectionXmlSerializationWriter.cs — In WriteMember, the attribute IsList path already writes " " between values. Add analogous logic for TextAccessor when it has a separator.
  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationWriter.cs — Similar changes for the non-reflection writer path.
  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationWriterILGen.cs — IL-gen'd serializer path needs the same logic.

Deserialization readers (splitting on the separator during read):

  • The reader paths need to split text content on the separator character when deserializing back to an array. Look at how attribute IsList deserialization currently splits on whitespace and apply similar logic for text content.
  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/ReflectionXmlSerializationReader.cs
  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationReader.cs
  • src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationReaderILGen.cs

What NOT to change:

  • The existing [XmlAttribute] string[] default behavior must remain space-separated (backward compatible)
  • The existing [XmlText] string[] default behavior must remain concatenated with no separator (backward compatible)
  • The existing test XML_TypeWithXmlTextAttributeOnArray asserts val1val2 concatenation — this must continue to pass

Tests to add:

  • [XmlText(Separator = ' ')] string[] round-trips as space-separated text content
  • [XmlText(Separator = ',')] string[] round-trips with comma separation
  • [XmlText] string[] (no separa...

This pull request was created from Copilot chat.

Fixes #115837

…for configurable list serialization

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/68db653f-3600-4ba3-b399-78995d7c9d4b

Co-authored-by: StephenMolloy <19562826+StephenMolloy@users.noreply.github.com>
StephenMolloy and others added 3 commits May 1, 2026 14:47
… update error messages and add tests for invalid characters
…add tests for no-separator cases

Co-authored-by: Copilot <copilot@github.com>
…ests

- Guard WriteValue(separatorStr) calls in WriteArrayItems with hasSeparator
  so the reflection writer makes no inter-element call when no Separator
  is configured (matches pre-PR behavior; the prior unconditional
  WriteValue("") call was a no-op for the built-in XmlWriter but
  observable to custom subclasses).

- Hoist separatorChar.ToString() outside the WriteMember enumeration
  loop. Char.ToString() allocates a fresh single-char string per call.

- Convert existing separator-set round-trip tests from skipStringCompare
  to explicit XML baselines for byte-level wire-format coverage of
  [XmlText(Separator=' ')], [XmlText(Separator=',')] and
  [XmlAttribute(Separator=',')].

- Add edge-case tests: single-element arrays (no spurious separator
  emitted) and embedded empty strings (e.g. ['a','','c'] with separator
  ',' round-trips through 'a,,c') for both [XmlText] and [XmlAttribute].

- Add wire-format preservation tests asserting that, when no Separator
  is specified, the output is byte-for-byte identical to pre-PR
  behavior for single-element [XmlText] and [XmlAttribute] string
  arrays.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 2, 2026 04:08
@StephenMolloy StephenMolloy added this to the 11.0.0 milestone May 2, 2026
@StephenMolloy StephenMolloy marked this pull request as ready for review May 2, 2026 04:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Separator option to XmlTextAttribute / XmlAttributeAttribute so XML serializer list-like members can use a caller-chosen delimiter without changing existing defaults.

Changes:

  • Adds new public Separator properties to XmlTextAttribute and XmlAttributeAttribute, plus ref assembly updates.
  • Threads separator metadata through XmlSerializer mappings/importer and updates reflection, generated-code, and IL-gen reader/writer paths.
  • Adds runtime-only serializer tests covering default behavior, custom separators, and separator validation.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/libraries/System.Xml.ReaderWriter/ref/System.Xml.ReaderWriter.cs Adds the new public API surface to the ref assembly.
src/libraries/System.Runtime.Serialization.Xml/tests/SerializationTypes.RuntimeOnly.cs Adds serializer test model types for separator scenarios.
src/libraries/System.Private.Xml/tests/XmlSerializer/XmlSerializerTests.RuntimeOnly.cs Adds runtime-only XmlSerializer tests for custom separators and validation.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlTextAttribute.cs Adds XmlTextAttribute.Separator.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationWriterILGen.cs Updates IL-generated writer paths for custom separators.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationWriter.cs Updates source-generated writer paths and adds char literal emission helper.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationReaderILGen.cs Updates IL-generated reader paths to split on custom separators.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationReader.cs Updates source-generated reader paths for custom separator parsing.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationGeneratedCode.cs Exposes the new char-literal helper to generated-code infrastructure.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlReflectionImporter.cs Imports separator metadata and validates separator chars.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlAttributeAttribute.cs Adds XmlAttributeAttribute.Separator.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/ReflectionXmlSerializationWriter.cs Updates reflection-based writer paths for custom separators.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/ReflectionXmlSerializationReader.cs Updates reflection-based reader paths for custom separator parsing.
src/libraries/System.Private.Xml/src/System/Xml/Serialization/Mappings.cs Adds separator storage to text/attribute accessors.
src/libraries/System.Private.Xml/src/Resources/Strings.resx Adds the separator-validation resource string.

public string DataType { get { throw null; } set { } }
public System.Xml.Schema.XmlSchemaForm Form { get { throw null; } set { } }
public string? Namespace { get { throw null; } set { } }
public char Separator { get { throw null; } set { } }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once code-reviewed within the team, (but before merging of course,) we will go through the API approval process.

Comment thread src/libraries/System.Private.Xml/src/Resources/Strings.resx Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

🤖 Copilot Code Review — PR #126767

Note

This review was generated by GitHub Copilot.

Holistic Assessment

Motivation: This PR adds a Separator property to XmlTextAttribute and XmlAttributeAttribute, allowing users to configure the delimiter used when serializing/deserializing array-typed members as XML text/attribute values. This addresses a real gap — the existing hardcoded space separator for attributes and concatenation-without-separator for text is inflexible.

Approach: The implementation is thorough, touching all four serialization code paths (reflection reader/writer, IL-gen reader/writer, and the C#-source-gen reader/writer). The API design using a char property with '\0' as sentinel for "default behavior" is reasonable, and the separator validation ensures only XML-safe characters are accepted.

Summary: ❌ Needs Changes. This PR introduces new public API surface (Separator on two public attribute types + ref assembly changes) with no visible linked api-approved issue. Per dotnet/runtime policy, new public APIs require an approved proposal before PR submission. Additionally, there are leftover TODO comments that should be removed.


Detailed Findings

❌ API Approval — No linked api-approved issue found

This PR adds two new public properties to the ref assembly (System.Xml.ReaderWriter.cs):

  • XmlAttributeAttribute.Separator { get; set; }
  • XmlTextAttribute.Separator { get; set; }

Per dotnet/runtime policy, new public APIs require an approved proposal (issue with api-approved label) before PR submission. I could not find a linked issue with this label in the commit messages or PR description. If the approval exists, please link it in the PR description. If not, the API needs to go through review before this can merge, or the properties should be marked internal pending approval.

⚠️ Code Quality — Leftover TODO comments in test files (merge-blocking)

Both test files contain // TODO smolloy - new after here comments:

  • XmlSerializerTests.RuntimeOnly.cs (line 3669)
  • SerializationTypes.RuntimeOnly.cs (line 4638)

These development markers should be removed before merge.

⚠️ Design — Asymmetric default behavior between the two Separator properties (advisory)

The two Separator properties have subtly different semantics when set to '\0' (the default):

  • XmlAttributeAttribute.Separator: defaults to space-separated (existing behavior preserved)
  • XmlTextAttribute.Separator: defaults to no separator / concatenation (existing behavior preserved)

While this preserves backward compatibility (correct), it creates a confusing API surface where the same property name has different default semantics on different attributes. The XML docs partially explain this, but consider whether users could be confused. This is advisory — it may be intentional from the API design.

⚠️ Generated Code Change — if (e != null) now unconditionally braced (advisory)

In XmlSerializationWriter.cs WriteArrayItems, the if (e != null) was changed from brace-less to a braced block unconditionally (not just when hasSeparator is true). This changes the text of generated C# source code even for existing types that don't use the new Separator feature. While functionally equivalent, if any tooling compares generated serialization source, this could trigger unexpected differences. Low risk but worth noting.

💡 Suggestion — WriteQuotedCSharpChar uses \x escape

The \x escape for chars < 32 in WriteQuotedCSharpChar is correct in this context (char literals in single quotes). However, \u00XX would be more robust since \x greedily consumes up to 4 hex digits. Since this is always within a '...' literal it works correctly, but \u would be safer if the method were ever reused in a string context.

✅ Correctness — Separator validation is sound

ValidateSeparatorChar correctly validates that the separator is safe for both XML text content and attribute values using XmlCharType.IsTextChar and XmlCharType.IsAttributeValueChar, properly rejecting <, >, &, ", control characters, etc.

✅ Backward Compatibility — Existing behavior preserved

When Separator is not set (default '\0'), all code paths fall through to the existing behavior. Wire-format preservation tests confirm this for both single-element and multi-element cases.

✅ Test Coverage — Comprehensive

Tests cover: comma separators, space separators, no-separator backward compat, single-element arrays, embedded empty strings, invalid separator rejection, and wire-format preservation for both [XmlText] and [XmlAttribute] paths.

Generated by Code Review for issue #126767 ·

StephenMolloy and others added 3 commits May 4, 2026 00:56
Throw InvalidOperationException when XmlTextAttribute.Separator is set on
a member that also has [XmlElement] or [XmlAnyElement] mappings. The
writer inserts separators between all array items, so on mixed-content
members this would emit separator characters around element nodes and
corrupt the round-trip.

Add tests for both [XmlText]+[XmlElement] and [XmlText]+[XmlAnyElement]
mixed-content cases in both ILGen and ReflectionOnly serializer modes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Track whether the previous array item was written as text in all three
writers (Reflection, ILGen, source-gen) so that separators are only
emitted between consecutive text items in mixed content arrays.

Remove the mixed-content guard from XmlReflectionImporter since the
separator now works correctly with mixed content.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cover three coverage gaps surfaced during the test review:

- Mixed content WITHOUT a Separator (default path through the new lastWasText/curIsText tracking, verifies no regression for the unconfigured case).

- Null item between two text items in a separator-tracking array (verifies the recently fixed text-branch null guard and that null preserves lastWasText so the next text item still gets its leading separator).

- Null item between an element and a text item (verifies the status-quo return preserves lastWasText=false so the following text item does NOT get a stray leading separator).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.

Keep generated serializers from clearing the text separator state when a choice-identifier item writes nothing, matching the reflection writer behavior. Add coverage for null mixed-content items with XmlChoiceIdentifier.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 18 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlReflectionImporter.cs:1626

  • Separator becomes part of the AttributeAccessor state, but ReconcileAccessor still only compares mapping/default when deduplicating qualified attributes. If two different members use the same qualified attribute name/type with different separators, the second import will silently reuse the first accessor and inherit its separator. That makes one of the serializers emit/parse the wrong wire format depending on import order.
                    if (a.XmlAttribute.Separator != '\0')
                    {
                        ValidateSeparatorChar(a.XmlAttribute.Separator, accessorName);
                        attribute.Separator = a.XmlAttribute.Separator;
                    }
                    attribute.Default = GetDefaultValue(model.FieldTypeDesc, model.FieldType, a);
                    attribute.Any = (a.XmlAnyAttribute != null);
                    if (attribute.Form == XmlSchemaForm.Qualified && attribute.Namespace != ns)
                    {
                        _xsdAttributes ??= new NameTable();
                        attribute = (AttributeAccessor)ReconcileAccessor(attribute, _xsdAttributes);

Comment thread src/libraries/System.Private.Xml/src/Resources/Strings.resx Outdated
StephenMolloy and others added 2 commits May 5, 2026 16:42
XmlText with a SpecialMapping target (XmlNode arrays / mixed content with
typeof(XmlNode)) silently breaks round-trip when Separator is set: the
writer emits separator-joined text, but the reader's SpecialMapping path
creates a single XmlNode from the entire text run with no split. Reject
that combination at mapping time and tighten the Separator XML doc to
state it applies to arrays of strings.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…es for XML text and attribute handling

Co-authored-by: Copilot <copilot@github.com>
Copilot AI review requested due to automatic review settings May 6, 2026 04:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 18 changed files in this pull request and generated 1 comment.

Comment on lines +2278 to +2290
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static void ValidateTextSeparatorChar(char separator, string memberName)
{
if (!XmlCharType.IsTextChar(separator))
throw new InvalidOperationException(SR.Format(SR.XmlInvalidTextSeparatorChar, memberName));
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static void ValidateAttributeSeparatorChar(char separator, string memberName)
{
if (!XmlCharType.IsAttributeValueChar(separator))
throw new InvalidOperationException(SR.Format(SR.XmlInvalidAttributeSeparatorChar, memberName));
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

XML serialization of xs:list elements doesn't respect white-space separation

3 participants