Skip to content

Sanitize Participant #4588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

9swampy
Copy link

@9swampy 9swampy commented Jun 22, 2025

Description

The SequenceDiagrams generated by the ComplexGitflowExample were invalid because the participants were incompatible with PlantUml. I've sanitized them so the SequenceDiagrams output by tests are cut & paste viewable.

Related Issue

#4585

Motivation and Context

So the SequenceDiagrams output by tests are cut & paste viewable.

How Has This Been Tested?

Test coverage and output of ComplexGitflowExample is now cut-paste viewable.

Screenshots (if appropriate):

For example feature/f1 is invalid - participant can't have '/' in it.

image

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@9swampy 9swampy force-pushed the SanitizeParticipant branch 2 times, most recently from cfa4003 to 76fb48d Compare June 22, 2025 18:13
@asbjornu asbjornu force-pushed the SanitizeParticipant branch from 76fb48d to 37c11c0 Compare June 22, 2025 21:38
Copy link
Member

@asbjornu asbjornu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing this and providing a fix! By the test cases, it seems like a simple .RegexReplace("[^a-zA-Z0-9]", "_") would do the trick – am I missing something? :)

public static string RegexReplace(this string input, string pattern, string replace)
{
var regex = RegexPatterns.Cache.GetOrAdd(pattern);
return regex.Replace(input, replace);
}

@9swampy
Copy link
Author

9swampy commented Jun 22, 2025

Don't let perfection be enemy of good enough; fair. I'm a little OCD when it comes to clean consistency so my initial proposal's a bit opionated; but also cleaner IMHO but the goals to be able to copy paste in to PlantUml and get an output. There are likely other curiousities i'd not encountered so the simplicity of the RegEx you propose wouldn't go amiss. If you're happy to take one or other; or both options; let me know how to tidy up the PR to meet acceptable standards to get a merge. Thanks for consideration.

Input PascalCase Output Underscore Output
feature/1234-is-id-with-something-kebab feature_1234_IsIdWithSomethingKebab feature_1234_is_id_with_something_kebab
feature/1234-IsSomethingPascalCase feature_1234_IsSomethingPascalCase feature_1234_IsSomethingPascalCase
feature/Caps-lower-something-kebab feature_CapsLowerSomethingKebab feature_Caps_lower_something_kebab
feature/Caps-lower-is-kebab feature_CapsLowerIsKebab feature_Caps_lower_is_kebab
kebab-folder/1234-is-id-with-something-kebab KebabFolder_1234_IsIdWithSomethingKebab kebab_folder_1234_is_id_with_something_kebab
kebab-folder/1234-IsSomethingPascalCase KebabFolder_1234_IsSomethingPascalCase kebab_folder_1234_IsSomethingPascalCase
kebab-folder/Caps-lower-something-kebab KebabFolder_CapsLowerSomethingKebab kebab_folder_Caps_lower_something_kebab
kebab-folder/Caps-lower-is-kebab KebabFolder_CapsLowerIsKebab kebab_folder_Caps_lower_is_kebab
PascalCaseFolder/1234-is-id-with-something-kebab PascalCaseFolder_1234_IsIdWithSomethingKebab PascalCaseFolder_1234_is_id_with_something_kebab
PascalCaseFolder/1234-IsSomethingPascalCase PascalCaseFolder_1234_IsSomethingPascalCase PascalCaseFolder_1234_IsSomethingPascalCase
PascalCaseFolder/Caps-lower-something-kebab PascalCaseFolder_CapsLowerSomethingKebab PascalCaseFolder_Caps_lower_something_kebab
PascalCaseFolder/Caps-lower-is-kebab PascalCaseFolder_CapsLowerIsKebab PascalCaseFolder_Caps_lower_is_kebab
1234-is-id-with-something-kebab 1234_IsIdWithSomethingKebab 1234_is_id_with_something_kebab
1234-IsSomethingPascalCase 1234_IsSomethingPascalCase 1234_IsSomethingPascalCase
Caps-lower-something-kebab CapsLowerSomethingKebab Caps_lower_something_kebab
Caps-lower-is-kebab CapsLowerIsKebab Caps_lower_is_kebab
feature/all-lower-is-kebab feature_AllLowerIsKebab feature_all_lower_is_kebab
feature/24321-Upperjustoneword feature_24321_Upperjustoneword feature_24321_Upperjustoneword
feature/justoneword feature_Justoneword feature_justoneword
feature/PascalCase feature_PascalCase feature_PascalCase
feature/PascalCase-with-kebab feature_PascalCaseWithKebab feature_PascalCase_with_kebab
feature/12414 feature_12414 feature_12414
feature/12414/12342-FeatureStoryTaskWithShortDescription feature_12414_12342_FeatureStoryTaskWithShortDescription feature_12414_12342_FeatureStoryTaskWithShortDescription
feature/12414/12342-Short-description feature_12414_12342_ShortDescription feature_12414_12342_Short_description
feature/12414/12342-short-description feature_12414_12342_ShortDescription feature_12414_12342_short_description
feature/12414/12342-Short-Description feature_12414_12342_ShortDescription feature_12414_12342_Short_Description
release/1.0.0 release_1.0.0 release_1_0_0
releases releases releases
feature feature feature
feature/tfs1-Short-description feature_tfs1_ShortDescription feature_tfs1_Short_description
feature/f2-Short-description feature_f2_ShortDescription feature_f2_Short_description
feature/bug1 feature_bug1 feature_bug1
f2 f2 f2
feature/f2 feature_f2 feature_f2
feature/story2 feature_story2 feature_story2
master master master
develop develop develop
main main main

@9swampy 9swampy force-pushed the SanitizeParticipant branch from 37c11c0 to aaf3c3e Compare June 22, 2025 23:22
@9swampy 9swampy force-pushed the SanitizeParticipant branch from aaf3c3e to ba750dd Compare June 22, 2025 23:33
@9swampy
Copy link
Author

9swampy commented Jun 23, 2025

So having circled back to use the output in earnest I'm seeing that the simple Regex really would be sufficient due to the @as mechanism that had already been implemented. The sanitized participant doesn't appear in the ui anyhow, the @as takes care of PlantUml while the original still gets displayed.

feature/f1 maps to feature_f1 so it doesn't break PlantUml but the original feature/f1 is still displayed.

FWIW I see what I was missing now 😁

image

Copy link
Member

@asbjornu asbjornu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I really appreciate the level of investment, but I think this does a bit much to solve a relatively simple problem. Since, as you write, @as takes care of the human readability aspect of the diagram, I don't think the programmatic name in the diagram needs this level of sophistication. :)


namespace GitVersion.Testing.Helpers;

public static partial class RegexReplacer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the partial class or could the NonAlphanumericRegex method just be jammed into the (already existing) StringExtensions class?

@@ -39,11 +40,12 @@ public SequenceDiagram()
/// </summary>
public void Participant(string participant, string? @as = null)
{
this.participants.Add(participant, @as ?? participant);
if (@as == null)
var cleanParticipant = ParticipantSanitizer.SanitizeParticipant(@as ?? participant);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we not use the RegexReplace() method here directly?

Suggested change
var cleanParticipant = ParticipantSanitizer.SanitizeParticipant(@as ?? participant);
var cleanParticipant = (@as ?? participant).RegexReplace();

@9swampy
Copy link
Author

9swampy commented Jun 23, 2025

🧵 Comparing Regex Usage Styles in C#

There are two common approaches to working with regular expressions in modern C#:

1. Traditional Runtime Regex (with Optional Caching)

Example:

public static string RegexReplace(this string input, string pattern, string replace) 
{ 
    var regex = RegexPatterns.Cache.GetOrAdd(pattern); 
    return regex.Replace(input, replace); 
}

Used with:

private const string TriviaRegexPattern = "...";
public static Regex TriviaRegex { get; } = new(TriviaRegexPattern, RegexOptions.Singleline | ...);

✅ Pros

  • Supports dynamic and runtime-defined patterns.
  • Allows complex, multi-line regex using RegexOptions.IgnorePatternWhitespace.
  • Can use caching to reduce recompilation overhead.

❌ Cons

  • Patterns are compiled at runtime, which is slower.
  • Errors in patterns aren't caught until runtime.
  • Refactoring tools don't understand regex inside strings.

2. [GeneratedRegex] Attribute (Compile-Time Regex)

Example:

[GeneratedRegex("[^a-zA-Z0-9]")]
public static partial Regex NonAlphanumericRegex();

✅ Pros

  • Patterns are compiled at build time — no runtime parsing.
  • Catches syntax errors at compile time.
  • Fully compatible with AOT and IL trimming.
  • Better performance than runtime regex.

❌ Cons

  • Requires .NET 7 or later.
  • Only supports compile-time constant patterns.
  • Verbose/complex patterns can be harder to maintain in this format.

✅ When to Use Which?

Use Case Recommended Approach
Known, constant patterns GeneratedRegex
Dynamic or user-defined patterns Runtime + caching (RegexPatterns.Cache)
Performance-critical code GeneratedRegex
Multi-line, complex patterns Traditional regex + IgnorePatternWhitespace
Needs AOT compatibility GeneratedRegex (strongly preferred)

If you prefer I can just follow the RegexPatterns convention for consistency, but especially for the preexisting as you've dropped net6 I'd strongly suggest converting as they're heavily used in the calculation so the compile time optimization would be worthwhile?

Let me know which route to go and I'll rip out the initial one, leave only regex.

@asbjornu
Copy link
Member

If you prefer I can just follow the RegexPatterns convention for consistency, but especially for the preexisting as you've dropped net6 I'd strongly suggest converting as they're heavily used in the calculation so the compile time optimization would be worthwhile?

As long as all tests pass, I'm all for converting to the new, precompiled variant! 🙂👍🏼

@arturcic
Copy link
Member

🧵 Comparing Regex Usage Styles in C#

There are two common approaches to working with regular expressions in modern C#:

1. Traditional Runtime Regex (with Optional Caching)

Example:

public static string RegexReplace(this string input, string pattern, string replace) 
{ 
    var regex = RegexPatterns.Cache.GetOrAdd(pattern); 
    return regex.Replace(input, replace); 
}

Used with:

private const string TriviaRegexPattern = "...";
public static Regex TriviaRegex { get; } = new(TriviaRegexPattern, RegexOptions.Singleline | ...);

✅ Pros

  • Supports dynamic and runtime-defined patterns.
  • Allows complex, multi-line regex using RegexOptions.IgnorePatternWhitespace.
  • Can use caching to reduce recompilation overhead.

❌ Cons

  • Patterns are compiled at runtime, which is slower.
  • Errors in patterns aren't caught until runtime.
  • Refactoring tools don't understand regex inside strings.

2. [GeneratedRegex] Attribute (Compile-Time Regex)

Example:

[GeneratedRegex("[^a-zA-Z0-9]")]
public static partial Regex NonAlphanumericRegex();

✅ Pros

  • Patterns are compiled at build time — no runtime parsing.
  • Catches syntax errors at compile time.
  • Fully compatible with AOT and IL trimming.
  • Better performance than runtime regex.

❌ Cons

  • Requires .NET 7 or later.
  • Only supports compile-time constant patterns.
  • Verbose/complex patterns can be harder to maintain in this format.

✅ When to Use Which?

Use Case Recommended Approach
Known, constant patterns GeneratedRegex
Dynamic or user-defined patterns Runtime + caching (RegexPatterns.Cache)
Performance-critical code GeneratedRegex
Multi-line, complex patterns Traditional regex + IgnorePatternWhitespace
Needs AOT compatibility GeneratedRegex (strongly preferred)
If you prefer I can just follow the RegexPatterns convention for consistency, but especially for the preexisting as you've dropped net6 I'd strongly suggest converting as they're heavily used in the calculation so the compile time optimization would be worthwhile?

Let me know which route to go and I'll rip out the initial one, leave only regex.

There is this class https://github.com/GitTools/GitVersion/blob/main/src/GitVersion.Core/Core/RegexPatterns.cs where we have all our regex, we will migrate this to newer GeneratedRegex in one go later on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants