Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moved Code for Json Files should not show as a difference #62

Open
MiguelElGallo opened this issue Jun 20, 2024 · 11 comments
Open

Moved Code for Json Files should not show as a difference #62

MiguelElGallo opened this issue Jun 20, 2024 · 11 comments
Labels
enhancement New feature or request vscode The issue is related to our VS Code extension

Comments

@MiguelElGallo
Copy link

Is your feature request related to a problem? Please describe.
I deal with reviewing changes to Infra as Code (JSON/ARM) , which are JSON files (2 to 8 MB) in which the sections are always generated in different order. Ask the Azure guys why that happen. I need to compare those JSON files, and validate the changes, but since sections are in different order, they are shown as differences, which in JSON those are not.

Describe the solution you'd like
A flag to tell only show differences, do not mind the order, for JSON files.

Describe alternatives you've considered
Write my own code, but the visualiation part will be a challenge.

Additional context
None

@MiguelElGallo MiguelElGallo added enhancement New feature or request vscode The issue is related to our VS Code extension labels Jun 20, 2024
@mmueller2012
Copy link
Contributor

Thanks for your feature request.

Can you give me an example of an old and new JSON document that contains such a move? A dummy example or a trimmed down version where you replace sensitive data with placeholder values would be fine.

Maybe we can add some logic specific to ARM templates, but I need to understand the use case a bit better for this.

@MiguelElGallo
Copy link
Author

@mmueller2012 Thanks for the quick reply.

This is a simple example, these two files are identical, from a Json ARM, perspective.

But it shows moved without any changes. Which is the case, but on large files it gets, impossible to spot the "real" differences.

Let me know if this good enough as an example.

ver1.json
ver2.json

@mmueller2012
Copy link
Contributor

Thanks for the example.

SemanticDiff already ignores reordering of keys, but that doesn't help in this case because the elements within an array change their order. We obviously cannot add a generic rule to ignore such changes for all JSON files, but we should be able to detect these templates based on the provided $schema value and enable more specific rules.

Based on the schema mentioned in the JSON file, this seems to mostly affect the top-level keys functions and resources, but the schema references 333 subschemas that we may also need to consider. Does the reordering only affect these top-level keys or is it present throughout the document?

@MiguelElGallo
Copy link
Author

It is present throughout the document. For example inside resources::properties could be in different order.

@mmueller2012
Copy link
Contributor

Manually writing the rules for all 333 schemas is not feasible, but it should be possible to write a script that extracts the location of all arrays. This still requires manual annotation of which array is unordered, but it sounds much more doable. We will take a look at it.

@MiguelElGallo
Copy link
Author

@mmueller2012 sorry forgot to tag you...

@MiguelElGallo
Copy link
Author

@mmueller2012 do you think this is doable ?

@mmueller2012
Copy link
Contributor

@MiguelElGallo We took a quick look and found that the schema contains a lot of oneOf entries. They complicate things because SemanticDiff needs to distinguish the object type in order to apply rules conditionally. We haven't decided on way forward yet, as we are currently finalizing the next SemanticDiff release. We will look into this further after the new version is released.

@MiguelElGallo
Copy link
Author

@mmueller2012 Thanks! Let me know if you decide anything! Good luck with the release!

@MiguelElGallo
Copy link
Author

@mmueller2012 Any updates relates yo this?

@mmueller2012
Copy link
Contributor

@MiguelElGallo Sorry for not posting an update. We took a closer look at the idea a few weeks ago.

During our analysis we found two problems:

  1. The schema uses a lot of "oneOf" definitions, which means we cannot simply add rules like "resources[]->foo->bar[]". For each layer, we may need to add type checks by looking for other keys.
  2. The schema definitions don't tell you which arrays are unordered.

The first would require support for more complex pattern matching for rules, but the actual patterns could be generated based on the JSON schema. The second, however, is more difficult. We are not aware of any other place to look for this information, and fear that it would require manual annotation. Adding manual annotations seems unfeasible as there are way too many arrays and it would require deep knowledge of Azure (e.g. does changing the order of vpnAuthenticationTypes affect which one is preferred?)

We concluded that it is not possible to hardcode such rules into SemanticDiff. A possible way forward would be to add loadable rules and allow developers with more knowledge of Azure to work on a community maintained set of rules. This would require some effort (adding loadable rules, providing debugging utilities, publicly documenting the system, ...) and it is not clear if there is enough interest for such a system yet.

We also discussed your original idea of adding an option to not show moved code as a change. This option has the big disadvantage that it might hide potentially relevant changes and that is something we really want to avoid. It could still serve as an intermediate solution for now, so we have added it to the backlog. I am not sure if it will make the cut for the next release though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request vscode The issue is related to our VS Code extension
Projects
None yet
Development

No branches or pull requests

2 participants