Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update validate-tooling-data for eliminate case insensitive languages #1516

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Vishv0407
Copy link
Contributor

What kind of change does this PR introduce?
Feature - Adds case-insensitive unique validation for language entries

Issue Number:

Screenshots/videos:
Forcefully made mistakes in the name of language,
image

Validator finds the mistake,
image

If relevant, did you update the documentation?

Summary
This PR introduces case-insensitive unique validation for language entries in the tooling data to solve several existing problems:

  1. Inconsistent language casing across tools (e.g., "JavaScript" vs "javascript" vs "JAVASCRIPT")
  2. Potential confusion for users seeing the same language listed multiple times

My solution:
Implements a custom AJV keyword caseInsensitiveUnique that:

  • Detects and reports case-insensitive duplicates using set
  • Provides clear error messages for easy fixes
           ajv.addKeyword({
              keyword: 'caseInsensitiveUnique',
              type: 'array',
              validate: function (schema, data) {
                if (!Array.isArray(data)) return false;
                
                const languagesSet = new Set();
                const languagesLowercaseSet = new Set();
                data.forEach((tool) => {
                  if (tool.languages) {
                    tool.languages.forEach((language) => {
                      languagesSet.add(language);
                      languagesLowercaseSet.add(language.toLowerCase());
                    });
                  }
                });
                if (languagesSet.size !== languagesLowercaseSet.size) {
                  console.error('Duplicate languages found');
                  const lowercaseMap = new Map();
                  languagesSet.forEach((language) => {
                    lowercaseMap.set(
                      language.toLowerCase(), 
                      (lowercaseMap.get(language.toLowerCase()) || 0) + 1
                    );
                  });
                  
                  lowercaseMap.forEach((value, key) => {
                    if (value > 1) {
                      console.log('Duplicate found for:', key);
                    }
                  });
                  validate.errors = [{
                    keyword: 'caseInsensitiveUnique',
                    message: 'array contains case-insensitive duplicates',
                    params: { keyword: 'caseInsensitiveUnique' }
                  }];
                  return false;
                }
                return true;
              }
            });

Does this PR introduce a breaking change?
Yes

Impact:
This PR enforces case-insensitive uniqueness for language entries. Any existing tooling data that includes language names with inconsistent casing—such as "JavaScript" and "javascript"—will now fail validation. This change helps eliminate redundancy and confusion caused by duplicate entries with different letter cases.

Who is affected:
Tool maintainers and contributors who have added language entries with varying casing.

Migration Path:
Update your languages arrays to ensure that each language appears only once in a consistent format, preferably matching the casing defined in the schema enum. For example:

# ❌ Before
languages:
  - "JavaScript"
  - "javascript"
  - "Go"
  - "go"

# ✅ After
languages:
  - "JavaScript"
  - "Go"

@Vishv0407 Vishv0407 requested a review from a team as a code owner March 14, 2025 11:09
Copy link

github-actions bot commented Mar 14, 2025

built with Refined Cloudflare Pages Action

⚡ Cloudflare Pages Deployment

Name Status Preview Last Commit
website ✅ Ready (View Log) Visit Preview ac51230

Copy link

codecov bot commented Mar 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (219521e) to head (ac51230).

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #1516   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           10        10           
  Lines          396       396           
  Branches       106       106           
=========================================
  Hits           396       396           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐛 Bug: Two JavaScript labels for filter on Tools page
1 participant