Skip to content

Conversation

@sak23042006
Copy link

@sak23042006 sak23042006 commented Oct 3, 2025

📌 Problem

Repositories often get cluttered with duplicate issues, making it difficult for maintainers to manage and causing wasted effort for contributors. Manual duplicate detection is slow, inconsistent, and error-prone.

💡 Solution

This PR integrates seroski-dupbot, a bot that automatically detects duplicate issues using embeddings and similarity scoring.

Behavior based on similarity score:

  • < 5 → Unique: No action taken.
  • 5 – 8.5 → Potential duplicate: Bot comments with related issues and flags for review.
  • > 8.5 → Clear duplicate: Bot comments, labels the issue as duplicate, and auto-closes it.

🧩 Benefits

  • Saves maintainers’ time by automating duplicate detection.
  • Reduces noise in issue trackers.
  • Improves contributor experience by providing instant feedback.

Impact: Provides a scalable, automated solution to handle duplicate issues across repositories, ensuring cleaner and more manageable issue tracking.

Closes #140.

Copilot AI review requested due to automatic review settings October 3, 2025 09:01
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive duplicate issue detection system called "seroski-dupbot" that automatically identifies and manages duplicate GitHub issues using machine learning embeddings and similarity scoring. The system provides three-tier behavior based on similarity scores: unique issues (< 0.55), potentially related issues (0.55-0.84), and clear duplicates (≥ 0.85) which are auto-closed.

Key Changes:

  • Automated duplicate detection workflow that triggers on issue creation, editing, and closure
  • Vector database integration using Pinecone for storing and querying issue embeddings
  • Database management utilities for population, cleanup, and validation operations

Reviewed Changes

Copilot reviewed 12 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
package.json Defines Node.js dependencies for GitHub API, vector database, and embedding services
.github/workflows/duplicate-issue.yml Main workflow for duplicate detection with cleanup jobs for closed issues
.github/workflows/database-operations.yml Administrative workflow for database management operations
.github/workflows/api-validation.yml Validation workflow to test API connectivity before operations
.github/scripts/validate-apis.js API validation script testing connections to all required services
.github/scripts/populate-existing-issues.js Script to populate vector database with existing repository issues
.github/scripts/debug-pinecone.js Debugging utility for inspecting Pinecone database state
.github/scripts/clear-all-vectors.js Destructive operation script to clear all vectors from database
.github/scripts/cleanup-specific-issue.js Utility to remove specific issue vectors from database
.github/scripts/cleanup-duplicates.js Script to clean up duplicate vectors in the database
.github/scripts/cleanup-closed-issue.js Automated cleanup script for removing closed issue vectors
.github/scripts/check-duplicates.js Core duplicate detection logic with three-tier similarity analysis

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.


steps:
- name: Checkout repository
uses: actions/checkout@v3
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using actions/checkout@v3 is deprecated. Consider upgrading to actions/checkout@v4 for better performance and security updates.

Copilot uses AI. Check for mistakes.
uses: actions/checkout@v3

- name: Setup Node.js
uses: actions/setup-node@v3
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using actions/setup-node@v3 is deprecated. Consider upgrading to actions/setup-node@v4 for better performance and security updates.

Copilot uses AI. Check for mistakes.
uses: actions/checkout@v3

- name: Setup Node.js
uses: actions/setup-node@v3
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using actions/setup-node@v3 is deprecated. Consider upgrading to actions/setup-node@v4 for better performance and security updates.

Suggested change
uses: actions/setup-node@v3
uses: actions/setup-node@v4

Copilot uses AI. Check for mistakes.

steps:
- name: Checkout repository
uses: actions/checkout@v3
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using actions/checkout@v3 is deprecated. Consider upgrading to actions/checkout@v4 for better performance and security updates.

Suggested change
uses: actions/checkout@v3
uses: actions/checkout@v4

Copilot uses AI. Check for mistakes.
uses: actions/checkout@v3

- name: Setup Node.js
uses: actions/setup-node@v3
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using actions/setup-node@v3 is deprecated. Consider upgrading to actions/setup-node@v4 for better performance and security updates.

Suggested change
uses: actions/setup-node@v3
uses: actions/setup-node@v4

Copilot uses AI. Check for mistakes.

steps:
- name: Checkout repository
uses: actions/checkout@v3
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using actions/checkout@v3 is deprecated. Consider upgrading to actions/checkout@v4 for better performance and security updates.

Suggested change
uses: actions/checkout@v3
uses: actions/checkout@v4

Copilot uses AI. Check for mistakes.
uses: actions/checkout@v3

- name: Setup Node.js
uses: actions/setup-node@v3
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using actions/setup-node@v3 is deprecated. Consider upgrading to actions/setup-node@v4 for better performance and security updates.

Suggested change
uses: actions/setup-node@v3
uses: actions/setup-node@v4

Copilot uses AI. Check for mistakes.
repo: REPO,
issue_number: ISSUE_NUMBER,
state: 'closed',
state_reason: 'duplicate'
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state_reason 'duplicate' may not be supported by the GitHub API. The valid state_reason values are typically 'completed' or 'not_planned'. Consider using 'not_planned' for duplicate issues.

Suggested change
state_reason: 'duplicate'
state_reason: 'not_planned'

Copilot uses AI. Check for mistakes.
@sak23042006
Copy link
Author

@yep-yogesh , please review this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Implement AI-powered duplicate issue detection

1 participant