Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include CVE data from Insights #474

Open
lukehinds opened this issue Dec 31, 2024 · 3 comments
Open

Include CVE data from Insights #474

lukehinds opened this issue Dec 31, 2024 · 3 comments

Comments

@lukehinds
Copy link
Contributor

lukehinds commented Dec 31, 2024

CVEs gonna be handled in a simplest way so far. We'll improve it later, if needed.

Brief explanation in a comment below

Old description

Introduce Vulnerability / Version data from insights into the Vector Search system to provide contextual augmentation of CVES and version specific fixes

┌───────────────────────────┐
│ 1. CVE / Version Info     │
│    introduced from insight│
└───────────────────────────┘
               │
               ▼
┌───────────────────────────┐
│ 2. Parse code snippet     │
│    and use existing       │
│    Package Extractor      │
│    (Package/Ecosystem)    │
└───────────────────────────┘
               │
               ▼
┌───────────────────────────┐
│ 3. Parse dependency       │
│    dependency matrix      │
│    for the specific       │
│    version of package     │
└───────────────────────────┘
               │
               ▼
┌───────────────────────────┐
│ 4. Perform similarity     │
│    search (Package &      │
│    Version)               │
└───────────────────────────┘
               │
           ┌───┴───┐
           │       │
     CVE Matched?  │
           │       │
           ▼       ▼
    ┌──────────────────────┐
    │ Yes: Augment prompt  │
    │ to guide LLM toward  │
    │ recommending action  │
    │ & fix                │
    └──────────────────────┘

    ┌──────────────────────┐
    │ No: Continue as      │
    │ normal in pipeline   │
    └──────────────────────┘

Explanation:

  1. Introduce CVE / Version Info: Collect CVE data or package version details within the insight pipeline
  2. Parse Code Snippet & Extract Package: Leverage already existing “Package Extractor” to identify which packages (and their ecosystems) are used in the snippet.
  3. Traverse dependency matrix tree: e.g. look up the currently used package(s) captured from code snippet.
  4. Perform Similarity Search: Match the discovered package and version against CVEs
  5. CVE Matched?:
    • Yes: Prompt augmentation instructs the LLM to recommend a fix (e.g., upgrade package, apply patch, warn or alternative library).
    • No: Proceed without additional guidance or continue searching other data sources.

This would be contingent upon #454 landing first.

cc @yrobla

@lukehinds lukehinds changed the title Include Vulnerability data from Insights into Vector Search Include CVE data from Insights into Vector Search Dec 31, 2024
@lukehinds
Copy link
Contributor Author

likely requires #454 first.

@lukehinds
Copy link
Contributor Author

tag @kevinholmesmobile for mapping to insights work, which would land first.

@davolokh davolokh changed the title Include CVE data from Insights into Vector Search Include CVE data from Insights Jan 30, 2025
@davolokh
Copy link
Contributor

davolokh commented Jan 30, 2025

We make a decision to not have CVEs data in VectorDB, but having a simple data (packageManager + packageName + version) indicating that specific one has CVEs. No severity, no specificity about CVEs. Just a fact that the package is vulnerable.
Data will be provided by Insight team and consumed in CodeGate similarly as malicious / deprecated / archived.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants