-
Notifications
You must be signed in to change notification settings - Fork 3
Add class that treats Codex as a backup #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
elisno
wants to merge
42
commits into
main
Choose a base branch
from
codex-as-backup
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+707
−1
Open
Changes from all commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
3acb048
add class to configure a decorator that treats Codex as a backup
elisno 74755d2
formatting
elisno f7f8156
Move is_bad_response helper functions to a validation.py module for c…
elisno d20d31c
update import
elisno a223b04
formatting and typing (wip)
elisno f15424c
Remove response_validators.py module
elisno 23eeb58
remove is_bad_response_contains_phrase
elisno 9e8690c
Improve helpfer functions for detecting bad responses
elisno d0ad8df
formatting and add dependencies
elisno b4bff54
formatting
elisno 038a475
address type checker complaints
elisno 5fbb48e
temporarily skip tests for codex_backup module
elisno 3892b52
formatting
elisno 22253e9
address comments
elisno e4bdf2c
Merge branch 'main' into codex-as-backup
elisno e5a6164
formatting & type hints
elisno 807d7fa
comment out to_decorator
elisno 00def49
Merge branch 'main' into codex-as-backup
elisno d8a6e86
enhance CodexBackup
elisno 2630a2c
delete commented-out to_decorator method
elisno 3286674
Merge branch 'main' into codex-as-backup
elisno 0ebd4fe
formatting
elisno 4eca7d3
fix tests for CodexBackup
elisno c59cec5
formatting and typing
elisno 6026179
formatting
elisno a94ffb5
formatting
elisno 2510255
fix unused fixture
elisno b439113
remove Self imported from typing, doesn't work for Python 3.8
elisno 38666de
remove unused type ignore
elisno a5d655b
Add explanation to is_unhelpful_response question
elisno e776dfe
Remove quotes from type annotation
elisno 7866f0c
remove _TLM protocol
elisno 26adbf1
formatting
elisno 36f80e9
threshold -> trustworthiness_threshold
elisno febbfd0
update is_bad_response docstring
elisno dc1d003
update docstrings for is_unhelpful_response
elisno 739ffc6
unhelpful_trustworthiness_threshold -> unhelpfulness_confidence_thres…
elisno 3e4864a
update module docstring for validation.py
elisno 81cc934
rename module validation.py -> response_validation.py
elisno 9e91e9b
move is_bad_response optional parameters to a parameter object (typed…
elisno c5843c9
formatting
elisno 49f9a9d
rename test_validation.py -> test_response_validation.py
elisno File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
# SPDX-License-Identifier: MIT | ||
from cleanlab_codex.client import Client | ||
from cleanlab_codex.codex_backup import CodexBackup | ||
from cleanlab_codex.codex_tool import CodexTool | ||
from cleanlab_codex.project import Project | ||
|
||
__all__ = ["Client", "CodexTool", "Project"] | ||
__all__ = ["Client", "CodexTool", "CodexBackup", "Project"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
from __future__ import annotations | ||
|
||
from typing import TYPE_CHECKING, Any, Optional, Protocol, cast | ||
|
||
from cleanlab_codex.response_validation import BadResponseDetectionConfig, is_bad_response | ||
|
||
if TYPE_CHECKING: | ||
from cleanlab_studio.studio.trustworthy_language_model import TLM # type: ignore | ||
|
||
from cleanlab_codex.project import Project | ||
|
||
|
||
def handle_backup_default(codex_response: str, primary_system: Any) -> None: # noqa: ARG001 | ||
"""Default implementation is a no-op.""" | ||
return None | ||
|
||
|
||
class BackupHandler(Protocol): | ||
"""Protocol defining how to handle backup responses from Codex. | ||
|
||
This protocol defines a callable interface for processing Codex responses that are | ||
retrieved when the primary response system (e.g., a RAG system) fails to provide | ||
an adequate answer. Implementations of this protocol can be used to: | ||
|
||
- Update the primary system's context or knowledge base | ||
- Log Codex responses for analysis | ||
- Trigger system improvements or retraining | ||
- Perform any other necessary side effects | ||
|
||
Args: | ||
codex_response (str): The response received from Codex | ||
primary_system (Any): The instance of the primary RAG system that | ||
generated the inadequate response. This allows the handler to | ||
update or modify the primary system if needed. | ||
|
||
Returns: | ||
None: The handler performs side effects but doesn't return a value | ||
""" | ||
|
||
def __call__(self, codex_response: str, primary_system: Any) -> None: ... | ||
|
||
|
||
class CodexBackup: | ||
"""A backup decorator that connects to a Codex project to answer questions that | ||
cannot be adequately answered by the existing agent. | ||
|
||
Args: | ||
project: The Codex project to use for backup responses | ||
fallback_answer: The fallback answer to use if the primary system fails to provide an adequate response | ||
backup_handler: A callback function that processes Codex's response and updates the primary RAG system. This handler is called whenever Codex provides a backup response after the primary system fails. By default, the backup handler is a no-op. | ||
primary_system: The existing RAG system that needs to be backed up by Codex | ||
tlm: The client for the Trustworthy Language Model, which evaluates the quality of responses from the primary system | ||
is_bad_response_kwargs: Additional keyword arguments to pass to the is_bad_response function, for detecting inadequate responses from the primary system | ||
""" | ||
|
||
DEFAULT_FALLBACK_ANSWER = "Based on the available information, I cannot provide a complete answer to this question." | ||
|
||
def __init__( | ||
self, | ||
*, | ||
project: Project, | ||
fallback_answer: str = DEFAULT_FALLBACK_ANSWER, | ||
backup_handler: BackupHandler = handle_backup_default, | ||
primary_system: Optional[Any] = None, | ||
tlm: Optional[TLM] = None, | ||
is_bad_response_kwargs: Optional[dict[str, Any]] = None, | ||
): | ||
self._project = project | ||
self._fallback_answer = fallback_answer | ||
self._backup_handler = backup_handler | ||
self._primary_system: Optional[Any] = primary_system | ||
self._tlm = tlm | ||
self._is_bad_response_kwargs = is_bad_response_kwargs | ||
|
||
@classmethod | ||
def from_project(cls, project: Project, **kwargs: Any) -> CodexBackup: | ||
return cls(project=project, **kwargs) | ||
|
||
@property | ||
def primary_system(self) -> Any: | ||
if self._primary_system is None: | ||
error_message = "Primary system not set. Please set a primary system using the `add_primary_system` method." | ||
raise ValueError(error_message) | ||
return self._primary_system | ||
|
||
@primary_system.setter | ||
def primary_system(self, primary_system: Any) -> None: | ||
"""Set the primary RAG system that will be used to generate responses.""" | ||
self._primary_system = primary_system | ||
|
||
@property | ||
def is_bad_response_kwargs(self) -> dict[str, Any]: | ||
return self._is_bad_response_kwargs or {} | ||
|
||
@is_bad_response_kwargs.setter | ||
def is_bad_response_kwargs(self, is_bad_response_kwargs: dict[str, Any]) -> None: | ||
self._is_bad_response_kwargs = is_bad_response_kwargs | ||
|
||
def run( | ||
self, | ||
response: str, | ||
query: str, | ||
context: Optional[str] = None, | ||
) -> str: | ||
"""Check if a response is adequate and provide a backup from Codex if needed. | ||
|
||
Args: | ||
primary_system: The system that generated the original response | ||
response: The response to evaluate | ||
query: The original query that generated the response | ||
context: Optional context used to generate the response | ||
|
||
Returns: | ||
str: Either the original response if adequate, or a backup response from Codex | ||
""" | ||
|
||
_is_bad_response_kwargs = self.is_bad_response_kwargs | ||
if not is_bad_response( | ||
response, | ||
query=query, | ||
context=context, | ||
config=cast( | ||
BadResponseDetectionConfig, | ||
{ | ||
"tlm": self._tlm, | ||
"fallback_answer": self._fallback_answer, | ||
**_is_bad_response_kwargs, | ||
}, | ||
), | ||
): | ||
return response | ||
|
||
cache_result = self._project.query(query, fallback_answer=self._fallback_answer)[0] | ||
if not cache_result: | ||
return response | ||
|
||
if self._primary_system is not None: | ||
self._backup_handler( | ||
codex_response=cache_result, | ||
primary_system=self._primary_system, | ||
) | ||
return cache_result |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll hold off on removing this until we've finalized the code in "validation.py".
The intention was to pass the fallback answer from the backup object to the relevant
is_fallback_response
helper function before deciding to call Codex as Backup.