feat: Add timeout handling to CAPTCHA solver API calls#7
feat: Add timeout handling to CAPTCHA solver API calls#7Cordycepsers wants to merge 1 commit intomainfrom
Conversation
- Add timeout constants to anti-scraping config (30s for submission, 10s for polling) - Add CaptchaSolverError exception for timeout and network error handling - Update all requests.post() and requests.get() calls in captcha_solver.py with timeouts - Improve error propagation in solve_recaptcha() method - Update request_handler.py to properly handle CaptchaSolverError - Add comprehensive unit tests for timeout behavior - Update method docstrings to document timeout behavior This prevents infinite hangs when CAPTCHA services are unresponsive and provides better error handling for timeout scenarios.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Reviewer's GuideThis PR integrates configurable timeouts into all CAPTCHA solver API interactions, centralizes error handling via a custom exception, propagates errors cleanly through the solver and request handler, and adds focused unit tests to verify timeout and network‐failure scenarios. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Pull Request Overview
This PR adds comprehensive timeout handling to the CAPTCHA solver API calls to prevent infinite hangs when CAPTCHA services are unresponsive. The implementation introduces a custom CaptchaSolverError exception class, configurable timeout values for different types of requests, and proper error propagation throughout the system.
Key Changes:
- Added timeout parameters to all 4 CAPTCHA API calls (2captcha and Anticaptcha submission/polling)
- Created
CaptchaSolverErrorcustom exception for timeout and network error handling - Enhanced error propagation in the request handler to catch and handle CAPTCHA solver timeouts
Reviewed Changes
Copilot reviewed 6 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_captcha_solver.py | Comprehensive test suite with 10 test cases covering timeout scenarios, network errors, and successful flows |
| tests/test_analysis_module.py | Replaced complex test file with simplified functional tests (file reorganization) |
| test_analysis_module.py | Removed duplicate test file to consolidate testing |
| src/anti_scraping/request_handler.py | Updated to import and handle CaptchaSolverError with proper exception catching |
| src/anti_scraping/config.py | Added timeout configuration constants for CAPTCHA requests |
| src/anti_scraping/captcha_solver.py | Core implementation with timeout handling, custom exception, and enhanced error propagation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| while True: | ||
| poll_response = requests.get(poll_url, params=poll_payload) | ||
| poll_result = poll_response.json() | ||
|
|
||
| if poll_result.get('status') == 1: | ||
| return poll_result.get('request') | ||
|
|
||
| try: | ||
| poll_response = requests.get( | ||
| poll_url, | ||
| params=poll_payload, | ||
| timeout=config.CAPTCHA_POLLING_REQUEST_TIMEOUT, | ||
| ) | ||
| poll_result = poll_response.json() | ||
| except requests.Timeout: | ||
| raise CaptchaSolverError("CAPTCHA polling request timed out") | ||
| except requests.RequestException as e: | ||
| raise CaptchaSolverError( | ||
| f"CAPTCHA polling request failed: {str(e)}" | ||
| ) | ||
|
|
||
| if poll_result.get("status") == 1: | ||
| return poll_result.get("request") | ||
|
|
||
| time.sleep(5) # Wait 5 seconds before polling again |
There was a problem hiding this comment.
The infinite polling loop lacks a maximum retry limit or total timeout, which could cause the application to poll indefinitely if the CAPTCHA service never returns a ready status. Consider adding a maximum polling duration or retry count to prevent infinite loops.
| while True: | ||
| poll_response = requests.post(poll_url, json=poll_payload) | ||
| poll_result = poll_response.json() | ||
|
|
||
| if poll_result.get('status') == 'ready': | ||
| return poll_result.get('solution', {}).get('gRecaptchaResponse') | ||
|
|
||
| try: | ||
| poll_response = requests.post( | ||
| poll_url, | ||
| json=poll_payload, | ||
| timeout=config.CAPTCHA_POLLING_REQUEST_TIMEOUT, | ||
| ) | ||
| poll_result = poll_response.json() | ||
| except requests.Timeout: | ||
| raise CaptchaSolverError("CAPTCHA polling request timed out") | ||
| except requests.RequestException as e: | ||
| raise CaptchaSolverError( | ||
| f"CAPTCHA polling request failed: {str(e)}" | ||
| ) | ||
|
|
||
| if poll_result.get("status") == "ready": | ||
| return poll_result.get("solution", {}).get("gRecaptchaResponse") | ||
|
|
||
| time.sleep(5) # Wait 5 seconds before polling again |
There was a problem hiding this comment.
The infinite polling loop lacks a maximum retry limit or total timeout, which could cause the application to poll indefinitely if the CAPTCHA service never returns a ready status. Consider adding a maximum polling duration or retry count to prevent infinite loops.
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
Blocking issues:
- Detected a request using 'http://'. This request will be unencrypted, and attackers could listen into traffic on the network and be able to obtain sensitive information. Use 'https://' instead. (link)
- Detected a request using 'http://'. This request will be unencrypted, and attackers could listen into traffic on the network and be able to obtain sensitive information. Use 'https://' instead. (link)
General comments:
- Introduce a cap or overall timeout in the while-True polling loops in both 2captcha and Anticaptcha methods so you don’t risk infinite polling if the service never responds.
- Replace the ad-hoc print statements in the CAPTCHA request handler and test runner with a proper logging framework to control verbosity and integrate with existing logs.
- Consider refactoring the new monolithic
test_analysis_module.pyinto discrete pytest-style functions with fixtures instead of a custom async main, to improve readability, parallelism, and maintainability.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Introduce a cap or overall timeout in the while-True polling loops in both 2captcha and Anticaptcha methods so you don’t risk infinite polling if the service never responds.
- Replace the ad-hoc print statements in the CAPTCHA request handler and test runner with a proper logging framework to control verbosity and integrate with existing logs.
- Consider refactoring the new monolithic `test_analysis_module.py` into discrete pytest-style functions with fixtures instead of a custom async main, to improve readability, parallelism, and maintainability.
## Individual Comments
### Comment 1
<location> `src/anti_scraping/captcha_solver.py:38-47` </location>
<code_context>
- captcha_id = result.get('request')
-
+
+ try:
+ response = requests.post(
+ url, data=payload, timeout=config.CAPTCHA_INITIAL_REQUEST_TIMEOUT
+ )
+ result = response.json()
+ except requests.Timeout:
+ raise CaptchaSolverError("CAPTCHA submission request timed out")
+ except requests.RequestException as e:
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider limiting polling attempts to avoid infinite loops.
Both polling loops lack a termination condition if the CAPTCHA is never solved. Adding a maximum attempt limit or timeout will prevent hangs and excessive resource use.
</issue_to_address>
### Comment 2
<location> `src/anti_scraping/request_handler.py:81` </location>
<code_context>
+ if captcha_solution:
+ # Retry with CAPTCHA solution
+ return self._retry_with_captcha_solution(url, captcha_solution)
+ except CaptchaSolverError as e:
+ print(f"CAPTCHA solving failed: {str(e)}")
+ # Continue with original response if CAPTCHA solving fails
</code_context>
<issue_to_address>
**question (bug_risk):** Continuing with the original response after CAPTCHA solving failure may not be ideal.
Returning the original response after CAPTCHA failure may cause downstream code to process a CAPTCHA page. Consider raising an exception or returning an explicit error to prevent this.
</issue_to_address>
### Comment 3
<location> `src/anti_scraping/request_handler.py:112-117` </location>
<code_context>
return False
-
+
def _attempt_captcha_solve(self, page_url: str) -> Optional[str]:
"""Attempt to solve CAPTCHA"""
- # This is a simplified version - real implementation would extract
</code_context>
<issue_to_address>
**suggestion:** The placeholder implementation for CAPTCHA solving may lead to confusion.
Consider raising NotImplementedError or clearly marking this method as a stub to avoid accidental use in production environments.
```suggestion
def _attempt_captcha_solve(self, page_url: str) -> Optional[str]:
"""Attempt to solve CAPTCHA
This method is a stub and must be implemented for production use.
"""
raise NotImplementedError("CAPTCHA solving is not implemented. This method is a stub.")
```
</issue_to_address>
### Comment 4
<location> `src/anti_scraping/config.py:32-33` </location>
<code_context>
-
+ CAPTCHA_SOLVER_API_KEY: Optional[str] = os.getenv("CAPTCHA_API_KEY", "")
+ CAPTCHA_SOLVER_PROVIDER: str = "2captcha" # or 'anticaptcha'
+ CAPTCHA_INITIAL_REQUEST_TIMEOUT: int = 30 # seconds for submission requests
+ CAPTCHA_POLLING_REQUEST_TIMEOUT: int = 10 # seconds for polling requests
+
# Randomization settings
</code_context>
<issue_to_address>
**suggestion:** Timeout values for CAPTCHA requests are hardcoded and may not suit all environments.
Recommend making timeout values configurable through environment variables or constructor parameters to accommodate varying deployment needs.
```suggestion
CAPTCHA_INITIAL_REQUEST_TIMEOUT: int = int(os.getenv("CAPTCHA_INITIAL_REQUEST_TIMEOUT", "30")) # seconds for submission requests
CAPTCHA_POLLING_REQUEST_TIMEOUT: int = int(os.getenv("CAPTCHA_POLLING_REQUEST_TIMEOUT", "10")) # seconds for polling requests
```
</issue_to_address>
### Comment 5
<location> `tests/test_captcha_solver.py:45` </location>
<code_context>
+ assert call_args[1]["timeout"] == 30
+
+ @patch("src.anti_scraping.captcha_solver.requests.post")
+ def test_2captcha_submission_network_error(self, mock_post, captcha_solver):
+ """Test that 2captcha submission handles network errors properly"""
+ # Mock network error
</code_context>
<issue_to_address>
**suggestion (testing):** No test for provider value case sensitivity.
Please add a test to confirm provider values are treated case-insensitively, as the implementation lowercases the input. This will help catch configuration issues related to case mismatches.
</issue_to_address>
### Comment 6
<location> `src/anti_scraping/captcha_solver.py:72` </location>
<code_context>
url, data=payload, timeout=config.CAPTCHA_INITIAL_REQUEST_TIMEOUT
</code_context>
<issue_to_address>
**security (python.lang.security.audit.insecure-transport.requests.request-with-http):** Detected a request using 'http://'. This request will be unencrypted, and attackers could listen into traffic on the network and be able to obtain sensitive information. Use 'https://' instead.
*Source: opengrep*
</issue_to_address>
### Comment 7
<location> `src/anti_scraping/captcha_solver.py:95` </location>
<code_context>
poll_url,
</code_context>
<issue_to_address>
**security (python.lang.security.audit.insecure-transport.requests.request-with-http):** Detected a request using 'http://'. This request will be unencrypted, and attackers could listen into traffic on the network and be able to obtain sensitive information. Use 'https://' instead.
*Source: opengrep*
</issue_to_address>
### Comment 8
<location> `tests/test_captcha_solver.py:199-202` </location>
<code_context>
</code_context>
<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))
<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.
Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals
Some ways to fix this:
* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.
> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.
Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>
### Comment 9
<location> `src/anti_scraping/captcha_solver.py:70` </location>
<code_context>
def _solve_with_2captcha(self, site_key: str, page_url: str) -> Optional[str]:
"""Solve CAPTCHA using 2captcha API
Uses configurable timeouts for submission and polling requests.
Raises:
CaptchaSolverError: If timeout occurs or network errors happen
"""
url = "http://2captcha.com/in.php"
payload = {
"key": self.api_key,
"method": "userrecaptcha",
"googlekey": site_key,
"pageurl": page_url,
"json": 1,
}
try:
response = requests.post(
url, data=payload, timeout=config.CAPTCHA_INITIAL_REQUEST_TIMEOUT
)
result = response.json()
except requests.Timeout:
raise CaptchaSolverError("CAPTCHA submission request timed out")
except requests.RequestException as e:
raise CaptchaSolverError(f"CAPTCHA submission request failed: {str(e)}")
if result.get("status") == 1:
captcha_id = result.get("request")
# Poll for solution
poll_url = "http://2captcha.com/res.php"
poll_payload = {
"key": self.api_key,
"action": "get",
"id": captcha_id,
"json": 1,
}
while True:
try:
poll_response = requests.get(
poll_url,
params=poll_payload,
timeout=config.CAPTCHA_POLLING_REQUEST_TIMEOUT,
)
poll_result = poll_response.json()
except requests.Timeout:
raise CaptchaSolverError("CAPTCHA polling request timed out")
except requests.RequestException as e:
raise CaptchaSolverError(
f"CAPTCHA polling request failed: {str(e)}"
)
if poll_result.get("status") == 1:
return poll_result.get("request")
time.sleep(5) # Wait 5 seconds before polling again
return None
</code_context>
<issue_to_address>
**issue (code-quality):** Explicitly raise from a previous error [×4] ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
</issue_to_address>
### Comment 10
<location> `src/anti_scraping/captcha_solver.py:132` </location>
<code_context>
def _solve_with_anticaptcha(self, site_key: str, page_url: str) -> Optional[str]:
"""Solve CAPTCHA using Anticaptcha API
Uses configurable timeouts for submission and polling requests.
Raises:
CaptchaSolverError: If timeout occurs or network errors happen
"""
url = "https://api.anti-captcha.com/createTask"
payload = {
"clientKey": self.api_key,
"task": {
"type": "RecaptchaV2TaskProxyless",
"websiteURL": page_url,
"websiteKey": site_key,
},
}
try:
response = requests.post(
url, json=payload, timeout=config.CAPTCHA_INITIAL_REQUEST_TIMEOUT
)
result = response.json()
except requests.Timeout:
raise CaptchaSolverError("CAPTCHA submission request timed out")
except requests.RequestException as e:
raise CaptchaSolverError(f"CAPTCHA submission request failed: {str(e)}")
if result.get("errorId") == 0:
task_id = result.get("taskId")
# Poll for solution
poll_url = "https://api.anti-captcha.com/getTaskResult"
poll_payload = {"clientKey": self.api_key, "taskId": task_id}
while True:
try:
poll_response = requests.post(
poll_url,
json=poll_payload,
timeout=config.CAPTCHA_POLLING_REQUEST_TIMEOUT,
)
poll_result = poll_response.json()
except requests.Timeout:
raise CaptchaSolverError("CAPTCHA polling request timed out")
except requests.RequestException as e:
raise CaptchaSolverError(
f"CAPTCHA polling request failed: {str(e)}"
)
if poll_result.get("status") == "ready":
return poll_result.get("solution", {}).get("gRecaptchaResponse")
time.sleep(5) # Wait 5 seconds before polling again
return None
</code_context>
<issue_to_address>
**issue (code-quality):** Explicitly raise from a previous error [×4] ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
</issue_to_address>
### Comment 11
<location> `src/anti_scraping/request_handler.py:54` </location>
<code_context>
def make_request(
self,
url: str,
method: str = "GET",
headers: Optional[Dict[str, str]] = None,
**kwargs,
) -> Optional[requests.Response]:
"""
Make HTTP request with anti-scraping measures
"""
print("--- Entering make_request ---")
domain = urlparse(url).netloc
# Wait if rate limit is exceeded
self.rate_limiter.wait_if_needed(domain)
# Get random user-agent
user_agent = self.user_agent_manager.get_random_user_agent()
# Get random proxy
proxy = self.proxy_manager.get_random_proxy()
# Set default headers
default_headers = {
"User-Agent": user_agent,
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.5",
"Accept-Encoding": "gzip, deflate",
"Connection": "keep-alive",
"Upgrade-Insecure-Requests": "1",
}
if headers:
default_headers.update(headers)
# Add random delay before request
delay = random.uniform(*self.rate_limiter.random_delay_range)
time.sleep(delay)
# Make the request
try:
print("--- About to call requests.request ---")
response = requests.request(
method=method,
url=url,
headers=default_headers,
proxies=proxy,
timeout=30,
**kwargs,
)
print("--- requests.request was called ---")
# Check if CAPTCHA was encountered
if self._is_captcha_encountered(response):
print("CAPTCHA detected, attempting to solve...")
try:
captcha_solution = self._attempt_captcha_solve(url)
if captcha_solution:
# Retry with CAPTCHA solution
return self._retry_with_captcha_solution(url, captcha_solution)
except CaptchaSolverError as e:
print(f"CAPTCHA solving failed: {str(e)}")
# Continue with original response if CAPTCHA solving fails
# Record the request
self.rate_limiter.record_request(domain)
return response
except requests.exceptions.RequestException as e:
print(f"Request failed: {str(e)}")
if proxy:
self.proxy_manager.mark_proxy_as_failed(proxy)
raise
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Merge dictionary updates via the union operator ([`dict-assign-update-to-union`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/dict-assign-update-to-union/))
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| try: | ||
| if self.provider == '2captcha': | ||
| if self.provider == "2captcha": | ||
| return self._solve_with_2captcha(site_key, page_url) | ||
| elif self.provider == 'anticaptcha': | ||
| elif self.provider == "anticaptcha": | ||
| return self._solve_with_anticaptcha(site_key, page_url) | ||
| else: | ||
| raise ValueError(f"Unsupported CAPTCHA provider: {self.provider}") | ||
| except (CaptchaSolverError, ValueError): | ||
| # Re-raise these exceptions as they indicate configuration or timeout issues | ||
| raise |
There was a problem hiding this comment.
suggestion (bug_risk): Consider limiting polling attempts to avoid infinite loops.
Both polling loops lack a termination condition if the CAPTCHA is never solved. Adding a maximum attempt limit or timeout will prevent hangs and excessive resource use.
| if captcha_solution: | ||
| # Retry with CAPTCHA solution | ||
| return self._retry_with_captcha_solution(url, captcha_solution) | ||
| except CaptchaSolverError as e: |
There was a problem hiding this comment.
question (bug_risk): Continuing with the original response after CAPTCHA solving failure may not be ideal.
Returning the original response after CAPTCHA failure may cause downstream code to process a CAPTCHA page. Consider raising an exception or returning an explicit error to prevent this.
| def _attempt_captcha_solve(self, page_url: str) -> Optional[str]: | ||
| """Attempt to solve CAPTCHA""" | ||
| # This is a simplified version - real implementation would extract | ||
| # This is a simplified version - real implementation would extract | ||
| # site key from the page and pass it to solver | ||
| print("CAPTCHA solving not implemented in this example") | ||
| return None |
There was a problem hiding this comment.
suggestion: The placeholder implementation for CAPTCHA solving may lead to confusion.
Consider raising NotImplementedError or clearly marking this method as a stub to avoid accidental use in production environments.
| def _attempt_captcha_solve(self, page_url: str) -> Optional[str]: | |
| """Attempt to solve CAPTCHA""" | |
| # This is a simplified version - real implementation would extract | |
| # This is a simplified version - real implementation would extract | |
| # site key from the page and pass it to solver | |
| print("CAPTCHA solving not implemented in this example") | |
| return None | |
| def _attempt_captcha_solve(self, page_url: str) -> Optional[str]: | |
| """Attempt to solve CAPTCHA | |
| This method is a stub and must be implemented for production use. | |
| """ | |
| raise NotImplementedError("CAPTCHA solving is not implemented. This method is a stub.") |
| CAPTCHA_INITIAL_REQUEST_TIMEOUT: int = 30 # seconds for submission requests | ||
| CAPTCHA_POLLING_REQUEST_TIMEOUT: int = 10 # seconds for polling requests |
There was a problem hiding this comment.
suggestion: Timeout values for CAPTCHA requests are hardcoded and may not suit all environments.
Recommend making timeout values configurable through environment variables or constructor parameters to accommodate varying deployment needs.
| CAPTCHA_INITIAL_REQUEST_TIMEOUT: int = 30 # seconds for submission requests | |
| CAPTCHA_POLLING_REQUEST_TIMEOUT: int = 10 # seconds for polling requests | |
| CAPTCHA_INITIAL_REQUEST_TIMEOUT: int = int(os.getenv("CAPTCHA_INITIAL_REQUEST_TIMEOUT", "30")) # seconds for submission requests | |
| CAPTCHA_POLLING_REQUEST_TIMEOUT: int = int(os.getenv("CAPTCHA_POLLING_REQUEST_TIMEOUT", "10")) # seconds for polling requests |
| assert call_args[1]["timeout"] == 30 | ||
|
|
||
| @patch("src.anti_scraping.captcha_solver.requests.post") | ||
| def test_2captcha_submission_network_error(self, mock_post, captcha_solver): |
There was a problem hiding this comment.
suggestion (testing): No test for provider value case sensitivity.
Please add a test to confirm provider values are treated case-insensitively, as the implementation lowercases the input. This will help catch configuration issues related to case mismatches.
|
|
||
| try: | ||
| poll_response = requests.get( | ||
| poll_url, |
There was a problem hiding this comment.
security (python.lang.security.audit.insecure-transport.requests.request-with-http): Detected a request using 'http://'. This request will be unencrypted, and attackers could listen into traffic on the network and be able to obtain sensitive information. Use 'https://' instead.
Source: opengrep
| for call_args in mock_post.call_args_list: | ||
| # Submission uses 30s timeout, polling uses 10s timeout | ||
| timeout = call_args[1]["timeout"] | ||
| assert timeout in [30, 10] |
There was a problem hiding this comment.
issue (code-quality): Avoid loops in tests. (no-loop-in-tests)
Explanation
Avoid complex code, like loops, in test functions.Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
- loops
- conditionals
Some ways to fix this:
- Use parametrized tests to get rid of the loop.
- Move the complex logic into helpers.
- Move the complex part into pytest fixtures.
Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.
Software Engineering at Google / Don't Put Logic in Tests
| captcha_id = result.get('request') | ||
|
|
||
|
|
||
| try: |
There was a problem hiding this comment.
issue (code-quality): Explicitly raise from a previous error [×4] (raise-from-previous-error)
| task_id = result.get('taskId') | ||
|
|
||
|
|
||
| try: |
There was a problem hiding this comment.
issue (code-quality): Explicitly raise from a previous error [×4] (raise-from-previous-error)
|
|
||
| if headers: | ||
| default_headers.update(headers) |
There was a problem hiding this comment.
issue (code-quality): We've found these issues:
- Use named expression to simplify assignment and conditional (
use-named-expression) - Merge dictionary updates via the union operator (
dict-assign-update-to-union)
🎯 Summary
This PR adds comprehensive timeout handling to the CAPTCHA solver API calls to prevent infinite hangs when CAPTCHA services are unresponsive.
🔧 Changes Made
Configuration Updates
CAPTCHA_INITIAL_REQUEST_TIMEOUT = 30seconds for submission requestsCAPTCHA_POLLING_REQUEST_TIMEOUT = 10seconds for polling requestsException Handling
CaptchaSolverErrorcustom exception classError Propagation
solve_recaptcha()method to properly propagate timeout errorsCaptchaSolverErrorTesting
🚨 Impact
📋 Testing
🔍 Code Quality
🎯 Fixes
Resolves the issue where CAPTCHA solver API calls could hang indefinitely without timeout parameters.
Summary by Sourcery
Add robust timeout and error handling to CAPTCHA solver integration by introducing configurable timeouts, a custom CaptchaSolverError, and updating the request handler to handle solver failures. Refactor test suites to include dedicated timeout tests for CAPTCHA solver and streamline analysis module tests.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Tests: