Skip to content

Commit 1adbf6b

Browse files
authored
Merge pull request #44 from mjishnu/v1.5.5
- Refine Range Header parsing for spec compliance and error suppression (#43) - Remove clear_terminal option (no longer clears the terminal) - Update docs and logging format - Minor bug fixes
2 parents 7193403 + 4ff062d commit 1adbf6b

File tree

6 files changed

+57
-109
lines changed

6 files changed

+57
-109
lines changed

.deepsource.toml

Lines changed: 0 additions & 7 deletions
This file was deleted.

README.md

Lines changed: 2 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,6 @@ dl.start(
6363
callback: Callable = None,
6464
block: bool = True,
6565
display: bool = True,
66-
clear_terminal: bool = True,
6766
)
6867
```
6968

@@ -99,7 +98,6 @@ Each option is explained below:
9998
- `callback`: A callback function to be called when the download is complete. The default value is `None`. The function must accept 2 positional parameters: `status` (bool) indicating if the download was successful, and `result` (FileValidator object if successful, None if failed).
10099
- `block`: Whether to block until the download is complete. The default value is `True`.
101100
- `display`: Whether to display download progress and other optional messages. The default value is `True`.
102-
- `clear_terminal`: Whether to clear the terminal before displaying the download progress. The default value is `True`.
103101

104102
- Supported Keyword Arguments:
105103
- `params`: Parameters to be sent in the query string of the new request. The default value is `None`.
@@ -410,7 +408,6 @@ The `Pypdl` class represents a file downloader that can download a file from a g
410408
callback = None,
411409
block = True,
412410
display = True,
413-
clear_terminal = True
414411
)`: Starts the download process.
415412

416413
##### Parameters
@@ -443,7 +440,6 @@ The `Pypdl` class represents a file downloader that can download a file from a g
443440
- `callback`: A callback function to be called when the download is complete. The default value is `None`. The function must accept 2 positional parameters: `status` (bool) indicating if the download was successful, and `result` (FileValidator object if successful, None if failed).
444441
- `block`: Whether to block until the download is complete. The default value is `True`.
445442
- `display`: Whether to display download progress and other optional messages. The default value is `True`.
446-
- `clear_terminal`: Whether to clear the terminal before displaying the download progress. The default value is `True`.
447443

448444
- Supported Keyword Arguments:
449445
- `params`: Parameters to be sent in the query string of the new request. The default value is `None`.
@@ -477,44 +473,12 @@ The `Pypdl` class represents a file downloader that can download a file from a g
477473

478474
### Helper Classes
479475

480-
#### `Basicdown()`
481-
482-
The `Basicdown` class is the base downloader class that provides the basic structure for downloading files.
483-
484-
##### Attributes
485-
486-
- `curr`: The current size of the downloaded file in bytes.
487-
- `completed`: A flag that indicates if the download is complete.
488-
- `interrupt`: A flag that indicates if the download was interrupted.
489-
- `downloaded`: The total amount of data downloaded so far in bytes.
490-
491-
##### Methods
492-
493-
- `download(url, path, mode, session, **kwargs)`: Downloads data in chunks.
494-
495-
#### `Singledown()`
496-
497-
The `Singledown` class extends `Basicdown` and is responsible for downloading a whole file in a single segment.
498-
499-
##### Methods
500-
501-
- `worker(url, file_path, session, **kwargs)`: Downloads a whole file in a single segment.
502-
503-
#### `Multidown()`
504-
505-
The `Multidown` class extends `Basicdown` and is responsible for downloading a specific segment of a file.
506-
507-
##### Methods
508-
509-
- `worker(segment_table, id, session, **kwargs)`: Downloads a part of the file in multiple segments.
510-
511476
#### `FileValidator()`
512477

513478
The `FileValidator` class is used to validate the integrity of the downloaded file.
514479

515-
##### Parameters
516-
517-
- `path`: The path of the file to be validated.
480+
##### Attributes
481+
- `path`: The path of the file
518482

519483
##### Methods
520484

@@ -526,12 +490,6 @@ The `FileValidator` class is used to validate the integrity of the downloaded fi
526490

527491
The `AutoShutdownFuture` class is a wrapper for concurrent.futures.Future object that shuts down the eventloop and executor when the result is retrieved.
528492

529-
##### Parameters
530-
531-
- `future`: The Future object to be wrapped.
532-
- `executor`: The executor to be shut down when the result is retrieved.
533-
- `loop`: The eventloop to be stopped when result is retrieved.
534-
535493
##### Methods
536494

537495
- `result(timeout=None)`: Retrieves the result of the Future object and shuts down the executor. If the download was successful, it returns a `FileValidator` object; otherwise, it returns `None`.
@@ -540,11 +498,6 @@ The `AutoShutdownFuture` class is a wrapper for concurrent.futures.Future object
540498

541499
The `EFuture` class is a wrapper for a `concurrent.futures.Future` object that integrates with an event loop to handle asynchronous operations.
542500

543-
##### Parameters
544-
545-
- `future`: The `Future` object to be wrapped.
546-
- `loop`: The event loop that will be used to manage the `Future`.
547-
548501
##### Methods
549502

550503
- `result(timeout=None)`: Retrieves the result of the `Future` object. If the `Future` completes successfully, it returns the result; otherwise, it raises an exception.

pypdl/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
__version__ = "1.5.4"
1+
__version__ = "1.5.5"
22

33
from .pypdl import Pypdl

pypdl/producer.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -135,23 +135,23 @@ async def _fetch_task_info(self, url, file_path, multisegment, **kwargs):
135135
if file_size := int(header.get("content-length", 0)):
136136
self._logger.debug("File size acquired from header")
137137

138-
if range_header:
139-
start, end = get_range(range_header, file_size)
138+
if multisegment and range_header:
139+
size = get_range(range_header, file_size)
140140
else:
141-
start = 0
142-
end = file_size - 1
141+
size = Size(0, file_size - 1)
143142

144-
size = Size(start, end)
145143
etag = header.get("etag", "")
146144
if etag != "":
147145
self._logger.debug("ETag acquired from header")
148146
etag = etag.strip('"')
149147

150-
if size.value < 1 or not header.get("accept-ranges"):
148+
if size.value == 0 or not header.get("accept-ranges"):
151149
self._logger.debug("Single segment mode, accept-ranges or size not found")
152-
kwargs["headers"] = user_headers
153150
multisegment = False
154151

152+
if not multisegment:
153+
kwargs["headers"] = user_headers
154+
155155
return url, file_path, multisegment, etag, size, kwargs
156156

157157
async def _fetch_header(self, url, **kwargs):

pypdl/pypdl.py

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,6 @@ def start(
104104
callback: Callable = None,
105105
block: bool = True,
106106
display: bool = True,
107-
clear_terminal: bool = True,
108107
**kwargs,
109108
) -> Union[utils.EFuture, utils.AutoShutdownFuture, List]:
110109
"""
@@ -124,7 +123,6 @@ def start(
124123
:param callback: Callback function to call after each download.
125124
:param block: If True, block until downloads finish.
126125
:param display: If True, display progress.
127-
:param clear_terminal: If True, clear terminal before displaying progress bar.
128126
:param kwargs: Addtional keyword arguments for aiohttp.
129127
:return: A future-like object if non-blocking, or a result list if blocking.
130128
:raises RuntimeError: If downloads are already in progress.
@@ -171,7 +169,7 @@ def start(
171169
task_dict[i] = task
172170
self.total_task += 1
173171

174-
coro = self._download_tasks(task_dict, display, clear_terminal)
172+
coro = self._download_tasks(task_dict, display)
175173

176174
self._future = utils.EFuture(
177175
asyncio.run_coroutine_threadsafe(coro, self._loop.get()),
@@ -194,7 +192,7 @@ def start(
194192

195193
return future
196194

197-
async def _download_tasks(self, tasks_dict, display, clear_terminal):
195+
async def _download_tasks(self, tasks_dict, display):
198196
self._logger.debug("Starting download tasks")
199197
start_time = time.time()
200198
coroutines = []
@@ -224,7 +222,7 @@ async def _download_tasks(self, tasks_dict, display, clear_terminal):
224222
self._consumers.append(consumer)
225223

226224
self._logger.debug("Starting producer and consumer tasks")
227-
self._pool.submit(self._progress_monitor, display, clear_terminal)
225+
self._pool.submit(self._progress_monitor, display)
228226
await utils.auto_cancel_gather(*coroutines)
229227
await asyncio.sleep(0.5)
230228
except utils.MainThreadException as e:
@@ -272,11 +270,11 @@ def _reset(self):
272270
self._success.clear()
273271
self._failed.clear()
274272

275-
def _progress_monitor(self, display, clear_terminal):
273+
def _progress_monitor(self, display):
276274
self._logger.debug("Starting progress monitor")
277275
interval = 0.5
278276
recent_queue = deque(maxlen=12)
279-
with utils.ScreenCleaner(display, clear_terminal):
277+
with utils.ScreenCleaner(display):
280278
while not self.completed and not self._interrupt.is_set():
281279
self._calc_values(recent_queue, interval)
282280
if display:

pypdl/utils.py

Lines changed: 42 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
from concurrent.futures import CancelledError, Executor, Future, ThreadPoolExecutor
88
from os import path
99
from threading import Event, Thread
10-
from typing import Callable, Dict, List, Optional, Tuple, Union
10+
from typing import Callable, Dict, List, Optional, Union
1111
from urllib.parse import unquote, urlparse
1212

1313
from aiofiles import open as fopen
@@ -278,22 +278,17 @@ def _stop(self) -> None:
278278

279279

280280
class ScreenCleaner:
281-
"""A context manager to clear the screen and hide cursor."""
281+
"""Context manager to hide the terminal cursor and add spacing for cleaner output."""
282282

283-
def __init__(self, display: bool, clear_terminal: bool):
283+
def __init__(self, display: bool):
284284
self.display = display
285-
self.clear_terminal = clear_terminal
286-
287-
def clear(self) -> None:
288-
sys.stdout.write(2 * "\n")
289-
if self.clear_terminal:
290-
sys.stdout.write("\033c") # Clear screen
291-
sys.stdout.write("\x1b[?25l") # Hide cursor
292-
sys.stdout.flush()
293285

294286
def __enter__(self):
295287
if self.display:
296-
self.clear()
288+
sys.stdout.write(2 * "\n")
289+
sys.stdout.write("\x1b[?25l") # Hide cursor
290+
sys.stdout.flush()
291+
297292
return self
298293

299294
def __exit__(self, exc_type, exc_val, exc_tb):
@@ -303,7 +298,7 @@ def __exit__(self, exc_type, exc_val, exc_tb):
303298

304299

305300
def to_mb(size_in_bytes: int) -> float:
306-
return size_in_bytes / MEGABYTE
301+
return max(0, size_in_bytes) / MEGABYTE
307302

308303

309304
def seconds_to_hms(sec: float) -> str:
@@ -502,45 +497,54 @@ def default_logger(name: str) -> logging.Logger:
502497
handler = logging.FileHandler("pypdl.log", mode="a", delay=True)
503498
handler.setFormatter(
504499
logging.Formatter(
505-
"(%(name)s) %(asctime)s - %(levelname)s: %(message)s",
500+
"%(asctime)s - %(levelname)s: %(message)s",
506501
datefmt="%d-%m-%y %H:%M:%S",
507502
)
508503
)
509504
logger.addHandler(handler)
510505
return logger
511506

512507

513-
def get_range(range_header: str, file_size: int) -> Tuple[int, int]:
514-
def parse_part(part: str) -> Optional[int]:
515-
return int(part) if part else None
508+
def get_range(range_header: str, file_size: int) -> Size:
509+
if not range_header.lower().startswith("bytes="):
510+
raise ValueError('Range header must start with "bytes="')
516511

517-
range_value = range_header.replace("bytes=", "")
518-
parts = range_value.split("-")
519-
if len(parts) != 2:
520-
raise TypeError("Invalid range format")
512+
range_value = range_header.split("=")[1].strip()
513+
parts = range_value.split("-", 1)
521514

522-
start, end = map(parse_part, parts)
515+
try:
516+
start = int(parts[0]) if parts[0] else None
517+
end = int(parts[1]) if parts[1] else None
518+
except (ValueError, IndexError):
519+
raise ValueError(f"Invalid range format: {range_value}")
523520

521+
# Case 1: "bytes=start-end"
524522
if start is not None and end is not None:
525-
if start > end:
526-
raise TypeError("Invalid range, start is greater than end")
523+
# Already parsed correctly
524+
pass
525+
526+
# Case 2: "bytes=start-"
527+
elif start is not None and end is None:
528+
end = file_size - 1
529+
530+
# Case 3: "bytes=-suffix_length"
531+
elif end is not None and start is None:
532+
if end == 0:
533+
raise ValueError("Invalid range: suffix length cannot be zero")
534+
start = max(0, file_size - end)
535+
end = file_size - 1
536+
537+
# Case 4: Invalid format like "bytes=-"
527538
else:
528-
if file_size == 0:
529-
raise TypeError("Invalid range, file size is 0")
530-
531-
if end is not None:
532-
if end > file_size - 1:
533-
raise TypeError("Invalid range, end is greater than file size")
534-
start = file_size - end
535-
end = file_size - 1
536-
elif start is not None:
537-
if start > file_size - 1:
538-
raise TypeError("Invalid range, start is greater than file size")
539-
end = file_size - 1
539+
raise ValueError(f"Invalid range format: {range_value}")
540+
541+
if start > end:
542+
if end == -1: # file_size == 0
543+
start = 0
540544
else:
541-
raise TypeError(f"Invalid range: {start}-{end}")
545+
raise ValueError(f"Invalid range: start ({start}) > end ({end})")
542546

543-
return start, end
547+
return Size(start, end)
544548

545549

546550
def run_callback(

0 commit comments

Comments
 (0)