Releases: lablup/backend.ai
Releases · lablup/backend.ai
22.03.6
Fixes
- Refine
scripts/install-dev.sh
,./py
, and./pants-local
scripts to better detect and use an existing CPython available in the host (#438) - Fix upload failures of the Client SDK wheel packages due to a bogus syntax/rendering error of reST caused by specific backslash patterns (#455)
- Correct missing dependencies due to different package-import names and indirect module references in the webserver (#459)
Documentation Changes
- Mention Git LFS as a prerequisite explicitly and let the install-dev script run
git lfs pull
always (#446)
Full Changelog
Check out the full changelog until this release (22.03.6).
22.06.0.dev4
Fixes
- Fix upload failures of the Client SDK wheel packages due to a bogus syntax/rendering error of reST caused by specific backslash patterns (#455)
Full Changelog
Check out the full changelog until this release (22.06.0.dev4).
22.03.5
22.03.4
22.03.4rc1
Features
- Add missing options (
parents
,exist_ok
) for themkdir
CLI command and functional API in the client SDK (#431) - Execute the keypair bootstrap script for batch compute session as well (previously it was only executed for interactive sessions) (#437)
Fixes
- Dump kernel registry information to a file upon
KernelStartedEvent
orKernelTerminatedEvent
. Saving at container start event did not ensure the existence of kernel object'srunner
attribute, which may causeAttributeError
in restarting the Agent server. (#441)
Documentation Changes
- Add a guide for plugin related workflow with the new mono-style repository structure (#434)
- Merge the documentation of the Client SDK for Python into the unified docs (#435)
Miscellaneous
22.06.0.dev2
- This ia another test release to verify automation of marking pre-releases.
22.06.0.dev1
Miscellaneous
- Migrate to a semi-mono repository that contains all first-party server-side components with automated dependency management via Pants (#417)
- Add a Pants plulgin
towncrier_tool
to allow running towncrier for changelog generation (#427) - Update readthedocs.org build configurations (#428)
- Update documentation for daily development workflows using Pants (#429)
- Automate creation of the release in GitHub when we commit tags (#433)
22.03
Key Highlights
General
- WSProxy (AppProxy) v2. It now supports session pooling using wildcard domain and direct connection to containers without going through the manager and webserver, reducing API gateway overheads for large-scale inference workloads.
- Migration of file browser sessions to storage proxy, which also reduces the API gateway overheads when transferring large files via file browser sessions.
- Improve multi-architecture support (including ARM64), by allowing hybrid setup of agents with different architectures on a single cluster and supporting multi-architecture container images.
Extensibility
- Support setting custom webhook URLs to subscribe session state changes.
- Support mounting sub-directories of vfolders. (backported to 21.09)
- Support mounting vfolders (and their sub-directories) to arbitrary absolute paths. (backported to 21.09)
- Implement inter-session dependencies to automatically start sessions one after another when prior ones finish successfully.
Administration
- Add new scheduler options to scaling groups: pending timeout of newly created sessions and allowed session types.
Client SDK
- Now supports JSON output formatting for mutation commands.
Enterprise Features
- Backend.AI Dashboard. It provides real-time performance metrics collected from Backend.AI Agents and compute plugins, as well as per-node Prometheus exporters.
- Add support for floating license (i.e., counting the active nodes and devices)
Performance and Stability
- Now we use etcd as the highly available distributed lock backend by default, and the lock backend can be configured as "pg_advisory" (for previous behavior) and "filelock" (for single-node setup) as well.
- Move several statistics database columns to Redis to avoid excessive transaction locking. (backported to 21.09)
- Apply the explicit "readonly" attribute to more database transactions. (backported to 21.03 and 21.09)
- Deprecate measurement of scratch directory space usage due to excessive I/O overheads with large number of files. (backported to 21.03 and 21.09)
- Apply
aiotools.PersistentTaskGroup
to a broader range of asyncio task management patterns to ensure smoother graceful shutdown (backported to 21.09)
Changelogs
- https://github.com/lablup/backend.ai-manager/blob/22.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-agent/blob/22.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-common/blob/22.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-client-py/blob/22.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-webserver/blob/22.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-webui/blob/22.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-storage-proxy/blob/22.03/CHANGELOG.md
21.09
Key Highlights
- Hardware platforms
- Support running on ARM64 platforms (Linux / macOS with Apple Silicon)
- Improve RDMA support with Infiniband networks (backported to 21.03)
- NetApp storage integration
- UI/UX
- Statistics dashboard integration (Enterprise only)
- Improved the performance of listing many items by server-side filtering and pagination
- Display the progress of image pulls while creating a new session when agents do not yet have the image
- Client SDK
- Revamp the CLI with JSON-formatted outputs for better scriptability and restructured command hierarchy for consistency
- API
- Allow manual assignment of agent(s) when creating a session for ease of node diagnosis
- Global query filter and query ordering expression support in GraphQL paginated list queries (backported to 21.03, 20.09)
- Scheduler
- Fix the HoL blocking issue in the FIFO scheduler with priority adjustments (backported to 21.03, 20.09)
- Fix lots of database stability issues by adopting SQLAlchemy v1.4 with asyncio support (backported to 21.03, 20.09)
- Stability
- Explicitly apply TCP keepalive timeouts in every database and RPC connections to avoid implicit and silent connection drops by network middleboxes (backported to 21.03)
- Adopt aioredis v2 and rewrite the internal event bus with Redis STREAM APIs (backported to 21.03)
- Adopt aiohttp v3.8 and drop aiojobs
21.03.0
Key Highlights
- All server-side components now run on top of Python 3.9.
- (BETA) The native support for Windows 10 (and Server 2019) is coming soon.
- This release has the identical set of features and fixes in the latest v20.09 series.
You may treat it as an integrated stability update against the v20.09 series.
Changelogs
- https://github.com/lablup/backend.ai-manager/blob/21.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-agent/blob/21.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-common/blob/21.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-client-py/blob/21.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-webserver/blob/21.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-webui/blob/21.03/CHANGELOG.md
- https://github.com/lablup/backend.ai-storage-proxy/blob/21.03/CHANGELOG.md