Skip to content

Releases: lablup/backend.ai

22.03.6

10 Jun 02:16
22.03.6
Compare
Choose a tag to compare

Fixes

  • Refine scripts/install-dev.sh, ./py, and ./pants-local scripts to better detect and use an existing CPython available in the host (#438)
  • Fix upload failures of the Client SDK wheel packages due to a bogus syntax/rendering error of reST caused by specific backslash patterns (#455)
  • Correct missing dependencies due to different package-import names and indirect module references in the webserver (#459)

Documentation Changes

  • Mention Git LFS as a prerequisite explicitly and let the install-dev script run git lfs pull always (#446)

Full Changelog

Check out the full changelog until this release (22.03.6).

22.06.0.dev4

09 Jun 07:13
22.06.0.dev4
Compare
Choose a tag to compare
22.06.0.dev4 Pre-release
Pre-release

Fixes

  • Fix upload failures of the Client SDK wheel packages due to a bogus syntax/rendering error of reST caused by specific backslash patterns (#455)

Full Changelog

Check out the full changelog until this release (22.06.0.dev4).

22.03.5

08 Jun 15:41
22.03.5
Compare
Choose a tag to compare

Features

  • Implement plugin blocklist and utilize it to mutually exclude self-embedded plugins in the manager and agent for when they are executed under a unified virtualenv (#453)

Fixes

  • Agent startup error due to UnboundLocalError of now variable in dumping the last registry. (#452)

22.03.4

08 Jun 08:41
22.03.4
Compare
Choose a tag to compare

Fixes

  • Always dump kernel registry information to a file upon agent termination. (#450)

22.03.4rc1

08 Jun 03:41
22.03.4rc1
Compare
Choose a tag to compare
22.03.4rc1 Pre-release
Pre-release

Features

  • Add missing options (parents, exist_ok) for the mkdir CLI command and functional API in the client SDK (#431)
  • Execute the keypair bootstrap script for batch compute session as well (previously it was only executed for interactive sessions) (#437)

Fixes

  • Dump kernel registry information to a file upon KernelStartedEvent or KernelTerminatedEvent. Saving at container start event did not ensure the existence of kernel object's runner attribute, which may cause AttributeError in restarting the Agent server. (#441)

Documentation Changes

  • Add a guide for plugin related workflow with the new mono-style repository structure (#434)
  • Merge the documentation of the Client SDK for Python into the unified docs (#435)

Miscellaneous

  • Fix install-dev.sh to work with RHEL-like distros by fixing system package names (#372)
  • Improve auto-detection of plugins in development setups so that we no longer need to reinstall them after running ./pants export (#439)

22.06.0.dev2

03 Jun 05:44
22.06.0.dev2
Compare
Choose a tag to compare
22.06.0.dev2 Pre-release
Pre-release
  • This ia another test release to verify automation of marking pre-releases.

22.06.0.dev1

03 Jun 05:13
22.06.0.dev1
Compare
Choose a tag to compare
22.06.0.dev1 Pre-release
Pre-release

Miscellaneous

  • Migrate to a semi-mono repository that contains all first-party server-side components with automated dependency management via Pants (#417)
  • Add a Pants plulgin towncrier_tool to allow running towncrier for changelog generation (#427)
  • Update readthedocs.org build configurations (#428)
  • Update documentation for daily development workflows using Pants (#429)
  • Automate creation of the release in GitHub when we commit tags (#433)

22.03

25 Apr 03:49
22.03.0
Compare
Choose a tag to compare

Key Highlights

General

  • WSProxy (AppProxy) v2. It now supports session pooling using wildcard domain and direct connection to containers without going through the manager and webserver, reducing API gateway overheads for large-scale inference workloads.
  • Migration of file browser sessions to storage proxy, which also reduces the API gateway overheads when transferring large files via file browser sessions.
  • Improve multi-architecture support (including ARM64), by allowing hybrid setup of agents with different architectures on a single cluster and supporting multi-architecture container images.

Extensibility

  • Support setting custom webhook URLs to subscribe session state changes.
  • Support mounting sub-directories of vfolders. (backported to 21.09)
  • Support mounting vfolders (and their sub-directories) to arbitrary absolute paths. (backported to 21.09)
  • Implement inter-session dependencies to automatically start sessions one after another when prior ones finish successfully.

Administration

  • Add new scheduler options to scaling groups: pending timeout of newly created sessions and allowed session types.

Client SDK

  • Now supports JSON output formatting for mutation commands.

Enterprise Features

  • Backend.AI Dashboard. It provides real-time performance metrics collected from Backend.AI Agents and compute plugins, as well as per-node Prometheus exporters.
  • Add support for floating license (i.e., counting the active nodes and devices)

Performance and Stability

  • Now we use etcd as the highly available distributed lock backend by default, and the lock backend can be configured as "pg_advisory" (for previous behavior) and "filelock" (for single-node setup) as well.
  • Move several statistics database columns to Redis to avoid excessive transaction locking. (backported to 21.09)
  • Apply the explicit "readonly" attribute to more database transactions. (backported to 21.03 and 21.09)
  • Deprecate measurement of scratch directory space usage due to excessive I/O overheads with large number of files. (backported to 21.03 and 21.09)
  • Apply aiotools.PersistentTaskGroup to a broader range of asyncio task management patterns to ensure smoother graceful shutdown (backported to 21.09)

Changelogs

21.09

08 Nov 07:23
21.09.0
Compare
Choose a tag to compare

Key Highlights

  • Hardware platforms
    • Support running on ARM64 platforms (Linux / macOS with Apple Silicon)
    • Improve RDMA support with Infiniband networks (backported to 21.03)
    • NetApp storage integration
  • UI/UX
    • Statistics dashboard integration (Enterprise only)
    • Improved the performance of listing many items by server-side filtering and pagination
    • Display the progress of image pulls while creating a new session when agents do not yet have the image
  • Client SDK
    • Revamp the CLI with JSON-formatted outputs for better scriptability and restructured command hierarchy for consistency
  • API
    • Allow manual assignment of agent(s) when creating a session for ease of node diagnosis
    • Global query filter and query ordering expression support in GraphQL paginated list queries (backported to 21.03, 20.09)
  • Scheduler
    • Fix the HoL blocking issue in the FIFO scheduler with priority adjustments (backported to 21.03, 20.09)
    • Fix lots of database stability issues by adopting SQLAlchemy v1.4 with asyncio support (backported to 21.03, 20.09)
  • Stability
    • Explicitly apply TCP keepalive timeouts in every database and RPC connections to avoid implicit and silent connection drops by network middleboxes (backported to 21.03)
    • Adopt aioredis v2 and rewrite the internal event bus with Redis STREAM APIs (backported to 21.03)
    • Adopt aiohttp v3.8 and drop aiojobs

21.03.0

29 Mar 01:55
21.03.0
Compare
Choose a tag to compare

Key Highlights

  • All server-side components now run on top of Python 3.9.
  • (BETA) The native support for Windows 10 (and Server 2019) is coming soon.
  • This release has the identical set of features and fixes in the latest v20.09 series.
    You may treat it as an integrated stability update against the v20.09 series.

Changelogs