Skip to content

Releases: lablup/backend.ai

19.09.0

22 Sep 07:08
19.09.0
Compare
Choose a tag to compare

Key highlights

  • Custom image import API which can automatically convert existing Python-based Docker images into a runnable Backend.AI kernel images.
  • Batch jobs which execute the given startup command immediately after session creation and terminated immediately once done, with an explicit record of success or failure depending on the command's exit code.
  • High availability support by running multiple manager instances.
  • Job queueing which allows submission of session creation requests even when the resources in the cluster are fully utilized, and automatically starts the oldest pending requests whenever the required amount of resources becomes available.
  • Event monitoring API using HTML5 Server-Sent Events protocol to allow clients to get kernel lifecycle notification without excessive polling.
  • 3-level user privileges: super-admin, domain-admin, and users
  • Customizable new user signup process
  • Authentication support for etcd
  • SSH keypairs fixed for user keypairs which are auto-installed into their sessions
  • Support for integration with Harbor docker registries

There are many more changes and fixes.
Please refer the per-component changelogs in their repisotories:

19.03.0

22 Sep 07:15
19.03.0
Compare
Choose a tag to compare

Key highlights

  • This is the first version to support a usable web GUI via the console project.
  • Integration with NGC (NVIDIA GPU Cloud) images
  • Per-keypair resource policies
  • Support for authentication with Redis
  • Resource presets
  • Multiple vfolder hosts to utilize multiple volume mounts
  • Various clean ups related to resource slot definitions and its operation semantics, including renaming of "gpu" slots into "cuda.shares" and "cuda.device"

There are many more changes and fixes.
Please refer the per-component changelogs in their repisotories:

18.12.0

22 Sep 07:11
18.12.0
Compare
Choose a tag to compare

Key highlights

  • Service ports
  • CORS support in the gateway API
  • TPU plugin support

There are many more changes and fixes.
Please refer the per-component changelogs in their repisotories:

v1.4.0

02 Oct 14:17
1.4.0
Compare
Choose a tag to compare

Key highlights: Shared virtual folders and multi-GPU scheduling

Manager

  • Add a new set of virtual folder APIs to invite other user to my own vfolder and list/accept invitations from other users. (lablup/backend.ai-manager#80)
  • Improve existing APIs to stream downloads/uploads of virtual folder files and explicit option to recursively delete a directory (lablup/backend.ai-manager#89, lablup/backend.ai-manager#70)
  • Add a new kernel API to list files in the session container (lablup/backend.ai-manager#63)
  • All API endpoints are now available without version prefixes (e.g., /v2/) and in the future only this will be supported. (lablup/backend.ai-manager#78)
  • The user_id field of the keypairs database table is now string instead of integer. You need to provide a manual user_id_map.txt mapping file to run the database schema upgrade using alembic.
  • Upgrade to aiohttp v3.4 series.

Agent

  • Add support for multi-GPU scheduling, where you can allocate multiples of GPU shares to compute sessions so that they can access multiple GPUs. The agent's decimal-based "share" model supports fractional allocations as well, but currently fractional CUDA GPU sharing is highly experimental and only provided to private testers. (lablup/backend.ai-agent#66)
  • Introduces an initial version of accelerator plugins. Currently there is only one plugin: CUDA accelerator. Now you can easily turn on/off CUDA GPU supports by installing/uninstalling this plugin. (lablup/backend.ai-agent#66)
  • Add support for nvidia-docker v2. (lablup/backend.ai-agent#64)
  • Agent restarts now completely preserves the kernel session states. (lablup/backend.ai-agent#35, lablup/backend.ai-agent#73)
  • You may limit the view of agents against available system resources such as CPU cores and GPU devices using a hexademical mask for benchmarks and multi-GPU debugging. (lablup/backend.ai-agent#65)
  • Stability improvements including that it does no longer retry to kill already terminated kernel containers but report them as "terminated", preventing an infinite loop of kernel creation failures in certain usage scenarios.
  • Improve inner beauty for future support of non-dockerized environments.

Client for Python (v1.4)

  • Add support for new vfolder subcommands to invite and accept invitation of shared virtual folders.
  • Add support for listing and downloading vfolder files.
  • Now client library users should wrap the API function invocation codes with an explicit session like aiohttp's client APIs. (example)
  • Upgrade to aiohttp v3.4 series.

v1.3.0

14 Mar 08:35
0af0b4e
Compare
Choose a tag to compare

Key highlight: Improve dockerization support and add a plugin architecture for future extension

Manager

Agent

  • Fix repeating docker event polling even when there is connection/client-side aiohttp errors.
  • Upgrade aiohttp to v3.0 release.
  • Improve dockerization. (lablup/backend.ai-agent#55)
  • Improve inner beauty.

Client for Python (v1.2.1)

  • Improve exception handling (use Exception instead of BaseException as the base class for BackendError)
  • Upgrade aiohttp to v3.0 release.
  • Fix silent swallowing of asyncio.CancelledError and asyncio.TimeoutError
  • Allow uploading multiple files to a virtual folder in a single command (backend.ai vfolder upload)

v1.2.0

30 Jan 02:55
Compare
Choose a tag to compare

Key highlight: Improved logging and batch-mode interactions

NOTICE

  • Now we have the official documentation for installation guide!
    Check out it here!

  • From this release, the manager and agent versions will go together, which indicates
    the compatibility of them, even when either one has relatively little improvements.

Manager and Agent

  • The gateway server now consider per-agent image availability when scheduling a new
    kernel. (lablup/backend.ai-manager#29)

  • The execute API now returns the exit code value of underlying in-kernel
    subprocesses in the batch mode. (lablup/backend.ai-manager#60)

  • The API gateway server is now fully horizontally-scalable across multiple cores and
    multiple servers.

  • Improve logging: it now provides multiprocess-safe file-based rotating logs.
    (lablup/backend.ai-manager#10)

Manager

  • Fix the Admin API error when filtering agents by their status due to a missing
    method parameter in Agent.batch_load().

Agent

  • Remove the image name prefix when reporting available images. (lablup/backend.ai-agent#51)

  • Improve debug-kernel mode to mount host-side kernel runner source into the kernel
    containers so that they use the latest, editable source clone of the kernel runner.

Client (v1.1.5 to v1.1.7)

  • Apply authentication to websocket-based API requests.

  • Fix a bug in client-side validation of user-provided session ID token.

  • Add missing ai.backend.client.cli.admin module in the distributed package.