Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for Features functionality and spec. #27

Merged
merged 5 commits into from
May 25, 2022
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 143 additions & 0 deletions proposals/devcontainer-features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# Dev container features reference

Dev container features provide a smooth path for customizing your container definitions. Features are self contained units of code meant to facilitate creation of custom images for application or development containers.

From a practical point of view, features are folders that contain units of code with different entrypoints for different lifecycle events.

Features can be defined by a `feature.json` file in the root folder of the feature. The file is optional for backwards compatibility but it is required for any new features being authored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use a distinctive name to make it easy to associate a schema file (and possibly other editing support) to it. devcontainer-feature.json would work well. Similarly, let's use devcontainer-feature-collection.json or shorter devcontainer-features.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked the shorthand of just feature.json but if we agree on that change, I can make it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shorthand is nice, but I think tools support will be easier when we use a more distinct name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the changes. cc @joshspicer, @samruddhikhandale


Features are to be executed in sequence as defined in `devcontainer.json`.

## Folder Structure

A feature is a self contained entity in a folder. A feature release would be a tar file that contains all files part of the feature.

+-- feature
| +-- feature.json
| +-- validate.sh (default)
| +-- aquire.sh (default)
| +-- install.sh (default)
| +-- (other files)

In case `feature.json` does not include a reference for the lifecycle scripts the application will look for the default script names and will execute them if available.

In case there is intent to create a set of features that share code, it is possible to create a feature collection in the following way:

collectionFolder
+-- feature-collection.json
+-- common (or similar)
| +-- (library files)
+-- feature1
| +-- feature.json
| +-- install.sh (or others)
| +-- (other files)
+-- feature2
(etc)

## feature.json properties

the `feature.json` file defines information about the feature to be used by any tool that helps a user utilize them, and the way the feature will be executed.

The properties of the file are as follows:

| Property | Type | Description |
| :--- | :--- | :--- |
| id | string | Id of the feature/definition. The id should be unique in the context of the repository/published package where they exist. |
| name | string | Name of the feature/definition |
| description | string | Description of the feature/definition |
| documentationURL | string | Url that targets the documentation of the feature. |
| licenseURL | string | Url that points to the license of the feature. |
Copy link

@bhack bhack Apr 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an OSI boolean?
I will be faster to filter then regex on license remote files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main issue might be that we would have to get authors to maintain that flag. It might be easier to allow the value to be the name of an OSI approved license OR an URL to a custom license (maybe as an object literal that has an OSI flag for those who want/have to link to their copy of an OSI license).

A license name to show in the UI would also make sense.

Copy link

@bhack bhack May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scope was related to having something machine readable so that we can quckly filter also in the UI.

How can we quickly filter OSS only elements?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edgonmsft Do I need to open a new ticket to track this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, lets create a new one. I think its a good idea the we should flesh out a little bit more. Thanks!

| version | string | Version of the feature. |
| metadata | any | Freeform data added by users for their own purposes. |
| keywords | array | List of keywords relevant to a user that would search for this definition/feature. |
| install.app | string | App to execute.ll (Optional, defaults to /bin/sh) |
| install.file | string | Parameters/script to execute. (Defaults to validate.sh) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that be needed for all script files? Let's omit this for simplicity, the predefined names do not seem to be a limitation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I get what you mean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always use install.sh (assuming validate.sh is a typo), there is no need to make this configurable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bein configurable is so that features can be created in other languages, if its not present we do fallback to install.sh

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid adding non-essential properties to the spec for now and this can be covered by having an install.sh forward the call.

| options | object | Possible options to be passed as environment variables to the execution of the scripts |
| containerEnv | object | A set of name value pairs that sets or overrides environment variables. |
| privileged | boolean | If preveleged mode is required by the feature. |
| init | boolean | If its necesarry to run `init`. |
| capAdd | array | Additional capabilities needed by the feature. |
| securityOpt | array | Security options needed by the feature. |
| entrypoint | string | Set if the feature sets requires an entrypoint. |

Options

| Property | Type | Description |
| :--- | :--- | :--- |
| id | string | Id of the option |
| id.type | string | Type of the option |
| id.enum | string array | possible values |
| id.default | string | default value for the option |
| id.description | string | Description for the option |

## feature-collection.json properties.

A feature collection file is a compilation of the `feature.json` files for each individual feature.

If the application finds a `feature-collection.json` file in the root of a downloaded tar file then it uses that file as to find the particular feature that will be executed.

In addition to the list of features included, the file includes the following properties.

| Property | Type | Description |
| :--- | :--- | :--- |
| name | string | Name of the collection |
| reference | string | Reference information of the repository and path where the code is stored. |
| version | string | Version of the code. |

In most cases the `feature-collection.json` file can be generated automatically at the moment of creating a release.

## devcontainer.json properties

Features are referenced in `devcontainer.json` where the `features` tag consists of an array with the ordered list of features to be included in the container image.

The properties are:
| Property | Type | Description |
| :--- | :--- | :--- |
| id | string | reference to the particular feature to be included. |
| options | object | Type of the option |

The `id` is the main reference point for how to find and download a particular feature. `id` can be defined in any of the following ways:

| Type | Description |
| :--- | :--- |
| feature | A simple name references a feature included with the application used to execute `devcontainer.json` |
| organization/repository/{feature or collectionl/feature}/@version | Reference to a particular release in a repository |
| https://<../URI/..>/devcontainer-features.tgz#{optional feature} | Direct reference to a file download. |
| ./{local-path} -or- ../{local-path} | A path relative to devcontainer.json where a feature or feature collection can be found. |

## Authoring

Features can be authored in a number of languages, the most straight forward being bash scripts. If a feature is authored in a different language information about it should be included in the metadata so that users can make an informed choice about it.

Reference information about the application required to execute the feature should be included in `feature.json` in the metadata section.

Applications should default to `/bin/sh` for features that do not include this information.

If the feature is included in a folder as part of the repository that contains `devcontainer.json` no other steps are necessary.

## Release

A release is created when the objective is to have other users use a feature.

A release consists of the following:

1.- Tar file with all the included files for the feature or feature collection.
2.- A copy of the `feature.json` or `feature-collection.json` file that defines the contents of the tar file with additional information added to validate it:

| Property | Type | Description |
| :--- | :--- | :--- |
| version | string | SemVer version of the release. |
| checksum | string | Checksum of the tar file. |

## Execution

The application that implements features should:

- Features are executed in the order defined in devcontainer.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep this declarative for several reasons:

  • Usability: When there are implicit dependencies the user has no way knowing what they are and we have no way of changing them over time without risking to break existing configurations.
  • Parallelization: We want to preserve the option to parallelize installation in the future, using multi-stage builds or otherwise.
  • Explicit dependencies: We want to preserve the option to add explicit dependencies, these would change the order of installation (potentially change it over time as dependencies change).

One example was the creation of a user: Since the user is specified in the devcontainer.json as a top-level option, it makes more sense to deal with that in general instead of as part of a feature (the common feature?). It might be easiest if we create the user before installing any features if that user does not exist yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I like the idea of having/using the cached features for things like users, although I'm not sure if defaulting to running it before other features is the best idea.
    I feel like there are two ways of dealing with this. One is the array format which makes it clear to the user in what order things are going to run and the user has total control of that. Not sure I like the idea of things changing from under me if there is an update somewhere upstream. Declaring dependencies could actually be done in devcontainer.json instead of as part of the feature itself. If we do that the tool can more easily parallelize, but it complicates the user experience in that now they have to declare the dependencies instead of just using an in order execution.

Thoughts? @Chuxel @jkeech @joshspicer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: Dependencies would be declared by features themselves (not in the devcontainer.json), the user should not have to know about these. We don't have dependencies at the moment, but since we have already investigated adding them, I use them as one example to show that having "features" in the devcontainer.json determine an execution order will constrain us in the future. (Even more important is the usability argument above.)

On the user example: The user is configured with the top-level "remoteUser" independently of features. That suggests that we should address it also independently of features. Since features then depend on that user, we would create it before we run the feature install scripts. (Having a feature create the "remoteUser" doesn't seem to make sense since that user needs to exist also without any features for the configuration to be valid.)

Another argument for keeping "features" declarative is that it will make adding auto-detection of features in the future simpler. (If order matters, we need logic to arrange that for the detected features.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I get what you mean but not sure if we should add that to this proposal till we have a better idea of how we want it to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main idea of making this changes is to allow users to control the order of execution, especially since we are not supporting dependencies yet and the changes to the OS applied by features might clash between themselves in unpredictable ways that a user might need to fix in this way.

Thoughts @Chuxel, @bamurtaugh, @jkeech, @joshspicer ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this would work, but is there a way a user could set a property if the order of their features matters? I think the concerns @chrmarti mentions make sense, but I can see the value in what @edgonmsft mentions about controlling the order of execution, so could we default to declarative but allow setting a property that ordering matters, in case users would like to override this declarative default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order that works best will depend on how the features are implemented and could change over time. The user does not have any insight into this. When we add dependencies we want to be in control of the order.

We should ensure the order of installation is always the same (e.g., sort by id), so the outcome is reproducible. If we have any implicit dependencies today, I suggest we take a closer look and find a way to deal with them short term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, I think the user being able to execute things in the order they choose is important for the following reasons:

  • Running things like pip install and npm install g etc that need to run after their respective sdks are installed. This can also be custom user code that needs to explicitly run after certain features.
  • There won't be a feature for everything, users need to be able to add their own code and control where it executes in the order with the minimum of fuss. Likewise, it should be easy to know at a glance how things are working.
  • Dependencies actually are on the underlying changes to the OS and not on features themselves. A feature can have its dependencies satisfied in different ways, like if the base image the user selects already has it installed in a different way. There can also be multiple features that satisfy an underlying dependency in a different way.
  • In order for features to be more reusable it is easier to avoid making monolithic features and allow users to go in and out of the ecosystem as they need, this would also provide a path for users to grow from features to more custom builds without having to get out of using features and needing to replicate what they do.
  • At some points in time there might be issues with existing features when new images are released and users need a way to fix by running code before, in the middle or after some features.
  • I actually think its a bad idea if a change in a feature changes the order of installation completely without the user knowing about it.

We've already encountered the case where to build some complex images we need to add custom code at different points in time, code that only makes sense for that image and not as a reusable feature.

There are probably more reasons than this, thoughts? @chrmarti, @Chuxel, @joshspicer , @jkeech

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest concern with obscuring the installation order of features (which are implemented as  shell scripts with no standardization on how to author them), is that is will significantly hinder the ability of a community member to author their own set of features, and prevent the wider adoption and ecosystem of features.

We have not landed on a simple, complete mechanism for automated versioning/dependency management despite a fair bit of time and effort researching and discussing options. We've investigated building on top of existing package managers like npm. Our biggest detractor for such an idea is that we aren't just installing the feature, but the feature with the provided options. Our "blessed" set of features today already function quite differently given provided options - We would need strict authoring rules for the community, as well as a strictly defined set of metadata and listing of dependencies, to make the declarative features object feasible. We are talking about reimplementing a package manager like 'apt'.

I think that is too complicated given the problem 'features' is trying to solve (An easy-to-use mechanism to get tool X available in your dev environment, that can auto-update on rebuilds if you want it to)

To the three arguments for implicit ordering:

Usability: When there are implicit dependencies the user has no way knowing what they are and we have no way of changing them over time without risking to break existing configurations.

I think the opposite is true. We provide the user with total control over dependencies (we hope there are few with features, but maybe there are 1 or 2). I think hiding the dependencies will cause more breaks for users.

Parallelization: We want to preserve the option to parallelize installation in the future, using multi-stage builds or otherwise.

This argument makes sense, we lose a bit of control over installation order if we aren't able to decide that ourselves.

Explicit dependencies: We want to preserve the option to add explicit dependencies, these would change the order of installation (potentially change it over time as dependencies change).

This is what we are seeking with the array syntax, an option for users to add explicit dependencies. If a dependency changes significantly, it is on the author to create a new major version of the feature and for users to make the conscious decision on how they would like to proceed in their dev containers. I have already implemented a way to pin to a GitHub release or tgz URL, ranges are a next logical step.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, we'll also need to take a dependency on the base image, which could be based on a yet another base image, both installing tools. While I think, in a perfect world, we could write tooling to figure everything out for a user, pragmatically I don't think we want to get into that business :D

- It should be possible to execute a feature multiple times with different parameters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Should a feature opt-in to supporting multiple runs? Otherwise this will make writing features harder.
  • Is it up to the feature to decide which run will be on the PATH by default?
  • How would a feature provide a version switcher to change which install is on the PATH?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I agree that features can opt-in to supporting multiple runs, perhaps as a propertie of devcontianer-feature.json that we could validate in some way.
  • While making changes to the implementation of the current features I added parameters to control it and with true. That reinforces the idea that features need to declare if they support multiple runs or not.
  • I feel like that is more a feature of a tool that a user would use at runtime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be easier to let a feature add options for installing multiple versions (maybe with some additional support for doing that). That would allow the feature to control installation of all requested versions in a single run instead of having it run multiple times. With that approach IntelliSense in the devcontainer.json would likely include a hint if a feature supports multiple versions (e.g., an "additionalVersions" property).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't want to block features from receiving multiple values and using them to loop, the idea is that we should also not disallow multiple runs.

- Features are used to create an image that can be used to create a container or not.
- Parameters like `privileged`, ``init` are included if just 1 feature requires them.
- Parameters like `capAdd`, `securityOp` are concatenated.
- ContainerEnv is added before the feature is executed and persisted.

TODO