Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEP: Add support for cookie in the HTTPRouteMatch API #2891

Open
lianglli opened this issue Mar 22, 2024 · 14 comments
Open

GEP: Add support for cookie in the HTTPRouteMatch API #2891

lianglli opened this issue Mar 22, 2024 · 14 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. kind/gep PRs related to Gateway Enhancement Proposal(GEP) needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@lianglli
Copy link

lianglli commented Mar 22, 2024

What would you like to be added:

It would be great if HTTPRouteMatch had a field to select a HTTP route by matching HTTP request cookies.

Moreover, a new match type "list" is added. Matches if the value of the cookie with name field is present in a list of strings.
This match type "list" can be applied to serving HTTPHeaderMatch and HTTPQueryParamMatch as well.

Similar specifications can be seen in ingress like ingress-nginx (ref: https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#canary) nginx.ingress.kubernetes.io/canary-by-cookie, kubernetes-ingress (ref: https://docs.nginx.com/nginx-ingress-controller/configuration/virtualserver-and-virtualserverroute-resources/#condition) and tengine-ingress (ref: https://tengine.taobao.org/document/ingress_routes.html).

// CookieMatchType specifies the semantics of how HTTP cookie values should be
// compared. Valid CookieMatchType values, along with their conformance levels, are:
//
// * "Exact" - Core
// * "List" - Extended
// * "RegularExpression" - Implementation Specific
//
// * "Exact" matching exact string
// * "List" matching string in a list of strings
//
// Note that values may be added to this enum, implementations
// must ensure that unknown values will not cause a crash.
//
// Unknown values here must result in the implementation setting the
// Accepted Condition for the Route to `status: False`, with a
// Reason of `UnsupportedValue`.
//
// +kubebuilder:validation:Enum=Exact;RegularExpression
type CookieMatchType string

// CookieMatchType constants.
const (
    CookieMatchExact CookieMatchType = "Exact"
    CookieMatchList CookieMatchType = "List"
    CookieMatchRegularExpression CookieMatchType = "RegularExpression"
)

// HTTPCookieMatch describes how to select a HTTP route by matching HTTP request
// cookies.
type HTTPCookieMatch struct {
    // Type specifies how to match against the value of the cookie.
    //
    // Support: Core (Exact)
    //
    // Support: Extended (List)
    //
    // Support: Implementation-specific (RegularExpression)
    //
    // Since RegularExpression CookieMatchType has implementation-specific
    // conformance, implementations can support POSIX, PCRE or any other dialects
    // of regular expressions. Please read the implementation's documentation to
    // determine the supported dialect.
    //
    // +optional
    // +kubebuilder:default=Exact
    Type *CookieMatchType `json:"type,omitempty"`

    // Name is the cookie-name of the cookie-pair in the HTTP Cookie header to be matched.
    // (See https://datatracker.ietf.org/doc/html/rfc6265#section-4.2.1)
    // cookie-header = "Cookie:" OWS cookie-string OWS
    // cookie-string = cookie-pair *( ";" SP cookie-pair )
    // (See https://datatracker.ietf.org/doc/html/rfc6265#section-4.1.1)
    // cookie-pair   = cookie-name "=" cookie-value
    // cookie-name   = token
    // token         = <token, defined in [RFC2616], Section 2.2>
    //
    // If multiple entries specify equivalent cookie names, only the first
    // entry with an equivalent name MUST be considered for a match. Subsequent
    // entries with an equivalent cookie name MUST be ignored. Due to the
    // case-insensitivity of cookie names, "foo" and "Foo" are considered
    // equivalent.
    //
    // When a Cookie header is repeated in an HTTP request, it is
    // implementation-specific behavior as to how this is represented.
    // Generally, proxies should follow the guidance from the RFC:
    // https://www.rfc-editor.org/rfc/rfc7230.html#section-3.2.2 regarding
    // processing a repeated header.
    Name HTTPHeaderName `json:"name"`

    // Values is the cookie-value of the cookie-pair in the HTTP Cookie header to be matched.
    // Matches if the value of the cookie with name field is present in the HTTP Cookie header.
    //
    // +kubebuilder:validation:MinLength=1
    // +kubebuilder:validation:MaxLength=1024
    Value string `json:"value"`

    // Values are the cookie-value list of the cookie-pair in the HTTP Cookie header to be matched.
    // Matches if the value of the cookie with name field is present in the list.
    //
    // +optional
    // +listType=set
    // +kubebuilder:validation:MaxItems=16
    Values []string `json:"values"`
}

Why this is needed:

Just like HTTP route based on header and query parameter is common, it is useful to match a cookie-name and cookie-value of the request for the specific backends.

Cookies are used to maintain state and identify specific users. It is an essential part of the HTTP request.

The Cookie header has multiple cookie-pair which contains the cookie-name and cookie-value.

cookie-header = "Cookie:" OWS cookie-string OWS
cookie-string = cookie-pair *( ";" SP cookie-pair )
cookie-pair       = cookie-name "=" cookie-value
cookie-name       = token
cookie-value      = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
                       ; US-ASCII characters excluding CTLs,
                       ; whitespace DQUOTE, comma, semicolon,
                       ; and backslash
token             = <token, defined in [[RFC2616], Section 2.2](https://datatracker.ietf.org/doc/html/rfc2616#section-2.2)>

E.g., backend service wants to select specific users for measuring the effectiveness of advertising campaigns. However, the cookies are used to maintain state and identify specific users. Hence, based on the following http rule, the http requests with cookie name "unb" and cookie value in a list of strings (i.e., 2426168118, 2208203664638, 2797880990, 70772956, 2215140160618) will be routed to the service "http-route-canary-campaign:7001".

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: http-route-cookie
spec:
  hostnames:
  - http.route.cookies.com
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: http-gateway
  rules:
  - backendRefs:
    - kind: Service 
      name: http-route-production
      port: 7001
    matches:
    - path:
        type: PathPrefix
        value: /
  - backendRefs:
    - kind: Service
      name: http-route-canary-campaign
      port: 7001
    matches:
    - cookies:
      - name: unb
        type: List
        values: 
        - 2426168118
        - 2208203664638
        - 2797880990
        - 70772956
        - 2215140160618

Cookies, headers and query parameter are common techniques used in a canary release.
E.g., based on the following http route, the HTTP requests with cookie "gray=true" will be route to the canary service "http-site-canary:80" specifically.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: http-route-cookie
spec:
  hostnames:
  - http.site.cookie.com
  - http.site.cookies.com
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: http-gateway
  rules:
  - backendRefs:
    - kind: Service
      name: http-site-production
      port: 80 
    matches:
    - path:
        type: PathPrefix
        value: /
  - backendRefs:
    - kind: Service
      name: http-site-canary
      port: 80
    matches:
    - cookies:
      - name: gray
        type: Exact
        value: true

If this requires a GEP, I would be like to start working on it.

@lianglli lianglli added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 22, 2024
@shaneutt shaneutt added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Mar 22, 2024
@lianglli
Copy link
Author

/assign @lianglli

@youngnick
Copy link
Contributor

Thanks for this @lianglli, this seems like another GEP, if a small one.

@youngnick
Copy link
Contributor

Hmm, you could conceivably do #2895 and this request together, maybe. @robscott, @shaneutt, thoughts?

@lianglli
Copy link
Author

/kind gep

@k8s-ci-robot k8s-ci-robot added the kind/gep PRs related to Gateway Enhancement Proposal(GEP) label Mar 29, 2024
@lianglli lianglli changed the title Add support for cookie in the HTTPRouteMatch API GEP: Add support for cookie in the HTTPRouteMatch API Mar 29, 2024
@lianglli
Copy link
Author

Hmm, you could conceivably do #2895 and this request together, maybe. @robscott, @shaneutt, thoughts?

@shaneutt

I prefer to create a gep 2891 separately.

lianglli added a commit to lianglli/gateway-api that referenced this issue Apr 3, 2024
@costinm
Copy link

costinm commented May 6, 2024

Do we have the implementation status in the different proxy implementations ? I think any proposed API should start with the list of proxies that provide the feature already - we can't define APIs for things that don't exist ( in terms of data plane processing), and it is very difficult to evaluate a proposal without having the prior art and status.

Cookie is a particularly tricky one - there are multiple cookies, they have structure, etc. There are other structured headers - Baggage, (new) Authorization - and RFCs around structured fields. It is certainly NOT suitable for prefix/exact or regex matching.

@spacewander
Copy link
Contributor

Do we have the implementation status in the different proxy implementations ? I think any proposed API should start with the list of proxies that provide the feature already - we can't define APIs for things that don't exist ( in terms of data plane processing), and it is very difficult to evaluate a proposal without having the prior art and status.

Cookie is a particularly tricky one - there are multiple cookies, they have structure, etc. There are other structured headers - Baggage, (new) Authorization - and RFCs around structured fields. It is certainly NOT suitable for prefix/exact or regex matching.

AFAIK, Envoy doesn't support cookie match.

Nginx supports cookie match via its variable: https://nginx.org/en/docs/http/ngx_http_core_module.html#var_cookie_.
However, Nginx variable doesn't support - and it must be lowercase, while cookie name can have - and is case-sensitive. So matching cookie via Nginx variable is not so useful.

APISIX, which is a gateway built upon Nginx, overcomes some of the limitations by extending Nginx's variable system.
https://github.com/apache/apisix/blob/64b81c48ed3bf4ed1bf5b2edbd4bcdea5137123c/apisix/core/ctx.lua#L240

The cookie name in APISIX can be case-sensitive and contain -, so every valid Cookie name can be used.

@lianglli
Copy link
Author

lianglli commented May 8, 2024

Do we have the implementation status in the different proxy implementations ? I think any proposed API should start with the list of proxies that provide the feature already - we can't define APIs for things that don't exist ( in terms of data plane processing), and it is very difficult to evaluate a proposal without having the prior art and status.

Cookie is a particularly tricky one - there are multiple cookies, they have structure, etc. There are other structured headers - Baggage, (new) Authorization - and RFCs around structured fields. It is certainly NOT suitable for prefix/exact or regex matching.

Pls. check the "## Prior Art" and "## References" of this PR 2926 specifically.

Moreover, the HTTPCookieMatch and List are considered an extended feature.

@lianglli
Copy link
Author

lianglli commented May 8, 2024

Do we have the implementation status in the different proxy implementations ? I think any proposed API should start with the list of proxies that provide the feature already - we can't define APIs for things that don't exist ( in terms of data plane processing), and it is very difficult to evaluate a proposal without having the prior art and status.

Cookie is a particularly tricky one - there are multiple cookies, they have structure, etc. There are other structured headers - Baggage, (new) Authorization - and RFCs around structured fields. It is certainly NOT suitable for prefix/exact or regex matching.

The RFC6265 defines the HTTP Cookie specifically.
The Cookie header has multiple cookie-pair which contains the cookie-name and cookie-value.

For this PR 2926, the HTTPCookieMatch describes how to select a HTTP route by matching HTTP request cookies.

The Name of HTTPCookieMatch is the cookie-name of the cookie-pair in the HTTP Cookie header to be matched.
The Value is the cookie-value of the cookie-pair in the HTTP Cookie header to be matched.
The Values are the cookie-value list of the cookie-pair in the HTTP Cookie header to be matched.

@Danielkiss9
Copy link

This is very much needed!
We extremely need the ability to redirect requests to services based on cookies.

@robscott
Copy link
Member

Hey @lianglli, if you're interested in getting this into v1.2, do you mind proposing this in our scoping discussion (#3103)?

@lianglli
Copy link
Author

lianglli commented May 31, 2024

Hey @lianglli, if you're interested in getting this into v1.2, do you mind proposing this in our scoping discussion (#3103)?

Yes, I will add a comment about GEP-2891: HTTP Cookie Match in the #3103 asap.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 29, 2024
@kfox1111
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. kind/gep PRs related to Gateway Enhancement Proposal(GEP) needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

10 participants