Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement proposal 716, passing full URL paths #789

Closed
wants to merge 4 commits into from

Conversation

telackey
Copy link
Contributor

@telackey telackey commented Jul 25, 2018

An implementation of proposal #716, passing the full URL through the Gateway and watchdog.

Description

Previous to these changes, only the query string provide by the HTTP client was passed through to the handling code. With these changes, the full URL path is passed as well as the query string.

Motivation and Context

  • I have raised an issue to propose this change

Proposal #716

How Has This Been Tested?

A new automated test case has been added for the Gateway and for fwatchdog as well as manual testing.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

It is unlikely this is a breaking change. Since the path is not currently passed by the Gateway, no existing function code can be relying on it. Or more correctly, for there to be a problem, the code would have to be relying on the absence of the path, which is unlikely though not completely impossible.

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I've read the CONTRIBUTION guide
  • I have signed-off my commits with git commit -s
  • I have added tests to cover my changes.
  • All new and existing tests passed.

This may or may not require a documentation change. The existing documentation, AFAIK, does not address the use of paths one way or the other.

Refs

Trello: https://trello.com/c/1qZMlGXU

url := baseURL

if requestURL != "" {
matcher, _ := regexp.Compile("/?function/([^/?]+)([^?]*)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, just curious what the regex is for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! I allowed some of my regex related comments to fall by the wayside when I refactored. I will add them back...

So, when building the upstream URL, when forwarding it for a function, it is helpful to strip off the /function/{functionName} prefix to the Gateway way.

That is, if I request: http://gateway:8080/function/xyz/my/path it is nicer for it to forward to: 'http://xyzcontainer:8080/my/path` than http://xyzcontainer:8080/function/xyz/my/path.

This regex lifts out the URL after /function/{functionName} and appends it to the baseURL.

This should only be done for function forwarding though, as the Gateway also forwards other types of requests and shouldn't mangle those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be done for function forwarding though, as the Gateway also forwards other types of requests and shouldn't mangle those.

Given the conditional nature I think that this should probably be done in another middleware function, one level higher, or by adding some config like stripPrefix: true to the type.

r, _ := regexp.Compile(forward + "([^/?]+).*")
matches := r.FindStringSubmatch(urlValue)
// The matched subgroup will be the function name.
if 2 == len(matches) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the magic number 2 mean? What is the purpose of the regex?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the above on adding back some comments...

So in this case, we want to figure out he function name from the URL. Previously that was done by trimming off the forward length and taking the rest. With a case like: /function/xyz/my/path though, what we really want is the section just following /function/ up to, but not including any later path segments or query string.

We've got one regex subgroup to match, so per the FindStringSubmatch if there is a match, the returned slice will have the text of the leftmost match at matches[0] (in this case, urlValue) followed by the matched subgroups. We've only got one subgroup, so what we want is matches[1]. So what we want to see for a positive match is a matches where the length is exactly 2.

I'll try to come up with a briefer way to say all that for the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could improve the unit test coverage for this behaviour? Perhaps also extract the specific code into its own method for deeper testing.

@ericstoekl
Copy link
Contributor

ericstoekl commented Jul 31, 2018

@telackey Thanks for your PR. Could you give some steps that I can run to test out the functionality?

I tried building it this version of the gateway, and deployed a function which lists environment variables.

$ curl localhost:8080/function/pytest?hi=1
{... 'Http_Query': 'hi=1', ... 'Http_Path': '/'}

$ curl localhost:8080/function/pytest/hi
{..., 'Http_Path': '/hi'}

In the second call, should hi show up in the Http_Query list? Or would you expect it to just show up in the path?

@telackey
Copy link
Contributor Author

@ericstoekl

Just on the Http_Path. If you try it with the old code, you should find that you get a path of / with both of your examples.

It really comes into its own when combined with something like the new node8 Express template: https://github.com/openfaas-incubator/node8-express-template/, where Express could be used to handle the routing.

@telackey
Copy link
Contributor Author

@ericstoekl

Following up on the mention of the Node8 Express template, if you wanted to do some more extensive checking, you might use something like this in the index.js:

if (isObject(handler)) {
    _.forOwn(handler, function(value, key) {
        if (_.isFunction(value)) {
            if ('root' === key) {
                console.log("Dynamically registered route: /");
                app.get('/', value);
                app.post('/', value);
            } else {
                console.log("Dynamically registered route: /" +  key);
                app.post('/' + key, value);
                app.get('/' + key, value);
            }
        }
    });
} else {
    console.log("Registered catch-all route: /*");
    app.get('/*', handler);
    app.post('/*', handler);
}

This would allow you to do something like this in handler.js:

module.exports = {
     a: ... fnA ... , 
     b:  ... fnB ...,
     root: ... fnX ...
};

Which you could then test by hitting /, /a, /b, etc. Without the patches (ie, this and the related of-watchdog patch), you will only hit the root handler regardless, but after the patches you should be able to hit each according to the path used.

// upstream request. In the following regex, in the case of a match
// the requestURL will be at `0`, the function name at `1` and the
// rest of the path (the part we are interested in) at `2`.
matcher, _ := regexp.Compile("/?function/([^/?]+)([^?]*)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During review I noticed that this compilation could be made ahead of time and re-used. This should improve performance/throughput.

@alexellis
Copy link
Member

I'm running into issues testing the PR.

I have OF deployed on Swarm on 192.168.0.35, then I did this:

go build && functions_provider_url=http://192.168.0.35:8080/ ./gateway

2018/08/02 12:22:33 Forwarded [GET] to /function/nodeinfo - [200] - 0.021146 seconds

When I hit 127.0.0.1:8080/function/nodeinfo I get returned with the UI.

If I hit 192.168.0.35:8080/function/nodeinfo I get the expected result

Can you advise?

Alex

@alexellis
Copy link
Member

I deployed the image to my swarm cluster and function readiness appears to be broken too.

screen shot 2018-08-02 at 12 34 05

@telackey
Copy link
Contributor Author

telackey commented Aug 7, 2018

I had been pulled away on other tasks. I'll take a look into this.

@derek
Copy link

derek bot commented Aug 7, 2018

Thank you for your contribution. I've just checked and your commit doesn't appear to be signed-off.
That's something we need before your Pull Request can be merged. Please see our contributing guide.

@derek derek bot added the no-dco label Aug 7, 2018
@telackey
Copy link
Contributor Author

telackey commented Aug 7, 2018

@alexellis

The readiness problem was an issue with the regex matching '/system/function/' as well as '/function'. This is my own fault for changing the code up at the 11th hour (previously, all the trimming had been done in the watchdog.)

It should be fixed now. As regards the IP issue... I cannot reproduce that, and it is hard to see how any of the code changes could be related. I'd suggest re-trying on the latest code, and if you still see it, I'll make another attempt.

@telackey
Copy link
Contributor Author

telackey commented Aug 7, 2018

@alexellis

I also addressed your comment about the regex. By my understanding, sharing regexs in golang can be tricky because you can end up with significant contention on the internal locks. My fix here is to pre-compile the expression as you suggested, but use Copy() for the actual usage. I also simplified to using a single expression for both the service name matching an the function path matching (it was pretty much the same expression anyway).

@telackey
Copy link
Contributor Author

telackey commented Aug 7, 2018

That string of merges is just a battle against git and amending with a signoff.

@derek derek bot removed the no-dco label Aug 7, 2018
@alexellis
Copy link
Member

This is high up on my list @telackey

@@ -66,7 +87,7 @@ func buildUpstreamRequest(r *http.Request, url string) *http.Request {

func forwardRequest(w http.ResponseWriter, r *http.Request, proxyClient *http.Client, baseURL string, requestURL string, timeout time.Duration) (int, error) {

upstreamReq := buildUpstreamRequest(r, baseURL+requestURL)
upstreamReq := buildUpstreamRequest(r, baseURL, requestURL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a previous question about whether this belongs in the generic code which forwards a request upstream. I feel like it belongs in a middleware handler which wraps this one, in that way we retain single-responsibility and the transformation is limited to the routes we add the middleware to. This would also allow one piece of code to cover the /function/ and /async-function/ routes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good example might be the CallID middleware?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or the basic auth middleware

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexellis Can you point me at an example so that I am sure we are on the same page?

@@ -430,6 +430,40 @@ func TestHealthHandler_SatusMethoNotAllowed_ForWriteableVerbs(t *testing.T) {
}
}

func TestHandler_HasFullPathAndQueryInFunction_WithCgi_Mode(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is not changing behaviour, just documenting it, I think we could raise a separate PR for it and I'll merge right aware since it's not changing anything.

@alexellis alexellis requested review from ivanayov and removed request for johnmccabe and ericstoekl August 13, 2018 16:53
alexellis added a commit that referenced this pull request Aug 13, 2018
This test takes inspiration from the PR from @telackey with changes
to make it more maintainable. Since the test does not require
changes to the code, I wanted to add it before merging changes.

Ref: #789

Signed-off-by: Alex Ellis (VMware) <[email protected]>
@alexellis
Copy link
Member

@telackey I noticed you didn't respond to my feedback on the unit test, so I've gone ahead and completed that task for you.

Please can you remove the unit test in watchdog/requesthandler_test.go from your changeset and squash down all 6 commits then push back up?

@ivanayov is doing some end to end validation of your changes, but I'd like to be able to test them against master too.

Alex

@telackey
Copy link
Contributor Author

@alexellis Per your earlier request, I changed things up to use a new URLPathTransformer interface instead.

I also added a configuration option: pass_url_path_to_functions. When this is true the path following the '/function/xyz/portion is passed along upstream. When it isfalse, the behavior will be similar to what it is now, passing only /` upstream.

@@ -15,6 +16,9 @@ import (
"github.com/prometheus/client_golang/prometheus"
)

// Parse out the service name (group 1) and rest of path (group 2).
var functionMatcher = regexp.MustCompile("^/?(?:async-)?function/([^/?]+)([^?]*)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs considering whether GET should be allowed for asynchronous functions. Not sure it makes sense to be supported, but if path parameters work for both function and async-function the user may expect to be able to reference the same path with async as without.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the async route only allows for POST right now, but we could extend that. Please @ivanayov check and raise an issue to update if that's the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #829

// rest of the path (the part we are interested in) at `2`.
matcher := functionMatcher.Copy()
parts := matcher.FindStringSubmatch(ret)
if 3 == len(parts) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're close now but I don't like the magic numbers. Can we make this clearer for the code maintainers?

@@ -11,6 +11,7 @@ type Request struct {
Header http.Header
Body []byte
Method string
Path string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 do we have the equivalent change lined-up in nats-queue-worker @telackey @ivanayov ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am not in a good position to test async functionality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not in a good position to test async functionality

What am I missing? Testing the async call is as simple as deploying OpenFaaS on a new VM, deploying your gateway and then invoking the function. Check the logs on the queue-worker and you're good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely the question I was responding to about "equivalent change lined-up in nats-queue-worker" must also mean changing, building, and deploying the new nats-queue-worker too.

faasHandlers.DeleteFunction = handlers.MakeForwardingProxyHandler(reverseProxy, forwardingNotifiers, urlResolver)
faasHandlers.UpdateFunction = handlers.MakeForwardingProxyHandler(reverseProxy, forwardingNotifiers, urlResolver)
queryFunction := handlers.MakeForwardingProxyHandler(reverseProxy, forwardingNotifiers, urlResolver)
if config.PassURLPathsToFunctions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should drop the config and enable this all the time? If it's not breaking

@alexellis
Copy link
Member

FYI we also have a rebase issue showing up with gateway/server.go.

@alexellis
Copy link
Member

I'm keen to move this forward and see it merged.

// MakeForwardingProxyHandler create a handler which forwards HTTP requests
func MakeForwardingProxyHandler(proxy *types.HTTPClientReverseProxy, notifiers []HTTPNotifier, baseURLResolver BaseURLResolver) http.HandlerFunc {
func MakeForwardingProxyHandler(proxy *types.HTTPClientReverseProxy, notifiers []HTTPNotifier, baseURLResolver BaseURLResolver, urlPathTransformer URLPathTransformer) http.HandlerFunc {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's split on two lines, as this becomes hard for reviewing.

path := "/my/deep/path"

reader := bytes.NewReader(srcBytes)
request, _ := http.NewRequest(http.MethodPost, "/function/xyz"+path+"?code=1", reader)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have a test with async-function as well

Copy link
Contributor

@ivanayov ivanayov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was tested with Sinatra using https://github.com/ivanayov/openfaas-sinatra-guestbook and worked well.

Needs documenting.

@telackey will you have the chance to finish the PR or you'd rather delegate this?

@ivanayov
Copy link
Contributor

The branch needs rebase from master.

// In the following regex, in the case of a match the r.URL.Path will be at `0`,
// the function name at `1` and the rest of the path (the part we are interested in)
// at `2`. For this transformer, all we need to do is confirm it is a function.
if 3 == len(parts) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar code is repeated on 3 places returning different result. Would be better to go in a single method to make the easier support in future.

// will be the service name we need, and at `2` the rest of the path.
matcher := functionMatcher.Copy()
matches := matcher.FindStringSubmatch(urlValue)
if 3 == len(matches) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in Alex's comment bellow, let's not use a number for this check, but edit for better readability.
One may be confused why there's a check for length of 3 to take the 2nd element and may edit and break it in future.

@telackey
Copy link
Contributor Author

telackey commented Aug 23, 2018

@ivanayov

will you have the chance to finish the PR or you'd rather delegate this?

If you'd like to finish it up with your notes, I have no objection at all.

@@ -170,7 +181,7 @@ func main() {

metricsHandler := metrics.PrometheusHandler()
r.Handle("/metrics", metricsHandler)
r.HandleFunc("/healthz", handlers.MakeForwardingProxyHandler(reverseProxy, forwardingNotifiers, urlResolver)).Methods(http.MethodGet)
r.HandleFunc("/healthz", handlers.MakeForwardingProxyHandler(reverseProxy, forwardingNotifiers, urlResolver, urlTransformer)).Methods(http.MethodGet)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does healthz take a urlTransformer? Surely it's only needed for the function endpoints and if it is un-used, could we pass nil?

@alexellis
Copy link
Member

Sorry for the misunderstanding but we don't "delegate" finishing/fixings PRs to other contributors.

@telackey if you can rebase your PR ASAP I'll merge it and then sort out the other feedback.

In the meantime I've asked @ivanayov to test the async-function path-passing since you said you're not in a position to test it.

Alex

@alexellis
Copy link
Member

Please could you explain the difference between these two transformers?

	urlTransformer := handlers.FunctionPathTruncatingURLPathTransformer{}
	functionURLTransformer = handlers.FunctionPrefixTrimmingURLPathTransformer{}

I understand that the functionURLTransformer will trim the /function/ prefix from calls upstream, but why is a urlTransformer needed on the calls which route to /system/functions etc and to healthz?

@telackey
Copy link
Contributor Author

The path truncating one strips out the path as did the previous behavior.

…atchdog.

Previously, only the query string of the URL was passed through the Gateway.
With this change, the entire path requested by the client is passed through as well as the query string.

While fwatchdog already supported passing the path through, in practice this would not happen
since the Gateway would have swallowed it before forwarding the request to the watchdog.

With this change, the path portion after the function name is added to the Http_Path environment
variable, provided that cgiHeaders are enabled.  This is similar to the of-watchdog equivalent.

Signed-off-by: Thomas E Lackey <[email protected]>
…tion prefix trimming instead of baking it in. Also add a configuration option, 'pass_url_path_to_functions' to control whether the full path is passed to the functions or not.

Signed-off-by: Thomas E Lackey <[email protected]>
…ync functions as well.

Signed-off-by: Thomas E Lackey <[email protected]>
alexellis added a commit that referenced this pull request Aug 24, 2018
This reviews the code and fixes up suggestions made by team for
the HTTP paths PR #789.

- Removed feature-flag (this is backwards-compatible, so I see
no value in adding the flag)
- There was a URL transform happening for calls proxied to the
back end, I changed this for the nil-transform - i.e. it does not
change anything in the URL
- Introduced variables to describe the regex indicies used in
the URL trimming.

Tested with Docker Swarm with a ruby-microservice, with
system calls and with function calls using the UI.

Signed-off-by: Alex Ellis (VMware) <[email protected]>
@telackey
Copy link
Contributor Author

@alexellis @ivanayov

Rebased. Thank you both for all your help!

@alexellis
Copy link
Member

I've opened a new PR to finish the work.. please can you check everything is still as expected?

@alexellis
Copy link
Member

alexellis commented Aug 24, 2018

Moved over to #832, this PR will be closed automatically when the new PR is merged. Your authorship and commits will still show as from yourself.

alexellis added a commit that referenced this pull request Aug 29, 2018
This reviews the code and fixes up suggestions made by team for
the HTTP paths PR #789.

- Removed feature-flag (this is backwards-compatible, so I see
no value in adding the flag)
- There was a URL transform happening for calls proxied to the
back end, I changed this for the nil-transform - i.e. it does not
change anything in the URL
- Introduced variables to describe the regex indicies used in
the URL trimming.

Tested with Docker Swarm with a ruby-microservice, with
system calls and with function calls using the UI.

Signed-off-by: Alex Ellis (VMware) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants