Skip to content

Commit 050ac1c

Browse files
authored
--rewrite-host-header flag for reverse proxy (#1492)
* Rewrite Host header during reverse proxy * bring back `VERIFIED6` * Lint fixes * `--rewrite-host-header` flag * Pass `--rewrite-host-header` for integration tests * expect `httpbingo.org` as header now due to host rewrite * Also pass flag during build & test suite
1 parent 9077c16 commit 050ac1c

File tree

10 files changed

+133
-39
lines changed

10 files changed

+133
-39
lines changed

.github/workflows/test-library.yml

+1
Original file line numberDiff line numberDiff line change
@@ -610,6 +610,7 @@ jobs:
610610
--enable-web-server
611611
--enable-reverse-proxy
612612
--plugin proxy.plugin.ReverseProxyPlugin
613+
--rewrite-host-header
613614
&&
614615
./tests/integration/test_integration.sh 8899
615616

README.md

+36-3
Original file line numberDiff line numberDiff line change
@@ -1075,7 +1075,7 @@ following `Nginx` config:
10751075

10761076
```console
10771077
location /get {
1078-
proxy_pass http://httpbin.org/get
1078+
proxy_pass http://httpbin.org/get;
10791079
}
10801080
```
10811081

@@ -1094,6 +1094,36 @@ Verify using `curl -v localhost:8899/get`:
10941094
}
10951095
```
10961096

1097+
#### Rewrite Host Header
1098+
1099+
With above example, you may sometimes see:
1100+
1101+
```console
1102+
>
1103+
* Empty reply from server
1104+
* Closing connection
1105+
curl: (52) Empty reply from server
1106+
```
1107+
1108+
This is happenening because our default reverse proxy plugin `ReverseProxyPlugin` is configured
1109+
with a `http` and a `https` upstream server. And, by default `ReverseProxyPlugin` preserves the
1110+
original host header. While this works with `https` upstreams, this doesn't work reliably with
1111+
`http` upstreams. To work around this problem use the `--rewrite-host-header` flags.
1112+
1113+
Example:
1114+
1115+
1116+
```console
1117+
proxy --enable-reverse-proxy \
1118+
--plugins proxy.plugin.ReverseProxyPlugin \
1119+
--rewrite-host-header
1120+
```
1121+
1122+
This will ensure that `Host` header field is set as `httpbin.org` and works with both `http` and
1123+
`https` upstreams.
1124+
1125+
> NOTE: Whether to use `--rewrite-host-header` or not depends upon your use-case.
1126+
10971127
## Plugin Ordering
10981128

10991129
When using multiple plugins, depending upon plugin functionality,
@@ -2613,7 +2643,7 @@ usage: -m [-h] [--tunnel-hostname TUNNEL_HOSTNAME] [--tunnel-port TUNNEL_PORT]
26132643
[--proxy-pool PROXY_POOL] [--enable-web-server]
26142644
[--enable-static-server] [--static-server-dir STATIC_SERVER_DIR]
26152645
[--min-compression-length MIN_COMPRESSION_LENGTH]
2616-
[--enable-reverse-proxy] [--enable-metrics]
2646+
[--enable-reverse-proxy] [--rewrite-host-header] [--enable-metrics]
26172647
[--metrics-path METRICS_PATH] [--pac-file PAC_FILE]
26182648
[--pac-file-url-path PAC_FILE_URL_PATH]
26192649
[--cloudflare-dns-mode CLOUDFLARE_DNS_MODE]
@@ -2622,7 +2652,7 @@ usage: -m [-h] [--tunnel-hostname TUNNEL_HOSTNAME] [--tunnel-port TUNNEL_PORT]
26222652
[--filtered-client-ips FILTERED_CLIENT_IPS]
26232653
[--filtered-url-regex-config FILTERED_URL_REGEX_CONFIG]
26242654

2625-
proxy.py v2.4.6.dev25+g2754b928.d20240812
2655+
proxy.py v2.4.8.dev8+gc703edac.d20241013
26262656

26272657
options:
26282658
-h, --help show this help message and exit
@@ -2791,6 +2821,9 @@ options:
27912821
response that will be compressed (gzipped).
27922822
--enable-reverse-proxy
27932823
Default: False. Whether to enable reverse proxy core.
2824+
--rewrite-host-header
2825+
Default: False. If used, reverse proxy server will
2826+
rewrite Host header field before sending to upstream.
27942827
--enable-metrics Default: False. Enables metrics.
27952828
--metrics-path METRICS_PATH
27962829
Default: /metrics. Web server path to serve proxy.py

proxy/common/constants.py

+1
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,7 @@ def _env_threadless_compliant() -> bool:
165165
if sys.version_info >= (3, 10)
166166
else (ssl.OP_NO_SSLv2 | ssl.OP_NO_SSLv3 | ssl.OP_NO_TLSv1 | ssl.OP_NO_TLSv1_1)
167167
)
168+
DEFAULT_ENABLE_REWRITE_HOST = False
168169

169170
DEFAULT_DEVTOOLS_DOC_URL = 'http://proxy'
170171
DEFAULT_DEVTOOLS_FRAME_ID = secrets.token_hex(8)

proxy/common/utils.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,6 @@
2525
from types import TracebackType
2626
from typing import Any, Dict, List, Type, Tuple, Callable, Optional
2727

28-
import _ssl # noqa: WPS436
29-
3028
from .types import HostPort
3129
from .constants import (
3230
CRLF, COLON, HTTP_1_1, IS_WINDOWS, WHITESPACE, DEFAULT_TIMEOUT,
@@ -42,6 +40,9 @@
4240

4341
def cert_der_to_dict(der: Optional[bytes]) -> Dict[str, Any]:
4442
"""Parse a DER formatted certificate to a python dict"""
43+
# pylint: disable=import-outside-toplevel
44+
import _ssl # noqa: WPS436
45+
4546
if not der:
4647
return {}
4748
with tempfile.NamedTemporaryFile(delete=False) as cert_file:
@@ -322,6 +323,7 @@ def set_open_file_limit(soft_limit: int) -> None:
322323
if IS_WINDOWS: # pragma: no cover
323324
return
324325

326+
# pylint: disable=possibly-used-before-assignment
325327
curr_soft_limit, curr_hard_limit = resource.getrlimit(
326328
resource.RLIMIT_NOFILE,
327329
)

proxy/core/connection/connection.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,6 @@ def connection(self) -> TcpOrTlsSocket:
4949

5050
def send(self, data: Union[memoryview, bytes]) -> int:
5151
"""Users must handle BrokenPipeError exceptions"""
52-
# logger.info(data.tobytes())
5352
return self.connection.send(data)
5453

5554
def recv(
@@ -67,7 +66,7 @@ def recv(
6766
return memoryview(data)
6867

6968
def close(self) -> bool:
70-
if not self.closed:
69+
if not self.closed and self.connection:
7170
self.connection.close()
7271
self.closed = True
7372
return self.closed
@@ -97,8 +96,9 @@ def flush(self, max_send_size: Optional[int] = None) -> int:
9796
self._num_buffer -= 1
9897
else:
9998
self.buffer[0] = mv[sent:]
100-
del mv
10199
logger.debug('flushed %d bytes to %s' % (sent, self.tag))
100+
# logger.info(mv[:sent].tobytes())
101+
del mv
102102
return sent
103103

104104
def is_reusable(self) -> bool:

proxy/http/parser/parser.py

+22-6
Original file line numberDiff line numberDiff line change
@@ -283,7 +283,12 @@ def parse(
283283
self.state = httpParserStates.COMPLETE
284284
self.buffer = None if raw == b'' else raw
285285

286-
def build(self, disable_headers: Optional[List[bytes]] = None, for_proxy: bool = False) -> bytes:
286+
def build(
287+
self,
288+
disable_headers: Optional[List[bytes]] = None,
289+
for_proxy: bool = False,
290+
host: Optional[bytes] = None,
291+
) -> bytes:
287292
"""Rebuild the request object."""
288293
assert self.method and self.version and self.type == httpParserTypes.REQUEST_PARSER
289294
if disable_headers is None:
@@ -301,11 +306,22 @@ def build(self, disable_headers: Optional[List[bytes]] = None, for_proxy: bool =
301306
path
302307
) if not self._is_https_tunnel else (self.host + COLON + str(self.port).encode())
303308
return build_http_request(
304-
self.method, path, self.version,
305-
headers={} if not self.headers else {
306-
self.headers[k][0]: self.headers[k][1] for k in self.headers if
307-
k.lower() not in disable_headers
308-
},
309+
self.method,
310+
path,
311+
self.version,
312+
headers=(
313+
{}
314+
if not self.headers
315+
else {
316+
self.headers[k][0]: (
317+
self.headers[k][1]
318+
if host is None or self.headers[k][0].lower() != b'host'
319+
else host
320+
)
321+
for k in self.headers
322+
if k.lower() not in disable_headers
323+
}
324+
),
309325
body=body,
310326
no_ua=True,
311327
)

proxy/http/server/reverse.py

+21-8
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,10 @@
1717
from proxy.core.base import TcpUpstreamConnectionHandler
1818
from proxy.http.parser import HttpParser
1919
from proxy.http.server import HttpWebServerBasePlugin
20-
from proxy.common.utils import text_
20+
from proxy.common.utils import text_, bytes_
2121
from proxy.http.exception import HttpProtocolException
2222
from proxy.common.constants import (
23-
HTTPS_PROTO, DEFAULT_HTTP_PORT, DEFAULT_HTTPS_PORT,
23+
COLON, HTTP_PROTO, HTTPS_PROTO, DEFAULT_HTTP_PORT, DEFAULT_HTTPS_PORT,
2424
DEFAULT_REVERSE_PROXY_ACCESS_LOG_FORMAT,
2525
)
2626
from ...common.types import Readables, Writables, Descriptors
@@ -111,23 +111,36 @@ def handle_request(self, request: HttpParser) -> None:
111111
assert self.choice and self.choice.hostname
112112
port = (
113113
self.choice.port or DEFAULT_HTTP_PORT
114-
if self.choice.scheme == b'http'
115-
else DEFAULT_HTTPS_PORT
114+
if self.choice.scheme == HTTP_PROTO
115+
else self.choice.port or DEFAULT_HTTPS_PORT
116116
)
117117
self.initialize_upstream(text_(self.choice.hostname), port)
118118
assert self.upstream
119119
try:
120120
self.upstream.connect()
121121
if self.choice.scheme == HTTPS_PROTO:
122122
self.upstream.wrap(
123-
text_(
124-
self.choice.hostname,
125-
),
123+
text_(self.choice.hostname),
126124
as_non_blocking=True,
127125
ca_file=self.flags.ca_file,
128126
)
129127
request.path = self.choice.remainder
130-
self.upstream.queue(memoryview(request.build()))
128+
self.upstream.queue(
129+
memoryview(
130+
request.build(
131+
host=(
132+
self.choice.hostname
133+
+ (
134+
COLON + bytes_(self.choice.port)
135+
if self.choice.port is not None
136+
else b''
137+
)
138+
if self.flags.rewrite_host_header
139+
else None
140+
),
141+
),
142+
),
143+
)
131144
except ConnectionRefusedError:
132145
raise HttpProtocolException( # pragma: no cover
133146
'Connection refused by upstream server {0}:{1}'.format(

proxy/http/server/web.py

+13-2
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,9 @@
2929
from ...common.utils import text_, build_websocket_handshake_response
3030
from ...common.constants import (
3131
DEFAULT_ENABLE_WEB_SERVER, DEFAULT_STATIC_SERVER_DIR,
32-
DEFAULT_ENABLE_REVERSE_PROXY, DEFAULT_ENABLE_STATIC_SERVER,
33-
DEFAULT_WEB_ACCESS_LOG_FORMAT, DEFAULT_MIN_COMPRESSION_LENGTH,
32+
DEFAULT_ENABLE_REWRITE_HOST, DEFAULT_ENABLE_REVERSE_PROXY,
33+
DEFAULT_ENABLE_STATIC_SERVER, DEFAULT_WEB_ACCESS_LOG_FORMAT,
34+
DEFAULT_MIN_COMPRESSION_LENGTH,
3435
)
3536

3637

@@ -78,6 +79,16 @@
7879
help='Default: False. Whether to enable reverse proxy core.',
7980
)
8081

82+
flags.add_argument(
83+
'--rewrite-host-header',
84+
action='store_true',
85+
default=DEFAULT_ENABLE_REWRITE_HOST,
86+
help='Default: '
87+
+ str(DEFAULT_ENABLE_REWRITE_HOST)
88+
+ '. '
89+
+ 'If used, reverse proxy server will rewrite Host header field before sending to upstream.',
90+
)
91+
8192

8293
class HttpWebServerPlugin(HttpProtocolHandlerPlugin):
8394
"""HttpProtocolHandler plugin which handles incoming requests to local web server."""

tests/integration/test_integration.py

+22-10
Original file line numberDiff line numberDiff line change
@@ -181,18 +181,30 @@ def proxy_py_subprocess(request: Any) -> Generator[int, None, None]:
181181
ca_cert_dir = TEMP_DIR / ('certificates-%s' % run_id)
182182
os.makedirs(ca_cert_dir, exist_ok=True)
183183
proxy_cmd = (
184-
sys.executable, '-m', 'proxy',
185-
'--hostname', '127.0.0.1',
186-
'--port', '0',
187-
'--port-file', str(port_file),
184+
sys.executable,
185+
'-m',
186+
'proxy',
187+
'--hostname',
188+
'127.0.0.1',
189+
'--port',
190+
'0',
191+
'--port-file',
192+
str(port_file),
188193
'--enable-web-server',
189-
'--plugin', 'proxy.plugin.WebServerPlugin',
194+
'--plugin',
195+
'proxy.plugin.WebServerPlugin',
190196
'--enable-reverse-proxy',
191-
'--plugin', 'proxy.plugin.ReverseProxyPlugin',
192-
'--num-acceptors', '3',
193-
'--num-workers', '3',
194-
'--ca-cert-dir', str(ca_cert_dir),
195-
'--log-level', 'd',
197+
'--plugin',
198+
'proxy.plugin.ReverseProxyPlugin',
199+
'--rewrite-host-header',
200+
'--num-acceptors',
201+
'3',
202+
'--num-workers',
203+
'3',
204+
'--ca-cert-dir',
205+
str(ca_cert_dir),
206+
'--log-level',
207+
'd',
196208
) + tuple(request.param.split())
197209
proxy_proc = Popen(proxy_cmd, stderr=subprocess.STDOUT)
198210
# Needed because port file might not be available immediately

tests/integration/test_integration.sh

+10-5
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616
# For github action, we simply bank upon GitHub
1717
# to clean up any background process including
1818
# proxy.py
19+
#
20+
set -x
1921

2022
PROXY_PY_PORT=$1
2123
if [[ -z "$PROXY_PY_PORT" ]]; then
@@ -164,8 +166,14 @@ cat downloaded2.whl | $SHASUM -c downloaded2.hash
164166
VERIFIED5=$?
165167
rm downloaded2.whl downloaded2.hash
166168

169+
# Without --rewrite-host-header we will receive localhost:<port> as host header back in response
170+
# read -r -d '' REVERSE_PROXY_RESPONSE << EOM
171+
# "localhost:$PROXY_PY_PORT"
172+
# EOM
173+
174+
# With --rewrite-host-header we will receive httpbingo.org as host header back in response
167175
read -r -d '' REVERSE_PROXY_RESPONSE << EOM
168-
"localhost:$PROXY_PY_PORT"
176+
"httpbingo.org"
169177
EOM
170178

171179
echo "[Test Reverse Proxy Plugin]"
@@ -174,8 +182,5 @@ RESPONSE=$($CMD 2> /dev/null)
174182
verify_contains "$RESPONSE" "$REVERSE_PROXY_RESPONSE"
175183
VERIFIED6=$?
176184

177-
# FIXME: VERIFIED6 NOT ASSERTED BECAUSE WE STARTED GETTING EMPTY RESPONSE FROM UPSTREAM
178-
# AFTER CHANGE FROM HTTPBIN TO HTTPBINGO. This test works and passes perfectly when
179-
# run from a local system
180-
EXIT_CODE=$(( $VERIFIED1 || $VERIFIED2 || $VERIFIED3 || $VERIFIED4 || $VERIFIED5 ))
185+
EXIT_CODE=$(( $VERIFIED1 || $VERIFIED2 || $VERIFIED3 || $VERIFIED4 || $VERIFIED5 || $VERIFIED6 ))
181186
exit $EXIT_CODE

0 commit comments

Comments
 (0)