Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't login to dashboard & connect new devices with Netbird 0.30/0.31 & Authentik 2024.10 #2847

Open
mradermaxlol opened this issue Nov 6, 2024 · 4 comments

Comments

@mradermaxlol
Copy link

mradermaxlol commented Nov 6, 2024

Describe the problem

I have an installation of Netbird (with PostgreSQL 16.4 as db backend) running with rootless podman(-compose). It's been working fine since 0.28.something paired with Authentik 2024.8. That installation has survived a couple updates just fine. Authentik was configured as Netbird's selfhosting wiki page says.

Recently I've updated and restarted both Netbird and Authentik to 0.30.3 (and then to 0.31.0) and 2024.10, respectively. The dashboard stopped logging users in, displaying a "Unauthorized" message with a logout button. The IDM itself is just fine.
At the same time existing clients can (re)connect to this Netbird instance; initiating new log-ins (netbird up --admin-url $URL --management-url $URL) results in the following error being returned:
2024-11-06T10:10:35+03:00 WARN client/cmd/root.go:244: retrying Login to the Management service in 606.5454ms due to error rpc error: code = Unknown desc = no SSO provider returned from management. Please proceed with setting up this device using setup keys https://docs.netbird.io/how-to/register-machines-using-setup-keys

Rolling Netbird/Authentik back doesn't help at all. No errors to be found in the logs.
I think it might have something to do with the Authentik side of things. 2024.10 introduced a built-in captcha stage and opt-in token encryption. Maybe some inner workings have also been changed, but I haven't seen that in the changelogs.

Considering that rolling versions back doesn't help - could something in the database get messed up? I have to note that I've tried rolling Netbird back only from 0.30.Y ti 0.30.X, before the upgrade to 0.31.0. After the upgrade to 0.31.0 I haven't tried rolling back to 0.30.X, so no database issues should arise from that in particular.

To Reproduce

Steps to reproduce the behavior:

  1. Use latest Netbird (0.31.0; or try to upgrade from 0.30.0 first)
  2. Use latest Authentik (2024.10; or try to upgrade from 2024.08 first)
  3. Try to log into dashboard or connect a new device
  4. Experience the error described above

Expected behavior

Dashboard opens as normal. New logins happen as normal.

Are you using NetBird Cloud?

No, it's a self-hosted instance.

NetBird version

0.31.0

NetBird status -dA output:

None.

Do you face any (non-mobile) client issues?

Yes (new logins fail, existing logins work fine).

Screenshots

image

In the Network debugging section I can see a request to https://netbird.some.fqdn/api/users failing with HTTP code 401:
image

Additional context
Netbird and Authentik are hosted behind a reverse proxy. Could that be related? It worked fine before the upgrade, though.
Reverse proxy config:

server {
    include upstreams/upstreams_netbird.conf;

    listen 443 ssl;
    server_name netbird.some.fqdn;

    # https://stackoverflow.com/a/67805465
    client_header_timeout 1d;
    client_body_timeout 1d;

    # netbird-dashboard
    location / {
        include misc/headers.conf;
        include misc/proxy.conf;
        proxy_pass http://$upstream_netbird_dashboard;
    }

    # netbird-signal-grpc
    location /signalexchange.SignalExchange/ {
        grpc_read_timeout 1d;
        grpc_send_timeout 1d;
        grpc_socket_keepalive on;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_pass grpc://$upstream_netbird_signal_grpc;
    }

    # netbird-management
    location /api {
        include misc/headers.conf;
        include misc/proxy.conf;
        proxy_pass http://$upstream_netbird_management_http_grpc;
    }

    # netbird-management-grpc
    location /management.ManagementService/ {
        grpc_read_timeout 1d;
        grpc_send_timeout 1d;
        grpc_socket_keepalive on;
        grpc_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        grpc_pass grpc://$upstream_netbird_management_http_grpc;
    }

    # netbird-relay
    location /relay/ {
        include misc/headers.conf;
        include misc/proxy.conf;
        proxy_pass http://$upstream_netbird_relay/relay;
    }
}
@stevo11811
Copy link

stevo11811 commented Nov 6, 2024

Commenting so i can track this, i installed 0.31 yesterday and while the existing clients worked any new clients would not work, if a client was disconnected and reconnected it would also fail. Due to the database changes i was forced to restore the entire virtual machine to downgrade.

EDIT: I suspect your downgrade issues are due to the database changes as noted in the github changelog "Because of a database migration where the setup-keys are being hashed, a downgrade is no longer possible without restoring a backup. So, testing and making"

@mradermaxlol
Copy link
Author

EDIT: I suspect your downgrade issues are due to the database changes as noted in the github changelog "Because of a database migration where the setup-keys are being hashed, a downgrade is no longer possible without restoring a backup. So, testing and making"

I have explicitly noted that I have not tried downgrading from 0.31.0 to 0.30.X due to those very changes.

@mradermaxlol
Copy link
Author

I tried dropping Netbird's PostgreSQL DB and dashboard & auth started working again. Diff'ing the original database with the fresh one got me nothing - things look normal, though the order of some keys has changed, e.g.:
(original DB)
-COPY public.setup_keys (id, account_id, key, name, type, created_at, expires_at, updated_at, revoked, used_times, last_used, auto_groups, usage_limit, ephemeral, key_secret) FROM stdin;

(fresh DB)
+COPY public.setup_keys (id, account_id, key, key_secret, name, type, created_at, expires_at, updated_at, revoked, used_times, last_used, auto_groups, usage_limit, ephemeral) FROM stdin;

Is this not normal?

@mradermaxlol
Copy link
Author

mradermaxlol commented Nov 8, 2024

dashboard & auth started working again

Nevermind, only auth with setup keys got back up. Output of netbird up --admin-url https://netbird.some.fqdn --management-url https://netbird.some.fqdn:

WARN client/cmd/root.go:244: retrying Login to the Management service in 1.394506528s due to error rpc error: code = Unknown desc = no SSO provider returned from management. Please proceed with setting up this device using setup keys https://docs.netbird.io/how-to/register-machines-using-setup-keys
[...]
Error: login backoff cycle failed: rpc error: code = Unknown desc = no SSO provider returned from management. Please proceed with setting up this device using setup keys https://docs.netbird.io/how-to/register-machines-using-setup-keys

I can see the following in Netbird service logs while trying to auth a client:

WARN [context: GRPC, requestID: REQUEST_UUID, accountID: UNKNOWN, peerID: PEER_ID] management/server/grpcserver.go:426: failed logging in peer PEER_ID: no peer auth method provided, please use a setup key or interactive SSO login

Apparently, that happens when trying to connect a Linux client. Android client connect just fine, app redirects me to the SSO page correctly.

My compose.yml for Netbird is:

services:
  netbird-dashboard:
    image: netbirdio/dashboard:v2.7.0
    restart: always
    env_file: ../../env/netbird/netbird-dashboard.env
    logging:
      driver: none
    healthcheck:
      start_period: 5s
      timeout: 3s
      interval: 15s
      retries: 2
      test: "curl -f http://netbird-dashboard:80 || exit 1"
    networks:
      - dmz_reverse_proxy

  netbird-signal:
    image: netbirdio/signal:0.31.0
    restart: always
    # healthcheck:
      # start_period: 5s
      # timeout: 3s
      # interval: 15s
      # retries: 2
      # test: "ps aux | grep -v grep | grep -q netbird-signal || exit 1"
    volumes:
      - netbird-signal_data:/var/lib/netbird:Z,rw
    networks:
      - dmz_reverse_proxy

  netbird-management:
    image: netbirdio/management:0.31.0
    restart: always
    env_file: ../../env/netbird/netbird-management.env
    command: [
      "--port", "33073",
      "--metrics-port", "9090",
      "--log-file", "console",
      "--log-level", "warn",
      "--disable-single-account-mode=false",
      "--disable-anonymous-metrics=true",
      "--disable-geolite-update=true",
      "--single-account-mode-domain=netbird.ldap.domain",
      "--dns-domain=netbird.ldap.domain"
    ]
    depends_on:
      - netbird-dashboard
    healthcheck:
      start_period: 5s
      timeout: 3s
      interval: 15s
      retries: 2
      test: "ps aux | grep -v grep | grep -q netbird-mgmt || exit 1"
    volumes:
      - /mnt/containers/sockets/common:/mnt/sockets/common:z,rw
      - ./netbird_config/management.json:/etc/netbird/management.json:Z,rw
      - netbird-management_data:/var/lib/netbird:Z,rw
    networks:
      - dmz_reverse_proxy

  netbird-relay:
    image: netbirdio/relay:0.31.0
    restart: always
    env_file: ../../env/netbird/netbird-relay.env
    # healthcheck:
      # start_period: 5s
      # timeout: 3s
      # interval: 15s
      # retries: 2
      # test: "ps aux | grep -v grep | grep -q netbird-relay || exit 1"
    networks:
      - dmz_reverse_proxy

volumes:
  netbird-management_data:
  netbird-signal_data:

networks:
  dmz_reverse_proxy:
    external: true

netbird-dashboard.env:

# Endpoints
NETBIRD_MGMT_API_ENDPOINT=https://netbird.some.fqdn
NETBIRD_MGMT_GRPC_API_ENDPOINT=https://netbird.some.fqdn

# OIDC
AUTH_AUDIENCE=[CLIENT_ID]
AUTH_CLIENT_ID=[CLIENT_ID]
AUTH_CLIENT_SECRET=
AUTH_AUTHORITY=https://idm.some.fqdn/application/o/netbird/
USE_AUTH0=false
AUTH_SUPPORTED_SCOPES=openid profile email offline_access api
AUTH_REDIRECT_URI=
AUTH_SILENT_REDIRECT_URI=
NETBIRD_TOKEN_SOURCE=accessToken

# configuration for reverse-proxy-side SSL/TLS
NGINX_SSL_PORT=443
LETSENCRYPT_DOMAIN=
LETSENCRYPT_EMAIL=

netbird-management.env:

NETBIRD_STORE_ENGINE_POSTGRES_DSN=host=/mnt/sockets/common port=5432 user=NETBIRD_DB_USER password=NETBIRD_DB_USER_PASSWORD dbname=NETBIRD_DB

netbird-relay.env:

NB_LOG_LEVEL=warning
NB_LISTEN_ADDRESS=:20000
NB_EXPOSED_ADDRESS=rels://netbird.some.fqdn:443/relay
NB_AUTH_SECRET=[AUTH_SECRET]

EDIT: apparently, it's only my local machine that fails to connect. Deleting config.json and /var/lib/netbird does not help somehow. Will try to investigate a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants