Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgraded Mongo to 7.0 from 5.0 #470

Merged
merged 3 commits into from
May 20, 2024
Merged

Upgraded Mongo to 7.0 from 5.0 #470

merged 3 commits into from
May 20, 2024

Conversation

Alopalao
Copy link

@Alopalao Alopalao commented May 2, 2024

Closes #451

Summary

Updated MongoDB to 7.0.

Local Tests

Updated MongoDB and run some tests. No noticeable changes between the versions and data generation.

Update steps

Before following the update procedure, be sure to download the latest Mongo package. To check on active database mongosh can be installed (For Debian 12 follow Ubuntu 22.04 process).
To update follow this instructions:

  1. Stop kytos
  2. From mongosh, enter db.adminCommand( { setFeatureCompatibilityVersion: "5.0" } ). This change can be checked with db.adminCommand( { getParameter: 1, featureCompatibilityVersion: 1 } )
  3. Stop the containers that kytos creates: mongo-rs-init, mongo1, mongo2 and mongo3. To check their status enter the console command docker ps -a. Example:
CONTAINER ID   IMAGE       COMMAND                  CREATED              STATUS                          PORTS                                                      NAMES
493d530e4104   mongo:6.0   "/scripts/rs-init.sh"    About a minute ago   Exited (0) About a minute ago                                                              mongo-rs-init
ad7a9149a38c   mongo:6.0   "/usr/bin/mongod --b…"   2 minutes ago        Up About a minute               0.0.0.0:27017->27017/tcp, :::27017->27017/tcp              mongo1
fbec48992a5d   mongo:6.0   "/usr/bin/mongod --b…"   2 minutes ago        Up About a minute               27017/tcp, 0.0.0.0:27018->27018/tcp, :::27018->27018/tcp   mongo2
607ee9359347   mongo:6.0   "/usr/bin/mongod --b…"   2 minutes ago        Up About a minute               27017/tcp, 0.0.0.0:27019->27019/tcp, :::27019->27019/tcp   mongo3

Stop each container with docker stop $CONTAINER_ID. E.g. docker stop 493d530e4104.

  1. Change kytos/docker-compose.yml mongo images to 6.0 (There are 4).
  2. Run Kytos and close Kytos. Or just run docker compose up -d.
  3. To mongosh, enter db.adminCommand( { setFeatureCompatibilityVersion: "6.0" } )
  4. Stop the containers that kytos creates
  5. Change kytos/docker-compose.yml mongo images to 7.0
  6. Run Kytos or docker compose up -d. The result can be verified with mongosh with the command mongosh mongo1:27017,mongo2:27018,mongo3:27019.
Current Mongosh Log ID: 6632a69a07964f2a047b2da8
Connecting to:          mongodb://127.0.0.1:27017/mongo1%3A27017%2Cmongo2%3A27018%2Cmongo3%3A27019?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.2.2
Using MongoDB:          7.0.8  # <-- HERE
Using Mongosh:          2.2.2
mongosh 2.2.5 is available for download: https://www.mongodb.com/try/download/shell

Discrepancies

Kytos does not pass "hello" command on MongoDB

There maybe some problems with the version compatibility. To check this you can check the mongo1 container. Run docker logs $CONTAINER_ID | grep "Invalid feature compatibility version" to check settings. In the results check for:

Invalid feature compatibility version value '5.0'; expected '6.0' or '6.3' or '7.0'.

The compatibility version is set to 5.0 which means that the version kytos/docker-compose.yml needs to change to either 5.0 or 6.0 (in the case of 6.0, the containers are compatible with 6.0 and 7.0).
Change docker-compose.yml accordingly, stop and delete containers. Finally run kytos or execute docker compose up -d and continue with the update process

APM not working

Stop and delete the containers created from docker-compose.es.yml. And create them again.

Problems with creating containers after docker compose up -d

The problems could be errors while recreating container or a container ID already in use. Stop and delete the containers created from docker-compose.yml.

Port 27017 or 8181 is already in use

  • Sometimes the port 27017/8181 will be in use when running Kytos. This port needs to be free. For Linux this command sudo fuser -k 27017/tcp worked (or 8181/tcp depending on the error).

Downgrade

Downgrading can be done depending on the compatibility version on the database.:

  • 5.0 compatibility settings allows the container with mongo 5.0 and 6.0
  • 6.0 compatibility settings allows the container with mongo 6.0 and 7.0

Before executing each time the containers with different version, stop the containers mongo1, mongo2 and mongo3 with the previous version.

Save and restore database

Command to export:

mongodump -d <database_name> -o <directory_backup>

It will create a folder with the database name, by default kytos database name is napps

Command to restore:

mongorestore -d <database_name> <directory_backup>

End-to-End Tests

The e2e tests depend on this PR

+ python3 -m pytest tests/ --reruns 2 -r fEr
============================= test session starts ==============================
platform linux -- Python 3.11.2, pytest-8.1.1, pluggy-1.5.0
rootdir: /tests
plugins: rerunfailures-13.0, timeout-2.2.0, anyio-4.3.0
collected 263 items

tests/test_e2e_01_kytos_startup.py ..                                    [  0%]
tests/test_e2e_05_topology.py ..................                         [  7%]
tests/test_e2e_06_topology.py ....                                       [  9%]
tests/test_e2e_10_mef_eline.py ..........ss.....x.....x................  [ 24%]
tests/test_e2e_11_mef_eline.py ......                                    [ 26%]
tests/test_e2e_12_mef_eline.py .....Xx.                                  [ 29%]
tests/test_e2e_13_mef_eline.py ....Xs.s.....Xs.s.XXxX.xxxx..X........... [ 45%]
.                                                                        [ 45%]
tests/test_e2e_14_mef_eline.py x                                         [ 46%]
tests/test_e2e_15_mef_eline.py .....                                     [ 47%]
tests/test_e2e_16_mef_eline.py .                                         [ 48%]
tests/test_e2e_20_flow_manager.py .....................                  [ 56%]
tests/test_e2e_21_flow_manager.py ...                                    [ 57%]
tests/test_e2e_22_flow_manager.py ...............                        [ 63%]
tests/test_e2e_23_flow_manager.py ..............                         [ 68%]
tests/test_e2e_30_of_lldp.py ....                                        [ 69%]
tests/test_e2e_31_of_lldp.py ...                                         [ 71%]
tests/test_e2e_32_of_lldp.py ...                                         [ 72%]
tests/test_e2e_40_sdntrace.py ..............                             [ 77%]
tests/test_e2e_41_kytos_auth.py ........                                 [ 80%]
tests/test_e2e_42_sdntrace.py ..                                         [ 81%]
tests/test_e2e_50_maintenance.py ............................            [ 92%]
tests/test_e2e_60_of_multi_table.py .....                                [ 93%]
tests/test_e2e_70_kytos_stats.py ........                                [ 96%]
tests/test_e2e_80_pathfinder.py ss......                                 [100%]

@Alopalao Alopalao requested a review from a team as a code owner May 2, 2024 19:03
Copy link
Member

@viniarck viniarck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic to have MongoDB 7.0 and mitigating the soon EOL issue.

Overall looks good @Alopalao, and we can rely on this PR here to link as a doc for the upgrade, but there are still points to be refined/documented:

  • Add a General Information section in the changelog stating that MongoDB 7.0 is the recommended, tested, and supported version. Other version of course can still work, but just to highlight that will the one being actively maintained and tested, and then also add a link to this PR url, for more information about how to upgrade.

  • When follow the steps to upgrade I hit an issue where it didn't recover from, containers kept restarting, I had to use db.adminCommand( { setFeatureCompatibilityVersion: "6.0" } ) before upgrading the images to 7.0, double check if steps 7 and 8 need to be reordered. Also, there are steps saying to stop kytosd again, no need to start kytosd during 6.0 since this is just a transitive version for the upgrade.

  • Let's also run a stress test, if you haven't run any with vegeta yet, let's target one endpoint that hits the DB, no expected surprises, especially since MongoDB 7.0 is a LTS/stable version (I see that the APM agent is working correctly)

  • Document on this PR how to backup the docker volumes first, and exporting mongodb data, and let's go over that procedure at least once, let's simulate a worst case in prod, where this guide is followed but then on 7.0 containers don't start correctly. How can network operators revert to 5.0 with the original docker volumes (at AmLight we know containers are being used so it's ok to only cover containers volumes). This is very critical to prod, let's not only hope for the best.

  • Excellent to see e2e passing. On GitLab https://gitlab.ampath.net/kytos/kytos-end-to-end-tester/-/blob/master/.gitlab-ci.yml needs to be updated too, check out the services there on 5.0. We also need to decide on a strategy, if we'll upgrade there to 7.0, and then upgrade amlight/kytos-end-to-end-tester on 7.0 to be able to run e2e on demand, but then we can maybe also create a new repo copy of amlight/kytos-end-to-end-tester to still have it on 5.0 just so if any patch on 2024.1 still needs to be patched then we can also test on 5.0 in the CI? and then we maintain the 5.0 there until prod upgrades to 7.0 when they deploy the future version 2024.2 probably by the end of the year? Let me know if you have any other suggestions. In practice, no expected issues to also test future patches of 2024.1 with 7.0, but for completeness let's be ready for that case too. This also needs to include @italovalcy in the conversation.

@italovalcy
Copy link

Hi team,

Great work, Aldo! very nice done!

  • Excellent to see e2e passing. On GitLab https://gitlab.ampath.net/kytos/kytos-end-to-end-tester/-/blob/master/.gitlab-ci.yml needs to be updated too, check out the services there on 5.0. We also need to decide on a strategy, if we'll upgrade there to 7.0, and then upgrade amlight/kytos-end-to-end-tester on 7.0 to be able to run e2e on demand, but then we can maybe also create a new repo copy of amlight/kytos-end-to-end-tester to still have it on 5.0 just so if any patch on 2024.1 still needs to be patched then we can also test on 5.0 in the CI? and then we maintain the 5.0 there until prod upgrades to 7.0 when they deploy the future version 2024.2 probably by the end of the year? Let me know if you have any other suggestions. In practice, no expected issues to also test future patches of 2024.1 with 7.0, but for completeness let's be ready for that case too. This also needs to include @italovalcy in the conversation.

Another approach would be having both tests on the end-to-end daily execution (5.0 and 7.0), just to make sure no surprises will be detected.

@viniarck
Copy link
Member

viniarck commented May 6, 2024

Hi team,

Great work, Aldo! very nice done!

  • Excellent to see e2e passing. On GitLab https://gitlab.ampath.net/kytos/kytos-end-to-end-tester/-/blob/master/.gitlab-ci.yml needs to be updated too, check out the services there on 5.0. We also need to decide on a strategy, if we'll upgrade there to 7.0, and then upgrade amlight/kytos-end-to-end-tester on 7.0 to be able to run e2e on demand, but then we can maybe also create a new repo copy of amlight/kytos-end-to-end-tester to still have it on 5.0 just so if any patch on 2024.1 still needs to be patched then we can also test on 5.0 in the CI? and then we maintain the 5.0 there until prod upgrades to 7.0 when they deploy the future version 2024.2 probably by the end of the year? Let me know if you have any other suggestions. In practice, no expected issues to also test future patches of 2024.1 with 7.0, but for completeness let's be ready for that case too. This also needs to include @italovalcy in the conversation.

Another approach would be having both tests on the end-to-end daily execution (5.0 and 7.0), just to make sure no surprises will be detected.

Good idea having two parallel pipelines on end-to-end project nightly on 7.0 and on 5.0. Let's also consider this, if the available GitLab workers can pick up the work in parallel, and then we maintain it until prod gets upgraded to 7.0.

@Alopalao
Copy link
Author

Alopalao commented May 10, 2024

Test description:

  • Created 100 EVCs.
  • 60 disabling/enabling were done. 40 seconds between each other.
  • From mef_eline interval and timings were reduced to 30.
  • Ringtopo topology was used.

Overall Results

Timing measure for bulk_write
cumulative_times

Database created and tests performed in mongo 5.0:

Overview

Overview

Trace

Trace

Request POST flow_manager install flows

Transaction_re_install

Request POST flow_manager delete flows

Transaction_re_delete

Database created and tests performed in mongo 7.0:

Overview

Overviewpng

Trace

Traces

Request POST flow_manager install flows

Transaction_request

Request POST flow_manager delete flows

Transaction_request_delete

Database created in mongo 5.0 and tests performed in mongo 7.0:

Overview

Overview

Trace

traces

Request POST flow_manager install flows

Trans_install

Request POST flow_manager delete flows

Trans_delete

@viniarck viniarck self-requested a review May 10, 2024 12:17
Copy link
Member

@viniarck viniarck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see that overall the DB ops latencies are similar as we'd expect, that's a good start. On flow_manager delete if you could also include in the screenshot how long the DB operation was taking (on 5.0 you included that correctly, but that was truncated on 7.0 screenshot):

20240510_091252

I'm just waiting now the the rest of the points on this comment to be addressed to approve it

@viniarck
Copy link
Member

Great to see that overall the DB ops latencies are similar as we'd expect, that's a good start. On flow_manager delete if you could also include in the screenshot how long the DB operation was taking (on 5.0 you included that correctly, but that was truncated on 7.0 screenshot):

20240510_091252

I'm just waiting now the the rest of the points on this comment to be addressed to approve it

@Alopalao if you could also post the output of vegeta for one of the endpoint stress tests for the record on both Mongo 5 and 7 results that'd be great, vegeta output is very helpful since it includes many stats including some percentiles. So, that of couse it's only from a client point of view, but if the overall latencies were similar, that's already a great sign

@Alopalao
Copy link
Author

Alopalao commented May 13, 2024

Vegeta report results from jq -ncM '{method: "POST", url: "http://localhost:8181/api/kytos/flow_manager/v2/flows/00:00:00:00:00:00:00:01", body: { "force": true, "flows": [ { "priority": 10, "match": { "in_port": 1, "dl_vlan": 100 }, "actions": [ { "action_type": "output", "port": 1 } ] } ] } | @base64, header: {"Content-Type": ["application/json"]}}' | vegeta attack -format=json -rate=200/1s -duration=60s -timeout=120s | tee results.bin | vegeta report

Mongo 7.0

Requests      [total, rate, throughput]         12000, 200.02, 148.80
Duration      [total, attack, wait]             1m1s, 59.994s, 603.263ms
Latencies     [min, mean, 50, 90, 95, 99, max]  10.2ms, 507.339ms, 519.413ms, 773.648ms, 848.971ms, 996.941ms, 1.239s
Bytes In      [total, mean]                     381289, 31.77
Bytes Out     [total, mean]                     1464000, 122.00
Success       [ratio]                           75.14%
Status Codes  [code:count]                      202:9017  503:2983
Error Set:
503 Service Unavailable

apm

Mongo 5.0

Requests      [total, rate, throughput]         12000, 200.02, 143.44
Duration      [total, attack, wait]             1m1s, 59.995s, 692.008ms
Latencies     [min, mean, 50, 90, 95, 99, max]  7.965ms, 509.928ms, 483.188ms, 831.703ms, 906.246ms, 1.101s, 1.267s
Bytes In      [total, mean]                     375985, 31.33
Bytes Out     [total, mean]                     1464000, 122.00
Success       [ratio]                           72.54%
Status Codes  [code:count]                      202:8705  503:3295
Error Set:
503 Service Unavailable

Screenshot_20240513_123936

The error from both tests come from concurrency limit:
2024-05-13 12:38:44,102 - WARNING [uvicorn.error] (MainThread) Exceeded concurrency limit.

@viniarck viniarck self-requested a review May 13, 2024 19:12
Copy link
Member

@viniarck viniarck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done @Alopalao, great to have Mongo 7.0 and well documented regarding the upgrade and related procedures here on this PR. I'll leave this pre-approved. It'll get merged when the rest of GitLab CI gets sorted out.

@viniarck viniarck merged commit f807667 into master May 20, 2024
2 checks passed
@viniarck viniarck deleted the upgrade/mongo branch May 20, 2024 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MongoDB 5.0 EOL in Oct 2024
3 participants