Best estimation for MAX_CONCURRENT_REQUESTS

harilam · September 12, 2023, 1:51am

Hello,

According to this topic https://forum.genieacs.com/t/session-unsuccessfully-terminated/3051, the default value of MAX_CONCURRENT_REQUESTS is 20. What is the best formula/estimation for MAX_CONCURRENT_REQUESTS value based on CPU/RAM capacity and the size of deployment (the number of connected CPE)?
For example:

Size of deployment: 10k CPE
Inform time: every 1 hour
CPU: 8 cores, 16 threads
RAM: 32GB

Thank you~

rudymartin · September 12, 2023, 5:54pm

my conclusion is to avoid using periodic informs. we have now over 5k of cpe and the VM is practically idle most of the time. 8Gb ram 8 cores, 1Gb swap (always unused).

back at the time I set that variable while trying to solve an impossible problem.

akcoder · September 12, 2023, 7:31pm

This is one of the few areas I disagree with @rudymartin on. We have ~3500 CPEs all set to inform every hour. This runs on a 4 core, 16gb RAM, 8 GB swap (unused). The load average on the VM is almost always less than 1. I have GENIEACS_CWMP_MAX_CONCURRENT_REQUESTS set to 30.

I have our CPEs set to inform every hour so I can detect when the TR069 daemon on the router has gone wonky and they stop informing. I have a script that runs at midnight and ssh’s into the router to reboot routers that stop responding.

I do it like this for a few reasons. The biggest being we do not allow customers access to the web UI on the router. All changes have to be done via our customer portal. This allows me to track the desired state of the router and easily restore the router should we have to swap out a bad unit.

Screenshot 2023-09-12 at 11.24.03 AM

harilam · September 13, 2023, 4:14am

Hi @rudymartin
I saw that your GENIEACS_CWMP_MAX_CONCURRENT_REQUESTS is 1000 for 5K CPE, am I right?

rudymartin · September 13, 2023, 1:03pm

I think you are referring to an experiment I made. go with the defaults values for all variables.

I forgot to ask an important question: what are you really trying to solve? how many cpe do you have ? or are you just planning ?

rudymartin · September 13, 2023, 1:31pm

what you say makes sense. Still there’s something important I never mentioned: One reason why I want periodic inform disabled is because I have debug enabled all the time, which allows me to track down more easly which parameters are causing a problem if a fault happens. I don’t trust Huawei routers so if something happens I want to know what was it exactly. Luckily It does not happen often.

root@acs3:/var/log/genieacs# ls -lah genieacs-debug.yaml*
-rw-r--r-- 1 genieacs genieacs  12M Sep 13 10:02 genieacs-debug.yaml
-rw-r--r-- 1 genieacs genieacs  37M Sep 10 23:58 genieacs-debug.yaml-20230911
-rw-r--r-- 1 genieacs genieacs 174M Sep 11 19:50 genieacs-debug.yaml-20230912
-rw-r--r-- 1 genieacs genieacs 146M Sep 12 23:56 genieacs-debug.yaml-20230913

last two entries sizes increase is because, at the moment, we are running into network problems.

we do have a customer portal but we only allow configuring wifi parameters. very special cases (usually business, but as a customer I also count as one of those) usually end up with a bridge installed instead of a router so the customer must provide a router or equivalent. We only provide PPPoE credentials to the customer to connect.

here in Argentina if a router stop working as it should, 50% of time the customer will call almost immediately.

akcoder · September 13, 2023, 4:50pm

This makes sense @rudymartin. For us, very few customers have a bridged router. I’m actually surprised at how few do. I thought the number would be higher than 3%.

Screenshot 2023-09-13 at 8.49.04 AM

harilam · September 14, 2023, 4:24am

I think you are referring to an experiment I made. go with the defaults values for all variables.
I forgot to ask an important question: what are you really trying to solve? how many cpe do you have ? or are you just planning ?

I did not change any default value until I met an exact problem with Worker overload

2020-02-17T09:22:17.249Z [WARN] Worker overloaded; droppedRequests=2 totalRequests=2 initiatedSessions=0 pid=1244
2020-02-17T09:22:27.232Z [WARN] Worker overloaded; droppedRequests=5 totalRequests=5 initiatedSessions=0 pid=1243

I did not know the background of changing MAX_CONCURRENT_REQUESTS from default 20 to 200 can solve the problem. I guess that it could relate to the Server capacity (RAM. CPU) and the number of connected CPEs. This is the reason why I created this topic.

I had ~ 10k connected CPE, and inform every hour, 16 vCPU with 32GB of RAM.

According to @akcoder information, can I set MAX_CONCURRENT_REQUESTS = (10/3.5 * 30) = 86 for 10K CPE? (just compared the number of connected CPE)?
or MAX_CONCURRENT_REQUESTS = 2 * 30 = 60 (when comparing CPU specs?)

Thank you!

rudymartin · September 14, 2023, 1:09pm

as far as I am aware we do not have concurrency problems, so I can’t really answer your question.

by any chance, do you have an extension that does database or network access ?

harilam · September 27, 2023, 4:07am

hi @rudymartin and @akcoder
Thank you so much for your explanation and support.
yes, I have 1 extension script to call HTTP POST to an external application (when CPE sends 0 BOOTSTRAP).
I changed the MAX_CONCURRENT_REQUESTS to 300 and no Overload worker happens anymore.
However, I met another issue with MongoDB (exactly same logs as below issue)

Can you share guidance on how to troubleshoot more?
Thank you~

rudymartin · September 27, 2023, 2:20pm

just don’t do that.

Topic		Replies	Views
Connection dropped on multiple cpe connections at the same time	5	1910	September 11, 2021
Session unsuccessfully terminated	7	1991	September 12, 2023
Worker overload	3	1287	March 30, 2020
GenieACS ressources to scale up to >10k clients	4	860	June 21, 2023
Doubt about requirements	1	364	October 28, 2021

Best estimation for MAX_CONCURRENT_REQUESTS

Related topics