we are a small ISP provider and we are currently installing optical fiber connections in our city since about 3 months. So far we have about 250 registered devices on GenieACS.
Thing is, while developing a solution to handle PPPoE and WiFi credentials, we established 12 hours as the periodic inform interval. Last weekend we had a several blackouts at sections of the city (I believe they were scheduled) and as result, we had dozens of CPE doing Inform event on the ACS at the same time.
I was really unable to identify other source of the problem than the ACS by itself. The error I got from genieacs-cwmp log say:
Connection dropped
Channel has faulted; channel=“boot” retries=0 faultCode=“session_terminated” faultMessage=“The TR-069 session was unsuccessfully terminated”
below the section of the log. I was wondering if it has anything to do with the VALUE event
Summary
2021-06-25T17:29:48.801Z [INFO] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Script: boot : : datos a grabar en CPE: {
“InternetGatewayDevice.ManagementServer.PeriodicInformInterval”: 43200
}
2021-06-25T17:29:48.801Z [INFO] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Script: boot : grabando InternetGatewayDevice.ManagementServer.PeriodicInformInterval = 43200
2021-06-25T17:29:48.801Z [INFO] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Script: boot : FIN ! ========================================
2021-06-25T17:29:48.804Z [INFO] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: ACS request; acsRequestId=“17a4437e5e40100” acsRequestName=“SetParameterValues”
2021-06-25T17:29:48.805Z [ERROR] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Connection dropped
2021-06-25T17:29:48.805Z [WARN] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Channel has faulted; channel=“boot” retries=0 faultCode=“session_terminated” faultMessage=“The TR-069 session was unsuccessfully terminated”
2021-06-25T17:30:01.946Z [INFO] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Inform; cpeRequestId=“38” informEvent=“4 VALUE CHANGE” informRetryCount=2
2021-06-25T17:30:16.061Z [INFO] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Inform; cpeRequestId=“39” informEvent=“2 PERIODIC” informRetryCount=0
2021-06-25T17:30:16.077Z [INFO] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Script: inform : : INICIO ! dev id: 00259E-HG8145V5-485754432D90B2A3
2021-06-25T17:30:16.077Z [INFO] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Script: inform : manufacturer: Huawei Technologies Co., Ltd
2021-06-25T17:30:16.360Z [INFO] 10.111.1.243 00259E-HG8145V5-485754432D90B2A3: Script: inform : parametros a pedir: {
“InternetGatewayDevice.ManagementServer.PeriodicInformInterval”: “”,
and I found this on the journalctrl
Error" exceptionMessage=“Lock expired” exceptionStack=“Error: Lock expired\n at cs (/usr/local/lib/node_modules/genieacs/bin/genieacs-cwmp:2:56400)\n at runMicrotasks ()\n at processTicksAndRejections (internal/process/task_queues.js:97:5)\n at async In (/usr/local/lib/node_modules/genieacs/bin/genieacs-cwmp:2:117461)\n at async Dn (/usr/local/lib/node_modules/genieacs/bin/genieacs-cwmp:2:106963)”
to deal with the issue of multiple connections at the same time I decided to switch the periodic inform times based on devid from 10 to 14 hours. So far so good.
As aditional background information, I could add the whole provisioning process takes some time, sometimes around 20 seconds to complete. I had to raise the timeout variable GENIEACS_EXT_TIMEOUT=150000 to archieve this. I am considering making a local copy of the DB API data to reduce the delays. Perhaps that should be my priority.
In case someone wonders, all the computers involved are virtual machines and I don’ t really know a lot about the network setup but will try to answer any question to clarify this.
What I would like to know is how do I address this problem.
thanks in advance. RM.