AWS Thinkbox Discussion Forums

Database service at full CPU on server, error in logs "no SSL certificate"

Hi there,

We’re seeing an issue today of client machines that are unable to connect to the repository - the error looks like:

deadline_client_error

When looking at the server, it appears the mongod.exe service is running close to full CPU, has 53k+ threads. When I stop the service, the CPU returns to normal levels (~10%) but when the service restarts, the CPU usage ramps up again.

In the logs, I see the following errors:

2020-02-05T10:18:14.438+0000 E NETWORK  [conn133903] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:14.445+0000 E NETWORK  [conn133904] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:14.700+0000 E NETWORK  [conn133907] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.400+0000 E NETWORK  [conn133910] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.522+0000 E NETWORK  [conn133915] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.549+0000 E NETWORK  [conn133916] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.550+0000 E NETWORK  [conn133918] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.550+0000 E NETWORK  [conn133917] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.558+0000 E NETWORK  [conn133920] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.558+0000 E NETWORK  [conn133919] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.577+0000 E NETWORK  [conn133921] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.712+0000 E NETWORK  [conn133923] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.712+0000 E NETWORK  [conn133922] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.850+0000 E NETWORK  [conn133924] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.874+0000 E NETWORK  [conn133925] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.912+0000 E NETWORK  [conn133926] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.980+0000 E NETWORK  [conn133928] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.981+0000 E NETWORK  [conn133927] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.985+0000 E NETWORK  [conn133929] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:15.996+0000 E NETWORK  [conn133930] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:16.004+0000 E NETWORK  [conn133931] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:16.108+0000 E NETWORK  [conn133932] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:16.152+0000 E NETWORK  [conn133933] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:16.197+0000 E NETWORK  [conn133936] no SSL certificate provided by peer; connection rejected
2020-02-05T10:18:16.285+0000 E NETWORK  [conn133937] no SSL 

Our sysadmin isn’t here at the moment and our render manager is out sick, so it’s fallen to me to try and solve this. Can anyone point me towards how to debug the issue?

Deadline version 10.0.16.6

Thanks in advance,
Rob

So we managed to get around this ourselves. We shutdown all slaves, launchers and monitors on the connected machines in the farm, restarted the database service, and then restarted all the slave processes, and it seems to have worked.

We have upwards for 50 nodes on the farm, so our sysadmin used PDQ Deploy to execute the shutdown across all the render farm nodes.

Privacy | Site terms | Cookie preferences