URGENT! Deadline launches multiple workers without instance and makes workstation stalled

StealthThinkBox · November 1, 2021, 2:26am

Hello, I got a serious trouble. Deadline suddenly launches multiple workers without instance and makes workstation stalled. It never happened before. It has started after we bought new workstation. (HPZ840 Intel(R) Xeon(R) CPU E5-2687W v3 @ 3.10GHz 3.10 GHz (2 processor 40 thread total ) 128 GB memory, OS version is Windows 10 Pro ver 20H2 ,Windows Feature Experience Pack 120.2212.3920.0)

Are there anyone who has same trouble and how did you fix it? I found one people has same trouble but seems not resolved.

The information of deadlien is here

Deadline Client Version: 10.1.9.2 Release (3d6a64d94)
FranticX Client Version: 2.4.0.0 Release (0b549a42a)

License Mode: Standard

Repository Version: 10.1.9.2 (3d6a64d94)
Integration Version: 10.1.9.2 (3d6a64d94)
3PL Settings Version: 20/07/2020

Thank you in advance

Kei

Justin_B · November 1, 2021, 2:09pm

If you only want a single Worker to run on the computer, look at our mulitple Workers on one machine page, in particular removing Workers if you only want a single Worker to run.

If the removal options through the UI fail, try removing the .ini files from the local worker as those are what define the Workers on the machine.

For urgent issues, you can create a ticket with us directly at awsthinkbox.zendesk.com or call us at 1-866-419-0283 ext 2 from 9am to 5pm Central Standard Time.

StealthThinkBox · November 1, 2021, 3:53pm

Thank you Justin.

I forgot telling you I once tried “MultipleSlavesEnabled=False” setting and it didn’t work somehow. But now I tried once again and seems it’s working ! I’ll see if this problem happens again or not for a while.

Thank you for your support!

Kei

christopheC · November 1, 2021, 3:57pm

Hi StealthThinkbox and Justin_B,

Just an update on our issue that you mentioned, we were still not able to resolve it and the machine with the issue is still opening several workers even after a full machine reinstall.
We did create a ticket and worked on the issue with the Thinkbox team (and followed the steps mentioned above) and were unable to resolve the issue.

It seems from our contacts with Thinkbox that it may be a network related issue. Maybe check your switch/ethernet cable?
We changed the machine location on our network as I was hoping it was a bad switch/cable but that
did not resolve it. Next up is to check if we can add a network card to see if the motherboard one is acting up.

If you figure out what the problem is I’d be happy to hear your solution

Chris

StealthThinkBox · November 1, 2021, 4:16pm

I’ve watched my workstations and I found some machines are fine and the others are still have problem.

One thing I notice is those workstation are loosing connection for remote desktop so often. Do you think there is some relationship between them?

StealthThinkBox · November 1, 2021, 4:24pm

Hi Christophe,

So sorry about you are still having this problem. It’s same on our side too. As you mention about suspicious network situation, our bad workstation have some trouble that they loose connection for remote desktop so often. I think there might be some relationship.

In the meantime, I asked some IT guy to make tool that kills workers except the first one. I hope this method will work.

Kei

Justin_B · November 2, 2021, 3:33pm

Would you be able to get us the Deadline application logs from a day when you see this behaviour? The Worker is supposed to grab a socket on the machine, and it uses that to make sure duplicates aren’t launched. Maybe that’s failing, and you’re seeing RDP failures as as side-effect.

We’ll be able to see the launcher starting Workers and why in the deadlinelauncher log.

Beware that the worker logs will include the task render logs, so you may want to create a ticket at awsthinkbox.zendesk.com if you’d like to keep your logs off the wider internet.

The logs will be on the machine in one of these locations:
Windows: C:\ProgramData\Thinkbox\Deadline10\logs
Linux: /var/log/Thinkbox/Deadline10
Mac OS X: /Library/Logs/Thinkbox/Deadline10

StealthThinkBox · November 3, 2021, 9:20am

Hi Justin,

Before sending log, we might find a reason why deadline launches multiple workers. It was “sleep mode” for our side. After killing sleep mode, finally deadline start working normally. When I bought new workstation, default sleep mode span was set to 10 min. I realized light scene can be rendered normally but heavy one was not. I still don’t know why this occurs multiple launching(and sometimes worker stalled without error in the monitor), but I assume once workstation goes to sleep mode, deadline tries to restart or something like that. What do you think?

One thing, our workstation has SSD for OS. When we used normal HDD drive, it never happened. Do you think this also might be the reason for multiple launching? Our assumption is the combination of sleep mode and SSD drive for OS. As I told before, one of worker stalled without error in the monitor sometimes. Is there some possibility that SSD’s quick shut down and recover causes this problem?

thanks,

Kei

StealthThinkBox · November 3, 2021, 9:21am

Hi, as I told Justin, can you check sleep mode ? Can you kill it then watch if still the problem happens or not? On our side, deadline works normally.

thanks,

Kei

StealthThinkBox · November 3, 2021, 9:29am

And please check auto monitor turning off function also. This is also suspicious.

Justin_B · November 3, 2021, 5:59pm

That makes sense to me, I wouldn’t want any render nodes shutting themselves off without informing Deadline. If you’d like to still have computers turn themselves off when there’s no work left to do take a look at Power Management which can take care of that for you.

StealthThinkBox · November 4, 2021, 5:14am

OK, thank you. I’ll check it if we still have a problem. Please share our case if some other people have a same trouble.

thanks,

Kei