AWS Thinkbox Discussion Forums

Slave doesn't accept tasks

Hi Ryan,

I have a render slave that doesn’t accept tasks. Other slave render as this one is idle. I have tried install a fresh copy of Deadline through the Gui @ CMD. Same result each time…

Logs:
2011-09-29 15:21:45: BEGIN - AIBLADE-06\root
2011-09-29 15:21:45: Start-up
2011-09-29 15:21:45: 2011-09-29 15:21:44
2011-09-29 15:21:45: Deadline Launcher 5.1 [v5.1.0.45235 R]
2011-09-29 15:21:45: Local python version file: /usr/local/Thinkbox/Deadline/python/2.6.7/Version
2011-09-29 15:21:45: Network python version file: /mnt/DeadlineRepository/python/Linux/2.6.7/Version
2011-09-29 15:21:45: Comparing python version files
2011-09-29 15:21:45: Python upgrade skipped because Version files are the same
2011-09-29 15:21:45: Local version file: /usr/local/Thinkbox/Deadline/bin/Version
2011-09-29 15:21:45: Network version file: /mnt/DeadlineRepository/bin/Linux/Version
2011-09-29 15:21:45: Comparing version files
2011-09-29 15:21:45: Launching Slave: Aiblade-06
2011-09-29 15:21:45: Launcher Thread - Launcher thread initializing…
2011-09-29 15:21:45: Perfoming remote admin check
2011-09-29 15:21:45: Remote Administration is now enabled
2011-09-29 15:21:45: Launcher Thread - Remote administration is enabled
2011-09-29 15:21:45: Launcher Thread - Launcher thread listening on port 5042
2011-09-29 15:22:45: Perfoming remote admin check
2011-09-29 15:24:45: Perfoming remote admin check

2011-09-29 15:21:45: BEGIN - AIBLADE-06\root
2011-09-29 15:21:45: Start-up
2011-09-29 15:21:45: 2011-09-29 15:21:44
2011-09-29 15:21:45: Deadline Slave 5.1 [v5.1.0.45235 R]
2011-09-29 15:21:45: Auto Configuration: A ruleset has been received
2011-09-29 15:21:45: Auto Configuration: Setting License Server to ‘192.168.1.2’
2011-09-29 15:21:45: Auto Configuration: Setting Repository Path to ‘/mnt/DeadlineRepository’
2011-09-29 15:21:45: Auto Configuration: Setting Local Slave Data Folder to ‘/temp’
2011-09-29 15:21:45: slave initialization beginning.
2011-09-29 15:21:46: Info Thread - Created.
2011-09-29 15:21:46: Trying to connect using license server ‘192.168.1.2’
2011-09-29 15:21:46: The license file being used will expire in 32 days.
2011-09-29 15:21:47: pid 3651’s current affinity mask: ffff
2011-09-29 15:21:47: pid 3651’s new affinity mask: 5535
2011-09-29 15:21:47: Checking repository integrity

Can you turn on Verbose Logging for the slave in the repository options? You can find this under the Application Logging section. Then restart the slave and let it run for a few minutes. Then grab the log and send it to us.

Thanks!

  • Ryan

2011-09-30 10:24:56: BEGIN - AIBLADE-06\root
2011-09-30 10:24:56: Start-up
2011-09-30 10:24:56: 2011-09-30 10:24:55
2011-09-30 10:24:56: Deadline Slave 5.1 [v5.1.0.45235 R]
2011-09-30 10:24:56: Auto Configuration: A ruleset has been received
2011-09-30 10:24:56: Auto Configuration: Setting License Server to ‘192.168.1.2’
2011-09-30 10:24:56: Auto Configuration: Setting Repository Path to ‘/mnt/DeadlineRepository’
2011-09-30 10:24:56: Auto Configuration: Setting Local Slave Data Folder to ‘/temp’
2011-09-30 10:24:57: slave initialization beginning.
2011-09-30 10:24:57: Info Thread - Created.
2011-09-30 10:24:57: Trying to connect using license server ‘192.168.1.2’
2011-09-30 10:24:57: The license file being used will expire in 31 days.
2011-09-30 10:24:58: Starting between task wait - seconds: 1
2011-09-30 10:24:58: pid 5310’s current affinity mask: ffff
2011-09-30 10:24:58: pid 5310’s new affinity mask: 5535
2011-09-30 10:24:59: Scheduler Thread - Slave initialization complete.
2011-09-30 10:24:59: Checking repository integrity
2011-09-30 10:24:59: Purging obsolete slaves
2011-09-30 10:24:59: Scheduler Thread - Performing house cleaning…
2011-09-30 10:24:59: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:24:59: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:25:05: Scheduler - No jobs found.
2011-09-30 10:25:05: Starting between task wait - seconds: 30
2011-09-30 10:25:35: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:25:35: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:25:38: Scheduler - No jobs found.
2011-09-30 10:25:38: Starting between task wait - seconds: 30
2011-09-30 10:26:08: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:26:08: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:26:11: Scheduler - No jobs found.
2011-09-30 10:26:11: Starting between task wait - seconds: 30
2011-09-30 10:26:41: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:26:41: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:26:44: Scheduler - No jobs found.
2011-09-30 10:26:44: Starting between task wait - seconds: 30
2011-09-30 10:27:14: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:27:14: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:27:17: Scheduler - No jobs found.
2011-09-30 10:27:18: Starting between task wait - seconds: 30
2011-09-30 10:27:48: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:27:48: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:27:50: Scheduler - No jobs found.
2011-09-30 10:27:51: Starting between task wait - seconds: 30
2011-09-30 10:28:21: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:28:21: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:28:23: Scheduler - No jobs found.
2011-09-30 10:28:24: Starting between task wait - seconds: 30
2011-09-30 10:28:54: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:28:54: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:28:56: Scheduler - No jobs found.
2011-09-30 10:28:57: Starting between task wait - seconds: 30
2011-09-30 10:29:27: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:29:27: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:29:30: Scheduler - No jobs found.
2011-09-30 10:29:30: Starting between task wait - seconds: 30
2011-09-30 10:30:00: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:30:00: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:30:03: Scheduler - No jobs found.
2011-09-30 10:30:03: Starting between task wait - seconds: 30
2011-09-30 10:30:18: pid 5310’s current affinity mask: 5535
2011-09-30 10:30:18: pid 5310’s new affinity mask: 5535
2011-09-30 10:30:33: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:30:33: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:30:36: Scheduler - No jobs found.
2011-09-30 10:30:36: Starting between task wait - seconds: 30
2011-09-30 10:31:06: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:31:06: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:31:09: Scheduler - No jobs found.
2011-09-30 10:31:09: Starting between task wait - seconds: 30
2011-09-30 10:31:40: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:31:40: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:31:42: Scheduler - No jobs found.
2011-09-30 10:31:43: Starting between task wait - seconds: 30
2011-09-30 10:32:13: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:32:13: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:32:15: Scheduler - No jobs found.
2011-09-30 10:32:16: Starting between task wait - seconds: 30
2011-09-30 10:32:46: Scheduler - Contacting Deadline Pulse running on “192.168.1.2”, port 5046.
2011-09-30 10:32:46: Scheduler - Requesting work from Deadline Pulse…
2011-09-30 10:32:48: Scheduler - No jobs found.
2011-09-30 10:32:49: Starting between task wait - seconds: 30


2011-09-30 00:06:48: BEGIN - AIBLADE-06\root
2011-09-30 00:06:48: Perfoming remote admin check
2011-09-30 00:16:48: Perfoming remote admin check
2011-09-30 00:26:48: Perfoming remote admin check
2011-09-30 00:36:48: Perfoming remote admin check
2011-09-30 00:46:48: Perfoming remote admin check
2011-09-30 00:56:48: Perfoming remote admin check
2011-09-30 01:06:48: Perfoming remote admin check
2011-09-30 01:16:48: Perfoming remote admin check
2011-09-30 01:26:48: Perfoming remote admin check
2011-09-30 01:36:48: Perfoming remote admin check
2011-09-30 01:46:49: Perfoming remote admin check
2011-09-30 01:56:49: Perfoming remote admin check
2011-09-30 02:06:49: Perfoming remote admin check
2011-09-30 02:16:49: Perfoming remote admin check
2011-09-30 02:26:49: Perfoming remote admin check
2011-09-30 02:36:49: Perfoming remote admin check
2011-09-30 02:46:49: Perfoming remote admin check
2011-09-30 02:56:49: Perfoming remote admin check
2011-09-30 03:06:49: Perfoming remote admin check
2011-09-30 03:16:49: Perfoming remote admin check
2011-09-30 03:26:49: Perfoming remote admin check
2011-09-30 03:36:49: Perfoming remote admin check
2011-09-30 03:46:49: Perfoming remote admin check
2011-09-30 03:56:49: Perfoming remote admin check
2011-09-30 04:06:50: Perfoming remote admin check
2011-09-30 04:16:50: Perfoming remote admin check
2011-09-30 04:26:50: Perfoming remote admin check
2011-09-30 04:36:50: Perfoming remote admin check
2011-09-30 04:46:50: Perfoming remote admin check
2011-09-30 04:56:50: Perfoming remote admin check
2011-09-30 05:06:50: Perfoming remote admin check
2011-09-30 05:16:50: Perfoming remote admin check
2011-09-30 05:26:51: Perfoming remote admin check
2011-09-30 05:36:51: Perfoming remote admin check
2011-09-30 05:46:51: Perfoming remote admin check
2011-09-30 05:56:51: Perfoming remote admin check
2011-09-30 06:06:51: Perfoming remote admin check
2011-09-30 06:16:51: Perfoming remote admin check
2011-09-30 06:26:51: Perfoming remote admin check
2011-09-30 06:36:51: Perfoming remote admin check
2011-09-30 06:46:51: Perfoming remote admin check
2011-09-30 06:56:51: Perfoming remote admin check
2011-09-30 07:06:51: Perfoming remote admin check
2011-09-30 07:16:51: Perfoming remote admin check
2011-09-30 07:26:51: Perfoming remote admin check
2011-09-30 07:36:51: Perfoming remote admin check
2011-09-30 07:46:51: Perfoming remote admin check
2011-09-30 07:56:51: Perfoming remote admin check
2011-09-30 08:06:51: Perfoming remote admin check
2011-09-30 08:16:51: Perfoming remote admin check
2011-09-30 08:26:51: Perfoming remote admin check
2011-09-30 08:36:51: Perfoming remote admin check
2011-09-30 08:46:51: Perfoming remote admin check
2011-09-30 08:56:52: Perfoming remote admin check
2011-09-30 09:06:52: Perfoming remote admin check
2011-09-30 09:16:52: Perfoming remote admin check
2011-09-30 09:26:52: Perfoming remote admin check
2011-09-30 09:36:52: Perfoming remote admin check
2011-09-30 09:46:52: Perfoming remote admin check
2011-09-30 09:56:52: Perfoming remote admin check
2011-09-30 10:06:52: Perfoming remote admin check
2011-09-30 10:16:52: Perfoming remote admin check
2011-09-30 10:24:55: Enqueing: Launch Slave
2011-09-30 10:24:55: Dequeued: Launch Slave
2011-09-30 10:24:55: Local python version file: /usr/local/Thinkbox/Deadline/python/2.6.7/Version
2011-09-30 10:24:55: Network python version file: /mnt/DeadlineRepository/python/Linux/2.6.7/Version
2011-09-30 10:24:55: Comparing python version files
2011-09-30 10:24:55: Python upgrade skipped because Version files are the same
2011-09-30 10:24:55: Local version file: /usr/local/Thinkbox/Deadline/bin/Version
2011-09-30 10:24:55: Network version file: /mnt/DeadlineRepository/bin/Linux/Version
2011-09-30 10:24:55: Comparing version files
2011-09-30 10:24:55: Launching Slave: Aiblade-06
2011-09-30 10:26:52: Perfoming remote admin check

Thanks! Based on the log, it looks like Pulse isn’t finding any jobs for this machine. Could a combination of pool,group,limit,blacklist/whitelist be preventing this job from rendering on this machine? A quick way to see which slaves your job can render on is to use the Slave Availability Filter in the Deadline Monitor:
thinkboxsoftware.com/deadlin … ulse_Panel (see second paragraph)

If this slave gets filtered out when you click on the job in the Monitor, then we know it’s a job setting that’s preventing it from rendering on this machine.

Cheers,

  • Ryan

Hi Ryan,

The slave availability filter is removing the trouble slave from the slave window. The pool,groups are not set for any of the slaves. The limits window doesn’t have any of the slave in it. Deadline version is correct and so is the 3D Software version “Maya 2012 hot-fix 4”.

Is there anything else I have missed?

Cheers

Carl

Hi Carl,

These are the things that can affect whether or not a slave will pick up a job:

  1. Pool or Group. Since these are not setup for any slave (and I’m assuming you’re submitting the job to the ‘none’ pool and group), we can rule this out.

  2. The job’s machine limit. If this is set to 0, that means it’s disabled, and we can rule it out.

  3. The job’s whitelist/blacklist. You can check this from the Machine Limit tab in the Job Properties dialog in the Monitor.

  4. Any Limits the job uses. You can check if the job is using any Limits in the Job Properties dialog in the Monitor.

  5. The slave has added itself to the job’s bad list, although this would be apparent in the slave list with the availability filter enabled.

If this doesn’t help you track down the problem, let us know!

Cheers,

  • Ryan

Great! Glad to hear it!

For future reference, here is the documentation for Failure Detection, which includes info about “bad slaves”:
thinkboxsoftware.com/deadlin … detection/

Cheers,

  • Ryan

Hi Ryan,

I’m having the same issue as before regarding “performing remote admin check”. Slaves are stalling and require a reboot each time. I have checked all of the listed before and everything is normal.

See attached

Cheers

Carl

Also,

I can’t see any of my render machines in pulse.

Hi Carl,

The “performing remote admin check” output is actually printed out by the Launcher, so it’s completely unrelated to the issue you’re seeing here. Can you check if the Slave application and the render process (I’m guessing it’s Maya) are still running? If so, then the problem could be that the render is what is stalling out. Deadline is waiting for the render process to finish, so if it gets stuck, it can make it look like the slave is stuck.

Do you mean that no slaves are able to connect to Pulse? If the slaves can’t connect to Pulse, they should be printing an error to their slave log. Do you have Pulse configured using it’s IP address or host name? Whichever one is being used, try pinging the same from the slave machines to see if they can resolve the Pulse server that way.

Cheers,

  • Ryan

Hi Ryan,

I believe you right about the maya plugin stalling. I have uploaded the logs to a support Ticket #133311 for you to look at as they are sensitive.

Regarding pulse, The server is connecting to pulse, see logs. The issue lies with deadline monitor, only two out of 14 server appear in the pulse tab. Pulse setting is set-up as IP Address and ping is positive from server to Ip address.

Hi Carl,

I just checked that ticket, but I don’t see any logs. Have you tried uploading them yet?

Note that the Pulse tab in the Monitor shows which machine Pulse is running on, as well as a bit of extra info about that machine. This list does not show which machines are connected to Pulse. If you want to see which Slaves are connected to Pulse, you can look at the Connected To Pulse column under the Slave tab.

Cheers,

  • Ryan

Hi Ryan,

I have created a new ticket. Ticket #513069. I have cc you on the ticket via email.

I was getting confused about the pulse tab. I have been working to much.lol

Privacy | Site terms | Cookie preferences