AWS Thinkbox Discussion Forums

pending job scan doesnt trigger

In deadline7 (Beta3), it seems like the pending job scan isnt doing anything. Not sure whats going on, i have a frame pending job, and its not tested by the pending job scan:

2014-09-18 09:14:55:  Repository Repair Thread - Performing repository repair
2014-09-18 09:14:55:  Pending Job Scan Thread - Performing pending job scan
2014-09-18 09:14:55:  Clean Up Thread - Performing house cleaning
2014-09-18 09:14:55:  Update timeout has been set to 1800 seconds
2014-09-18 09:14:55:  Update timeout has been set to 1800 seconds
2014-09-18 09:14:55:  Update timeout has been set to 1800 seconds
2014-09-18 09:14:55:  Stdout Handling Enabled: False
2014-09-18 09:14:55:  Stdout Handling Enabled: False
2014-09-18 09:14:55:  Popup Handling Enabled: False
2014-09-18 09:14:55:  Using Process Tree: True
2014-09-18 09:14:55:  Popup Handling Enabled: False
2014-09-18 09:14:55:  Using Process Tree: True
2014-09-18 09:14:55:  Hiding DOS Window: True
2014-09-18 09:14:55:  Hiding DOS Window: True
2014-09-18 09:14:55:  Stdout Handling Enabled: False
2014-09-18 09:14:55:  Popup Handling Enabled: False
2014-09-18 09:14:55:  Creating New Console: False
2014-09-18 09:14:55:  Creating New Console: False
2014-09-18 09:14:55:  Using Process Tree: True
2014-09-18 09:14:55:  Running as user: root
2014-09-18 09:14:55:  Hiding DOS Window: True
2014-09-18 09:14:55:  Running as user: root
2014-09-18 09:14:55:  Creating New Console: False
2014-09-18 09:14:55:  Running as user: root
2014-09-18 09:14:55:  Executable: "/opt/Thinkbox/Deadline7/bin/deadlinecommand.exe"
2014-09-18 09:14:55:  Executable: "/opt/Thinkbox/Deadline7/bin/deadlinecommand.exe"
2014-09-18 09:14:55:  Executable: "/opt/Thinkbox/Deadline7/bin/deadlinecommand.exe"
2014-09-18 09:14:55:  Argument: -DoPendingJobScan False True
2014-09-18 09:14:55:  Argument: -DoRepositoryRepair False True
2014-09-18 09:14:55:  Startup Directory: "/opt/Thinkbox/Deadline7/bin"
2014-09-18 09:14:55:  Startup Directory: "/opt/Thinkbox/Deadline7/bin"
2014-09-18 09:14:55:  Argument: -DoHouseCleaning False True
2014-09-18 09:14:55:  Startup Directory: "/opt/Thinkbox/Deadline7/bin"
2014-09-18 09:14:55:  Process Priority: BelowNormal
2014-09-18 09:14:55:  Process Priority: BelowNormal
2014-09-18 09:14:55:  Process Priority: BelowNormal
2014-09-18 09:14:55:  Process Affinity: default
2014-09-18 09:14:55:  Process is now running
2014-09-18 09:14:55:  Process Affinity: default
2014-09-18 09:14:55:  Process is now running
2014-09-18 09:14:55:  Process Affinity: default
2014-09-18 09:14:55:  Process is now running
2014-09-18 09:14:56:  Performing Pending Job Scan...
2014-09-18 09:14:56:  Performing house cleaning
2014-09-18 09:14:56:  Performing Job Cleanup Scan...
2014-09-18 09:14:56:      Pending Job Scan - Loading pending and active jobs
2014-09-18 09:14:56:      Job Cleanup Scan - Loading completed jobs
2014-09-18 09:14:56:      Job Cleanup Scan - Loaded 0 completed jobs in 21.243 ms
2014-09-18 09:14:56:      Job Cleanup Scan - Done.
2014-09-18 09:14:56:  Purging Unsubmitted Jobs
2014-09-18 09:14:56:      Unsubmitted Job Scan - Loading unsubmitted jobs
2014-09-18 09:14:56:      Unsubmitted Job Scan - Loaded 0 unsubmitted jobs in 1.515 ms
2014-09-18 09:14:56:      Unsubmitted Job Scan - Done.
2014-09-18 09:14:56:  Purging Deleted Jobs
2014-09-18 09:14:56:      Deleted Job Scan - Loading deleted jobs
2014-09-18 09:14:56:      Deleted Job Scan - Loaded 0 deleted jobs in 2.266 ms
2014-09-18 09:14:56:      Deleted Job Scan - Done.
2014-09-18 09:14:56:  Purging Old Job Auxiliary Files
2014-09-18 09:14:56:      Auxiliary File Scan - Scanning for auxiliary directories
2014-09-18 09:14:56:      Auxiliary File Scan - Found 12 auxiliary directories in 4.042 ms
2014-09-18 09:14:56:      Auxiliary File Scan - Loading job IDs
2014-09-18 09:14:56:      Auxiliary File Scan - Loaded 12 job IDs in 6.132 ms
2014-09-18 09:14:56:      Pending Job Scan - Loaded 4 pending and active jobs in 72.074 ms
2014-09-18 09:14:56:      Pending Job Scan - Scanning pending and active jobs
2014-09-18 09:14:56:      Auxiliary File Scan - Purged 0 auxiliary folders in 650.000 μs
2014-09-18 09:14:56:      Auxiliary File Scan - Done.
2014-09-18 09:14:56:  Purging Old Job Reports
2014-09-18 09:14:56:      Job Report Scan - Loading job report collections
2014-09-18 09:14:56:      Job Report Scan - Found 11 report collections in 5.575 ms
2014-09-18 09:14:56:      Job Report Scan - Loading job IDs
2014-09-18 09:14:56:      Job Report Scan - Loaded 12 job IDs in 1.206 ms
2014-09-18 09:14:56:      Job Report Scan - Purged 0 report collections in 93.000 μs
2014-09-18 09:14:56:      Job Report Scan - Purging old job report files
2014-09-18 09:14:56:      Job Report Scan - Purged 0 report files in 6.478 ms
2014-09-18 09:14:56:      Job Report Scan - Done.
2014-09-18 09:14:56:  Purging Obsolete Slaves
2014-09-18 09:14:56:      Obsolete Slave Scan - Skipping because it is disabled in the Repository Options
2014-09-18 09:14:56:  Purging Old Slave Reports
2014-09-18 09:14:56:      Slave Report Scan - Loading slave report collections
2014-09-18 09:14:56:      Slave Report Scan - Found 17 report collections in 2.351 ms
2014-09-18 09:14:56:      Slave Report Scan - Loading slave IDs
2014-09-18 09:14:56:      Slave Report Scan - Loaded 18 slave IDs in 6.948 ms
2014-09-18 09:14:56:      Slave Report Scan - Purged 0 report collections in 223.000 μs
2014-09-18 09:14:56:      Slave Report Scan - Done.
2014-09-18 09:14:56:  Purging Old Limits
2014-09-18 09:14:56:      Old Limit Scan - Loading machine limits
2014-09-18 09:14:56:      Old Limit Scan - Found 12 machine limits in 3.393 ms
2014-09-18 09:14:56:      Old Limit Scan - Loading job IDs
2014-09-18 09:14:56:      Old Limit Scan - Loaded 12 job IDs in 1.429 ms
2014-09-18 09:14:56:      Old Limit Scan - Purged 0 machine limits in 102.000 μs
2014-09-18 09:14:56:      Old Limit Scan - Done.
2014-09-18 09:14:56:  Purging Temporary Repository Files
2014-09-18 09:14:56:      Temporary File Scan - Scanning for 'connectReadWriteTest' files
2014-09-18 09:14:56:      Temporary File Scan - Deleted 0 temporary files in 2.368 ms
2014-09-18 09:14:56:      Temporary File Scan - Done.
2014-09-18 09:14:56:  Purging Old Statistics
2014-09-18 09:14:56:      Old Statistics - Skipping job statistics because the option to purge them is disabled in the Repository Options
2014-09-18 09:14:56:      Old Statistics - Purging slave statistics that are older than May 21/14  17:21:20
2014-09-18 09:14:56:  Performing repository repair
2014-09-18 09:14:56:  Performing Orphaned Task Scan...
2014-09-18 09:14:56:      Orphaned Task Scan - Loading rendering jobs
2014-09-18 09:14:56:      Orphaned Task Scan - Loaded 3 rendering jobs in 26.242 ms
2014-09-18 09:14:56:      Orphaned Task Scan - Scanning for orphaned tasks
2014-09-18 09:14:56:      Orphaned Task Scan - Separated jobs into 1 lists of 100
2014-09-18 09:14:56:      Orphaned Task Scan - Scanning job list 1 of 1 (3 jobs)
2014-09-18 09:14:56:      Pending Job Scan - Error occurred while scanning job "540130027a3a9e1be02480d5": Object reference not set to an instance of an object (System.NullReferenceException)
2014-09-18 09:14:56:      Pending Job Scan - Released 0 pending jobs and 0 pending tasks in 63.330 ms
2014-09-18 09:14:56:      Pending Job Scan - Done.
2014-09-18 09:14:56:  Processing Pending Job Events
2014-09-18 09:14:56:      Pending Job Events - Checking for pending job events
2014-09-18 09:14:56:      Pending Job Events - Processing 0 job events
2014-09-18 09:14:56:      Old Statistics - Purged old slave statistics in 21.828 ms
2014-09-18 09:14:56:      Old Statistics - Purging repository statistics that are older than May 21/14  17:21:20
2014-09-18 09:14:56:      Old Statistics - Purged old repository statistics in 812.000 μs
2014-09-18 09:14:56:  Purging Deleted Document Stubs From Database
2014-09-18 09:14:56:      Deleted Document Stubs - Deleting stubs that are older than 3 days
2014-09-18 09:14:56:      Deleted Document Stubs - Deleted 0 stubs in 765.000 μs
2014-09-18 09:14:56:  Triggering House Cleaning Events
2014-09-18 09:14:56:      Orphaned Task Scan - Cleaned up 0 orphaned tasks in 36.991 ms
2014-09-18 09:14:56:      Orphaned Task Scan - Done.
2014-09-18 09:14:56:  Performing Orphaned Limit Stub Scan...
2014-09-18 09:14:56:      Orphaned Limit Stub Scan - Loading limits
2014-09-18 09:14:56:      Orphaned Limit Stub Scan - Loaded 12 limits in 3.220 ms
2014-09-18 09:14:56:      Orphaned Limit Stub Scan - Loading slave states
2014-09-18 09:14:56:      Orphaned Limit Stub Scan - Loaded 18 slave states in 11.805 ms
2014-09-18 09:14:56:      Orphaned Limit Stub Scan - Returned 0 orphaned limit stubs in 8.408 ms
2014-09-18 09:14:56:      Orphaned Limit Stub Scan - Done.
2014-09-18 09:14:56:  Checking Available Database Connections
2014-09-18 09:14:56:      Available Database Connections - Skipping because there are no Low Database Connection notification email addresses set in the Repository Options
2014-09-18 09:14:56:  Performing Stalled Slave Scan...
2014-09-18 09:14:56:      Stalled Slave Scan - Loading slave states
2014-09-18 09:14:56:      Stalled Slave Scan - Loaded 18 slave states in 2.188 ms
2014-09-18 09:14:56:      Stalled Slave Scan - Scanning slave states
2014-09-18 09:14:56:      Stalled Slave Scan - Cleaned up 0 stalled slaves in 7.408 ms
2014-09-18 09:14:56:      Stalled Slave Scan - Done.
2014-09-18 09:14:56:  Triggering Repository Repair Events
2014-09-18 09:14:56:  ScanlineEventListener _init__ called, wiping PYTHONPATH from deadlineslave's embedded python
2014-09-18 09:14:56:  ScanlineEventListener _init__ called, wiping PYTHONPATH from deadlineslave's embedded python
2014-09-18 09:14:56:      Pending Job Events - No more job events to process
2014-09-18 09:14:56:      Pending Job Events - Done.
2014-09-18 09:14:56:  ScanlineEventListener _init__ called, wiping PYTHONPATH from deadlineslave's embedded python
2014-09-18 09:14:57:  Process exit code: 0
2014-09-18 09:14:57:  Process exit code: 0
2014-09-18 09:14:57:  Process exit code: 0

Doesnt seem like there are any tests done by the pending job scanner, even though:

Thanks for reporting this! I’m seeing this error, which I’m guessing is the reason it’s not working:

We’re looking into it on our end. If possible though, could you export and upload the pending job? We can drop it in our database, and that might help speed up debugging.

Thanks!
Ryan

You dont mean archive, right? Just the db entries?

To be able archive it, i would have to change its state

Attached are the Jobs, JobTasks and Limits for the job
json__540130027a3a9e1be02480d5.tar (100 KB)

Archiving it should be fine. We would import it back into our db, and then put it in the pending state and test.

Attached is the archive!
laszlo.sebo__3dsmax__[EXO] RS_190_2060_v0402_lse_pleaseDontExplo_images_render3d_FL-Happy_L_0 __540130027a3a9e1be02480d5.zip (10.5 KB)

Hey Lazlo,

So I have not been able to reproduce this at all. The job you archived seemed to work perfectly fine, would you be able to try resubmitting the job, or importing the archived one to see if it still doesn’t work?

Thanks,
Grant

I tried reimporting the job, and i get the same error with the new ID:

2014-09-19 04:58:17: Pending Job Scan - Error occurred while scanning job “54012ff57a3a9e066c3e60dd”: Object reference not set to an instance of an object (System.NullReferenceException)

I ran into this same issue a few weeks ago in Beta 2. After a bunch of trouble shooting I found my problem to be the firewall on the Deadline Repository server. I had it previously configured with the correct ports for an earlier version of Mongo and Deadline. After disabling the firewall completely everything ran fine. I then did a complete uninstall/re-install of Deadline 7 and verified the Mongo port. Seems to be working now.

Not sure if this is the same error as yours, I didn’t save the error code. Hopefully this is helpful…

Thanks for the tip! We double checked the firewalls and they - in theory - are ok.

The error seems to happen for new submissions too:

2014-09-19 07:21:49:      Pending Job Scan - Error occurred while scanning job "54012ff57a3a9e066c3e60dd": Object reference not set to an instance of an object (System.NullReferenceException)
2014-09-19 07:21:49:      Pending Job Scan - Error occurred while scanning job "541c92233db7593610939189": Object reference not set to an instance of an object (System.NullReferenceException)
2014-09-19 07:21:49:      Pending Job Scan - Error occurred while scanning job "541c922b3db7592fa896667f": Object reference not set to an instance of an object (System.NullReferenceException)

Hey lazlo,
Might I ask what kind of job you are submitting to see if I can reproduce it here through a new submission.

Grant

Or if it’s a custom submission script, can you show us the key/values you’re setting in the job info file?

Sure. Note, that its identical to all our submissions we do with deadline 6.2
max_job_info_0.zip (5.26 KB)

I tried to run the scan from another machine manually, but i get this:

C:\Program Files\Thinkbox\Deadline7\bin>deadlinecommand -DoPendingJobScan True
Skipping pending job scan because it is not required at this time

If you are trying to get a slave to check they will only do a pending job scan if enough time has happened since the last one.
If you get a monitor or pulse to run the pending job scan it should skip this.

Grant

We have the pending check interval set to 60 seconds. I shut down pulse, so that it doesn’t attempt the dependency checks itself.
Seems like the over eager slaves jumped in right away.

Managed to do a manual cycle once all slaves were shut down. Running the dependency check from this machine worked. So it must be somehow related to the pulse machine we use for deadline7

Actually, i just noticed an error even on this machine:

Dependency script returned 121 tasks that can start: \S2\exchange\software\managed\pythonScripts\site-packages\scl\deadline\scriptDependency.py
Pending Job Scan - Error occurred while scanning job “54012ff57a3a9e066c3e60dd”: An entry with the same key already exists. (System.ArgumentException)

Is beta 3 installed on the Pulse machine? If it is, can you try a reinstall of the client on it? It’s weird that this machine would have problems but not the other…

Hey Lazlo,

So that second error (an entry with the same name) I have run into my self and have fixed on our end. What is happening is that the post job task is trying to be added multiple times, I am fixing this for the next build. For now you can get around this by having the script remove the task -1.

Grant

Did a clean reinstall of the client on the machine, still getting this:

[root@deadline03 bin]# ./deadlinecommand -DoPendingJobScan true
Performing Pending Job Scan…
Pending Job Scan - Loading pending and active jobs
Pending Job Scan - Loaded 6 pending and active jobs in 70.111 ms
Pending Job Scan - Scanning pending and active jobs
Pending Job Scan - Error occurred while scanning job “541cd0803db75925bc62acdb”: Object reference not set to an instance of an object (System.NullReferenceException)
Pending Job Scan - Error occurred while scanning job “541cd0783db75935406c6716”: Object reference not set to an instance of an object (System.NullReferenceException)
Pending Job Scan - Error occurred while scanning job “541c92233db7593610939189”: Object reference not set to an instance of an object (System.NullReferenceException)
Pending Job Scan - Error occurred while scanning job “541c922b3db7592fa896667f”: Object reference not set to an instance of an object (System.NullReferenceException)
Pending Job Scan - Error occurred while scanning job “54012ff57a3a9e066c3e60dd”: Object reference not set to an instance of an object (System.NullReferenceException)
Pending Job Scan - Released 0 pending jobs and 0 pending tasks in 121.330 ms
Pending Job Scan - Done.
Processing Pending Job Events
Pending Job Events - Checking for pending job events
Pending Job Events - Processing 0 job events
ScanlineEventListener init_ called, wiping PYTHONPATH from deadlineslave’s embedded python
Pending Job Events - No more job events to process
Pending Job Events - Done.

Privacy | Site terms | Cookie preferences