Hi there,
We have an average submission time to deadline of around 5-9 seconds. Sometimes, this causes problems, as we have batch submissions of 1-200 jobs in a single go.
Would using the “SubmitMultipleJobs” command line option help with this? I’m hoping we could just do all the preparations, then submit all of them in the time it currently takes to submit a single job.
cheers
laszlo
That can help, assuming the slowdown is related to the startup time of deadlinecommand, and not related to the actual submission of the job (which could be slow if the database is under duress).
If you’re submitting 200 jobs at once, the command line arguments for that could get unwieldy (and probably won’t even work due to Windows’ limitation on the length of a single command line). The workaround for that is to write your command line to a text file (one argument per line) and pass that file as the only argument to deadlinecommand.exe.
I was going to link to documentation about this, but it turns out we don’t have any, so that’s been added to the todo list.
Cheers,
Ryan
I think the time comes from database access, not the executable startup. So we probably wouldn’t really gain much by doing the submission in one go.
Why does it take so long to publish btw? Is there a way to timestamp what activities go on? It takes a very long time considering its only a couple of db inserts…
Do you guys submit your scene files with the job, or have any event plugins that are enabled? Both could impact submission performance.
These jobs are nuke scripts, so the file size is negligible.
If you guys have a ‘full logging, extreme verbosity’ version of the latest deadlinecommand.exe, i could send you guys some logs on what takes how long. Feels to me as if the db connections might be renegotiated during submission multiple times. But that’s just a wild guess…
I did some metrics using proc monitor. Its from a remote box, so submission time & access to the repository is slower than normal, but might be good as a reference.
Ill redo this test later on a machine local to the db/repo:
Full submission time took 19 seconds.
Biggest offenders:
~0.8 seconds to load python modules
~0.5 seconds to read dbConnect.xml (1st time)
Doing a read write test on the repo (~0.7s)
~0.1 seconds to read dbConnect.xml (2nd time)
~0.8 seconds #2 mongodb communication
~2.5 seconds #5 mongodb communication (tons of TCP Receive(?) operations from primary server only, large total data amount)
~0.4 seconds Finding plugins folder on the repository
~1.8 seconds #7 mongodb communication (TCP mostly sends, some receives)
~0.9 seconds repo Folder prep
~1.7 seconds #1Event handling (loading off of repo folder)
~5 seconds #2 More event parsing (draft/shotgun)
The full log:
deadlinecommand load:
1/14/2015 11:25:53.9504733 AM
...
1/14/2015 11:25:54.0682932 AM
deadlinecommand init (loading local ini files):
1/14/2015 11:25:54.0725256 AM
...
1/14/2015 11:25:54.0889985 AM
init mongo / check version / load .net:
1/14/2015 11:25:54.1055454 AM
...
1/14/2015 11:25:54.1582944 AM
load python (load dlls, init socket, load network modules) ~0.8 seconds:
1/14/2015 11:25:54.1624187 AM
...
1/14/2015 11:25:55.0053434 AM
Additional traffic to registry
1/14/2015 11:25:55.0057352 AM
...
1/14/2015 11:25:55.1139027 AM
reads the job info file:
1/14/2015 11:25:55.1147203 AM
...
1/14/2015 11:25:55.1179406 AM
Further reads from local deadline ini / lock files:
1/14/2015 11:25:55.1203943 AM
...
1/14/2015 11:25:55.1219137 AM
Loading mongo db driver functions
1/14/2015 11:25:55.1226436 AM
...
1/14/2015 11:25:55.1487341 AM
Reading the repository dbConnect.xml file (~0.5 seconds):
1/14/2015 11:25:55.3595884 AM
...
1/14/2015 11:25:55.8483887 AM
Doing a read write test on the repo (~0.7s):
1/14/2015 11:25:55.8825280 AM
...
1/14/2015 11:25:56.5183121 AM
Misc magic (dll load, registry parse):
1/14/2015 11:25:56.5640402 AM
...
1/14/2015 11:25:56.6373449 AM
Loading the dbConnect.xml file again (~0.1 s):
1/14/2015 11:25:56.8350399 AM
...
1/14/2015 11:25:56.9449495 AM
Misc magic (loading more dlls.. system.drawing, system.xaml, accessibility etc. .NET init?):
1/14/2015 11:25:56.9866331 AM
...
1/14/2015 11:25:57.1464386 AM
#1 Actual mongodb communication (tiny TCP operations, 40-50 bytes at a time, to both primary/secondary members of our replica set servers):
1/14/2015 11:25:57.1802958 AM
...
1/14/2015 11:25:57.3352186 AM
Misc magic (more dll, registry parses)
1/14/2015 11:25:57.3403089 AM
...
1/14/2015 11:25:57.3650399 AM
#2 Actual mongodb communication (tiny TCP operations, 50-700 bytes in size, to all 3 members of the replica set (primary, secondary, arbiter)) ~0.8s:
1/14/2015 11:25:57.4181831 AM
...
1/14/2015 11:25:58.2470009 AM
Read plugin info file:
1/14/2015 11:25:58.2651883 AM
...
1/14/2015 11:25:58.2653470 AM
Misc registry entry lookups to HKLM\SOFTWARE\Microsoft\.NETFramework\JitLockWrite
...
#3 Mongodb communication (TCP operations to primary server only):
1/14/2015 11:25:58.3980576 AM
...
1/14/2015 11:25:58.3981494 AM
Getting current time (registry operations, dll loads):
1/14/2015 11:25:58.4008535 AM
...
1/14/2015 11:25:58.4040940 AM
Mystery delay of 0.15 seconds
#4 Mongodb communication (TCP operations to primary server only):
1/14/2015 11:25:58.5463778 AM
...
1/14/2015 11:25:58.5465273 AM
Misc registry entry lookups to HKLM\SOFTWARE\Microsoft\.NETFramework\JitLockWrite
...
#5 More mongodb communication (TCP RECEIVE! operations from primary server only (2 sends.. then hundreds of receives, large total data amount)) ~2.5s:
1/14/2015 11:25:58.7376985 AM
...
1/14/2015 11:26:01.2948320 AM
Finding plugins folder on the repository ~0.4 sec:
1/14/2015 11:26:01.4725208 AM
...
1/14/2015 11:26:01.8344559 AM
#6 Short Mongo communication with primary server:
1/14/2015 11:26:01.9023781 AM
..
1/14/2015 11:26:01.9458681 AM
Misc registry entry lookups and dll loads, seems encryption related:
1/14/2015 11:26:01.9545307 AM
...
1/14/2015 11:26:02.0270375 AM
#7 Mongo communication, TCP mostly sends, some receives ~1.8 sec:
1/14/2015 11:26:02.0490815 AM
...
1/14/2015 11:26:03.8954246 AM
Repo Folder prep ~0.9sec:
1/14/2015 11:26:04.3548347 AM
...
1/14/2015 11:26:05.2442135 AM
#8 Mongo db
1/14/2015 11:26:05.3127979 AM
...
1/14/2015 11:26:05.4469499 AM
#1 Event handling (loading off of repo folder) ~1.7 seconds:
1/14/2015 11:26:05.4867881 AM
...
1/14/2015 11:26:07.1133460 AM
#9 Some db communication again:
1/14/2015 11:26:07.3964307 AM
...
1/14/2015 11:26:07.6159784 AM
#2 More event parsing (draft/shotgun) ~5seconds
1/14/2015 11:26:07.7520144 AM
...
1/14/2015 11:26:12.7356683 AM
Tying things up, some more deadline.ini editing, db communications and unloading of dlls:
1/14/2015 11:26:12.7418447 AM
...
1/14/2015 11:26:12.8439004 AM
Same test run on a local machine. As expected, any repository parsing operation is now much faster:
Total submit time: ~10 seconds
Biggest offenders:
~0.4 seconds: deadline command load (dlls and such)
~2.9 seconds: loading python and its modules
~0.7 seconds: #5 mongodb communication (tons of TCP Receive(?) operations from primary server only, large total data amount)
~2.0 seconds: #7 mongodb communication, TCP mostly sends, some receives
~0.3 seconds: #8 mongodb communication (mixed send/receive)
~0.3 seconds: #10 mongo db communication (mixed send/receive)
In total, there is about 3-4 seconds of mongodb communication, with one large read operation. The rest are mixed send/receives. Out of the 10 seconds, the single largest time hog is the python module log, giving 30% of the total time. Mongodb is around 35%, the rest is … hard to tell from process monitor…
deadlinecommand load ~0.4 seconds:
1/14/2015 12:32:48.6254444 PM
...
1/14/2015 12:32:49.0179723 PM
deadlinecommand init (loading local ini files):
1/14/2015 12:32:49.0249319 PM
...
1/14/2015 12:32:49.0476580 PM
init mongo / check version / load .net:
1/14/2015 12:32:49.0707631 PM
...
1/14/2015 12:32:49.1795583 PM
load python (load dlls, init socket, load network modules) ~ 2.9 seconds:
1/14/2015 12:32:49.1909973 PM
...
1/14/2015 12:32:52.0766502 PM
Additional traffic to registry (tons of HKLM\SOFTWARE\Microsoft\.NETFramework\JitLockWrite)
1/14/2015 12:32:52.0773099 PM
...
1/14/2015 12:32:52.2100691 PM
reads the job info file:
1/14/2015 12:32:52.2166521 PM
...
1/14/2015 12:32:52.2227972 PM
Further reads from local deadline ini / lock files:
1/14/2015 12:32:52.2262535 PM
...
1/14/2015 12:32:52.2278592 PM
Loading mongo db driver functions
1/14/2015 12:32:52.2288422 PM
...
1/14/2015 12:32:52.2855048 PM
Reading the repository dbConnect.xml file
1/14/2015 12:32:52.3138763 PM
...
1/14/2015 12:32:52.3278418 PM
Doing a read write test on the repo:
1/14/2015 12:32:52.3308929 PM
...
1/14/2015 12:32:52.4014785 PM
Misc magic (dll load, registry parse):
1/14/2015 12:32:52.4345658 PM
...
1/14/2015 12:32:52.5012422 PM
Loading the dbConnect.xml file again:
1/14/2015 12:32:52.5112130 PM
...
1/14/2015 12:32:52.5293671 PM
Misc magic (loading more dlls.. system.drawing, system.xaml, accessibility etc. .NET init?):
1/14/2015 12:32:52.5429774 PM
...
1/14/2015 12:32:52.7926323 PM
#1 Actual mongodb communication (tiny TCP operations, 40-50 bytes at a time, to both primary/secondary members of our replica set servers):
1/14/2015 12:32:52.7955779 PM
...
1/14/2015 12:32:52.8826087 PM
Misc magic (more dll, registry parses)
1/14/2015 12:32:52.8889066 PM
...
1/14/2015 12:32:52.9226410 PM
#2 Actual mongodb communication (tiny TCP operations, 50-700 bytes in size, to all 3 members of the replica set (primary, secondary, arbiter)) ~0.8s:
1/14/2015 12:32:52.9515638 PM
...
1/14/2015 12:32:53.1560369 PM
Read plugin info file:
1/14/2015 12:32:53.1803196 PM
...
1/14/2015 12:32:53.1806807 PM
Misc registry entry lookups to HKLM\SOFTWARE\Microsoft\.NETFramework\JitLockWrite
...
#3 Mongodb communication (TCP operations to primary server only):
1/14/2015 12:32:53.2943229 PM
...
1/14/2015 12:32:53.2944328 PM
Getting current time (registry operations, dll loads):
1/14/2015 12:32:53.2975772 PM
...
1/14/2015 12:32:53.3019988 PM
Mystery delay of 0.15 seconds
#4 Mongodb communication (TCP operations to primary server only):
1/14/2015 12:32:53.4282931 PM
...
1/14/2015 12:32:53.4284058 PM
Misc registry entry lookups to HKLM\SOFTWARE\Microsoft\.NETFramework\JitLockWrite
...
#5 More mongodb communication (TCP RECEIVE! operations from primary server only (2 sends.. then hundreds of receives, large total data amount)) ~0.7seconds:
1/14/2015 12:32:53.6240341 PM
...
1/14/2015 12:32:54.3964949 PM
Finding plugins folder on the repository:
1/14/2015 12:32:54.4857064 PM
...
1/14/2015 12:32:54.4965071 PM
#6 Short Mongo communication with primary server:
1/14/2015 12:32:54.5013387 PM
..
1/14/2015 12:32:54.5663936 PM
Misc registry entry lookups and dll loads, seems encryption related:
1/14/2015 12:32:54.5781718 PM
...
1/14/2015 12:32:54.6853739 PM
#7 Mongo communication, TCP mostly sends, some receives ~2 sec:
1/14/2015 12:32:54.7750509 PM
...
1/14/2015 12:32:56.7965668 PM
Repo Folder prep ~0.9sec:
1/14/2015 12:32:56.8968988 PM
...
1/14/2015 12:32:56.9243534 PM
#8 Mongo db ~0.3 seconds
1/14/2015 12:32:56.9307265 PM
...
1/14/2015 12:32:57.2065198 PM
#1 Event handling (loading off of repo folder) ~0.2 seconds:
1/14/2015 12:32:57.2396819 PM
...
1/14/2015 12:32:57.4346931 PM
#9 Some db communication again:
1/14/2015 12:32:57.5999429 PM
...
1/14/2015 12:32:57.7745759 PM
Tying things up, some more deadline.ini editing:
1/14/2015 12:32:57.7871560 PM
...
1/14/2015 12:32:57.7942254 PM
#10 mongo db communication ~0.3seconds:
1/14/2015 12:32:57.8028828 PM
..
1/14/2015 12:32:58.1099884 PM
Closing threads, unloading dlls:
1/14/2015 12:32:58.1104625 PM
...
1/14/2015 12:32:58.1202640 PM
Based on that, you would probably notice an improvement submitting multiple jobs at the same time. The loading of the python modules would only get done once (when deadlinecommand.exe first starts up), and the initial connection to the repository and the database will only be done once as well. Give it a try and let us know if that helps!
Cheers,
Ryan