Synchronizing plugin icons delays boot time

Mike_Rochefort · September 19, 2018, 10:10pm

Hello,

I’m running into this issue where the Deadline Monitor takes a very long time to load when connecting to a network share. I realize this could be a mix of issues, between network speeds, protocol usage, and Deadline configuration.

Monitor spends what appears to be an excessive amount of time (read: most of boot time) synchronizing plugin icons. This is from a Linux (CentOS 7) client to a RHEL 7 server using SAMBA as the share/transfer protocol. Network speed was 10-10 Mbps.

2018-09-19 17:03:16:  Time to initialize: 116.000 ms
2018-09-19 17:03:20:  Auto Configuration: No auto configuration for Repository Path could be detected, using local configuration
2018-09-19 17:03:41:  Auto Configuration: Picking configuration based on: linstat / 192.168.0.117
2018-09-19 17:03:41:  Auto Configuration: No auto configuration could be detected, using local configuration
2018-09-19 17:03:41:  Time to connect to Repository: 21.396 s
2018-09-19 17:03:41:  Time to check user account: 91.000 ms
2018-09-19 17:03:41:  Time to purge old logs and temp files: 5.000 ms
2018-09-19 17:05:37:  Time to synchronize plugin icons: 1.923 m
2018-09-19 17:05:38:  Time to initialize main window: 691.000 ms
2018-09-19 17:05:38:  Main Window shown
2018-09-19 17:05:38:  Time to show main window: 30.000 ms

Any help on determining why this happens would be great. I also have the server restricted to only a few submission scripts (Arnold, Maya, RenderMan, Modo, Blender) and that list has a 99% chance of never changing. Would it be possible to disable this sync on startup? Paths for the plugin executables might change as updates are added, but Monitor doesn’t need to be aware of that, does it?

Cheers,
Mike

eamsler · September 20, 2018, 8:28pm

Which Deadline version?

You can do a few things. One is to enable a local cache. This stores a copy of the Repository locally in your home folder and pulls as needed.

We’ve also sped things up with the RCS, but speeding up the icons should have been done before 10.0.0

Mike_Rochefort · September 21, 2018, 3:50am

Hi E,

We’re using 10.0.20.2. But this has been happening pretty consistently since I set this up (10.0.7). My Windows laptop under the same connection loads faster, but that might be due to better performance with SAMBA.

I’ll give the local cache option a try, thanks for the tip!

Cheers,
Mike

Mike_Rochefort · September 22, 2018, 2:54am

Just did a test over here; turning that on in the Launcher menu had a dramatic impact on performance, and not the good kind. Suddenly connecting to the repository would take about 2 minutes and synchronizing icons could take 1.5 - 3 minutes. Reverting that option immediately brought things by down to what I posted originally.

Any thoughts?

Cheers,
Mike

eamsler · September 24, 2018, 5:08pm

Hmm. I expected a time saving from just checking size/modification date. I’ll open an issue for that one.

Another option would be to run the RCS on a machine that’s close to the Repository files. You may want to use the launcher to start it on login, and optionally the Launcher as a service.

The difference between what you’ve tested and that version is that the metadata checking will skip the SMB protocol and go straight through HTTP. This mode is used a lot more often out in the wild and uses the same local caching that you enabled (and in fact it can’t be turned off in that mode).

Update: While cutting that development issue, it looks like the original slowdown was also ~2 minutes:

2018-09-19 17:03:16:  Time to initialize: 116.000 ms
...
2018-09-19 17:05:38:  Main Window shown

Mike_Rochefort · September 24, 2018, 10:17pm

You’re recommending running the RCS tool? I don’t think I have that configured right now. Is the HTTPS certificate used the same one generated when setting up Deadline in the first place? IE Deadline10Client.pfx.

In regards to your update, the original time was about two minutes, but that was for total launch time. The caching mechanism added another 2-3 minutes on top of that from cold launch to displaying the Monitor.

Cheers,
Mike

cmoore · September 24, 2018, 10:52pm

Hey Mike,

The easiest way to setup the RCS would be to re-run the Client Installer and select the RCS option. The installer will generate its own set of certificates for the RCS.

Can you try disabling different event plugins to see how that impacts the monitor speed? Do you have any
python search paths set in Tools > Configure Repository Options > Python Settings? I’ve seen these two things slow down monitor launch speeds before.

Regards,

Charles

Mike_Rochefort · September 24, 2018, 11:45pm

Hi Charles,

For the RCS do I need everyone using Monitor (and the slave on render nodes, too) to enable it for themselves? Or do I need to activate it on one machine and everything passes through that using the same generated certificate? I’m just trying to get a better feel for how RCS fits into the game, it’s essentially another gateway to the Manager/Repository, but the clients still need to have the Repository mounted on the client/node side machines.

I don’t have any additional Python search paths enabled. By disabling plugins you mean unchecking the enabled part in the Configure Plugins menu, no the Configure Script Menus dialog? Just want to be sure of that, as I have most scripts disabled in the menu.

Cheers,
Mike

eamsler · September 25, 2018, 7:03pm

The RCS replaces direct connections so a Monitor/Slave would no longer need direct access to the Repo or DB but will still need one for their assets. The cert in use is not the same as the Database at the moment.

Charles is onto something there though. I had forgotten we’ve seen unusual slowdowns with certain plugins, but I cannot remember the details there.

cmoore · September 25, 2018, 7:16pm

Hey Mike,

The RCS runs on a single machine where its client is connected directly to the Repository. Your workstation and render node client apps ( Monitor, Slave, etc ) will connect to that RCS via ip and port with the RCS certificate. Those machines would no longer need the repository share mounted unless you want to do a Direct Connection instead of the Remote Connection.

I was referring to Tools > Configure Events to see if disabling any of these events increased monitor load speed.

Have you tried clearing the icons from the script menu configurations?

Regards,

Charles

Mike_Rochefort · September 26, 2018, 3:06am

Hi Charles and Ed,

Thanks for the clarification, that makes a lot more sense. I went through and disabled all of the tools in Configure [Plugins,Events] that we don’t use and this had no impact. On my Windows laptop on the same home network over VPN (just like the CentOS Desktop) launch time was about 44 seconds, just about evenly split between connecting to the repository and synchronizing plugins/icons. That was Windows 10 1803. For a Windows user on the actual network (no VPN) it took about 27 seconds to launch completely.

This makes me think there’s either something absolutely terrible about my SMB configuration, or there’s a bug in the Deadline Linux build.

I have yet to attempt clearing the script menu icons. In order to reattach them later (if they have no impact) the icons are in the Repository, right? I’ll look into implementing RCS over the weekend when I have a bit more time.

Cheers,
Mike

eamsler · September 26, 2018, 4:56pm

SMB is a not efficient over high-latency (100 milliseconds and up) connections as all requests need to be acknowledged if I recall correctly. That throttles file transfers and was the driving force behind why we built the Proxy (which has been replaced by the RCS). We wanted to make sure you could run a farm on the other side of the country (and now in the cloud).

Mike_Rochefort · October 9, 2018, 3:31am

Hi Ed,

Sorry for the late reply, I never got a notification and have been busy. I still haven’t gotten around to implementing RCS but plan to. And you’re right, SMB is no the most efficient protocol to be using, particularly when packet signing is enabled. Mac users have extreme issues with that where performance is depreciated by order of magnitudes compared to having it off.

Does RCS allow users to ‘browse’ the repository when they are ready to submit/modify jobs? Or does the repository need to be mounted on their end in order to do so?

Cheers,
Mike

eamsler · October 9, 2018, 2:51pm

Unfortunately they’d need to mount it. Normally users don’t need to dig around in the Repository themselves other than for the integrated submission script installer, and you should be able to copy those to a nicer location for them.

What are they using the Repository for these days?

Mike_Rochefort · October 28, 2018, 7:36pm

We don’t use the integrated submission scripts for DCC’s, we use the Monitor submission style instead. So user’s have to mount the repository location in order to access the farm and submit jobs. I’m looking into configuring the RCS system today to see if that will help us out.

As we have a bit of a non-standard setup, users don’t work directly off of the repository mount and only use it when uploading the information they need when they are ready to submit.

Sorry for the long delay on this!

Cheers,
Mike

eamsler · October 29, 2018, 10:03pm

No worries Mike. As long as you’re not waiting for us, it’s all good.

I’m hoping the RCS helps out here. That said, the integrated scripts are quite improved over what’s in the Monitor, but it depends on the workflows.

MikeOwen · October 31, 2018, 8:44am

Never use SMB/SAMBA with Linux. Always use NFS or better custom posix driver variants, if you want any kind of performance on Linux. To be clear, you CAN use Samba with Linux. I just would never do it.

eamsler · October 31, 2018, 3:42pm

@mikeowen any caveats there? Suggestions on what to do with mixed farms? Do you have good strategies to share when NFSv3 unix permissions have to play with NTFS ACLs via SMB?

MikeOwen · October 31, 2018, 5:01pm

Tricky. Depends on use case and versions of Windows in place. Windows NFS client or 3rd party.

Mike_Rochefort · October 31, 2018, 10:08pm

SAMBA wasn’t my desired protocol, but I had to support Windows and macOS, with Linux being just the servers and my own workstation. On the backend internal network I average 80-100MB/s so it’s not that bad. I was originally going to utilize NFS but it wouldn’t have worked out as intended and I can’t force/expect all Windows users to upgrade their machines to Student (unless they’ve already used it for something else) just to connect to a folder. Just finished upgrading the farm to 10.0.21.5, next up is adding RCS support. Wanted to figure out the HTTPS connection first before opening it up.

I’ve been having everyone use the Monitor just so that the experience would be consistent for everyone and it would be easier for me to see what was happening. What are the advantages of using the integrated scripts? For RenderMan and XGen I need to manually add the environment variables for the proper libraries as having them loaded on the system by default through LD_LIBRARY_PATH tanks Arnold’s startup times for some obscure reason. Is this possible to do in the integrated Maya submission script?

Cheers,
Mike