Synchronize Scripts and Plugins not working

BlueLegend78 · August 10, 2018, 8:30pm

I am unable to get “Synchronize Scripts and Plugins” to work properly on deadline 10.0.17.5
Reference this
deadline.thinkboxsoftware.com/f … nt-plugins

I should be able to re-sync any event plugin file after i edit the event file by clicking that button. However, it doesn’t push my changes until i restart the slave from the monitor. Whats also weird is that it detects that i modified the event on the console and even stats that it “Finished synchronizing scripts and plugins.”

2018-08-10 16:28:16:  Detected modifications to event plugin error_db_collect_failed in the Repository, it will now be re-initialized 
2018-08-10 16:28:16:  Synchronizing plugin icons...
2018-08-10 16:28:17:  Finished synchronizing scripts and plugins.

eamsler · August 13, 2018, 7:58pm

Is this on an isolated machine? I’m wondering if it reloads if you disable to Slave’s sandbox under the “Slave Settings” section of the “Configure Repository Options” window. I know this used to work, but it’s been a long time since I’ve dabbled with it.

I tend to write little wrappers in my event plugins that I run through DeadlineCommand but many of the functions (LogInfo, GetConfigEntry*) don’t work in that context. Here’s an example:

# Testing OnJobFinished()
def __main__():
    job = RepositoryUtils.GetJob("5b607855cbdaad0b1ff18f3c", True)
    event = GetDeadlineEventListener()

    event.OnJobFinished(job)
    
    print("Done")

Then I run it like so:

c:\Program Files\Thinkbox\Deadline10\bin>deadlinecommand executescript c:\path\to\script.py

We should figure out why the script re-sync isn’t working though.

BlueLegend78 · August 15, 2018, 10:13pm

I tried disabling slave sandbox mode, but that did not fix the issue. this occurs on multiple/all machines. So far i tested this with two events; OnHouseCleaningCallback and OnJobErrorCallback. The OnHouseCleaningCallback works perfectly fine with the “Synchronize Scripts and Plugins” button. Any change to the code from my housecleaning event instantly gets synced when i click the button. OnJobErrorCallback however, doesn’t sync properly with the button.

Things i notice:
When i leave it alone for x mins, some sort of auto sync occurs (housecleaning?) and the scripts syncs correctly.
Only workaround to manually cause a sync the scripts is to restart the slave.

Is there a cached area where these events reside on the slave? I noticed in %appdata%\Local\Thinkbox\Deadline10\slave\R922\plugins\5b6b37a18bee0c1f5c878c60 you are able to see the plugins tied to the job. Is there such a location for events that get triggered?

Thanks!

eamsler · August 16, 2018, 3:26pm

I can answer the caching question.

If you are using a “remote” connection (which requires the RCS or in the older days, the Proxy), we do have a local cache. On Windows, that lives at “C:\Users[user name]\AppData\Local\Thinkbox\Deadline[major version]\cache”. We use that so Python doesn’t have to try and load modules from a remote Repository over HTTP since that would require a special kind of magic and would tend to be slow if the files were far away.

I don’t think that’s at play here however. The timeout for reload has been there awhile, and is the usual way that Repo updates / changes are loaded. Having thousands of machines checking more often than that would hurt scaling on the Repository.

For the “OnJobErrorCallback”, that is thrown by the Slave which threw the error when the task being rendered saves out its error report. How are you re-syncing that event?

BlueLegend78 · August 20, 2018, 9:37pm

We are using the “Synchronize Scripts and Plugins” button to re-sync the event. The button is able to detect that I have modified the file, but it just doesn’t update the slaves when it is clicked. We are currently using deadline

Deadline Client Version: 10.0.16.6 Release
Repository Version: 10.0.17.5
Integration Version: 10.0.17.5
Attached is the event plugin which i am testing with
test_event.7z (1.1 KB)

Thanks!

eamsler · August 21, 2018, 4:49pm

Ah, that explains it then. That menu option only controls the Monitor’s own loaded scripts. There is no remote command to reload the scripts, though you can ask the Slave to “restart after current task completion” to reload itself.

I’d say if you are developing a new event, one of the fastest dev loops I’ve found is to run the Slave on your local machine if you can. I tend to use an easier event to trigger such as “OnJobComplete” and then manually complete the job in the Deadline Monitor. Once that’s working, I shift it to the event I planned on using it for.

I also run things via the other method I outlined earlier in the thread, but you cannot use “LogInfo()” or “GetConfig*” that way.

BlueLegend78 · August 23, 2018, 6:28pm

Brilliant, the work around is exactly what i needed instead of finding various ways of triggering events for testing.
Thanks!

eamsler · August 23, 2018, 8:35pm

Welcome! I do need to write something up one day so it’s not all in my head. In the mean time, we can use this forum thread as a reference.

eamsler · September 25, 2018, 3:41pm

Hey guys,

in looking at the developer issue it seems like we should be reloading the events every five minutes. I’m seeing that it looks to work on some boxes but how long to you have to wait for event scripts to reload on the problem boxes?

Thanks,

Edwin

BlueLegend78 · September 25, 2018, 10:13pm

Hey,

The issue I was having wasn’t that the House Cleaning Interval not working properly. My issue was trying to update scripts for a remote slave. As far as I know, the House Cleaning Interval is working as intended by the Configure Repository Options. So if i decide to wait for the scripts to update remotely, it would be decided by that variable.

eamsler · September 26, 2018, 4:59pm

There’s actually a remote script update interval as well where we don’t recheck until a certain point. Did you ever find machines did not pull the new script changes? From “When i leave it alone for x mins, some sort of auto sync occurs (housecleaning?) and the scripts syncs correctly” back in August, I assume it did work but I wanted to make sure that worked 100% of the time.

kwatts · October 25, 2018, 10:04pm

Hey Edwin,

Does this mean the workflow for releasing a new event means that i have to restart all the slaves for them to get it?

Based on the information , could we create an on slave idle event that checks the current loaded plugins and sync’s them across to the local slave, so that they are always up to date and we do not need to constantly restart them, when we push changes.
would this work?

cheers
Kym

Dan_Grover · November 7, 2018, 5:11pm

Heya,

I just wanted to mention that this thread has been super useful for me in diagnosing why my Event callbacks weren’t working - the slaves hadn’t refreshed their local cache - but I feel like this could be documented better (at all?) in the help files. I’ve spent a difficult-to-justify amount of time wondering why and troubleshooting why on earth my events were being ignored despite appearing in the Event list on Monitor. Furthermore, I feel that the “Synchronise Scripts and Plugins” button gives the impression it might… well, synchronise the scripts and plugins. Which it doesn’t appear to.

I couldn’t find any reference to this pretty unusual behaviour anywhere in the documentation for Events.

Thanks,
Dan

Dan_Grover · November 8, 2018, 10:42am

Hmm, actually, I’m still not sure this is all working.

I had a script that got assigned to the OnJobStartedCallback that, let’s say, printed “Hello” to the log. I have now changed this behaviour so that, rather than launch once at the start of the job, it’s been moved to the OnSlaveStartingJobCallback so that it runs on each slave that renders a task, before it renders that task. Now it prints “Goodbye” to the log instead.

The change in the script was recognised in the console. However, when I kick off a new job, it still just runs the script once, at the start, under “(no slave)” and it prints “Hello”. On my own machine I delete that cache mentioned above and restart my slave. Now when I kick off a render, it does still run once at the start under “(no slave)”, printing “Hello”, but it also runs at the start of each task when my machine renders a frame, printing “Goodbye”.

I left this overnight, thinking that the other node (I only have one node other than my own machine on our Dev repo) might refresh its cache periodically, but this morning I’m seeing the same behaviour, with my machine working as intended by not the other node, which seems stuck on the old script. Surely I am doing something wrong, as “deleting the cache and restarting the slaves” can’t be the way it’s meant to work?

Edit: I decided to push this change in the script to our “actual” repo - I’m a brave boy - and it seems to have been picked up almost immediately by all the nodes. So it seems the problem must be something on our Dev repo, which isn’t ideal, as it somewhat negates the purpose of having a testing platform. Clearly it’s a setting though - does anyone have any idea what might be causing this behaviour on the dev repo but not the production one?