Modifying job documents directly, redux

I’m wondering what problems I could potentially cause by writing custom data into a job’s document after it is submitted. (This came up briefly back in the heady days of the Deadline 6 beta, but the original thread is now locked.) Ideally I’d be able to inject my own subdocument somewhere (most likely within either Props or Props.PlugInfo), but there are a few behavioral questions I don’t currently have the answers to.

First, when properties are changed, does Deadline make in-place field updates, or just squash the whole document? As a developer, I’d like to think it’s the former, but I obviously don’t know.

Second, I know that in order to get anything into the PlugInfo doc as part of the submission process, the value has to be a string. If I suddenly drop a nested document in there, is some part of the Deadline code going to error? I’d like to think Deadline is always making explicit modifications to fields by name, and that the only code that would ever enumerate the keys in the job’s PlugInfo doc would be something like Job.GetJobPluginInfoKeys() (and hopefully this wouldn’t error, as long as you didn’t try to peek at the value…). In other words, it seems like the main Deadline machinery should leave PlugInfo alone.

So, with the understanding that what I’m proposing is completely unsupported and unadvised… how likely is it to work? :smiley:

Thanks.

i’ll leave the tech details to ryan et al, but i would like to know ‘why’? what are you in need of doing that we could do?

cb

In short, the dream is to have a place to store arbitrary, structured BSON data that shares the same lifetime as the Deadline job in the database, and is easy to look up.

We store assorted internal data with our jobs, which is basically structured as a JSON document. Currently, in order to include it with our submissions, we have to serialize the data to a string and, conversely, we have to read and deserialize the entire sting to get any one piece of data, rather than being able to query it on its actual fields. This is particularly unfortunate because of how nicely such data would map to MongoDB, and I have previously suggested the ability to submit such data with a job.

Modifying the PluginInfo to go from a flattened key-value store to a structured document with support for nesting, non-string data types, etc. would open a lot of doors for creative use of job storage without having to worry about reliable cleanup of external data.

So, it looks like we’ll either update the entire job, or just job.props (we have functions for both, didn’t trace them back). That means that making quick changes and doing a SaveJob() is going to push the entire doc over.

Now, I have to warn you that interesting things are coming down the pipe that will break if you’re relying on MongoDB underlying data access.

Hands down the safest way to deal with your data blob is going to be what you’re doing now. In fact, because the data gets plumbed through .net to Python, even if you put data in the job document, you’d need to do your own queries outside of Deadline to get it (maybe you already do?).

FWIW, I really like the idea of having the PluginInfo be a structured document. It’s just one of those things that doesn’t give much of a win for 99% of clients, so it gets less attention. Plus trying to build a structured doc in melscript and the other weird scripting languages just makes me uncomfortable…

This actually is not completely true. The Job.Props sub-document is definitely saved wholesale when job Properties are changed (so don’t put any new properties in there for sure), but almost everything else does single-field updates (status/progress changes, timestamp updates, etc.). There might be some edge cases where we save out the entire Job doc, but the only ones I can think of is on initial submission, and maybe when the Job’s frame range is changed – I’d have to check that one. Note that even calling RepositoryUtils.SaveJob on a job will only clobber the Props sub-document, as previously mentioned.

I did want to add MongoDB isn’t going away anytime soon, I don’t think this would necessarily be as big a concern going forward as Edwin makes it sound :slight_smile:

I agree with Edwin that the safest way is definitely adding dictionary entries to either the Job’s ExtraInfoKeyValuePairs or PluginInfo, since as you pointed out, they are just dictionaries and we kinda just deal with it as a an opaque lump of key/values that we know nothing about.

That said, if you really want to add extra fields to the root Job object, you probably could get away with it. I wouldn’t want to deal with that data randomly disappearing because some edge case somewhere does a full-document save, though, so I can’t say I’d necessarily recommend it.

Cheers,
Jon

Thanks for the information guys. It sounds like this isn’t a particularly tenable idea at this point…

Yeah, I understand that a lot of the things I ask for are extremely niche, and we are definitely different from most of your customers in that what we really want from Deadline (outside of solid user-facing elements) is a tunable, scalable, and performant work scheduler with a robust API to build on, rather than a complete farm-in-a-box. From a development perspective, Deadline is, in many ways, a legacy application layer ported to run on a modern back-end (though obviously with a very nice front end). That is, the plugin design and scripting API haven’t really changed since 5.2 (and maybe earlier).

Hmm, that’s some intriguing vagueness… :wink: Is there any way you would be willing to elaborate at all on what you mean by “interesting things” (via PM or email)? I’d like to keep an ear to the ground, since we are in the process of designing some higher-level stuff that will use Deadline as a back-end, and I’d like to know if major changes are on the horizon before we get too deep (even if we’re talking several versions out).

Yeah, that’s certainly workable, although I’ll probably just end up storing the data elsewhere in Mongo and relating it using the JobID.