AWS Thinkbox Discussion Forums

invalid packet size error

Some of our slaves are throwing this error on deadline7:

2014-12-30 19:09:53: 0: INFO: Executing script: C:\Users\scanlinevfx\AppData\Local\Thinkbox\Deadline7\slave\LAPRO0602\plugins\54a368087a3a9e152c36e75a\customize.ms
2014-12-30 19:09:54: 0: An exception occurred: Error: ExecuteScript: Exception caught in 3ds max: simple_socket::receive: Invalid packet size (given 1165518179)
2014-12-30 19:09:54: at Deadline.Plugins.ScriptPlugin.StartJob(Job job, String& outMessage, AbortLevel& abortLevel) (Deadline.Plugins.RenderPluginException)

Some are rendering just fine. They are all the same version.

Hi Laszlo,
Edwin has spent a lot of time playing with this error and we have a proposed fix to inside the Lightning plugin which we intend to ship with the first beta of v7.1. Edwin tells me in his testing, that this error is dramatically reduced by 95% or more. We believe it’s a simple case of the local socket timing out and the timeout value’s just need to be slightly increased to compensate. As this change could have far reaching consequences for many customers, we proposed to ship it with the first beta of v7.1, so customers have a good opportunity to give it a good thrashing. Fingers crossed. Edwin can provide all the gory details if needed. :slight_smile:

Cool, thanks for the update!

When would the first beta for 7.1 come about? I wonder how bad this would affect our farm, as this was already happening with only a single job and 4 slaves.

Our plan is to get the 7.1 beta started basically ASAP, since we already have a bunch of stuff for it. I think the plan was originally to have it going by the end of January, but we’ll definitely keep you posted!

Cheers,
Jon

We find that this happens in deadline6 as well. Does that sound right?

Also found a couple of errors like this (~1% of errors on d7):

“2015-01-13 23:18:36: 0: An exception occurred: Error in StartJob: RenderTask: Unexpected exception (Exception caught in 3ds max: – Runtime error: Error in GetJobInfoEntry: simple_socket: Timed out waiting for header packet to arrive.”

Seems like its related

This issue has been eating away at me for many years…fingers crossed…Edwin’s fix saves the day :slight_smile:
EDIT: Yep, it’s related and fixed by the same tweak.

The gory details amount to this:

The socket communication between Deadline and Lightening handle a socket timeout exception as a perfectly passable event (I agree with that plan) while the way the socket was coded leaves things in kind of an inconsistent state if a message was only half-received. It’s not a huge problem most of the time because the natural timeout happens after a whole message is sent. The fix was to increase that timeout to give half-sent message more time to pass through on extremely heavy renders.

There’s more possible work to do here like a proper acknowledgement, but considering how elegantly simple the current system is we’ll cross that bridge if my fix isn’t good enough.

Privacy | Site terms | Cookie preferences