AWS Thinkbox Discussion Forums

Deadline Connection Freezing

Hi,

I’m having some issues with intermittent freezing of the monitor and I think API. I’m not really sure what’s causing it, although it seems to have started since I turned on the web server to do some API work. Any thoughts on where I should start to debug these kind of issues…? Basically The monitor will freeze for a couple of minutes and then jump back into life (or I have to restart it). Or the API just seems to hang trying to connect or submit. I’m assuming it’s something to do with Mongo.

Any help would be great.

Thanks!

Nick

Hmm. This sounds oddly familiar, and I agree that it should be at the database level here. Is the web service running on the same machine as the Proxy/RCS?

You should be able to spot a slow query in the DB logs. If you want to send in one (including the approximate time) I can do a quick text search for times greater than 300ms. I’d send it in via a ticket if you can.

We don’t use a proxy or RCS? At least not that I’m aware of. The database, Pulse and the Web Service are all on the same machine. I’ll have a look at the logs, was just looking in the documentation regarding Mongo. We’ve got an old Deadline9 Database on the machine that seems to be running too, so I was going to shut that off. It’s definitely coincided with me using the API though, I am submitting about 40 jobs in rapid succession through it. There’s also a WmiPrvSE.exe process using 20% CPU on that machine, I’m not sure if that’s linked to Mongo?

Nick

No, it’s not. For others here, I took a look at the log and there are some incredibly long queries (20 seconds versus 0.1 seconds) so there is something performance related going on here.

The DB isn’t running on Windows Server 2012 is it? I’ve battled strange performance problems on that platform quite a few times now. Essentially, every metric that should affect performance on the machine seems fine (network, disk, memory, CPU all low usage) but the queries take forever anyway. I’m not sure if it’s due to that NUMA warning they give or not.

Just thinking aloud…

It is on Server 2012… Not sure I’ve got anything else I can run it on atm though.

Well, something I didn’t investigate was actually changing the memory settings on a machine with NUMA (multiple processor sockets with their own RAM). Here’s the notes from MongoDB’s documentation:

There’s a little more here:
groups.google.com/forum/#!topic … ncT91G3PMg

Does that machine have multiple sockets?

It’s a dual CPU machine but only with one CPU and RAM on one side. We’ll have a look at disabling NUMA (if it’s on) but it’ll require a reboot so might take a while before we can do it.

I can’t imagine it’ll be impacted by this then… It’s worth a try for the negative confirmation, but I’d really like to figure out this weird Windows Server 2012 performance issue.

Some Googling later yields not so much other than confirmation some people see the problem:
stackoverflow.com/questions/275 … erver-2012

Someone also complained about Windows being less efficient than Linux. Even if that is the case, I’d expect to see some parameter being pegged (CPU, RAM, whatever).

Actually, I might be able to loop with the MongoDB support team on this one. Would you be willing and able to throw a bunch of effort at this problem?

We do run other things on that server, for Deadline it’s running Pulse and the Web Server. We also run Retrospect for tape backup on it, but that’s not active during the day so shouldn’t be causing us an issue I don’t think.

Privacy | Site terms | Cookie preferences