Hello,
we have our deadline repository on a Linux machine, on the same machine we run Pulse. However it (Pulse) crashes every day or two under load, causing clients connect to repository directly via samba share, bringing the machine down on the knees.
Distribution is Ubuntu 12.04 with mono installed (from ubuntu repository), and deadline 5.1.
Any help appreciated.
Can you post a Pulse log from a session where it crashed? You can find the logs folder from the Pulse interface by selecting Help -> Explore Log Folder.
If you are running Pulse in headless mode, the logs folder can be found in the Deadline install folder (ie: /usr/local/Thinkbox/Deadline/logs).
Also, can you let us know which version of Mono you have installed on the machine?
Thanks!
Hello,
Logs are attached. Mono:
Mono JIT compiler version 2.10.8.1 (Debian 2.10.8.1-1ubuntu2.2)
Copyright © 2002-2011 Novell, Inc, Xamarin, Inc and Contributors. mono-project.com
TLS: __thread
SIGSEGV: altstack
Notifications: epoll
Architecture: amd64
Disabled: none
Misc: softdebug
LLVM: supported, not enabled.
GC: Included Boehm (with typed GC and Parallel Mark)
If i would consider a fresh install. What are my steps?
Install the system, which mono packeges and which libraries do i exactly need?
And is that correct: first i install deadline repository from installer, then owerwrite it with the old one, then install pulse as a service?
pulselogs.tar.bz2 (1.69 MB)
Thanks for the logs. I don’t see anything that stands out, unfortunately.
As a test, can you try running Pulse on a different machine than the repository? If the heavy load on the repository machine is bringing it down, it would be interesting to see how it runs when it’s on a dedicated machine. Maybe you could just run it on a render node instead of the slave for a few days to see how it goes, rather then setting up a new machine for the test.
Also, are you currently submitting your scene files with the jobs? This can place additional load on the repository machine, and could add to the problem.
Cheers,
Hi
Excuse me for thread join, but :
I got exact same problem - ubuntu 11.10, machine is xeon 5504 with 2 GB ram (but 1 always free - as it’s no X server machine), and pulse crashing randomly - most often on load. Logs are also clear - one I got an exeption in log :
Caught unhandled exception: Couldn’t create thread (System.ExecutionEngineException) - thats some kind of mono error.
That was once - usually it crashes witout any traces logged.
My mono version :
mono --version
Mono JIT compiler version 2.10.5 (Debian 2.10.5-1ubuntu0.1)
Copyright © 2002-2011 Novell, Inc, Xamarin, Inc and Contributors. www.mono-project.com
TLS: __thread
SIGSEGV: altstack
Notifications: epoll
Architecture: amd64
Disabled: none
Misc: softdebug
LLVM: supported, not enabled.
GC: Included Boehm (with typed GC and Parallel Mark)
With what I see - problem occures not only for me - any advices different than switching machines ? We use pulse to sumbit job and to communicate with our application - so it’s rather critical to have it running smoothly.
Is Pulse running on the same machine as the repository, or is it a different machine? The original poster had it running on the same machine, which is why we were suggesting switching it to a dedicated one for testing.
Same - that why I said we got exact same problem
And by under load I meant one/two jobs on queue - not so heavy deal - but it crash occured once or twice with no load at all.
Having it both (pulse and repo) on one machine should reduce response time and network load - is that correct ?
It can, but they can also impact each other’s performance. Do you run Pulse with the user interface, or do you pass it the -nogui argument from the command line to suppress the user interface? If you’re not already using the -nogui argument, maybe you could try that for a bit to see if it improves things. Then if Pulse crashes again, send us the log and we’ll take a look.
There’s no X server on that machine - pulse always running as -nogui - and the error I’ve already pasted above is the only one caught in logs - pulse usually dies quietly - with normal logs - but next time pulse crash - I’ll attach log here.
Sounds good. I’m just curious to see if Pulse crashes in the same spot, or if it’s more random. So actually, if you could post the Pulse logs from the next 2 or 3 crashes, that would be great!
Also, what version of Deadline are you running?
Latest stable - that would be 5.2 .47700
I’ll provide logs when I gather it from few crashes.
64 bit if that matters.