Hi therem
Seems that purging job reports can get the cleanup to stop and never finish… We had this a couple of times today, that housecleaning would start, it would stsart cleaning up job reports, and then nothing would happen for like an hour or more after that. I have to kill the processes and restart pulse.
Attached is the latest log.
@ 14:14pm: the cleanup has been ‘running’ for 2 hours:
root 5858 4.2 1.6 1078848 546404 ? Dl 12:16 4:59 mono --runtime=v4.0 /opt/Thinkbox/Deadline6/bin/deadlinecommand.exe -DoHouseCleaning 10 True
root 12962 0.0 0.0 103248 852 pts/3 S+ 14:14 0:00 grep deadline
root 21437 0.0 0.0 161436 1212 ? S Nov10 0:00 /bin/su - -c "$DEADLINEBIN/deadlinelauncher" -nogui
root 21443 0.2 0.0 2796048 26600 ? Sl Nov10 27:43 mono --runtime=v4.0 /opt/Thinkbox/Deadline6/bin/deadlinelauncher.exe -nogui
root 22296 67.1 9.8 5207608 3248504 ? Sl 10:20 157:13 mono --runtime=v4.0 /opt/Thinkbox/Deadline6/bin/deadlinepulse.exe -nogui
I know about this because artists start reporting issues like this:
cheers,
laszlo
deadlinepulse-deadline-2013-11-18-0003.log (2.49 MB)
Happened again:
2013-11-18 14:52:51: purging job reports for '5286e5f54af9521a74c36ad3' because the job no longer exists
2013-11-18 14:52:53: purging job reports for '5286e620653d5f13580c4425' because the job no longer exists
2013-11-18 14:52:54: purging job reports for '5286e62b653d5f23ec94b08d' because the job no longer exists
2013-11-18 14:52:55: purging job reports for '5286e633653d5f48985f7825' because the job no longer exists
2013-11-18 14:53:06: purging job reports for '528a6ed7229aec1130fd6ad2' because the job no longer exists
2013-11-18 14:53:09: purging job reports for '528a71abb4535414dc140400' because the job no longer exists
2013-11-18 14:53:17: purging job reports for '528a778a2702df16f863d277' because the job no longer exists
2013-11-18 14:53:35: purging job reports for '528a77ba229aec159009b48e' because the job no longer exists
2013-11-18 14:53:36: purging job reports for '528a7ac434e08f1fb8bd7d0b' because the job no longer exists
2013-11-18 14:56:13: Power Management - Thermal Shutdown: Skipping zone "Test" because it is disabled
2013-11-18 14:56:13: Power Management - Thermal Shutdown: Skipping zone "Slaves" because it is disabled
2013-11-18 14:56:13: Power Management - Thermal Shutdown: Skipping zone "Laszlo" because it is disabled
2013-11-18 14:56:13: Power Management - Thermal Shutdown: Skipping zone "AnimatorWorkstations" because it is disabled
2013-11-18 14:56:13: Power Management - Idle Shutdown: Skipping idle shutdown group "Test" because it is disabled
2013-11-18 14:56:13: Power Management - Idle Shutdown: Skipping idle shutdown group "Slaves" because it is disabled
2013-11-18 14:56:13: Power Management - Idle Shutdown: Skipping idle shutdown group "Laszlo" because it is disabled
2013-11-18 14:56:13: Power Management - Idle Shutdown: Skipping idle shutdown group "AnimatorWorkstations" because it is disabled
2013-11-18 14:56:13: Power Management - Machine Startup: There are no slaves that need to be woken up at this time
2013-11-18 14:56:13: Power Management - Machine Restart: Skipping machine group "Test" because it is disabled
2013-11-18 14:56:13: Power Management - Machine Restart: Skipping machine group "Slaves" because it is disabled
2013-11-18 14:56:13: Power Management - Machine Restart: Skipping machine group "Laszlo" because it is disabled
2013-11-18 14:56:13: Power Management - Machine Restart: Skipping machine group "AnimatorWorkstations" because it is disabled
2013-11-18 14:56:13: Power Management - Slave Scheduling: Skipping scheduling group "Test" because it is disabled
2013-11-18 14:56:13: Power Management - Slave Scheduling: Skipping scheduling group "Slaves" because it is disabled
2013-11-18 14:56:13: Power Management - Slave Scheduling: Skipping scheduling group "Laszlo" because it is disabled
2013-11-18 14:56:13: Power Management - Slave Scheduling: Skipping scheduling group "AnimatorWorkstations" because it is disabled
2013-11-18 14:57:39: Server Thread - Auto Config: Received packet on autoconfig port
2013-11-18 14:57:39: Server Thread - Auto Config: Picking configuration based on:
2013-11-18 14:57:39: Server Thread - lapro3027
2013-11-18 14:57:39: Server Thread - ::ffff:172.18.8.40
2013-11-18 14:57:39: Server Thread - Auto Config: No config worth sending
2013-11-18 14:57:41: Server Thread - Auto Config: Received packet on autoconfig port
2013-11-18 14:57:41: Server Thread - Auto Config: Picking configuration based on:
2013-11-18 14:57:41: Server Thread - lapro1287
2013-11-18 14:57:41: Server Thread - ::ffff:172.18.4.77
2013-11-18 14:57:41: Server Thread - Auto Config: No config worth sending
Could the other log messages somehow hang the job report purging?
It could be due to the sheer number of logs you have in your database. We’re going to try and figure out a solution to have this scale better.