Hey guys,
Where does the Deadline Monitor get the time Pulse last ran the pending job scan, repository repair, and house cleaning? We are looking to monitor/graph this information. Is it kept in the mongo database? The closest thing I could see was deadline10db_StatusInfo.PulseInfo
Thanks in advance!
While we don’t support you grabbing stuff from the database (we can make changes there between major versions), that’s where it is. Most of the apps that extend the MachineInfo class have their own bits in the same kind of structure. SlaveInfo, PulseInfo, BalancerInfo, etc.
For reference, i 've found it in deadline8db_Config, under:
db.DeadlineSettings.findOne({_id:"lock_entries"})
{
"_id" : "lock_entries",
"Flags" : 0,
"LastRepositoryRepairEntry" : {
"InProgress" : false,
"MachineName" : "deadline01.scanlinevfxla.com",
"Heartbeat" : ISODate("2017-10-12T17:03:39.564Z"),
"LastCheck" : ISODate("2017-10-12T17:03:39.564Z")
},
"LastHouseCleaningEntry" : {
"InProgress" : true,
"MachineName" : "deadline01.scanlinevfxla.com",
"Heartbeat" : ISODate("2017-10-12T17:03:51.738Z"),
"LastCheck" : ISODate("2017-10-12T17:00:46.115Z")
},
"LastPendingJobScanEntry" : {
"InProgress" : false,
"MachineName" : "deadline01.scanlinevfxla.com",
"Heartbeat" : ISODate("2017-10-12T17:03:22.538Z"),
"LastCheck" : ISODate("2017-10-12T17:03:22.538Z")
},
"LastThermalShutdownEntry" : {
"InProgress" : false,
"MachineName" : "LAPRO1374",
"Heartbeat" : ISODate("2017-10-11T14:48:28.069Z"),
"LastCheck" : ISODate("2017-10-11T14:48:28.069Z")
}
}
We dont like going to the DB directly neither,… :\ best would be if this data was accessible using the restAPI
I’m not at all sure why I didn’t think of this, but there’s also the standard API functions for this:
docs.thinkboxsoftware.com/produ … aff035f241
I don’t have a great example of pulling PulseInfo, but it’s about the same as SlaveInfo, so here are a few examples for that one:
[github.com/ThinkboxSoftware/Dea ... aveInfo.py](https://github.com/ThinkboxSoftware/Deadline/blob/23ef6b53f21fd26af65239e7717a246e90f02d3e/Custom/scripts/General/QuerySlaveInfo.py)
[github.com/ThinkboxSoftware/Dea ... aveRunning](https://github.com/ThinkboxSoftware/Deadline/tree/23ef6b53f21fd26af65239e7717a246e90f02d3e/Examples/DeadlineCommand/IsSlaveRunning)
This particular information (when did pending job scan, housekeeping, repo repair run last) does not seem to be exposed to this API.
The PulseInfo class contains ( the docs only contain a fraction of these ):
CPUUsage
CPUs
CompareTo
CouchRevision
CouchbaseCAS
DiskReads
DiskSpace
DiskSpaceString
DiskWrites
Equals
Finalize
FreeMemory
GetHashCode
GetType
HostName
ID
IPAddress
IsAWSPortalInstance
LastReadRepoTime
LastReadTime
LastWriteTime
MACAddress
MachineArchitecture
MachineCPUUsage
MachineCPUs
MachineDiskSpace
MachineFreeMemory
MachineIPAddress
MachineMACAddress
MachineMemory
MachineOperatingSystem
MachineProcessorSpeed
MachineRealName
MachineUserName
MachineVideoCard
MemberwiseClone
Memory
NetworkReceived
NetworkSent
OSShortName
Overloads
PULSE_NAME
Port
ProcessorArchitecture
ProcessorSpeed
PulseName
PulsePort
PulseRegion
PulseRunningTime
PulseState
PulseStatus
ReferenceEquals
Region
ServerPort
StateDateTime
SwapUsage
ToString
UpTimeSeconds
UpdateDateTime
UserName
Version
VideoCard
ZoneTemperatures
and the PulseSettings contains:
CouchRevision
CouchbaseCAS
Equals
Finalize
GetHashCode
GetType
HostMachineIPAddressOverride
ID
LastWriteTime
ListeningPort
MacAddressOverride
MemberwiseClone
Overloads
OverrideListeningPort
OverridePort
Port
PrimaryPulse
PulseHostMachineIPAddressOverride
PulseIsPrimary
PulseListeningPort
PulseMacAddressOverride
PulseName
PulseOverrideListeningPort
PulseOverridePort
PulsePort
All of them are per pulse settings or infos, and not the global repo value for those 3 properties. I tried digging around a bit, but couldnt find anything else that may be related.
We are good for now, Robert has set up a direct mongodb query and attached a PRTG sensor to it so we can monitor it (sometimes pulse crashes, and in deadline8 the failover to a secondary pulse doesn’t seem to work). It is essential for us to be able to monitor when these processes have stopped running for a while (pending scan, etc).
Fair enough. I’ll open an issue to get the Doxygen docs synced up (fields need to be properly exposed).
For Pulse crashing, I don’t remember which OS you’re on for those guys. I’ll see if I can find a tracking issue over here.
Edit: I forgot that the last scans should be global to the farm and are probably going to go into RepositoryUtils.