Not sure what it might be (maybe pyqt5?) and since this is not really possible to measure i don’t have exact metrics, but the deadline7 monitor seems more sluggish than deadline6’s
Menus appear with a slight delay (1-3 seconds), moving separators is slower, moving between menu items is slow, opening settings dialogs takes seconds etc. Is this a known issue?
For example, clicking on menus, i first see its drop shadow being drawn, then the items show up. In deadline6, its instant.
We currently only have 30-40 jobs in d7 (albeit a lot of slaves), but with 200x that many jobs, the deadline6 monitor feels faster.
We’ll look into this ASAP, but I just want to check if you have dynamic sorting and filtering enabled in the Monitor panels in v7, but not v6, as that could explain the difference in performance.
Also, how many slaves do you currently have in your v7 Monitor?
Dynamic sorting is on in both versions. I tried turning it off in d7, but this slow response time did not really change. Even closing all slave panels, it was still somewhat slower than d6 with 4+ slave panels open.
Ill try to record a camtasia if i get some time today.
Attached is a camtasia recording. Note that while doing this video i noticed a couple of things… first, that the responsiveness is dependent on the machine you are using.
My workstation is much slower than this (but it didn’t have camtasia installed, so i had to use another box). The delays between the slow periods on my workstation are shorter, maybe 10 seconds or so, and when the menu comes up, you can see it being drawn.
Also, on my workstation the menus have drop shadows. Note that the drop shadows don’t appear in this relatively faster monitor in the camtasia. Weird…
So maybe some of the gui features work on some machines (making it slower on those), but not on others? deadline7_monitor_menu_speed.mov (8.06 MB)
The settings are near identical, some were set back to defaults because we anticipate better performance due to d7 splitting the databases (we have severe locking/mongo queue issues with the default settings in d6).
Green shows update intervals that are actually longer in d7 then in d6.
Red shows update intervals that are shorter in d7 compared to d6 (note, all of which are shorter should only affect the mongo db, not the monitor, since those intervals are same or longer)
Thanks for sharing your settings! I think the problem is a result of the shorter interval for slave info updates (in red). This setting does impact the Monitor because with a slave update interval of 30s, and a slave info interval of 20s, the Monitor will updating every active slave at every interval (which is 2000+ updates).
Can you try bumping the slave info updates (in red) to 60s and see if that helps? At least then, on average, the Monitor will only be processing half the number of active slaves on each update.
Would it worth having different slave update times for different slave states? Say, an idle machine could update much less frequently than a machine thats actively rendering?
In addition, maybe only updating the status if there are significant changes.
For example, while memory / cpu usage might change all the time, rendering status changes less frequently, version/pools/limit groups/whitelists are essentially one off changes very infrequently and some of the fields almost never change (cpu count, speed, total memory, operating system, gfx card etc)
Is all slave data currently updated by the slave and then rerequested by the monitor? Or just information that can potentially change?
We had actually attempted to do this for 7.0 (the initial public release actually did this). However, during our cloud scaling tests, we discovered that this performed quite poorly compared to sending the the full slave info state. That’s why we put out the 7.0.1 patch release so quickly to address this.
Also, this wouldn’t impact the Monitor’s performance because checks are already made to only update data for a cell that has changed.
We already have a couple ideas on how we can improve the performance of the list controls in the Monitor, and they’re currently on the roadmap for 8.0.
Did some time profiling using cProfile of the gui in d7, to try to narrow down why interactivity (moving splitters etc) might be slow.
In this session, i opened the monitor for about 45 seconds, and then dragged the splitter between jobs / slaves up and down (i get only about 2-4fps doing that) constantly till shutdown:
In the static session, after UpdateModel (which is running in the background, so is expected to take a long time), the next entry is {method ‘emit’ of ‘PyQt5.QtCore.pyqtBoundSignal’ objects}, which im guessing is signals to the GUI to update. This all makes sense.
However, when the splitters are being moved around, around 40% of the time is spent doing. I’m guessing, not much can be do about this, right? (Really weird that it calls endInsertRows when nothing is changing just the size of the window):
Based on my mini research, most of the monitor gui speed limitations are due to pyqt.
Do you see any chance in the future to go c++/qt for performance? The sluggishness can be a real downer :-\
In the previous log, i could optimize out the function call to:
C:\Program Files\Thinkbox\Deadline7\bin\UI\ThinkboxUI\Models\MultiColumnSortFilterProxyModel.py:67(headerData)
by caching the sourceModel into an internal variable (basically went from needing 5-6 seconds to needing 1 for about the same amount of function calls):
The rest of the functions don’t seem to be optimizable any further. I think the overhead of python here is the real killer, hundreds of thousands of small function calls etc.
For Deadline 8, we currently plan on looking at moving the list controls and their data models into a separate c++ library, similar to how the node view works. The theory is that this will speed things up because all the heavy computation will be done in c++ instead of python. If this works, the rest of the python code can be left alone, since the vast majority of computation is done in the list controls.
Thanks Ryan, i think this will make a huge difference for us! It seems like most of the time is spent in a crazy amount of calls between the model / controls. Which of course is all expected, but the added overhead of python/pyqt might make it slower than it could be with the amount of data we have.