I’ve been working on writing some custom submission scripts in Python, and have a need to import/use a number of python modules, such as BeautifulSoup for pulling data from a web page. I have no issues importing BeautifulSoup on it’s own, but as soon as I include it in a submission script containing the following Deadline imports I’m unable to import BeautifulSoup:
[code]
from System.Collections.Specialized import *
from System.IO import *
from System.Text import *
from System.IO import File
from Deadline.Scripting import *
from DeadlineUI.Controls.Scripting.DeadlineScriptDialog import DeadlineScriptDialog[/code]
When I try running the submission script, I get the following:
2016-07-11 17:21:38: Traceback (most recent call last):
2016-07-11 17:21:38: File "DeadlineUI/Commands/ScriptCommands.py", line 110, in InnerExecute
2016-07-11 17:21:38: PythonNetException: AttributeError : 'module' object has no attribute 'BeautifulSoup'
2016-07-11 17:21:38: File "\\redacted1\DeadlineRepository8\custom\scripts\Submission\redacted.py", line 75, in __main__
2016-07-11 17:21:38: htmlSoup = bs4.BeautifulSoup(parseText, 'html.parser')
Using Deadline8’s bundled Python, I’m able to run a test script importing and using BeautifulSoup without issue – the only difference is the import of the Deadline libraries.
Hmm, so I’ve downloaded BeautifulSoup and done some tests and seem to have replicated your issue.
I’ve been getting a slightly different error, but it’s entirely possible/likely that it just comes down to differences in the actual script we’ve used.
Here’s the one I’ve been getting, for reference:
Error: String cannot have zero length.
at System.Reflection.RuntimeAssembly.GetType(RuntimeAssembly assembly, String name, Boolean throwOnError, Boolean ignoreCase, ObjectHandleOnStack type)
at System.Reflection.RuntimeAssembly.GetType(String name, Boolean throwOnError, Boolean ignoreCase)
at Python.Runtime.AssemblyManager.LookupType(String qname)
at Python.Runtime.ModuleObject.GetAttribute(String name, Boolean guess)
at Python.Runtime.ImportHook.__import__(IntPtr self, IntPtr args, IntPtr kw)
at Python.Runtime.Runtime.PyObject_Call(IntPtr pointer, IntPtr args, IntPtr kw)
[...]
I’ve seen this error reported before, but was never able to reproduce. I was able to narrow this down to the specific area in BeautifulSoup module that was causing this, so hopefully we should be able to get a fix for this in quickly.
I wound up getting the error that you see during some of my testing. Sadly I can’t remember exactly what it was I did that generated it. I do know that I tried a variety of things to resolve this issue, including importing BeautifulSoup as a different name, which didn’t help either:
from bs4 import BeautifulSoup as bSoup
Thanks for the update, hopefully there will be a fix soon.
Alright, I found the issue, and the fix on our end was pretty easy – it should be in the next 8.0 build. The problem came from a couple relative imports in BeautifulSoup that had errors on initial import, which leads to Python.NET trying to resolve them to a .NET namespace (which it doesn’t like, because the module name is just a ‘.’).
I found a workaround that you can tweak your BeautifulSoup install with, if you are needing to work around this problem before this gets released. Here are the imports that are troublesome to Deadline, inside the “bs4/builder/init.py” script (lines 313-324 in the version I downloaded):
try:
from . import _html5lib
register_treebuilders_from(_html5lib)
except ImportError:
# They don't have html5lib installed.
pass
try:
from . import _lxml
register_treebuilders_from(_lxml)
except ImportError:
# They don't have lxml installed.
pass
There’s a couple options here… You can remove either/both of these imports if your install of BeautifulSoup is using neither of html5lib or lxml, you can INSTALL html5lib and lxml, so that it finds it and the imports succeed (these imports are only a problem for Deadline if they fail), or you could tweak the imports a bit like so:
try:
from ..builder import _html5lib
register_treebuilders_from(_html5lib)
except ImportError:
# They don't have html5lib installed.
pass
try:
from ..builder import _lxml
register_treebuilders_from(_lxml)
except ImportError:
# They don't have lxml installed.
pass
The next 8.0 release should be soon as well, so you could also just wait for it to be fixed on our end
Yeah, 8.0.6.5 should fix that issue – it should be raising the original Python exception (ImportError) when a relative module fails to import. Let me know if you’re still getting that issue after trying 8.0.6+
Are you using your own copy of the google API? Because we’re shipping with a (probably outdated) copy of it with Deadline, which does not contain a ‘GoogleCredential’ member in the oauth2client.client module. I’m pretty sure what happened here is that we fixed a bug in our code that fixed the local synching of the third-party APIs deadline ships with, as well as rejiggered the order in which we internally add paths to Python’s sys.path to give preference to the libraries Deadline ships with (primarily to avoid PyQt conflicts).
You could work around this by pre-pending sys.path with the path to where your own copy of oauth2client is located, which should correctly load yours instead of ours (unless it’s already been loaded, in which case you would need to explicitly reload it after doing your import).
Thanks Jon. What I’ve been doing to date is using a copy of the Google API that I installed using Deadline’s bundled Python and pip. This was installed prior to the upgrade to 8.0.7.3 (when it stopped working). Unless I’m mistaken, that should install it in c:\ProgramFiles\Thinkbox\Deadline8\bin\lib\site-packages.
Per your suggestion, I’ve tried this as a workaround:
After working with support for a bit on this, it turns out that the issue re-surfaced due to a new feature in 8.0.7.3. The conflicting imports were due to the .zip archived Python modules stored in C:\Users<user>\AppData\Local\Thinkbox\Deadline8\pythonAPIs\XXXXXXXXXXXXXX.
The most successful workaround has been to modify sys.path (removing the entry for C:\Users<user>\AppData\Local\Thinkbox\Deadline8\pythonAPIs\XXXXXXXXXXXXXX) at the very beginning of the custom submission script, prior to any import statements. After submission/closure the script then adds the previously removed path back to sys.path.