I’ve been running Redshift 3.0.67, using Deadline UBL (Repo 10.1.21.4) to hand out licenses, and it’s been working great. I had just tested 3.5.04, and UBL through Deadline still works on a windows render box. However, after just testing 3.5.04 on AWS (linux), it’s giving me this error:
2022-06-20 15:53:08: 0: STDOUT: License error: Error communicating with license server (-17)
2022-06-20 15:53:08: 0: STDOUT: License error: (RLM) Communications error with license server (-17)
2022-06-20 15:53:08: 0: STDOUT: Connection refused at server (-111)
Reverting to 3.0.67 in the same session makes everything work, so it’s not a firewall issue with AWS. License forwarder logs have no indication of failure.
2022-06-20 10:36:00: ::ffff:xx.xx.xx.xx has connected
2022-06-20 10:36:00: License Forwarder - Received request to register ip-xx-xx-xx-x/::ffff:xx.xx.xx.xx for feature redshift.
I did notice this line is new with the 3.5.04 version:
We’ve been using Houdini 19.0.589 + Redshift 3.5.03 on the DL 10.1.19.4 linux AMI, with UBL for H19 and RS 3.5 without any issues. We did not try using standalone RS, so I cannot say if that is working or not on the AMI.
I do see the same RLM License Search path for RS v 3.5.03 – however, I do see a line before that with the redshift_LICENSE env var:
Maybe you’re just missing the port number since it did contact the license forwarder correctly. Did the license forwarder log report that it was listening on port 5054 for redshift or maybe that port was already in use?
Is there a way you can test with RS v3.5.03? I know when we first tried the linux version of RS v3.5.01, it was missing some libs and just didn’t work, so I’d maybe just see if you can get the licensing to work with another 3.5.x version (well, not 3.5.01 LOL).
I should have clarified, this is a Redshift Standalone job.
I just checked the task reports again and it does include the line STDOUT: [Redshift] redshift_LICENSE=5054@127.0.0.1
I then checked another task report and just noticed it has additional lines saying Read error from network and system call error:
2022-06-20 15:25:29: 0: STDOUT: License error: Error communicating with license server (-17)
2022-06-20 15:25:29: 0: STDOUT: License error: (RLM) Communications error with license server (-17)
2022-06-20 15:25:29: 0: STDOUT: Read error from network (-105)
2022-06-20 15:25:29: 0: STDOUT: select() system call error (comm: -15)Interrupted system call (errno: 4)
Thing is, with the instance still running, I can just install 3.0.67 on top of 3.5.04 and everything works as it should.
Could you test a standalone rs job to see if you run into this issue?
I’d start up an instance and try it right now but it appears there’s no spot capacity at the moment.
Unfortunately, I cannot test a standalone RS job at the moment (I’m on the East Coast and I’m heading out of the office for the day.) I can maybe try tomorrow if I get a chance. However, I do think RS uses the same licensing regardless of standalone or not.
Those errors are all RLM related – I really want to say port related.
RLM_EL_COMM_ERROR (-17) Error communicating with server
This indicates a basic communication error with the license server, either in a network initialization, read, or write call.
RLM_EH_NET_RERR (-105) Error reading from network
RLM_EH_CONN_REFUSED (-111) Connection refused at server
Hope you were able to get it working. Didn’t have a chance to test until today. I was able to render RS standalone with UBL using versions 3.5.03 and 3.5.04
from the lic forwarder:
2022-07-22 19:51:18: ::ffff:10.128.66.1 has connected
2022-07-22 19:51:18: License Forwarder - Received request to register ip-10-128-66-1/::ffff:10.128.66.1 for feature redshift.
2022-07-22 19:52:28: License Forwarder Tunneler Thread for redshift ( 5054 ) : 10.128.66.1 : Connection received!
2022-07-22 19:52:28: License Forwarder Tunneler Thread for redshift ( 7054 ) : 10.128.66.1 : Connection received!
In the end I was unable to rely on AWS. I migrated over to Paperspace and quickly ran into the same issue, even on a Windows machine. Firewall settings, ports, etc, didn’t change anything. I created a new license forwarder on a different local machine out of desperation and it started working, however, so that was relief. What was frustrating, was that the machines were suddenly pulling from the ‘broken’ license forwarder! Creating a new license forwarder changed something in the repo or SOMETHING, because it all started working without an actual “fix”. I removed the new license forwarder and it all kept chugging along like nothing happened. The worst bugs or issues are the ones that resolve themselves without knowing what was fixed, so it’s not repeatable if it happens again in the future. Not sure if this will help anyone in the future, but who knows.
I’m hitting this issue, getting different errors between version 3.5.19 and 3.6.04, I can see on the lic forwarder and flexnet webpage that the licenses are being checked out.
Might be unrelated, but another Deadline issue I had was solved by deleting the “libredshift-core-cpu.so” file in the bin folder. Not sure if it’s still necessary, but with every update I just obliterate that file
Details are here: redshift.maxon.net/post/332957