Can't run "execute command..." on half of my slaves.

Hi,

with deadline 6 I could do this: I right click all my slaves and use “execute command…” to run a bat file with a robocopy command to copy most of my plugins to all servers.

But with deadline 7.1 I get errors. Half of my slaves don’t execute the command at all and give this error

But the option is obvioulsy checked in the “Client Setup” (BTW there is not “Client Settings”).

The other half execute the command correctly but gives this error:

Also I can’t shutdown those same machine via remote control.

here is the log from one of those slave, it seems it can connect to the database:

2015-06-15 23:34:39: Exception.Source: MongoDB.Driver
2015-06-15 23:34:39: Exception.HResult: -2146233088
2015-06-15 23:34:39: Exception.StackTrace:
2015-06-15 23:34:39: at MongoDB.Driver.Internal.DirectMongoServerProxy.Connect(TimeSpan timeout, ReadPreference readPreference)
2015-06-15 23:34:39: at MongoDB.Driver.Internal.DirectMongoServerProxy.ChooseServerInstance(ReadPreference readPreference)
2015-06-15 23:34:39: at MongoDB.Driver.MongoServer.AcquireConnection(ReadPreference readPreference)
2015-06-15 23:34:39: at MongoDB.Driver.MongoCollection.RunCommandAs[TCommandResult](IMongoCommand command, IBsonSerializer resultSerializer, IBsonSerializationOptions resultSerializationOptions)
2015-06-15 23:34:39: at MongoDB.Driver.MongoCollection.RunCommandAs[TCommandResult](IMongoCommand command)
2015-06-15 23:34:39: at MongoDB.Driver.MongoCollection.Count(CountArgs args)
2015-06-15 23:34:39: at Deadline.StorageDB.MongoDB.MongoSlaveStorage.SlavesInState(SlaveStatus targetStatus)
2015-06-15 23:34:39: at Deadline.Slaves.SlaveSchedulerThread.i()
2015-06-15 23:34:39: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2015-06-15 23:35:03: Slave - An error occurred while updating the slave’s info: An error occurred while trying to connect to the Database (192.168.1.107:27070). It is possible that the Mongo Database server is incorrectly configured, currently offline, blocked by a firewall, or experiencing network issues.
2015-06-15 23:35:03: Full error: Unable to connect to server Newegg-a:27070: No such host is known. (FranticX.Database.DatabaseConnectionException)
2015-06-15 23:35:26: Error occurred while writing report log:
2015-06-15 23:35:26: Exception Details
2015-06-15 23:35:26: IOException – The specified network name is no longer available.
2015-06-15 23:35:26: Exception.Data: ( )
2015-06-15 23:35:26: Exception.TargetSite: Void WinIOError(Int32, System.String)
2015-06-15 23:35:26: Exception.Source: mscorlib
2015-06-15 23:35:26: Exception.HResult: -2147024832
2015-06-15 23:35:26: Exception.StackTrace:
2015-06-15 23:35:26: at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
2015-06-15 23:35:26: at System.IO.File.InternalCopy(String sourceFileName, String destFileName, Boolean overwrite, Boolean checkHost)
2015-06-15 23:35:26: at Deadline.StorageDB.SlaveStorage.WriteSlaveReportFile(Report report, String reportFileName)
2015-06-15 23:35:50: Info Thread - An exception occurred while updating slave’s info: An error occurred while trying to connect to the Database (192.168.1.107:27070). It is possible that the Mongo Database server is incorrectly configured, currently offline, blocked by a firewall, or experiencing network issues.
2015-06-15 23:35:50: Full error: Unable to connect to server Newegg-a:27070: No such host is known. (FranticX.Database.DatabaseConnectionException)
2015-06-15 23:36:13: An error occurred while saving slave report: An error occurred while trying to connect to the Database (192.168.1.107:27070). It is possible that the Mongo Database server is incorrectly configured, currently offline, blocked by a firewall, or experiencing network issues.
2015-06-15 23:36:13: Full error: Unable to connect to server Newegg-a:27070: No such host is known. (FranticX.Database.DatabaseConnectionException)
2015-06-15 23:36:53: Skipping thermal shutdown check because it is not required at this time
2015-06-15 23:36:56: Skipping pending job scan because it is not required at this time
2015-06-15 23:36:56: Skipping repository repair because it is not required at this time
2015-06-15 23:36:56: Skipping house cleaning because it is not required at this time

And again if I log off and relog, it all works correctly (I could shut it down remotely after logging off and relogging).

So based on that log snippet you shared, it looks like the slave is losing connection to both the repo and the database, and seeing as you cannot remotely control the slave at the time, I am suspecting that there is a network connection loss involved. Next time you have this happen, can you open the command line and try to ping the DB machine? Thanks.

Cheers,

Dwight

There is no problem with the connection itself since I can log in the slave by windows remote desktop and I can access them via my shared folders too. I was also able to control them with deadline 6. Finally I think I will just downgrade since everything was working better for me with deadline 6.

I am the only one who has those problem with deadline 7?

Hello,

I am not aware of others with this issue, but if you’d like we can arrange a remote Team Viewer session to take a closer look at this. Can you email support@thinkboxsoftware.com so we can arrange that?

Ok thanks, I would have to wait after an insane project I have to finish within 3 weeks.

We just got new workstations and installed Windows Server 2012 R2 and tried using Deadline 7.2 to execute a robocopy batch script and it no longer works. We still have two old workstations that we did not upgrade and they also no longer execute commands run through deadline. If we login remotely to the machines and manually run the batch file it works correctly. Did something change in Deadline 7.2? Was there ever any resolution to this question?

Thanks,
Austin Reed

Here is the error code we are getting: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 10.1.70.30:17070

Here is the simple batch file we are trying to run:

robocopy “\server\projectname\Maps\Aerials\Gridded” D:\Aerials\projectname\Gridded /mir

Thanks,
Austin Reed