Monitor slave list empty, database error

NewJohnny · August 1, 2017, 9:53pm

I’ve buggered up the repository database somehow. I deleted a slave from the monitor, but it was still listed under “slave scheduling”. 30 minutes later, the slave starts again, but this time I shut it down, renamed the domain machine, and now the database is broken.

I have this error message from the console. Is there a repair function I can do without wiping out mongodb?

2017-08-01 14:47:18: Error occurred while updating slave cache: QueryFailure flag was BSONObj size: -286331154 (0xEEEEEEEE) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: “john-pc” (response was { “$err” : “BSONObj size: -286331154 (0xEEEEEEEE) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: “john-pc””, “code” : 10334 }). (MongoDB.Driver.MongoQueryException)

eamsler · August 2, 2017, 6:33pm

This page should help you there:
docs.thinkboxsoftware.com/produ … ase-repair

Just so I can track it, what version of Deadline are you on?

NewJohnny · August 2, 2017, 6:41pm

Deadline version is 9.0.6.1

NewJohnny · August 2, 2017, 6:59pm

I ran the commands but the issue still persists. Here’s the repair log:

2017-08-02T11:54:07.718-0700 I CONTROL [main] Hotfix KB2731284 or later update is installed, no need to zero-out data files
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] MongoDB starting : pid=5792 port=27017 dbpath=R:\DeadlineDatabase9\mongo\mongo\data 64-bit host=Canary
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] db version v3.2.12
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] git version: ef3e1bc78e997f0d9f22f45aeb1d8e3b6ac14a14
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1u-fips 22 Sep 2016
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] allocator: tcmalloc
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] modules: none
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] build environment:
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] distmod: 2008plus-ssl
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] distarch: x86_64
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] target_arch: x86_64
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] options: { repair: true, storage: { dbPath: “R:\DeadlineDatabase9\mongo\mongo\data” }, systemLog: { destination: “file”, path: “R:\DeadlineDatabase9\mongo\data\logs\repairlog.txt” } }
2017-08-02T11:54:07.718-0700 E NETWORK [initandlisten] listen(): bind() failed errno:10048 Only one usage of each socket address (protocol/network address/port) is normally permitted. for socket: 0.0.0.0:27017
2017-08-02T11:54:07.718-0700 E STORAGE [initandlisten] Failed to set up sockets during startup.
2017-08-02T11:54:07.718-0700 I CONTROL [initandlisten] dbexit: rc: 48

tomasz · August 3, 2017, 6:27am

The second error message you posted is different than the first one. It looks like there is some process in your Windows that is occupying the port 27017 which is required by your mongo instance. To resolve this you will have to find the process that is using this port and kill it manually. Check this thread for reference
https://stackoverflow.com/questions/34709062/failed-to-set-up-sockets-during-startup-dbexit-rc-48-error-in-mongodb

eamsler · August 3, 2017, 2:23pm

Also, if this is urgent please call us (my signature has the number, ext 2 for everyone).

You may have to restart the service now that the repair was done. The log is showing “{ repair: true, …}” so it seems to have fixed things alright. Without the port definition (which we now store in “config.conf” in the data directory (“C:\DeadlineDatabaseX\mongo\data” on Windows) it’ll try for 27017.

NewJohnny · August 3, 2017, 3:37pm

Actually, it was so urgent I wiped the install and put in a new database. It was only 2 weeks old anyways, so not much of a loss. Thanks for your help on this. Next time, I’ll call support directly.