In another thread, it was mentioned that Deadline 7 will support Mongo sharding. I’m in the process of setting up a shard cluster for testing, but I was hoping to get a discussion going as well, and to find out some more about exactly what this means.
We are (optimistically, perhaps) hoping that a sharded cluster would allow us to unify our two studios onto a single farm, with shards on each side (and possibly with each shard operating as a replica set, with the secondary node in its corresponding remote studio). However, in order for this to work in any meaningful way, we would need Deadline to help out by allowing us to specify shard keys for write operations, so we could properly shard collections based on location. The idea is that each studio should only end up writing to its local shard in the majority of cases, and prevent performance from taking a sharp nose-dive. Obviously querying may end up being a different story, but I’d like to think that this could be done “lazily” (at least in the case of the Monitor). I know a lot of work has been done to improve the level of database traffic and contention in Deadline 7, so that will likely help out as well.
Tangentially related, I’m still somewhat skeptical of how reliably Deadline could work in a replica-set scenario where the secondary nodes were actually used as read nodes (instead of just in failover situations). For instance, if the secondary read nodes hadn’t received all of the latest write operations, it seems like you could end up with some nasty side effects like multiple slaves picking up the same task from the same job, etc.
Anyway, I would love to get some more information, and maybe brainstorm some ways in which Deadline could make a location-aware sharding process (and maybe even read-slave replication) work effectively.
Thanks