Questions about sharding support

nrusch · July 25, 2014, 12:17am

In another thread, it was mentioned that Deadline 7 will support Mongo sharding. I’m in the process of setting up a shard cluster for testing, but I was hoping to get a discussion going as well, and to find out some more about exactly what this means.

We are (optimistically, perhaps) hoping that a sharded cluster would allow us to unify our two studios onto a single farm, with shards on each side (and possibly with each shard operating as a replica set, with the secondary node in its corresponding remote studio). However, in order for this to work in any meaningful way, we would need Deadline to help out by allowing us to specify shard keys for write operations, so we could properly shard collections based on location. The idea is that each studio should only end up writing to its local shard in the majority of cases, and prevent performance from taking a sharp nose-dive. Obviously querying may end up being a different story, but I’d like to think that this could be done “lazily” (at least in the case of the Monitor). I know a lot of work has been done to improve the level of database traffic and contention in Deadline 7, so that will likely help out as well.

Tangentially related, I’m still somewhat skeptical of how reliably Deadline could work in a replica-set scenario where the secondary nodes were actually used as read nodes (instead of just in failover situations). For instance, if the secondary read nodes hadn’t received all of the latest write operations, it seems like you could end up with some nasty side effects like multiple slaves picking up the same task from the same job, etc.

Anyway, I would love to get some more information, and maybe brainstorm some ways in which Deadline could make a location-aware sharding process (and maybe even read-slave replication) work effectively.

Thanks

rrussell · July 25, 2014, 5:31pm

We have looked at the possibility of supporting tag-aware sharding (which would allow for the setup you’re describing), but at this time it’s currently not supported because there is no exposure to the shard keys that Deadline is using. Knowing that you guys have a high-latency connection between your two offices, combining a shard in each location into a single cluster might not be a good idea. Even if it would be possible to keep all traffic locally to their respective shards, you might as well just have two separate databases and guarantee that all traffic will remain local. This is the main reason why we didn’t push to have tag-aware sharding support in version 7, since we’re not convinced it has any real-world merit in Deadline’s case. But it’s good to have ongoing discussions about it!

Deadline currently doesn’t support reading from secondary nodes either, since we are also concerned about how reliable it would be (we actually enforce this in the connection policy that Deadline uses). Our main concern is that the secondary nodes will only eventually be consistent with the primary. If the Monitor was a read-only application, it wouldn’t be an issue, but since it also writes, it’s necessary to have information that is up to date as possible. It is possible to configure the system so that writes will only return once it has been written to the replicas, but we expect that to have a significant impact on performance.

Cheers,
Ryan