Description
Today selectNodeRequests
shard packing into node requests is mostly dictated by order of nodes in nodesIt
.
We do not explicitly order nodes in that iterator upon constructing a data structure.
In general packing shards into node requests is a bin-packing problem and it is NP hard.
However we can improve existing approach (without having to iterate all possible allocations).
Assuming the following shard allocation:
shard1=[node1,node2]
shard2=[node2,node1]
shard3=[node3,node1]
We could introduce a node selection strategy (such as spread to as many or as few nodes)
and create following requests: node1->{shard1},node2->{shard2},node3->{shard3}
or node1->{shard1,shard2,shard3}
depending on configuration using something like: select(nodeIds, strategy, pendingRequests)
where we pick the next node based on order (if no strategy is supplied) or based on its presence or absence in pendingRequests.
Spread strategy could speed up complex queries while the opposite might be good for CCS to minimize amount of requests to remote clusters.
See