mirror of
https://github.com/mongodb/mongo.git
synced 2024-12-01 09:32:32 +01:00
SERVER-47244 Add the Auto-splitting section to the sharding arch guide
This commit is contained in:
parent
128ea14211
commit
7419262b63
@ -314,6 +314,88 @@ The config server replica set durably stores settings for the maximum chunk size
|
||||
should be automatically split and balanced.
|
||||
|
||||
## Auto-splitting
|
||||
When the mongos routes an update or insert to a chunk, the chunk may grow beyond the configured
|
||||
chunk size (specified by the server parameter maxChunkSizeBytes) and trigger an auto-split, which
|
||||
partitions the oversized chunk into smaller chunks. The shard that houses the chunk is responsible
|
||||
for:
|
||||
* determining if the chunk should be auto-split
|
||||
* selecting the split points
|
||||
* committing the split points to the config server
|
||||
* refreshing the routing table cache
|
||||
* updating in memory chunk size estimates
|
||||
|
||||
### Deciding when to auto-split a chunk
|
||||
The server schedules an auto-split if:
|
||||
1. it detected that the chunk exceeds a threshold based on the maximum chunk size
|
||||
2. there is not already a split in progress for the chunk
|
||||
|
||||
Every time an update or insert gets routed to a chunk, the server tracks the bytes written to the
|
||||
chunk in memory through the collection's ChunkManager. The ChunkManager has a ChunkInfo object for
|
||||
each of the collection's entries in the local config.chunks. Through the ChunkManager, the server
|
||||
retrieves the chunk's ChunkInfo and uses its ChunkWritesTracker to increment the estimated chunk
|
||||
size.
|
||||
|
||||
Even if the new size estimate exceeds the maximum chunk size, the server still needs to check that
|
||||
there is not already a split in progress for the chunk. If the ChunkWritesTracker is locked, there
|
||||
is already a split in progress on the chunk and trying to start another split is prohibited.
|
||||
Otherwise, if the chunk is oversized and there is no split for the chunk in progress, the server
|
||||
submits the chunk to the ChunkSplitter to be auto-split.
|
||||
|
||||
### The auto split task
|
||||
The ChunkSplitter is a replica set primary-only service that manages the process of auto-splitting
|
||||
chunks. The ChunkSplitter runs auto-split tasks asynchronously - thus, multiple chunks can undergo
|
||||
an auto-split concurrently, provided they don't overlap.
|
||||
|
||||
To prepare for the split point selection process, the ChunkSplitter flags that an auto-split for the
|
||||
chunk is in progress. There may be incoming writes to the original chunk while the split is in
|
||||
progress. For this reason, the estimated data size in the ChunkWritesTracker for this chunk is
|
||||
reset, and the same counter is used to track the number of bytes written to the chunk while the
|
||||
auto-split is in progress.
|
||||
|
||||
splitVector manages the bulk of the split point selection logic. First, the data size and number of
|
||||
records are retrieved from the storage engine to approximate the number of keys that each chunk
|
||||
partition should have. This number is calculated such that if each document were uniform in size,
|
||||
each chunk would be half of maxChunkSizeBytes.
|
||||
|
||||
If the actual data size is less than the maximum chunk size, no splits are made at all.
|
||||
Additionally, if all documents in the chunk have the same shard key, no splits can be made. In this
|
||||
case, the chunk may be classified as a jumbo chunk.
|
||||
|
||||
In the general case, splitVector:
|
||||
* performs an index scan on the shard key index
|
||||
* adds every k'th key to the vector of split points, where k is the approximate number of keys each chunk should have
|
||||
* returns the split points
|
||||
|
||||
If no split points were returned, then the auto-split task gets abandoned and the task is done.
|
||||
|
||||
If split points are successfully generated, the ChunkSplitter executes the final steps of the
|
||||
auto-split task where the shard:
|
||||
* commits the split points to config.chunks on the config server by removing the document containing
|
||||
the original chunk and inserting new documents corresponding to the new chunks indicated by the
|
||||
split points
|
||||
* refreshes the routing table cache
|
||||
* replaces the original oversized chunk's ChunkInfo with a ChunkInfo object for each partition. The
|
||||
estimated data size for each new chunk is the number bytes written to the original chunk while the
|
||||
auto-split was in progress
|
||||
|
||||
### Top Chunk Optimization
|
||||
While there are several optimizations in the auto-split process that won't be covered here, it's
|
||||
worthwhile to note the concept of top chunk optimization. If the chunk being split is the first or
|
||||
last one on the collection, there is an assumption that the chunk is likely to see more insertions
|
||||
if the user is inserting in ascending/descending order with respect to the shard key. So, in top
|
||||
chunk optimization, the first (or last) key in the chunk is set as a split point. Once the split
|
||||
points get committed to the config server, and the shard refreshes its CatalogCache, the
|
||||
ChunkSplitter tries to move the top chunk out of the shard to prevent the hot spot from sitting on a
|
||||
single shard.
|
||||
|
||||
#### Code references
|
||||
* [**ChunkInfo**](https://github.com/mongodb/mongo/blob/18f88ce0680ab946760b599437977ffd60c49678/src/mongo/s/chunk.h#L44) class
|
||||
* [**ChunkManager**](https://github.com/mongodb/mongo/blob/master/src/mongo/s/chunk_manager.h) class
|
||||
* [**ChunkSplitter**](https://github.com/mongodb/mongo/blob/master/src/mongo/db/s/chunk_splitter.h) class
|
||||
* [**ChunkWritesTracker**](https://github.com/mongodb/mongo/blob/master/src/mongo/s/chunk_writes_tracker.h) class
|
||||
* [**splitVector**](https://github.com/mongodb/mongo/blob/18f88ce0680ab946760b599437977ffd60c49678/src/mongo/db/s/split_vector.cpp#L61) method
|
||||
* [**splitChunk**](https://github.com/mongodb/mongo/blob/18f88ce0680ab946760b599437977ffd60c49678/src/mongo/db/s/split_chunk.cpp#L128) method
|
||||
* [**commitSplitChunk**](https://github.com/mongodb/mongo/blob/18f88ce0680ab946760b599437977ffd60c49678/src/mongo/db/s/config/sharding_catalog_manager_chunk_operations.cpp#L316) method where chunk splits are committed
|
||||
|
||||
## Auto-balancing
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user