web123456

Redis's data structure-hash table

RedisUse a Hash table as its underlying layerData structureto store key-value pairs. EachRedis DatabaseBoth maintain two hash tables:ht[0]andht[1],inht[0]is the main hash table,ht[1]It is a temporary table used when performing rehash operations.

The structure of the hash table

Redis hash table consists of an array and aLink Listcomposition. Each element in the array is a pointer to the linked list formed when a hash conflicts. The specific structure is as follows:

  1. Array: The core part of the hash table is an array of pointers, each element in the array points to a hash bucket, and each bucket is a header pointer to a linked list.

  2. Link List: If multiple key-value pairs are mapped to the same hash bucket (i.e. hash conflict), these key-value pairs are stored in the linked list. Each linked list node stores a key-value pair.

Hash conflict problem

Hash conflict refers to the hashing of different keysfunctionAfter calculation, the same hash value is obtained (i.e. mapped to the same position in the hash table). In Redis, hash conflicts are resolved by the link address method, that is, under the same hash value, the conflicting key-value pairs are stored in a linked list.

Rehash operation and its blocking problems

As the data in the hash table increases, the probability of hash conflict increases, resulting in the linked list becoming longer and the search operation isTime complexityIt gradually degenerates from O(1) to O(n). To solve this problem, Redis needs to expand or shrink the hash table, and this process is calledrehash

The process of Rehash
  1. Create a new hash table: When rehash is needed, Redis will create a new hash tableht[1], its size isht[0]twice or half, depending on the current load factor.

  2. Migrate key-value pairs:Willht[0]All key-value pairs in recalculate the hash value and move toht[1]middle.

  3. Replace old table: Redis will be released when all key-value pairs are migrated.ht[0]and willht[1]Replace with newht[0]

Operation blocking problem

Doing rehash directly can cause blocking operations, because during large-scale data migration, Redis needs to pause other operations until the migration is complete. This kind of blocking cannot be ignored for applications with high real-time requirements.

Progressive rehash

In order to solve the possible blocking problems during rehash, Redis adoptsProgressive rehash(Incremental Rehashing) strategy.

How to implement progressive rehash
  1. Step by step: Redis will not transfer all key-value pairs fromht[0]Migrate toht[1]. Instead, it moves data step by step to the new hash table as it reads and writes the hash table. Every time a write operation is performed, Redis willht[0]Migrate some data toht[1]

  2. Maintain two hash tables at the same time: Redis is used simultaneously during progressive rehashht[0]andht[1]. All new data will be inserted intoht[1]Redis will first check when reading or modifying dataht[1], if not found, go againht[0]Find in.

  3. rehash index tracking: Redis tracks migration progress through a rehash index. Each time a progressive rehash is executed, the index is updated to point to the next bucket that needs to be migrated.

  4. End rehash:whenht[0]All key-value pairs inht[1]When Redis willht[1]Replace withht[0], and clearht[1], at this time the rehash process ends.

Advantages of progressive rehash

Through progressive rehash, Redis distributes rehash operations into multiple steps, avoiding the blocking problem caused by one-time large-scale data migration. This method allows Redis to handle other read and write requests while rehashing, thus ensuring high performance and low latency.

In summary, Redis uses hash tables to store key-value pairs and uses the chain address method to resolve hash conflicts. In order to avoid the blocking problems caused by rehash, Redis uses progressive rehash to perform data migration in step by step, thereby improving the system's responsiveness.