Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize redis cluster slotmap with compact slot range object #1493

Merged

Conversation

messikiller
Copy link
Contributor

Recently, I discovered a noteworthy code snippet located in the Predis\Cluster\SlotMap.php file. Upon reviewing and debugging the code, I realized that this class handles slot mapping for Redis in its default cluster mode. This class stores slot mappings using a simple attribute, Predis\Cluster\SlotMap::slots. It's worth mentioning that this array-type attribute uses the array keys as slots and the values to store the corresponding connection information for those slots. Since Redis clusters have 16384 slots by default, the maximum possible length of this array is 16384. When a large number of slot mappings are stored (normally, this involves synchronizing all slot data from the cluster), the array takes the following format:

[
    0 => c1,
    1 => c1,
    2 => c1,
    ...
    16384 => c1,
]

Evidently, this array is quite large and contains numerous duplicate values (connection information), which I believe consumes a significant amount of memory. To investigate, I wrote a test script to measure memory usage when storing a large number of slot mappings:

<?php

use Predis\Cluster\SlotMap;

require 'autoload.php';

error_reporting(E_ALL & ~E_DEPRECATED);

function convert($size)
{
    $unit = array('b', 'kb', 'mb', 'gb', 'tb', 'pb');
    return @round($size / pow(1024, ($i = floor(log($size, 1024)))), 2) . ' ' . $unit[$i];
}

function benchmark($slots, $slotMap)
{
    $t1 = microtime(true);
    $m1 = memory_get_usage();
    foreach ($slots as $slot) {
        $slotMap->setSlots($slot[0], $slot[1], $slot[2]);
    }
    $m2 = memory_get_usage();
    $t2 = microtime(true);

    printf("Memory before slots set: %s\n", convert($m1));
    printf("Memory after slots set: %s\n", convert($m2));
    printf("Memory usage: %s\n", convert($m2 - $m1));
    printf("Time: %f ms\n", ($t2 - $t1) * 1000);
}

$slots = [
    [0, 4000, 'c1'],
    [4001, 8000, 'c1'],
    [8001, 12000, 'c1'],
    [12001, 16000, 'c1'],
    [16000, 16383, 'c1'],
];

benchmark($slots, new SlotMap());

Upon running this script, I obtained the following results:

Memory before slots set: 408.52 kb
Memory after slots set: 668.58 kb
Memory usage: 260.05 kb
Time: 0.357151 ms

As evident from the results, memory usage increased by 260kb when storing a small number of slot mappings. With a larger number of mappings, the memory footprint would be even greater.

To address this, I experimented with a new approach using the Predis\Cluster\SlotRange object to store slot range information. Instead of maintaining a gigantic array with all slot mappings, this object only records the start and end nodes along with connection information for each slot range. This approach simplifies the process of merging and splitting multiple Predis\Cluster\SlotRange objects during each call to the Predis\Cluster\SlotMap::setSlots() method. For detailed code, please refer to the PR I submitted.

Furthermore, I renamed the original Predis\Cluster\SlotMap to Predis\Cluster\SimpleSlotMap and compared the memory usage of both solutions using a simple test script. The script is as follows:

<?php

use Predis\Cluster\SimpleSlotMap;
use Predis\Cluster\SlotMap;
use Predis\Cluster\SlotRange;

require 'autoload.php';

error_reporting(E_ALL & ~E_DEPRECATED);

function convert($size)
{
    $unit = array('b', 'kb', 'mb', 'gb', 'tb', 'pb');
    return @round($size / pow(1024, ($i = floor(log($size, 1024)))), 2) . ' ' . $unit[$i];
}

function generateRandomSlotRange()
{
    $r1 = mt_rand(0, SlotRange::MAX_SLOTS);
    $r2 = mt_rand(0, SlotRange::MAX_SLOTS);
    return [
        min($r1, $r2),
        max($r1, $r2),
        'connection-' . mt_rand(1, 100),
    ];
}

function benchmark($slots, $slotMap)
{
    $t1 = microtime(true);
    $m1 = memory_get_usage();
    foreach ($slots as $slot) {
        $slotMap->setSlots($slot[0], $slot[1], $slot[2]);
    }
    $m2 = memory_get_usage();
    $t2 = microtime(true);

    printf("Memory before slots set: %s\n", convert($m1));
    printf("Memory after slots set: %s\n", convert($m2));
    printf("Memory usage: %s\n", convert($m2 - $m1));
    printf("Time: %f ms\n", ($t2 - $t1) * 1000);
}

$slots = [];
$max = 10000;
for ($i = 0; $i < $max; ++$i) {
    $slots[] = generateRandomSlotRange();
}

echo "\n==========CompactSlotMap===========\n";
benchmark($slots, new SlotMap());
echo "\n==========SimpleSlotMap===========\n";
benchmark($slots, new SimpleSlotMap());

After running the comparison with 10,000 setSlots method calls, here are the memory usage results:

==========CompactSlotMap===========
Memory before slots set: 3.13 mb
Memory after slots set: 3.13 mb
Memory usage: 5.3 kb
Time: 171.266079 ms

==========SimpleSlotMap===========
Memory before slots set: 3.14 mb
Memory after slots set: 3.77 mb
Memory usage: 640.05 kb
Time: 453.664064 ms

Additionally, I conducted unit tests for the new SlotMap class, reusing the existing SlotMapTest class to ensure functional compatibility between the old and new solutions.

That's all for now. I hope this PR will be accepted and merged into the main branch. I welcome any questions or suggestions and look forward to receiving feedback. Thanks.

@messikiller messikiller changed the title Feature v2.x optimize cluster slotmap optimize redis cluster slotmap with compact slot range object Nov 25, 2024
@messikiller
Copy link
Contributor Author

Great work. Can you resolve these? https://github.com/predis/predis/actions/runs/12008755661/job/33472086379?pr=1493

Thanks, sorry for forgetting phpstan check, the problem has been fixed right now. @tillkruss

@messikiller
Copy link
Contributor Author

Is there anything I can do to get this PR merged? @tillkruss

tillkruss
tillkruss previously approved these changes Dec 9, 2024
@tillkruss
Copy link
Member

@messikiller: No, looks good to me, but to double check this won't cause any breaking changes with an existing data set, right?

@tillkruss tillkruss self-assigned this Dec 9, 2024
@tillkruss tillkruss added redis-cluster Cluster (managed by redis-cluster) feature labels Dec 9, 2024
@messikiller
Copy link
Contributor Author

@messikiller: No, looks good to me, but to double check this won't cause any breaking changes with an existing data set, right?

Thanks for your approvement ! 😄

@messikiller
Copy link
Contributor Author

Is there anything else that can be improved in this modification? Could you please give some suggestions for improvement on this PR when you have time? @vladvildanov @tillkruss

@tillkruss tillkruss requested a review from a team December 19, 2024 18:32
tillkruss
tillkruss previously approved these changes Feb 13, 2025
@coveralls
Copy link

coveralls commented Mar 10, 2025

Coverage Status

coverage: 88.722% (+0.1%) from 88.577%
when pulling 7d5777c on messikiller:feature-v2.x-optimize-cluster-slotmap
into 067b051 on predis:v2.x.

@tillkruss tillkruss requested review from vladvildanov and removed request for vladvildanov March 19, 2025 03:47
@vladvildanov
Copy link
Contributor

@messikiller LGTM! Great job

@tillkruss tillkruss merged commit 7bf1f6e into predis:v2.x Mar 19, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature redis-cluster Cluster (managed by redis-cluster)
Development

Successfully merging this pull request may close these issues.

4 participants