| One change to make this library more idiomatic would be to make HyperLogLog avoid forcing heap allocation on the user. So from this (simplified code): pub fn HyperLogLog(comptime p: u8) type {
return struct {
dense: []u6,
const Self = @This();
pub fn init(allocator: Allocator) !Self {
return Self{
.dense = try allocator.alloc(u6, 0),
};
}
}
}
To this: pub fn HyperLogLog(comptime p: u8) type {
return struct {
dense: [1<<p]u6;
const Self = @This();
pub fn init() Self {
var s = Self{};
for (s.dense) |*x| x.* = 0;
return s;
}
}
}
The advantage to this API is that the user can choose where to place the HLL (stack, heap, global memory). Also note how the new init can't fail anymore.That said, in the code that I've conveniently omitted there's one big complication: when the HLL has only few items in it, it doesn't use the dense representation and instead uses a `std.AutoHashMap` to keep counts. Unfortunately `std.AutoHashMap` is not a type that can be reasonably disentangled from allocation. I was about to drop my idea of commenting when I realized that a very neat solution would be to also provide the user with the choice (and admittedly extra responsibility) of when to switch from a hash map to a proper HLL. pub fn init(start: std.AutoHashMap) Self {
var s = Self{};
for (s.dense) |*s| s.* = 0;
self.addFromHashMap(start);
return s;
}
There you go, allocation-free HyperLogLog :^)I've also come up with my own funky interface when I implemented Cuckoo Filters in Zig, for those who are interested: https://github.com/kristoff-it/zig-cuckoofilter |
https://github.com/redis/redis/blob/unstable/src/hyperloglog...