David @mleku - 1y
there, complete GC count operation which scans events for non-pruned (for #layer2 event stores) events, then scans the access time record of all the events, and generates a single list of all events with their database key, event data size (it's binary) and the last access time of the record this took about 18 seconds for a 14 gigabyte data store on disk, with as you can see, 11.6Gb of actual events this is acceptable for a process that probably will run once an hour or less
i have to rework the code so it merges the event GC and index GC (for events that have been pruned out but have space to store the index) into one pass but i need a bit of downtime, super pleased with my progress on that, this will definitely now scale up to terabytes of event data with a suitable set of parameters for cache sizing and GC frequency in the future a continuous adjustment and monitoring scheme will make it completely dynamic but for the purposes of current work a simple high/low water and fixed GC frequency will be MVP