Read up more on the COLA paper and funnily enough inspired by LSM-tree's I implemented something alike "basic-cola" a few months back (the thing I mentioned for size-coding), add to end of array, hierarchically re-sort for each of the bits of the new size that turns 0 from bottom up (divisible by 2, sort last 2, divisible by 4, sort last 4 and so on) and search with O((log2(n))^2) complexity.
B-tree fill factors are a parameter that can be tuned to avoid extra splits depending on your insertion patterns, an in-memory variant doesn't need to be tuned like a disk-based variation.
Also nothing preventing the "bottom" of an LSM variation to be a dense B-tree like structure that just gets less updates.
The COLA paper also never mentions threading, an in-memory B-tree should scale better with size if there is multi-threading while I don't see an easy way to avoid big locks with COLA, maybe why the COLA paper was from 2000 and we haven't seen much additional development on it? (LSM-tree's work beautifully of course but then you basically have the double-space issue like with semi-space garbage collectors).
In the end, what you choose is dependent on your application and patterns, COLA is a clever structure that probably shines in some scenarios.
Like my scenario was a mostly heavy on consecutive insertions as well as lookups of potentially arbitrary compound keys, perfect for cola like since those are cheap.
> while I don't see an easy way to avoid big locks with COLA
Do not modify arrays in-place, create new arrays instead. This way you can have multiple readers as the data pretty much read-only, no write locks and, again, cache-oblivious merging (fast).
B-tree fill factors are a parameter that can be tuned to avoid extra splits depending on your insertion patterns, an in-memory variant doesn't need to be tuned like a disk-based variation.
Also nothing preventing the "bottom" of an LSM variation to be a dense B-tree like structure that just gets less updates.
The COLA paper also never mentions threading, an in-memory B-tree should scale better with size if there is multi-threading while I don't see an easy way to avoid big locks with COLA, maybe why the COLA paper was from 2000 and we haven't seen much additional development on it? (LSM-tree's work beautifully of course but then you basically have the double-space issue like with semi-space garbage collectors).
In the end, what you choose is dependent on your application and patterns, COLA is a clever structure that probably shines in some scenarios.
Like my scenario was a mostly heavy on consecutive insertions as well as lookups of potentially arbitrary compound keys, perfect for cola like since those are cheap.