Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nobody has ever had this confusion about the access time of hash tables except maybe in the introductory class. What you’re describing is the same reasoning as any data structure. Which is correct. Physical memory hierarchies are a data structure. Literally.

I’m confused by GP’s confusion.





This is completely false. All regularly cited algorithm complexity classes are based on estimating a memory access as an O(1) operation. For example, if you model memory access as O(N^1/3), linear search worse case is not O(N), it is O(N^4/3): in the worse case you have to make N memory accesses and N comparisons, and if each memory access in N^1/3 time, this requires N^4/3 + N time, which is O(N^4/3).

> linear search worse case is not O(N), it is O(N^4/3)

No, these are different 'N's. The N in the article is the size of the memory pool over which your data is (presumably randomly) distributed. Many factors can influence this. Let's call this size M. Linear search is O(N) where N is the number of elements. It is not O(N^4/3), it is O(N * M^1/3).

There's a good argument to be made that M^(1/3) should be considered a constant, so the algorithm is indeed simply O(N). If you include M^(1/3), why are you not also including your CPU speed? The speed of light? The number of times the OS switches threads during your algorithm? Everyone knows that an O(N) algorithm run on the same data will take different speeds on different hardware. The point of Big-O is to have some reasonable understanding of how much worse it will get if you need to run this algorithm on 10x or 100x as much data, compared to some baseline that you simply have to benchmark because it relies on too many external factors (memory size being one).

> All regularly cited algorithm complexity classes are based on estimating a memory access as an O(1) operation

That's not even true: there are plenty of "memory-aware" algorithms that are designed to maximize the usage of caching. There are abstract memory models that are explicitly considered in modern algorithm design.


You can model things as having M be a constant - and that's what people typically do. The point is that this is a bad model, that breaks down when your data becomes huge. If you're tying to see how an algorithm will scale from a thousand items to a billion items, then sure - you don't really need to model memory access speeds (though even this is very debatable, as it leads to very wrong conclusions, such as thinking that adding items to the middle of a linked list is faster than adding them to the middle of an array, for large enough arrays - which is simply wrong on modern hardware).

However, if you want to model how your algorithm scales to petabytes of data, then the model you were using breaks down, as the cost of memory access for an array that fits in RAM is much smaller than the cost of memory access for the kind of network storage that you'll need for this level of data. So, for this problem, modeling memory access as a function of N may give you a better fit for all three cases (1K items, 1G items, and 1P items).

> That's not even true: there are plenty of "memory-aware" algorithms that are designed to maximize the usage of caching.

I know they exist, but I have yet to see any kind of popular resource use them. What are the complexities of Quicksort and Mergesort in a memory aware model? How often are they mentioned compared to how often you see O(N log N) / O(N²)?


> It is not O(N^4/3), it is O(N * M^1/3). There's a good argument to be made that M^(1/3) should be considered a constant

Math isn't mathing here. If M^1/3 >> N, like in memory-bound algorithms, then why should we consider it a constant?

> The point of Big-O is to have some reasonable understanding of how much worse it will get if you need to run this algorithm on 10x or 100x as much data

And this also isn't true and can be easily proved by contradiction. O(N) linear search over array is sometimes faster than O(1) search in a hash-map.


This is untrue, algorithms measure time complexity classes based on specific operations, for example comparison algorithms are cited as typically O(n * log(n)) but this refers to the number of comparisons irrespective of what the complexity of memory accesses is. For example it's possible that comparing two values to each other has time complexity of O(2^N) in which case sorting such a data structure would be impractical, and yet it would still be the case that the sorting algorithm itself has a time complexity of O(N log(N)) because time complexity is with respect to some given set of operations.

Another common scenario where this comes up and actually results in a great deal of confusion and misconceptions are hash maps, which are said to have a time complexity of O(1), but that does not mean that if you actually benchmark the performance of a hash map with respect to its size, that the graph will be flat or asymptotically approaches a constant value. Larger hash maps are slower to access than smaller hash maps because the O(1) isn't intended to be a claim about the overall performance of the hash map as a whole, but rather a claim about the average number of probe operations needed to lookup a value.

In fact, in the absolute purest form of time complexity analysis, where the operations involved are literally the transitions of a Turing machine, memory access is not assumed to be O(1) but rather O(n).


Time complexity of an algorithm specifically refers to the time it takes an algorithm to finish. So if you're sorting values where a single comparison of two values takes O(2^n) time, the time complexity of the sort can't be O(n log n).

Now, you very well can measure the "operation complexity" of an algorithm, where you specify how many operations of a certain kind it will do. And you're right that typically comparison sorting algorithms complexities are often not time complexities, they are the number of comparisons you'll have to make.

> hash maps, which are said to have a time complexity of O(1), but that does not mean that if you actually benchmark the performance of a hash map with respect to its size, that the graph will be flat or asymptotically approaches a constant value.

This is confused. Hash maps have an idealized O(1) average case complexity, and O(n) worse case complexity. The difficulty with pinning down the actual average case complexity is that you need to trade off memory usage vs chance of collisions, and people are usually quite sensitive to memory usage, so that they will end up having more and more collisions as n gets larger, while the idealized average case complexity analysis assumes that the hash function has the same chance of collisions regardless of n. Basically, the claim "the average case time complexity of hashtables is O(1)" is only true if you maintain a very sparse hashtable, which means its memory usage will grow steeply with size. For example, if you want to store thousands of arbitrary strings with a low chance of collisions, you'll probably need a bucket array that's a size like 2^32. Still, if you benchmark the performance of hashtable lookup with respect to its size, while using a very good hash function, and maintaining a very low load ratio (so, using a large amount of memory), the graph will indeed be flat.


> if you model memory access as O(N^1/3), linear search worse case is not O(N), it is O(N^4/3)

This would be true if we modeled memory access as Theta(N^{1/3}) but that's not the claim. One can imagine the data organized/prefetched in such a way that a linear access scan is O(1) per element but a random access is expected Theta(N^{1/3}). You see this same sort of pattern with well-known data structures like a (balanced) binary tree; random access is O(log(n)), but a linear scan is O(n).




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: