Yes, it certainly provides a lot more clarity than the handwaving. While momentu...

kingstnap · 2025-10-08T05:36:52 1759901812

I doubt it would ever be period 3.

Not a formal proof, but there is this fun theorem called period 3 implies chaos that my gut instinct says applies here.

Basically if you have a continuous mapping from [a,b] -> [a,b] and there exists a 3 cycle then that implies every other cycle length exists.

Which in this case would kinda say that if you are bouncing between three values on the y axis (and the bouncing is a continuous function which admittedly the gradient of a relu is not) you are probably in a chaotic system

Now that requires assuming that the behaviour of y is largely a function of just y. But their derivation seems to imply that it is the case.