We realise our approach in a new non-blocking data-structure, the TS
(timestamped) stack. In experiments on x86, the TS stack outperforms
and outscales all its competitors -- for example, it outperforms the
elimination-backoff stack by factor of two. In our approach, more
concurrency translates into less ordering, giving less-contended
removal and thus higher performance and scalability. Despite this,
the TS stack is linearizable with respect to stack semantics.
The weak internal ordering in the TS stack presents a challenge when
establishing linearizability: standard techniques such as
linearization points work well when there exists a total internal
order. We have developed a new stack theorem, mechanised in Isabelle,
which characterises the orderings sufficient to establish stack
semantics. By applying our stack theorem, we can show that the TS
stack is indeed correct.