We propose a new model to solve the approximate near neighbor problem, aiming to balance theoretical guarantees with dataset adaptability. Our approach involves storing the input point set in a binary tree structure, optimized for
performance on a fixed dataset and query distribution. The core idea of our approach is to extract useful information from the point set to enhance our structure, but to halt this extraction when it becomes potentially harmful. When this happens, we transition to an existing technique that offers theoretical guarantees. This strategy allows us to leverage the efficiency of our model while avoiding elements that could degrade performance. Thus, our structure remains data-driven while maintaining theoretical guarantees. Finally, we conduct experiments to demonstrate our algorithm’s adaptability to a dataset while preserving its theoretical guarantees. Specifically, we assess our model on the MNIST dataset, by performing queries on model instances built on different sized samples. We then compare our results with those of linear search.