Nearest neighbor search is the problem of computing the most similar objects to a given query object. Typically objects are represented as vectors in a high-dimensional space. You want to be able to compute similar objects fast even when the database contains millions of objects.

Varieties of the problem

Sparse Data vs. Dense Data
Systems (standalone systems vs. keyword search systems)
Online queries vs. batch
Queries vs. all nearest neighbors problem
Types of similarity

What’s new

2019, Application. In NLP with deep neural networks, one can predict a vector, then search for the corresponding word: “At test time, the model generates a vector and then searches for its nearest neighbor in the target embedding space to generate the corresponding word.” Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs by Kumar and Tsvetkov
2019, Library, Microsoft open sources a new NN library SPTAG: A library for fast approximate nearest neighbor search
2017, Library, Faiss by Facebook is a library for efficient similarity search and clustering of dense vectors
2014, Library, Non-Metric Space Library