The document describes a framework for large-scale semantic visual search. It discusses using robust local image features that are aggregated into statistical representations and compressed for efficient matching. The framework performs an initial matching using nearest neighbor search followed by geometric verification and re-ranking to improve results. The goal is to develop visual representations that can capture fine-grained differences to effectively solve challenging ranking and retrieval problems.