The Internet of things (IoT) significantly increases the volume of computations and the number of running applications on processors, from mobiles to servers. Big data computation requires massive parallel processing and acceleration. In parallel processing, associative memories represent a promising solution to improve energy efficiency by eliminating redundant computations. However, the tradeoff between memory size and search energy consumption limits their applications. In this paper, we propose a novel low energy Resistive Multi-stage Associative Memory (ReMAM) architecture, which significantly reduces the search energy consumption by employing selective row activation and in-advance precharging techniques. ReMAM splits the search in the Ternary Content Addressable Memory (TCAM) to a number of shorter searches in consecutive stages. Then, it selectively activates TCAM rows at each stage based on the hits of previous stages, thus enabling energy saving. The proposed in-advance precharging technique mitigates the delay of the sequential TCAM search and limits the number of precharges to two low-cost steps. Our experimental evaluation on AMD Southern Island GPUs shows that ReMAM reduces energy consumption by 38.2% on average, which is 1.62X larger than using GPGPU with conventional single-stage associative memory. We also show that implementing voltage overscaling on the top of the ReMAM further improves energy saving to 44.8% with average relative error lower than 10%.