Spiking Neural Networks (SNNs) are artificial neural network models that show significant advantages in terms of power and energy when realizing deep learning applications. However, the data-intensive nature of machine learning applications imposes a challenging problem to neural network implementations in terms of latency, energy efficiency and memory bottleneck. Therefore, we introduce a scalable deep SNN to address the problem of latency and energy efficiency. We integrate a Computing-In-Memory (CIM) architecture built with a fabricated memristor crossbar array to reduce the memory bandwidth in vector-matrix multiplication, a key operation in deep learning. By applying an inter-spike interval (ISI) encoding scheme to the input signals, we demonstrate the spatiotemporal information processing capability of our designed architecture. The memristor crossbar array has an enhanced heat dissipation layer that reduces the resistance variation of the memristors by ~30%. We further develop a time-to-first-spike (TTFS) method to classify the outputs. The designed circuits and architecture can achieve very high accuracies with both digit recognition and the MNIST dataset. Our architecture can classify handwritten digits while consuming merely 2.9mW of power with an inference speed of 2μs/image. Only 2.51pJ of energy per synaptic connection makes it suitable to apply in deep learning accelerators.