While research in designing brain-inspired algorithms have attained a stage where such Artificial Intelligence platforms are being able to outperform humans at several cognitive tasks, an often-unnoticed cost is the huge computational expenses required for running these algorithms in hardware. Bridging the computational efficiency gap necessitates the exploration of devices, circuits and algorithms that provide a better match to the computational primitives of biological processing – neurons and synapses, and which require a significant rethinking of traditional von-Neumann based computing. Recent experiments in spintronic technologies are revealing immense possibilities of implementing a plethora of neural and synaptic functionalities by single spintronic device structures that can be operated at very low terminal voltages. Leveraging insights from such experiments, we present a multi-disciplinary perspective across the entire stack of devices, circuits and systems to envision the design of an "All-Spin" neuromorphic processor enabled with on-chip learning functionalities that can potentially achieve two to three orders of magnitude energy improvement in comparison to state-of-the-art CMOS implementations. We also discuss recent innovations at the algorithm front on exploring event-driven Spiking Neural Networks for large-scale machine learning tasks. Such event-driven systems can potentially provide an significantly lower computational resources in contrast to standard deep learning platforms. The outlined end-to-end hardware-software co-design effort can enable drastic improvements in efficiency of autonomous AI agents. This is an invited presentation-only paper.