Deep learning algorithms have taken learning-based applications by storm because of their algorithmic accuracy in modeling complex patterns, classification tasks and prediction problems across a wide range of applications like image processing, autonomous driving, pattern recognition, etc. This has come about due to several factors - improved computation capabilities with GPUs, improved memory technologies (neardata processing, High Bandwidth Memory, Hybrid Memory Cube and non-volatile memory) that have enabled us to break the memory wall), dedicated accelerators, FPGA implementations and so on. Since these workloads primarily constitute shuttling large datasets between the main memory and the processing station(s), several non-von Neumann approaches have also been implemented through dataflow computing, neuromorphic chips, Spiking neural network-based machines, and other braininspired computing technologies. While these technological advances have improved the accuracy of deep learning algorithms and made neuromorphic and non-von Neumann implementations possible, substantial challenges remain ahead. From a systems perspective, it is not clear how overall system level energy efficiency can be modeled, estimated and coordinated across such diverse computing architectures. Furthermore, a large class of deep learning algorithms that benefit from these advances run predominantly on "edge" devices such as in autonomous cars, and hence, overall power consumption and energy efficiency are of the highest importance. This paper makes the following contributions - we present an architectural analysis of deep learning and how they have benefited from modeling its algorithms as dataflows and representing the dataflows as "neurons", and when modeled as non-von Neumann architectures, have drastically improved the performance and accuracy of these workloads. We then present a landscape of different CNN implementations across von Neumann and non-von Neumann models. Subsequently, we look at system level energy efficiency considerations (energy models, abstractions, estimation methodologies, standards and benchmarks) for deep learning systems built with such diverse architectures and technologies and discuss some open problems, challenges and opportunities for both industry and research communities.