Energy harvesting(EH) IoT devices have attracted vast attention in both academia and industry as they can work sustainably by harvesting energy from the ambient environment. However, due to the weak and transient nature of harvesting power, EH technology is unable to support power-intensive IoT devices such as IoT edge servers. Therefore, the hybrid IoT system where the EH IoT devices and non-EH IoT devices co-exist is forthcoming. This paper explored the routing problem in such a hybrid distributed IoT system. We first proposed a comprehensive multi-hop routing mechanism of this hybrid system. After that, we proposed a distributed multi-agent deep reinforcement learning algorithm, known as spatial asynchronous advantage actor-critic(SAC). SAC can optimize the system routing policy and energy allocation while maximizing the total amount of transmitted data and the overall data delivery to the sink node. The experiment results indicate that SAC can averagely complete at least 1:5 transmission rate and 12:9 Sink packet delivery rate compared with the baselines.