Preliminary Program

SAC: A Novel Multi-hop Routing Policy in Hybrid Distributed IoT System based on Multi-agent Reinforcement Learning

Wen Zhang¹, Tao Liu², Mimi Xie³, Jun Zhang⁴, Chen Pan¹
¹Texas A&M University-Corpus Christi, ²Lawrence Technological University, ³The University of Texas at San Antonio, ⁴Harvard University

Abstract

Energy harvesting(EH) IoT devices have attracted vast attention in both academia and industry as they can work sustainably by harvesting energy from the ambient environment. However, due to the weak and transient nature of harvesting power, EH technology is unable to support power-intensive IoT devices such as IoT edge servers. Therefore, the hybrid IoT system where the EH IoT devices and non-EH IoT devices co-exist is forthcoming. This paper explored the routing problem in such a hybrid distributed IoT system. We first proposed a comprehensive multi-hop routing mechanism of this hybrid system. After that, we proposed a distributed multi-agent deep reinforcement learning algorithm, known as spatial asynchronous advantage actor-critic(SAC). SAC can optimize the system routing policy and energy allocation while maximizing the total amount of transmitted data and the overall data delivery to the sink node. The experiment results indicate that SAC can averagely complete at least 1:5 transmission rate and 12:9 Sink packet delivery rate compared with the baselines.