SPHINCS+, a hash-based signature scheme, has stood out as one of the four winners in the post-quantum cryptography (PQC) competition hosted by the U.S. National Institute of Standards and Technology (NIST). However, the slow signing speed forms a bottleneck for applications. Therefore, a kind of short-input hash function named Haraka is recommended as the third instantiation in SPHINCS+ due to its advantage in processing speed. In this work, we propose four hardware architecture schemes for Haraka in SPHINCS+, denoted as Case I to Case IV. Several optimization methods are combined and applied in different cases to perform the trade-off between area and throughput for different application scenarios. We code our designs in System Verilog language and synthesize them under the TSMC 28-nm CMOS technology. The experiment results show that Case IV achieves the best throughput and the most efficient performance, about 81.92 Gbps and 1.26 Mbps/GE, respectively, which also significantly outperforms the state-of-the-art implementation of Haraka and the advanced hardware implementation of the SHA-3 hash function.