%0 Conference Proceedings %T FPGA-Based Multi-precision Architecture for Accelerating Large-Scale Floating-Point Matrix Computing %+ State Key Laboratory of High Performance Computing %+ School of Computer [Chine] %A Zhang, Longlong %A Peng, Yuanxi %A Hu, Xiao %A Huang, Ahui %A Tian, Tian %Z Part 4: Architecture and Hardware %< avec comité de lecture %( Lecture Notes in Computer Science %B 17th IFIP International Conference on Network and Parallel Computing (NPC) %C Zhengzhou, China %Y Xin He %Y En Shao %Y Guangming Tan %I Springer International Publishing %3 Network and Parallel Computing %V LNCS-12639 %P 191-202 %8 2020-09-28 %D 2020 %R 10.1007/978-3-030-79478-1_17 %Z Computer Science [cs]Conference papers %X Matrix computing plays a vital role in many scientific and engineering applications, but previous work can only handle the data with specified precision based on FPGA. This study first presents algorithms, data flows, and mapping strategies to match the hardware structure for matrix computing of different precisions. Then, we propose a unified multi-precision matrix computing unit core that can handle three precisions and three matrix operation modes and can be used as a coprocessor for large-scale matrix computing which has advantages of low storage and high efficiency. Finally, we build a complete matrix computing acceleration system and deploy it on FPGA using 128 processing elements (PEs). The experimental results show that the accelerator achieves a maximum frequency of 180 MHz, and matrix computing of double-precision, single-precision, and half-precision floating-point data performs 46.1 GFLOPS, 92.1 GFLOPS, and 184.3 GFLOPS respectively, which is superior to other current designs in terms of application range and performance. %G English %Z TC 10 %Z WG 10.3 %2 https://inria.hal.science/hal-03768753/document %2 https://inria.hal.science/hal-03768753/file/511910_1_En_17_Chapter.pdf %L hal-03768753 %U https://inria.hal.science/hal-03768753 %~ IFIP-LNCS %~ IFIP %~ IFIP-TC %~ IFIP-TC10 %~ IFIP-NPC %~ IFIP-WG10-3 %~ IFIP-LNCS-12639