%0 Conference Proceedings
%T Semi- and Fully-Random Access LUTs for Smooth Functions
%+ University of California [Riverside] (UC Riverside)
%+ North Carolina State University [Raleigh] (NC State)
%+ Yeditepe University
%+ Özyeğin University
%A Gener, Y., Serhan
%A Aydin, Furkan
%A Gören, Sezer
%A Ugurdag, H., Fatih
%< avec comité de lecture
%( IFIP Advances in Information and Communication Technology
%B 27th IFIP/IEEE International Conference on Very Large Scale Integration - System on a Chip (VLSI-SoC)
%C Cusco, Peru
%Y Carolina Metzler
%Y Pierre-Emmanuel Gaillardon
%Y Giovanni De Micheli
%Y Carlos Silva-Cardenas
%Y Ricardo Reis
%I Springer International Publishing
%3 VLSI-SoC: New Technology Enabler
%V AICT-586
%P 279-306
%8 2019-10-06
%D 2019
%R 10.1007/978-3-030-53273-4_13
%Z Computer Science [cs]Conference papers
%X Look-Up Table (LUT) implementation of complicated functions often offers lower latency compared to algebraic implementations at the expense of significant area penalty. If the function is smooth, MultiPartite table method (MP) can circumvent the area problem by breaking up the implementation into multiple smaller LUTs. However, even some of these smaller LUTs may be big in high accuracy MP implementations. Lossless LUT compression can be applied to these LUTs to further improve area and even timing in some cases. The state-of-the-art in the literature decomposes the Table of Initial Values (TIV) of MP into a table of pivots and tables of differences from the pivots. Our technique instead places differences of consecutive elements in the difference tables and result in a smaller range of differences that fit in fewer bits. Constraining the difference of consecutive input values, hence semi-random access, allows us to further optimize designs. We also propose variants of our techniques with variable length coding. We implemented Verilog generators of MP for sine and exponential using conventional LUT as well as different versions of the state-of-the-art and our technique. We synthesized the generated designs on FPGA and found that our techniques produce up to 29% improvement in area, 11% improvement in timing, and 26% improvement in area-time product over the state-of-the-art.
%G English
%Z TC 10
%Z WG 10.5
%2 https://inria.hal.science/hal-03476614/document
%2 https://inria.hal.science/hal-03476614/file/501403_1_En_13_Chapter.pdf
%L hal-03476614
%U https://inria.hal.science/hal-03476614
%~ IFIP
%~ IFIP-AICT
%~ IFIP-TC
%~ IFIP-WG
%~ IFIP-VLSISOC
%~ IFIP-TC10
%~ IFIP-WG10-5
%~ IFIP-AICT-586