Di Wu (Ph.D.)
Published: 2023
Total Pages: 0
Get eBook
In the last decade, deep learning has gained soaring popularity in both academia and industry, playing an indispensable role in many aspects of human lives. The resource-intensive nature of deep neural networks (DNNs), especially its core operation, general matrix multiply (GEMM), has prompted extensive efforts to optimize the execution on conventional hardware, aiming to democratize the transformative capabilities of deep learning technology. However, conventional hardware with binary computing does not exhibit satisfactory hardware efficiency. To yield unprecedented levels of hardware efficiency to enable new applications, my research leverages novel unary computing and approximate computing, which diverge from conventional binary computing. The primary focus of my research is on the exploration of power-efficient computer architecture using unary computing and approximate computing, especially within the context of deep learning. Unary computing utilizes extremely simple hardware to manipulate data in the form of serial bitstreams. My research on unary computing answers two questions: 1) what are the fundamental challenges of unary computing and the corresponding solutions? 2) how to design power-efficient computer architecture using unary computing? My research holistically presents how to understand, research, leverage unary computing, via innovations across multiple design stacks, including theory, simulation, circuit, architecture and application. This research targets mainly GEMM operations, with extended research on nonlinear operations. On the other hand, approximate computing intentionally introduces artificial defects to the computation, while maintaining the associated errors at an acceptably low level. My research on approximate computing answers two questions: 1) given certain fault tolerance, how to make trade-offs between the accuracy and energy? 2) given the expensiveness of complex nonlinear operations, how to minimize the hardware area and power overheads? My research attempts to identify the best trade-offs between DNN accuracy and hardware energy, achieving dynamic accuracy and energy scaling at both the single operation level and the DNN level, ultimately meeting the varying demands in various application scenarios. This research targets emerging DNNs with heavy and complex nonlinear operations, in addition to GEMM operations.