Residential College | false |
Status | 已發表Published |
A 47nW Mixed-Signal Voice Activity Detector (VAD) Featuring a Non-Volatile Capacitor-ROM, a Short-Time CNN Feature Extractor and an RNN Classifier | |
Lin, Jinhai1; Un, Ka Fai1; Yu, Wei Han1; Mak, Pui In1; Martins, Rui P.1,2 | |
2023-03-23 | |
Conference Name | 2023 IEEE International Solid- State Circuits Conference (ISSCC) |
Source Publication | Digest of Technical Papers - IEEE International Solid-State Circuits Conference |
Volume | Volume 2023-February |
Pages | 214 - 216 |
Conference Date | 2023-02-19 to 2023-02-23 |
Conference Place | San Francisco, CA, USA |
Country | United States |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Abstract | eal-time speech recognizers and translators rely on an always-on voice activity detector (VAD) to enable and disable the main system for effective power savings. A feature extractor and a memoryless classifier build the basic structure of the recent VADs, as depicted in Fig. 13.2.1 (upper-left). The feature extractor [1], [2] using Mel-frequency cepstral coefficients (MFCCs) regrettably occupies a substantial area and incurs a long latency for an extraction window of 16 to 25ms to cover a frequency down to sim 100Hz. For example, the required filter bank in [1] consumes 1muW r with a 25ms extraction window, leading to a 30ms latency. The mixer-based analog filter in [2] succeeds in squeezing the feature extractor's power to 60nW by using a time-interleaved fashion, but the latency prolongs to 512ms. The time-domain convolutional neural network (TD-CNN) in [3] exploiting passive switched-capacitor computation saves both the power and area of the feature extractor. Still, the area of the analog memory, and the signal leakage within the extraction window (10ms), are large. Its large threshold in the sensitivity threshold control also incurs in a latency of 50ms to sustain the VAD hit rate. Finally, all [1]-[3] require area-and-power-hungry on-chip memory to store the network parameters, preloading them to the volatile memory for every power-on increases the system complexity substantially. |
DOI | 10.1109/ISSCC42615.2023.10067728 |
URL | View the original |
Scopus ID | 2-s2.0-85151755600 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | INSTITUTE OF MICROELECTRONICS Faculty of Science and Technology DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING |
Corresponding Author | Un, Ka Fai |
Affiliation | 1.Univeristy of Macau 2.Instituto Superior Tecnico/University of Lisboa, Lisbon, Portugal |
Recommended Citation GB/T 7714 | Lin, Jinhai,Un, Ka Fai,Yu, Wei Han,et al. A 47nW Mixed-Signal Voice Activity Detector (VAD) Featuring a Non-Volatile Capacitor-ROM, a Short-Time CNN Feature Extractor and an RNN Classifier[C]:Institute of Electrical and Electronics Engineers Inc., 2023, 214 - 216. |
APA | Lin, Jinhai., Un, Ka Fai., Yu, Wei Han., Mak, Pui In., & Martins, Rui P. (2023). A 47nW Mixed-Signal Voice Activity Detector (VAD) Featuring a Non-Volatile Capacitor-ROM, a Short-Time CNN Feature Extractor and an RNN Classifier. Digest of Technical Papers - IEEE International Solid-State Circuits Conference, Volume 2023-February, 214 - 216. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment