PyGaborSTM¶

Python library for extracting Rate-Scale-Frequency (RSF) representations from audio signals using bio-inspired auditory spectrograms and 2D Gabor filterbanks, following Chi, Ru & Shamma (2005) and Bellur & Elhilali (2017).

What it does¶

audio ──▶ AuditorySpectrogram ──▶ GaborFilterbank ──▶ RSF
              (n_freq × n_time)         (n_frames × n_rates × n_scales × n_freq)

CPU or GPU: drop-in NumPy/CuPy backend.
Memory-adaptive Gabor stage: caches kernel FFTs when memory allows, falls back to streaming otherwise.
Custom batched-SOS CUDA kernel for the cochlear filter stage.

Where to go next¶

Getting Started — install and run the pipeline on one audio file.
API Reference — generated from the source docstrings.
GitHub repository — issues, PRs.

References¶

Bellur, A., & Elhilali, M. (2017). Feedback-driven sensory mapping adaptation for robust speech activity detection. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(3), 481–492.
Chi, T., Ru, P., & Shamma, S. A. (2005). Multiresolution spectrotemporal analysis of complex sounds. The Journal of the Acoustical Society of America, 118(2), 887–906.