--

That's cause we don't want to average samples across different channels (and risk losing some signal in data) and ultimately convert the audio into 1-D array (especially when HuBERT can handle multi-dimensional audio arrays).

--

--

Dr. Varshita Sher
Dr. Varshita Sher

Written by Dr. Varshita Sher

FTSE 100 Tech Leader 🚀 | Data Science & Generative AI | Explain like I am 5 | Oxford Alumni | 2x Top writer on Medium | Editor of Trusted Data Science @ Haleon

No responses yet