Nov 11, 2022
That's cause we don't want to average samples across different channels (and risk losing some signal in data) and ultimately convert the audio into 1-D array (especially when HuBERT can handle multi-dimensional audio arrays).
That's cause we don't want to average samples across different channels (and risk losing some signal in data) and ultimately convert the audio into 1-D array (especially when HuBERT can handle multi-dimensional audio arrays).
Senior Data Scientist | Explain like I am 5 | Oxford & SFU Alumni | https://podurama.com | Top writer on Medium