Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm trying to make a neural audio codec using a variety of misguided methods. One I am using ESNs wrong spreading leak rates in a logarithmic fashion acting like a digital cochlea. The other is trying to do the same with a complex mass-spring-damper system to simulate the various hairs of the cochlea as well. Both approaches make super interesting visuals and appear to cluster reasonably well, but I am still learning about RVQ and audio loss (involves GANs and spectral loss). I kinda wanna beat SNAC if I can.


Do you have a log available somewhere?


I keep everything in my self hosted gitea. Just made it public.

https://gitter.swolereport.com/robviren/cspace


Thanks, I’ll check it out

Edit: timed out


Reminds me of https://github.com/RobViren/kvoicewalk where people take voice clips and train a text to speech using random walks.

Not related, misguided methods :D


Well, it’s the same author so it is kind of related.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: