Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Notable this isn't even close to realtime. M4 Max.




True :)

After some performance improvements, it is realtime on my DGX Spark with an RTF of .416 -- now getting ~19.5 tokens per second. Check it out, see if it's better for you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: