Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Its not even certain that they are few. Whats rather unsettling is that with these local moves of SGD the parameters settle on a good enough local minima in spite of the fact that we know that many local minima exists that have zero or near zero training loss. There are glimmers or insight here and there but the thing is yet to be fully understood


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: