Hacker Newsnew | past | comments | ask | show | jobs | submit | oac's commentslogin

It's done in xet as a replacement for git lfs: https://huggingface.co/blog/from-files-to-chunks


It runs a model with 260K params, so hardly a "Large" LM. Nevertheless, a cool project.


And also not exactly a Commodore 64 is it, since it requires an addon with 30x the RAM. Still very cool and impressive though!


It's a borderline thing. The official Commodore REU only supported 8x the RAM. But you could modify it yourself to 32x. Creative Micro Design also had the third party 1750 REU which supported 32x RAM. (2 megabytes.)

So it is somewhat period accurate, albeit very expensive at the time.


Imagine if someone had run that LLM on that hardware in the 1980s, though. Incredible!

(Probably couldn't have trained the model, but still.)


You'd have needed a ZX Spectrum for that!


I might be mistaken, but I think this is partly because of the undirected structure of RBMs, so you can't build a computational graph in the same way as with feed-forward networks.


By "undirected structure" I assume you refer to the presence of cycles in the graph? I was taught to call such networks "recurrent" but it seems that that term has evolved to mean something slightly different. Anyway yeah, because of the cycles Gibbs sampling is key to the network's operation. One still employs gradient descent during training, but the procedure to calculate the gradient itself involves Gibbs sampling.

Edit: Actually was talking about the General Boltzmann Machine. For the Restricted Boltzmann Machine an approximation has been assumed which obviates the need for full Gibbs sampling during training. Then (quoting the article, emphasis mine) "after training, it can sample new data from the learned distribution using Gibbs sampling."


Nice and clean explanation!

It brings up a lot of memories! Shameless plug: I made a visualization of an RBM being trained years ago: https://www.youtube.com/watch?v=lKAy_NONg3g


This doesn't make motion capture obsolete: 1) Mocap can be applied to rigged characters and 2) mocap can animate full-body rigs not just facial expressions.


use larger machine learning models in the browser


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: