oac's comments

oac · 2025-10-01T06:34:35 1759300475

It's done in xet as a replacement for git lfs: https://huggingface.co/blog/from-files-to-chunks

oac · 2025-05-18T19:11:53 1747595513

It runs a model with 260K params, so hardly a "Large" LM. Nevertheless, a cool project.

aaronharnly · 2025-05-18T23:57:18 1747612638

And also not exactly a Commodore 64 is it, since it requires an addon with 30x the RAM. Still very cool and impressive though!

actionfromafar · 2025-05-21T12:02:39 1747828959

It's a borderline thing. The official Commodore REU only supported 8x the RAM. But you could modify it yourself to 32x. Creative Micro Design also had the third party 1750 REU which supported 32x RAM. (2 megabytes.)

So it is somewhat period accurate, albeit very expensive at the time.

robertlagrant · 2025-05-21T13:01:11 1747832471

Imagine if someone had run that LLM on that hardware in the 1980s, though. Incredible!

(Probably couldn't have trained the model, but still.)

bunchofnumbers · 2025-05-21T13:10:53 1747833053

You'd have needed a ZX Spectrum for that!

oac · 2025-05-15T19:59:44 1747339184

I might be mistaken, but I think this is partly because of the undirected structure of RBMs, so you can't build a computational graph in the same way as with feed-forward networks.

alimw · 2025-05-16T10:21:41 1747390901

By "undirected structure" I assume you refer to the presence of cycles in the graph? I was taught to call such networks "recurrent" but it seems that that term has evolved to mean something slightly different. Anyway yeah, because of the cycles Gibbs sampling is key to the network's operation. One still employs gradient descent during training, but the procedure to calculate the gradient itself involves Gibbs sampling.

Edit: Actually was talking about the General Boltzmann Machine. For the Restricted Boltzmann Machine an approximation has been assumed which obviates the need for full Gibbs sampling during training. Then (quoting the article, emphasis mine) "after training, it can sample new data from the learned distribution using Gibbs sampling."

oac · 2025-05-15T19:05:24 1747335924

Nice and clean explanation!

It brings up a lot of memories! Shameless plug: I made a visualization of an RBM being trained years ago: https://www.youtube.com/watch?v=lKAy_NONg3g

oac · on Oct 23, 2024

This doesn't make motion capture obsolete: 1) Mocap can be applied to rigged characters and 2) mocap can animate full-body rigs not just facial expressions.

oac · on March 1, 2017

use larger machine learning models in the browser