Who cares about how large a NPM library is? the issue is what you send to your clients, not how monstrous is what you download once on the integration server.
If anyone really cared, they would fix the broken ecosystem that encourages people to redownload the same packages millions of times.
Picked a popular package at random, webpack. npm says version 5.88.2 released 3 months ago has 5,992,398 downloads in the last 7 days.
I don't know how anyone can look at that see it as anything other than a massive failure.
Fast connections and free bandwidth have caused people to completely ignore the fact that every time some CI pipeline runs, npm goes off and downloads 100MB of dependencies. Dependencies that haven't changed since the pipeline last ran 30 seconds ago.
npm could fix this by aggressively rate limiting clients that have already downloaded the same package multiple times, but I guess as long as the vc funding is paying the bandwidth bill it's not a problem, and those "millions of downloads" make you look good.
> Fast connections and free bandwidth have caused people to completely ignore the fact that every time some CI pipeline runs, npm goes off and downloads 100MB of dependencies. Dependencies that haven't changed since the pipeline last ran 30 seconds ago.
Maybe it's just me, but I've always thought it was well known best practice to cache your deps[0].
I'm pretty certain that this can be achieved with most CI/CD tools.
I've seen a lot of pipelines that simply don't bother. Or maybe they tried, but the caching isn't working and since the build works in the end no one notices an extra 30 seconds.
The vast majority of those are from CI on ephemeral cloud instances.
Do you think CI should not be run?
Or CI should be run, but not on ephemeral cloud instances?
Or CI should be run on ephemeral cloud instances, but the packages should be cached using a separate service from npmjs.com (e.g. S3)? If so, what makes this other service preferable?
> Or CI should be run on ephemeral cloud instances, but the packages should be cached using a separate service from npmjs.com (e.g. S3)? If so, what makes this other service preferable?
Yes, you should vendor external dependencies.
A build should ideally not require internet access to complete.
> You've got a non-internet CI with non-internet source code repository with non-internet vendored dependencies??
Vendored dependencies are pulled down from an internal s3 bucket (and cached locally) before the build starts, the rest of the build runs with no internet access.
> In real terms that means a saving of around 1MB. That doesn’t sound like much, but at 4 million weekly downloads, that would save 4TB of bandwidth per week.
Yeah, who cares about 4TB/week? What is this, the '90s?!
I mean they care to some degree, if they _really_ cared presumably everything would be compressed with zstd and served to the more modern npm-cli installations, and npm-cli would refuse to upload binaries that are not explicitly allowlisted.
Lots. See the "npm node_modules blackhole" meme. For one practical reason, confirming the quality of the code in node_modules is so impossible a task it just isn't even attempted. So people are shipping code they have 0 knowledge of. For a paranoia-fueled reason, even devDep packages run the risk of being harmful (malware in the build process).