These days, it feels like the sole use of User-Agent is as a weak defence against web scraping. I've written a couple of scrapers (legitimate ones, for site owners that requested machine-readable versions of their own data!) where the site would reject me if I did a plain `curl`, but as soon as I hit it with -H "User-Agent: [my chrome browser's UA string]", it'd work fine. Kind of silly, when it's such a small deterrent to actually-malicious actors.
(Also kind of silly in that even real browser-fingerprinting setups can be defeated by a sufficiently-motivated attacker using e.g. https://www.npmjs.com/package/puppeteer-extra-plugin-stealth, but I guess sometimes a corporate mandate to block scraping comes down, and you just can't convince them that it's untenable.)
Preventing scraping is an entirely futile effort. I've lost count of the number of times I've had to tell a project manager that if a user can see it in their browser, there is a way to scrape it.
Best I've ever been able to do is implement server-side throttling to force the scrapers to slow down. But I manage some public web applications with data that is very valuable to certain other players in the industry, so they will invest the time and effort to bypass any measures I throw at them.
As a person who scrapes sites (ethically), I think it's impossible or pretty damn near impossible to prevent a motivated actor from scraping your website. However, I've avoided scraping websites because their anti scraping measures made it not worth the effort of figuring out their site. I think it's still worth for do minimal things like minify/obfuscate your client side JS and use some type of one time use request token to restrict replay-ability. The difference between knowing that I can figure it in 30 minutes vs 4 hours vs a few days is going to filter out a lot of people.
Of course, sometimes obfuscating how your website works can make it needlessly more complicated, so it's a trade off.
Checking the user-agent string for scrapers doesn't work anyway. In addition to using dozens of proxies in different IP address blocks, archive.is spoofs its user agents to be the latest Chrome release and updates it often.
(Also kind of silly in that even real browser-fingerprinting setups can be defeated by a sufficiently-motivated attacker using e.g. https://www.npmjs.com/package/puppeteer-extra-plugin-stealth, but I guess sometimes a corporate mandate to block scraping comes down, and you just can't convince them that it's untenable.)