A few years back I ported Haskell style monadic parsing to Java, basically as a shit post entry to an event at work. I sadly don't have the code since it was a work thing, but I got it to work.
... at tremendous cost to my sanity. I've later been told I may have inadvertently broken two of the seven seals.
haha timely comment for me. I'm currently working on a project where I have no choice but to target a limited form of lua 5.1, but have total freedom within that constraint.
Was considering what it would take to write a small ML that compiles to lua. "Almost certainly more than I'm willing to put in" was my tentative unresearched conclusion but this validates it a bit.
Unfortunately, I haven't found a ton of "easily-digestible" and, at the same time, comprehensive guides on compiling functional languages. Generally you'll find a mix of blog posts/class notes/papers covering a single step.
Some resources I like:
- Andrew Kennedy's 2007 paper Compiling with Continuations, Continued [1]. This one is the most clear IMO
- Andrew Appel's Compiling with Continuations book (a bit outdated though... assembly code is for VAX)
I spent a few hours with this list a few weeks ago and nah. They're all abandoned, or undocumented, or incomplete, or target luajit, or intensely nonstandard, or surface the underlying lua behavior I'm trying to avoid. Usually a mix of some of those.
The lisp ones look solid and I love lisp, but for this specific thing a good type system would be most valuable.
There's a lot more to writing a compiler/interpreter than parsing, and I would almost posit that parsing is the _least_ interesting part of writing a compiler.
But what this library and some others are aiming for is the NEXT level of parsing difficulty: Parsers that work AND also provide high-fidelity output for SEVERAL consumers (IDE, editors, linters, ...) AND provide high-quality errors (aka: diagnostics) AND provide suggestion to fix that errors.
I agree that it's the least interesting part (at least for me), but a necessary part to be able to get to the juicy stuff. Getting a quick parser ready helps keep me wanting to code since I don't have to trudge though to get to the fun part.
Can you tell any other part of the compiler ecosystem that helps you write a new programming language in 1 hour?
This is supposed to be a fun article for experimentation in programming languages, not a new advancement in a garbage collection algorithm for increasing the efficiency of a datacenter by 2%.
My favourite new programming language is Svelte for example (just a parser that emits javascript), it's just a little different from HTML/CSS/JS, but just enough to make web development fun again for me.
Rust already has Nom for parser combinators, though. This library looks fine (at first glance) for a parser library, but it doesn't seem like it helps with building languages beyond just parsing (which is what I expected from the title).
I mostly agree, this just seems like a case of a weird description in the HN submission. As it is... it's a parser library, the Github project calls it a parser library, and that's fine.
A programming language implementation is either a compiler or interpreter. Title says "write your own programming language..." which implies an implementation. But that's not the case. As other commenters point out parser is a tiny, usually inconsequential part of a lang implementation.
Now do indentation-based. All parser libraries tend to trip on this small, yet widely relevant, feature. To be honest, most parser libraries should at least include a Python sample as some sort of acid test.
I have never used Chumsky before but the tutorial has the following to say about whitespace:
> Whitespace is not automatically ignored. Chumsky is a general-purpose parsing library, and some languages care very much about the structure of whitespace, so Chumsky does too
TL;DR: The best approach, in my view, is to detect whitespace in the lexer and emit fake delimiter tokens for the parser so that the parser-level syntax is still context-free.
Wow, Chumsky (and Ariadne!) looks really powerful. Got me wondering if I should port my language off of my hand-written parser; I haven't even begun to wrestle with parsing-error recovery.
Likely controversial opinion: Parser combinators are fun but feel a little too magic for me.
Maybe I’m being stubborn, but I prefer to write lexers and parsers by hand. Although, I would like to get around to writing a “common lexer and parser utility library” one day to abstract away some common patterns while simultaneously not forcing design decisions. Perhaps I’ll end up with a parser combinator library by the time I’m done and I’ll have to eat my words.
I tend to agree. If you're using an imperative language then the basic primitives of structured programming (loops, conditionals and functions) are all you need to write a recursive descent parser. Combinators like Parsec's 'sepBy' can make your parser more concise, but at the cost of making it more difficult to reason about performance and more difficult to read for anyone who hasn't memorized all the combinators.
Most parser combinator libraries give you lots of help with the easy stuff (e.g. parse a list of integers separate by commas), but not so much help with the hard stuff like generating good error messages or avoiding unbounded backtracking.
My impression is that he’s pretty controversial politically but that his contributions to linguistics/theory of grammars are generally seen as very impactful. And I imagine that in this case the reference is more related to the latter (?)
Curious to find out more about critiques on his work though (in all spheres)
One reason is that his fame among culture warriors on youtube is related to his third act (or arguably fourth). His first act in Linguistics, back in the 1950s, is relevant to Computer Science (eg: "Chomsky Normal Form")
As the author of the library, I'd like to note that I consider some of Chomsky's views to be... questionable. The choice of name was largely down to his work in the world of linguistics and parser theory.
anarchy and libertarian socialism is particularly popular among the American left. Though I think OP is from UK. Either way as a individualist and capitalist, I rolled my eyes.
Even among the right, I think anarchism and libertarian socialism are viewed more sympathetically than authoritarian leftist tendencies. Much like how the left is less vitriolic (in principle, at least) towards capitalist libertarians than they are to fascists.
The title here on HN seems kinda misleading; this is a parser library (and describes itself as that in the readme and github description), which isn't the only thing needed for writing a language.
Yes, the library is only a parser, but the tutorial does cover using the parsed structure for interpretation as well. I think it's mostly a fair title but could probably use the word "tiny" in it (as in, "write your own tiny programming...").
Don't let that mislead you: This is much more than a simple parser library. It's really a big chunk of what a compiler does, as the title advertises. Yeah, you need to handle the actual runtime part, of course.
I'm looking at the API docs and just seeing parser combinators and parsing-related stuff, like reporting/handling/recovering from parse errors. That's very handy to have, certainly, but I feel like I'm missing something if there's supposed to be support for implementing type-checking, nontrivial semantics, optimization, codegen, etc.
... at tremendous cost to my sanity. I've later been told I may have inadvertently broken two of the seven seals.