It looks way too long in the number of different tokens.
This is the TXR Lisp interactive listener of TXR 233.
Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
1> (defun merge-hashes (hlist)
(let ((hout (hash)))
(each ((h hlist))
(dohash (k v h) (push v [hout k])))
hout))
merge-hashes
2> (merge-hashes '(#H(() (a 1) (b 2) (c 3)) #H(() (b 5) (c 7) (d 8))))
#H(() (c (7 3)) (b (5 2)) (a (1)) (d (8)))
Now here is that function fully code golfed with unnecessary spaces removed and all symbols one character long:
(defun m(x)(let((o(hash)))(each((h x))(dohash(k v h)(push v[o k])))o))
That's down to 70 characters: only about double the K3 size.
If defun and the other built-ins were one character long, it would be down to 50 chars:
(d m(x)(l((o(H)))(e((h x))(D(k v h)(p v[o k])))o))
Now we have a fair comparison where we have leveled the field, eliminating the difference due to whitespace elimination and token condensation.
Though still significantly by raw character count (35 versus 50), there is less clutter in it. Also, it defines the name m whereas the {.+...} syntax needs a few more characters to define a function, I think.
But wait; that's far from the shortest code necessary. What I wrote is a decently efficient way of doing it which iterates the input hashes imperatively and builds up the output. It can be done in other ways, like this:
We can obtain a flat "assoc list" of all the key value pairs from all the dictionaries by mappend-ing them through hash-alist. hash-alist retrieves an assoc list from a single dictionary, and we map over that, appending these together.
Then we can group-reduce the assoc list; group-reduce populates a hash by classifying elements from an input sequence into keys, which denote individual reduce accumulators. So all the a elements are subject to their own reduce job, so are the b elements and so forth. The reduce function is (op cons (cdr @2) @1); an anonymous function that takes the cdr (value element) of each pair (that pair coming as argument 2), and conses it onto the accumulator (coming in as argument 1), returning the new accumulator. Since nil is the empty list, and fetching a nonexistent hash key yields nil, the accumulation can boostrap implicitly from an empty hash.
Now if we were to remove all unnecessary whitespace and give a one-character name to everything, we now get:
(d m(h)[R(H)c(o n(r @2)@1)[M L h]])
That shows there is a potential to get this down to 35 characters, with suitable function/operator and variable names.
Moreover, these 35 characters are yet easier to parse because all the parentheses/brackets are there: they comprise 14 out of the 35 characters! And we still have 5 spaces. So 19 characters out of the 35 are punctuation having to do with shaping the syntax; 16 are semantic.
If you remember that R is group-reduce, you know that its arguments are (H) c (o ...) and [M ...]. There is no guesswork about what goes with what.
The above K expression is actually verbose because it doesn't use very many characters for shaping the syntax tree. Most of the characters you see do something.
Languages like K leverage their terseness not just from compression of the input notation, but from the semantics of the operations. But they do not have a monopoly in semantics. Good semantics that enables terse programming can be found in languages that don't use terse notations at the token level.
The K3 example we have here is not leveraging good semantics; if that's the best that can be done, it suggests that k doesn't have a well-rounded library of operations for concisely manipulating dictionaries. (Could it be that the insistence on one-character names creates a pressure against that?)
It's better to have thousands of functions with descriptive names, and then be able to choose the best ones for expressing a given problem tersely, than to reach for one-letter naming as the primary means for achieving terseness, and then try to pull a one-size-fits-all library within that naming convention.
I think that k3 example does illustrate weaknesses of k3. At that time, dictionary types were very limited and inexpressive. Pretty much all you could do with them was index into them and destructure them.
In k5/k6, dictionaries had a more complete algebra. If you only wanted to take the union of a set of dictionaries (more common) it's just raze:
,/([a:1];[b:2];[a:3;c:4])
[a:3;b:2;c:4]
The full problem could be expressed as
{k!{x@&~^x}'x@'/:k:!,/x}
Note that there's no longer any clumsy transposition to assemble the dictionary; generally much less nesting as well. The syntax for creating symbol-keyed dictionaries is also much nicer.
Something that apparently I left on my workbench for oK was the "adverb form" of drop (_), which would let you express filter more directly:
I added some arguments to TXR Lisp's hash-uni (which merges two dictionaries) to be able to do this. It already has an optional join function to resolve values with clashing keys. Now it will take two additional functions to map the left and right values.
If we have a pair of dicts with values that we'd like to list, we can do [hash-uni h1 h2 append list list]. The values are turned into lists, and clashes resolved with append.
A list of hashes could be merged into lists with left-reduce, like this.
Our initial value is an empty hash #H(). The reduce kernel function is hash-uni. The values of every has in hlist are turned into one element lists, and then accumulated onto the lists in the hash merged so far with append. The initial accumulator is the empty hash.
It's more than vaguely monadic: the two functions are like "unit" and so on.
Yes, but your comparison isn't fair. How many distinct functions does TXR Lisp have? I guarantee you that k has less.
Moreover, if I can do something in a couple of characters, why do I need names? This trend towards huge stdlibs and intellisense and heavyweight IDEs is ridiculous. I can write k or q with notepad if I felt like it. The ability to understand the entire language and never have to reach for documentation is very powerful.
I've noticed you criticizing k and array languages in general in some other threads so I'm going to ask you a genuine question: do you have any experience with them? Not even professional experience, just like, spent a weekend programming something cool? Because before I tried them I was very dismissive and now I absolutely adore k.
A few more than it had a month ago. When you can name things in any way you want, the sky is the limit. You have to watch the image size and startup times, of course. Nowadays you can cram a lot into something that is tiny by modern standards.
> Moreover, if I can do something in a couple of characters, why do I need names?
Firstly, those characters are names.
Secondly, you need bigger names than one character if you want more functions, so then you have a better choice of functions for writing programs that are smaller in terms of the number of operations they combine.
doesn't introduce any names, other than the one it is defining, merge-hashes and the local variable/parameter hlist.
Everything else is a built-in, resulting in "point-free" style.
If by "why do I need names" we mean locally introduced names for intermediate values, I completely agree. I don't need to introduce names if I can achieve the calculation just by a functional combination of operations.
I mean, I'm just saying that most functions are unnecessary. We don't need them, and they actually make things worse. For example, I just counted, and q has a total of 209 distinct functions in the language (and that's pretty large for some apl/k purists). And it's not like the language is lacking functionality, additional functions are simply not necessary. Compare that to really any other non-trivial language, and the difference is pretty stark. Like, look at Common Lisp, for example.
If defun and the other built-ins were one character long, it would be down to 50 chars:
Now we have a fair comparison where we have leveled the field, eliminating the difference due to whitespace elimination and token condensation.Though still significantly by raw character count (35 versus 50), there is less clutter in it. Also, it defines the name m whereas the {.+...} syntax needs a few more characters to define a function, I think.
But wait; that's far from the shortest code necessary. What I wrote is a decently efficient way of doing it which iterates the input hashes imperatively and builds up the output. It can be done in other ways, like this:
We can obtain a flat "assoc list" of all the key value pairs from all the dictionaries by mappend-ing them through hash-alist. hash-alist retrieves an assoc list from a single dictionary, and we map over that, appending these together.Then we can group-reduce the assoc list; group-reduce populates a hash by classifying elements from an input sequence into keys, which denote individual reduce accumulators. So all the a elements are subject to their own reduce job, so are the b elements and so forth. The reduce function is (op cons (cdr @2) @1); an anonymous function that takes the cdr (value element) of each pair (that pair coming as argument 2), and conses it onto the accumulator (coming in as argument 1), returning the new accumulator. Since nil is the empty list, and fetching a nonexistent hash key yields nil, the accumulation can boostrap implicitly from an empty hash.
Now if we were to remove all unnecessary whitespace and give a one-character name to everything, we now get:
That shows there is a potential to get this down to 35 characters, with suitable function/operator and variable names.Moreover, these 35 characters are yet easier to parse because all the parentheses/brackets are there: they comprise 14 out of the 35 characters! And we still have 5 spaces. So 19 characters out of the 35 are punctuation having to do with shaping the syntax; 16 are semantic.
If you remember that R is group-reduce, you know that its arguments are (H) c (o ...) and [M ...]. There is no guesswork about what goes with what.
The above K expression is actually verbose because it doesn't use very many characters for shaping the syntax tree. Most of the characters you see do something.
Languages like K leverage their terseness not just from compression of the input notation, but from the semantics of the operations. But they do not have a monopoly in semantics. Good semantics that enables terse programming can be found in languages that don't use terse notations at the token level.
The K3 example we have here is not leveraging good semantics; if that's the best that can be done, it suggests that k doesn't have a well-rounded library of operations for concisely manipulating dictionaries. (Could it be that the insistence on one-character names creates a pressure against that?)
It's better to have thousands of functions with descriptive names, and then be able to choose the best ones for expressing a given problem tersely, than to reach for one-letter naming as the primary means for achieving terseness, and then try to pull a one-size-fits-all library within that naming convention.