Episode 72 Transcript — The Array Cast

Transcript

Transcript prepared by Bob Therriault, Igor Kim and
Sanjay Cherian

Transcript

00:00:00 [Elias Mårtenson]

Even though I considered myself fairly knowledgeable in terms of how tacit expressions worked, I still always felt some kind of mental exhaustion every time I was reading a tacit expression.

00:00:25 [Conor Hoekstra]

Welcome to episode 72 of ArrayCast. I'm your host, Conor. And today with us, we have a special guest who we will get to in a few minutes with an introduction. But before we do that, we're gonna go around and do brief introductions from the panelists. We'll start with Bob, then go to Stephen, then to Marshall, and then finish with Rich.

00:00:38 [Bob Therriault]

I'm Bob Therriault. I am a J enthusiast, and I'm very interested in the language that we're gonna talk about today.

00:00:46 [Stephen Taylor]

I'm Stephen Taylor. I'm an APL and q enthusiast, and I'm looking forward to learning something about the language we're gonna talk about today.

00:00:54 [Marshall Lochbaum]

I'm Marshall Lochbaum. I've worked with other array languages, but now I prefer my own, which is BQN, and I also work on Singeli.

00:01:01 [Rich Park]

I'm Rich Park. I'm an APL programmer, and I work in media and outreach at Dyalog Limited.

00:01:05 [CH]

And as mentioned before, my name's Conor. I am a polyglot programmer, massive fan of all array languages, and also super excited to talk about Kap, spoiler. It was probably in the title today, but we'll get to that in a few minutes. And I really have been waiting to have this conversation since November, I think was when the first time I really learned a little bit more about Kap. But before we do that, I think we have two announcements, both from Rich, so I'll throw it over to him, and then we'll get into our conversation.

00:01:29 [RP]

Yeah, so if you're interested in APLs, of which Kap is similar, I don't know if we can yet say that it is one, we'll find out just now, but the APL Challenge, [01] the rebranding of the APL Problem Solving Competition has just launched today as of recording, so by the time you're listening to this, it will be live. It's targeted at very much ranked beginners, so very accessible, and you can enter, solve some problems in APL for a chance to win up to one of $300 cash prizes. So if there's some incentive to get involved learning APL, then there it is. And then if you find you have fun with that, and you want to learn a bit more, and keep discovering further, this is another reminder on the 27th of March, Dyalog will be hosting the AppleSeeds24 online event. So it's gonna be live, online, link to join that will be coming out, well, before the actual event date, but it's not there on the website yet, but there will be a link in the show notes to where you can find that. We'll be having a series of panel discussions with various APL users, professionals, and other people about topics ranging from getting started, your first steps learning APL, and things you might find challenging, through to using APL in academia, using APL to write scientific software, and things like using APL in industry. So what does it look like for a professional APL to use APL? So very excited for that coming up in a couple of months. Yeah, those are my announcements.

00:03:00 [CH]

Awesome, so as always, links will be in the show notes, both on our website and all of your podcast apps, as well as the transcript, shout out to our transcribers, they're fantastic. But without further ado, we are going to introduce, here's my best attempt, he will correct the pronunciation in a second, Elias Mårtenson, I think, and well, I definitely know that's his name, whether that's the pronunciation or not, we will be corrected in a sec. And he is the creator of the Kap array programming language. So we've mentioned this one or two times before on the podcast, I think. We know that unofficially or officially, Kap stands for Kotlin APL, maybe we can get the full story behind that from Elias. And this language was, I think, created around the same time as BQN. So it's one of the most modern array languages, it shares a lot of similarities with APL, but there's a bunch of differences as well. A few of them, I really hope we get to today, because like I said, I've been waiting to have this conversation and to ask them the questions that I have. I think before we throw it over to get your whole background of sort of how you got into computing, I'll start off with my first question, just to get it out of the way. Is Kap spelt with all uppercase letters, or is it, you know, Pascal case with a capital K and lowercase AP? Because depending on the APL wiki or the landing site or the GitHub repo, it's all different. And people have told me on Twitter, whenever I tweet, you know, I don't spell it the right way. So first things first.

00:04:21 [ML]

Whichever one you choose, right?

00:04:25 [CH]

Yeah. (laughs) What's the capitalization? And then maybe if you want after that, take us back to, you know, whenever you want and how you got to creating your own array language.

00:04:31 [EM]

Well, it was all uppercase and then I got tired of it. So now it's just the K being uppercase. I think it looks better when you write it. I, it's, yeah, that's pretty much it.

00:04:44 [ML]

nd I did go into the APL wiki and do a text replace some week ago or something. So now that's all fixed.

00:04:51 [EM]

Yeah, I did the search and replace on my old blog posts and my own documentation as well to make it, try to make it, make it clean.

00:05:00 [CH]

This makes me feel so much better too. So the real answer is at one point it was all uppercase, but now cutting edge, breaking news, it is capital K, lowercase AP. All right, awesome. So yeah, take us back to whenever you want to, how you got into computing and yeah, bring us up to, you know, 2024.

00:05:16 [EM]

You know, I kind of guessed that you were gonna ask that. So I thought to myself a couple of days ago, I wonder how long it will take me to talk. And then I found 45 minutes later, I realized that I probably shouldn't go into that much detail. So I'll try to speed run through the relevant parts. I started out in '84, '83, '84, programming on a Commodore 64, doing BASIC and then 6502 assembly. And I think that's actually quite interesting because BASIC, starting out with BASIC, it's a REPL-based development environment, especially the old BASICs on the 8-bit systems. And I realized that the REPL-based development is certainly something I prefer, which is why I did a lot of Lisp and now APL. But going back to, then I moved to Atari ST, Motorola 68000 assembly, some PC, Pascal C, C++. Then I joined a computer club with a lot of Unix. So I learned, I started using Emacs, so, and Emacs Lisp, of course, and a bunch of all the scripting languages that go hand in hand with Unix. Then I went working, I was working with Unix, I was working for Sun Microsystems, obviously a lot of Java there. And I've also been doing a bunch of Common Lisp, some Erlang, some Elixir. So I have reasonable breadth in terms of my programming language experience, but you might notice distinct lack of array languages in that list. That's because I was thinking back to when I actually learned about the existence of APL for the first time, and it must have been at the computer club sometime in the early '90s. And like most computer clubs at the time, you have a big stack of old hardware. And one of those pieces of hardware was an old terminal, and the terminal had APL symbols on it. And I remember someone telling me about it, and he said, "Oh, oh, have you seen that? Those symbols, those crazy, crazy things, horrible language, unreadable." We've all heard those things, right? I mean, completely unreadable, you can write an entire program in a single line, but no one will be able to understand it afterwards. And I think if the goal of that, I can't remember who it was, but if it was his goal to stop me from being interested, it utterly failed. I was incredibly fascinated. But I didn't have any information about it.

00:07:48 [EM]

Around '91, not many opportunities to actually run any APL or any other kind of array programming. So I think the first array programming language I touched was, yeah, it must have been A+. [02] And I do remember installing it at the computer club there, having a lot of trouble, mainly because it is a little bit of a hassle to use, because it's not Unicode, of course. So you had to install fonts and it was messy and just something as simple as copy and pasting from one place to another, and you can't read what you wrote because the characters are the way they are. After that, again, a lull, if you like, and then it must have been APL 2 and Norse 2000. That's where I started actually learning APL. And it was interesting, but those two products have the problem of running on Windows only, and I'm not the Windows guy. So I tried it whenever I rebooted my machine into Windows to play a game, I played around a little bit with APL and enough to learn a little bit, but I didn't do anything with it because, yeah, I didn't. And also APL 2 was, of course, was a demo version at the time. I don't think you can even download it anymore, but at the time, I think it was like one month, two months, something like that. So again, a lull until GNU APL showed up. And that was really the thing, really cool. It was open source, it was Linux, primarily Linux, and had everything you needed except for one thing, and that was an easy way to input the characters. And the general recommendation at the time was, well, you have this special key sim you can install on your Linux system and that will work, which I didn't want to do because I had my own personal customized key sim file at the time. So I did what any other enterprising Emacs user would do, you'd write an Emacs mode for it. So I did that, and that became what is today available as GNU APL mode for Emacs. And once I implemented the basic input, I took it a little bit further, took ideas from the Slime development environment for Common Lisp, which allows the editor to communicate directly with the underlying interpreter. So that, for example, you can get command completion on all of those things. And to do that, I needed to be able to talk directly to GNU APL, and there was no way to do it at the time. So I had to learn how GNU APL worked, so I could implement the protocol that Emacs could use to talk directly to the interpreter to get this information, because if you open a window in Emacs that shows the result, the content of a variable, when the variable changes, you want that to update in the editor. So you need some way of pushing information from the interpreter to Emacs. So I implemented that, that ended up getting merged into GNU APL proper. And once I had learned how the internals worked, and the internals of GNU APL is actually quite nice. It's easy to work with, pretty straightforward, nicely, cleanly written code. So I implemented some other things that I needed, primarily I needed to be able to talk to SQL. I wanted to specifically talk to Postgres, so I implemented a SQL library, SQL API for GNU APL, which is also part of GNU APL today, which supports Postgres and SQLite. So yeah, so I was hacking on GNU APL for a while. And when you hack on an APL interpreter, you inevitably start to see things you want to change. And the changes that I wanted to make were quite, shall we say, what's the word, invasive. And I wanted to make changes that complete, very fundamental changes to the underlying architecture of the platform, which wasn't really compatible with the path that GNU APL was taking, because they focus very much,on being a APL2 compatible system.And I wanted to experiment with other things.,So, again, you did what any enterprising programmer does, and that's start your own project.,And so, yeah, so I guess that's the origin,,if you like, of Kap. I had a list of things that I wanted to do with it.,And so the one thing that got me started,was the idea of doing certain types of parallelism.,And so GNU APL has the ability to parallelize, especially scalar operations. So let's say you have an array of a million elements, and you have two cores.,What it does, or can do, if you enable it, at the time you had to enable it when you compile, I don't know what the default is now, because I haven't used it in a while. But if you have two cores, what it can do is that it performs one half of the operations on one core and the other half on the other core, and then it collects everything into the resulting array, and that improves performance. Now, for various reasons, you can get maybe five, six times performance increase as you add more cores. I ran some tests on 128-core machine, or 192, I can't remember, lots of cores. And yeah, it sort of peaked at about six cores parallelism. And the reason for that, actually, I'm not 100% sure what the real reason is, but one of the reasons I hypothesized was that, let's say you have a big array, a gigabyte in size, and you put that in variable X, and then you write the expression one plus two plus three plus four plus X. What GNU APL does is that it performs the four plus X in parallel across all the cores. And I think you're going to run into some memory issues there as well. There is gonna be too much hitting the memory at the same time. I'm not too good with actually memory optimization for multiprocessing, but I think that's one of the things that happen. Anyway, after it performs that addition, there's sequence point where it collects all the result, gets an output array, and then it starts all over again and adds three, and then performs all of that again, adds two, and then one. So I was thinking, well, surely you must be able to optimize things by parallelizing. You want to run, take one cell, you want to add four plus three plus two plus one, and put that in the output, and then the next cell, and then you can parallelize that. That's gotta be faster. So I was thinking about various ways of doing that. And I had some ideas how to do it in GNU APL. I did some proof of concepts, but like I said, I would have needed a huge amount of changes to the underlying architecture to do it. And I think GNU APL is not really, it wasn't really suited to do that because I wanted to do it in ways that wasn't really compatible with the implementation. So I was thinking another way to approach this, instead of trying to write an optimizer that identifies these situations, like one plus two plus three plus four plus X, how about four plus X? How about I don't perform the computation? Instead of doing the computation, I simply return an object that represents a future result. And then if I do three plus that, then I can merge them together or not, and just stack them. And then you end up with a stack of lazily evaluated cells. And then I noted, hey, wait a minute, what if after I perform this computation, and if I drop half of the results for various reasons, maybe I do some filtering or searching for the first result, then I don't even have to evaluate the results for those cells at all. So that was interesting enough for me to write a small prototype. So I had a few false starts, but then I implemented, so this would have been late 2019, early 2020. It was just before COVID, so around there. And so I rather quickly had a proof of concept running without even a parser. It was just, in order to test code, I had to write the syntax tree manually in code, but it was enough to note that the idea seemed to work. So the next step of course was to write the parser. So I did that. And then I needed to be able to input easily. And I had the same problem as when I started using GNU APL, that I didn't have a good way to input. So I wrote the UI for it. And at that point, the snowball was rolling already. So then COVID hit, and like many others, you tend to focus very much on your hobby project during that time. So coming out of COVID, I had pretty much a fully functional APL work-alike. And then that was an interesting base that I could then put more stuff on top of. And that's what I've been doing for the last however long. And I'm not done yet, because as we'll probably get into, there are plenty of things that are not really complete and that there's work needed. But I would say that the core of the system is there, and it's usable at the moment. So I've been working a bit lately on trying to get the onboarding experience nicer, getting the user interface better. I'm not a very good designer, so it's hard, especially the web version is really difficult to work on, because I don't, yeah, I'm not good at web at all. Yeah, so working on documentation and things like that as well. So if you have questions, I'll be happy to answer, but I think I've been doing a monologue for a bit too long.

00:18:00 [CH]

No, this is, we've had, I don't know how many, maybe 10, 20 of our guests. It's a large fraction of them that always say, "Oh, I've gone on for too long, but literally I'm sitting here and it's just absolutely fantastic." It's everything you hoped for going back to the '80s. We got BASIC, we got all the functional languages. I'm not sure that you might be the first person to ever mention Erlang, Elixir, those kinds of languages on this podcast, which we don't talk about them much here, but they're all fantastic. So I mean, the first thing I'll say is, so you've mentioned both the lazy evaluation and the parallel evaluation. Are those kind of what you would say are the two biggest differences between the existing APLs, Js, BQNs? And then the second question is, it sounded like you said you got up to 6x performance, regardless of the number of cores. When you were benchmarking that, was that versus GNU APL, or were you benchmarking it versus a bunch of different targets?

00:18:51 [EM]

No, so that was GNU APL itself, right? When I, so I got 6x in GNU APL. So in Kap, I haven't hit the limit. I only have a 16-core machine here, and I can easily, if I run my, I have a test program that is a Mandelbrot, and I saturate those CPUs, and I get exactly 16x performance on a 16-core machine. But that is because one thing that may be controversial, because some, especially in modern computer science, if you read modern computer science papers, there's a lot of focus on type safety and ensuring that you cannot do the wrong thing. And preferably, and the state of the art in at least some of research today is to try to get type safe multi-processing working, so that you can guarantee that even your multi-threaded program is doing the right thing. Somewhat controversially, I don't do that. So essentially, the way the Kap guarantees multi-threaded safety is that as long as your functions are pure, it's fine. Don't worry about it. You cannot do it wrong, because you don't have any mutable objects. However, if you have side effects, and if you change variables that are shared across multiple threads, then you can screw things up. You have to be careful. And I'm doing it intentionally. It's sort of the same principle as Common Lisp has in terms of their guarantees, because doing so allows me to do interesting things without having to spend too much time on trying to design a type system that doesn't tie the user's hand behind their backs. So in the case of my... I mean, it's part of the... The Mandelbrot code is part of the... It's in the examples directory in the source repository. So the way it works is that you have one function that computes a single pixel. And then I create an array of, say, let's say you create an array of 400 by 400 pixels, and then you take the Mandelbrot function each on that array. That computes it linearly on one CPU. But if you do it Mandelbrot function each parallel, which is two vertical bars, that means that you perform a parallel each, if you like. I think that's in... Someone call it in k. I think they call it parallel each as well. So in Kap, it's not a special function. It's an operator, and the operator acts on the specific function. So right now, only each has a parallel version. So if you try to apply the parallel operator to anything else, you just get an error saying parallel is not supported for function. But I do intend to... One plan I have is to add it for reduction, for example. So if you do a plus reduction parallel, then you would do a sort of a... What is it called? A map-reduce kind of thing. But that would then, of course, introduce new constraints on the function because now the evaluation order changes, which is why I didn't implement it. I don't know if it needs a separate operator or something like that. But that's the idea, basically, to have the parallel operator so that you can apply it on other functions too, other than just each. But yeah, I haven't been able to hit any limitations there, primarily because it takes quite a bit of time, comparatively, to compute the result of a single pixel in Mandelbrot. [03] So, and it's very CPU-intensive, it doesn't hit memory. So as long as you're... There is really very little contention happening on the bus when you do that. So I don't expect it to be much of a problem. So yeah, so it depends very much on your, what it is you're doing, whether or not this is useful. And the benefit here, of course, is that if the thing that you do an each on is a lazy value, then the computation of each individual cell in that lazy value will then, of course, also be parallelized. Because by the time you call the parallel operator, those, the results has not been materialized yet in the lazy array, if that makes any sense.

00:23:44 [CH]

No, yeah, it's one of the, not complaints, but things I've thought about since learning about array languages. And I actually, I think this has come up very briefly when I came back from KXCon earlier in the year, because I was talking to Oleg and Pierre, who are two of the C developers on q at KX. And I can never, I can't, I didn't actually have a conversation with, I think it was Pierre, but I chatted with Oleg, and Oleg mentioned that it was also one of Pierre's ideas that he had this kind of streams versus arrays, that the same idea where you have that one plus two plus three plus four plus X, and that's gonna necessitate the materialization of four or five different arrays when in an ideal world, your interpreter could see through all that, fuse it all, and especially if it's followed by a reduction, then technically nothing needs to be materialized. You can just do a reduction with a couple of transformations over your array. So yeah, it's super fascinating that that's all built into Kap and that.

00:24:39 [EM]

Right, so there are several optimizations like that that you get for free. So one example that I like to bring up is that, if you look at APLcart, and you look for the best way or the shortest way to do a string prefix match, let's say you have a string that is like long, and then you want to check, does this string begin with this other string, right? So if ABCDEF, yeah, it starts with ABC, so you want to take ABC something and the string. And the shortest way to do it is just to do a first of where, and where the E with the underscore.

00:25:24 [ML]

Find I think.

00:25:27 [EM]

Which is nice, right? You check all of the places where you can find the string, but you're only interested in the first one, so you throw away the rest. Now, in Dyalog, I'm not entirely sure if Dyalog actually optimizes that as an idiom, but if not, then this thing is ON by the length of the string you search. But of course, in CAB is 01, because even though you do the matching, you're not looking at those results, so you're throwing it away right away, and you're only going to check the beginning. So then it's ON for the string that you match against, if that makes any sense.

00:25:59 [CH]

Yeah, so basically any primitive that is essentially, like you mentioned where, but then also find, as Marshall and Rich said, all of those that are essentially like could come, they're kind of like, you do them linearly. And if you're calling first on that, all of those, you're gonna end up with that optimization, which is why all of the, there's a lot of languages that have these kinds of lazily chained, like Rust has iterators, Java has streams, C++ has ranges. It's all to take advantage of this exact thing, where you call some kind of filter operation, but then you're only interested in the first N, whether that's one or whether that's 10. And if you're not using this kind of API or language, you end up doing a whole bunch of extra work for no real benefit.

00:26:41 [EM]

Right, so the idea, I think, I think if you were to summarize what the idea here is that very often in APL, you have a really nice, concise, fantastically beautiful solution, like the string prefix match, that is also very, very slow because you're doing a lot of extra work. But the code is so nice and so easy to show why it works, prove why it works and teach it. So my goals is that this kind of sloppy, lazy, naive solution that is still very short and nice should be acceptable performance-wise. Now, I don't go for a super high performance solution if the programmer is really programming on a low level, because obviously all of this stuff adds overhead. So for optimized solutions, I don't think I could ever beat BQN or Dyalog for those. But for these naive solutions, I want them to be at least not terrible. And the string prefix matches is a good example of that, I think. So when it comes to performance, I believe that if I can stay within no worse than 10 times slower than Dyalog for the same thing, if you ask someone, "Do this the proper fast way in Dyalog and you do it the proper fast way in Kap," it shouldn't be worse than, at least shouldn't be worse than 10 times. In my experience, when I do these tests, I'm typically hitting maybe four or five times slower than Dyalog, which I think is okay because I'm running it on the JVM with the code that really have so many more layers of stuff on top of it. So I don't think that's too bad. And maybe we'll get into it later, but there are certain things about the language since it can be parsed into a syntax tree and you can run an optimizer on the actual code, on essentially the compiled code. It allows me to make things like calls to custom functions like dfns much, much faster. So if you do repeat the dfn calls, Kap is pretty much always faster than Dyalog. So, you know, it's a give and take.

00:28:57 [RP]

That was one thing I noticed looking at the Kap versus APL page you've got on your documentation. And then I only played with it very briefly, but you know, you have to assign your dfns with a different arrow. And this is so that you can do this kind of thing, I assume, because in Dyalog, you're interpreting that block every time the dfn is called.

00:29:18 [EM]

Yeah, I actually, I could, if I wanted to get rid of the double arrow and I could use the normal arrow because at compile time

00:29:26 [RP]

I did wonder that.

00:29:28 [EM]

At compile time, it can do it. You know, it knows what it is because it parses the right-hand side and it sees, okay, this is a bare function. And in Kap, a bare function is a function that doesn't have an argument on the right. And then it comes to the arrow and sees, well, I have a right bare function on the right. Therefore, it can do a function assignment. But doing that, I was thinking about it and I said, yeah, no, because these are two very, very different things because function assignment happens at compile time while variable assignment happens at runtime. And having compile time versus runtime effects change depending on what arguments is all the way on the right. All the way on the right might be an argument to this function and all of a sudden, this entire thing turns into a variable assignment. So I wanted to make it very, very clear so that if you accidentally put the value on the right, you don't change the semantics of the entire code. So that's why. So there's really no, the parser could do it, but I didn't want to because it got confusing. And if we talk later about the tacit, we'll probably come back to that because I have some perhaps some controversial ideas in terms of what I believe is readable versus not readable. [04]

00:30:49 [CH]

We will definitely come. I mean, we could definitely come back to it or we can just pivot right now. But I'll pause in case we want to stay on the topic of parallel evaluation or lazy evaluation. Are there other questions?

00:31:00 [ML]

Yeah, well, I will say with regards to the speed, I've seen some of Elias's benchmarks on Kap and I'm always like, well, that is much faster than I would have thought that lazy evaluation could happen. And so, I mean, yeah, it's not as fast as the, in particular, you don't get vectorization as easily. And I mean, if you're running on the JVM, probably not at all. So you're missing out on some stuff, but the overhead is a lot lower than I would have expected.

00:31:29 [EM]

Yeah, I think when I did, I remember when I did my first benchmark, when the code had reached a level of maturity that I could actually write code where I could do a benchmark. Yeah, I was a little bit surprised myself at first because I think a lot of the performance actually comes down to the JVM, which is that is almost magical in terms of performance. Now, the JVM does, Java has an API to do vectorized computation. And I've been thinking about playing around with that to create the special array type that supports vector operations, but that would mean I have to duplicate a lot of code and I don't want to really, but you could in principle, but I don't know if it's worth it. Because again, performance is not really the number one priority. It should be fast enough. And I think my rule of thumb initially was as long as I am always faster than Python, it's okay. Because people are apparently happy about using Python for some reason. So as long as I'm faster than that, and that has been the case, even for like very naive imperative looping code, at least last time I did tests, which arguably was a couple of years ago, I was consistently faster than Python. And that's fine. That's really all I'm aiming for. My second, like I said, my second goal is to keep within 10 times of Dyalog's optimal performance, which I have roughly been doing so far. But if you're after like highly optimized, low level code, yeah, BQN seems like a good choice.

00:33:06 [ML]

I hope so.

00:33:08 [EM]

I have to say that because I don't really, it's not really on my roadmap to beat BQN at its own game.

00:33:16 [ML]

Yeah, although Dzaima did do a little thing where he was compiling Singeli to Java to use the vector API. So I know it exists and it's sort of like a new experimental feature. On the other hand, the fact that he wants to write Singeli instead might tell you about how much you want to use it.

00:33:34 [CH]

Bob, were you gonna say something as well, a couple of seconds ago?

00:33:37 [BT]

Well, I was gonna bring it back to Tacit 'cause that's always a hit with the audience.

00:33:42 [CH]

All right, here we go, folks. This is almost gonna be, I feel like a mini episode. Oh wait, Stephen's got his hand up. So we'll take a quick pause.

00:33:50 [ST]

That Tacit's a very deep rabbit hole, as we all know here. So a quick question first. Elias, how did you decide that making arrays immutable would be a good idea? What was your insight there?

00:34:02 [EM]

Because I wanted to be able to parallelize things arbitrarily without worry. So my original plan was to be everything immutable. I mean, we're talking Haskell level immutability. I walked back on that. Variables are mutable. You can change them. You don't have to. And I'm thinking, you can declare variables as being constant, so you're not allowed to change them. But it's much too useful to be able to change variables. And like I said, I'm not a functional purist, which is why I'm not a huge fan of Haskell. I respect Haskell very much for what it is, but I don't know. I'm more like, I'm more a common list guy when it comes to those kinds of languages, because Lisp is a language that allows you to do anything you want, because it supports all different paradigms. And I kind of like that. But yes, the original plan was to make it purely immutable, everything. So for example, you have hash tables in Kap, and the hash table itself is immutable. You can't change it. The hash modification functions returns a new hash table that is the modified version.

00:35:14 [RP]

You've also got these lists as distinct from one-dimensional arrays as well, I noticed, although I haven't played with it very much.

00:35:23 [EM]

Yeah, lists are actually less special than you might think. What it is, it's an n-tuple. It's an object that is a scalar atomic object in itself. In APL, I think it's called a simple scalar. But it can contain some number of objects inside of it. And the design of that came out of me wanting to... What happened was I was implementing support for the standard APL array lookup, because we haven't actually addressed this. The language syntax itself is very, very close to APL, perhaps too close sometimes, because people might think that it's a full APL, then they try to do APL things, and it doesn't work. So maybe I should have made it change symbols, like BQN did. I don't know, but it is what it is. So I was implementing support for the bracket array dereferencing, which in APL, of course, if you have a two-dimensional array, you set the array name, the variable name, opening bracket, then some coordinates with semicolons in between. So what I decided to do was that I didn't want to have special syntax there. So what I said was, well, what if a list with semicolons in between was an object in itself? So what you can do is you can say A equals two, opening parentheses, A semicolon two, closing parentheses. You need the parentheses because semicolon binds weaker than everything else. And then you can look up a two-dimensional array by saying array name bracket A, because it, and I thought I was really clever until I read an old document from the '70s. And I realized that it was already invented by APL 3000. If I'm not, if I think it was that one that already did it. So

00:37:25 [ML]

That language was named after the year it was developed, right?

00:37:29 [EM]

Yeah, obviously. So they took it from me. So yes, but it always happens, right? You think that you're clever and then someone else came up with the idea. So, yeah, so that's where the concept of the list or the n-tuple came from. What I, then I realized I can reuse the same thing for, to pass multiple arguments to functions. So if you have an argument that takes several, a function that takes multiple arguments, well, the multiple arguments is actually a single argument, which is a list. So what that means is that if you take, if you have a function called F, you take F opening parentheses and you put the arguments closing parentheses. And the only difference between the way you write that function called multiple arguments in C versus Kap is that you're using semicolon instead of commas. So the idea there is that it's, I realized that that might actually be a quite kind of a welcoming way to bring people in and say, well, you see, it kind of, sort of looks like your favorite C or Python. So that was the idea there, but then you can put multiple arguments on the left also, and then some responsibility from the API designer required, but the capability is there. So when you define a function, you specify the function name, and if the argument, when you give named arguments, it will do an automatic destructuring of the list into the individual components, basically. So that's where it came from.

00:38:59 [CH]

And it is morally the same thing as what you do in APLs and Js and BQNs, where if you need more than two, you just put things in a list and pass it, but you just end up with a more familiar syntax for folks that are coming from more popular languages.

00:39:13 [EM]

Right, right. So it's an object in its own right, and that object is itself a simple scalar or an atom, which means that the normal rules when you do enclose and things like that, you will never have, so if you do, if you try to add something to it, for example, it won't work because scalar operation, you can't do one plus a list, it says, no, you can't work on this, you have to destructure it first. So it is really meant to hold a group of values together.

00:39:45 [RP]

Any other thing? I'm sorry to derail this from obviously people chomping at the bit to talk about tacit, but.

00:39:52 [CH]

It's all right, we'll get there, we'll get there.

00:39:54 [RP]

The thing that I don't know much about that I saw on that page was, you can make an array of symbols, right? You actually have symbols as a type you can do things with rather than just as names or references to values or functions.

00:40:08 [EM]

Right, yes, so you have symbols. Right now, you can't do that much with the symbols. It's essentially, it's a way to give you a high performance object that represents some kind of identity, but the only thing you can really, so normally the most common thing you use it for right now is as some kind of identifier, some key in a hash map or something like that. You use it in exactly the same way as Lisp. You can also use it, you can give a symbol, you can ask whether a symbol is bound, which you need to do if you want to have an ambivalent function that takes either, is either monadic or dyadic. You need to check if the left argument is bound and you can do that by looking at the, you have a function that acts on a symbol that returns that information to you. Now, the language itself has first class functions. So you don't need to refer to a function by a symbol. What you do is you capture a closure because the language has lexical scoping and full closures, which means that you can create an array of functions and those functions can then capture variables in its outer scope, even though you leave the scope, things like that, like any other functional language can do, which is quite useful. And if anyone wants to look at an example of how it's used, it's actually used in the code that does the rendering, the formatting of arrays for printing. In the source repository, it's in a file called output3.kap. And the different renders, the different functions that render the different data types is in a hash map keyed by a symbol that is the data type. So you have a function called typeof that returns the data type of a certain value. And then that data type is a symbol and then the symbol is a key. So it looks up the function, calls the function. Yeah. So if anyone wants to look at how that is used, that's a good example.

00:42:21 [RP]

That sort of reminds me of one last tangent, which was I noticed on, I didn't get, I got almost halfway through this video before I had to go, but you did a video that referenced the previous array cast where we were talking about labeling axes in multidimensional arrays. [05] I only got partway through that, but I wanted to, you seem to just have your array and then the array sort of continues to live on almost as it was, but it's got this metadata attached somehow. But I didn't get as far as seeing like how that's used, how you reference using the name.

00:42:57 [EM]

Right. So yes, arrays can have metadata that carries along with the value. So it's not attached to a variable, it's attached to a value. So if you have an array, let's say it's a two dimensional array, two by three, right? So you're two rows by three columns. You can give the rows names, right? So that would be an array in itself of two strings, and you can give the columns also names, right? So that would be a three element array of strings. So these names are carried along wherever it makes sense. So if I do a transpose, yeah, it transposes the labels as well. Column labels are displayed when in the UI. So if you have it, you print it, then it's displayed when you render the output as text. Row labels do not. That's mainly because I've been lazy and not implementing it, but it's still kept there. So if you do, for example, you take the first two columns of a list, yeah, so that result will then keep, it makes sense. You keep the labels. If you concatenate two arrays, then the labels will be concatenated. If you concatenate them on top of each other, the column labels will match, will be kept if they match in both arrays, things like that. The benefit of that is that when you take this array and do something with it, like you export to CSV, then you can include the headers. When you import from Excel, you can grab the headers there as well, because you usually have, you can say the first row of a CSV that you import is all the labels and you get the labels there. So when you open the, because the UI has an array editor that kind of looks like a spreadsheet. And of course, then it renders the labels there as well. And also when you display a chart, like a line chart or whatever, then the names come up as labels. So it's sort of like, it's not as extensive as the data frames in R, for example, but it's sort of inspired by that. And then I have a couple of helper functions that allows me to do a selection. I grab labels, sorry, I grab columns or rows by label instead of by ID, which is very, very nice when you work with, like I do, I work a lot with data that other people make in Excel. So when I want to do something with it, I don't want to do those calculations in Excel. So I just pull it straight in. And so I've added a bunch of features that allow you to copy and paste data from Excel. And preserve as much as possible of that information. So I can pull it into Excel, do the editing, do a little bit of graphical manipulation in the UI first, and then I push it into a variable and then I can do my computations. So having the labels there is really, really useful for that purpose. But the key is that this metadata is kept together with a value. So if I reassign the value, put that array inside another array and put it back out again, the labels are of course kept. So that's the idea. And that is sort of a flexible thing. You can do whatever you want with that metadata. It's there, you can read it, you can write it, you can view it and whatever, where it makes sense. So if you, for example, the SQL API, right? If you do a select statement and you get an array result, yeah, the headers, you have the headers right there. It's a column, it's a table columns, a name, right? And those will then be the labels. So it's very easy to work with. And that I'm not sure if any array, like I mean, R is an array language, right? So it's not something new, but I think I'm not sure any other APL like language has done exactly that. I think it's an interesting idea and I certainly would like to see it in BQM, Marshall.

00:46:56 [ML]

Well, I can tell why I haven't. And yeah, I don't know of any APL family thing that has that. I mean, it's not that different from K's maps, but, or dicts, but a dictionary in k is, the keys are fundamentally part of the dictionary. Like there's no array indices for a dictionary. I mean, you can convert it to a list, but fundamentally it's just a map from keys to values. So I'm not aware of anything that does it in the annotation style. The reason why I wouldn't do it in BQM is, I mean, it is very useful when it works, but there are a lot of array operations that don't have obvious label correspondences. So I think one we ran into when we were discussing this was, if you select, there's actually two ways that you could think about the labels. First is that the result has, and this is just, I'm just talking about one vector selecting another vector to be simpler. First is that you should say, well, the indices correspond, all the elements of the result correspond to the indices you gave. So it should have the same result labels as those indices. But the other is to say, well, the elements came from this other vector, so you should also select the labels to get the new labels. So there's a lot of things like that where there's multiple interpretations of the array operation you're doing. And so as a result, there's no one clear choice of what labels to have. So, I mean, BQM is supposed to be a very basic array language that says, well, these are some simple array operations and you can use them however you like. And so adding, and the idea of that is that, other people can build on it in however way, whatever way they want, and it'll be a solid foundation. So having that, something that's up to interpretation like that is not really the direction I wanna go.

00:48:48 [EM]

Yeah, right. I mean, and discover this also, right? Because most operations that do any form of transformation that doesn't, where it's not clear, it just drops the labels. So the main use for this is when you have, when you bring in data from an external source and you keep these labels because it's easy to work with, because it's easy to see what your raw data is. But then once you pull that data, some part of that data out of the array, you don't really care about the labels that much anymore. And for the most part, Kap will just drop them unless it's obvious. Like transpose, yes, obviously it will just keep the labels, but when you add the one to an array with labels, what are you supposed to do? The result is not necessarily the same, what Kap does, it drops it. Probably it should keep it if the labels in the source and result match, but how often would that happen? Probably never. So, yeah.

00:49:47 [CH]

All right, here we go. I mean, my mind's buzzing. I've got like, I've got technically like four or five questions I could ask, and I've also got like a couple tangents we could go on, but we're gonna put a button in all those for, you know, the second time we bring Elias back, 'cause this is, there's just so much, so many things I have to ask, and yeah, mind is buzzing. This is just fantastic. We are going to start Tacit Episode 5.2. You might be wondering, 5.2, since when did we start doing incremental episodes? A while ago, I was supposed to actually mention this on the failed Tacit 6 episode from a couple weeks ago, where on my Combinatory Logic site, if you go to the links.htm, link will be in the show notes, it shows all these links of Combinatory Logic and Tacit programming, and at the very bottom, there's a podcast episode section. It used to just be ADSP episode 47, where I waxed rhapsodic for like 30 minutes on the history of Combinatory Logic, but I've added all of the Tacit ArrayCast episodes. So Episode 9 was number 1, 11 was number 2, 15 number 3, 17 number 4, then we had a bit of a break, but then in Episode 64, we came back with Tacit number 5, and then I labeled Episode 65 Tacit 5.1, because I said it was only gonna be 5 or 10 minutes, but it ended up being the first 20 minutes of that episode, which I believe the actual topic was game programming, but I asked Marshall a question that was prompted by looking into the Kap language, because Kap, I will let Elias give the overall high view of Tacit programming and Kap and how it differs from APL, but the one thing that really stood out was that Kap does not have 3 trains in the way that J, BQN, and APL do. [06] They support forks, but there are 2 different identifiers or symbols that you use to spell basically A, identifier B, identifier C, where A, B, and C are your functions, and then you end up with a fork. However, in terms of trains, where it's the juxtaposition of functions, they only have 2 trains, and that is the same B-combinator compose where you're composing 2 unary functions, which is fantastic, because it makes Kap the only language that can string together unary functions in a Tacit way where you don't end up with some flip-flopping...

00:51:59 [ML]

That's not K.

00:52:01 [CH]

Sorry, not K.

00:52:02 [RP]

K does it.

00:52:03 [ML]

K does that.

00:52:05 [CH]

All right. It is the second array language. It's the only array language with Unicode symbols. There, I'll qualify my statement. That allows you to do this, and you can do this obviously in Dyalog, APL, and J by using dfns where you use the braces, but you can't do it nicely Tacitly. You either need to use J's, what do they call it, break, and BQN calls it nothing in order to get this pattern.

00:52:28 [RP]

Do a tops over and over, is it? You could do that.

00:52:30 [CH]

Yeah, tops over and over, which is, it's just, you know, it's honestly the simplest composition pattern other than 2 unary functions is multiples.

00:52:38 [BT]

And J uses a thing called Cap.

00:52:40 [CH]

Oh, Cap, yes, sorry, I think I said break, but Cap is the correct term for it.

00:52:43 [RP]

J uses Cap?

00:52:45 [CH]

Oh, yes, CAP with a C. C-A-P, yeah, this is confusing now, folks. We've already gone into the deep end.

00:52:49 [ML]

You can spell it in all caps if you want to.

00:52:53 [CH]

So I guess my, I'll try not to ask six questions here in a row. So maybe my first one will be, let's just go with a basic overview. And then the second one is the one that I am dying to know the answer to is how did you come to the decision to throw away forks, or not throw away, give them a different syntax and stick with only two trains? Because that is really what I think is like the very interesting design choice is there are no three trains in Kap, and that actually creates a huge difference in the spelling of certain Tacit expressions. All right, over to you.

00:53:24 [EM]

Oh, where to begin? I guess we have to roll back time all the way back to before Kap existed and go back to GNU APL, because GNU APL doesn't have Tacit at all. In it, you do have dfns, and the general opinion within the GNU APL community is that that's enough. A lot of people there are people who love APL 2, which I believe doesn't even have dfns. So some people say dfns are a bit too much. In a sense, I kind of still agree that dfns is enough, technically. I was very, very much against all form of Tacit programming, the way it's done in APL and J, especially J, because I could never understand it. And so for the longest time, Kap didn't have any Tacit at all. But what happened was that once I had something that you could use, and especially I had something on the web that people could use to try out, inevitably what happened was that the first thing people tried to do was to implement, to do something using Tacit, because it worked in APL, right? And this is an APL, so of course. And it never worked, and I had to explain to them there's no Tacit. And after people had been playing around with it, I mean, you know, it's not a super popular language, but you know, a handful of people played around with it, had exactly the same experience at least twice or three times. Then I looked at how, because I've always, I learned rather quickly how Tacit works in Dyalog. It's not that I don't understand the syntax. And I was like, the syntax is simple, the way it parses, you know, the Kap parser is like perfectly capable of parsing these Tacit expressions as well. So I just added it. So the first version had Tacit that was more or less identical to Dyalog. And I did it to partly to show myself that it could be done and also to better understand why, how Tacit worked. So what better way than to implement it, right? Because once I did, I was able to take the parser that parses an expression, and that gives me a parse tree, which represents the evaluation, the evaluation tree. So I could write the UI and then I could render a literal graph and it's available in the UI if you download the Java client. You can use that, you type an expression and it will draw some boxes with lines that show how things are being evaluated. And it was actually quite useful for me because since I had the same syntax as Dyalog, that meant that when someone on the chat gave an example of a Tacit expression, I could just paste it into Kap and then I had a graphical representation much easier to read. So as I was using it, and once Tacit existed in the language, you start to use it a little bit. I mean, in simple cases like the obvious, you want to take the average, that's a classic, right? The sum over the count, or you want to do something like a sort, which is also a fork with the right tack on the right and the grade up or grade down on the left. And then you do the select in the middle, you know, started using those left and right. Okay, it's fine. But even though I considered myself fairly knowledgeable in terms of how Tacit expressions worked, I still always felt some kind of mental exhaustion every time I was reading a Tacit expression, because I felt like, look, the first thing I have to do when I look at this expression is to decode it. And I have to jump back and forth to try to figure out why something evaluates the way it is and then usually converting it to dfn, and that made it clear how things are evaluated. So it became easier. So the longer I thought about it, I realized that there's really one thing that is painful. And that is the three chain or the three train, if you like. Internally in the Kap implementation, it was called chain three. So, because I realized if you have an expression, and let's talk about monadic, because monadic calls are simpler. So if you have three functions, a, b, and c, and x is the value, and you say a, b, c, x. Okay, well, x is supplied to c, which is supplied to b, supplied to a. Standard right to left evaluation. You put parentheses around those and all of a sudden, not only does the calling order change from c, b, a to c, a, b, the arity of b also changes. Now it's a dyadic call instead of monadic. So in order, and you know, it's fine when you have three functions, which is why the taking the average is such a beautiful example, because it's really, really tacit. You know, it's the sum over the count. But you take it, make that into a longer train, and all of a sudden you have to think about the even-odd rule, and you have to check, oh, there's a function call here, and then it's called. And I thought, okay, what if I just remove that? What if I just remove that single thing? What do we end up with? And, you know, I think, I understand why the syntax in J and subsequently Dyalog was invented. And clearly, in order to do it better, you need two extra symbols. And J certainly had no extra symbols, because you absolutely don't want to use a digraph for this, because then you create a fork and you end up with four extra symbols. It just, it wouldn't work, right? Same for Dyalog. Dyalog also tries to be, I assume that they don't want to take a lot of extra symbols. They're trying to be a little bit more conservative about that. And also, it existed in J, so why not take it? I have the benefit of not having to care about backwards compatibility one bit, because this is an experimental language. So I can do what I want. So I could just throw out the old style, the J-style fork, and try to see how it works. So I did. So I implemented. So the left and right symbols that you mentioned are, of course, the French quotation sign, the left and right guillemot, it's like two greater than or less than symbols. So once I did that, I realized that what was left was just, if you have a sequence of functions now, it was just, it's just a chain. It's a two chain attached to another two chain attached to another two chain. And this natural way of calling, it just fell out of it. And another benefit is that when you, if you want a function, a train that does not, that binds the left argument, in Dyalog, you need the JOT, the bind symbol for that. [07] In Kap, if you say, if you just say one plus, and there's nothing on the right, that's a bare function, that is a train that adds one to its argument. That's a monadic function itself. So that means that if I take a bunch of those monadic functions, string them together, I get one plus two plus three plus four plus, well, that's a train, which adds four plus three plus two plus one to its argument. That means that if you have the expression one plus two plus three plus X, and then decide to put that, that's just an explicit form, right? But you put parentheses around everything except for the final X, that's a tacit expression that does exactly the same thing. So all of a sudden, the evaluation order is identical in tacit and explicit form. And that was not really the plan, because I just realized that happened by itself. Once I removed the three chain. And so at that point, I realized, okay, I'm keeping this because now I can actually understand tacit. And I don't feel this mental exhaustion every time I have to decode tacit expression. Of course, that doesn't mean that you can't make complex, unreadable monsters in Kap. You can, because you have various types of, you have left and right hooks. Those are quite good at making it a little bit more difficult to read, but it's not as bad. And in fact, as it turns out, at least for me, I don't use forks very often. It happens, but most of, almost all the time, even in Dyalog, when I do something in Dyalog and use a three chain, most of the time, either left or right hand side is a tack. Well, you don't need that in Kap because you have the hooks. Well, in the dyadic case, you sometimes do, but I feel that nine times out of 10, I can do it with a hook instead. And so I know some people are really don't like the syntax of the fork in Kap at all. They find the symbols grating or something like that. Understandable. I mean, a matter of taste. But what I noticed is that I don't use those symbols much at all, because like I said, hooks are essentially doing it most of the time. So, yeah, I mean, that's the short summary of where it came from. So it wasn't really invented so much as discovered. That makes it sound like it's some kind of deep insight. No, it sort of was a natural thing.

01:02:56 [ML]

It was happened upon.

01:02:58 [EM]

Yeah, it happened upon. It was a lucky coincidence that I had already implemented the left bound functions without the jot at the time, because that made me realize that, hey, wait a minute. So very often, actually, when I write something explicit and I want to turn it into tacit, most of the time I just put parentheses around the thing and it just works, which I love because I don't have to think about it so much.

01:03:22 [CH]

I think that there actually is like a couple, if not deep, at least insights. And I would argue that they are deep insights. Like one, I've been, you know, I don't know, since the beginning of this podcast, before this podcast, or at some point I fell in love with tacit programming. And there is this, you know, the common question that always comes up is, is it actually, you know, a lot of people say it's the, or I say it's the epitome of elegance, but is it actually like a practical tool for programming? And when I was at Middlebrook back in October a couple months ago, I was talking with Morton Cromberg, and I think this anecdote has come up on the show before, and he was saying that he was actually warming up to tacit programming. He's kind of not really been a fan because he also, I think, you know, the mental exhaustion that you experienced, I think he would completely agree. Those aren't the words he used, but he would completely agree. And one of the things he pointed out was that it's not actually tacit per se, it's tacit combined with the ambivalence. And what you pointed out in your description I think is brilliant, and it's that, and I had never thought of this before, it's when you're going from that, you know, ABC, you know, not tacit, to putting it in parentheses, not only are you changing the order of evaluation, but you're changing the arity of the B function, which is why I've never noticed that before, but that's completely brilliant. Like, it's already, you're doing sort of some mental gymnastics, which for experienced tacit programmers, I don't think they consider gymnastics. Like, it's just, you get used to reading code that way. But the fact that you are changing the arity, and now you need to stare at this and be like, okay, we've got unary, binary, unary, and we have been ignoring the fact that, like, we're talking about monadic forks right now, we're not talking about the dyadic forks and the fact that, like, even the forks themselves are overloaded. And so the fact that you experienced this mental exhaustion, said, let's experiment with something, you got rid of it, and ended up with something a lot nicer, I think is like very, very insightful. And when I was going through, we'll link this in the show notes, I have a little markdown file called one-liners, which are 10 different problems that I like solving, and I've done it in BQN, Uiua, Kap, Dyalog APL, and Jello and Jelly. And there's two different problems that you end up needing either CAPs in J or nothings in BQN. And in one of the solutions for APL, I just do it explicitly in a dfn, because they don't have that. And so you're going to end up adding two parentheses in order to get these, you know, two chains, the B combinator twice in a row. Whereas in Uiua, because they've got the stack, you don't have to worry about it. And in Kap, you just have, you know, you can compose these things together. And so what you end up with is a four-character solution in both Uiua and Kap. Five in BQN, because you need the nothing. And then I think it's six in Dyalog APL, because I didn't want to do tacit. It's just, it ends up not looking elegant if I have to use parentheses in order to get this, because that's your only option in Dyalog APL, if you want to stay in tacit land.

01:06:02 [ML]

Well, often you can use an atop, like the atop operator.

01:06:06 [CH]

Yes, that is true. That is true. Aaaand potentially in my case, you could have done that. But all this to say is that you're saying: "oh, to say that you discovered something maybe is a bit much". I really think that there hasn't been a ton of exploration outside of the classic two-train/three-train model that you see in J, APL and BQN, other than the fact that J initially did the S (the hook for the two-chain or the two-train), and then APL and BQN changed that because of Roger Hui. But I think Kap is like the first quote-unquote "array language" that said: "let's experiment with something different". And then it led to this kind of quote-unquote "discovery". Anyways, Rich, you had your hand up.01:06:47 [RP]

Or, well, I'm going back and forth on saying this. My my question is.

01:06:50 [RP]

Well, I'm going back and forth on saying this. My question is what do you gain from doing tacit if basically it's the same as if you had this like dfn with a bunch of monadic calls and an omega, or in the dyadic cases, like the last function becomes dyadic? Yeah, I don't know how I feel about this because obviously you're saying like in practice, in a lot of cases, you're not using the forks. Especially for long ones, they add like potential confusion. So I can sort of see that. But then I'm thinking, then why bother if all you're doing is adding a reference to your argument at the end? If you're calling it a train or you're calling it this tacit function, but it's actually just like a bunch of monadic function calls and the syntax of these Iversonian array languages is already a very simple. Just, you know, function, function, function.

01:07:35 [EM]

Yeah. So, I mean, I'm not saying that you have ... [sentence left incomplete] Okay, how to phrase this? So first of all, the fork is still there, right? You do need it sometimes, especially when you have in the dyadic case and you want to do something complicated, then you use those symbols to attach them together. Now, if you want to compare the amount of number of characters needed in Dyalog versus Kap in this case, it's the same actually, because Kap uses these two symbols, but they bind stronger, which means you don't need the parentheses around it usually. In Dyalog, you need the parentheses so, number of characters is actually roughly the same. In my experience, in practice, right, a well-written tacit expression in Dyalog versus the same tacit expression in Kap is roughly the same, is plus minus a few characters. The main difference is that I find that the Kap version is easier to read. [It is] easier to understand, because in the case of the fork, you still have the weird calling order where you call the middle function last, but at least it's very obvious because the middle function is surrounded by these symbols. And you have the left and right hook symbols, which is the jot and the jot underscore. Jot underscore does not exist in Dyalog but the jot does not work the same as in Dyalog. So if you have, for example, A jot B with an argument X, what happens is that B is called monadically with X, and then A is called dyadically with X again on the left, and the result of B. And the jot underscore is the opposite, right? What I'm trying to get at: those are the ones I use the most. And I very often put parentheses around the thing to the left or the right of the jot.

01:09:42 [RP]

I like what you were saying, because in Dyalog, you would often use: tack something, something. Yeah, okay.

01:09:49 [EM]

Exactly, so I avoid the tack in all of those cases. But obviously here you end up with something that is not too dissimilar from a fork in Dyalog, because you still end up reusing one argument for two parameters. But the key difference that is important to me at least (and I don't know about others, but it's certainly something that helps me a lot), is that the jot and the jot underscore (those two symbols) makes it very, very obvious that something weird is going on. When you just string a series of functions together, I want that series of functions to be called in that expected order, right to left, right? And so I was thinking about how to describe [this], because there is one difference. When you have a series of functions, A, B, C, D, E, F and you make that into a tacit chain, and you call that with X and Y dyadically. Now, of course, removing those parentheses will change the semantics, because with the parentheses around it, the last function in the chain will be called dyadically and the rest will be called monadically. Versus, you remove the parentheses, and then the first function (A in this case) will be the one called dyadically, right? Now, I was thinking about it, because I knew obviously this was going to come up so I was thinking, how do I justify the fact that that is actually different? And I think the way I justify it is that, well, if I see a chain of functions, I read from right to left, and I see X, I see a closing parentheses and I see a function, then I know this is a train already. I just have to look just at the end. And then all I have to do is to find the corresponding parentheses and see: "OK, that's the beginning of the train", and the rest is just the train inside. I don't have to think at all about what's going on inside those parentheses. So my eyes just have to jump to the end, maybe one jump to the beginning and then back again, but there's no trying to figure out what's going on in the middle. That's how I try to justify to myself what's going on. Why it is that I understand this better than the J style, because J is the one that I really, really struggle with. Dyalog I can handle. I can read Dyalog code and reasonably quickly understand what's going on. I've been recently trying to learn more J. I'm still at the stage where I have no idea what's going on [laughs], but that's because I don't have enough experience.

01:12:21 [BT]

Well, and J is a challenge in that area, especially because of the way that the two train was developed with Roger, [08] and Roger's admitted that that wasn't the approach that he would take if he did it over again.

01:12:33 [ML]

Well, it's, I mean, that comes from the original paper "Phrasal forms" by Iverson and Eugene McDonald. So I don't know which was responsible, but it wasn't Roger.

01:12:43 [RP]

No, but when he came to Dyalog, that's why ours are different, right?

01:12:47 [ML]

Oh yeah, that's what he changed. So yeah, he fixed it in my view and many other people's.

01:12:52 [CH]

And then, yeah, there's a link we'll put in the show notes to a J software article that he wrote called "Hook Conjunctions?", which I probably I brought up like four or five times in the past [chuckles], which highlights that yeah, if he could have gone back to do it over, he probably would have chosen this, which is why, which is why both Dyalog, APL and BQN ... [sentence left incomplete]. And just to clarify, because we've been kind of referring to these as different things: these hooks (and you can correct me if I'm wrong Elias) that you've been referring to are called different things in different languages. So I think actually your documentation calls them compose and inverse compose, but colloquially you're just referring to them as hook and reverse hook. And I think actually "I" was the very first language where I ever came across something referring to it.

01:13:36 [ML]

"I" think that was probably the first language that had them. I can also point out that there's some of the documentation in "I" that indicates that you should be using the two train. I was writing that before Dyalog did it, not before Roger said it [Conor laughs], but so I didn't get it from Dyalog.

01:13:51 [CH]

And which is why I usually like to include "I" on my historical language lineage because even though Marshall said: "don't use I", it introduced some important ideas. So inverse compose and compose in Kap, hook and reverse hook and "I" ... BQN calls them "before" and "after". And then J actually, as we just mentioned, they actually have the hook as the two train. Adám's not with us today, but if he was here, he'd be shooting his hand up saying that even though Dyalog doesn't have it, his extended Dyalog does. I think it actually is on the roadmap, not necessarily identical to Elias' because they already have some of the definitions for compose, but there is a reverse compose (asterisk; the actual patterns to be determined). I don't think it's coming in 19.0, but I think it's on the roadmap for 20.0 or in the future. And I can confirm that these are very, very useful. A lot of times it's just a tack. Like even you try to do sort (if you don't have a sort primitive like BQN does); I'm not sure if it's on the horizon for Kap. I remember tweeting at one point only if Kap had sort ... [sentence left incomplete]

01:15:02 [ML]

Didn't you implement it like the next day?

01:15:03 [EM]

I implemented it roughly one day after I heard [Conor expresses amazement]. I'm not using Twitter, but Marshall forwarded the screenshot and the same evening I implemented it and I pushed it the next morning.

01:15:18 [CH]

I didn't think this would come up, but I am not on Mastodon, but you are the single reason that I have thought about getting on Mastodon. Specifically [so] that I could interact. But I need like one more person because I have had no desire to be on Mastodon except when I found out that you weren't on Twitter [chuckles] and you were on Mastodon. I was like: "well, that might be reason alone for me to add another social network". So there you have it folks. Kap has the sort primitives, but Dyalog APL does not. And if you want to spell that tacitly, that's one of the famous examples that uses the right tack as the unary operation on the right. Which I think whenever we bring that up, Adám mentions that you will be able to do the shorter three primitive expression in the future.

01:16:04 [ML]

Yeah. Well, and you also need a different select primitive to do that. So.

01:16:07 [CH]

Right. Yes, yes, yes.

01:16:08 [ML]

Dyalog at present is a ways off from that, but version 20 maybe.

01:16:12 [CH]

We could get there.

01:16:14 [EM]

In the screenshot of that Twitter message, you also mentioned that you had looked at the code and couldn't make sense out of it. I did make a short video showing how to add a function (a native function) to the language. And it's a screen share, about 10 minutes long I think, that just shows how to add a simple function that swaps to our two elements in an argument array, just to show how it's done. And I did it because you mentioned that you tried and failed. And I'm not expecting people to start adding functions to it, you can watch that at your leisure if you want.

01:16:58 [CH]

We'll throw a link in the show notes as always. Very quickly, as soon as you posted it, DM-ed it to me. I went and watched it. It has been on my horizon to go and add that. And it was a super helpful video, and I think I could probably do it. But now that you've added the sort primitives yourself. The reason I was trying to do it is that I was so excited when I was going through my 10 problems that there was one where you needed a sort. That was the other example where J needed to use "cap", I think, twice and BQN needed to use "Nothing" twice. I was like: the only thing that Kap was missing was a sort primitive. And it was adding like three or four [characters], and I was like: Kap's going to be now not just the most elegant for one of the 10, it's going to be the most elegant for like two or three of the 10 [problems]. I was so excited about it, I was like: "you know, how hard could this be?" I don't know Kotlin. I don't even know how to build a Kotlin project. But I went and downloaded the source. ChatGPT, 30 minutes later, told me the missing commands I had. Got the project building and then I ran out of steam after like an hour. The code is actually quite clean. I just am not enough of a Kotlin/Java/object-oriented [programmer]. I was looking at the inheritance structures, and it's a lot there.

01:18:09 [EM]

Actually implementing the sort is kind of difficult. It's more difficult than the more trivial functions. It wouldn't be too hard if performance is not important at all. But if you want it to be a little bit more performance, you have to use certain, certain more complicated, internal things. You want to have type optimization and handle specialized arrays differently and things like that. But if you just wanted to do a simple sort, that would be, I don't know, 10 lines of code, probably. Something like that. But if you look at the implementation (if you look at the commit that I actually did), I think it's more like 40 lines of code to implement it because it handles more special cases and things like that. Anyway, someone else wanted to talk and I interrupted. Sorry.

01:19:03 [CH]

Yeah, yeah. No, no worries. I think, yeah, Bob had something.

01:19:06 [BT]

Well, I was going to mention the idea behind tacit because Rich was asking: why would you do it that way? And I think the deal is that you're really looking at representing combinators with tacit and certain combinations stick out at you when you use them that way. And when we're talking about forks, that fork is a specific combinator. I don't remember what. Conor's got all the bird names and the letter names and stuff.

01:19:32 [CH]

As someone once mentioned, the ornithological equivalence. Someone mentioned on a website once and I thought: that's a fantastic way to refer to the bird names. It's not bird names but "ornithological equivalence".

01:19:43 [RP]

Yeah, let's make it more accessible, yeah.

01:19:47 [BT]

[Laughs heartily] But the point is that there's certain functions [which] when they're put in certain combinations, you can isolate them and they're used over and over again. The fork is one of them. One of the things I do like about Kap is that you have used the double arrows as a way to designate those, which I think is an easier way to spot it and immediately tells you that that center function is dyadic. And that's something that is not apparent when you use the parentheses with J. Now, over time, you do get used to it with J and you learn how to recognize it. But just [as] a matter of recognition, I think the easier it is to make the initial contact recognizable, the quicker you get into: "this is a pattern that happens over and over again". So the sequential pattern is just another combinator and the fork is just another combinator. The back hook or the forward hook are just other combinators. When you can spot them, you can immediately understand: "this is what's going on". And that's why tacit is useful. Because you're not worrying about the arguments outside. You're really just worrying about the pattern that you're seeing with the functions. And I think it's a good way to program for that reason.

01:20:55 [ML]

One other thing I did notice in BQN I think has made trains a little bit easier [09](still, ambivalence is really the big problem and I don't do anything about that) is having this context-free grammar where you can see at a glance whether something is a subject or a function. So whether it can be used as an argument or not. What I gradually came to realize as I programmed more with BQN was that I was spending a fair amount of time in J, reading names and then remembering whether they were functions or arguments. And even in normal expressions, that's more overhead. But in trains, to even figure out whether something is a train, you have to read this name. And in BQN, you've got the capitalization for names and even that syntax everything that has highlighting. So it's very easy to make out the patterns quicker. And I guess you could do this in J or APL by just having a naming convention, which is something that I never did in those languages. But that's one thing that helps a little.

01:22:04 [EM]

Yeah, I had the same idea. initially I was thinking about saying that if it starts with an uppercase character, it's a function; if it starts with a lowercase character, it's a variable. If you look at older code that I wrote (the Mandelbrot example, for example) has functions starting with uppercase because of that. I chose not to do it (and I definitely chose not to enforce it) because I just thought it looked ugly.

01:22:33 [ML]

Yeah, there is that the other.

01:22:35 [EM]

The other reason is that I also wanted to make it possible to define new symbols. A couple of episodes ago, I believe you were talking about symbols. In Kap, you can declare a symbol to be a symbol. Now I overloaded the words. "Symbol" obviously should be reserved for the same [meaning] as the Lisp symbol. A character or a glyph, if you like. So if you declare a glyph to be what you call a single char, that means that if you put that glyph next to each other, it is now two glyphs: it is parsed at two individual symbols, because it would be annoying if you had to write "+ +" and have "++" be one symbol [Marshall agrees]. What you can do is you can take any symbol in Unicode and say: "this is my new symbol; my new glyph" and this is going to act just like any other function. Then you can declare it as a function and then it's added to the language like that. I wanted to add that. Once you do that, you realize: well, most characters in Unicode does not have an uppercase and lowercase representation. At that point, I realized: "OK, this would have to be a convention." And then I realized that I didn't like the convention of having uppercase functions because they're ugly. And then I didn't do it. But probably a better idea (because as you say, you have to know if it's a function or not, because it completely changes the semantics of an expression) ... [sentence left incomplete]. Now, thankfully, Kap does not allow you to redefine. Once it's defined as a function, it's always a function. And it needs to be like that otherwise, the parser wouldn't be able to parse an expression before running it.

01:24:26 [ML]

Yeah, you get all sorts of circular stuff.

01:24:27 [CH]

And it is (to echo what you've said, Marshall) is one of my favorite things, not just about the rules, but the fact when you're defining custom modifiers or user-defined modifiers in BQN, especially when you're in BQNPAD, and it automatically, as soon as you put that underscore in front of the function, it changes colors. And if you'd put the trailing underscore, it changes. Small things like that ... [sentence left incomplete]. It's the same thing in Uiua. In the REPL when you're typing and you type enough of the prefix that it recognizes whether it's a unary or binary and it changes the color. That is such a small thing, but it changes the experience. When you mentioned at the beginning, Elias, that REPL-driven programming is the way you prefer to program. You don't actually hear people talking about REPL-driven programming as the way that they develop. You hear it as, like: "oh, Python comes with a REPL". Like, how many people are writing their Python programs in the REPL? But for certain languages, especially LISPs (especially Clojure), if you watch people livestream programming in Clojure, they built all these tools that [let] you edit an expression live, seeing the update of what the result is. It is actually a way to develop. And when you come from a compiled language, like C++, it's just completely changes the way that you feel and the certainty of, like: "well, I'm writing; we'll see if this compiles even, and once it compiles, we'll see if it's ...". You can incrementally see the result of the expression and that completely changes the confidence that you have. Anyways, say what you will about the aesthetics of the trailing underscores. I actually use it in a lot of my personal C++ libraries and I always get the comment: "isn't a prefix underscore reserved for the compiler?" and then anyone that is familiar with the standard will say: "actually, no, that's not the case." It's two prefix underscores that you're not supposed to use in your code but a single one is okay. Anyways, I really like these small things, because it adds to the readability of the code, in my opinion, like a massive amount.

01:26:21 [EM]

Yeah, the real-time error reporting and real-time displaying of results is something I want to ... [sentence left incomplete]. I have all the pieces in place to do that in the web version and the native version, but especially in the web version, it involves writing a lot of web stuff, and I have to get a lot of energy before I can sit down and do it, because I'm not very good at it and it's kind of uncomfortable, but all of the server-side components are there. You can send it to the syntax checker. You can get all the results back. It is a question of rendering it and I'm a bit jealous with both Uiua and BQN, because their web interfaces are so nice. I am nowhere near being able to do that. I spent days just getting the new editor working that looks better; at least I get token highlights now but that took a while to implement [chuckles]. But I really would like to be able to use Uiua and BQN.

01:27:35 [ML]

One story about this which I still haven't gotten to, is we mentioned a few episodes ago an announcement about this new funmaker/bqneditor [10]and its way of typing the glyphs. You type in the prefix character, and then it puts next to every one of those clickable glyphs in the key bar what what the next character you should type is to be able to type it. So after that episode, Elias then implemented that functionality just based on my description [chuckles] and then went to the funmaker version and said [chuckles]: "oh, well, that's much better executed, I guess". But it really was the same thing and it's nice to have it. So now I'm sitting there with my BQN REPL and going: "gosh, I wish I had that functionality".

01:28:23 [EM]

I should have looked at that implementation first, because it's so pretty, and mine is ... [sentence left incomplete]. I mean, it does its job, but it's not great compared to that absolute work of art, which is the BQN one.

01:28:37 [CH]

Yeah, the last thing I think maybe we'll mention before winding down is in the back of my head I've been thinking that what would be amazing (whether it was an online version of this or like a RIDE 2.0, because I'm not sure if you could modify RIDE, the Remote IDE for Dyalog APL to do this), but if you had (not different J playground, BQNPAD, Kap online), but we just had a single one that somehow we create some array intermediate representation where you set up your key bindings and whatever in an executable. Then you have a new combo box, which [contains] Kap, BQN, J, choose your array language, and then all of this work that all the different communities and people are doing to replicate, we could just share and I think we're all ... [sentence left incomplete]. No one's trying to destroy anyone's array language or compete. It's just everyone's borrowing these ideas and implementing them and I think it would be fantastic if I could just go to one spot. In the simplest cases, you could hit some button and say I want to take this BQN expression and convert it to Uiua or convert it to Kap, and that would be a whole other thing. But just the starting point being a one online REPL to rule them all. Even out there we've got the NGN APL and dzaima/APL and all the different [APLs] (you know, NARS 2000), we could technically hook them all up. There's nothing stopping us if you have the right kind of executable and piping between. Will it happen? I'm throwing the idea out there. Maybe 2024, some people out there will be like: "hey, let's get some open source initiative behind this". One can dream. Anyways, we have gone way over, but I, you know [laughs] I was expecting this because I know I had questions, and obviously others are going to have questions. Any last sort of comments that we want to make or maybe we'll throw it over to Elias if there's anything you want to encourage listeners to go and do or if you're looking for contributors. Maybe we'll throw the last thing over to you if you've got anything you want to plug. And, of course, you're on Mastodon for those that are there, too.

01:30:40 [ML]

If I can interject, I did link a website in the chat, which does let you run a whole bunch of array languages and switch between them, but the editor for it is very basic. So, if anybody wants to do that with a fancy editor, maybe that project would be a starting point.

01:30:58 [RP]

It's got Incunabulum it it.

01:31:00 [CH]

Look at that. All right, so.

01:31:02 [ML]

Interesting. All right, so we've already got BQN; we've got J; we've got APL/?; we've got a bunch of Ks, we've got Kap. So, we've already got a starting point, folks. We've just got to combine this. Uiua's not here yet, though. So, we're missing Uiua, we're missing a couple. We've got to add the missing ones, combine this with BQNpad. All right, folks, it's already started. Anyways, back to Elias, if you want to send us off for any last announcements or calls to action for our listeners.

01:31:35 [EM]

Well, so I wrote down a little list of my feature of things that are unique in Kap. And I was looking at the list to see if there was anything that should have been mentioned. There are two things. First of all, one thing that wasn't mentioned: you mentioned that it's implemented in Kotlin. It's actually implemented in multiplatform Kotlin. So, right now, I can compile it to JavaScript; I can compile it to WebAssembly, to native Linux and to the JVM. If anyone wants to build, make it compile for Windows or Mac OS, that would be fun. I don't have the necessary knowledge. I don't know enough about Windows APIs to do it. And I don't really have a Mac to play with. So, that would be useful. So the reason I'm mentioning this is the website you mentioned. The guy who made it ... [sentence left incomplete] I was talking to him on the chat, and I explained to him how to integrate the Kap JavaScript version. I have changed the APIs, because you run the Kap JavaScript version in a WebWorker and then you have an API or a protocol that you communicate with the WebWorker. I changed that since then. I think he's running off of a pretty old version. I don't think it's been updated for the more modern version. It's probably, yeah, if someone wants to help out with that or build something new, they would have to probably take a look at how Kap is doing it now.

01:33:18 [EM]

We never had the opportunity to talk about the numeric tower in Kap, [11] which is stolen from Common Lisp, more or less. It supports big nums and rational numbers. And I think there was some talk about rational numbers in one of your older episodes, I think. If you take 2 and you divide it by 3, the result is not some decimal number (0.6666 or whatever). It's actually 2/3. If you multiply that by 3 again, you get exactly 2 back. The benefit, of course, is that any rational computation is exact, which is closer to what a lot of people expect. You have to be explicit if you want to use floating point. Of course, rational numbers is way slower than floating point but it goes hand in hand with the idea that the simple, straightforward approach of someone doing the most naive code should just do the right thing. If you want to do it more complicated, well, fine: you add 0.0 to a number, it becomes a floating point and after that, you just do floating point computations. So, rationals, I like! I wanted to put out the shout out to rationals because I think far too few languages support it. I think it's really only the mathematical languages like Mathematica (and whatever) and Lisp, of course. But there isn't many others that do it and I think it's kind of cool. So, that's something I wanted to promote [chuckles]. Other than that, I mean, there's of course, a million other things that could be discussed. But that would put us off on a completely different tangent that probably lasts another hour so, we probably shouldn't do that.

01:35:12 [CH]

No, don't worry. If you are willing, we will definitely have you back to dig into all of these questions. This has absolutely been fantastic. And yeah, mind is absolutely buzzing. It was everything I hoped for and more and this will definitely get labeled, at least on the CombinatoryLogic site, Tacit 5.2. I have like a bunch of notes on the side. We will save them for the next episode because if I mention any now, we're going to end up talking for another 10 or 20 minutes like half the time we do. We're already past the hour and 40 minute mark. Last time I looked, it was hour and 30 minutes [chuckles]. We'll see how much Bob edits this down. But yeah, this was absolutely fantastic. Thank you so much for taking the time to come and chat with us. And hopefully, yeah, this will get some people that wasn't on their radar. They'll be checking out Kap and the parallelism, the laziness, and the different Tacit model, I think, are all just awesome features. Hopefully, this will inspire folks to check it out.

01:36:08 [EM]

And if I can just finish off with saying if anyone wants to help out, the biggest thing that I would need is anyone at all willing to just try it out. If they do, let me know on Matrix; on Mastadon; through the bug tracker. Why not? Or email or whatever methods of means of communication you like. Because I might not be on Twitter but I'm available in other places. Any form of feedback, positive or negative, is good because if someone feels that something is missing then I'll be happy to add it. My biggest problem right now is deciding what to work on and if it's something someone wants, then I will work on that because it's very, very satisfying to know that you implement something that someone is interested in. Which is why I implemented the sort immediately after I learned that ... [sentence left incomplete]. I had been thinking about implementing sort primitives for a while; I just didn't bother because it's three characters. What's the problem? Someone mentions it, and then it took 15 minutes and it was there. So, that is the best thing anyone can do is just to let me know and I will do my best to accommodate whatever feature requests, if it matches the general design of the system as a whole. So, yeah, that's it.

01:37:46 [CH]

And to add on to that list, if you don't have any of that stuff and you just want to send us an email, you can reach us and give us feedback for either the podcast or Kap, and we can pass it along. I'll throw it over to Bob, because he always knows the email.

01:37:58 [BT]

The email is contact@arraycast.com and if you want to get in touch with us, please let us know. We also have show notes. I'd like to give a shout out to the transcribers because we'll have a transcription. Because we're actually recording this a bit late, the transcription may end up coming out a bit late. But three or four days after we're on, this won't make any sense at all so, I'm not even sure whether I'll keep it in [chuckles]. Two other things I'll add: I love the shout out to Rational Numbers. J does have Rational Numbers, by the way. I like them too. I think that's a great thing and I think it's really great that Kap's put them in. Of course, it is inherited from Lisp and those languages. And the final thing is Elias completes our continents at this point, because he's in Singapore. There you go. Asia is represented.

01:38:51 [CH]

There we go. And I think, yeah, you actually mentioned that we've talked about Rational Numbers. I'm not sure if it was Rationals, but definitely Extended Precision, we've talked about with Henry on a previous episode about J. Then it also came up when we were talking to Rob Pike and his Ivy project because he's done a lot of work there in terms of high precision stuff. We'll link those in the show notes as well. I mean, you should just go back and listen to all the episodes, because the content's phenomenal. And do we need the views? I don't know. Maybe we don't need the views but you know.

01:39:18 [BT]

And it was actually Raul Miller as well. That was one of the things that he was talking about, porting [and] improving the Rational and Extended Integers into J.

01:39:23 [CH]

Right.

01:39:26 [EM]

Rob Pike is on Mastodon, by the way, if you need a second person.

01:39:29 [ML]

That's true.

01:39:30 [CH]

Oh, there we go. I think he has a handle on Twitter. Actually, he might not. Maybe that's it. All right, folks, we'll be signing up for Mastodon by the time Episode 73 rolls around. But yeah, once again [chuckles] Elias, thanks for coming on. This has been fantastic. And with that, we will say, happy Array Programming!

01:39:46 [everyone]

Happy Array Programming!

[MUSIC]

Transcript

Episodes About