Transcript

Transcript prepared by Bob Therriault, Adám Brudzewsky, Sanjay Cherian and Igor Kim.
[ ] reference numbers refer to Show Notes

00:00:00 [Conor Hoekstra]

So, so far we've got you can solve it nicely in an array language if you just change the problem

00:00:07 [Marshall Lochbaum]

Well, if you restrict the problem to the most useful form of the problem.

00:00:13 [John Earnest]

OK, the most useful form of the problem . You can put that on a T-shirt.

00:00:19 [Music Theme]

00:00:29 [CH]

Welcome to another episode of Arraycast. I'm your host Conor and today with me I've got four panelists. We're gonna go around and do brief introductions. We'll start with Bob, then we'll go to Marshall, then we'll go to Rich, and then we'll go to our special panelists today, John.00:00:43 [Bob Therriault]

I am Bob Therriault. I am a J enthusiast. Very enthusiastic about J.

00:00:48 [ML]

It's the earliest I've ever announced before. I'm Marshall Lochbaum. I was formerly a J programmer. I worked at Dyalog for a while. Now I develop BQN.

00:00:57 [Richard Park]

I'm Richard Park. I'm an APL programmer and educator working for Dyalog Limited.

00:01:03 [JE]

And I'm John Earnest. I'm a professional k programmer, and currently I'm working on an interactive development environment called Decker with an array-flavored language.

00:01:16 [CH]

And as mentioned before, my name is Conor. I am a research scientist/polyglot programmer with a huge huge passion for array languages, including all the languages we talk about on this podcast. And yeah, welcome to John. I think if we've got regular listeners, they'll know that Stephen, I think, has departed and is in the midst of a bicycle cycling trip right now. And so as a replacement for the representative of the k/q, mostly k today languages or language, we've got John, who was a past guest that we interviewed two different times. I believe on episode 41 and 43, if I am recalling those numbers correctly from when I looked at them this morning. [01] I highly recommend you go and check out those episodes. Especially, I mean, both of them are fantastic, but I really, really enjoyed the history that you gave, John, of the different k versions, because I think that is something that there are little bits and pieces of on the internet, but there's not a ton of folks that are able to speak super coherently to going all the way back to the beginning up to where we are now with K7 and K9 or Shakti or whatever it's called. So huge plug if you haven't heard those episodes already, go back and listen. We'll link them in the show notes. And yeah, welcome to the suite of panelists that we have here, John.

00:02:30 [JE]

It's great to be back and to be promoted, I guess.

00:02:33 [CH]

Yeah, yeah. It's a small, small promotion. We're going to create... We may or may not create a little section on the ArrayWiki APL site that mentions all you know, part-time panelists because there's been a couple of them now. Marshall actually used to be one before. Were you a guest first or were you a guest panelist first, Marshall?

00:02:53 [ML]

A guest.

00:02:54 [ML]

So you went from guest to guest panelist.

00:02:56 [ML]

Yeah, and now I'm even second panelist, which is just crazy.

00:03:00 [JE]

Senior principal panelist.

00:03:02 [ML]

Nope. Nope, still second panelist. I'm working on my paneling skills and maybe I'll be first panelist someday.

00:03:10 [CH]

We should come we should go we should go full corporate and give people title. So yeah executive executive vice president panelist

00:03:18 [BT]

As long as I get a corner office.

00:03:20 [CH]

No, you get the office that you have because this is this is all remote.

00:03:25 [CH]

All right, so we've got.

00:03:26 [CH]

All right. We've got a few announcements slash follow-up slash ad so I guess we'll we'll go to Bob with the first announcement. And then after that we'll go to to Rich with two announcements and then I'll spin the bottle from there

00:03:35 [BT]

And I'm going to go off script right off the bat because I said I was going to make an announcement.

I will make that announcement, but the first thing I want to do is congratulations to Conor's dad who won a national journalist, investigative journalist award for Canada. That is really impressive. Congratulations to your dad. [02]

00:03:54 [CH]

Thanks. He will not know that this congratulatory statement was made because he definitely doesn't listen to this, but I will pass it along and I'm sure he'll be very grateful for that. Uh, thanks.

00:04:08 [BT]

Well, I think that's a big deal. Good for him. Uh, and the announcement I was going to make was, uh, Ed Gottsman has been working on the J Wiki browser, and we actually have about a seven minute video that we've put together that just gives you the functionality of it. Now, this thing is, it works through the JQT interface at this point, but it gives you access to the J Wiki. You'd literally hover over the different areas and it provides the pages of it. So it's very, very quick. And not only does it do the J Wiki, it also does all the J forums. So if you're looking back through the forums, you literally hover over a year and a month and all the threads show up. And then as you go to an individual thread, all the postings show up. So it's, it's actually really worth looking at. We're hoping to expand it in the future and give it access, general access to people right now, it's just a prototype, but it's a very impressive prototype.

00:05:02 [CH]

And is it online at the moment that you can go play around with it, or this is... Is it just a video?

00:05:07 [BT]

No, you cannot go play around with it yet. You know, it's not quite yet. He's just in the process. He's got four or five people testing it. I'm one of them. I'm one of the lucky few. And he is looking to expand it, but it'll probably be a gradual rollout over the next month or two, I'm guessing. And then we'll have it out as a general release. You'll still need to go through JQT.

So you'll still need to have J downloaded at this point. But we are looking at other things, possibly trying to do something with WASM or something like that.

00:05:35 [CH]

All right, cool. So for now, check out the link in the description to a video preview of what will be available in one or two months. Rich, over to you.

00:05:45 [RP]

So yeah, if you can't get enough array language content, specifically APL content, there is a new episode of the APL show, [03] this little podcast I do with other regular array show panelists, Adám Brudzewsky. We're talking about primitives, primitives that we'd like to see, primitives that have been modeled, but mainly ones that don't exist yet in Dyalog APL specifically, at least in that episode. But also at the beginning of that episode is an announcement that I'm not 100% sure if it's been brought up very recently on the ArrayCast. I know it has been probably talked about in the past. And that's the development of a proper array notation, literal array notation for APL. So currently, you sort of for anything beyond vectors, beyond lists, you have to construct the array using primitive functions and things like that. This idea has been talked about for a long time to have a literal way of denoting multidimensional nested arrays in your code. There is finally an official proposal ostensibly written by Adám for Dyalog Limited, but because this is for APL, we want to establish this as something that any APL implementation might want to use, and so we think we should make it consistent across those as much as possible. So we're currently asking for feedback for this proposal. So we'll put a link to the APL Wiki page on array notation in the show notes, no doubt. And there will be-- well, there's an info box at the top that gives you more information about where to find the proposal and where to give feedback.

00:07:40 [CH]

Awesome. So yeah, if you have feedback to give, check out that link in the description as well. We'll go to John for the ad slash next announcement.

00:07:50 [JE]

Sure. So I've been asked to point out to our listeners that 10.10.Data is currently hiring for k developers. [04] So if you are a person who is interested in working with K3 and probably also some overlap with C and/or web technology, you should check out 10.10.Data. If you go to 10.10.Data.com, that's 1010data.com, and check out the careers section. You can take a look at the job listings there.

00:08:23 [CH]

And we'll once again include a link in the description in case by the time you finish this episode, you forgot what the link was.

00:08:28 [ML]

So for for 1010 data.com. Why didn't you sing it?

00:08:34 [CH]

Come on. Next time next time you come on with the with the plug, we're gonna get you to sing a little Jingle.

00:08:38 [JE]

I've been, I've been sternly instructed by the marketing department that I'm not supposed to sing this anymore.

00:08:43 [CH]

This is a problem that you've run into before, I understand. All right, well, yeah, last but not least, we'll hop over to Marshall, who's got a follow up that will may or may not bleed into our topic for today.

00:08:56 [ML]

So, yeah. So last time we talked about this problem. You have a list and you want the largest sum of any slice from the list, any contiguous subarray to the maximum subarray sum problem. And the well-known algorithm for this is called Kadane's algorithm, [05] which the idea is that you

go across the array sequentially and you build up what the largest sum is that ends where you currently are. And then you also keep track of the largest sum that ends anywhere. So both of these are pretty easy to maintain, and that turns out to be fairly fast. And Conor said what he wants to do is to write this algorithm in a typical style in APL with a scan and then a fold and then just have the language, you know, do whatever's fastest. And as part of that, he can clarify maybe, but he said, well, the fastest way to do this is to write it is to interleave the loops and write it in sequential C. And I said, I'm not sure about that. And now I'm very not sure about that. Because what I did was I looked at the, I looked at how you would parallelize this And I found there's actually a more, you know, nicer in terms of algebraically way to approach it, which is instead of just having this left to right information about, you know, the largest sum that ends here and the largest sum total, you really want four values, four values for any slice of the array, not just one that starts at the beginning. So you want to know what the largest sum is total,

but also the largest sum that ends at the right, the largest one that starts at the left, and the sum of the entire array. And then once you have that, you can combine slices in any order, which means that you can rearrange it and do it in parallel. So I did a handwritten SIMD implementation of that. And I found that it's faster than the C implementation. I don't know what conclusion you really draw from this because you couldn't write the algorithm in APL now, really. I mean, you could write something like it, but it wouldn't resemble the SIMD code at all because it has to, like you have to think about, which how things are packed together in registers. And actually I even had to split the whole array in half in order to get, in order to fill up a whole register. So the algorithm is kind of more different from Kadane's algorithm than I thought. But on the other hand, you know, maybe, maybe a language that is really performance oriented should be pushing you towards identifying these kinds of, you know, if you have a fold, you should try to make it associative. And this is, this is a way to make it associative. So C doesn't really encourage you to do that. I don't think any other language does right now, but that, as far as I know, is the fastest way to do it. And it also, that way you can split it across cores, run it on any number of cores, run it on a GPU, whatever. So yeah, that's the follow-up.

00:11:53 [CH]

So this is fantastic. I agree with everything you just said. And what I should, and I think this is, I was just talking about this with someone

in the last couple of weeks is that I don't think I personally do a good enough job articulating what I'm thinking with my GPU brain or my CPU brain. And I think a lot of the time I'm implicitly thinking with my GPU brain because that's mostly what I do for work. And then Marshall ends up responding with vectorization or something. And then I'm like, "Okay, yeah, that's like a 4x speed up." But I was thinking about something on the GPU.

00:12:25 [ML]

Yeah, well, in this case, it's not a huge speed up. It's like 30% or something, I think. I mean because there's a lot of overhead in packing it all into vectors, and you know, having four pieces of information.

00:12:36 [JE]

And dealing with the ragged ragged edges right?

00:12:39 [ML]

I just didn't bother with that. I mean, it's a constant constant cost.

00:12:44 [JE]

But it adds to a lot of complexity to the implementation every time you do, you know one of these striding SIMD things.

00:12:51 [CH]

So what I was going to say is that when I was talking about the folding both loops AKA one that's a scan and one that's a reduce into a single loop, I'm talking about that being the fastest, like sequential implementation.

00:13:04 [ML]

I mean, yeah, that's the fastest that you can do like directly from what you write to like without introducing new algorithmic thinking. I wasn't sure. I mean, I thought like maybe there's a way to take this scan and do something that's that's like fairly mechanical. That's a known process and turn it into like if it was associative, then yeah, there are standard ways to implement an associative scan fast. With this, you have to, you know, add this extra data and do a lot of thinking, which maybe there's a systematic, maybe there's, you know, some transformation that turns it into that, but I don't know what that would be.

00:13:46 [CH]

So my, my ask, I completely agree with you. And my hesitation is like that translation is possible. This like four piece of information, turn this into an associative operation that you can then do in parallel. I don't know that my statement that I'm about to make is true, but I think it is possible because I know that I've talked to Troels Henriksen about this problem and problems like it and how he approaches it in Futhark. [06] And he says it's an extremely challenging problem and doesn't know of an existing way to do it. But I also know that the folks out of Carnegie Mellon,

they have a library called Parlaylib, where they have a Kadane's implementation of this problem that also has a parallel collection Algorithms and they do the same like keep four pieces of information and I know that there's like enough problems that are like this where if you just do it sequentially, it's not parallelizable. But if you like modify the operation, I think even um, Guy Steele gave a talk back in the 2000s when he was working on fortress, which is like a parallel Fortran and in that talk he talks about a similar problem called rainwater where you're given this like Some people there's different versions of the problem where you know, it's a it's a skyline of a you know downtown or something this one. It's supposed to just be like a mountainous range. Anyways, I don't need to explain the problem in detail, but he does this exact thing where he solves it sort of serially but then shows Hey look if we try and do some algebraic thing where we build up a monoid or half a monoid or whatever the algebraic term is and Then once we have this operation we can then go and parallelize it. So I don't know that it's possible

00:15:22 [ML]

Well, I mean it's clearly possible because the two the two methods give the same result so.

00:15:28 [CH]

So no, I don't it's the mechanization of I want to write the and also we should clarify you said APL solution earlier. Technically the APL ones broken because their scans broken. You know insert extra inflammatory statements, but it'd be q and it works and that's where you write the you know max reduced and then scan with your fork for the binary operation like I want to be able to write that and then have a sophisticated enough interpreter or compiler that like sees that expression and goes oh we can parallelize this by generating what you said basically looks extremely different and is a different algorithmic solution But like that is my pipe dream is like right the easiest most expressive thing that like naively when you look at it it's like oh you're materializing an array from the scan and then doing a reduction on that and like best case is you're not materializing that array best best case is like it completely changes what you would think naively happens and creates this like associative operation that can then be parallelized.

00:16:27 [ML]

Well, I mean, so we know that's technically possible. What you do is you go through sequentially every possible algorithm and you get for each algorithm you go through every possible proof that it's equivalent to the algorithm that you asked for more and and then finally find one.

00:16:45 [JE]

You know I think Rice's theorem might have a might have a bone to pick with that.

00:16:49 [ML]

You can figure out the the best possible algorithm with a bounded, like if you bound the length of the algorithm and the length of the proof. So I mean and then we know because I did it, it's possible to have this other algorithm. So eventually that would find this algorithm, but it would take a long time.

00:17:11 [RP]

But is that the the Mathematica?

00:17:13 [RP]

So is that the Mathematica, the Wolfram model of computation? Isn't that their idea, more or less? You type in "solve_Kadane's" it comes up with.

00:17:21 [JE]

Well, the Wolfram model is also the somewhat more realistic version of what Marshall is proposing, which is where instead of diagonalizing on the fly when you compile things, you build a database of transformations that you can do algebraically on code. And as long as you're dealing with compositions of primitives that are well behaved in certain ways, then you can identify special combinations and optimize them. The problem is that there is an enormous combinatoric explosion of possible combinations that you might want to make special.

00:17:55 [ML]

Well, I mean, so obviously another another thing the compiler could do is just have Kadane's algorithm, in particular hard coded.

00:18:03 [JE]

Just make it a primitive then and then problem solved.

00:18:06 [ML]

We know there are various ways that a compiler could handle this. And the question is, you know, is it practical for it to handle this and similar problems? And that's what we don't know. And part of the reason why it's hard to figure out whether it's practical or not, is that we can't prove it's impossible because it is possible. So it's hard to say where like, how much could a compiler do if you just gave it the scan and reduce thing as input?

00:18:32 [CH]

I think it's an incredibly interesting problem to solve though that like in my pipe dream world there is like a hierarchy of algorithms and there's actually only like a very small set of fundamental algorithms at like the root of your hierarchy you know there's reductions there's scans there's maps you get into like the cut category as they call it in J, but like you can very quickly implement 80% of other algorithms just as like specializations of those fundamental ones. And then the problem becomes is like, how can you define a coherent set of transformations just based on those that like primitive fundamental small set? And if you can do that correctly, I think like you said, the generalization of this where we're not hardcoding recognizing the eight symbols that spell Kadane's in BQN, but it's a generalized "Oh, we see this pattern, we can do X, Y, and Z." I think you can actually get quite far with that.

00:19:37 [JE]

There's a language design principle that Stevan Apter [07] and I have talked about a few times. Tiling space. You think of the collection of primitives that you have available in, k. There are first order compositions of those and second order and so on. It is clearly the case

that k is a Turing complete language, and it is clearly the case that there are many, many simple compositions that are useful. But the open questions are, if you change the set of primitives that you have available, would there be a wider range of things that you cover that nicely fall out of simple compositions? Are more of them useful? And are there things that are just ugly, nasty, inherently twisty knots that can't be nicely decomposed into a set of primitives, or that can't be built up from a set of primitives?

00:20:37 [ML]

Well, I think that is actually one of the really nice properties of programming with primitives is that they don't cover all this space very easily.

That means, and yes, there are definitely these horrible twisty things. You take a nice problem and then you introduce a bug. So you say, "I have this nice problem. Also the third element of the result should come out wrong." Right. And that's horrible and twisty.

And so it's very good that these array algorithms made out of primitives can't easily get into that space because it's much harder to have a bug.

00:21:10 [CH]

Yeah, I was going to say I had the exact same thought about primitive choice as like a combinator/juxtaposition choice in array languages. [08] Like, to my knowledge, the only languages that have experimented with combinators and composition patterns are the array languages. Like, J was the first one that chose as their two train the S combinator and the D combinator. And then for their dyadic-- or for their three trains, phi combinator and phi 1, which they call hook and fork and some other stuff. But then APL came along, Dyalog APL, and like changed that choice. They kept the phi and phi one combinators, but they changed S and D to be B and B one. But like, that was it. I think every other language since then has sort of done that.

I know

00:21:57 [ML]

Well, the stack based languages do a lot of combinator type.

00:22:01 [CH]

Yeah, but that's kind of different. I don't think any of the stack languages have at least implemented as a part of the standard library.

00:22:07 [ML]

Like the these languages aren't all going to do the combinators the same anyways.

00:22:11 [CH]

Anyways, I just think it's interesting that like, you know, and the same thing with like the alternating, you know, three trains versus four trains, five trains, like that model just is what it is. But like, no one's ever experimented like, oh, instead of it toggling from, you know, even an odd, what if it's like three train, four train, five train, and then it cycles, right? Like, I'm not saying it's a good idea. I'm just saying, I'm just saying like, there is like an infinite number of options. And we've only ever, you the array languages are the only ones to explore it. They tested it out once in J. Roger thought that was a mistake. They switched it, and we called it a day since then.

00:22:48 [JE]

Well, and arguably you know, there's the dissenting opinion in in k which has, you know it's it's straight train composition model instead of having forking as a you know as a primitive composition.

00:22:59 [CH]

I mean, a lot of the times that's what you want, right? Simple problems. Pipelines. Yeah, that's the, you know, a lot of the times I like to do point-free programming or tacit programming and then I'll be building up just a sequence of unary compositions and it's like, well, I've got in BQN, I've got two options. I can just keep them in the braces and have the x or I can remove the braces and then add a bunch of like, what is it called? Nothing in BQN where the dot dot dot, which is like, I actually like it quite a bit because it's,

00:23:28 [ML]

I mean the only difference between k trains in this restricted set of BQN and trains is nothing. [09]

00:23:33 [CH]

Right, yeah. Which is a funny joke.

00:23:40 [JE]

It's it's also what TAC and and APL. Right?

00:23:43 [ML]

So all you would have to do to translate a ktrain to either APL or BQN is just add parentheses around every monadic application. So in BQN, you also have this alternate thing where instead you can put nothing as the left argument. And then it says, well, that means there's no left argument. So that's a little nicer because you need one symbol instead of two. But yeah, so k trains other than the syntax, which is important. They're a special case of APL, BQN, not quite J trains.

00:24:15 [CH]

All right, well, this discussion has been great. Should we segue into a different problem? If the listeners all haven't, because that definitely wasn't dense at all.

00:24:28 [BT]

Now that your brains are warmed up.

00:24:30 [CH]

Yeah, so at the at the end of our last episode, Marshall and I and the other panelists got into a discussion because I just randomly brought up this problem that let's get the exact name of it. It'll be in the show notes. It's called Sliding Subarray Beauty. [10] I believe it was a part of LeetCode Contest 342. Doesn't really matter. But I brought this problem up because I like to solve these problems. And when I stumbled across this one, my first thought was that this is a problem that doesn't actually lend itself very well to the array languages. And honestly, a lot of the times when I hit these problems, I just skip them. I'm just like, well, this was not meant for this paradigm. Move on. And then the next time I see a problem, and it's like, oh, you know, calculate the maximum number in the columns of this matrix. And I'm like, oh, boy, oh, boy, that's two characters. I'll solve this one. And so I brought this up, and then we ended up having a mini discussion about it. And so I think the rest of this episode, for the next 45 minutes or so, will be kind of just...

00:25:32 [ML]

He knows it's going to happen for 45 entire minutes.

00:25:37 [CH]

Yeah, that's a bit, you know, what's the word? The hubris. Yes, I should take a step back because we rarely get five minutes of knowing where we're going. So I guess I'll start off by re-articulating the problem and then probably I'm going to change it slightly immediately. So I mean, everyone looked at this problem statement, and then I'm going to change it because I think it makes it more interesting. So you're basically given an array of integers. And if we're being explicit, the problem states that these values can be between -50 and 50. This is my first change, is they can be any integer value, because I think that's how I explained the problem in the last video.

00:26:21 [JE]

-50 to 50 is one of those like immediately suspicious things in a programming puzzle, isn't it?

00:26:27 [ML]

I think I want a hundred hundred one element lookup table.

00:26:31 [CH]

Exactly. That was a couple of the solutions were and I was like, well, I mean, and yeah, so it's more interesting if there can be any integer value for whatever in 32 or 64 details don't matter a ton. And then you're going to be given two other values, k and X. K is going to be the sliding window length, and X is going to be sort of the X-th smallest value that you're looking for. And so I'll read out the second example because I think it's the easiest. You're given the values -1 to -5. So -1, -2, -3, -4, -5. And you're given a sliding window value of 2, so you're going to look at two elements at a time, and then you're given an X value of 2, which means you're going to be looking at the technically the largest, the second smallest, but in this case because k is equal to x, you're always going to be looking for the largest. And so then you basically just do a sliding window, you know, an n-wise reduction in APL, you can do it a couple different ways in BQN, and you just calculate what's the x smallest. So in this case, you could technically do a...

00:27:35 [ML]

So what do they want to know, the x smallest for each window?

00:27:41 [CH]

For each window, yes.

00:27:42 [ML]

So your result is a list, yeah.

00:27:44 [CH]

So yeah, the wrinkle that I haven't mentioned is what we're gonna ignore.

00:27:48 [JE]

Oh oh.

00:27:49 [CH]

And that's because I just think it's a... Well, I mean, we can talk about if...

00:27:53 [JE]

I mean, it really twists the knife on having a nice, neat solution to it, though.

00:27:59 [CH]

Well, so we can talk about two different solutions, but the wrinkle is that it only wants you to tell you what the x smallest is if it's negative. So if the x smallest... So if you're looking for, in this case, the second smallest value, but that value is positive, just basically want you to floor it to zero. Which I mean, I don't think for my solution, it was an extra two characters, zero floor or whatever.

00:28:24 [ML]

Yeah, I mean, you can just do it at the end.

00:28:25 [CH]

But it sounds like maybe for John's solution, it is a bit nicer.

00:28:31 [JE]

Yeah, it just adds some complication.

00:28:35 [CH]

And I think maybe so ramble ramble over, maybe one person here can very briefly rearticulate what I said just in case if there's a listener that's on a run or doing dishes and doesn't have the ability to go look at this problem statement and read it for themselves, because sometimes it's good to hear a problem statement two different ways. Anyone want to?

00:28:55 [JE]

Okay, so you're given a list. You're asked to view it in sliding, overlapping windows of size k, and for each of those windows, they want to know the x-th order statistic of the negative numbers capped at zero in that sliding window.

00:29:23 [CH]

Perfect. Yep. I understood that. Hopefully the listener understands that. Before we actually talk about the solutions, I thought, and this is what we disagreed on, and maybe Marshall has had time to realize I was right or definitely realized that he was right and I was wrong.

00:29:40 [ML]

Or explain that we didn't disagree.

00:29:43 [CH]

I thought the time complexity of this was Big O [11] of the length of the array times k log(k). Because you're basically doing a sort like the way that I naively I should say this for the naive solution. I'm not sure if there's a better one out there. But the way that I did this in BQN was basically just use windows or ranges, whatever the dyadic version of range is.

00:30:10 [ML]

Windows, yeah.

00:30:12 [CH]

Windows. And so that creates a matrix. And then I basically sort each row, and then I just pick the x the largest. So you're doing a n log n linear rhythmic sort, which is in the length of your window k. So that's k log k. And you're doing that, technically it's n minus x, but x is constant, so it's big O of n times k log k. But I think Marsh...

00:30:34 [ML]

Well, and on the restricted range, it would be a linear time sorting. So then it would be nk.

00:30:43 [CH]

Right, correct. So I'm not sure, are folks on the--because I remember, Marshall, maybe it was that we were not thinking--because I think you had said that there was like a strictly linear solution that you were thinking about.

00:30:54 [ML]

Well, I don't remember what exactly it is, but for--if you want just the total minimum value from every window. So this is x is zero, I think, or maybe one, depending on how they're indexing it.

00:31:10 [CH]

Yeah, x is one.

00:31:11 [ML]

Yeah, x is one. Then I believe there's a way where you keep a queue of all the smallest elements you've seen. Well, some sort of queue of smallest elements, and you'll update this as you see new values. And the way it works is you're not guaranteed to have some number of operations at every step, but the total number of operations should come out linear. But I'm not exactly sure how, like what the algorithm was. I'd have to work it out again.

00:31:38 [CH]

Yeah. So my, my thought that, and I think I said this last episode was that, you know, the time complexity that I just stated was for doing it naively in like an array language. Cause I don't really know how to translate my C plus plus solution of this is that there's a data structure called a min heap, which has log n insertions that use for, like there's a classic interview problem of like you need to keep track of some sort of, what do they call it, telemetrics data for like the last three days and so you're updating some blah blah blah value. And the way you do it is you basically, you look at the first k values, create a min heap from that, and then you can basically, because you know what value you're popping off each time you step, you do like a delete, an insert, and then you can get the value very quickly. And then I think the time complexity of that ends up being big O of the length of the array log k or something like that.

00:32:30 [CH]

So instead of doing a sort, you're doing two different deletions and insertions, which are each log k. And...

00:32:36 [JE]

Well, and the key thing is that by using a heap-backed priority queue in that way, not only do you get the ability to update it as you roll it along, but also it allows you to get the nth order statistic. So it doesn't rely on baking in the assumption that it's the first element. It could be the second, third, etc. Because you have the heap ordering to guide the efficient lookup.

00:32:58 [CH]

Is it actually log k or is it log x? Because you actually only need to store... oh no, I think you do have to store all of it.

00:33:10 [JE]

The size of the heap has to be the size of the window. But the amount that you walk down to find the nth order statistic as an alternative to quick selecting or something is based on log of that.

00:33:14 [CH]

So I guess that's hopefully the listeners are still hanging on here. But now is my goal of bringing it up last episode. Is there a array solution to this that doesn't fall into that naive category where you're just basically sorting every sliding window?

00:33:45 [JE]

Yeah, 'cause to recap, the naive approach in an array language is you slice the whole structure into the windows, and then for every one of those slices, you want to filter it, sort it, and extract the nth element from it with a cap appropriately, based on how your language handles indexing or what have you.

00:34:07 [CH]

Which is definitely suboptimal compared to some imperative solution where you're updating some min heap or something like that.

00:34:13 [ML]

So a minor point on when we talk about complexity, the usual measure for complexity is big O. And when you say a problem is in big O of f of x, that means that the optimal runtime is bounded by f of x. So it's actually like, if I tell you, oh, this problem is big O of e to the x, and you say, no, it's linear, it's big O of x, I was still right, because it is bounded by e to the x. So when you say, I've shown that this algorithm is big O of this number, or of this function, that's just an upper bound. And people don't generally-- I mean, it's pretty hard to prove a lower bound. But if you had a lower bound, you would use theta, which says that it actually grows corresponding to this function.

00:35:04 [JE]

But the point you you.

00:35:05 [JE]

You generally want to relatively type down, because otherwise it isn't a very helpful categorization.

00:35:10 [ML]

I mean, yeah, obviously if I tell you the algorithm is O(V^x), it's like, yeah, I'm right, but...

00:35:16 [JE]

Not very helpful.

00:35:18 [ML]

You're not going to say, oh great, problem solved.

00:35:20 [JE]

Well, another thing that I don't think got brought up in the previous discussion for this is that the efficient priority queue approach to this has the right complexity class in terms of overall performance. But it is also inherently sequential, right? So whereas the naive k approach or APL or what have you is in fact inherently parallel. So when we start talking about performance on a real computer, it's important to consider that you can be in the right complexity class and still be slower in practice for many reasons.

00:36:06 [ML]

Well I will say it's not totally sequential because you can run multiple different sets of windows. Like you can split the array up large scale. But it's not still like the path you follow every time you insert the heap is different. So you're doing like a different thing every time. I mean it's more like it could be evaluated concurrently than really in parallel.

00:36:30 [JE]

Well, I mean, but in the APL approach, the parallelism is just immediately there. Like in K, there's an each there. Whereas you can recognize that it is possible to contort the C version of this even more in order to harvest more parallelism. But that isn't actually the same algorithm. It's another even more complicated algorithm.

00:36:54 [CH]

And just in case for the listener, 'cause we all kind of nodded our heads when John said, "Oh, the priority queue version is inherently sequential "and the sliding one is parallel." And we all went, "Yeah, yeah, yeah." But in case the listener didn't hang onto that, the reason that it's inherently parallel is because you can imagine that like a sliding primitive that does a reduction on each of those, I mean, this is a very sophisticated reduction where we're doing some partial sort or sort and then picking out a number. It's not actually like a max reduction, but we're doing some operation on each sliding window. So we could do like a four each sliding window and that primitive could just be parallel where you're passing it off to different cores or you know, there's a bunch of different strategies. Whereas the priority queue the way we explained it you're doing basically a sequential left left left pass on your sequence, which as Marshall pointed out you could break that up you could you know, we're doing something that does not actually carries...

00:37:46 [ML]

But you are still redoing some work at the overlap Like you have to do a full window before you get anything out, right? I mean, I guess the same is true in the array parallel version. You have to sort a whole window.

00:37:57 [JE]

Yeah, well, in k and in Q, depending on which dialect you're dealing with, there's each and there's peach. [12] And the peach implementation is allowed to use different strategies in terms of how it stripes or divides up the work across cores.

00:38:17 [CH]

I used peach a couple of the peach for those that haven't guessed stands for parallel each at least I think it does...

00:38:23 [JE]

I think it's for the fruit actually.

00:38:27 [CH]

I think they should just replace it with the peach emoji, you know, what are we doing here? Uh, you know, we're all about character length and ...

00:38:32 [JE]

Maybe that'll be in the next version of shakti.

00:38:35 [CH]

Yeah, that would be that would be amazing if they as an April Fool's released an actual executable.

00:38:42 [JE]

Arthur has played a little bit with Unicode characters, you know, that's that's where this madness leads.

00:38:48 [CH]

But, uh, anyway, I was just gonna say, I've played around with peach a bit, and I've never really... I don't think I've ever used it correctly, because I've tried to speed some stuff up, and it's like running the exact same. But also, I'm using the trial version that limits you to whatever some n number of cores, so it's also potentially...

00:39:04 [JE]

I think that what it does... Remember, this is based on no secret knowledge of anything. I think it's really relatively simply just, you know, bucketing it up across n cores and not doing anything particularly fancy above that. But what's, you know, what's nice about it and very neat is the fact that you can just drop it in instead of an each and, you know, and it doesn't change anything else about the behavior of your program. It just, you know, takes advantage of parallelism when it's available, which is conceptually pleasant.

00:39:39 [CH]

So I guess to wind back a little bit, Does anybody have, so this is the problem's been stated, we talked about a couple different solutions. Is everyone's gut reaction to solving this like the naive way or is like, does the elevated array language programmer have that solution but then they go, oh yeah, but there's a different way to get a better performance profile similar to the priority queue thing that you might be doing it in a non-array language.

00:40:07 [ML]

So I know if x is one, so if you're picking, or of course k, so if you're picking the smallest element or the greatest element. This is what I did in Dyalog before that I sort of described on the last episode, but I don't I don't think I got it very clear. What you can do-- well, so first, the case with two windows. That's a pretty standard thing to do. And the array language is going to-- I think most array languages are going to do this for you. In BQN, you don't even have-- you could do this with windows, but it would be slow. What you do is you take your array-- all you want is the min of each pair of elements or something like that. So what you do is you take two views of your array that are offset by one. So you're going to take one drop of the array and minus one drop of the array. And then you take the minimum of those two arrays. So then that's completely parallel. And that's definitely what Dyalog, I'm pretty sure what J does. I think ngnk might not...

00:41:15 [RP]

Yeah, you can do it with just a 2 min slash for that.

00:41:18 [ML]

Yeah, yeah, exactly. If you write "2 min slash" in Dyalog, that's what it's going to actually translate to, because that's just the fast way to do it. You don't want to, like, if you did one element at a time, it'd be super slow.

00:41:31 [JE]

Well, and this particular concept of a sliding window of two is so common that k actually elevates it to being an adverb. That's each prior, essentially.

00:41:43 [ML]

So yeah, NGNK probably optimizes that as opposed to doing it one pair at a time. So yeah, and that's like the standard pattern in BQN for doing a two wise thing. I recommend not using windows for that because it's doing a lot of extra work in producing the actual list of all the pairs. Okay, so that gets you two. If you want three, well, maybe we'll skip to four. If you want four, what you do is you...

00:42:13 [JE]

If you want three, you're wrong, stop. It's not a good number.

00:42:19 [CH]

Just for the listener, just the problem statement, K can be anything up to the length of the array. So we've solved k equal to two.

00:42:30 [ML]

Yeah, and so we're fixing X at one, but k keeps getting higher.

00:42:34 [CH]

No, so k is bounded by N, where n is the length of our sequence. So technically you could have a 10,000 element array and you could say k is equal to 10,000 and x can be anything, one to 10,000. But we've solved two, we're skipping three.

00:42:53 [ML]

We'll go back to three, it's okay. So if you want the minimum of four elements, right? You can split those four elements into two pairs and you say, well, I want the minimum of this pair and that pair. So what you do is you first do your pairwise thing on the array. And then what you want is, so you know element 0 is the minimum of the original 0 and 1. 2 is the original 1 and 2. And 3 is the original 2 and 3. 0, 1, and 2. Ugh. So you actually want to combine your element 0 and 2 to get the first two pairs that don't overlap put together. So what you'll do is take two drop of the array, min with minus two drop of the array. And so on, if you want eight, you can do, you can do those two steps and then you do four drop minus four drop and so on. So that gets you the powers of two.

00:43:50 [CH]

For X equal to one though.

00:43:51 [ML]

Always for X equal to one. I don't know how to do the higher X values. So we'll keep talking about that, I guess. But if X is one. So all these steps get you the power of two.

00:44:04 [CH]

So k equal to two for any values of x equal to one or two, and then k equal to four for, or k equal to powers of two with x equal to one. So we're getting there.

00:44:18 [JE]

Tiling space.

00:44:21 [CH]

So basically your algorithm is gonna be like a, you know, you're gonna have three different implementations specializing for like, you know, the first one, the second one.

00:44:30 [RP]

That's the array interp language interpreter way, isn't it? That's how you do it.

00:44:36 [ML]

That's what Dyalog does. If you tell it, you know, k bin slash list, it does this. So then if you've got a number like three or five or so on, what you want is actually to overlap. So you'll say when I'm doing three, I don't want two pairs that don't overlap. I want them to overlap by one element. So you're going to do one step with one and then another step with one, and that adds up and you get three. There's an off by one error in this, which is annoying. And so five, you would have to go up to four, but then you just overlap by one after that, and then you're done. So like the number of steps to get you to five is the same as eight, and so on with that.

00:45:20 [JE]

It ends up being a lot like, you know, reducing multiplies to shifts in sort of an analogous sense.

00:45:29 [ML]

Maybe, I mean, yeah, you could call it a kind of strength reduction. You could also say it's sort of like a little bit of dynamic programming where you're saying that, you know, the minimum along all these elements is actually the minimum of these two different subsets, but the subsets are shared between different windows. So maybe actually that's a way to approach higher values of x is to say, well, if x is two and my window size is 100, Well, the minimum of well, yeah, yeah. Then you want the smallest two values. You could split it up into two windows at 50 and you say get the smallest two of each of those and then you work out well and then you combine those together, I guess, which that that sounds complicated, but possible.

00:46:15 [CH]

So far we've got. You can solve it nicely in an array language if you just change the problem.

00:46:24 [ML]

Well, if you restrict the problem to the most useful form of the problem, I will pick it up.

00:46:31 [JE]

The most useful form of the problem. You put that on a t-shirt.

00:46:35 [ML]

See, this is the thing. The array language won't let you do dumb stuff like take the fourth smallest value from every window of size 100. Who does that?

00:46:43 [RP]

It's true. It's another common thing, isn't it? A user comes up to an array language person and goes, "I've got this problem in my code. Can you help me do this thing?" And then the array language person goes, "Well, what are you actually trying to do?" And then tells you to write something entirely different because you're actually approaching the problem there.

00:46:58 [CH]

I don't... are we approaching the problem wrong right now? I don't think so.

00:47:01 [RP]

Well no, so we got as far as we're not trying to do sorts or grades.

00:47:06 [ML]

Yeah, well although to be a little more serious on this, like a min... I mean this is sort of like a median filter or a ordered statistic filter. I mean I think this is actually useful with x other than one. But the most common would be, you know, a total minimum.

00:47:23 [CH]

I can see Marshall in a Google interview. They ask him a question. They say, "Yeah, that's not really useful here. So what we're going to do is we're going to solve the actual useful version of this." The guy across the table is going to be like, "What is going on?" And they're going to try and be like, "No, we want to do this." And he's going to be like, "Listen, listen, okay? You clearly haven't done much user-facing stuff."

00:47:43 [ML]

My worry is just they're going to ask me how to sort a million random integers. And I'm going to say it's this algorithm I came up with. You do a quick sort down to size 2 to the 16th, and then you do Robin Hood sort after that. Trust me, they're going to be like, "What?"

00:48:00 [CH]

I mean, potentially the answer to all of this is just that, yeah, we just throw our hands up and we take the performance hit and that's it.

00:48:09 [JE]

Well, I mean again, the thing is it's a problem that is very clearly designed to be solved using a sliding window and a priority queue backed by a heap. That's what it's intended to be done with. And there are problems that are specifically concocted around that kind of conceit. But the fact of the matter is that there's still a fairly elegant parallel solution available in APLs with the caveat that it has a difference in how many allocations it has to do and so on.

00:48:53 [RP]

The only other thing I've got to add on this ... but it's also kind of specific to at least Dyalog APL ... [sentence left incomplete]. Because I'm guessing, Marshall, your windows thing: that's a flat 3D array and then your interpreter is doing the traversal when you pick out the Nth (what is it, column or something?) after you've sorted all of them. I don't know but it's the sort rank one that I'm interested in. Or the "sort each" after you've partitioned it or whatever is going to be "sort each" in k, right? Is that there are these flat array partitioned vector sorting idioms. You can find them on APLCart [13] where I just looked. So this one involves 2 grades and a plus scan, but I think in this scenario it means replicating the list in some way so that you could ... [sentence left incomplete]. You basically give the boolean list on the left (a boolean vector, where a 1 indicates the start of each of your partitions, which will be your windows). And then on the right is your list of numbers. And it gives you the grade for sorting each subarray.

00:50:04 [ML]

Yeah

00:50:06 [RP]

But then you have to do a bunch of modular arithmetic on that after, because you're only gonna pick the second from each sub array. What I'm saying is in Dyalog APL, when you create this list ... when you do the windows, you get a nested ... [sentence left incomplete]

00:50:22 [ML]

So you're saying you could turn the individual sorting into one big grade. I mean, I think that would work. The issue is that sorting generally is like O(Nlog(N)). So by combining all these arrays, you do more work. Unless it's a restricted integer range which, if they're 32 bit integers, they are a restricted range. But you know whether it takes advantage of that depends on how many you have. But I know there is code for sort rank one, although all that really does is to, take out all the decision making about the size of the array because it knows this the size is fixed. But there are parallel things you could do. Like a sorting network is a really fast way to sort small arrays. You just have a fixed arrangement of comparisons that you do on pairs of elements or swaps. You compare and swap if they're out of order. And so if you have a lot of arrays actually, you can then pack those into vectors, and since you're doing the same swaps for all the different arrays, you have a bunch of very simple vector operations that go together. So yeah, you can actually do something with saying: "well, I have a bunch of sorts all happening together", but still fundamentally, your complexity is O(n * k) or higher there, so if k is large, that's not good.

00:52:02 [CH]

All right, one comment and one question. The one comment that is [a] response to something you said, John, is that [this] technically actually ... [sentence left incomplete]. Yes, this is a concocted leet code problem. But this is actually like a thing that comes up all the time.

00:52:17 [ML]

Yeah, this is pretty good for leet code.

00:52:19 [CH]

Like technically, I think the primitives in q. I'm not sure if there's actually like a moving median, but like the moving averages and stuff (I know that's different because you can actually update that in constant time). You don't need the heap backed priority queue kind of thing. But this kind of streaming telemetrics where you're trying to update something over the last 24 hour period and you're popping stuff off and pushing stuff in to update these statistics is definitely a thing that happens in the real world. So I wouldn't say that this is a completely contrived example. So that was the comment.

00:52:51 [CH]

The question is [this]. Your response was: "we have actually a decent solution in APLs (using that umbrella term) if they're parallelized so that you can chunk this stuff up". So the question is, today, what of J, Dyalog APL, BQN, k, q (any of the other ones if we want to mention them), can you actually write that version [in]? So I know that J now has ...

00:53:20 [BT]

Threads

00:53:22 [CH]

Yeah, threads, colon T. So I think J can do it. q should be able to do it with the "peach" primitive, yeah?

00:53:30 [ML]

So I think that's the smallest change because you just, you know, change "each" to "peach". But [with] J, you have to add the threading primitives though it's not much either. But you have to add a whole primitive here.

00:53:45 [RP]

In Dyalog you have to download a workspace that's a model of peach.

00:53:46 [ML]

Yeah

00:53:47 [CH]

And do k, BQN and Dyalog APL, do they have any way to do that or would stuff need to be added?

00:53:53 [JE]

Well, peach is in some dialects of k.

00:53:57 [CH]

As the word "peach"?

00:53:58 [JE]

Yeah, or sometimes it's underbar.

00:54:00 [CH]

Underbar, I thought Underbar was drop.

00:54:03 [JE]

Well, it depends on the dialect [both laugh]. In older versions, in K2 and K3, an underbar prefix on a name is how you supply all of the built-ins. You know the stuff that's it's technically a primitive, but it's infrequently used enough that it gets pushed off into this ugly space of getting actual English names for things, like sv and vs for [TODO at 00:54:33: Is "svnds" the way this is named in k code?] encode and decode.

00:54:36 [ML]

This is what k-ers think English is [everyone laughs]

00:54:40 [JE]

Yes!

00:54:41 [CH]

Was that you reading off a well-named function? "SV, SBNS" or something like?

00:54:46 [JE]

"Underbar SV" and "underbar VS" are a pair of built-ins in K3 for doing what is called decode and encode in several other array languages. Scalar/vector, vector/scalar: that's where the name comes from.

00:55:04 [CH]

Oh yeah, I've actually seen that somewhere. I'm not sure, if it was in a paper

00:55:07 [JE]

And then, in K5 ... [sentence left incomplete]

00:55:10 [ML]

There's also an APL SV and an APL VS if you're interested in history [chuckles]. The SV was shared variables I think.

00:55:18 [JE]

Oh great.

00:55:19 [CH]

Oh no [chuckles]. So in some cases you can do it.

00:55:22 [RP]

There's an old document called "Peach for Dyalog APL", [14] but that is what evolved into what's called futures and isolates, which is right now a workspace you have to download to do it and it doesn't do threads on different cores, so it's TCP ... [sentence left incomplete] Well, I guess technically it is, but it's like sending stuff over TCP and the setup for that is quite intensive, I think.

00:55:47 [ML]

So you like, set up a web server and then you can run things in parallel? [everyone laughs]

00:55:51 [RP]

Yeah, literally right, yeah. You know, there symbol assigned for it.

00:55:59 [ML]

And yeah, [with] BQN, if you want to do it in parallel, you have to spawn processes, really. You're doing that through the shell pretty much on your own.

00:56:09 [JE]

And it's one of those things where, like, you know, semantically "each" and "peach" are supposed to be the same operation. It's just that one of them is faster if you use it in the right situations, and so there's kind of the ... [sentence left incomplete]

00:56:20 [ML]

Yeah, I mean, assuming your operand doesn't have side effects and change stuff.

00:56:24 [JE]

Which is part of the judgment that goes into that but like in a pure sense.

00:56:29 [ML]

That's why they're different primitives at all.

00:56:31 [JE]

Well, and there are situations where it you know that the set is small; you know it's not going to be faster. So don't bother.

00:56:40 [RP]

That goes back to Conor's question of you wanna just spell it and it should know. Is that what you're getting to? Sorry, I interrupted you as well.

00:56:48 [JE]

Yeah, that's where I'm getting. Like from a certain perspective you should say like: well, "peach" shouldn't be its own separate primitive because ideally the interpreter just identifies, "oh, this is a good situation; I'm just going to do this in parallel". That would be ideal from a programmer's perspective, but there's a big trade off in complexity in order to make that choice (right or wrong), versus make it a really, really easy change for the programmer to explicitly put in that piece of information.

00:57:21 [ML]

Yeah, it's that high level versus low level distinction again.

00:57:24 [JE]

Oh and you know, it's not a very low level ... it doesn't require the programmer to change their reasoning about it.

00:57:32 [ML]

It's just 1 little bit of high-level-ness versus low-level-ness and I mean the right answer is not always to go to the highest level. A lot of people intuitively think that, but especially when you want performance, you have to have some control.

00:57:46 [BT]

Which I think for J Threads, Henry decided that he was going to put it in automatically for matrix multiplication. But past that, he hasn't put anything else in yet.

00:57:56 [ML]

Yeah, well, and presumably it's got a size threshold that it's testing for that.

00:58:01 [JE]

Which is the same kind of reasoning that the programmer makes when they're deciding whether or not ... [sentence left incomplete]. Like in J, you have an elevated ability to know whether or not you've got a complicated composition with no side effects versus in k, once go and do a lambda, everything goes up out the window and it could do nasty things that need to be accounted for.

00:58:23 [ML]

But, I mean the language, the implementation, could inspect the source of that right? Or even the bytecode.

00:58:28 [JE]

Yeah, I mean potentially, but it's sort of the banana and the monkey's hand in the jungle kind of problem [chuckles], where in the simplest case, it's easy to do that inspection. And in a contrived case, you can make it very expensive and complicated to do that inspection.

00:58:46 [ML]

Well, you could make it undecidable if you wanted [John agrees]. But I mean, I think there would be a lot of cases where you've got a function that you write and it uses only local variables inside the function, right? It's not a crazy thing to expect. It's just that they don't do it. And I mean, BQN doesn't do any of this either. Neither does APL. Nobody's really doing this, but you know it's possible.

00:59:09 [JE]

I mean in the same sense that you could kind of constructively determine that some subset of compositions represents an associative operation, ... or not. But you know, for the general case, there's lots of things that you can't make that determination.

00:59:26 [ML]

Yeah, but if you're working with side effects like if you have a local variable and a function, you can change that all you want, it's not going to escape. If you change a global variable, yeah, all bets are off. If you call a function that reads a file, no hope. Things like that. But I mean, there's a whole lot of useful stuff that you can do with just a lambda function that does some computations that keeps its own variables and stuff. Like that doesn't seem to me anywhere near the amount of work of proving things are associative and actually having that work a fair amount of the time.

01:00:03 [JE]

Sure

01:00:04 [ML]

Yep, something for the future [chuckles]

01:00:05 [JE]

You know, the design of k involves a lot of trying to find the global minimum complexity. And that means that a certain amount of stuff that would be nice for programmers gets thrown out the window in the pursuit of making it so that your program is simpler and the interpreter is simpler and there's a little bit more that the user has to think about. Or there's something that the user would like to have and: no! you're not allowed to have that because it would add too much complexity to the implementation.

01:00:40 [ML]

Yeah, and having an implementation you can like think about easily is really nice.

01:00:44 [JE]

That doesn't have surprising ... [sentence left incomplete]

01:00:46 [ML]

It's all trade-offs.

01:00:47 [JE]

Right.

01:00:47 [BT]

And if you don't put that constraint, you end up with J.

01:00:50 [ML]

But J doesn't do it either; just for matrix multiplication.

01:00:53 [JE]

Well, but J does some more of that kind of thing and J has a richer numeric tower. It has a larger set of special combinations and a whole bunch of bells and whistles that K dispenses with.

01:01:10 [BT]

I think Stevan Apter said (when we interviewed him), I think he said J was a challenge for him because it was so big, such a big language. There was so much to learn about it. Whereas k is so much simpler. And equally as powerful, if not more so.

01:01:26 [CH]

So TL;DL is when you have a program that isn't quote/unquote "nicely solvable in array language", you have to redefine your definition of nice because due to the primitives that exist ... [sentence left incomplete]. Like a lot of array problems can be solved with this tiling kind of operations or cutting operations and those a lot of the times are very easily parallelizable. So then the question is: is it actually nicer? And I actually prefer ... [sentence left incomplete]. Like, I wrote the solution to this in BQN and minus the fact that it is a problem that takes 3 arguments (so really like the most verbose part of the problem is unstructuring the x k in the list of numbers), this is a one liner. And technically I did it the slow way using windows, which is suboptimal but in an array language where I believe q and k have multi X, Y and Z (so you wouldn't have to do that destructuring) and in a language that had a fast windows version and it wasn't like a performance hit, you can write this in basically a single line of code. And I prefer writing that compared to my heap backed priority queue.

01:02:50 [ML]

It's a lot easier.

01:02:51 [JE]

Yeah, I mean it's worth considering that we're very sensitive to what we describe as being an ugly or complicated solution in the array languages [everyone laughs]. But it's still a 20 character solution versus, you know, easily a full page of fairly naughty C++ with lots of situations where you could go wrong. Whereas the k version is a little bit more complicated than a 3 adverbs chained together. But there no real potential for "off by 1" errors. There's not a lot of places that it can go wrong. It's still pretty concise and it's still reasonably performant.

01:03:32 [ML]

I mean, the big problem is the performance if you get into huge window sizes. I mean, I could definitely see a situation where you're, you know, building some product and at some point you want the smallest 20% filter and you do this and it's just not fast enough. Like it takes a few seconds to run. And like, that's really a pain.

01:03:54 [JE]

Well, you know, in the world of industrial k programming, [do] you know how we solve those problems?

01:03:59 [ML]

C?

01:04:00 [JE]

Yeah. You write a C implementation, you wrap it in a little DLL and call it from k. No problem, just like all of the built-ins in k [chuckles]. You know, k has an FFI and at 1010data for example, there's a fair number of things that involve thorny string manipulation, dealing with Unicode (Unicode is not something that's just for free built into K3). And every time inherently sequential things like that have to happen, you just write the hot parts of it in C and you wrap it up in a nice library and then you don't have to worry about that. You can write your higher level logic in k and it works.

01:04:45 [ML]

I mean, at the same time, if we found an array solution to arbitrary X values, I mean that would be pretty cool. It would probably be, you know, less error prone than the C version. Maybe not. You know, it might be really crazy complicated. Hard to say before you actually solve these problems. That's the other thing: if you find a nice array solution, it's almost always the case that (maybe not like implemented in k or APL), but it's almost always the case that this is a fundamentally faster algorithm than what you would write in C. So if you took this algorithm and wrote it out, by hand specifying which vector instructions you're using [and] where, it would end up at least as fast as the sequential C solution. And probably 2 times faster or something. Really depends on the problem, but I mean, this array thinking is a really good way to develop fast algorithms, even if they're hard to actually implement in a fast way.

01:05:49 [JE]

And from a different angle entirely, consider that in any of these array languages, you could write the heap backed priority queue version of this. You get the correct complexity class. It would be very slow and very ugly, but you could (if you wanted to, just hypothetically), you could express that algorithm and it might even be more concise than the C++ version. It's just not really working with the grain of these languages.

01:06:20 [ML]

It's not like unbearably slow. I mean, BQN is pretty fast (the scalar stuff). It's about 10 times C usually. I think k would be about 20. Dyalog really is kind of slow.

01:06:31 [JE]

It means you're in same ballpark as garden variety, scripting languages at that point. And you know a lot of people do actually write Python and somehow endure it's performance. And what do the Python programmers do when it's too slow? They write C and then they wrap it in a DLL. And then they call it. Or they write Fortran or they coerce someone else to write Fortran, and then they wrap it in a DLL and call it.

01:06:57 [CH]

This is a big point in favor of when Henry was on and he was talking about TPLs, [15] the "true programming languages" and then I said (not because I believe this, but I know that there's people out there that would say), well in different cases you're saying: "oh, and then I'll just write some C code and call it". Like it kind of seems like C maybe is the TPL (for some definition of TPL). And that's what I was actually thinking at one point when you started saying that. I wonder if at any point the solution is just, solving this in a different language and then calling into it.

01:07:31 [JE]

I mean, have you ever wondered they named C after the speed of light?

01:07:34 [CH]

Is that what they did? [chuckles]

01:07:36 [JE]

I mean, that's my interpretation. I think it's a lot more exciting than saying: "well, you know, it's the language that came after B" [everyone laughs]. But from a philosophical standpoint, C is definitely the speed of light in the mind of many programmers. Despite the fact that in a lot of situations, when you're doing numerical code, Fortran is better because it has simpler aliasing behavior and your compiler is more able to reason about the programs.

01:08:01 [CH]

All right, we've only got a couple minutes left. Are there closing thoughts, things people want to say on the air before we wind things down?

01:08:10 [BT]

My TLDR of this was, if you get a problem that doesn't fit in array language, convince the person they should be solving a different problem [Conor laughs]

01:08:21 [ML]

I mean, that's it. Like, anytime you get a problem, try and solve a different one. Any problem at all!

01:08:28 [JE]

Chuck Moore, the designer of Forth, had the saying: "quantize brutally". You know, if you're tasked with a problem that has a lot of moving parts, just throw some of them out. Simplify the problem and obviously there are lots of situations where you know your client or what have you, will balk at this approach. But in those situations, perhaps Chuck was right all along, when everything is said and done.

01:08:56 [RP]

You're saying chuck the client out? Is that what you're saying? [others laugh]

01:09:00 [BT]

If they're the problem [everyone laughs]

01:09:04 [JE]

I mean, do you want to make money or do you want to write beautiful programs? [everyone laughs] You gotta decide which one is going to satisfy you more when you're on your death bed. Did you create a bunch of value for shareholders or did you create a couple of really beautiful jewels of functional programming?

01:09:22 [CH]

I mean, hopefully both. Not that I care about creating shareholder value, but I do care about having money [laughs]. There's only a couple of people that I know that sold their language for a couple of million dollars.

01:09:38 [RP]

If you wanna do both, I hear there's an opening at 1010data [Conor laughs].

01:09:45 [CH]

All right. I mean, I feel like it's hard to top that.

01:09:48 [BT]

Well, I'll go with contact@arraycast.com, if you want to get in touch with us and we've had a number of different people e-mail us over the last week, one of which was an interesting topic for future show perhaps ... is large systems written in APL and how you develop those. Whether APL is or the array languages are just basically built for leet code. Which of course we know they're not just built for leet code.

01:10:18 [BT]

Quite the opposite.

01:10:19 [ML]

The opposite, yeah.

01:10:21 [JE]

Leet code is built for APL? [everyone laughs]

01:10:26 [CH]

As proven by this episode.

01:10:28 [ML]

Maybe the contrapositive? Leet code is built for not APL.

01:10:33 [BT]

But I saw recent discussion where somebody said they started to talk about large programs that were written in the array languages. But the code that's written isn't that large. So that kind of proves their point; is that it's still small, even though it's large for an array language. Anyway, it would be worth, exploring at some point, I think, how we approach (working in larger systems), how you work in a large system and play nice in the playground.

01:11:04 [JE]

I'm willing to disappoint listeners by saying that 1010data's codebase does not fit on one sheet of paper despite being implemented in k.

01:11:16 [BT]

And that's probably interaction with the real world and problems to be solved and not making customers into problems.

01:11:23 [ML]

He didn't specify font size, you need to. [Bob laughs]

01:11:31 [JE]

But you know the planck length does exist, so.

01:11:35 [BT]

I didn't know you wanted spaces between the numbers, I think is one of the answers that somebody gave. Anyway, yes, contact@arraycast.com [16] if you want to get in touch with us. That's my spiel.

01:11:49 [CH]

And also, if you're listening to this and you've been, uh, the fire has been coming out of your ears or whatever, smoke has been coming out of your ears because you're thinking "why didn't they talk about this solution?", feel free to [send it to us]. You can do it on GitHub or just send us the code on e-mail and if we get any very interesting solution ideas that we totally missed, we'll be happy to maybe (not have a whole part two on this), but definitely mention it as a follow up in a future episode. All right, well, John, once again, thanks for coming on and being promoted to you know, guest panelist, from guest to guest panelist. That's part of the promotion. We'll come up with some sort of Pikachu evolution, (Pokémon, that's what they call it? The kids used to call it back in my day) So you're now, I don't know what level Pokémon you are, but ...

01:12:35 [ML]

What moves does he know? [everyone laughs]

01:12:37 [JE]

Well, it's struggle for sure. [everyone laughs]

01:12:40 [CH]

But yeah, thanks for coming on. This has been awesome. Hopefully we'll have you back at some point as either a guest panelist or or something else in the future.

01:12:48 [JE]

Thanks for having me.

01:12:50 [CH]

And with that, we'll say happy array programming

01:12:53 [everyone together]

Happy array programming!

[closing music]