Transcript

Transcript prepared by Bob Therriault, Adám Brudzewsky, Sanjay Cherian and Igor Kim.
[ ] reference numbers refer to Show Notes

Transcript 

00:00:00 [Conor Hoekstra]

Are you going to be writing the book while cycling or are those going to be two separate activities?

00:00:04 [Stephen Taylor]

That's actually quite dangerous, Conor.

00:00:08 [CH]

You're talking to someone who's been in like five bicycle accidents like hospitalized twice because of it so you know it's probably right there.

00:00:14 [Marshall Lochbaum]

So he needs all the advice he can get.

00:00:16 [Music Intro]

00:00:28 [CH]

Welcome to another episode of ArrayCast. I'm your host Conor and today with us we've got our four panelists. We're gonna go around and do brief introductions

We'll start with Bob then go to Stephen then go to Marshall and then finish with Rich.

00:00:38 [Bob Therriault]

I'm Bob Therriault. I am a J enthusiast I'm very enthusiastic about J as usual.

00:00:44 [ST]

I'm Stephen Taylor. I'm an APL and q programmer.

00:00:48 [ML]

I'm Marshall Lochbaum. I started as a J programmer. I worked for a Dyalog for a while and now I'm a BQN Developer.

00:00:55 [Rich Park]

I'm Rich Park, I do media and training at Dyalog APL.

00:00:59 [CH]

And as mentioned before my name is Conor, I am a polyglot programmer primarily in C++, but I also love the array languages which is why I host this podcast. And with that out of the way we are going to do a few announcements. I think we'l start with Bob who's got two and then I'll announce who goes next after that (laughing) To keep it simple.

00:01:21 [BT]

Keep the cats going in the right direction. Okay, so I've got two announcements. [01] The first one is you can now get J903 on iOS devices. Ian Clark has upgraded it, so now you can go all the way up to 903. Now the way you need to do it, unfortunately because of the way Apple has worked with us for naming, Ian has upgraded 901, which is the one that you look for. So look for J901. When you get that, then when you upgrade it, when you bring it on, it's gonna be 903 actually. But we're gonna get that naming worked out in the next version. But for now, we're just gonna upgrade and it'll actually be 903. And that means you can run it on your iPad or any iOS device you have. Second thing is, hot off the heels of J9.4 being released, J9.5 beta is now in the works and it includes a modular arithmetic, a new primitive. So if you're interested in that stuff, you can download the beta and try it out. And the more people that do, the quicker we find the problems. And it very much is an alpha beta right now 'cause it's fresh off the presses and that's the new thing happening in J. And my guess is sometime around December to January next year, that one will be released 'cause that seems to be the usual process. And that's what we've got in the J world right now.

00:02:44 [CH]

Awesome. And we'll probably have links to both of those things in the show notes. We'll spin the bottle and go to Stephen next.

00:02:51 [ST]

I'm going to be away for the next few weeks. I'm setting off on a cycle tour. I'm writing a book. I've got the first draft of it with me. It's on how you make the transition from the kind of one potato, two potato scalar programming that a wonderful phrase, Joel Kaplan called, coined on this show to vector program. So the title is post-atomic vector programming in Qt. If this is a subject which interests you or you have some examples of vector solutions that you think are helpful in making a transition or some exercises, please do write to me. Please do write to me. It's sjt@5jt.com. [02] And maybe you can help me get this book written. More news on this later. You can also follow my travels on 5jt.com, where I'll post from time to time.

00:03:54 [CH]

Are you gonna be writing the book while cycling or are those gonna be two separate activities?

00:03:58 [ST]

That's actually quite dangerous, Conor.

00:04:02 [CH]

You're talking to someone who's been in like five bicycle accidents, like hospitalized twice because of it. So, you know, it's probably right there.

0:04:08 [ML]

So he needs all the advice he could get.

00:04:11 [ST]

Just to prove it was a mistake. Now, as my friend, Julian Cera likes to say, this is not for charity. I'm not out to prove anything or challenge myself. I'm just going around visiting family and friends. It's slow travel and rainy weather is likely to find me in a library or a cafe, tapping away on this tiny ancient netbook.

00:04:32 [CH]

Ooh, netbook.

00:04:33 [ST]

That's got quite good battery life and it's very simple. In fact, I wrote the first drafts of Code KX.com on after Whitney's bitch uh beach house some years

ago.

00:04:45 [ML]

We know what you really said. Interesting place, huh?

00:04:50 [RP]

Do you have to cut that?

00:04:52 [CH]

Yeah, Yeah, we allowed to leave that in. It was a Freudian slip. It wasn't intentional. So we'll have to consult the... Marshall, oh my goodness. My goodness.

00:05:05 [ST]

Do we have a house psychotherapist here?

00:05:11 [CH]

I don't even know what to say. We'll just go straight to Marshall with his announcement, probably unrelated to what just happened, but I hope you have fun on the…

00:05:19 [ML]

It is indeed unrelated. My announcement is just that we have started using version numbers for the CBQN interpreter, not the specification for the BQN language. That's at the 1.0 candidate. But the BQN interpreter is now up to version 0.2.0. This doesn't change anything about our development process. For a few months, we've had a develop branch that's separate from master and we occasionally move master up to develop. What we are doing now is just also increasing the number when we do that. So there will be tags for version releases. And so if you want a number that increases as development goes on. There it is. Not gonna do anything different.

We're not going to backport any fixes to old version or anything like that, but we have a number now.

00:06:10 [CH]

Awesome. I assume there'll be some kind of link in the description as well, pointing at some kind of show note or something.

00:06:16 [ML]

Well, there's not really a note on it. It's just that there are numbers.

00:06:21 [CH]

All right, okay, so maybe no link. (laughs) Just another link to BQN if you wanna go check it out. All right, we'll round up our announcements

with Rich. Go over to you, Rich.

00:06:30 [RP]

All right, so last week I published a video, which is a follow-up to my talk from the Appleseed user meeting that happened at the end of March. In that talk, I sort of showed a particular problem from a previous APL problem solving competition. And then I talked through my thought process, the kind of reasoning behind the solution, and then how that translated into some APL code. [03] So, Stephen, feel free to take that example for your book if you think it's good. Don't say anything if you think it's not. But this video that I've just published is, well, it's like an hour and 20 minutes or so, going through the full solution that what I presented previously is just a part of. So, it's very much aimed at people who've maybe just about got started with APL and want to see you do something that's maybe a bit more substantial than just the typical one-liner that you might see.

00:07:33 [CH]

Awesome. Yeah, I can say there'll definitely be a link if there's a YouTube video. All right, I think that's all the announcements out of the way, which means we will transition to today's topic, which I think is not necessarily a follow-up to two episodes ago, but it was on the topic list of things that we might have talked about, I think, when we had Henry Rich on, [04] and that topic is folds and scans, specifically, I believe, how they differ in BQN and in K/Q. And to start the conversation off, I guess what I'll say is part of the reason BQN is my favorite array language is because BQN scan, and also can cues, is not broken. So let's just start off with an inflammatory statement like that. Because in my opinion, I remember, here's a small little anecdotal story, when I wrote my first mini interpreter, which obviously was not complete, and I've probably written like two or three at this point, all to varying levels of incompleteness, I remember when implementing the scan, I was writing unit tests and I was modeling this after APL. I think the very first one I called "Ahi". It was the APL Hoekstra interpreter. No one's ever seen this. It's not on GitHub. There's no link to this. It's private. It's terrible code.

00:08:53 [ML]

Code, but you still named it I still.

00:08:54 [CH]

You always got a name. And this is pre-ChatGPT. So ChatGPT didn't name this. Now ChatGPT names everything for me. And I remember trying to implement scan and

wrote some unit tests based on what APL gave me. And then I ended up getting so confused. And I remember naming naming the implementation of scan internally in my interpreter as scan underscore Fold right underscore WTF or something like that because it was like repeated right folds which meant that at least my interpret. My interpreter had like a quadratic time complexity on scans and I think in APL. There's some idiom recognition if you have a certain, you know binary operation that's built in you'll get like a linear time complexity, but one it's just like bad performance in some cases and two it's impossible to

Solve certain problems for instance one of my favorite problems of all time And I think the most beautiful line of code I've ever seen is the solution to Kadane's algorithm, and you can't do that elegantly and beautifully in J or an APL because of the way that their scan works whereas in bqn you can do it absolutely beautifully and I believe K&Q is the same.

00:10:13 [ML]

I think the K1 looks nicer.

00:10:14 [CH]

Even the K1 looks nicer.

00:10:16 [CH]

Oh, wow. I should go take a look, but I'll stop. That's my little monologue and you know, I've started the flame Wars internally here on this little panelist

So have at it, you know, let's play a four player ping-pong and we'll see who wins who wants to start?

00:10:28 [BT]

I'll start because because we're sort of basing this on what Henry was talking about with fold and I think with fold although there's all these extra characters, I think Fold can actually do Kadane's algorithm the way you like now.

00:10:36 [CH]

This is correct.

00:10:38 [BT]

It couldn't back then probably, because you had to do a bunch of reverses. And the thing that you're working against with Scan is that APL and J and BQ and everybody

evaluates right to left.

00:10:54 [ML]

Not K actually, which we should go into.

00:10:59 [BT]

OK, go into it cause I that's I'm over my head already.

00:11:03 [CH]

Well before we go before we go into it you're definitely correct Bob, and I think actually I did in J. So because APL there's basically no hope for it unless if you want to do something you know where you're reversing things and reversing things again, but like J has under [05] so J you can actually do something where you combine what I think is a broken scan with under in order to get the solution then if you use fold you can also use fold. But fold you have to specify at least two binary operations to and like one of them is identity which isn't really necessary. So like you can do it two different ways in J, but both of them, in my opinion, are much less elegant than what you can do in BQN or in k or q. But yeah, Marshall, you dive into the semantic differences that you were just mentioning.

00:11:50 [ML]

Yeah, well, so scan and fold are pretty closely related. In APL, scan is defined as a bunch of folds. And well, I guess the issue that we're going to get into is that scan has a direction to it and fold has a direction to it and then in APL they go opposite ways. And you can fix this in a number of ways. What K does is just to define folds to go left to right instead of APL's right to left, which I think is the more obvious definition that everybody would naturally think of. I do think there's some reasons to say in mathematics in particular, you might want the right to left fold, which is why in BQN I actually kept APL's right to left fold, even though for the scan I took the, I made it so that scans go left to right in, you know, both the kind of outer loop and the inner loop. So in BQN scan is not really compatible with fold. I don't think that was right now. So I think really for a programming language, you just always wanted to go left to right. So I would say k does the right thing there.

00:12:58 [CH]

What's the math reason for wanting motivation for wanting to have a right to left hold, or is it too is it too dicey to get into here.

00:13:04 [ML]

No, I mean, it's it's kind of fuzzy. But one of the really important things about it is that, so say you're doing a minus reduction, this is something that you would do in APL and you would never do in k, because it doesn't make sense going left to right. The reason why it's nice in APL is that what does minus do, right? It negates its right argument and it adds it to its left argument. So, I mean, that's a weird way to put it, but I'mgoing to break it into those two operations. And then if you look at a fold that breaks down into a sum of all the elements, but some are negated and some are not. And if you look at the number of times each element gets negated in this process, it's actually equal to its index. So that's a really nice property. Basically the number of times that a certain value is passed as an argument, if you include like when the result is passed as well, is equal to its index if you do a right to left fold. So in math I think it's kind of nicer, but in programming a lot of the time you want to just express an iteration with a fold, and then it's just crazy to go backwards through the list because you're seeing the values in the reverse of their index order. So I think there are arguments for both, and in functional programming you often have a fold L and a fold R. I can see the arguments for both, but it seems like the left fold is more... is overall... makes more sense as a general programming tool.

00:14:28 [CH]

I know that a minus scan in APL gives you... I think they're called alternating sums, but I have no idea if there's anything to... is there... obviously a plus reduction shows up everywhere like in yeah life and in math but like is there an actual i don't know of an actual thing where like a minus reduction is a thing like is that a thing because like that's what you're saying is like oh it has this nice..

00:14:56 [ML]

um well if you look at say the Taylor series for cosine [06] i think that alternates plus and minus uh there are a lot of formulas in that that alternate

although i think usually you can actually um like in cosine where this actually comes from is that it's taking, is that you have the Taylor series for the exponential has a factor of i in it, so it goes around and cosine takes half of those terms, so you get an i squared factor, which goes plus and minus.

00:15:23 [CH]

So you're saying there's certain polynomials that like technically, the way in math where you have like a capital sigma, you can express some polynomial that way,

you can do the same thing if you have like a capital delta or whatever that was for a minus reduction like you could express a Taylor series that way?

00:15:41 [ML]

Yeah, and I mean that's the normal way to write the math, but also what you can do is just take... so there's a factor of x squared that

you need for the cosine, it's really a polynomial in x squared not x, and if you just make that a minus x squared instead you'll also get this alternating property. So I think usually there's a better way to write these alternating sums.

00:16:01 [CH]

Okay, but there is like a... someone could build an argument that there is a use case out there.

00:16:09 [ML]

But also if all you have is a plus scan, what you can do is just negate every other term of the sequence. And that's fine because it isn't alternating some, it's just every other term is negative. And I mean that that's why I converted the scanned order, but I didn't convert the fold order. And, you know, now I think I should have, but I'm not going to change that because there's a lot of code that that relies on that, or that takes into account that particular fold order.

00:16:31 [CH]

And so k and q are the same way or are they slightly different from every everyone else?

00:16:37 [ML]

I just know about k but k's fold is left to right and I mean I assume q is the same.

00:16:46 [ST]

Yeah that was a nod for the listener.

00:16:46 [CH]

Yeah interesting. So that means does that mean k and q are the only languages with the left to right array languages maybe. Like in our Iversonian circle. Yeah, and I'm sure the k derivatives keep that intact too, but there aren't any super popular k derivatives other than Q. Interesting. I wonder, I assume not, but is there any insight into why Arthur made that choice? I assume it was Arthur.

00:17:16 [ML]

You know, no, I don't think I've seen anything written about it.

00:17:19 [CH]

That's why you got to come on the podcast, Arthur. We know you're listening.

00:17:24 [BT]

So if you're going left to right, in addition to the, I mean, it might work against mathematical, but in terms of computer languages, is it easier to do it that way? Does it make more sense when somebody's programming?

00:17:37 [ML]

Well, it depends on what you're doing. And that's the way I think of it is kind of there are different uses for these two different folds. I mean, I think usually

the left to right one is easier to use because the way you use a fold is to say, and I guess we should talk about initialized folds too, but if you're using like a fold as a form of program flow control, what you'll say is, "I start with some value and then I want to process each incoming value and change the value that I'm keeping in some way." In that case, you're always going to want to represent like a stream of incoming values as an array in order.

00:18:21 [BT]

So the right-to-left fold just makes you reverse that array a lot. And you're kind of working against the functions that are non-communitive. So things like subtraction when... I mean, I found when I was first trying to do reduces and stuff across with subtraction, that's when you have to say, "Okay, well, if I'm doing right-to-left, now I've got got to reverse those two things so they're in the right order, and then I have to reverse the whole list. And that's where it gets complicated in your head to keep up with, and you don't have that problem if you're going left to right.

00:18:56 [ML]

It's pretty easy to get yourself into a complicated situation with folds.

00:19:00 [BT]

And the other thing, earlier, Conor, you were mentioning that with Jay's fold, you have to specify two verbs to be included with the fold. And that's true. And one of the needs to be the identity if you're only going to use one. But I think that does kind of make sense. It's extra work to do some of the time, but there's other times when it really gives you something because you can jump into that loop that you're putting the fold through and send something else back out, which I think is how Henry gets around, you know, maintaining state and things like that, is that you don't have to keep everything. You can choose what you see and then work with that.

00:19:40 [CH]

Yes, I agree that in the casesthat they're useful. They're very useful. I also this actually never came up at the end of the because the I think that episode with Henry Rich went went super long. We tried to end it like three times but then got really excited about some something that someone said. And part of the reason actually that I was very curious to have that conversation was because of Kadane's problem. So I keep mentioning Kadane's and I'm sure that there's at least one listener that's not familiar. So Kadane's algorithm is alternatively known as the maximum subarray sum problem, which also probably doesn't mean much to the same folks, the same person, but it basically means if you're given a list of numbers, positive and negative, it's important that there's some negatives because if there's no negative, the answer is just the whole list. What's the maximum sum you can attain by choosing a contiguous subsequence or subarray of that array so it's kind of there's a couple different ways to solve it you can use dynamic programming but there's a cute way where basically you define a binary operation where you are doing a scan and first you add your adjacent elements your accumulator and whatever your current element is and then you take the maximum of whatever that sort of temporary sum temporary temporary accumulated sum is with your current element so if at any point the current element is greater than the accumulated sum up to that point you basically reset your your sum because what that basically means is you've encountered a negative or some sequence of elements that it's it's suboptimal to include that whole thing or whatever portion you're looking at instead of just sort of resetting your sequence. And then once you have that scanned sequence, you just do a max reduction on that. And the reason I, part of the motivation, I wanted to know more about the folds in general, but I wanted to know could I solve Kadane's algorithm with a single fold primitive in J. But the answer to that question is no. Because the two operations, I need two binary operations that are applied.

00:22:06 [ML]

I think you probably can actually. So what J has, the normal scan is you use two modifiers. [07] You use the fold and the prefixes modifier. And that's the normal way to do a scan that matches APL. But you can also use the fold and the suffixes modifier, which is the backslash or the dot. And then suffixes goes backwards across the array. So then your folding direction is right to left and also your your outer Suffixes direction is right to left. So then it can reuse the results

00:22:43 [CH]

So that's you're definitely correct. That's how I saw that in bqn and in other languages what I was hoping to be able to do with the J fold though So like the capital F and then insert dot or colon or whatever sequence you need, could I do it just with that without a reduction at the end without without like without two separate without a scan and a reduction just with the fold because what the folds do is they do avoid that like scanning that basically like what you end up doing with the scan plus a reduction is you materialize the scan. And then you do a reduction on that and it seemed like there was a possibility that the fold in J

Could do it all at once because it had sort of two arguments.

00:23:39 [ML]

I think you can do this with any fold all at once.

00:23:44 [CH]

Definitely but then you have to like hand roll a lambda, that's like doing both of your operations at the same time. Yeah. And what I was hoping with the the J fold is that I could just specify like the max as one and then that binary binary operation, which is you know a fork with you know max in the middle plus and then right to get the current element but the way that the J folds work that doesn't it doesn't it doesn't do that

It basically is applying like a unary operation on the state I believe and then the binary operation

I'm sort of forgetting this I need to look at the diagram again

00:24:19 [ML]

But well, so if you're gonna pack it into one fold you have to have two elements of state you have one which is what the scan normally gets, which is the greatest sum that ends at the current position, and another which is the greatest sum that you've seen so far total. You can't pack that information into one number easily, so I think you would always have to write a complicated function if you wanted to use one fold.

00:24:49 [CH]

Yes and no. I mean, this is getting down down into the weeds, but like I hand rolled my own, I called this once a two op fold, and then I renamed that to a double fold, but then I don't like double fold anymore 'cause it sort of implies that you're doing two folds, but I don't also like two op fold 'cause two op fold sounds very non elegant, but like I wrote a two op fold in C++ that basically just takes a sequence or an array and the two binary operations and calls it a day and like does the unpacking and you're right, you need those two pieces of state, but like you can write an algorithm that does the right like application of operations

and like stores what it needs to store and then returns you what you need at the end of the day.And I was hoping that the J Fold does that, but it doesn't. It does something close to that, but not exactly what I needed. I feel like we're gonna probably be losing the listener at this point, 'cause I'm talking about like the

implementation of a C++ algorithm that no one's looking at right now.

00:25:43 [BT]

Well, one of the things about Kadane's is that it's linear, right? That you don't have to-- -

00:25:49 [CH]

Yes. Yeah.

00:25:50 [BT]

If you were thinking about trying to get sub-arrays of lists and stuff like that, you think, well, I'd have to look at every single sub-array and then I just always, all these combinations. And that's gonna be probably quadratic, I think, that you ended up with. So you're gonna, for the longer your list gets, it gets multiple times exponentially longer process. But if you're linear, it's the same, the longer it gets, the longer it gets. It doesn't get exponentially longer, you're just gonna work a little bit longer for the longer list. And that's the advantage of Kadane's. And in the case of J and I think all the others, having to do that second reduction, that's still linear. So you're still linear even though you're doing something twice.

00:26:34 [CH]

Yeah, so the motivation for all this is that, you're totally correct. I don't even know what like, what's the time complexity of generating every single sub-array, but it's like, it's crazy.

00:26:45 [ML]

There's a starting position and an ending position, so that's two. Okay, yeah. I mean, it's a triangular number is the total number of steps.

00:27:01 [CH]

So yeah, so that's what the naive implementation is. Then if you switch to Kadane's algorithm implemented in an array language, you get linear time complexity, but also linear space complexity because of the materialization of that scan. But like, ultimately, what my goal is, is I want linear time complexity because of the, it's just being a single algorithm, and constant memory complexity, because you only have to carry like one or two pieces of state. Steven, you had your your hand up, what do you want to jump in and say?

00:27:35 [ST]

I'm wondering if it'd be helpful if I were to describe how the cue scan works, and how it's able to generate both the converge to and while forms all from the same iterator. I think if I were to do that, you might see as a kind of side effect why it inevitably does the fold left to right.

00:28:02 [CH]

All right, let's go there.

00:28:04 [ST]

Okay, so the scan iterator [08] is a way of repeatedly applying a function. And let's call the function f. So we have a derived function fscan. And let's say f is for example ternary, it's three arguments. f on x, y, and z, three arguments, will apply itself and keep, it'll apply f to the arguments and keep on applying it till it's done. That's the core concept here. We'll keep on applying it till it's done. How's it know when it's done? Well, the first iteration, it's gonna take f of x, y zero, and z zero. And assume that y and z are conformable lists, which is to say that at least one of them has gotta be a list and the rest, the others could be atoms, although they have to be the same way. So basically, the results you get back are going to be f on x, y zero, and z zero, and then the result of that becomes the x for the next, first argument for the next iteration. So it's f on the calculated argument, y one, and z one. Clear so far? And we could have other, you know, up to eight arguments and so forth. So for a non-unary F, the termination condition is clear. You just work your way through the length of the other arguments. For a unary F, it's not clear when to stop. So fscan on x is just gonna keep going until it's done. When's it done? It's done when the results you get is the same or indistinguishable from the previous result. It's like, oh, and we're done here. Or if the result matches your original value, which is, oh, we've come around in a circle and we're done. So that gives us the converge form of the scan operator. It's a unary F applied to X until it stops changing or you get back to where you started. Now, the second key thing you have to understand is that an iterator derives a function. So Fscan is a derived function. And that function is variadic. The derived function can be used either as a unary or it can be used as a binary. So Fscan x can't take a left argument. If it doesn't, if it's applied as a unary, then you get the converge that I just described. It keeps on going until it decides it's done. But you can give it a left argument to tell it when it's done. And if the left argument is an integer, like three, it'll apply F three times and then it's done. And if the left argument is a function, then it will apply F until the function applied to the result returns a zero. So that you can regard as your test. Now a nice little wrinkle in that is that because Q and K sees a kind of dualism between functions and arrays, you can replace the test function with a list. And for the same reason, you can replace F with a list. So you can apply a list to an index until you get back to the original index. That is if your list represents a finite state machine. You can apply F until you get back to the original index, or you can work through your finite state machine a specific number of steps, or you can apply a test to see whether you've got the index that you want. By finite state machine here, I mean a list in which every item is a valid index of the list. Or you can replace your test function with another list indexed by the results of the first, which gives you a zero when you're done. Which is kind of cute because you can get an iterated expression in which there is no function involved whatsoever.

00:33:32 [BT]

What an awesome rabbit hole. I feel like I've dropped through something.

00:33:38 [CH]

It sounded like there was fixed point scans in there, which I think they've been referred to. K and Q the only languages that have those two?

00:33:46 [ST]

I'm sorry, I don't know what a fixed point is.

00:33:47 [CH]

Fixed point is like one that converged to the same answer. It's like Newton-Raphson's method is a example of a fixed point algorithm. So it's like exactly what you described. When it converged, when the answer no longer changes within some specified threshold, you stop.

00:34:01 [ST]

Dialog APL has got something of that kind too. I forget what it is.

00:34:05 [RP]

Yeah, doesn't BQN do you have power as well? It's a different operator.

00:34:09 [ML]

So BQN has repeat, but it's only ever a fixed number of times. I don't like this fixed point thing because it means that there's hidden in here, there's the condition on what values are the same or indistinguishable. So for numbers, you often want to stop when it hits a certain error threshold or something. And so I don't like the language deciding for you when's a good time to stop, I think you should write that out yourself as the programmer.

00:34:39 [ST]

So you would use the wild form of that and apply the test yourself?

00:34:41 [ML]

Yeah. I mean, so that kind of makes sense. The other thing in BQN is that there are no primitives that can keep going forever. There's stuff that takes a variable amount of time, depending on what argument it gets, but there's nothing that does one iteration and then decides, am I going to do the next iteration? So that's really nice because like if you have a tacit function that's just made of primitives, you always know that it's going to stop eventually, that it describes a finite computation. So that's another reason why I wouldn't want to add that even if I had, I mean, I wouldn't add like a while operator, which I know how a while operator is supposed to work, and there's a system function for it even. But I don't want a primitive that loops forever.

00:35:26 [BT]

And in J they actually use infinity as the argument to the power function to keep looping and then when you apply that to another power conjunction you put them together it ends up looping until it converges and when it converges there's no change that's how j does it the same thing that Steven...

00:35:46 [ML]

Yeah well so I was um so in j it just stops it only ever stops if um if the the value it gets is matches the previous one. So if you do minus power infinity of three, it'll go, that'll just go forever, 'cause it alternates. But Q, I guess it also compares it to the first value, but it doesn't compare it to any of the values in the middle, does it? So like if you're running like the Collatz conjecture where you can start at some number and go along the ways and eventually go into a cycle, Does it find that?

00:36:25 [BT]

No, I don't think it does.

00:36:25 [ML]

Yeah. So it seems like J's version where you only compare to one value is cleaner to me than the one where you compare to two values, but they're just kind of arbitrary.

00:36:37 [BT]

Well, they're arbitrary, but it's reasonable arbitrary. I mean, you're starting with your starting.

00:36:41 [ML]

I mean, I'm not saying it's awful, but it seems like it just doesn't sit well with me.

00:36:44 [BT]

The other thing I should mention with cadanes is Tom Maguire in the J Wiki actually did a whole tutorial on Kadane's [09] and he did that recently and included at the end of it he realized that he'd done all these reverses and suffix scans and all those kind of things and then he applied fold to it so he's got it in a fold version as well. And I guess while I'm thinking about it, Will Gajate also, and I think he referred to this in a previous episode, Conor had actually done an equivalence between the K scans and the J versions with folds and everything. So we can put those in the notes as well. If you're interested, you can sort of see what the equivalences are. And I think he's got most of the equivalences. I think there's a couple of fixed points that he hasn't come up with the versions in J that would be sort of optimal.

00:37:31 [ML]

Yeah. K definitely has like the most varieties of this.

00:37:36 [CH]

I just googled "Kadane" on the jsoftware website and it gave me one result. Trash.

00:37:47 [BT]

Okay, so Tom McGuire, if you, well, I'm not sure googling, but if you're on the J Wiki and you do Tom McGuire, well, I mean, we'll put the link in the show notes.

00:37:57 [CH]

I need to see if you, if someone solved this in J using a fold, it's like maybe I can go confirm whether I was right or wrong. I spelt McGuire wrong.

00:38:08 [BT]

M-C-G-U-I-R-E.

00:38:10 [CH]

Oh, I really spelt it wrong. Added an A, added a Q.

00:38:15 [ML]

One thing I wanted to say about the reverses, I mean, the whole problem is not defined in any particular ordering. So you can do a forward scan or a backward scan equally well. You should never have to reverse the array because, you know, a subsequence has no ordering.

00:38:34 [BT]

Oh for Kadane's, yeah. No, you're right. Yeah. Yeah. No for Kadane's. That's right.

00:38:39 [CH]

For a second. I got very excited because I googled I googled I searched on the page. I found the page. We'll link it and There was an expression that was just using F colon dot where the fork Using right max and plus is the right a binary operation and then it uses identity on the left. But then that's the that's the stepping stone to the final solution which adds a max reduction at the end of it So for a second, I was like, oh it works. But Yeah, all right. Well, that's good to know that my conclusion that I came to is maybe not correct, but at least it's consistent with what Tom has here.

00:39:21 [ML]

And I did want to say about the memory that it's like I mean, using linear memory is just how array languages work, because your arrays are immutable. For any problem, pretty much, you're going to have to make a bunch of new arrays.

00:39:35 [CH]

In existing array languages, there could be an array language that avoids arrays when possible.

00:39:43 [ML]

Yeah. And you can split up the problem.

00:39:46 [BT]

There could be an array language that avoids arrays?

00:39:48 [ML]

You've got a point.

00:39:50 [CH]

Alright, alright, I messed that up. It avoids materializing arrays when possible.

00:39:57 [ML]

But it's going very much against the grain of the language to say, "All right, I've specified my answer in these high-level array terms, and now I want you to turn it into a C program for me."

00:40:08 [RP]

I don't know if some people lost their dream, isn't it?

00:40:10 [ML]

I mean, it's still nice to specify a problem this way, but it's like, actually this array form for specifying gives you some pretty big advantages. you can have a function, you won't necessarily have a function for this, you know, maximum and plus scan, but for the max reduction at the end of it, you can definitely do that very quickly. So if you, like if you try to pack that all into a big iteration, your algorithm is no longer expressed as an array operation, and these array operations are things that we know how to do really quickly. So you are even giving up some performance information. If you tell it, well, yes, I'm in an array language, but don't actually make me any arrays.

00:40:56 [CH]

I agree up to a certain extent. I guess I'm not. I'm not. What are they? What's that quote that says the boundary of a language is the boundary of my imagination or something like that. I'll Google it when someone else is talking later and then get the real quote because that's not it. But I mean, my dream is that I want to be able to write like the most expressive solutions to problems and then have it do the most performant thing. Like for Kadane's, for example, like the most performant thing is to hand roll that reduction yourself. It's going to be faster than materialize.

00:41:43 [ML]

I'm not convinced of that.

00:41:46 [CH]

Well, here we go. Marshall and I will be giving a talk together at some point at some conference in a very adversarial. I think for definitely like a certain size of data. And that's the thing is you have to understand I work for a company that, you know, we are hardware is optimized for, you know, like national labs and people working with supercomputers where like, I was having a a conversation with someone at work the other day and they were like, they were dealing with how many data points are we dealing with here? And then, and like 2 billion was a small number. Like ...

00:42:23 [ML]

2 billion is a small number.

00:42:26 [CH]

I know. But it's that it? Well, I mean, because I think at one point, you said that the most sizes of your average array is like 10,000 or so.

00:42:34 [ML]

Well, but what you can do, even when you have an array algorithm, you can split it into to smaller arrays. And this is often a lot better because you get to use vector operations with these. So for Kadane's algorithm in particular, I don't know how to express that purely with vector operations, but I think there probably is a way. And in that case, if you write it in C style, where you have the-- where you interleave this scan and the reduction, then it's much harder to go from that to a vectorized algorithm, which would be, if it exists, it would almost definitely be the fastest way. Where the, I think the array algorithm more naturally splits up into, and it's still hard to get, I mean, the way you would get the array thing to be cache friendly is that you run it in these blocks. And then within a block, it's, it's doing a bunch of vector operations, but what you really want is like, you know, working on two vector registers, say at a time.

00:43:46 [CH]

Aren't all these solutions where you're like chunking data though, like you're handrolling that in your array language, correct? Like we don't have, like in any of these array languages.

00:43:55 [ML]

Yeah, the array language doesn't automatically chunk, but it's not that hard ...

00:44:00 [CH]

But so that's the thing is you have to understand is my goal is like, I wanna write the BQN max reduce, paren, you know, right max plus n paren scan.

00:44:12 [ML]

Yeah, and I think the way for the implementation to get the best, I mean, maybe not with this particular problem, but definitely for things that are friendlier to arrays where you don't have any compound functions inside scans. The way to optimize those is not to immediately break it down into a series of scalar operations, but instead to be more careful and start with your whole array stuff and break that as necessary, and maybe even compile these array operations into operations of individual registers, which is hard. Nobody's really done that, but coming from the other side, from C, there have been, you know, who knows how many man hours poured into work on auto vectorization. And it's still pretty terrible. It would have absolutely no chance of handling something like Kadane's algorithm. So, I mean, you know, getting the best implementation of an idea is pretty difficult. But I think actually starting from an array representation, you do have a pretty good chance without going through a C style scalar representation first.

00:45:25 [CH]

I mean, I'm not advocating necessarily for compiling down to some C thing. I just like my goal is to not have to do anything to the original solution and just have to have that, like I said, like the most expressive solution handwritten that then is accelerated like maximally without having to do like the chunking and stuff. So like, I agree, you can do all this stuff you said, but like my dream is to like have that stuff automatic behind the scenes and that it can like, whatever sandbox you create to put this language in, it will see through all of that, see that there's the juxtaposition of a reduce or a maximum reduction with some kind of scan, see through like the binary operation that you're passing to the scan, understand the algebraic properties of what that binary operation is. It knows max reduce, you can maximally parallelize that. The binary operation depends on the algebraic properties of that. And then it sees through all of that. And I understand it's a pipe dream. What was it? Guy Steele, he had a project very similar to this called Fortress, [10] which was supposed to be this parallel Fortran that actually had definable Unicode symbols and was aware of the algebraic properties of binary operations. Had a lot of these ideas and it ultimately ended up failing. I think it was mostly 'cause of funding, not because necessarily his ideas weren't good. But I understand what I'm asking for is, It's like lofty, but like, that's what the dream is. That's like the whole, you know, I think at some point, Bob asked a question, then we went on this big Kadane's rabbit hole with a couple tangents here and there. But at the end of the day, like...

00:47:07 [ML]

Well, so the important thing to know is, is this operation associative? And I think what I should do is just try it out on a table of arguments, make a three-dimensional table in either order.

00:47:18 [CH]

It's associative, I believe, but it's not commutative.

00:47:22 [ML]

Yeah, so if it's associative, you can do, there's a general optimization for associative scans. So it could be vectorized.

00:47:33 [CH]

Anyways, I feel like we're way, anyone wanna pull us out of this rabbit hole that we're extremely, extremely deep down?

00:47:40 [BT]

One of the things that might give us a leverage back out is I think what your ideal situation is, The way Henry's approached it with J is his special code. [11] So that there's certain combinations that will give you quicker results because it's optimized. And there are certain operations that will save you space because they can do them in place. But the challenge is you have to know those combinations. If you miss those combinations by a little bit, it's going to do it the way it interprets it, and it may not be the most efficient and why, I guess what you're looking for is in all cases, it's just going to figure out what you're doing and say, oh, well, this would be the fastest way to do it.

00:48:20 [RP]

Is that related to the idea of thunks?

00:48:23 [CH]

Yes, yeah. I mean like, so this is sort of comes a little bit or overlaps a little bit with all of what are, I give this lightning talk internally called collection oriented programming, which like array languages fit into smalltalk, functional languages, and then within libraries, within other languages like C++ ranges, Rust iterators, Java streams, and all these libraries, at least, not so much smalltalk, but a lot of the functional languages, they implement like stream fusion. So when you, it's like, it's a lazy implementation so that you're only ever, you basically only ever have constant memory, even if you're doing like a, you know, a filter after a transform, after another filter, after a reverse, it manages to build up these kind of like, It depends on the language, but like in C++, it's like basically structs wrapped inside of structs, wrapped inside of structs. And then you call some operation at the end of the day and then poof, then it does this whole evaluation of this thing that you've built up, which Rich referred to as a thunk, which is I think what they call them in like functional languages. So it's like this unevaluated thing that you need to call some certain type of operation on. And there's no reason you couldn't like implement an array language sort of designed around that. It just means that like not all operations, Like not everything can be lazy, right?

00:49:45 [ML]

Well, so, and there are array languages that do this. There's like, there's Apex is the APL parallel executor.

00:49:52 [CH]

Is that Bob Bernickie's?

00:49:53 [ML]

Yeah.

00:49:55 [CH]

That's what, that's what Apex does?

00:49:56 [ML]

I believe so. That's what I got from ...

00:49:59 [CH]

We gotta get Bob on. We gotta get Bob on. Bob, if you're listening, we gotta get Bob on.

00:50:04 [BT]

Yeah, we gotta get Bob on. You're right, we gotta get Bob on.

00:50:06 [CH]

I had no idea, I know of Apex. [12] I think I've seen a couple papers., But I did not know that was what it was doing. Are there any other examples? Or that's just the one that comes to mind?

00:50:19 [ML]

So I think-- I mean, it's hard to remember what exactly does what. I think this is pretty much what Futhark does. Although I'm not... I know it has some sort of fusion and fission. I mean, the thing is, it's just not a good idea to split things into single operations. So the array languages that are doing this sort of fusion are doing other stuff as well to try and keep some of the array powered. I mean, so Futhark's working on the GPU, which is a very different model. And I mean, they're splitting a stream operation on the GPU into sequential processing. I'll just ignore the fact that I'm running on a GPU at all and pretend it's a CPU and then get worse performance.

00:51:11 [CH]

Sorry, wait. When you say it's a bad idea to split operations, what do you mean by that?

00:51:16 [ML]

If you want to compile a stream, the way to do this reliably is to split it into one operation at a time. And then you can always compile the stream, this sequence of stream operations, and it always uses linear memory. But it's just not very good at the end of it. So if you have an addition and then a filter and then another addition or something, it's pretty tough to-- the filter might pick out every other element. And at that point, if you're trying to keep things in vectors, it gets very complicated because once you filter out half a vector, you're not aligned to the registers anymore. So what you have to do to compile this sequence of streams in a way that's simple is just to split it into one operation and then another. And I don't know which array languages have tried to do this. I think most compiled array languages sometimes do something like this. But usually, they're trying to keep the array structure more intact so they're not doing exactly this. And so I'm not sure which ones do what from this perspective. Because the benefit of an array compiler is that it does array stuff. So no one's gonna talk about, well, yeah, we compile, we can compile scalar things with it.

00:52:44 [CH]

We gotta have Bob on. All right, back to K and q and all the other language scans and folds. What do we have left to say? BQNs, Ks and Qs are better.

00:52:57 [ML]

Well, so we more talked about fold. We didn't say that much about scan, I guess. [13]

00:53:02 [CH]

What happens to the scans?

00:53:05 [ML]

J has got ... [sentence left incomplete]. I guess there's some some interesting progression from APL to J where it split up the scan. And K didn't do that. In BQN I decided not to do that either.

00:53:22 [RP]

[By] splitting up the scan you mean getting the prefixes and reducing on each?

00:53:27 [BT]

You can separate prefixes from suffixes before you do your reduce.

00:53:31 [ML]

And there's kind of a question to me of how closely related is scan to reduce really?

00:53:39 [RP]

I was gonna ask ... APL didn't always have general scans and reductions, right?

00:53:46 [ML]

No, you could only use arithmetic with it.

00:53:48 [RP]

So I think that goes partly to explaining the reasoning for having ... [sentence left incomplete]. Because the alternating sums ones ... [sentence left incomplete]

00:53:56 [ML]

Yeah, in old APL syntax you wouldn't even have the concept of saying: forward slash, then backward slash (/\) because you can't apply an operator to a function that has another operator applied to it.

00:54:07 [RP]

So then, because they were thought of as the fold, but with the in-between values, then they're thought of as very, very closely related. You're saying you might separate them out.

00:54:19 [ML]

Well, I don't know. I mean, obviously you can define a scan as the intermediate results from a fold. And there's this other thing that I mentioned; (I guess it was two podcasts ago) about how I think the kind of natural way to split it up is to have the result from the fold and also all the results from the scan that includes the identity element. Yeah, I'm not sure I said this, but another thing that's kind of cleaner about that is that ... [sentence left incomplete].

00:54:50 [CH]

Wait, did you say that there's a version of a scam that prefixes the identity element to the result?

00:54:55 [ML]

Yeah. So.

00:54:58 [CH]

How does that work for any custom binary operation though?

00:55:01 [ML]

Well, you'd have to pass in an initial element. So that's another thing: So K and Q and BQN all have this initial element. And I think of that as kind of the most ... [sentence left incomplete]. Like leaving that off and just using the element (the identity element that you get from the function) that to me is just a convenience; that's an optimization. You know that for a plus scan, it always makes sense to start with zero. But I think there are all sorts of problems with more complicated functions, if you leave off the identity element.

00:55:39 [RP]

I was about to ask: what's the difference between the seed value, this initial value, and just prefixing your actual argument with a different value.

00:55:52 [ML]

Those would be the same, but you have this weird thing that, you know, if you have a homogeneous array but you want your initial value to have a different type, then you prefix it and then you get a array that's not homogeneous, so you're kind of, you know, spitting in the face of your array language there [chuckles]

00:56:10 [BT]

And the reason you got an identity element is because you got a binary function and you've only got one argument. So what you would return is the identity element and then get on with the rest of the binary.

00:56:22 [ML]

And so I think that in some cases (now I can't remember what I what I ran into) sometimes that ends up a lot cleaner with an empty array. So what happens with an empty array is that then the ... [sentence left incomplete]. Oh yeah yeah! What I was thinking about is [this]: so if you have an inclusive scan, you might think it's nice because you can get the result of the fold from just the last element of the scan. So all you have to do is scan and then you can get the fold of the identity element. But if you have an empty array as an input then the result of your scan is empty, and you don't have any information. So you need a way to get that initial element back. But having the exclusive scan and the fold together solves that. What happens with an empty element is that you get an empty scan and then actually the identity element goes into the fold. So the fold is the exact same and the scan is this shifted version that starts with the identity element. An example that I couldn't come up with on that episode was if you have the strides in an array (this is a backward scan). This comes up a lot if you're implementing an array language, but sometimes if you're just using it. What you want a lot is the strides of an array, which are to say along any particular axis of the array. How many elements do I have to move in index or ravel order to go one step along this axis? So for the last axis, it's always 1 which is the identity element of multiplication. For the next axis up, it's the length of that last axis. So to get to the next element (like if you're in a table to go vertically), you have to go horizontally all the way across that row and wrap around to the next element. And so on going up.

What happens is this is a reverse scan with multiplication, but it's an exclusive scan because it ends with one. And then the next stride is that times the last axis and the next stride is that times the second to last axis and so on. And the final (the fold that you get out) it's not a stride, but it's the total number of elements in the array, which is also a useful value, but which you usually don't want to keep in the same array as those strides, because it's not a stride in the array. If you go [forward] the total number of elements starting at any element, you'll run off the end of the array. So that's two pieces of information that you use a lot when you're implementing. And with those, it's pretty obvious that the best form to have is this exclusive scan plus the fold. So that still has fold and scan [being] really closely related, but the relation is different than this other view which K and Q used to say that ... [sentence left incomplete]. Like in Q where you call the the scan "fold sums" because it's just every sum along the different lengths of the array. So that's I guess the relationship, the way that APL and Q and everything view it is that a scan is kind of a fold with extra information, but in this other view it's like the scan and the fold both tell you different information about what you get when you're traversing the array.

01:00:04 [CH]

So if I understood that correctly, your exclusive scan is similar to the C++ one where it takes an initial value and gives you back N elements. Meaning that it doesn't include the final results, which is what you're saying is what the fold is for.

01:00:27 [ML]

And you know, it should return that. C has this nice thing where you can just write to pointers, so it should write all those intermediate results and return the final fold. But maybe it doesn't do that.

01:00:39 [CH]

Oh. Because my response to that (given now that I have confirmed that I understood that correctly) is like, if that's the model you're advocating for (which it doesn't sound like you're necessarily advocating for, you're ... [sentence left incomplete]).

01:00:51 [ML]

I'm not necessarily advocating that this is the thing that you want your programming language to do. But I think in a lot of cases this is the way that you should think about it.

01:01:00 [CH]

You should think about it. OK, so it's a mental exercise.

01:01:03 [ML]

OK, you should keep in mind that you know, maybe if you're doing a scan, what you actually want is this exclusive scan. So in BQN, what you would do is ... [sentence left incomplete]). I mean it depends on whether you think the array is gonna be empty, but maybe you take the scan and you put the identity element in front, or maybe you shift the identity element in so that the fold comes drops off the end. But you do something to get it closer to that format and then work with that and often it will turn out that your program ends up nicer because the format that you're thinking of is the more natural fit.

01:01:43 [CH]

Yeah, I don't disagree with that as a mental model. My remark was just going to be like: it's upsetting if that's the way it works, because then if I ever want that last value, I either actually have to do go and call a fold, which after just having done linear ... [sentence left incomplete])..

01:01:58 [ML]

Well, you would want a function that returns two values and that's also really messy in an array language. I mean it's ugly if you want the fold to just be calling this function and saying well, then take the last element of that and do that every time you want to fold over an array.

01:02:19 [CH]

So yeah, as a mental model for it, though, I completely agree that it's useful to think about it that way. And yeah, [in] the world of scans there's, as we discovered when talking with Henry, there's many, many different flavors. Not enough good names. [all laugh] Yeah, what do we call all these things? Pre-scans, scans, post-scans. And actually there is, I think, there was a paper. I'll try and find [it]. Maybe it wasn't a paper, but definitely there has been like GitHub issues in different languages that have libraries that we're implementing these things. Rust is the one that comes to mind. I don't think they actually necessarily had it for scans, but they had it for folds because some of the folds have the same issue. Like do you use an initial value with the first elements for your first application of your binary operation. Or do you just use the 1st two? And that means if you only have one, you get back an optional value. So what did they call these? And then I think at one point Rust had called this like fold-with-first or fold-first. But then I think they ended up like axing that and when C++ was proposing these they were calling them fold first. But now I think the final things they got in was like fold left fold right and then fold left first and fold right. But anyways the point is that this this issue actually is a thing in the wild in these different languages because of all these different flavors and I think we even have a version that returns an iterator, not a value. So that one's called like "fold left iter". And I'm not even sure if there's like because The thing is every single time you add these binary properties, it doubles the number. So we went from 2 and then we needed like: "oh, does it have an initial value?". That makes it 4. And then it's like: "oh, well, do we want to return an iterator or do we want to return a value"? And that makes it 8. But I don't think they actually did the full matrix. I think they just did "iter" for fold left and fold right, not like: fold left first and fold left right. I could be making this up, but the point is all of this stuff is incredibly mucky, which is why I kind of enjoy thinking about this stuff because at the end of the day, really what you want is the most primitive versions of things that you can just compose so that you don't get this explosion. Like, a C++ algorithm library is notoriously bad for this. Like we have all these (and don't get me wrong, the algorithm header is amazing and it is a work of art by Stepanov) but we have these Suffixes and prefixes that aren't even used consistently. So we'll have an algorithm like copy, and then we'll have "copy if", which is like essentially a filter. But then when it comes to other algorithms we don't. We have find and "find if". When it comes to transform, which is like a functional map (the things that are implicit in array language) is we have transform but we don't have "transform-if" right? So it's like we have this like incomplete matrix of stuff. But even if we were to complete the matrix, we would go from like 100 algorithms to like 400 because you know, there's no composability and scalability. It's just you're hard coding all of this stuff in. Anyways, it's a problem that exists in many places in computer science, Bob, you were gonna say something?

01:05:43 [BT]

I was just gonna say I think that brings it back around to what Stephen talked about in, in Q and K. They've got a very, very simple set of primitives that do this amazingly complex thing because when you get in deeper, you can find all the stuff you can do, but the actual expression of it at the surface is actually very simple. So rather than having all these different libraries that could do this thing, what they do is they give you something simple that can be used in a very complex way. And I think that's the power of what you know, Stephen described there with Q and I would imagine the same is true for K and to some extent with BQN. [It] is that you've got a simple approach to your language which can be used in very complex ways. You've got building blocks that can make some very complex structures.

01:06:32 [ML]

I mean, the whole idea is to just break things down as far as you can. If you really do it right, then putting two things together to make something that would be, one thing in the C library is ... [sentence left incomplete]). It's not necessarily the same, but it's easier to think about, it's easier to remember and you know, often it can perform pretty close. So that's, you know, one of the big things about the array language and also, we don't have this array [vs] iterator distinction: everything's an array, so cutting out or splitting things up, breaking it down to the simplest possible thing that works. That is a really good way to manage complexity in the language.

01:07:28 [CH]

I mean it's to a limit though, right? Like cut in J [14] is a great example of that. Like [here Marshall agrees] even in talking about the scans, you can see that the the proliferation of small flavors, binary options like: "do this [or] do that". It leads to this explosion and that's kind of what happened with cut, right? Cut is, in my opinion a function that takes like four or five arguments. But the language is limited to two argument functions, so then they end up doing this weird stuff [where] a value is 1 argument, but whether it's negative or positive is one of those binary option options which I think admittedly, is like a quite poor design, but you're constrained, so you have to find a way to do those flavors. And, you know, APL doesn't really have anything for the richness that J offers, like they've got the partition and partition and close. But if you want to do something else, you're going to have to, mess around with the primitives that they give you and and figure out a way to get exactly what you want. I think I talked to you about this one time, Marshall. We messaged back and forth and and your take on splits and stuff is that there was just too much variability in the different types of chunking and splitting you can do. Like, do you include the delimiter? Do you include it with the first one or the second one? And then your take was that stuff just belongs in a library which goes full circle back to what Bob just said. Yes, we do try and solve this stuff at the language level, but at a certain point, because of the explosion of different flavors of like: "how do you implement this? It could be this way this way that way". And then you [multiply] 5 different, binary operations and you instantly have 32 different flavor algorithms that can't be exactly boiled down to just a composition of two or three different things. That's when you kind of do need a library. And actually this is totally a random tangent but there was a problem (I think it was LeetCode) that I looked at the other day that was like: given an array of integers and a subsequence length k, and then a value N, return an array that gives you the Nth smallest value for each K subsequence (like that is windowed). So if you have 10 elements and your value [of] k [is 5] and N equals to 2. You first look at the first five elements and say what's the second smallest. And then you look at the 2nd to 6th [elements]. Those are your next 5 elements and you say what's the second smallest in here? And coming from C++ land, the way you would do that is with a heap. So you would store something so that you can pop things off and insert things and in O(log N) time so you don't need to constantly sort your windows. And then my first thought was: how would you solve this with the same performance profile in an array language? Like I just gave up [chuckles].

01:09:52 [ML]

My thought is, this problem was written by a C programmer who needs to do this [all laugh]

01:10:26 [CH]

I thought it was interesting because I was like, it's pretty simple. You've got windows in BQN or you've got the sliding reductions. You've got sort primitives. Very easy to solve, but it's going to be the naive way of solving it in array language. Anyways, the reason that this popped into my head is that I love the array language paradigm and I love what Bob said about trying to find the perfect set of primitives that can represent and solve all problems simply. But definitely, there are a category of problems and certain algorithms that you hit walls in array languages [but] I'm also not an array language expert. So maybe there is actually a very nice way to solve that with heap performance.

01:11:04 [ML]

Well, yeah. So I will say I don't know about the second smallest values. But I looked at windowed reduce in Dyalog (so finding the smallest, the very smallest value from every window of say 10 elements or 100 elements or something). When I looked at this, Roger Hui had written the C style queue based algorithm to do this so you know you ... [sentence left incomplete]).

01:11:36 [CH]

C style queue? So he wrote the solution in q but like ... [sentence left incomplete]).

01:11:39 [ML]

Q-U-E-U-E

01:11:42 [CH]

Oh, geez! Thank goodness I clarified that!

01:11:45 [ML]

So where you have a queue of the smallest values and you add a new value and you compare it to the values you've got or whatever it is.

01:11:53 [CH]

Although a queue ... [sentence left incomplete]).? How does it, cause queue is just like a pop off, you know? Pop front [then] push back.

01:12:00 [ML]

Yeah, you have to add it in some sort of ... [sentence left incomplete]). I don't remember the exact details, but I mean there's a known way to do this to keep enough information so that you can find the smallest window at every time. And it runs in linear time and it uses maybe (in the worst case) an amount of memory that's equal to the window size. But I looked at this and I said, well, if the window is small what I should really be doing is taking pairwise maximums in powers of two, because that lets me use array operations, so I'll start with taking the (I've been flipping between minimum and maximum), but say taking the maximum of of every pair of elements. And then taking the maximum of the ones that are three apart and so on until you get up to your window size. Or maybe 1, 2 and 4 and something like that. And then I tested this out and it performed very well. And what I actually found was I couldn't find any combination of arguments where that linear time C solution was better than this logarithmic time array solution so. I mean, I know eventually ... [sentence left incomplete])..

01:13:16 [CH]

So, wait, wait, wait. When you say linear, do you just mean for each window? Because it's like N times or it's the length of the sequence times K, which is the window size. We'll call the length of the sequence (because I already used N, so I'm messing this up) we'll call the length of the sequence A. The window size is K and your top Nth value is N, so it's and you're describing A times K ... [sentence left incomplete])..

01:13:43 [ML]

Well, here you just want the the maximum overall. I don't know how you'd find the second highest. That's more complicated.

01:13:52 [CH]

OK. But even for the smallest, you're saying it's still A times. It's the O(A * K), [15] whereas for the heap solution O(A * log K) correct?

01:14:04 [ML]

I think the heap can be done in just, linear time order of A. It's complicated and it's not using O(1) time on each step, but when I looked into it, I think I convinced myself that it was using an average of like (if you took the total time), it was linear for the whole thing in any situation.

01:14:27 [CH]

Well, we'll have to kick this to another [episode] cause, I'm so ... [sentence left incomplete]). It's possible, but you're saying there's a linear solution for the whole thing?

01:14:36 [ML]

And practically when I timed it, it was always pretty close to the same amount of time, regardless of how the data [was arranged]. If it was always increasing or always decreasing or other complicated patterns.

01:14:48 [CH]

So this solution you're describing is big O(A).

01:14:53 [ML]

Yeah, and the array solution would be O(A * log K).

01:14:57 [CH]

[incredulously] The array solution? I'm so lost now.

01:15:01 [ML]

So where you do the pairwise thing, and then the differences. The way you actually implement that is that you take slices of the array that are different distances and do the array operation on those. So it's this length log(N) sequence of array operations that are each A steps long (possibly a little shorter). But I know eventually the O(K) solution has to beat the ... [sentence left incomplete]).

01:15:34 [CH]

O(A), not O(K), right? A is the length of the sequence. There's no way it can be O(K) or O of log ... [sentence left incomplete]).

01:15:44 [CH]

No, there's no way it can be O(K). Yeah, technically it's O of A minus ... [sentence left incomplete]).

01:15:47 [ML]

Why not? Alright, that is O(K). O(K), covers everything fast or slow. You have to process every element in the input sequence. It can't be any faster than linear time.

01:16:03 [CH]

You said O(K) and I think you meant O(A). So A is the length of the sequence. K is the length ... [sentence left incomplete]).

01:16:08 [ML]

No, I said O(A).

01:16:10 [CH]

Roll it back, Bob! And I want you to copy and paste the part where he said (or either copy or paste when he said that or add some like "Conor was wrong") cause one of us is right and one of us is wrong.

01:16:25 [ML]

We don't even need the N anymore. We should've been using N.

01:16:27 [CH]

Yeah, we should have've. I messed that up by saying the Nth largest ... I messed that up. Alright, I switched back to my recording cause it was in a different thing and I thought at one point we were at 55 minutes and I thought 10 minutes went by and I was like: alright, we should wrap things up and here we are like almost an hour and a half ... [sentence left incomplete]). We will resume this. Everyone will have time to go and work on this. The listeners and the panelists as well. If you have an array solution, feel free to send it to ...

01:16:56 [BT]

Contact@arraycast.com. [16] As always, we look forward to your solutions in this case, but yeah, you can get in touch with us at contact@raycast.com. Yeah. And many people do. And it's great to get interaction with people and hear what they think. And I'm sure on this episode there'll be a lot of people who have different thoughts about different feelings about descriptions and stuff, and that would be great. We'd love to hear back from you and we can include it in future shows.

01:17:26 [CH]

Awesome. Yeah. I will spend some time. I mean, maybe not this week.

01:17:30 [ML]

Well, and I checked the contains algorithm operation and it's not associative. [17] And that's because it depends on one side of the operation. It's taking the maximum of the right argument in the sum or something like that. Or I mean, there might be a version where you can use 0. Ohh yeah, so there is a version that you can use zero and I think that's probably not associative either. Well now I can check this.

01:17:59 [CH]

We'll we'll follow up in a future episode.

01:18:02 [CH]

Maybe we'll do a whole episode on Kadane's.

01:18:04 [ML]

No, it's not associative.

01:18:05 [CH]

It's not associative, alright.

01:18:07 [ML]

So yeah, and neither operation is associative. So yeah, you'll have to be smarter to find a real array style, or an implementation in terms of array style operations that run fast on a CPU. For Kadane's algorithm, but I think it probably exists.

01:18:27 [CH]

Time will tell. We'll do a full episode on this, whatever, sub sub window sorting problem. Is there a good array solution or is the naive array solution better than any other solution? And then we'll follow up on on Kadane's because it's a great problem. And yeah, we'll leave links to everything, including the J blog or J tutorial that talks about the Kadane's as well as everything else. Any last thoughts, comments?

01:18:56 [BT]

Well, the last guy who came up with a solution to that question got the algorithm named after him, [18] so you know pretty big deal [everyone laughs].

01:19:05 [CH]

This is very true. This is very true.

01:19:06 [ML]

I think he was actually the first guy, not the last.

01:19:09 [BT]

OK well, the first guy to find. Naming things is hard! [laughs]

01:19:17 [CH]

This is true. This is true. All right. With that, we will say: Happy Array Programming.

Happy Array programming!

[Music]