Transcript

Transcript prepared by Bob Therriault, Adám Brudzewsky, Sanjay Cherian and Igor Kim.
[ ] reference numbers refer to Show Notes

00:00:00 [Nick Psaris]

There's been a lot of complaints from my colleagues around the community that the ArrayCast doesn't have enough q representation.

And I feel like I'm doing my best here to hold that. But if we can get a few more people on some of the people that we've highlighted today, that would be great for the community, I think.

00:00:22 [Adám Brudzewsky]

Maybe if we get enough q people on, then Arthur Whitney will come up to balance it with a k. Okay. Oh, you like my strategy here?

00:00:29 [Conor Hoekstra]

I like that. I like that, Adám. I like that, Adám. Very nice.

00:00:35 [Music Theme]

00:00:45 [CH]

I'm your host, Conor. And today with me, I've got four panelists, technically three regular panelists, and a guest panelist will go around and do brief introductions. with Bob, then go to Adám, then go to Marshall, and then go to Nick.

00:00:58 [Bob Therriault]

I'm Bob Therriault. I am a J enthusiast. In fact, I am a J programming language enthusiast. I just want to make that point.

00:01:05 [AB]

I'm Adám Brudzewsky. I've been promoted to head of language design at Dyalog. So I do APL.

00:01:12 [ML]

Hi, I'm Marshall Lochbaum. I've been AJ programmer around, worked at Dyalog. Now I do BQN.

00:01:17 [NP]

And I'm Nick Psaris. I'm a q enthusiast. I'm a q programmer, a q author, and a q adjunct professor as well.

00:01:28 [CH]

Awesome. And as mentioned before, my name is Conor. I am a research scientist and polyglot programmer, but array language enthusiast on all of my free time and some of my work time as well. So I think we've got four different announcements, plus or minus for some definition of the word announcement. So I think we'll start off with Bob and then we'll go to Adám and then I'll introduce the next two after that.

00:01:50 [BT]

Well, my announcement I've actually pre-announced. [01] J is now referring to itself as J programming language. And the reason for that is so many people have been concerned about doing a search for J and coming up with, well, you can imagine what happens if you search for the letter J. So in order to do that, if you write something about J and you wish it to be more visible to a Google community, you write J programming language somewhere in your SEO and it should be able to be picked up by people. A little bit easier if we all consistently do that. Of course, if you really wanna have your article or whatever you're writing known to the J community, you've got a J Wiki, and that is actually one of the best repositories of information about J. So always consider putting something into the J Wiki because I think that's where people generally could go to get information. But J programming language should open us up to the rest of the web.

00:02:53 [CH]

Awesome, yep, that's not confusing at all, but J programming language is what to use when searching now, is what people should take away.

00:03:00 [AB]

API is also confusable with other things named the APL, but you can't call it APL programming language, because that would be redundant.

00:03:07 [CH]

At least you've got three letters instead of one, you know, k, q, J, they've got more of an uphill battle.

00:03:13 [AB]

That's true, so not actually an APL announcement, [02] but on Stack Exchange, it's a network of sites of which Stack Overflow is by far the most known one, which that one deals with people's issues and programming. But they also have sites about all kinds of other topics and people can suggest new sites. Some enthusiasts suggested a new site for programming language design and implementation. And that passed through the first gate of approval and now it's in a beta phase. So it means it's not really listed among the sites,

but you can, it's still open to the public to go there. And while I say it's not an APL announcement, but obviously I've been clobbering it with answers to people's questions when they ask about what are some viable approaches to do this or that and say, oh, APL does this. APL does that. So. But check it out. There's some cool stuff there.

00:04:08 [CH]

And I'm sure we'll we'll leave a link in the show notes for folks if they want to find it. All right. I think so. That's our two official announcements out of the way. We're going to rotate to Marshall now, who has, I believe, a follow up to what we talked about in the previous episode.

00:04:22 [ML]

Yeah, I don't have any problem calling this an announcement. But on the last episode, we talked about a problem of moving in a sliding window along an array and finding the kth from smallest value, basically. And a special case of this is finding

the absolute smallest or largest-- doesn't really matter-- value. And so I discussed an array-style solution to this that is very fast in practice, but it's not linear time. So there are a few actually scalar approaches that work in linear time in the length of the array. But so at the end of the episode, we had not found an array approach that works in linear time. But I got an email afterwards from Phineas Porter, [03] who actually linked me to a blog post he'd written two years ago, which will be in the show notes, about a method for doing this. The idea is if you want the smallest value from every length k window in an array, what you're going to do is actually split the whole array into non-overlapping slices of length. I worked it out. It's actually best to use k minus one. And then once you split that up, every window is split across two of these slices. So you've got the left half in one slice and the right half in the other slice. And then there's a completely linear time method to get all the minimums. And that is, for each of these chunks of the array that you split it into, you're going to take a forward scan of the minimum and a backward scan of the minimum. And then for each pair of windows, you take the backward scan on the left window and the forward scan on the right window,

and you min those together. And the way it works out is that you end up with-- you get every window that falls into those slices. You get the minimum for all those windows. And the scan is a linear time operation, and it's very array-friendly, of course. So if your windows are long enough-- if your windows are really short, you do a bunch of small scans, and that's not good. But if your windows are long enough, this is really fast. And I found that it beats the method that I talked about when you get to around 500. K is 500, so windows of 500 or more. So that's really cool. Now we have two different methods. One is really fast for small windows. One is fast for large windows. So I'm ready to declare total array superiority on the sliding window minimum or maximum problem. We still don't know about the smallest, even the largest k, but--

00:07:01 [CH]

I was about to say, I was like, what, you're declaring total. And I was waiting for the prefix of k equal to 1, which you did add subtly at the very end.

00:07:10 [ML]

What we discussed is the generalization of that, which I still don't know, but that's pretty cool that there's, I mean, and as far as I know, these are the fastest ways to do this on a CPU, at least, probably on a GPU. There's one where you, the one that I talked about, where you do offsets in powers of two, and then there's the other where you do these scans.

00:07:32 [CH]

Yeah, if it's basically boiling down to scans on a GPU for k equal to one, this will be the fastest as well.

00:07:37 [ML]

Yeah, GPU's are pretty good at scans.

00:07:39 [CH]

Yeah, we eat them for breakfast.

00:07:41 [ML]

Probably better than CP.

00:07:43 [CH]

Yeah, so we'll definitely leave a link to that in the show notes. This is a great transition because Phineas was one of the q gods in Nick's presentation, which I think we'll get to in a moment, but Nick introduced himself earlier. I looked it up and I believe this is your fourth time as a guest panelist. [04] So technically I'm not actually sure what happened. We demoted you from guest panelist to guest, and now we've re-promoted you back to guest panelist. So Nick was on episode three, four, and five, representing the q language, and then we had him on as a guest, actually not too long ago, December 9th, I think, in episode 42. So links to all of those episodes if you haven't heard them and you're a recent listener, you definitely want to go listen to those. But Nick is going to, I think, recap some of the announcements from KXCon, which is basically what today's episode discussion is going to be about. Last week from Wednesday till basically Friday or Saturday, we were at a two-day conference put on by KX and First Derivatives, who are the owners of the q language. And yeah, I'll throw it over to Nick, who has, in my opinion, had one of the best talks

at the conference. But we'll do some announcements first, and then we'll hop into what we thought of the conference and any questions that the other panelists have.

00:09:01 [NP]

Thanks, Conor. There was a lot of announcements. Some of them were commercial alliances with Microsoft or Azure, Amazon, things like that. But I think ones that are relevant to the community listening to this podcast are things that pertain to the language itself.

One of them that is actually quite interesting is they always have a test version of the language that owners of licenses have access to their downloads.kx.com. You can go see all the new pieces of functionality that are going to be in the next test version. So 4.0 is the public version. 4.1 has been in test for a couple of years now. So it'd be 4.1T. They announced that 4.1T would now be available to everyone to play with. I think one of the things is there's so much functionality. They want to make sure they get it right when 4.1T becomes actual 4.1. There's a lot of the KX company is shifting more towards being developer friendly. And on that front, PyKX, [05] which gives you the ability to run a q process, not a separate thread, but just the q memory space inside Python, that is now open sourced. And so you can do a pip install PyKX and everything will be pulled down. and with a license, you can get your own personal license or an enterprise license, you can then run q inside your Python memory space, which given enough examples, you know, you have a q table instead of pandas, things compute much, much faster. In addition to that, the vectors that make the q table columns are indeed vectors. And when you want to grab them as NumPy vectors, that can be done with zero copy as well. So there's a lot of efficiencies that they put in there. And I believe, and I find it quite exciting, is that this is going to open up a lot of people who know Python coming out of school, going to the workplace or start playing with it at home, seeing the performance enhancements that importing PyKX would give you, and then start wanting to learn more about the underlying language itself. We can go into some of the additional changes to the actual language that haven't been released yet, but I think one of them that I wanted to bring up and we can talk some more in detail is that the parallelism is all throughout the language on many, many different fronts. And one of the things I think, Conor, you mentioned last time was that you tried to go from each to peach in your program and it didn't make much of a difference. I just wanted to confirm whether or not you started it with the dash S and then some number that

would indicate the number of secondary threads.

00:11:49 [CH]

Yeah, so the answer to that question is no, and I had no idea you had to do that. And actually, so Phineas and I drove into Montauk together and ended up chatting for like three hours along the way. We got into a big argument about the implementation and design, the semantics of prior. And then the other thing we talked, well, one of the other things we talked about was why I wasn't getting a difference from Peach. [06] And he mentioned two different things. He said, one, you gotta do it with hyphen S, but two, he was also shocked that I didn't have a full q, like on demo version. Like he was like, isn't Stephen on the podcast? Can't he hook you up with a full access to all the cores? 'Cause he said that really doesn't make a difference too. Like if you're stuck to two cores, you know, you're definitely not gonna see the, or actually I'm not sure what the limit is on the demo license, 'cause I usually run it with hyphen task set, hyphen two on Linux, because if you don't limit, like so I think I have 32 or 64 cores on my computer. And if you don't limit it, it just, the executable won't run. So you have to do it, you have to run it with task set hyphen C and then the number of cores. But then on top of that, Phineas said that you also have to do the dash S in order for them to actually launch on multiple threads when you're using Peach. So yeah, fail on my part.

00:13:07 [NP]

One of the things historically about Peach was that the data on the way in of the secondary threads was, I think, shared with a pointer, but on the way out, it was serialized and then deserialized. And so you really paid a massive penalty. It's not a penalty. I mean, you can imagine that when you want to send data to a separate process, you need to serialize it and then deserialize on the way back. But I forget, there's been a recent version, maybe recent as in two years or three years, maybe more, I'm dating myself, but they've even removed the serialization and deserialization. So it's just, there is no copy from the master, the main thread to the secondary thread and then back again. So you can literally just put Peach anywhere you wanted in each and there's really limited

overhead. I mean, there's some marshalling perhaps, I agree. But it's quite powerful. And so because there is this memory sharing, that's why they've added Peach into all of the primitives under the hood. I guess let me just get into some of the other places they've added it while we're talking about it. So the CSV loader and all the other loaders that bucket it, whether it's, I think, binary or text, it chunks the file into pieces and then parses them in parallel. And now that q is making a big push, KX is making a big push to get q to be able to query objects on the cloud, you can imagine the delay if you were to load each column one at a time serially. And so they've changed some of the column loading and the partition loading. All of that is now parallel as well.

And that would make a massive difference. I think on a single machine, if you're IO bound, maybe serially is just as good as in parallel, but on the cloud, I think that that's kind of where this comes into play. So cross partitions, database loading, and data serialization and deserialization as well.

it also does that in parallel. So there's a lot of emphasis on parallelism in the q language.

00:15:13 [CH]

Yeah, and it was clear from some of the presentations, can't remember exactly which one it was, it might have been the Goldman Sachs presentation that they were talking about that they like very aggressively use Peach basically on everything and they see like huge performance wins because of it, because they're running stuff across so many different machines or cores. Yeah, we will, and I think all these videos are getting released online, including Nick's, and we will, when we have access to those, I don't know how long it's gonna take, some conferences it's like a week, some conferences it's like six months, but we will announce it in a future episode

when we have access to the video links. I would guess that they're gonna try and get them on sooner rather than later while it's topical, but time will tell.

00:15:55 [NP]

I mean, I guess there's a few other things in the language that were interesting, I think, that they've added a new version of compression, the Z standard, [07] I think it's from Facebook. There's many different, that's I think the fifth compression type, you know, there's snappy and then LZ4, and then there's the native q compression. They're always adding new compression levels. There's new options to the garbage collection. How aggressively do you want it? Do you want it to be quick or do you want it to be complete? There's a way you can load new databases, KDB databases that do or do not load the code that's in there that might make it a little faster if you just want to reload the database without the actual code again. You can now run SV and VS. I think you mentioned that in the last episode. You can run those on byte vectors and that would also include globally unique identifiers, GUIDs. And one of the ones that got a lot of clapping was that in the console, backslash C would give you 10 rows and 200 columns, backslash C 10 and 200.

now they've added the ability to do it automatically sizing. So I don't know if it's backslash C, no, no, or zero, zero. I don't remember, but that would allow you to just, as you change the size of your console to have the display match that size as well. So that got it.

00:17:18 [ML]

So what does back slash C do exactly?

00:17:21 [NP]

It's just the columns and rows or rows and columns.m So backslash C with an integer and another integer says, how many rows should it display?

00:17:29 [ML]

Oh, so it sets the size of your terminal.

00:17:31 [NP]

Yeah, it's probably stands for console, yeah.

00:17:33 [AB]

It's kind of like what PW and APL, they've added auto-sizing it.

00:17:38 [NP]

Yeah. And I think one of the most exciting ones was, and maybe we can come back to this because I don't know, it could take a whole episode if we can't remember exactly all the details, but Pierre brought out a new syntax for reversing any operation, like assignment or... Yeah, but basically assignment. You can, you know, when in Python, you have multi variable assign. So you have a list of parameters on the left-hand side and any list on the right-hand side. And when you assign it, it will unpack that list into a list of variables. We've been asking that for that for a long time. And not only do they, do they give that, I don't think it's available yet from the download, but, um, he's extended that beyond anything anyone ever thought was possible. and we can get into that in a bit.

00:18:28 [CH]

Yeah, it was mind blowing. In my opinion, best presentation of the conference, I'm biased 'cause I'm like a programming language nerd. So I'm sure there's tons of other folks that there was some other announcement that was like top of their list, but Pierre Kovalev, he's on the KX Core team. So he works on the q language. And I just was like, I was sitting next to Nick during that presentation and I'm not like a q expert by any means. I'm like, I'm barely even a q novice. And I understood 75% of it,

but then it got to certain things and I'm like watching what Pierre is doing and then I would lean into Nick and be like, "I'm gonna repeat what I think I just understood and you just tell me am I correct or am I wrong?" Because the stuff he was showing was just like, I've never seen it in any programming language before. Like it was like pattern matching, destructuring, like monadic lift and like structural under all in one. Yeah, anyways, like we said, we can come back to this and Pierre, if you're listening, which I don't think you are. But the whole joke for the rest of the conference was me trying to get Pierre and Oleg. Oleg is also on the KX Core team. We were trying to get them on. We invited them on and he was like, "No, I don't want to be on the podcast." And then at some point someone told me just to go ask Ashok to force them to be on. But here's the running joke now. But we can't get Arthur on and we can't get Pierre and Oleg on. But I think we got a better shot with Pierre and Oleg if we bring them up enough.

00:19:51 [NP]

I definitely won't do it justice, you know, trying to describe what he's got there. And he's, it's constantly moving and it's not like profession, you know, it's not ready for release yet. But the, the, I had at the end of my presentation, a list of gripes I had with the language and PyKX and legitimately they had solved the PyKX one, like, you know, a days, because I'd already told them personally that I thought this wasn't going to work properly. They'd already fixed that. I wanted a native assert_incue and I wanted this multi-variable assign. And Pierre just was like, "Yeah, we've got that." He actually gave the presentation before I had my presentation. So it's as if they read my mind. He preempted me and it was really fantastic to have basically all your gripes and wishes. By the end of the conference, you found that, you know what, they've been listening and they've solved them.

00:20:42 [ML]

And I thought K4 did this too already. If you do like list A semicolon B semicolon C colon 123, I mean that works fine in ng-nk. So that's unpacking a list of three variables into three assignments. So that's not in k.

00:21:04 [NP]

No, that would definitely not work. No K9 has it, I believe. I think Shakti has it. It's definitely not in K4.

00:21:10 [ML]

But it will be.

00:21:10 [NP]

Yeah, but that's just the tip of.

00:21:13 [ML]

Yeah, I mean, I figured it sounded like a lot more than that.

00:21:15 [CH]

All right, so should we pause if the other panelists have questions and we can answer them and then maybe after that we can get Nick to give us a little summary. Because after Piers, I mean there was fantastic talks, I mean maybe we can get Nick to list off his favorites, but Nick also gave an amazing, entertaining, very entertaining and also very information dense, he solved the same problem or a plethora of problems all in q and I think the slide deck and like I said the presentation will be online. But before we get to Nick giving us a summary of that, questions? I saw Bob, you had your hand up.

00:21:51 [BT]

Yeah, I guess in regards to Peach, essentially what's happening there is when you do the Peach and it breaks it into the parallels,

it's not determining what you do with the parallels. So it's not necessarily sending it to GPU or CPU or cores, is it? Or is it just it breaks it up into sections and then you deal with it as you wish after that?

00:22:10 [NP]

So there's two choices here. So you have a monadic function you call peach and then you have a vector or a list of whatnot. And then it will call that function, whatever you wanna do in the function you do. That's if you don't have remote processes. If you start, you can run peach in a multi-process mode in which case you need to start and make the, you have to set up the file descriptors so that when you run Peach, it will take the data and send it and run it on the remote process. So when Conor was mentioning that Goldman Sachs is using Peach heavily, it was in that multi-process mode. So you have a bunch of remote processes and maybe a single server doesn't have enough RAM to compute everything all at once. And so you set up a bunch of servers, you open up a port, your gateway then connects to each of those separate processes and then parallelizes across the processes. That takes a lot of, um,

set up and, you know, uh, uh, configuration and things like that. The one that's just multi-threaded per se on a single process, multi-core, um, that's a lot easier and you don't even have to think about it other than starting it with the dash S.

00:23:27 [BT]

Yeah, one thing that you do are allowed to do inside the process is if, let's say, you decide that this routine should not be multi-core for whatever reason, maybe it accidentally tries to update a global variable, you can set the number of cores back down to zero.

So you can do a backslash S zero, and it will then turn that off until you decide to turn it back on.And you can turn it on again up to as many cores as the process was started with. So I know in Python, I think you can in process determine how many cores you want,

but in q you have to, you can't hijack a process and then suddenly decide I'm gonna use more cores than the process started with. I guess that maybe that bypasses the license because it checks it at startup and you don't wanna allow more than that number of cores,

but you can deduct or reduce from the total that you started the process with and then go back up to that maximum number.

00:24:32 [BT]

And we're talking about CPU cores. Is anybody doing anything with GPU's at this point that same extension?

00:24:38 [CH]

Not at KX, but I did run into a guy that worked for Citadel. His name was Ryan and he actually, although I'm not sure he wants me mentioning his like little mini project from years ago, but, um, he had a project called QUDA, that's Q-U-D-A, where he was trying to call into some QBLAS GPU functions. So there are people out there, like I discovered at the conference, that have tried things in the past with GPUs and q and trying to bundle them together, but I don't think currently Nick can correct me if I'm wrong, but I talked to a bunch of folks and they said that it's on their peripheral edge of their, you know, things that they would like to do in the future, but there isn't currently anything yet.

00:25:20 [ML]

Yeah, well, I mean the thing about GPUs [08] is that they work so differently from CPUs you pretty much just have to rewrite all your code. I mean if you're implementing at the low level, implementing primitives, I mean the same techniques

don't even work. There are some similarities between things, but you pretty much have to rewrite everything. So I mean that would be a whole new k implementation.

00:25:43 [CH]

All right, if there's no more questions, maybe we'll throw it back to Nick and you can tell us, give us a mini, I mean I saw the talk, it was fantastic, but I'll give you an opportunity to give a summary of it. And I thought the best part was when you brought folks up, but I won't steal your thunder.

00:25:56 [NP]

I'll let you I guess we can talk about my talk a little bit. We can also go through some of the other people's talks as well. The talk I gave was about-- it's very famously known as deferred acceptance. [09] That's a very nice way to put it. But it's also called the stable marriage problem. And the idea is you have two separate populations. And without money changing hands, you need to match people up. If there was a dollar value to these matches, then an optimization problem could be used. But because there's no money, there's only a ranked list. What algorithm should you use to match 10? In the case of the presentation, I pulled 10 of the q gods up to the stage, and then I allowed 10 QBs to come up to the stage. And I gave them note cards. The note cards had the numbers zero through nine and A through J, I believe it was, on them. And then on the back it had their ranked preferences and then they kind of went off, they tried to pair themselves up. That got a lot of laughs. I enjoyed that and tried to lighten the mood a little bit. And then I went into the actual code that implemented the algorithm. It turns out that the optimal, the party that gets the optimal solution is this proposing side. And they didn't realize that until a few years later. But so if you were to rerun the algorithm

where if this was men and women, if the men get the optimal pairings, and if you let the women propose, the women would get the optimal pairing. And so a lot of the implementations these days don't refer to men and women, but suitors and reviewers. That was the kind of the on-stage demonstration. Then we went into the stable marriage problem, algorithm, how that extends to the stable roommates problem, and then how that extends to the hospital residency problem, which in the US, the year of the national residency matching program, which, you know, brings people together to ensure that you most perfectly get residents into their hospitals of choice. And then finally, there's something called the student allocation problem, where you have professors, you have students, and instead of having them rank each other, the students are actually ranking projects that the professors own. And so there's kind of this one layer of indirection between the rankings. And then that, you know, that was, I think the algorithms were pretty cool. I pulled it into Python using the PyKX package, ran the Python implementation, ran the q implementation straight from Python without having to add any code because it just loads it directly. And we saw that in this particular case, a vectorial approach, which is how I implemented it, you know, was 10 times faster than the object oriented approach as it was implemented in Python. So yeah, so that was, I didn't know I was going to go down to implementing four different algorithms and three of them, two of them had suitor optimal and reviewer optimal. So in the end it was six algorithms I needed to implement, but it was, I thought it was fascinating going down that deep rabbit hole, and I hope that the audience enjoyed it as well.

00:29:17 [CH]

It was great. And also unintentional, or maybe it was intentional, you can tell us, Nick, byproduct of this talk is he basically lined up for us 10 different guests to have on our podcast in the future. I'll try and see if I can read. I'm not going to get them all. So Phineas was one of them, who we've mentioned a couple times now. Now he works at Jump Trading. I believe so. Andrew was the co-presenter along with Pierre from the KX Core team and Oleg. So that was three of them. Aaron Davies, who I believe was at Morgan Stanley. Yes. Attila, who actually we've already [10] had on. So we've already had one of the QGODs on before. And that was way back. I think that was, you know, he was our sixth or seventh guest. I know who was, and we can talk about this. I know who wasn't in the QGOD was Johnny Press from, he's the CTO of Data Intellect, formerly known as AquaQ. And that was, that might've been the funniest part of the whole conference because he went up and thought he was in the top 10, but he wasn't.

00:30:18 [NP]

Because it was Anjani, Anjani was there. The problem was I pronounced it Anjani and I think if you were to do it correctly, it would have been Anjani. And so if I have a shout out to Anjani here, I'm sorry for mispronouncing your name. Johnny Press ended up coming up and it should have just been you.

00:30:38 [CH]

Oh, it was so funny because he was up there and then people were like, what are you doing here? And he's like, oh, I belong. And that was not the case. And we will, we will bring Johnny Press on because I think he, he would definitely love to be on the podcast. Or I'm, I'm just, you know, I'm, uh, you know, forcing, putting out into the world what I want to materialize. Um, but I think he would be a great person to talk to you because he's got a long history in the q community. So that's what, we're seven or eight at this point?

00:31:02 [NP]

Yeah, I think you're at eight, and there's two more. There's Igor Cantor. He gave a talk as well. Oh, sorry, that was seven. And then there was Uday Koli. I'd worked with him at Morgan Stanley back in the day. And so he has a long experience. I think he's at Citadel right now. And then there was Mohammad Noor as well.

00:31:22 [CH]

Yeah, so basically we got, you know, we haven't had enough q guests on the podcast. We got a list curated by Nick for us, and we'll see if we can get them all. Pierre and Oleg, I think, are gonna be the most difficult. And I think Andrew said, really, the person from the KX Core team that we should bring on is Charlie. I actually don't know Charlie's last name, who's in Zurich, Switzerland. And Charlie, I believe, is the manager of the KX Core team and worked with Arthur for a very, very long time. I think, initially, when KX was initially acquired by First Derivatives, I think it was just Charlie and Arthur that were working on the q executable, which yeah, if we can get him on the podcast, I think that he would be another amazing guest.

00:32:06 [NP]

So I guess some of the other topics, I can go through a few of them. I think one of them that I found quite fascinating was it was Mohammed from Citadel [11], basically talked about how they incorporate Q into the PyKX Python environment. And it's actually, I think it's, I've heard from other developers as well. It's becoming quite a popular way of doing things. You use asyncio. Q, when you pull it into Python, doesn't have its main thread anymore. It's fully controlled by Python. And so if you want to have market data coming in, updating your q tables, you use asyncio with all of your real-time feeds. And then you update data structures inside Q and then you make your business decisions based on that. So they gave an example where you had C++ hooked in Python and KDB all in one process. And that allows you to not have to rewrite all your data feeds as q native. You leverage Python for that, get your data from Kafka or wherever you get it from, pull it into the q process or delegate it to the q process and then do your heavy lifting there. And then on the way out, again, you go through Python and send it to where it belongs. So like, if you want a web server, you don't have to create a web server in q anymore. You have a tornado web server in Python and just do all of your work in q under the hood. So I thought that was very exciting. Kevin Webster gave a presentation. [12] I really, I personally, I like using Q for business algorithms and analytics to transform businesses and not just as a pure technology play. Kevin, he's, I guess he's in between jobs right now He wrote a whole book on transaction cost modeling, and he had just recently published a book. He reviewed a bunch of papers in that domain. And at the very end of his book, he published a few KDB functions, giving examples of why KDB is so much better at any other language when it comes to tick data, market data to analyze for TCM. And he went through all the different papers. He listed them out, but I think his book is probably gonna be quite awesome. I have a copy and that's my planning for summer reading for sure. Then there's Erin Stanton gave a presentation of why her job at Virtu was just completely and utterly transformed between the time they used to use a different database. [13] And then when they moved everything over to KDB, her productivity was like, you know, tenfold. She gets to the office, she can within 30 minutes to an hour, she can download the data she wants, do a bunch of AI tests on the data, come to different conclusions and just basically is focusing on the business rather than waiting to obtain the data. That got a lot of excitement.

00:35:04 [CH]

Yeah, she was also a fantastic presenter. When I gave my presentation, I started off by apologizing to the crowd because the amount of, I called it Oprah level energy that she brought to her presentation was just, uh, yeah, it's hard to match her presentation, her presentation skills.

00:35:24 [NP]

I think Conor was the last person on the last day. And so people were checked out. Were you the last one or second to last?

00:35:30 [CH]

I think I was second last. The presenter from Stack, uh, was, was the last one. Uh, and then he started his talk off by saying, I apologize for being in between you and the cocktail hour.

00:35:42 [NP]

So, yeah, so then Johnny Press gave his presentation on the business that AquaQ now called Data Intellect. He used some slide animation there that got the crowd excited up until Conor presented his which took that onto steroids.

00:35:58 [CH]

Yeah. That's the only question I got is how do you do your slide transitions, which is not surprising because it's overwhelmingly the number one question I get at other conferences.

00:36:09 [NP]

You don't have to do them if you're upset that people are not listening to your presentation. You can just remove the transition.

00:36:16 [CH]

No, no, I can't do it. They excite me probably more than excites the people that are watching them. Literally that was the question, "Can you go back a few slides and show us the transition again?" I was like, "Okay."

00:36:30 [NP]

Yeah, when Johnny did it, I think people were zoning out and he's like, "Wait, wait. Let me do that again." So he backs up the slide and said, "Okay, everyone pay attention. Here it comes." And then the data intellect logo has two brackets, like an array indexing brackets. And so one of the brackets went from the lower left right-hand side and just got bigger and bigger and bigger, and it appeared on the upper left-hand side. So that was very nice. What else? Then of course, Andrew Wilson gave a long presentation about everything that the core team has been doing for the past couple of years, even though it hasn't been released officially in new versions. I think I talked a little bit about that. I think it's exciting under the hood, you can query all the cloud object stores as if they were files on the file system. So you can take them as partitioned data sets in the future if you get that license and you want to move your data to the cloud. They do the best they can to make it as transparent as possible. So take some partitions that happen to be on local disk, maybe you don't want to spend that much money on the SSD, you can move them to slower, still local disks. Maybe it's an NFS storage. And then, you know, maybe even further back in time, you don't need the performance. You can move those partitions out onto the cloud. And from the end user, they shouldn't tell, be able to tell the difference. It's just a matter of where you set, how you set your KDB database up.

00:37:54 [CH]

Yeah. Bob, I heard you were going to.

00:37:56 [BT]

Yeah. There's a question I've got. It seems to me like with Python, is k or q moving towards being more like an embedded thing that Python programmers may not know the q, I mean, they're encouraged to learn it, but that it may be other people creating the q modules that might be dropped into Python. Is that a future direction?

00:38:17 [NP]

I think that's, I mean, I think that is definitely, it's enabled now basically. Whether it's promoted is a different story. I mean, people who know q don't wanna code in anything else. And so they're gonna keep writing in q. But if you have a highly efficient algorithm written in q, you no longer have to rewrite it to get it into Python. So you have a whole team of people building very data-intensive algorithms that work perfectly well. You can now access them from Python with no overhead. You just put those q files in the directory and they get loaded and runnable. So like, you know, in my presentation, I had a data, I had a dictionary with NumPy vectors in them, and I pass it directly to the q function and shows up inside q as a native dictionary as well. And it shared, like I said, very low overhead, zero copy. And then you run your algorithm and pass the result back. So it's a possibility. I think most likely it's, in some sense, well, I think the hope is that it's like a gateway drug in the sense that you start Python with some q ability in it. And you're like, well, what is this? And then you start poking around and then you realize is how great it is, and then you continue down that path, and then you get more q developers from there. But even if you don't end up like that, if the person's not, if the person just wants to solve business problems and is not caring about the actual, the implementation and things like that, it should be just perfectly usable without having to learn anything else. So the goal would be to make the, has been to make the API as Pythonic as possible and intuitive to anyone just showing up in Python. One possibility that they've tried to do is you can, instead of saying import pandas as pd, you can import pyx.table as pd. So if you just kind of swap in another import, they're adding a lot of the pandas functionality into this wrapper object, this QTable object, and you immediately get a boost in performance. Of course, the API is quite large and some of the functions are not efficient to start with and wouldn't be in q either. And so there wouldn't be a reason to redirect those functions back into Q and maybe they won't even implement them. And so we kind of get like an N-Y-I not yet implemented error or it would just redirect back into a Python implementation. But when and where you can get a boost in productivity just kind of replacing pandas with a q table, I think that would be a brilliant way of getting people using the code.

00:41:03 [BT]

I'm just thinking in terms of, you know, structuring a programming team, you might have several Python programmers who are supporting a q programmer, and then the interface between them is trying to make those connections clearer and better so that you, well, essentially, I guess, you end up with a q programmer who's got the muscle to do the work, but then Python programmers who are bringing an interface into other programmers or other applications that can easily bring it to them, and then they can get down to work, pound out the answer, and then blow it back out to Python.

00:41:38 [NP]

Yeah, I think in practice that is how it actually gets done. There are people who just love data, who love efficiency, and don't want to be bothered with the business use, And so they're just optimizing, optimizing, optimizing. And that's what they love to do. Or maybe I should say that's what we love to do. But there's also the side where, you know, you need to actually get things done. And so I try to balance my side on both of those sides. But yeah, I think there's room for both sides. Like I mentioned, there's some vendors. One of them was Snowflake gave a presentation on how, you know, a large, There's a lot of data on the internet that is managed by Snowflake. And they're giving the ability through PyKX to get the data out of Snowflake [14] and return it to you as like a q table. They're not moving the data to KDB format. They're just giving you the, in their Snowpark, they're giving the ability to pull the data out from the PyKX interface. And then Syneos Health gave a presentation on how they've, starting with a small team of people, wanted to, I think it was RX Data at the time, RX Data Sciences at the time, they wanted to, they had a large, they wanted to manage a large set of data and they said, you know, I think this is a perfect case for Q and KDB and so they, within, I think was it three, they had a four month deadline, but they finished it up in three months, something to that effect. And now I think you've seen the quotes a hundred times. A hundred times faster. I would say a. We're a 10th of the cost, yeah. We're a 10th of the cost. And so the question is where did that quote come from? And it was exactly in this particular case. They were able to take an old database that used to take hours to run reports and they chopped that down to be a hundred times faster. And the actual price, the amount of servers they required was like, let's say 10 X servers. All of that was managed by a single server. So that was the source of that famous quote that KX has on their website as well.

00:43:49 [CH]

Yeah, that was one of my favorite talks as well. It was by Nataraj Dasgupta, [15] I think. I apologize if I got that name wrong or pronunciation wrong. But he was super entertaining at the beginning and sort of gave a couple anecdotal funny stories and his talk was a little bit about the history of him deciding to sort of step away from—I can't remember if he was working at First Derivatives or KX at the time, but I think he did at one point and decided to do a startup like not in the finance space or fintech space but in like the healthcare space which I don't think had been done before so it was kind of like a startup, but also in a space that hadn't been using kdb+ and q and He ended up selling the company, you know five years later and it was like a pretty massive success story. So it was very just interesting to hear sort of in the words of the founder and the person that was doing it. You know Taking a risk and then it working out because of the technology is I think obviously a huge win for him But also it's just a great story for array languages in general that like, you know. People can build technologies and stacks on top of this stuff and build companies and be successful, you know things.

00:45:00 [NP]

Yeah, I think a lot of the presentations are about people building companies on KDB and q and obviously, they're not showing the failures, but this many of them are success stories I think the next presentation was about the next couple were on crypto. There was B2C2. They're recording all their data in KDB. They not only record historical prices, but also the blockchain. How do you use the blockchain to verify how much money your counterparty has? Can you monitor it and build reports to kind of give you advanced warning that a particular their wallet is suddenly starting to draw down. So I thought that was pretty cool.

00:45:45 [CH]

Yeah, they said they were logging, what was it like 50 gigs of data daily? (laughing) Is something ridiculous like that.

00:45:52 [NP]

I think Oprah data probably might, I don't know if it beats that, but like yeah, crypto data, like it's trading all day all the time in all the different markets and all the different coins. It's quite a lot of data for sure. StratMaker was a back testing, sorry, I mean, calling it a back testing not be doing its service. It's pretty much from beginning to end, build a trading strategy. So you have your market data, you have your reference data, it's got blockchain data as well. And then they're giving you an optimizer to optimize your risk in your portfolio. They give you factor models, factor risk models, and the ability to actually create your own factor risk models. And then there's the backtesting component and the order generation as well. And so you can go from concept to semi-production, or you can call it production, but what they don't do is execute your orders for you because that takes on a whole other level of compliance and regulatory risk. So you can do your research, you can do your portfolio analysis, your risk analysis, and order generation, but not execution. That looked very powerful. Then Amand, Flynn from State Street gave a fireside chat, which cannot be repeated here, but it was very fascinating. And he's a great, great, great person. He had a little startup with KDB and then he got bought out by State Street. And so now he's working there. Yeah. A very senior person.

00:47:24 [CH]

With four different titles or something like that.

00:47:28 [NP]

Yeah. Who's next? It was Igor Kantor came next [16]. He talked about what's the best way to build data as a service? He's built it many times in many different banks, and he's very passionate about what's the right way to properly structure your data as a service, in the sense that you do not give people direct access to the data, you give them APIs that do the standard transformations that they would want. And he gave a bunch of list of dos and don'ts. There should be clear objectives. There should be good data quality. The data should be stored in standard formats. You should obviously have security and privacy, and you should manage for scalability. On the don't side, don't store any unnecessary data that you can easily compute, and don't ignore the data quality, and don't skip on the documentation. Yeah. The one that I thought, the one part that I thought was pretty funny is like, don't violate the human rights of data. There was a discussion about what's the best way to handle data, but data to him is clearly a very valued... "Data is the new oil," he gave that quote from Clive Humbly. And so you need to respect your data and treat it with utmost respect. Rebecca Kelly gave a presentation from her company INQData. [17] There's a little chat bot, a Qbot they call it. You type a question and it gave you some answers. I thought that was pretty exciting. Aaron Davies gave a demonstration of a q implementation of the Rasp language. [18] The Rasp language is a very simple summary of what the transformer architecture in large language models as one example, but you can boil down the transformer architecture into three primitives. I think it's a select and aggregate and a select where, select size, something like that. And with those three primitives, you can start building other things like a reverse function or a flip function and things like that. So he gave that example of the transformer architecture. And Karthik Murali from Goldman Sachs, we discussed that earlier. He gave a presentation on the way they use data on the cloud. They're moving all of their data sets from on-prem and onto the cloud and how they use a lot of parallel servers. And prior to that was Conor's, sorry, I did skip Conor. And prior to that was Conor's presentation where he took us through three different, I think they all three LeetCode problems. I forgot the names.

00:50:29 [CH]

Technically one of them was actually from the Appleseeds talk. I actually don't remember the individual that gave the talk, but it was like the skyline problem. That one I actually haven't found on LeetCode. There's similar problems to it, but yeah, just they're LeetCode-esque if they're not actually from LeetCode.

00:50:44 [NP]

Nothing more than five characters. That was the maximum, I believe. Something to that effect. It was very close.

00:50:51 [CH]

In APL, yeah, it was, If you don't include the braces and the Omega, it was, I think, four characters. In q, it was the keyword versions of those. So a couple more characters. But yeah.

00:51:08 [NP]

Oh, characters. Sure. Agreed. Yeah, you presented your, what is it? Street View?

00:51:12 [CH]

Oh, yeah. Well, so at a small digression, yeah. There's eight Conors that work for KX. And three of them were there. So I was the fourth Conor at this conference and mind you there's only I think there's less than 200 people at this conference so that's it's it's not like there's thousands of people there and Turns out that like there's actually a pretty huge running culture at KX. So Ashok He's a marathoner. Conor to me Manus a bunch of the folks that I met there. They have sort of like a WhatsApp like running group We actually never ended up getting out running together because you know, it was pretty crazy while we were there. But anyways, I ended up showing I became while I was there the number two city strider in Montauk. Which is like a website that tracks what percentage of a city's roads you run? So so in the three days that I was there I was trying to become number one but there was some lady named Caitlin who was run 48% of Montauk and that would have been the equivalent of probably like. I Don't know 150 kilometers which believe it or not. I didn't have time to log that much mileage while I was there. But Anyways, yeah, I threw a slide in there and I think also Fintan who doesn't work at KX anymore But he works at Shakti. He's also I've seen posts on LinkedIn that he runs like a 2:56 or 2:54 Marathon, which is which is very fast So yeah, I don't know if it's just like a q thing in general or KX thing in general that like, you know It's not just fast Executables that people like it's they like going fast. There's something there.

00:52:42 [NP]

I probably left out a few of the presentations. There's two more that I do know that I left out. One of them was Phineas Porter, which we mentioned earlier. [19] And he gave a wonderful visual demonstration of how you can use edge detection. I think it's like you create a convolution CNN. He did it in q. Basically, you take your little window, you shift it left and right and up and down. you collapse that and then you can detect edges on a picture. And then, then you can, if you want to shrink your picture and making sure that all of your edges don't get messed up, you just fix those and then shrink everything else other than that. And so he, in his Jupyter notebook, he uses Qcode to take a GIF, or I don't know what the format was, a PNG, and it has a little slider. And as he slid it, the nice beautiful picture, all of the corn fields where there was no edges, they just all shrank. And the people and the trees all stayed pretty much the same size. And so you can shrink the picture and not distort it in any way that's observably erroneous. But obviously, after a certain size, it starts to fall apart. But I thought that was really, that was a really good demonstration. And it was minimal amount of code to do that.

00:54:04 [CH]

I don't actually know what the right word for that, like the title of Phineas's talk, which it was a great talk. I mean, I'm a biased. Any talk that shows code, so like Nick's talk, Phineas's talk, Aaron's talk, obviously Pierre, Pierre's talk. I always prefer those, but yeah. He called it liquid rescaling and...

00:54:22 [ML]

Yeah, I've heard that. I think that Photoshop introduced that.

00:54:24 [CH]

Yeah, but it's not really like, when I hear rescaling, I think of just like, you're resizing the image, But really there's, I don't even know if compaction is the right word. Cause like you're technically deleting parts of the photo. Right. Like, um, but yeah, I guess, I guess...

00:54:40 [ML]

But that's definitely like liquid rescaling is that specific, um, I mean, I don't know if it's exactly the same algorithm, but it's the idea of, um, of resizing by cutting out the bits where nothing's happening, basically.

00:54:51 [CH]

Yeah. Very cool. It's a very cool, and he does it in like a Jupiter notebook at the end of the day, where he actually has a dial and then he just drags it back and forth. And you can see in real time, it's very cool.

00:55:01 [NP]

And then the other person I left out (it was in the beginning of the conference) was Alex Donohue from Toronto Dominion. [20] I think Conor got excited about that [Conor and Nick laugh] and he was basically saying how passionate he was that people use q. How do you get your colleagues to like to use it and adopt it. There's just so many things that just make it such a better environment to develop in. He did have some things that are important. Again, you should not give people direct access to tables, because there's so many things that can go wrong. You don't want people to have to re-implement the data quality problems. His example was: if you have data that shows up on a weekend, you don't want people to look at it because they keep coming to you and saying, "what is this data?". Well then your API should handle that for you. And the other thing: he did talk about typing. [21] KDB is more statically typed than Python (which I think Conor and I were kind of wondering what that was referring to). So I thought about this for a while and Python in general, it has the float and it has a long integer, maybe it has a string type. So there's not that many types involved. But KDB has many, many more types and they don't typically change under the hood. You have an int and you keep adding to it. You're not gonna accidentally get a long. That's very different than in J and I believe, in APL where there's this auto promotion (after it gets beyond a certain size, you'll get a new type). In q, the types are pretty much fixed the way they are. And there's a lot of them: you have time types, you have minute types, you have date types, datetime types. It's just like a profusion of different types for just exactly what you're hoping to achieve and at the minimum size that you can pack it into. They're both statically typed; it's just that one is more types than the other. Perl, on the other hand, is not statically typed in some sense, right? Sorry, I completely made that statement wrong. They're all dynamically typed languages. Sorry, I'm sorry. I'm not trying to imply that it's not a C++ compiled language. But the types... [sentence left incomplete]. Yeah, go on.

00:57:24 [ML]

And what that means specifically is that if you write a function and you write (a + b) in the function, then the type of the plus is not even known when you write that function. When you call the function, a and b can have any type. That's pretty useful actually, because then you can write this one function and it applies to differently typed things. [Nick agrees]. But you also don't get safety guarantees or maybe some optimizations. Who knows?

00:57:54 [NP]

I do think that it's important to compare it with Perl, [22] where you could have this (in awk I guess) you have string "1" and string "2", and you can actually legitimately say string "1" plus string "2", and you'll get the number three out. So that is very different. When I was referring to statically typed, it's not a statically typed language, but the types are strongly typed. I think that would be the way to phrase it: the types are strong and they don't morph into each other.

00:58:26 [CH]

Yeah and that was actually what was confusing. Like Alex's talk was great, but that technically was a mistake on his slide: that he had a comparison (half of his talk [was about] Python versus KDB and q). And on one of the slides, he says: "a con of Python is that it's not strongly typed"; which is actually false. Python is strongly typed. There is disagreement in the academic community of what does "strong" really mean. It's not as explicit.

00:58:51 [ML]

It's not meaningful [chuckles]

00:58:52 [CH]

It's not as explicit as: static versus dynamic.

00:58:54 [ML]

Like there are degrees of strong. There's no strong and not-strong.

00:58:56 [CH]

Exactly it's not binary but in general there's a lot of agreement that languages that have implicit casting from type to type ... [sentence left incomplete]. People will disagree in the C++ community. There are some people that say C++ is a very strongly typed language. In my opinion it's not, because we have implicit casting all over the place: null pointers to booleans to integers. There's like the classic thing when you have a while loop of (i != 0) and that can be your condition because integers ... [sentence left incomplete]. Well, actually you can do (i--). You're like subtracting and then the result of that is an integer. But when that hits zero that will be the equivalent of the boolean false and so it'll cut. In a language like Java you can't do that because Java does not convert integers until into booleans. But the point being is that Python technically is strong, but it is dynamically typed and so you can reassign an integer to be a string and that does create issues. So yeah, it's a complicated thing though, but, technically on his slide, he was arguing a correct point, but saying the wrong reason for it [chuckles]. But that gets into academic PL theory because when you talk to people, everyone has a different definition of what is strong versus weak and, uh, it's kind of a landmine that you step on.

01:00:15 [NP]

I mean, even though I mentioned that integers stay integers in q, in fact, there's a short type: if you add two shorts to each other, you'll end up with an integer. So when you get down to the smaller pieces, things do sometimes get auto-promoted, maybe to some people's frustration. Maybe there's just some historical reason for it. But most of the types maintain themselves when you try to manipulate them.

01:00:36 [ML]

So that is also a lot like NumPy, [which] does that as well. I assert!

01:00:43 [NP]

[chuckles] Okay, I haven't played with it enough. Well, I think that covers all my notes [that] I took. Forgive me; I think there was a presentation by Amazon to start it off, I believe.

01:00:56 [CH]

Yeah, that was Tim (I'm going to mess up his last name; Grishbach, I think, something like that) Tim G. He's @AWSTim on Twitter and I think that's his handle as well inside of Amazon. He was a blast to chat with and super fun guy.

01:01:13 [NP]

They're really hands-on, like if you've got a project to work on and you're spinning up some KDB servers on the cloud, they just want your collaboration. They asked for anyone to kind of reach out and they're willing to put the time and energy into getting it working with you.

01:01:32 [CH]

So I guess a question: I thought this was the first KX Con. That was entirely false. There's been multiple KX Cons in the past: the last one was in 2016. Only one?

01:01:42 [NP]

The one in 2016 was, I thought, the only other KX Con, to my knowledge.

01:01:50 [CH]

I heard there was one in 2008 or '09 or something like that, or '07 (maybe pre-crisis [chuckles]).

01:01:56 [NP]

It's possible, okay. Then I was not at it if there was one. There was another KX25, which in New York City, and I presented there. But the only other KX conference I was aware of was in 2016, [23] I believe in May. Same place, the same Montauk. And that was it. But yes, I was at that one.

01:02:19 [CH]

Okay, maybe I am thinking of the KX25. When would have that been?

01:02:25 [NP]

I thought it was around '18.

01:02:26 [CH]

2018? So that was after 2000. Okay, so maybe there was one, maybe there was multiples.

01:02:32 [NP]

They have all these conferences, Iverson, and you know, they bring everyone out. They'd even had them, but I don't think they called them KX Con. It was more of like an array ... [sentence left incomplete]. It was these off-sites for sure but I don't think they called it KX Con per se.

01:02:46 [CH]

So yeah, my question even if there wasn't multiples (the only thing there needs to be is the one in 2016) is: what are your thoughts overall on the conference that just happened versus 2016. And, yeah, in general, just overall thoughts, you know, should people go next year? I think that is one of the big announcements (slash, it is not official yet) but I think the people that I talked to at KX said that they're trying to make this an annual occurrence and it's going to hop around from location to location. I think someone said they're even going to to toggle off, one in the States and then one in Europe somewhere back and forth, potentially.

01:03:25 [NP]

Yeah. I mean (clearly I was a lot younger then; I was a lot less experienced in my coding) [24] I felt that back then it was just like: "here's a new toy that everyone has recently found; let me show you what I've done with this toy". It was just a lot of fun. It was like really impressive to see what people were doing with the language itself. It was the beginning of my machine learning foray with q and KDB that ended up being a book. I tried to run some of this stuff on GPUs by redirecting all matrix multiplications into ... [sentence left incomplete]. There's this package out there you can download called QML, which is like q and it doesn't stand for "machine learning" (it's "matrix library", I think it stands for). And so if you just rope that in, you can call a bunch of code. It doesn't do GPUs; it does a Fortran under the hood. And so it's a lot faster and I thought that if you could then find a CUDAblast, you could compile against that and then get all of your matrix multiplications and offload [them] into the GPU. But I didn't take it that far. But [at] the last KX Con, it felt like nobody was showing what they had achieved. It was more like what we're doing and what we're playing with. Now it's a bit more mature. You know, you've got vendors (I don't think there were any vendors back then, if I recall correctly). It's a lot more mature and although I do think that because of the time between the last time and this time, there was a lot of hugging and being like: "oh, it's been so long since I've seen you; it's so great to spend the time". We've been communicating over email or maybe I've seen you on a Zoom, but seeing you in person: it really brought the community closer together. I thought that was fantastic.

01:05:27 [CH]

So yeah, everybody should go next year, q user or not. Location TBD. I'm sure actually, next time we can just bring in one of the q gods, although if Steve was here, he'd be upset with us because he wants to banish that [chuckles] term, but I don't know. I'm torn because it's fun. It's fun to call people q gods. When the lineup happened I thought it's like the Avengers are assembling, you know? And I'm there live at the conference watching the Iron Man, the Thor, the Hulk of the q community [chuckles].

01:06:04 [NP]

I think what Stephen was saying that he likes the word q'ist right? I believe that's what it is: a q'ist.

01:06:10 [CH]

I mean I would say that it's one of the best conferences if not the best conference I've been to. Which isn't really surprising because I'm an array language person at heart and C++ conferences are great [chuckles], but you go to an array language conference and it's like your people, you know? You're talking about scans and talking to Oleg and Pierre was so awesome. And yeah, my question was gonna be: are you guys inspired? I mean, technically Dyalog APL already has their conference. [25] I've just never been.

01:06:40 [ML]

Yeah, there's nothing technical about it [chuckles]

01:06:45 [CH]

What? Technical? Oh, "technically". Yes good point [laughs]. That was an unnecessary use of that word.

01:06:51 [ML]

It's like people fly in and all hang around and present to each other.

01:06:55 [AB]

Do conference things, in a conference setting!

01:07:00 [CH]

Yeah, purportedly there's a conference that happens (put on by Dyalog APL), but you know, if the tree falls in the woods, does it actually ... [sentence left incomplete].

01:07:07 [ML]

Yeah, so J had its conferences. [26] 2012 and 2014 were the last set and they said it was just too much work to put on another one, which is very understandable.

01:07:18 [CH]

It is a lot of work. Yeah, shout out to the KX folks: Ann, James (I'm gonna miss a couple of people). But just like, the whole KX team did an amazing job. And kudos to them. Please keep on doing all that hard work because it was an awesome experience. And yeah, it's a lot of work. I've been involved in organizing conferences and it lasted for a year and then I immediately stepped down not because [chuckles] I didn't want the conference to happen, but it's just so much work and I just didn't have that amount of time to dedicate to helping organize it.

01:07:49 [BT]

And key point: it wasn't at the request of the other organizers that you stepped down. [everyone laughs]

01:07:55 [CH]

No, no, it was not! I think actually Mike Dom (the organizer) sent out a tweet or something to clarify that it wasn't that I did a terrible job the first time and then I got fired! It was that I've got too many (what do you call it) pots on the stove or whatever.

01:08:14 [ML]

Array programming could help with that!

01:08:16 [BT]

I was in the 2012 J conference (or was it at the 2014 one?) and it was a really good conference. And there's so much community building that goes on there that's, I think, really essential to building strong communities that I just don't think you can really do online. [Conor agrees] But the balance is if you've got a small community, the work that needs to be done in order to do that, I think, becomes a critical mass. If you get a certain level of organization, it can be done and the returns are greater than the work put in. But I think for a small community, the work put in is much more than the return that you get initially. I'm guessing that's kind of where J found itself at that time. Hopefully at some point it gets to the point where you have people that are as interested in organizing community as they are in, you know, programming the language. But, yeah, it's a critical mass kind of thing, I think.

00:09:15 [CH]

Yeah. And there's definitely something because I've attended the APL seeds [27] conferences, which are great and I've attended other online conferences, but hands down, the best part of KX Con (at least for me) is getting to chat with people over dinner and over lunch. That is very hard to do at an online conference because even if you do have some sort of like breakout room thing, a lot of people just disappear as soon as the talks end. Even if everyone does stick around, it's still very hard to kind of ... [sentence left incomplete]. It's like you have these fixed tables with the fixed size of people. It's just very hard to replicate that hallway track (at an in-person conference) online. At least I have not attended an online conference that has managed to do it. Which means that online conferences are still great, but I still think there's a space for in-person ones.

01:10:06 [AB]

I think that's why we do the live ones at Dyalog as well. Because obviously all the presentations really could be done online. Maybe the workshops less so. But getting people together is a whole different thing. You *should* come to the user meeting, by the way, Conor.

01:10:23 [CH]

I mean, I wanted to. It was held in Portugal, right? The most recent one?

01:10:28 [AB]

Yeah

01:10:31 [CH]

I can't remember, but I think it was maybe I had like a half marathon that clashed.

01:10:36 [ML]

Not even a whole marathon! [Conor laughs]

01:10:40 [CH]

I think also part of the reason was that ... [sentence left incomplete]. Yeah, there was other reasons anyways, but it is on my list of things to do: to make it out to a Dyalog APL conference.

01:10:50 [AB]

It seems to be, since I joined Dyalog, every other time, it's in Elsinore. It's not really a classic conference center. It's like kind of a special place. And that place has (in its corridors and open areas) has all these little holes into the wall, kind of like in a restaurant. Sometimes you have this, where people are sitting at a table, but it's kind of with a wall behind the seats and the little bumps into the walls. There are benches inside and tables. There people really get into this, and you see all evening after the last talk, people are sitting there and you have customers that have issues with something and they can grab some of the Dyalog staff or people discussing various things. And that's really great. I'm looking forward to that. It's happening in October. Also it's right next to where I live, so it'll be convenient.

01:11:48 [CH]

So yeah, fingers-crossed that all the array language conferences start popping up.

01:11:54 [AB]

And that we'll have an array conference, right?

01:11:56 [CH]

Uh, I mean, that was talked about, yeah.

01:11:59 [ML]

But all right, did anybody rig up a bridge, you know, calling J from APL from q from BQN from k? Anybody try that?

01:12:10 [AB]

Well, I suggested to Pierre and Oleg. They showed me that you can evaluate k in q: you apply the letter of the language to another string and that evaluates it. So why not just have J and APL too? You can evaluate in any other language you want. If this has a nicer way to formulate in BQN, you just formulate in BQN from inside q.

01:12:41 [CH]

Sounds like a great project for Marshall, master of all array languages.

01:12:45 [ML]

I definitely would not call myself a q master. Not great at k, but q really throws me with all the names.

01:12:53 [NP]

Too obvious [everyone laughs]

01:12:55 [ML]

No, I mean, about half of them are obvious. And then the other half, I mean, I know if I learned them, then they'd be no problem, but I'm just not familiar with them. So I have to guess it: "there's a few things this could mean". I just don't know.

01:13:06 [CH]

Here's actually a very, random specific question Nick, [chuckles] (and feel free anyone else if they know the answer) that I was looking for the other day. They [meaning Q] have all the moving averages, moving mins, moving maxes. Is there a generic version of that function that just creates windows or slices or something like that?

01:13:25 [NP]

I mean, it doesn't exist. You've got the prior, right? But that's not actually how moving is wint.

01:13:31 [ML]

That's that two window.

01:13:33 [NP]

Yeah, moving average is actually implemented by stacking them up and then just averaging across them. Like it creates the windows.

01:13:39 [CH]

Yeah, yeah.

01:13:41 [NP]

It's much more efficient than it would be in Python. But no, there is not and inevitably anyone who starts a new library, they create that moving window F you know, at moving MF or whatever you want to call it. Like, give me a function and apply it to the moving window. It's not going to be performant per se, but people roll their own. So there's not.

01:14:02 [CH]

Interesting. Okay. Well, good to know that I wasn't missing it under the wrong name (under the built-in). And I guess that's another thing (just to round out the end of this discussion) is I was talking to Johnny Press and he mentioned to me "torque" (t-o-r-q) and it's this open source q framework that is completely hosted on GitHub that has I don't know how many lines of code but it's definitely a large project. And then I ended up finding on the KX website a collection of a massive number of open source q libraries. Like there's a huge ecosystem of q code out there. Basically, I'm not going to say for everything that you might be looking for, but Nick, you were just saying people writing their own libraries. Anyways, we'll link to that in the show notes. That is something I feel like Stephen has mentioned before, Awesome q, [28] which is a curated list of libraries that he hosts on a repo, but I didn't know about the ... [sentence left incomplete]. Or maybe he mentioned it and [it] just never stuck in my memory that they have a huge list of things, sort of by category. And that is, I think, one of the criticisms of (maybe not J) but other array languages. [It] is that we don't really have a huge package management/library ecosystem similar to Python. And definitely this does not equal Python's ecosystem, but is definitely way larger than the impression that I had about the q community.

01:15:26 [NP]

Yeah, I mean, I think it's one of the first things ... the second episode or third episode that I was on. It was like: why aren't the array languages more adopted? And part of that is package management and package deployment. There is no single source where people can pop their libraries and with a pip install, can just install it. It'd be really great if the community can come and bring that together or if KX were to do it. KX has something called QPacker, which does package things up and deploys it, but it's really only for their own enterprise software. Maybe there's a way to leverage that and make it for the community as well. But I think our conversation devolved into the fact that: why would you need a library if all these functions are only four characters long?

01:16:15 [CH]

Yeah, I mean, that's one of the things that Aaron Hsu says that some people disagree with and some people agree with. It's like the libraries versus idioms, you know? Who needs a library when you can just type everything out yourself because it's so short. Which I think works to a certain extent, but then, I prefer people writing stuff for me, even if it's terse, because if they're writing it, I don't have to. [chuckles]

01:16:41 [BT]

And there's an advantage to consistency. If you've got somebody who's come up with a good algorithm, something that works, then you don't have to. If you start from scratch, there can be edge cases that maybe don't work as well.

01:16:52 [ML]

Yeah. There's a lot of stuff where the easy version is very simple, but then the version that's good is many times more complicated.

01:17:02 [CH]

Last thing I'll note too, a plug for a plugin I came across called VS Code q. Because the first thing, when I started digging around in some of these open source repos, was that there doesn't seem to be any standard or even non-standard code formatter for q. Which, coming from automatically code formatted languages, drives me nuts when I see like two lines next to each other, one that has like a space between the ':' and the definition of a function and [in] the next one, there's no space. That stuff just drives me nuts because I'm like: let's just ... [sentence left incomplete]. I don't even care what the style is.

01:17:37 [ML]

But you're saying a q programmer wrote a space that was unnecessary?

01:17:41 [CH]

Oh, all over the place. All over the place. [chuckles]

01:17:44 [ML]

I guess they weren't a K programmer [everyone laughs]

01:17:46 [AB]

If it was K, that'll basically make a difference, wouldn't?

01:17:49 [CH]

It depends where the space is.

01:17:52 [NP]

It's a carriage return.

01:17:53 [CH]

Carriage return?

01:17:54 [NP]

That would make a difference.

01:17:55 [CH]

Oh, yeah, yeah, yeah, for sure.

01:17:56 [NP]

In K.

01:17:57 [AB]

That's in every language that I know of. Or at least Iversonian languages. None of them use a terminating character for expressions.

01:18:07 [CH]

Anyway, so I went searching for one. Didn't find one, but then found a pretty popular plugin that does build ... [sentence left incomplete]. They don't call it a q language server, but it's mini language server written on top of TreeSitter (I believe was the program that they were using). And I don't think TreeSitter has support for code formatting but there's another open source repo called Topiary which I think you can build on top of TreeSitter. Anyways, I may myself go and do this or ... [sentence left incomplete]. Here's my nerd sniping: if there's someone out there that wants to help out with getting standard q formatting across some open source plugins and stuff, write to Bob and then Bob can forward it to me or just DM me on Twitter. And yeah, I think that would be a very nice thing for the q ecosystem and also just array languages in general but I think q, mostly because q strikes me as the closest thing to non-niche array languages. Because they actually use words, you start to desire that kind of code formatting stuff but [in] a lot of the APL BQN stuff, is a lot more terse. Like if you look at Aaron Hsu's stuff, I don't look at that code and go: "hey, you know what this would benefit from? a code formatter". [chuckles]

01:19:21 [AB]

That's funny. I use the built-in code formatter in Dyalog all the time. It's not really a code formatter. It's a whitespace adjuster. That's it.

01:19:31 [CH]

Yeah, comment aligner. Yeah, I use that as well. That's very nice.

01:19:32 [AB]

No, not just the comment aligning. I would occasionally use that too. No, but making all the indents standardized (4 spaces) and make all the braces be in the right places and so on. It never moves them to a different line, but it will move everything horizontally. And that I use all the time.

01:19:54 [NP]

So yeah, on that front, a few items to mention: KX is working with Microsoft and on that front, they are building a VS code . It's been an imminent release from like last December, but soon, maybe this month [or] next month, there's gonna be a VS code plugin owned and produced by Microsoft and KX, so it should be quite good with the documentation, when you hover over it. All of the things you would expect out of a VS Code plugin. So that should be coming soon, and I don't, think it should be freely available at that point.

01:20:29 [CH]

So rest in peace to VS Code q, I guess? Microsoft is implementing their own one.

01:20:34 [NP]

I believe so. The other thing is, I obviously like Emacs, so I have my q mode, and that puts the comments and you can hit meta q (I don't remember the code, but meta q or whatever) and it will re-indent and everything like that. What I wanted to add to it was flycheck, which basically ... [sentence left incomplete]. In Python, when you have a typo, it sends the code off to a Python process. And if there was an exception, it'll highlight that record (that line). And I wanted to do the same thing for q, but there's no way of passing in a QScript to q, having it attempt to syntax parse it and throwing out an exception. So I thought that would be pretty good if we had that. That doesn't exist yet.

01:21:24 [CH]

Very exciting. Wow!

01:21:26 [BT]

And I was gonna say that if you do wanna get in touch with us, of course, the email address is contact@araycast.com [29] and we welcome your suggestions. If you were going to do some kind of a interface to q and this was your imminent release and you think you can beat out KX and Microsoft, then let us know and we'll publicize it for as long as it might exist before it gets thumped by something much bigger. But yeah, contact@arraycast.com.

01:22:03 [CH]

Awesome, yeah. And thanks Nick for coming on. This has been great to recap and hopefully we'll get to do one of these in T minus 12 months, if KX does go ahead and do another 2024 and who knows, maybe I'll do the Dyalog APL one and we can do a recap because I don't think we've ever done a ... [sentence left incomplete]. Uh, we did do the live from when it was online [30] back during COVID ... the great COVID times that we all loved and miss so much [chuckles].

01:22:29 [BT]

And actually the discussion that we had was kind of neat cause it did bring together all the presenters at the end, which I thought was a neat way to do it.

01:22:36 [CH]

Yeah, I mean, pros and cons definitely of both in person and online. Like it's definitely, [actually] probably not possible to have a in-person live from KX Con where you've got a room with all the speakers who can just like hop up to the mic, right? That's only going to be possible in an online format. So definitely pros and cons to both. But yeah, once again, we'll say thank you so much for taking the time, Nick. We know it's not always easy to find time to come on these, but we love having you, especially as a fourth time guest panelist. I think you're probably the most frequent guest panelists at this point.

01:23:08 [NP]

So well, there's been a lot of complaints from my colleagues around the community that Arraycast doesn't have enough q representation. I feel like I'm doing my best here to hold that. If we can get a few more people (some of the people that we've highlighted today) that would be great for the community.

01:23:30 [CH]

I think we'll definitely get Johnny on. I mean I talked to his (was it Victoria or Charlotte?) wife Charlotte or Victoria that works for KX, [who] said that it would be easy to get Jonny on. We'll get Jonny on. I think I talked to Aaron too and he said he's gotta go back and listen to all the podcasts first before he wants to come on. But there's a couple people in the pipeline now that we'll definitely add to our list and our listeners can expect to hear from those folks in the future.

01:23:55 [AB]

Maybe if we get enough q people on then Arthur Whitney will come up to balance it with k.

01:24:02 [CH]

Ooooooh!

01:24:03 [AB]

You like my strategy here?

01:24:09 [CH]

I like that, I like that, Adám. I like that, Adám. Nice, very nice. Just start talking about K4, how it's the best k. There's no k better.

01:24:11 [AB]

Last word [Conor laughs]

01:24:15 [CH]

I think with that we will say: Happy Array Programming!

[everyone in unison]

Happy Array programming!

[music]