simonw · November 14, 2025 19:59
diff --git a/default_tc.srt b/default_tc.srt
 1
 00:00:00,080 --> 00:00:09,920
 You can't talk about NoBook LM without talking about their brilliantly weird podcast thing, where they can generate a podcast about the um about whatever content you've thrown in there.

 2
 00:00:09,920 --> 00:00:28,320
 So I like doing things like um I fed in a uh big boring report about some like generative AI study, and I told them you are banana slugs, and I want you to have a conversation about this paper from the perspective of banana slugs talking about how this could affect your slug society, and they did it.

 3
 00:00:28,320 --> 00:00:30,880
 It's like a monthly Python skit by the leg at all.

 4
 00:00:30,960 --> 00:00:32,560
 Oh my god.

 5
 00:00:36,480 --> 00:00:46,480
 Welcome to episode six of Accessibility and Gen AI, a podcast that interviews the newsmakers and thought leaders in the world of accessibility and artificial intelligence.

 6
 00:00:46,480 --> 00:00:50,720
 I'm Joe Devon, and I'm joined by my co-host Eamon MacElane.

 7
 00:00:50,720 --> 00:00:56,160
 And today we are interviewing Simon Willison, a true OG of the web.

 8
 00:00:56,160 --> 00:00:58,960
 He created my favorite framework called Django.

 9
 00:00:58,960 --> 00:01:09,760
 He created Lanyard data set, and now he is a prolific blogger talking about and building projects on top of AI on a daily basis.

 10
 00:01:09,760 --> 00:01:16,000
 Simon has contributed so much to the world we live in, and it is a true pleasure to have him join us today.

 11
 00:01:16,000 --> 00:01:17,920
 Simon, welcome to the pod.

 12
 00:01:18,320 --> 00:01:19,200
 Thank you very much.

 13
 00:01:19,200 --> 00:01:21,040
 I'm excited to be here.

 14
 00:01:21,040 --> 00:01:28,320
 You know, when I think about all the things that you've done, your impact on the web spans so many influential projects.

 15
 00:01:28,320 --> 00:01:35,520
 But I'd love to understand what has been the most meaningful to you personally and what is a typical day look like for you.

 16
 00:01:35,600 --> 00:01:39,200
 So I think the the most impactful project I've worked on is still Django.

 17
 00:01:39,200 --> 00:01:42,720
 Like Django has it's extraordinary how far that framework's gone.

 18
 00:01:42,720 --> 00:01:46,480
 You know, NASA have used it, Instagram and Pinterest were built on top of it.

 19
 00:01:46,480 --> 00:01:52,320
 Um it uh face uh Meta's Threads is just another Django application I found out recently.

 20
 00:01:52,320 --> 00:01:53,680
 Um so that's amazing.

 21
 00:01:53,680 --> 00:01:59,600
 And I love the fact that Django can be classified now as boring technology in that it's the safe choice, right?

 22
 00:01:59,600 --> 00:02:04,360
 If you just want to build something on the web, if you pick Django, you're not going to run into any sharp edges.

 23
 00:02:04,360 --> 00:02:05,640
 There won't be any surprises.

 24
 00:02:05,640 --> 00:02:07,880
 I'm really proud that it's made it to that point.

 25
 00:02:08,280 --> 00:02:21,240
 But these days, um, I'm really excited about the more recent stuff I'm working on and working on open source tools for data journalism, where the dream here is I want the tools that I'm building to help somebody win a Pulitzer Prize in investigative reporting.

 26
 00:02:21,240 --> 00:02:30,200
 And that sounds wildly ambitious, and I think it is, it's ambitious, but that would be such a cool sort of um example of the kinds of problems I want to solve.

 27
 00:02:30,200 --> 00:02:35,720
 Like I build tools for other people, and I want those other people to achieve sort of world-changing results with them.

 28
 00:02:35,800 --> 00:02:37,080
 That is so cool.

 29
 00:02:37,080 --> 00:02:43,160
 Uh and and I'm just going to tell you an anecdote, which I have not mentioned to you before when we met.

 30
 00:02:43,160 --> 00:02:49,960
 I I had never touched Python, and I needed to deliver an e-commerce project for a client.

 31
 00:02:49,960 --> 00:02:55,880
 And I learned Python, Django built the app in two weeks.

 32
 00:02:55,880 --> 00:02:58,040
 And my client was so happy.

 33
 00:02:58,040 --> 00:03:05,800
 She was like, because it you know had that admin that that you guys put in there, and my client was like, you should sell this whole admin thing.

 34
 00:03:05,800 --> 00:03:10,120
 And I was just laughing so loud because this was a two week project for me.

 35
 00:03:10,120 --> 00:03:11,640
 That is so thrilling.

 36
 00:03:11,640 --> 00:03:13,080
 And I I've heard a lot of that.

 37
 00:03:13,080 --> 00:03:15,640
 A lot of people got their start in programming, learning Django.

 38
 00:03:15,640 --> 00:03:18,760
 And when we built Django, we never dreamed it would be a beginner's project.

 39
 00:03:18,760 --> 00:03:23,720
 We thought it was like, you know, for advanced for experienced engineers to quickly build web applications.

 40
 00:03:23,720 --> 00:03:28,280
 But since then there's the Django Girls tutorials that have been running for I think nearly a decade now.

 41
 00:03:28,280 --> 00:03:29,960
 There's been so much activity on that front.

 42
 00:03:29,960 --> 00:03:34,520
 And yeah, I meet people and they're like, oh, Django was my introduction to programming and web development.

 43
 00:03:34,600 --> 00:03:35,800
 That that's so exciting.

 44
 00:03:35,800 --> 00:03:39,640
 That you know, that wasn't the plan, but it's so thrilling to see that happening.

 45
 00:03:39,640 --> 00:03:44,520
 Yeah, though, to be fair, I did start with PHP in MySQL before, but uh did I.

 46
 00:03:44,520 --> 00:03:46,560
 That was my that was my intro to Python.

 47
 00:03:46,560 --> 00:03:51,120
 It was really hard to get Python working, but Django part of it was was awesome and easy.

 48
 00:03:51,120 --> 00:03:51,600
 Nice.

 49
 00:03:51,600 --> 00:03:52,480
 It's funny you say that.

 50
 00:03:52,480 --> 00:03:53,520
 Simon, pleased to meet you.

 51
 00:03:53,520 --> 00:03:54,640
 You and I never met before.

 52
 00:03:54,640 --> 00:03:57,120
 So thanks for spending your time with us today.

 53
 00:03:57,120 --> 00:04:00,160
 Um I've heard Python a lot over the past several weeks.

 54
 00:04:00,160 --> 00:04:05,120
 Our youngest son, who's a sophomore in Oregon State, he's currently doing a Python class.

 55
 00:04:05,200 --> 00:04:07,040
 He loves it compared to C.

 56
 00:04:07,040 --> 00:04:08,320
 He's like, I'm all over it.

 57
 00:04:10,080 --> 00:04:11,200
 He's a fan.

 58
 00:04:11,200 --> 00:04:13,120
 He's a huge fan, he is.

 59
 00:04:13,120 --> 00:04:21,680
 Um, you mentioned in your initial response about you know your goal, core goal of helping people and making things easier.

 60
 00:04:21,680 --> 00:04:25,600
 Um tying that into accessibility, digital accessibility.

 61
 00:04:25,600 --> 00:04:30,240
 How do you see the GNAI roadmap and accessibility coming together?

 62
 00:04:30,240 --> 00:04:35,440
 Um, and maybe the improvements from a from an inclusiveness perspective.

 63
 00:04:35,440 --> 00:04:41,120
 So this is something that I so I I do not I I'm not an expert in accessibility.

 64
 00:04:41,120 --> 00:04:43,040
 It's something I've cared about throughout my entire career.

 65
 00:04:43,040 --> 00:04:51,040
 Like Django from the very start always was built with sort of semantic HTML and all of those sort of 20 year ago accessibility concerns in place.

 66
 00:04:51,040 --> 00:04:57,040
 And the even these days there is an accessibility working group that I'm not involved with, but that's doing amazing work on the Django admin and so forth.

 67
 00:04:57,040 --> 00:04:58,960
 And so it's something I care very deeply about.

 68
 00:04:58,960 --> 00:05:04,080
 I'm fascinated to learn more about that intersection between generative AI and accessibility myself.

 69
 00:05:04,080 --> 00:05:10,080
 I feel like the most exciting trend over the past year for me has been the vision models, or these these multimodal models, right?

 70
 00:05:10,080 --> 00:05:17,600
 Which um they can consume um images, video, the audio stuff has got incredibly powerful in the past two months.

 71
 00:05:17,600 --> 00:05:24,720
 Like as of now, it's not even surprising that you can have an audio conversation with one of these models two months ago that was hardly a feature.

 72
 00:05:24,720 --> 00:05:25,920
 That's so cool.

 73
 00:05:25,920 --> 00:05:35,560
 And um, I feel like the accessibility benefits of these seem underexplored, at least from from my perspective, but but so so promising.

 74
 00:05:35,560 --> 00:05:46,520
 Some people I've talked to have been skeptical about the accessibility benefits because their argument is, you know, if you give somebody unreliable technology that might hallucinate and make things up, surely that that's that's harming them.

 75
 00:05:46,520 --> 00:05:48,360
 I don't think that's true.

 76
 00:05:48,360 --> 00:05:53,000
 I feel like people who use screen readers are used to unreliable technology.

 77
 00:05:53,000 --> 00:05:58,040
 You know, if you use a guide dog, a guide dog, it's a wonderful thing and a very unreliable piece of technology.

 78
 00:05:58,040 --> 00:06:04,520
 So, you know, when you consider that people with accessibility needs have agency, they can understand the limitations of the technology they're using.

 79
 00:06:04,520 --> 00:06:09,000
 I feel like giving them a tool where they can point their phone at something and it can describe it to them.

 80
 00:06:09,000 --> 00:06:11,640
 It's got really good OCR capabilities built in.

 81
 00:06:11,640 --> 00:06:13,480
 You can have an audio conversation with this.

 82
 00:06:13,480 --> 00:06:19,400
 This just feels like a world away from accessibility technology in that jet just sort of three or four years ago.

 83
 00:06:19,480 --> 00:06:20,520
 Agree, completely agree.

 84
 00:06:20,680 --> 00:06:29,000
 I think you know that concern ties into well, maybe the concern of LLMs not being fully inclusive.

 85
 00:06:29,000 --> 00:06:32,840
 And if LLMs are not inclusive, they're that there will be gaps.

 86
 00:06:32,840 --> 00:06:34,040
 But we can solve that.

 87
 00:06:34,040 --> 00:06:46,280
 We can solve that with engaging with individuals with disabilities with the prompts, we can evolve uh resolve that with making sure that we have a true comprehensive, non-biased data set when we're building data sets.

 88
 00:06:46,280 --> 00:06:52,040
 Um I think as long as we get ahead of it, as long as we're aware of that potential gap, that we can solve it.

 89
 00:06:52,040 --> 00:06:53,800
 I do, I believe that.

 90
 00:06:53,800 --> 00:06:57,880
 Simon, we are now uh, I think it is day 12.

 91
 00:06:57,880 --> 00:07:00,920
 Is it was it 12 days of open AI or 14 days?

 92
 00:07:00,920 --> 00:07:04,200
 I think it's 12 days, and we've reached the final day.

 93
 00:07:04,200 --> 00:07:08,280
 But it also feels like the 12 days of Google AI.

 94
 00:07:08,280 --> 00:07:14,440
 Uh, and when Google was doing their announcements, open AI, it sort of like did their own bunch of announcements.

 95
 00:07:14,440 --> 00:07:19,120
 It just feels like we're in a war right now, massive competition.

 96
 00:07:19,120 --> 00:07:22,320
 Uh, so much to talk about with respect to that.

 97
 00:07:22,320 --> 00:07:29,120
 But um, what do you what were you most excited about in the last couple of weeks that was released?

 98
 00:07:29,120 --> 00:07:31,600
 Like, what are the the top items?

 99
 00:07:31,600 --> 00:07:42,880
 And then as an aside, if any of those are um tie into accessibility, would love your thoughts because I haven't had enough time to even look at uh most of these.

 100
 00:07:42,960 --> 00:07:44,240
 It's been bewildering.

 101
 00:07:44,240 --> 00:07:47,200
 The whole month of December has just been a whirlwind.

 102
 00:07:47,200 --> 00:07:50,080
 And when is December the month that people release everything, right?

 103
 00:07:50,080 --> 00:07:55,680
 You'd expect people to be dialing down for the holidays, but no, we've had extraordinary releases from OpenAI.

 104
 00:07:55,680 --> 00:07:59,360
 Google have managed to undercut OpenAI, which has never happened before.

 105
 00:07:59,360 --> 00:08:06,480
 Like last year, every time Google made a Gemini announcement, OpenAI would launch something better that morning, almost as sort of a power move.

 106
 00:08:06,480 --> 00:08:09,520
 The opposite is happening today, which is so fascinating.

 107
 00:08:09,520 --> 00:08:12,560
 Like Google's Gemini team are really ramping up.

 108
 00:08:12,560 --> 00:08:15,840
 Um there's a bunch of Gemini stuff that's really exciting.

 109
 00:08:15,840 --> 00:08:24,400
 They released uh Gemini Flash 2.0, which is the for sort of the cheapest version of their Gemini 2.0 series, is a really impressive model.

 110
 00:08:24,400 --> 00:08:26,320
 I've been playing around with that one a whole lot.

 111
 00:08:26,320 --> 00:08:33,280
 The Gemini models, um, they can do audio input and they can do video input, which puts them a step ahead of OpenAI.

 112
 00:08:33,280 --> 00:08:39,200
 OpenAI have some preview audio models, but nothing like what Gemini can do on that sort of multimodal front.

 113
 00:08:39,200 --> 00:08:55,040
 Um, the really fun thing is um Gemini and OpenAI both now have streaming video APIs where you can literally point your webcam at something and you can then stream video images into the model, talk over them and have it talk back to you.

 114
 00:08:55,040 --> 00:08:57,920
 And this was this is absolute science fiction.

 115
 00:08:57,920 --> 00:09:03,720
 Like Gemini managed to squeeze their version of this out the day before OpenAI did, which was extraordinary.

 116
 00:09:03,880 --> 00:09:07,960
 OpenAI, however, productized it, it's in the Chat GPT mobile app now.

 117
 00:09:08,120 --> 00:09:15,160
 So I can fire up ChatGPT, I can turn on my webcam, I can point, and I can start having a conversation, including with Santa Claus.

 118
 00:09:15,160 --> 00:09:18,120
 They've got a gimmicky Santa Claus voice that they can talk to.

 119
 00:09:18,120 --> 00:09:19,800
 Like I introduced it to my chickens.

 120
 00:09:19,800 --> 00:09:22,120
 I said, here are my chickens, this is their names.

 121
 00:09:22,120 --> 00:09:24,920
 And then a few minutes later I pointed at Chick and said, which chicken is this?

 122
 00:09:24,920 --> 00:09:26,040
 And it got the answer right.

 123
 00:09:26,040 --> 00:09:28,040
 That's what are we even doing?

 124
 00:09:28,040 --> 00:09:29,640
 That's amazing, right?

 125
 00:09:29,880 --> 00:09:31,720
 Um that's so exciting.

 126
 00:09:31,720 --> 00:09:38,040
 And the accessibility implications of streaming video and audio into these things, that's extraordinary.

 127
 00:09:38,040 --> 00:09:39,240
 Absolutely extraordinary.

 128
 00:09:39,240 --> 00:09:42,360
 Those capabilities became available, what, three or four days ago?

 129
 00:09:42,360 --> 00:09:44,680
 Like this is absolute cutting edge.

 130
 00:09:44,680 --> 00:09:46,760
 The stuff is available over APIs as well.

 131
 00:09:46,760 --> 00:09:48,120
 Just was it yesterday?

 132
 00:09:48,120 --> 00:09:49,400
 I'm losing track of the days.

 133
 00:09:49,400 --> 00:10:02,840
 OpenAI now have a new Web RTC API for their real-time stuff, which I knocked out a little test web page, which you can click a button on, and now you're having an audio conversation with with one of their GPT voices.

 134
 00:10:02,840 --> 00:10:06,440
 And it was like a dozen lines of JavaScript to get that working.

 135
 00:10:06,680 --> 00:10:08,200
 Unbelievable, right?

 136
 00:10:08,200 --> 00:10:10,920
 So that stuff is, and again, it's just so new.

 137
 00:10:10,920 --> 00:10:13,880
 Like these streaming APIs didn't exist two weeks ago.

 138
 00:10:13,880 --> 00:10:16,040
 Now they've they've rolling out.

 139
 00:10:16,040 --> 00:10:20,760
 We're, I feel like we've hardly even started dipping our toes into what those can do.

 140
 00:10:20,760 --> 00:10:27,240
 The other exciting thing is um OpenAI drops the prices of their audio API by a lot.

 141
 00:10:27,240 --> 00:10:31,480
 Like previously it was prohibitively expensive, now it's just about affordable.

 142
 00:10:31,480 --> 00:10:39,480
 Um, Gemini haven't announced the pricing on their API yet, but all of their other models are just bargain basement prices already.

 143
 00:10:39,480 --> 00:10:44,360
 There's this part part of the benefits of the competition is that the pricing just keeps on going down.

 144
 00:10:44,360 --> 00:10:48,400
 It's unbelievably inexpensive to use these vision models right now.

 145
 00:10:48,640 --> 00:10:58,160
 A little while ago, so I say a while ago, like two weeks ago, Amazon announced their Amazon Nova models, which were effectively their version of the Google Gemini models.

 146
 00:10:58,160 --> 00:11:00,640
 They're similarly priced, they have similar capabilities.

 147
 00:11:00,640 --> 00:11:22,480
 And I did a uh napkin calculation and found that if I wanted to take 67,000 photographs from my photo library and run all of those through either Gemini 1.5 Flash or the Amazon Nova cheap one, it would cost me $10.25 to do 67,000 photos to get an actual useful text description of those photos.

 148
 00:11:22,480 --> 00:11:28,160
 That I ran those numbers three times because I didn't believe them, like the first time I calculated them.

 149
 00:11:28,240 --> 00:11:29,440
 That's amazing.

 150
 00:11:29,440 --> 00:11:30,240
 Wow, right?

 151
 00:11:30,240 --> 00:11:31,120
 Absolutely incredible.

 152
 00:11:32,160 --> 00:11:35,680
 Yeah, I was gonna ask you what tools you currently use.

 153
 00:11:35,680 --> 00:11:38,960
 It sounds like you use everything on a daily basis.

 154
 00:11:38,960 --> 00:11:50,160
 Um but from what you see now or where you see things growing, do you believe that there's a current front runner or somebody that's just gaining that constant momentum and getting ahead of the game?

 155
 00:11:50,160 --> 00:11:51,760
 No, no, and this is new.

 156
 00:11:51,760 --> 00:11:54,560
 Like um six months ago, it was OpenAI, right?

 157
 00:11:54,560 --> 00:11:59,920
 OpenAI with GPT, they launched GPT 4, and GPT 4 was the best available model for like nine months.

 158
 00:11:59,920 --> 00:12:01,520
 It felt like an eternity.

 159
 00:12:01,520 --> 00:12:06,320
 Then that broke when Anthropic released their Claude 3 Opus model.

 160
 00:12:06,320 --> 00:12:08,320
 I I'd have to look it up, I think it was March this year.

 161
 00:12:08,320 --> 00:12:11,680
 And then they followed up this model's Claude 3.5 Sonnet.

 162
 00:12:11,680 --> 00:12:13,520
 None of these names are very catchy.

 163
 00:12:13,520 --> 00:12:18,000
 That's been the best model, like the model I use every day for about six months now.

 164
 00:12:18,000 --> 00:12:21,600
 And I think it's still just ahead of the rest of the pack.

 165
 00:12:21,600 --> 00:12:24,320
 But the Gemini models are catching up really quickly.

 166
 00:12:24,320 --> 00:12:33,080
 The OpenAI models, um they're they're the new O1 models that they they released another one of those like a few days ago, those are really interesting as well.

 167
 00:12:29,920 --> 00:12:43,400
 But meanwhile, the the Chinese labs put out this um Quen 2.5 is an openly licensed model that I can run on my laptop, which is in that GPT4 space.

 168
 00:12:43,400 --> 00:12:49,320
 The Lama models from Meta Lama 3.3, again, GP4 level runs on my laptop directly.

 169
 00:12:49,320 --> 00:12:50,520
 I never thought that would happen.

 170
 00:12:50,520 --> 00:12:55,400
 I thought I'd need a $50,000 like mainframe computer to run that kind of model.

 171
 00:12:55,400 --> 00:12:57,480
 This is all changing so quickly.

 172
 00:12:57,480 --> 00:13:05,160
 The um the flip side is there is a suspicion that the the scaling thing is no longer holding.

 173
 00:13:05,160 --> 00:13:10,760
 Like it used to be you chuck in more data and more parameters and more GPU time, and you get better models.

 174
 00:13:10,760 --> 00:13:15,960
 But everything seems to be sort of leveling off at the GPT-4.0 Claude Sonnet level.

 175
 00:13:15,960 --> 00:13:20,440
 Like the new models are slightly better, but they're not like two times better.

 176
 00:13:20,440 --> 00:13:26,600
 And so the new frontier appears to be this idea of um um inference scaling.

 177
 00:13:26,600 --> 00:13:33,720
 It's this thing where you take a model that's that's that's really strong and you just give it longer to churn away to try and come up with answers.

 178
 00:13:33,720 --> 00:13:36,920
 And that um OpenAI started out with their O1 model.

 179
 00:13:36,920 --> 00:13:45,480
 Um Gemini four hours ago, Google announced their released their first version of this, this thing called Gemini Flash Thinking, I think it's called.

 180
 00:13:45,480 --> 00:13:50,040
 I just literally got some software working against that like half an hour ago.

 181
 00:13:50,040 --> 00:13:52,920
 Um, and it looks I'm happy I saw it a half an hour ago.

 182
 00:13:52,920 --> 00:13:54,280
 You already have code working.

 183
 00:13:54,280 --> 00:13:56,440
 This is this is Simon for you.

 184
 00:13:56,440 --> 00:14:02,200
 It's the the and the Chinese ones, there's um a Quen model called QWQ, which is the same pattern.

 185
 00:14:02,200 --> 00:14:03,320
 There's a another.

 186
 00:14:03,320 --> 00:14:05,400
 Um, and so this is all happening now.

 187
 00:14:05,400 --> 00:14:20,080
 And what's interesting about those is they're not better at every task, but they are better at tasks like coding and mathematics, where it helps where if you're a human being and you've got about a big notepad and you you wrote down your thinking step by step, that would help you solve a problem.

 188
 00:14:20,080 --> 00:14:22,400
 These models are now capable of doing that kind of stuff.

 189
 00:14:22,400 --> 00:14:24,320
 So it's a different kind of better.

 190
 00:14:24,320 --> 00:14:32,960
 They're not just getting universally better at everything, but certain sort of like problem-solving tasks that we've got a new frontier that people are working on.

 191
 00:14:32,960 --> 00:14:35,360
 And all of this happened so recently as well.

 192
 00:14:35,360 --> 00:14:40,080
 Like it's it's it's it's a full-time job keeping up with it, definitely.

 193
 00:14:40,160 --> 00:14:41,040
 It really is.

 194
 00:14:41,040 --> 00:14:45,920
 And uh it's definitely eating into my days, uh, just trying to keep up.

 195
 00:14:45,920 --> 00:14:51,760
 Um I am curious, since Eamon brought it up, what what do you use on a daily basis?

 196
 00:14:51,760 --> 00:15:01,680
 And what is good enough for you to have taken uh done the subscription for uh, including are you are you spending that 200 a month on the new open AI model?

 197
 00:15:01,680 --> 00:15:03,120
 Is that any any good?

 198
 00:15:03,120 --> 00:15:08,480
 I've seen some people say it's not that good, and then a few people are like, this is the best thing out there.

 199
 00:15:08,480 --> 00:15:09,680
 I'm so torn on that one.

 200
 00:15:09,680 --> 00:15:11,680
 So I'm not spending the 200 a month yet.

 201
 00:15:11,680 --> 00:15:17,200
 At the moment, I pay for Chat GPT 20 a month and Claude 20 a month.

 202
 00:15:17,200 --> 00:15:21,520
 I would be paying for GitHub Copilot, but I get it for free as an open source maintainer.

 203
 00:15:21,520 --> 00:15:23,680
 And that's it for my subscriptions.

 204
 00:15:23,680 --> 00:15:28,480
 But then I've also got API accounts with basically everyone, and I'm constantly experimenting with the APIs.

 205
 00:15:28,480 --> 00:15:32,960
 The thing is they're so cheap that most months my API bill across everyone comes like 10 bucks.

 206
 00:15:32,960 --> 00:15:38,480
 Like I've never managed to spend more than 20 on the API on all of the APIs in any given month.

 207
 00:15:38,480 --> 00:15:41,760
 So it's not a huge amount of money that I'm spending right now.

 208
 00:15:41,760 --> 00:15:48,560
 Yeah, the the the $200 thing gets you, it gets you less um caps on the 01 model.

 209
 00:15:48,560 --> 00:15:57,760
 Like the 01 model, I think you can use 50 times before you get locked out of it for a few days, which is a bit frustrating because I've started using that one a little bit more.

 210
 00:15:57,760 --> 00:16:05,560
 Um, and you get this thing called O1 Pro, which I really want to try, but I don't want to spend $200 a month to $200 to try.

 211
 00:16:05,720 --> 00:16:08,600
 So I just give me one free go at it, you know.

 212
 00:16:08,600 --> 00:16:14,680
 Um I don't know, I might end up paying $200 a month at some point, but I haven't quite justified it to myself yet.

 213
 00:16:14,760 --> 00:16:22,440
 We just had Ed Summers, the head of um accessibility for GitHub, um, and he announced that Copilot is now free.

 214
 00:16:22,440 --> 00:16:26,040
 Um so that's not even limits for everyone.

 215
 00:16:26,120 --> 00:16:27,080
 Yeah, it's uh yes.

 216
 00:16:27,080 --> 00:16:32,040
 I forget what the limits are, but it's it's it's absolutely and the great thing about that is it's not just a free trial.

 217
 00:16:32,040 --> 00:16:35,720
 This is a free tier that GitHub are planning to make permanent.

 218
 00:16:35,720 --> 00:16:39,080
 So you will always, it's especially important for people around the world, you know.

 219
 00:16:39,080 --> 00:16:44,200
 People people in India are much less likely to be able to set up that credit card subscription and so forth.

 220
 00:16:44,200 --> 00:16:47,640
 Now they get access to the copilot um experience.

 221
 00:16:47,640 --> 00:16:48,840
 I'm really excited about that.

 222
 00:16:48,840 --> 00:16:54,360
 I think that's a it's a it's like it's actually it's the oldest generative AI tool in mainstream use.

 223
 00:16:54,360 --> 00:16:59,240
 Copilot, it turns out, predates chat GPT by like what nearly two years.

 224
 00:16:59,240 --> 00:17:02,680
 They released the first version of Copilot in 2021.

 225
 00:17:02,680 --> 00:17:08,440
 And I love that it's not a chat and or at least it originally it was that autocomplete interface, which was really innovative.

 226
 00:17:08,440 --> 00:17:11,960
 It was a really really interesting way of interacting with those models.

 227
 00:17:11,960 --> 00:17:15,400
 So yeah, I'm I've been I'm I'm a huge copilot user.

 228
 00:17:15,400 --> 00:17:20,040
 I've I'm at that point now where if it's not running in my text editor, I feel restricted.

 229
 00:17:20,040 --> 00:17:24,360
 Like, wow, now I'm I'm having to actually type the code out in full myself.

 230
 00:17:24,360 --> 00:17:36,280
 Yeah, and now that they've added a system prompt um for the accessibility, that's so helpful because then you can really make sure that uh what it spits out is much more likely to be accessible.

 231
 00:17:36,280 --> 00:17:39,240
 Um I I've been asking them about that for a year.

 232
 00:17:39,240 --> 00:17:42,920
 So uh they announced it just now at the the universe.

 233
 00:17:43,000 --> 00:17:44,040
 That's amazing.

 234
 00:17:44,040 --> 00:17:51,600
 That's something I wrote up a thing last night where one of the things you can do with Claude, and ChatGPT has this as well now.

 235
 00:17:51,600 --> 00:17:58,000
 There's this thing called a project, and where you set up a project and you can dump a bunch of files into it, but you can also set custom instructions in there.

 236
 00:17:58,000 --> 00:18:00,640
 So it's a nice, easy way of doing system prompts.

 237
 00:18:00,640 --> 00:18:14,560
 And I've been setting up little projects for different types of code that I write with custom instructions that just have like always I always start my HTML documents in this way, always includes um box sizing border box in the CSS, little things like that.

 238
 00:18:14,560 --> 00:18:16,080
 Um it's fantastic.

 239
 00:18:16,080 --> 00:18:24,560
 Like I can now one shot prompt an app a full page of of working code and have all of those little um those little ideas baked into it.

 240
 00:18:24,560 --> 00:18:29,280
 It's also interesting because it means that you can use the model for things that aren't in its training data.

 241
 00:18:29,280 --> 00:18:42,000
 Like I've started using the Python UV tool a lot, which has ways of um running Python script where you list the dependencies in a magic comment at the top of the file, and then you don't have to think about your dependencies, it just uses them correctly.

 242
 00:18:42,000 --> 00:18:48,560
 And so I built a I built a little custom project which teaches it, give gives it one example of here's how you list your dependencies.

 243
 00:18:48,560 --> 00:18:52,400
 And now I can one shot fully working self-contained Python script.

 244
 00:18:52,400 --> 00:19:09,280
 So yeah, I absolutely buy that if you're if you have expertise in accessibility in ARIA stuff that the models aren't doing, you give them one example, just one example of your sort of ideal framework, uh, your your ideal layout, and from that point on they'll they'll be really good at producing code that fits that that example.

 245
 00:19:09,280 --> 00:19:10,960
 I love to see you enthusiasm here.

 246
 00:19:10,960 --> 00:19:14,000
 Yeah, it's clear how much you love it, it really is.

 247
 00:19:14,000 --> 00:19:20,240
 Um you personally um created a command line tool called LLM.

 248
 00:19:20,240 --> 00:19:24,640
 Can you share with our listeners what that's all about and how useful that can be?

 249
 00:19:24,640 --> 00:19:35,000
 Yeah, so this is um the the initial idea around this was I spent a lot of time in the terminal, like um running uh I'm I'm on MacOS, but you know I'm I'm I'm a terminal person.

 250
 00:19:35,160 --> 00:19:44,200
 And I realized that large language models and the terminal are really good match because at their most basic form, a large language model is a thing where you give it a prompt and it gives you a response.

 251
 00:19:44,200 --> 00:19:47,480
 And in the terminal, you're always piping things from one tool to another.

 252
 00:19:47,480 --> 00:19:53,240
 So wouldn't it be great if you could just pipe text into a language model and then have the response come back out again?

 253
 00:19:53,240 --> 00:19:55,560
 So the first version of LLM was exactly that.

 254
 00:19:55,560 --> 00:19:58,120
 It was using the OpenAI API.

 255
 00:19:58,120 --> 00:20:02,600
 And I noticed that nobody had LLM on the Python package repository yet.

 256
 00:20:02,600 --> 00:20:03,560
 So I grabbed that.

 257
 00:20:03,560 --> 00:20:09,000
 It was like a namespace grab, because a three-letter acronym tool felt like a cool thing to have.

 258
 00:20:09,000 --> 00:20:11,080
 And so I built that, and it turns out it is great.

 259
 00:20:11,080 --> 00:20:18,520
 It's really fun being able to say cat, my file.py, pipe, LLM, explain this code, and it spits out an explanation.

 260
 00:20:18,520 --> 00:20:19,800
 That's really fun.

 261
 00:20:19,800 --> 00:20:25,560
 And then I added plugin support to it so that you could have it support additional models.

 262
 00:20:25,560 --> 00:20:31,800
 Because, you know, why talk to just OpenAI when you could talk to Anthropic or Gemini or all of these other models as well.

 263
 00:20:31,800 --> 00:20:37,480
 And because that's based on plugins, anyone else can write a plugin that adds support for a new model.

 264
 00:20:37,480 --> 00:20:40,120
 And I also can write plugins that do local models.

 265
 00:20:40,120 --> 00:20:43,560
 So now my little command line tool out of the box does OpenAI.

 266
 00:20:43,560 --> 00:20:47,240
 And if you install a plugin, it can do Gemini and then Anthropic and Claude and so on.

 267
 00:20:47,240 --> 00:20:50,840
 And then you can install some plugins that will install models on your laptop.

 268
 00:20:50,840 --> 00:20:54,600
 And now you've got a completely offline language model environment.

 269
 00:20:54,600 --> 00:20:55,640
 So much fun.

 270
 00:20:55,640 --> 00:20:59,720
 Like it's it also means that whenever a new model comes out, I've got something I can do with it.

 271
 00:20:59,720 --> 00:21:03,000
 Like I can be like, okay, new Gemini model, tap, tap, tap, tap, tap.

 272
 00:21:03,000 --> 00:21:06,200
 Now my LLM plugin for that can support that model.

 273
 00:21:06,200 --> 00:21:13,320
 Um so it sort of helps me stay on top of new developments because I've got uh I'm actually writing code that interacts with these models.

 274
 00:21:13,320 --> 00:21:14,680
 And I use that on a daily basis.

 275
 00:21:14,680 --> 00:21:29,200
 Like there are all sorts of things where it's convenient to be in the terminal and to quickly ask a question or quickly analyze something, or you can do things like curl a URL and pipe that into the model, and now you've got ask questions against a web page.

 276
 00:21:36,960 --> 00:21:39,040
 A lot of people don't know how to use a terminal.

 277
 00:21:39,040 --> 00:21:41,280
 Like it's it's a it's a power user tool.

 278
 00:21:41,280 --> 00:21:47,440
 Um and it bothers me that a lot of the stuff I'm building is then only available to people who are terminal users.

 279
 00:21:47,440 --> 00:21:51,840
 So I have an ongoing goal to build the sort of web application on top of LLM.

 280
 00:21:51,840 --> 00:22:00,000
 So you can type LLM space web enter, it runs a local web server, it pops open your browser, and now you've got a GUI where you can start playing with models.

 281
 00:22:00,000 --> 00:22:03,600
 And I'm I'm forever two weeks away from getting that feature working.

 282
 00:22:03,840 --> 00:22:04,640
 Forever.

 283
 00:22:04,640 --> 00:22:05,520
 Yeah.

 284
 00:22:05,840 --> 00:22:06,880
 That'll be fun.

 285
 00:22:06,880 --> 00:22:10,240
 And you're also doing multimodal on the uh command line.

 286
 00:22:10,240 --> 00:22:18,640
 Oh, um, and I think I saw something you did that was accessibility that could be used for accessibility, like alt text or image description or something like that, correct?

 287
 00:22:18,640 --> 00:22:24,720
 So that's something I actually use large language models for uh most of my alt text text these days.

 288
 00:22:24,720 --> 00:22:40,400
 Like whenever I tweet an image or whatever, I basically I've I've got a clawed project that's called out text writer, and it's got a prompt and example, and I dump an image in, and it gives me the alt text, and I very rarely just use it because that's rude, right?

 289
 00:22:40,400 --> 00:22:46,640
 You should never just dump text onto people that you haven't reviewed yourself, but it's always a good starting point.

 290
 00:22:46,640 --> 00:22:53,280
 And normally I'll edit a tiny little bit, I'll like delete an unimportant detail, or I'll bulk something up, and then I've got alt text that works.

 291
 00:22:53,280 --> 00:22:56,400
 And often it's actually got really good taste.

 292
 00:22:56,400 --> 00:23:05,000
 Like a great example is if you're if you're if you've got a screenshot of an interface, there's a lot of words on a screenshot of like a uh an interface.

 293
 00:22:59,840 --> 00:23:06,360
 And most of those words don't matter.

 294
 00:23:06,440 --> 00:23:09,160
 Like the message you're trying to compare in the alt text is okay.

 295
 00:23:09,160 --> 00:23:14,360
 It's two panels on the left is a conversation, on the right, there's a preview of the SVG file or something.

 296
 00:23:14,360 --> 00:23:16,840
 My alt text writer normally gets that right.

 297
 00:23:16,840 --> 00:23:27,320
 Like it it it's even good at summarizing tables of data where it will notice that actually what really matters is that Gemini got a score of 57 and Nova got a score of 53.

 298
 00:23:27,320 --> 00:23:31,320
 And so it'll pull those details out and ignore like the release dates and so forth.

 299
 00:23:31,320 --> 00:23:32,440
 That's really cool.

 300
 00:23:32,840 --> 00:23:39,560
 So I'll be able to prioritize actually what the overall screen, I'll be able to pick out the key key components and key metrics on the screen.

 301
 00:23:39,640 --> 00:23:40,840
 Just does it already.

 302
 00:23:40,840 --> 00:23:42,920
 It's just got good taste by default.

 303
 00:23:42,920 --> 00:23:44,120
 And then you can always talk to it.

 304
 00:23:44,120 --> 00:23:50,120
 So it can give you alt text and you can reply and say, yeah, ignore this column, and then dump, it'll try again.

 305
 00:23:50,120 --> 00:23:51,160
 That I love that.

 306
 00:23:51,160 --> 00:23:56,280
 Like I'm I I take pride in the alt text on these images because so many people don't bother.

 307
 00:23:56,280 --> 00:24:14,680
 And I love um, I'll often try and drop little like in jokes in or things like not jokes that are sort of like Easter eggs that would spoil the experience for somebody with a screen reader who's who's actually using a screen reader, but just little things that make it clear that I'm trying to convey the the sort of the message that's embedded in the image.

 308
 00:24:14,680 --> 00:24:16,920
 It's really fun, you know.

 309
 00:24:17,160 --> 00:24:34,040
 Simon, I um I listened to about uh a one-hour podcast that you did a couple of months ago, uh, an engineering podcast, and and you really did a good job of explaining um where fine-tuning made sense, which I think you said doesn't really usually make sense.

 310
 00:24:34,040 --> 00:24:36,600
 Um, and you compared it to RAG.

 311
 00:24:36,600 --> 00:24:43,080
 Um, and recently there was another release that OpenAI did to improve the fine-tuning.

 312
 00:24:43,080 --> 00:24:46,240
 And where I'm going with this is I'd love for you to explain.

 313
 00:24:46,320 --> 00:24:54,480
 Like, I'm I'm very uh passionate about the coding LLMs and that they should be accessible by default.

 314
 00:24:54,480 --> 00:24:56,640
 Um, so there's two aspects to this.

 315
 00:24:56,640 --> 00:25:03,040
 One is um how do you create a model or a version of the model that is more accessible?

 316
 00:25:03,040 --> 00:25:09,200
 In other words, you you feed it accessible code, right?

 317
 00:25:09,200 --> 00:25:29,600
 Um, just to try and counteract uh the bad code that's in the training data, um, which, as you mentioned on that podcast, you're gonna really struggle to um to really make a difference if you're adding data because there's too much data that might in this case not be accessible.

 318
 00:25:29,600 --> 00:25:34,080
 So, how would you recommend um customizing the model?

 319
 00:25:34,080 --> 00:25:59,520
 And then part two of the question is how how would you recommend that I go ahead with my mission of trying to get AI researchers to pay a little more attention to this, um, perhaps with a benchmark or or somehow get them to compete with each other so that every time there's a new foundation model, they would take a look at the accessibility and say, hey, we perform well on on this benchmark.

 320
 00:25:59,600 --> 00:26:06,160
 A benchmark is an incredibly good idea, like absolutely fantastic idea to have an accessibility benchmark.

 321
 00:26:06,160 --> 00:26:11,840
 And I feel like it's it's difficult, like building benchmarks is not a trivial thing, but it's definitely achievable.

 322
 00:26:11,840 --> 00:26:13,280
 There's lots of examples out there.

 323
 00:26:13,280 --> 00:26:15,040
 There are people who could help with that.

 324
 00:26:15,040 --> 00:26:16,400
 I love that idea.

 325
 00:26:16,400 --> 00:26:28,800
 Because yeah, um, one of the things that's becoming increasingly clear with these models is um a lot of people have this idea that all you do is you scrape the entire internet and dump it into the model and then try and get even more data and dump it in.

 326
 00:26:28,800 --> 00:26:32,120
 That's not actually an accurate model of of how this stuff works.

 327
 00:26:32,120 --> 00:26:37,800
 The more you took the more the AI labs experiment, the more it's becoming clear that the quality of the data matters so much.

 328
 00:26:37,800 --> 00:26:41,720
 Like you really don't just want a random scrape of a boat, a bunch of junk.

 329
 00:26:41,720 --> 00:26:44,760
 You want really high quality, well curated data.

 330
 00:26:44,760 --> 00:26:56,920
 There's a lot of work going on right now with um synthetic data, where people are artificially creating vast amounts of data and feeding into their models because they know that they've just fed in a bunch of Python code that passes its unit tests, for example.

 331
 00:26:56,920 --> 00:26:58,680
 And that's that's just better.

 332
 00:26:58,680 --> 00:27:07,000
 So, and the flip side of that is that occasionally you hear little little hints that the labs are hiring experts just to help with their training data.

 333
 00:27:07,000 --> 00:27:14,440
 Like they will hire experts, like expert biologists to help like refine and dump in way more high-quality biology data.

 334
 00:27:14,440 --> 00:27:20,520
 There is no reason at all that they couldn't hire expert accessibility engineers to help curate and dump stuff in.

 335
 00:27:20,520 --> 00:27:23,160
 There just needs to, they need to see that there's demand for that.

 336
 00:27:23,160 --> 00:27:27,160
 So yeah, I and if there were benchmarks, that would that would help push the needle on that one.

 337
 00:27:27,480 --> 00:27:31,960
 I've written to all of the foundation models and not gotten any response so far.

 338
 00:27:31,960 --> 00:27:34,200
 But um, anyway, what were you gonna say?

 339
 00:27:34,200 --> 00:27:37,320
 Well, yeah, so it's we should talk a little bit more about fine-tuning.

 340
 00:27:37,320 --> 00:27:43,880
 So the the obvious everyone who starts working on these models, one of the first things they think is, I wish it knew my stuff, right?

 341
 00:27:43,880 --> 00:27:47,240
 I wish it had been trained on all of the documents within my company.

 342
 00:27:47,240 --> 00:27:51,960
 And so obviously, I should fine-tune a model to train it to understand that information.

 343
 00:27:51,960 --> 00:27:54,280
 That's the thing which mostly doesn't work.

 344
 00:27:54,280 --> 00:28:01,440
 I mean, you can try and do it, but it turns out dumping a little bit of extra information to a model that's been trained on a giant scrape of the internet.

 345
 00:28:01,440 --> 00:28:05,080
 There's so much in there already, it's very difficult to bias it in the correct direction.

 346
 00:28:05,080 --> 00:28:09,480
 And my big frustration with fine-tuning is lots of people will sell it to you.

 347
 00:28:09,480 --> 00:28:12,120
 There are very expensive APIs from all of these providers.

 348
 00:28:12,120 --> 00:28:14,840
 There are companies and startups that will help you do this.

 349
 00:28:15,040 --> 00:28:24,240
 When you ask them for demos, like I just want somebody to show me a really clear demo of look, here's the default model, here's the fine-tuned one, the default one sucks at answering this question.

 350
 00:28:24,240 --> 00:28:26,080
 The fine-tuned one is really good at it.

 351
 00:28:26,080 --> 00:28:33,120
 And these demos are really hard to come across, which is one of the reasons I'm very I remain skeptical of fine-tuning as a technique.

 352
 00:28:33,120 --> 00:28:42,800
 I think someday it's going to be useful and people will have those demos, but right now I feel like you can spend a lot of time and money and energy and just not get really great results out of it.

 353
 00:28:42,800 --> 00:28:51,680
 The flip side is the thing that's getting increasingly easy these days is just straight up prompting using these long context models.

 354
 00:28:51,680 --> 00:29:01,840
 So like just two years ago, most models only accepted up to like 8,000 tokens, which is like maybe maybe 20 pages of text.

 355
 00:29:01,840 --> 00:29:03,760
 I'd have to have to look that up.

 356
 00:29:03,760 --> 00:29:11,520
 Today, almost all of the good models will accept 100,000 tokens, and the and Gemini, it takes a million or two million tokens.

 357
 00:29:11,520 --> 00:29:16,640
 And that's like you can dump multiple novels worth of information into Gemini in one go.

 358
 00:29:16,640 --> 00:29:31,200
 Which means if you wanted to build a model that was really good at accessibility engineering, find like 50,000 tokens worth, 10,000 tokens worth of really high quality code, stick that in the prompt, and it'll it'll pick up from those examples.

 359
 00:29:31,200 --> 00:29:33,520
 Models are amazingly receptive to examples.

 360
 00:29:33,520 --> 00:29:37,440
 Like that's the most powerful way to work with them is to give them examples of what you want.

 361
 00:29:37,440 --> 00:29:48,080
 And honestly, even like three or four really good examples of well-written accessible code might be enough to start the models around a lot along the right route.

 362
 00:29:48,080 --> 00:29:50,160
 And that's a really cheap experiment to run.

 363
 00:29:50,160 --> 00:30:02,280
 There's also there are these um uh prompt caching mechanisms that a lot of the providers have now, where if you give it the same exact sort of system prompt, it costs way less money on the second and third and fourth goes.

 364
 00:29:59,840 --> 00:30:03,240
 And that's really useful as well.

 365
 00:30:03,320 --> 00:30:09,400
 If you're gonna have like a long prompt full of examples, you pay money up front for the first one, and then from then on it gets cheaper.

 366
 00:30:09,400 --> 00:30:10,520
 I think that's the way to do it.

 367
 00:30:10,520 --> 00:30:12,680
 It's also really quick to iterate on these.

 368
 00:30:12,680 --> 00:30:17,560
 Like you build a really big prompt, try it, then you tweak it and try it again and see if you get better results.

 369
 00:30:17,560 --> 00:30:20,840
 So I think that's the what that's the most promising avenue right now.

 370
 00:30:20,920 --> 00:30:27,000
 Now I finally understand the caching because I I didn't totally get it, but it's the system prompt.

 371
 00:30:27,000 --> 00:30:32,200
 That makes so much sense because that's gonna be a hit every every single time.

 372
 00:30:32,200 --> 00:30:35,400
 Um I mean, it's also it's it's common prefixes.

 373
 00:30:35,400 --> 00:30:43,720
 So you might have a system prompt that says you're a useful robot that answers questions based on this document, and then a regular prompt that's the document, and then questions after that.

 374
 00:30:43,720 --> 00:30:46,920
 And as long as the document stays the same, you'll get that benefit.

 375
 00:30:46,920 --> 00:30:55,080
 Also, if you're thinking about doing chat interfaces, the way chats work is each time you say something new, it replays the previous conversation.

 376
 00:30:55,080 --> 00:30:57,320
 And again, that's where caching kicks in.

 377
 00:30:57,320 --> 00:31:02,440
 So if the caching is happening, the the subsequent posts in the conversation save a lot of money.

 378
 00:31:02,440 --> 00:31:03,240
 That makes sense.

 379
 00:31:03,240 --> 00:31:04,280
 That makes sense.

 380
 00:31:04,280 --> 00:31:13,720
 And then for your approach of these uh really long prompts with examples in them, how would you contrast that with rag as an approach?

 381
 00:31:13,720 --> 00:31:15,800
 I think it's the same kind of idea.

 382
 00:31:15,800 --> 00:31:19,640
 So RAG, um, so RAG it stands for retrieval augmented generation.

 383
 00:31:19,640 --> 00:31:39,240
 And the first version of it was a trick where you ask the model a question, and rather than just answering, it goes and tries to look in your big corpus of documents for anything that looks roughly similar to that, and then junk just basically so it does a search, gets the results, sticks those into the prompt hidden from you, and then tries to answer the question.

 384
 00:31:39,240 --> 00:31:40,840
 It's a really effective trick.

 385
 00:31:40,840 --> 00:31:46,800
 Like it's the answer to how do I teach the model about my company isn't fine-tune a model.

 386
 00:31:46,800 --> 00:31:50,080
 It's set up a rag system that can run searches against things.

 387
 00:31:50,080 --> 00:31:57,440
 And really the lesson from that is most of prompt engineering, most of building on top of LLMs is thinking about the context.

 388
 00:31:57,440 --> 00:32:06,320
 It's thinking, okay, what is the best thing I can cram into those 8,000, 100,000 million tokens to guarantee that to increase the chance that I get a good answer?

 389
 00:32:06,320 --> 00:32:26,400
 And yeah, the the um the examples thing is almost like a fixed version of RAC, where there are actually things you can do where you could have a system where the user says, I want to build an interface that does this, and you do effectively a rag search against a hundred examples and find the five most relevant pieces of example code, bung those in the prompt and then answer the question that way.

 390
 00:32:26,400 --> 00:32:27,440
 And that would work really well.

 391
 00:32:27,440 --> 00:32:29,440
 That's like a very effective technique.

 392
 00:32:29,440 --> 00:32:36,240
 You touched upon it a few minutes ago about um how AI is got the potential actually to generate jobs.

 393
 00:32:36,240 --> 00:32:42,800
 Uh, as you well know, there's a general concern out there right now about AI replacing many jobs.

 394
 00:32:42,800 --> 00:32:44,000
 What's your response to that?

 395
 00:32:44,000 --> 00:32:48,160
 I know it's a very general question, but it is one of the larger concerns out there right now.

 396
 00:32:48,160 --> 00:32:49,520
 And it deservedly should be.

 397
 00:32:49,520 --> 00:32:52,640
 Like it's um, this is a very disruptive technology.

 398
 00:32:52,640 --> 00:32:59,760
 Like there are there are jobs that will be very that there are jobs that will be damaged by this, there will jobs that will be enhanced by this.

 399
 00:32:59,760 --> 00:33:05,680
 And so there's the sort of negative and the the the pessimistic and the optimistic way to look at this.

 400
 00:33:05,680 --> 00:33:13,520
 And I can actually focus on this as a software engineer because it turns out writing code is one of the things these models are best at.

 401
 00:33:13,520 --> 00:33:24,160
 Like writing, it's it's interesting where the great thing about code is that it's got fact checking built in because if a model spits out code and you run it and get an error, then obviously there's a mistake.

 402
 00:33:24,160 --> 00:33:32,200
 If it spits out an essay, you need to fact check every line of that essay, and that's a much harder process than just trying to run the compiler or whatever.

 403
 00:33:32,200 --> 00:33:35,080
 So a lot of software engineers are terrified of this.

 404
 00:33:35,080 --> 00:33:41,400
 They're like, hey, this is a technology which is going to if like anyone can write can get it to write code now.

 405
 00:33:41,400 --> 00:33:43,880
 My 20 years of experience are no longer valuable.

 406
 00:33:43,880 --> 00:33:50,280
 I may I need to find a new career in in plumbing or something which which won't be replaced by AI.

 407
 00:33:50,280 --> 00:34:01,080
 Um my perspective on this is a developer who's been using these systems on a daily basis for like a couple of years now, I find that they enhance my value.

 408
 00:34:01,080 --> 00:34:08,120
 Like I am so much more competent and capable as a developer because I've got these tools like assisting me.

 409
 00:34:08,120 --> 00:34:14,840
 I can write code in dozens of new programming languages that I never learned before, um, but I still get to benefit from my 20 years of experience.

 410
 00:34:14,840 --> 00:34:35,080
 Like, take somebody off the street who's never written any code before and ask them to build an iPhone app with Chat GPT, and they are going to run into so many pitfalls because you know, programming isn't just about can you write code, it's about thinking through the problems, understanding what's possible and what's not, understanding how to QA, what good code's having good taste.

 411
 00:34:35,080 --> 00:34:38,120
 There's so much depth to what we do as software engineers.

 412
 00:34:38,120 --> 00:34:48,360
 And I've I've said before that um generative AI probably gives me a like two to five times productivity boost on the part of my job that involves typing code on into a laptop.

 413
 00:34:48,360 --> 00:34:50,120
 But that's only 10% of what I do.

 414
 00:34:50,120 --> 00:34:56,360
 Like as a software engineer, most of my time isn't actually spent with the typing of the code, it's all of those other activities.

 415
 00:34:56,360 --> 00:34:59,000
 The II systems help with those around other activities too.

 416
 00:34:59,000 --> 00:35:07,640
 They can help me think think through architectural decisions and research library options and so on, but still I have to have that agency to understand what I'm doing.

 417
 00:35:07,640 --> 00:35:10,920
 So, as a software engineer, I don't feel threatened.

 418
 00:35:10,920 --> 00:35:20,480
 I think that there's my most optimistic view of this is that the cost of developing software goes down because an engineer like myself can be more ambitious and can take on more things.

 419
 00:35:20,480 --> 00:35:22,880
 As a result, demand for software goes up.

 420
 00:35:22,880 --> 00:35:38,720
 Because if you were a company that previously you'd never have dreamed of building like a custom CRM for your industry because it would have taken 20 engineers a year before you got any results, if it now takes four engineers three months to get results, maybe you're maybe you're in the market for software engineers now that you weren't before.

 421
 00:35:38,720 --> 00:35:40,880
 But that's the software engineering side of things.

 422
 00:35:40,880 --> 00:35:43,520
 That's sort of my sort of like rosy glasses.

 423
 00:35:43,520 --> 00:35:49,280
 There are other industries where this stuff is just massively disruptive, and I don't think there's a there's a happy ending.

 424
 00:35:49,280 --> 00:35:52,000
 Like my favorite example there is uh language translation, right?

 425
 00:35:52,000 --> 00:36:14,400
 If you are a human being who earns money translating text from one language into another, the models are not as good at it as you, but they are good enough and they are vanishingly inexpensive that I know a lot of translators now are finding that their job has changed from translate from one language to another to here is an AI generated translation, fix it up, and that you get paid less for.

 426
 00:36:14,400 --> 00:36:15,280
 And that sucks.

 427
 00:36:15,280 --> 00:36:24,800
 Like that's an entire industry of people who, even before Chat GPT, like just Google Translate about five years ago got good enough that that industry took a massive hit.

 428
 00:36:24,800 --> 00:36:29,360
 Um the question then is how many other examples are there like that?

 429
 00:36:29,360 --> 00:36:46,480
 Um, professional illustrators who worked at the sort of like lower end of the scale, like being commissioned to do illustrations to illustrate blog posts, they are having a terrible time because they are you can now prompt an image generation model and get an illustration that's good enough for your blog post, right?

 430
 00:36:46,480 --> 00:36:54,400
 It's um nowhere near what what a professional illustrator could do, but it's it's really taking a chunk out of that that that that end of the market.

 431
 00:36:54,400 --> 00:37:00,440
 I think um in the in the movie industry, the group that are most affected to my understanding is concept artists.

 432
 00:37:00,760 --> 00:37:09,320
 It used to be that if a director is dreaming up a sci-fi scenario, they are paying artists to just come up with those initial concepts to help them think through.

 433
 00:37:09,320 --> 00:37:12,840
 That's the kind of thing which maybe they're turning to generative AI for.

 434
 00:37:12,840 --> 00:37:20,520
 Um so yeah, so I am not gonna say that there aren't huge negative implications to a whole bunch of people around this stuff.

 435
 00:37:20,520 --> 00:37:33,480
 And ideally, like hopefully this shakes out to a point where now maybe concept artists are taking on more ambitious projects and they find a new niche in the market that pays well and so forth, but I can't guarantee that's going to happen.

 436
 00:37:33,480 --> 00:37:34,920
 And that that that sucks, you know.

 437
 00:37:34,920 --> 00:37:47,080
 It it sucks to be in the like you you talk about cars and horses, where the automobile came along and caused mass unemployment against people in the the horse industry and created loads more jobs.

 438
 00:37:47,080 --> 00:37:49,560
 Like now there are more jobs as drivers and people making cars.

 439
 00:37:49,560 --> 00:37:53,720
 It still sucks if you're a professional farrier working on horses, you know.

 440
 00:37:53,720 --> 00:37:59,560
 Being disrupted by technology that creates new jobs isn't great if you're one of those jobs that gets disrupted.

 441
 00:37:59,640 --> 00:38:00,600
 It's interesting.

 442
 00:38:00,600 --> 00:38:08,440
 Justine Bateman, uh, who in my youth uh uh family ties, um, she was uh Mallory on family ties.

 443
 00:38:08,440 --> 00:38:11,880
 Uh she has a computer science degree, I believe.

 444
 00:38:11,880 --> 00:38:19,720
 And um she has spoken out as an actress and a filmmaker, she has been railing against AI in a really big way.

 445
 00:38:19,720 --> 00:38:31,800
 And it's kind of interesting to see a techie uh uh go against it, um, because even though it is awful, the jobs that are going to be lost, there's nothing we're gonna do to stop it.

 446
 00:38:31,800 --> 00:38:38,560
 Um, and it will at least take it away from the big names in Hollywood and democratize it.

 447
 00:38:38,560 --> 00:38:42,640
 So a lot of people are going to be able to make movies for a lot cheaper.

 448
 00:38:42,640 --> 00:38:43,160
 Right?

 449
 00:38:43,160 --> 00:38:43,480
 Right.

 450
 00:38:43,480 --> 00:38:45,680
 It's like um the thing with the translators.

 451
 00:38:45,920 --> 00:38:46,880
 There's a trade-off here.

 452
 00:38:44,920 --> 00:38:51,760
 Like on the one hand, it really sucks if that was your your profession and it's been impacted in that way.

 453
 00:38:51,920 --> 00:38:59,200
 But if we now have technology, that means a billion people can have conversations who never could have afforded a human translator before.

 454
 00:38:59,200 --> 00:39:08,480
 That is, I mean, I don't like to make just straight up statements about one thing is worth something else, but that's a pretty compelling trade-off at that point, you know.

 455
 00:39:08,480 --> 00:39:16,960
 But and yeah, for filmmaking, so much of filmmaking is expensive and slow moving and frustrating.

 456
 00:39:16,960 --> 00:39:23,360
 I love, as a programmer, I love when frustrating parts of my job get get sped up.

 457
 00:39:23,360 --> 00:39:36,800
 But do we uh does the economy shake out that people who were doing like dis skilled but but you know, but skilled but um frustrating aspects of their work still get employed and earn more money doing more creative things?

 458
 00:39:36,800 --> 00:39:38,080
 God, I hope so.

 459
 00:39:38,080 --> 00:39:38,960
 But I don't know.

 460
 00:39:38,960 --> 00:39:41,920
 I'm not a I'm I'm not in that industry, I'm not an economist.

 461
 00:39:41,920 --> 00:39:46,880
 I I have no way of I can't say with any certainty that it's gonna play out in a good way.

 462
 00:39:47,040 --> 00:39:48,240
 That's my fear too.

 463
 00:39:48,240 --> 00:39:50,560
 And and here's another angle for you.

 464
 00:39:50,560 --> 00:40:10,240
 So yesterday, a friend, the sister of a friend of mine got scammed out of $20,000, and they kept her on the phone the entire time and went from uh blockchain uh or Bitcoin uh ATM to Bitcoin ATM, throwing in some money here, some money there.

 465
 00:40:10,240 --> 00:40:14,560
 Um, otherwise uh she would get arrested and she totally fell for it.

 466
 00:40:14,560 --> 00:40:41,880
 And it hit me that that not it won't be long before you're going to see some scammers take the voice of somebody you know, and and we've you know, a lot of people in AI have talked about this, um, and and and the solution is everybody should create a safe word, but then it hit me that all you have to do to get that safe word is you call the person whose safe word you need and whose voice you're gonna grab.

 467
 00:40:41,880 --> 00:40:42,840
 And you know what I mean?

 468
 00:40:42,840 --> 00:40:53,240
 Like if you have, let's say two siblings, you get both of their voices, you use 11 labs or something to emulate their voice, and then you grab that, you play man in the middle, right?

 469
 00:40:53,240 --> 00:40:56,600
 And you grab that safe word and then you hand it over.

 470
 00:40:56,600 --> 00:41:07,080
 It sort of feels like you you need a double safe word and a really um you really have to spend some time to get this right, and most people are not going to be able to handle that.

 471
 00:41:07,080 --> 00:41:12,920
 Yeah, I mean, that there are there are a lot of bad things, like bad people can use this technology to do a lot of bad things.

 472
 00:41:12,920 --> 00:41:16,680
 And in most of these cases, there's always an argument they could have done it before.

 473
 00:41:16,680 --> 00:41:21,320
 Like if you get if you if you're talented in impersonating voices, you could have pulled off that scam.

 474
 00:41:21,320 --> 00:41:29,000
 But not a lot of people are talented at personating voices, and it's it's that extra friction meant that wasn't necessarily a widespread scam.

 475
 00:41:29,000 --> 00:41:33,080
 The quality of voice cloning and so forth these days is is shockingly good.

 476
 00:41:33,080 --> 00:41:41,880
 Like the um, I recently found out that the the open AI, real the really good open AI voices, they can train those on like a 15-second audio sample.

 477
 00:41:41,880 --> 00:41:45,720
 They they prepay a professional voice actor for 15 seconds of their voice.

 478
 00:41:45,720 --> 00:41:52,040
 And they've deliberately not made that capability available to everyone else, but it's it's it's the way the models work.

 479
 00:41:52,040 --> 00:42:03,400
 Um so yeah, I I mean that's one of the other things that scares me about this stuff is is as a society, are we ready to to understand and to to to to to cope with this?

 480
 00:42:03,400 --> 00:42:07,560
 And if not, how quickly can we speak can can we get up to speed?

 481
 00:42:07,560 --> 00:42:12,280
 The the um the one that works me the most is isn't voice cloning, it's the romance scams, right?

 482
 00:42:12,280 --> 00:42:34,720
 The thing where you get a text message out of the blue, you apply they try and form a relationship with you those romance scams have been run out of effectively sweatshops in places like Indonesia and the Philippines for years where they get people with with good written English skills and effectively like force them to pull these scams on people even it's even cheaper if you can get a generative AI model to do that.

 483
 00:42:34,720 --> 00:42:59,360
 And yeah I think that's just going to be a growing problem that we have is that like scams are going to become more prevalent and they'll be cheaper to run like and yeah it's it's something yeah that it's always scale like so many of these problems come down to the fact that the bad thing was possible before but now it's possible at a hundred times the volume and yeah what what do we what how do we how do we fight back against that I I don't know.

 484
 00:42:59,360 --> 00:43:12,480
 Yeah watched you you talk about the romance scam it was like 16 minutes or some show on last week and it's exactly around that and how that is growing and growing and growing globally and it's just it's it's just so unfair it really is.

 485
 00:43:13,360 --> 00:43:23,200
 Many of our listeners on today's podcast will be interested to to kind of get a little bit more insight about your Django story kind of how that how you created it.

 486
 00:43:23,200 --> 00:43:53,800
 I know we don't have a lot of time but could you give us just an overview of how that was initiated and your your journey there and where it's at today absolutely so this is going back a long time this is two this is 21 years ago this is um 2003 um I was a university student and I had a blog and this in 2003 there were only about a hundred people with blogs talking about web development so we all knew each other um and this chap, um Adrian Holivati was a uh journalist, uh web developer working in Kansas, and on his blog, he put up a job ad.

 487
 00:43:53,800 --> 00:43:57,480
 And my university offered us a year in industry placement program.

 488
 00:43:57,480 --> 00:44:01,960
 So you could take a year off of university, go and work somewhere, and then then come back again.

 489
 00:44:01,960 --> 00:44:04,600
 And it meant that you could get a student visa.

 490
 00:44:04,600 --> 00:44:11,960
 So I got in touch with Adrian and said, hey, would this work as like a year-long sort of like paid internship kind of arrangement?

 491
 00:44:11,960 --> 00:44:12,680
 And it did.

 492
 00:44:12,680 --> 00:44:14,600
 So I moved out, moved from London.

 493
 00:44:14,600 --> 00:44:20,840
 Um I moved from England to Kansas to Lawrence, Kansas, and spent a year working at this tiny little local newspaper.

 494
 00:44:20,840 --> 00:44:28,440
 And yeah, Adrian and I were both PHP developers who wanted to use Python, and none of the Python web frameworks at the time quite did what we wanted.

 495
 00:44:28,440 --> 00:44:36,440
 So we ended up building our own little thin abstraction layer over the mod Python Apache module to build newspaper websites with.

 496
 00:44:36,440 --> 00:44:39,960
 And honestly, we had no idea that it was ever going to be an open source thing.

 497
 00:44:39,960 --> 00:44:43,960
 We thought it was the CMS that we were using to build these newspaper websites.

 498
 00:44:43,960 --> 00:44:45,480
 But um, I was there for a year.

 499
 00:44:45,480 --> 00:44:56,600
 I left, and then six months after I left, they got the go-ahead from the university publishers to release this as open source, partly because Ruby on Rails had just come out and was like taking the world by storm.

 500
 00:44:56,600 --> 00:45:01,560
 And they were looking at their thing and saying, hey, we've got a thing that looks a bit like Ruby on Rails, but it's for Python.

 501
 00:45:01,560 --> 00:45:04,600
 This company, 37 Signals, are doing well out of their release.

 502
 00:45:04,600 --> 00:45:06,680
 We should we should go ahead and put that out into the world.

 503
 00:45:06,680 --> 00:45:07,320
 And they did.

 504
 00:45:07,320 --> 00:45:13,240
 And um they they called it Django because Adrian Holivati is a huge Django Reinhardt gypsy jazz fan.

 505
 00:45:13,240 --> 00:45:18,600
 He actually um like he has a YouTube video where he does gypsy jazz guitar covers of different things.

 506
 00:45:18,840 --> 00:45:20,600
 He's a very talented musician.

 507
 00:45:20,600 --> 00:45:26,360
 Um yeah, and that that put Django out into the world, and it just grew and grew and grew.

 508
 00:45:26,360 --> 00:45:29,240
 And it's been, yeah, it's been out for nearly 20 years now.

 509
 00:45:29,240 --> 00:45:35,400
 We're we're planning a 2021st birthday party for it, um, hopefully next year, which would be really fun.

 510
 00:45:35,400 --> 00:45:41,000
 Um, but yeah, and so I was involved at the very start, um, and then tangentially involved after that.

 511
 00:45:41,000 --> 00:45:49,040
 I haven't been like a uh core contributing developer for a very long time, but I'd throw ideas over the fence and occasionally knock up a few patches and so forth.

 512
 00:45:49,040 --> 00:45:50,480
 Um, it's just been amazing.

 513
 00:45:50,480 --> 00:45:57,280
 Like watching that grow, watching the community around it grow around the world and seeing all of these things that people have built on top of it.

 514
 00:45:57,280 --> 00:46:00,480
 Yeah, I'm I'm really excited to see how that's worked out.

 515
 00:46:00,480 --> 00:46:04,720
 The engineering quality of uh Django is just top notch.

 516
 00:46:04,720 --> 00:46:08,480
 The only ORM I ever liked, I always hate ORMs.

 517
 00:46:08,480 --> 00:46:09,920
 It's like just goes straight.

 518
 00:46:10,400 --> 00:46:11,920
 And that was nothing to do with me.

 519
 00:46:11,920 --> 00:46:20,960
 The RM So when I was working on it, um uh Adrian built a code generator that generated because oh like database code is really repetitive.

 520
 00:46:20,960 --> 00:46:25,280
 So he built a thing that generated Python code for you to talk to your your models.

 521
 00:46:25,280 --> 00:46:36,560
 Um, and then um Malcolm Trudenic was the person who joined the Django community and helped uh helped um turn that into something into what we have today, an incredible piece of work.

 522
 00:46:36,560 --> 00:46:37,680
 It's such a good design.

 523
 00:46:37,680 --> 00:46:42,000
 And yeah, for years after the ORM came out, I still wasn't very good at SQL.

 524
 00:46:42,000 --> 00:46:44,000
 I just rely on the what the Django ORM did.

 525
 00:46:44,000 --> 00:46:49,920
 It's only in the past maybe five years that I've got super confident in using SQL for these things instead.

 526
 00:46:49,920 --> 00:46:51,200
 Yeah, interesting.

 527
 00:46:51,200 --> 00:46:58,560
 Well, speaking of data, you have another project called Data Set, which I would love for you to explain.

 528
 00:46:58,560 --> 00:47:01,680
 What does it do that no other database does?

 529
 00:47:01,680 --> 00:47:06,400
 Like what problem is it that you're trying to solve, and where are you going with dataset?

 530
 00:47:06,400 --> 00:47:08,320
 This is a very interesting question.

 531
 00:47:08,320 --> 00:47:14,160
 And um, I wish I had the one sentence answer, but I don't, so I'm gonna have to I'll have to give you a few paragraphs.

 532
 00:47:14,160 --> 00:47:23,440
 So, dataset is an open source tool I've been building for nearly seven years now, and it's a Python web application for exploring, analyzing, and publishing data.

 533
 00:47:23,440 --> 00:47:31,080
 So the if the initial idea was um it actually came out of work I'd done at newspapers, where when you're a newspaper, you often publish data-driven stories.

 534
 00:47:31,160 --> 00:47:38,200
 You'll have a story about um the number of um hospital beds currently available across the state or whatever.

 535
 00:47:38,200 --> 00:47:39,960
 And those stories come with data.

 536
 00:47:39,960 --> 00:47:45,080
 And what I wanted to start encouraging newspapers to publish the data behind the stories.

 537
 00:47:45,080 --> 00:47:59,000
 This is something we started doing when I worked at The Guardian back in 2000 and 200 10, 2011, where the idea was you'd publish a story and then we'd put out the data behind the story, and we just publish it as a Google spreadsheet.

 538
 00:47:59,000 --> 00:48:03,800
 We'd have a Google spreadsheet with these are the raw numbers that went into this piece of reporting.

 539
 00:48:03,800 --> 00:48:10,120
 I always felt that should be a better thing than Google spreadsheet, something a little bit more open, a little more but with more capabilities.

 540
 00:48:10,120 --> 00:48:12,280
 And so the first version of data set was just that.

 541
 00:48:12,280 --> 00:48:29,720
 It was like, okay, take a bunch of data, stick it in a SQLite database, because SQLite doesn't require a separate server, it's just a file, and then deploy an application that gives you a little interface, like a web UI on top of this database where you can click around through it, and a JSON API so that you can start building things against it as well.

 542
 00:48:29,720 --> 00:48:33,880
 And then other features like the ability to export CSV versions and so forth.

 543
 00:48:33,880 --> 00:48:35,160
 So that was the initial idea.

 544
 00:48:35,160 --> 00:48:38,600
 It was what's the best possible way of publishing data online?

 545
 00:48:38,600 --> 00:48:41,480
 Because to my surprise, there weren't really any solutions to that.

 546
 00:48:41,480 --> 00:48:48,360
 Like if you want to publish a million rows of data online, your options are basically stick a CSV file in an S3 bucket.

 547
 00:48:48,360 --> 00:48:49,560
 And how's that useful?

 548
 00:48:49,560 --> 00:48:56,600
 Like, like it's great for CSV nerds, but it's not exactly something that general the people can generally engage with.

 549
 00:48:56,600 --> 00:49:03,080
 So that was the initial idea, and then I added plugins, and I realized that my inspiration there was actually WordPress, right?

 550
 00:49:03,080 --> 00:49:12,360
 Where WordPress is a perfectly decent blogging engine with tens of thousands of plugins that mean any publishing problem you have, you can solve with WordPress plus some plugins.

 551
 00:49:12,360 --> 00:49:17,120
 And I thought, okay, what if that was the answer for data exploration and analysis projects?

 552
 00:49:17,120 --> 00:49:29,440
 Like any project you have that involves data, which is basically everything, if you could take dataset plus specific plugins for visualizations that you want to run or export formats, that would be a really cool thing to build.

 553
 00:49:29,440 --> 00:49:32,800
 So that's how the project's been evolving over the past few years.

 554
 00:49:33,200 --> 00:49:45,440
 And so you've got, there are, I think, 150 plugins now for things like GeoJSON export or visualize everything on a map, or um I've started building plugins for adding for um for editing data as well.

 555
 00:49:45,440 --> 00:49:56,880
 So you can actually use dataset as a kind of air table alternative where you're loading in your data, making edits to it, running transformations against it, doing geocoding operations, all of that kind of thing.

 556
 00:49:56,880 --> 00:50:05,600
 And I love this project because thanks to plugins, if there's anything in the world that I think is interesting, I can justify that as a plugin for dataset.

 557
 00:50:05,600 --> 00:50:18,160
 I can be like, okay, this week I'm into GIS and I'm gonna do geospatial plugins, and then next week I'm doing some weird AI stuff, and I could write plugins for dataset that use language models to generate SQL queries or whatever it is.

 558
 00:50:18,160 --> 00:50:21,840
 So effectively, this is the project I want to work on for the rest of my life.

 559
 00:50:21,840 --> 00:50:24,560
 If I'm gonna do that, it needs to earn its keep.

 560
 00:50:24,560 --> 00:50:32,400
 So for the past um year and a bit, I've been putting together the sort of business model side of it, which is effectively the WordPress thing again.

 561
 00:50:32,400 --> 00:50:34,880
 It's the hosted SaaS version of data set.

 562
 00:50:34,880 --> 00:50:36,800
 So dataset completely open source.

 563
 00:50:36,800 --> 00:50:42,160
 If you are comfortable running an Ubuntu virtual machine somewhere, go ahead and install it and run it.

 564
 00:50:42,160 --> 00:50:52,160
 Or you'll be able to pay me a summer month and I will run a private dataset instance for you and your team with all of the plugins and the integrations and API stuff and all of that kind of thing.

 565
 00:50:52,160 --> 00:51:02,280
 And I'm working at the moment, I'm I need to put the final searches on the billing side so I can actually turn on like self-service payments for it, but it's getting there.

 566
 00:51:02,280 --> 00:51:06,040
 It's um that that's an exciting um that that's called dataset cloud.

 567
 00:50:59,600 --> 00:51:12,520
 It's dataset.cloud as opposed to dataset, which is dataset.io, spelt like the word cassette.

 568
 00:51:12,520 --> 00:51:15,960
 So it's d-a-t-a-c-a-s-e-t-t-e.

 569
 00:51:15,960 --> 00:51:21,880
 But yeah, so that's that's something I'm spending a lot of time on at the moment is is that commercial side of of the open source project.

 570
 00:51:21,880 --> 00:51:23,240
 Did you say C?

 571
 00:51:23,240 --> 00:51:24,360
 Do you mean S?

 572
 00:51:24,360 --> 00:51:25,160
 I did mean S.

 573
 00:51:25,160 --> 00:51:26,520
 D-A-T-H-E.

 574
 00:51:27,960 --> 00:51:28,680
 Thank you for that.

 575
 00:51:28,680 --> 00:51:29,320
 Yeah.

 576
 00:51:29,400 --> 00:51:35,080
 So, Simon, obviously the engagement between developers and accessibility community is key.

 577
 00:51:35,080 --> 00:51:39,560
 It's key for the ongoing progress, both from a conformance and usability perspective.

 578
 00:51:39,560 --> 00:51:43,480
 As a lead developer yourself, you know, how can we bridge that gap?

 579
 00:51:43,640 --> 00:51:47,480
 What are specific areas that we can assist with sharing with you?

 580
 00:51:47,560 --> 00:52:05,160
 So the thing I find most difficult about like building accessible sites at the moment is there are lots of there are the WCAG guidelines and there are like um Chrome extensions that do audits and so forth, and I just don't trust them because just because my site passed an audit, that doesn't mean it's going to work in a screen reader.

 581
 00:52:05,160 --> 00:52:17,080
 Like, especially with um like modern JavaScript things, where if I'm building an interface where I click through tabs and parts of the page update, how should I make sure that screen readers are notified about the correct piece of the page?

 582
 00:52:17,080 --> 00:52:20,520
 I don't just want to be told that we're WCAG guidelines, I want demos.

 583
 00:52:20,520 --> 00:52:29,240
 The thing that's missing for me is I would like almost like a cookbook of accessibility patterns where it's like, here's how to do modal dialogues and here's to how to do tab switching.

 584
 00:52:29,240 --> 00:52:33,880
 And for each one, I want a video of what a screen reader does with that demo.

 585
 00:52:33,880 --> 00:52:35,360
 Or I do I want multiple videos.

 586
 00:52:35,360 --> 00:52:42,920
 Like show me for the two or three most popular screen readers, how do they behave when you lay out your modal dialogues or your tab interface?

 587
 00:52:42,920 --> 00:52:47,040
 The one that's really um really relevant right now is chat interfaces, right?

 588
 00:52:47,040 --> 00:52:52,000
 LLMs do that streaming text thing where you ask them a question and everything comes back a word at a time.

 589
 00:52:52,000 --> 00:52:55,280
 How do I make that accessibility accessible to a screen reader?

 590
 00:52:55,280 --> 00:52:57,040
 Like there must be patterns.

 591
 00:52:57,040 --> 00:53:05,760
 What those patterns are, it's very difficult for me to find examples of those put together by experts with proof in sort of video form that they do the right thing.

 592
 00:53:05,760 --> 00:53:08,800
 I would love to see that more of that kind of documentation.

 593
 00:53:08,800 --> 00:53:14,800
 Yeah, I think from our perspective, you know, we the accessibility community shares as much as they possibly can across the board.

 594
 00:53:14,800 --> 00:53:17,680
 And Joe has played a phenomenal part in that.

 595
 00:53:17,680 --> 00:53:22,240
 But as you well know, different tag stacks, um, there are commonalities to your point.

 596
 00:53:22,240 --> 00:53:30,320
 And I do believe there should be some type of base documentation, lessons learned, and again, the videos would go a long way.

 597
 00:53:30,320 --> 00:53:36,000
 We've done it from our perspective, from a Service Node perspective, and we're more than willing to share what we've built.

 598
 00:53:36,000 --> 00:53:41,200
 But a more open source, if you will, access to that type of content could go a long way.

 599
 00:53:41,280 --> 00:53:44,480
 My my technical preference here, I like HTML and JavaScript.

 600
 00:53:44,480 --> 00:53:47,040
 I don't want React and I don't want like VT and so forth.

 601
 00:53:47,040 --> 00:53:55,440
 I just want to give me an accessible HTML and JavaScript demo of how to do like five or six of these common common interactions.

 602
 00:53:55,440 --> 00:54:01,040
 And that's enough, like then if somebody's using React, they should be able to port that to React, you know.

 603
 00:54:01,040 --> 00:54:04,320
 But having those example, the thing I want is proof.

 604
 00:54:04,320 --> 00:54:11,120
 Like every time a new JavaScript library comes out, one of my first things I do is check to see if they've got any documentation about their accessibility.

 605
 00:54:11,120 --> 00:54:12,400
 Most of them don't at all.

 606
 00:54:12,400 --> 00:54:17,040
 Some of them will say we are we have the right ARIA tag.

 607
 00:54:17,040 --> 00:54:18,080
 That's still not enough for me.

 608
 00:54:18,080 --> 00:54:25,680
 I still don't believe I won't believe that it's accessible until I see video evidence that it's been tested and shown to work with the screen reader.

 609
 00:54:25,680 --> 00:54:31,560
 And I feel like um I think Adobe do have some of this documentation for some of their accessible React things.

 610
 00:54:31,960 --> 00:54:45,000
 Yeah, so what I'd recommend that you look at is um Charlie Tripp at Triplet, um, who's one of the accessibility architects that that um helps create what we're working on at my company?

 611
 00:54:45,000 --> 00:54:48,040
 Um he wrote the book on accessibility.

 612
 00:54:48,040 --> 00:54:58,040
 Uh so you can just go to the book on accessibility.com and he was working for T Mobile and for T Mobile built magenta ally.

 613
 00:54:58,040 --> 00:55:00,920
 So Magenta A11Y.com.

 614
 00:55:01,240 --> 00:55:16,440
 And it's got um it's got acceptance criteria, it's got uh sample HTML, it's got videos of screen readers or assistive technology, and I think it's gonna provide a lot of the solutions that you're looking for.

 615
 00:55:16,440 --> 00:55:17,560
 That's really exciting.

 616
 00:55:17,560 --> 00:55:25,320
 Like just that that for me, the thing that would make that would make my my life as an engineer who cares about this easier, it's it's demos.

 617
 00:55:25,320 --> 00:55:34,840
 It's it's straight up demos that are proven to work with explanations of why this is the right pattern for doing a tabbed interface or replacing part of the screen.

 618
 00:55:34,840 --> 00:55:40,120
 The other problem I've got at the moment, which is a really interesting one, is tables, just actual tables of data.

 619
 00:55:40,120 --> 00:55:44,280
 Because my my software data set presents tables of data.

 620
 00:55:44,280 --> 00:55:46,520
 I have no control over what those tables are.

 621
 00:55:46,520 --> 00:55:51,240
 Often it'll be like a 40 column table with like 2,000 rows in it.

 622
 00:55:51,240 --> 00:55:59,000
 I don't care how good my table markup is, that's gonna suck in a screen reader if you're trying to make sense of like 2,000 rows of data with 40 columns.

 623
 00:55:59,000 --> 00:56:06,840
 Um, I had a great conversation actually with Ed Summers about this, where we talked about how this is a great opportunity for chat-based interfaces, right?

 624
 00:56:06,840 --> 00:56:15,920
 If you can, if you've got that table and you can then say to it, what's the school with the highest number of um of children from this particular background?

 625
 00:56:16,080 --> 00:56:17,280
 And it gives you an answer.

 626
 00:56:17,280 --> 00:56:18,160
 That's just better.

 627
 00:56:18,160 --> 00:56:19,920
 That's just a better way of interacting.

 628
 00:56:19,920 --> 00:56:28,080
 So I'm really excited for my own data set um tool about like what are the plugins I can build that use language models to give you that sort of conversational interface.

 629
 00:56:28,080 --> 00:56:29,760
 Because I realized I thought that was a gimmick.

 630
 00:56:29,760 --> 00:56:33,680
 I thought having a conversation with the table feels like that's kind of fun, but is it really useful?

 631
 00:56:33,680 --> 00:56:36,560
 From accessibility perspective, it's amazingly useful.

 632
 00:56:36,560 --> 00:56:37,760
 Completely, completely agree.

 633
 00:56:37,760 --> 00:56:40,240
 And we we touched upon this yesterday with Ed.

 634
 00:56:40,240 --> 00:56:57,120
 Um, from a service now perspective, we will be releasing a conversational AI with Now Assist or Assist AI, uh full conversational AI, with the goal of creating a full conversational AI for an entire user journey from start to finish.

 635
 00:56:57,120 --> 00:57:01,600
 Then we're going to be hopefully overlaying that with what we call intuitive page summarization.

 636
 00:57:01,600 --> 00:57:10,000
 So an unusual here, what you just touched upon before, the key parts of that page, the most critical parts of that page will be read back right away.

 637
 00:57:10,000 --> 00:57:12,000
 So they don't have to be tapping across everything.

 638
 00:57:12,000 --> 00:57:14,320
 So we're actually working on both of them right now.

 639
 00:57:14,320 --> 00:57:19,120
 And isn't it amazing that what you just described is now something that can be built?

 640
 00:57:19,120 --> 00:57:21,600
 Like it's now feasible to build these kinds of things.

 641
 00:57:21,600 --> 00:57:32,400
 Um with like just interacting with a web page, straight up a screenshot of an entire web page felt in felt um fed into a high quality model, will do a good enough job right now that it can start being useful.

 642
 00:57:32,400 --> 00:57:34,400
 And that's fascinating.

 643
 00:57:34,400 --> 00:57:35,440
 Yep, love it.

 644
 00:57:35,440 --> 00:57:36,000
 Yeah.

 645
 00:57:36,000 --> 00:57:45,760
 And in fact, to prepare for this podcast, I took um I took your blog, which is just incredible, I threw it into Notebook LM.

 646
 00:57:45,760 --> 00:58:00,120
 I took a whole bunch of your podcasts, through that into Notebook LM, and then some of the questions I asked you, I did honestly get out of Notebook LM because it was able to munch through so much data and and provide that information.

 647
 00:58:00,120 --> 00:58:01,400
 It's such a cool tool.

 648
 00:58:01,400 --> 00:58:02,760
 Have you played with it?

 649
 00:57:59,840 --> 00:58:06,760
 Yeah, I'm I'm really I'm I'm really fascinated by Notebook LM.

 650
 00:58:06,840 --> 00:58:11,240
 Um, what you've just described, it's this is it's the best current consumer rag product.

 651
 00:58:11,240 --> 00:58:18,440
 Like really what you're doing there is you're just dumping a bunch of documents into a rag system, and when you ask it a question, it looks things up for you and so forth.

 652
 00:58:18,440 --> 00:58:19,960
 And it works really, really well.

 653
 00:58:19,960 --> 00:58:27,880
 The um the questions that you gave me, I've been on a few podcasts, they were by far the best select selection of questions in advance I've ever seen.

 654
 00:58:27,880 --> 00:58:31,960
 It's so interesting to hear that that was part of your process for putting those together.

 655
 00:58:31,960 --> 00:58:42,840
 The other thing, you can't talk about Notebook LM without talking about their brilliantly weird podcast thing, where they can generate a podcast about the um about whatever content you've thrown in there.

 656
 00:58:42,840 --> 00:58:43,800
 I love that.

 657
 00:58:43,800 --> 00:58:47,000
 It's both a gimmick and incredibly useful.

 658
 00:58:47,000 --> 00:58:49,880
 Um it is spookily good.

 659
 00:58:49,880 --> 00:58:55,400
 Like it's the best version of like fake human voices that I've heard from anything.

 660
 00:58:55,400 --> 00:59:01,720
 They just released a new preview feature last week, I think, where you can now interrupt the podcast and ask them a question.

 661
 00:59:01,720 --> 00:59:06,680
 And so you can basically join in and be part of the the podcast experience.

 662
 00:59:06,680 --> 00:59:07,800
 Very, very weird.

 663
 00:59:07,800 --> 00:59:10,280
 They also added um custom instructions to it.

 664
 00:59:10,280 --> 00:59:28,680
 So I like doing things like um, I fed in a uh big boring report about some like generative AI study, and I told them you are banana slugs, and I want you to have a conversation about this paper from the perspective of banana slugs talking about how this could affect your slug society, and they did it.

 665
 00:59:28,680 --> 00:59:33,080
 It's like a monthly pipe and oh my god.

 666
 00:59:33,080 --> 00:59:34,520
 That's incredible.

 667
 00:59:35,240 --> 00:59:36,440
 How fun, yeah.

 668
 00:59:36,520 --> 00:59:42,040
 And their team announced that they're leaving and starting their own company, so that should be a good one.

 669
 00:59:42,280 --> 00:59:44,040
 That's that's the constant pattern with this thing.

 670
 00:59:44,040 --> 00:59:50,400
 If you build anything good at all around AI, you should quit your job and raise a million, raise 10 million dollars off of VC straight.

 671
 00:59:44,840 --> 00:59:51,040
 Or a billion.

 672
 00:59:51,200 --> 00:59:52,640
 Some of them are getting a billion.

 673
 00:59:52,800 --> 00:59:54,400
 The money is flowing freely right now.

 674
 00:59:54,400 --> 01:00:00,480
 If you ever want to do a startup, if you can prove yourself on a product, then then yeah, it's uh it's the time to be doing that.

 675
 01:00:00,480 --> 01:00:04,800
 That's something I always like to emphasize is it's important to have fun with this stuff.

 676
 01:00:04,800 --> 01:00:10,640
 Like a lot of people haven't gotten on to how deeply entertaining these things can be if you give them the chance.

 677
 01:00:10,640 --> 01:00:14,160
 Like don't just ask them to tell you jokes, they'll tell you crap jokes.

 678
 01:00:14,160 --> 01:00:21,120
 But if you ask them to be banana slugs and talk about the impact of generative AI on their society, that keeps me entertained all the time.

 679
 01:00:21,120 --> 01:00:22,960
 There's just so much you can do with that.

 680
 01:00:22,960 --> 01:00:24,080
 I could listen to you all day.

 681
 01:00:24,080 --> 01:00:25,600
 I really, really could, Simon.

 682
 01:00:25,600 --> 01:00:29,040
 Um enjoyable, uh, educational.

 683
 01:00:29,040 --> 01:00:37,760
 Um and and just honestly, again, um, as I mentioned before, your passion for what you do is pretty apparent, and your authenticity is pretty apparent.

 684
 01:00:37,760 --> 01:00:39,680
 So thank you so much for your time today.

 685
 01:00:39,840 --> 01:00:41,360
 Greatly, greatly appreciate it.

 686
 01:00:41,360 --> 01:00:42,080
 Thanks very much.

 687
 01:00:42,080 --> 01:00:43,840
 This has been a really fun conversation.

 688
 01:00:43,840 --> 01:00:44,640
 Thanks, Simon.

 689
 01:00:44,640 --> 01:00:49,760
 And just let our audience know where they can reach you and and read your stuff.

 690
 01:00:49,760 --> 01:00:56,960
 So I'm online at Simon Willison.net is my blog, which I have updated every day since January the first this year.

 691
 01:00:56,960 --> 01:01:01,520
 So I'm just about to hit a year-long streak, which I'm I'm quite excited about.

 692
 01:01:01,520 --> 01:01:04,160
 Um that will link to all of my other stuff.

 693
 01:01:04,160 --> 01:01:09,040
 I'm um I have a very active GitHub account with 900 projects on it at the moment.

 694
 01:01:09,040 --> 01:01:13,200
 Um, and I'm on Blue Sky and Mastodon and Twitter and so forth as well.

 695
 01:01:13,200 --> 01:01:13,600
 Great.

 696
 01:01:13,600 --> 01:01:16,160
 Under Simon W at Simon W, right?

 697
 01:01:16,160 --> 01:01:23,760
 Uh yes, or at SimonWilson.net on Blue Sky and um Simon at SimonWilson.net on Mastodon.
No results found