-
-
Save simonw/ea1dc73029bf080676839289e705a2a2 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| 1 | |
| 00:00:00,080 --> 00:00:09,920 | |
| You can't talk about NoBook LM without talking about their brilliantly weird podcast thing, where they can generate a podcast about the um about whatever content you've thrown in there. | |
| 2 | |
| 00:00:09,920 --> 00:00:28,320 | |
| So I like doing things like um I fed in a uh big boring report about some like generative AI study, and I told them you are banana slugs, and I want you to have a conversation about this paper from the perspective of banana slugs talking about how this could affect your slug society, and they did it. | |
| 3 | |
| 00:00:28,320 --> 00:00:30,880 | |
| It's like a monthly Python skit by the leg at all. | |
| 4 | |
| 00:00:30,960 --> 00:00:32,560 | |
| Oh my god. | |
| 5 | |
| 00:00:36,480 --> 00:00:46,480 | |
| Welcome to episode six of Accessibility and Gen AI, a podcast that interviews the newsmakers and thought leaders in the world of accessibility and artificial intelligence. | |
| 6 | |
| 00:00:46,480 --> 00:00:50,720 | |
| I'm Joe Devon, and I'm joined by my co-host Eamon MacElane. | |
| 7 | |
| 00:00:50,720 --> 00:00:56,160 | |
| And today we are interviewing Simon Willison, a true OG of the web. | |
| 8 | |
| 00:00:56,160 --> 00:00:58,960 | |
| He created my favorite framework called Django. | |
| 9 | |
| 00:00:58,960 --> 00:01:09,760 | |
| He created Lanyard data set, and now he is a prolific blogger talking about and building projects on top of AI on a daily basis. | |
| 10 | |
| 00:01:09,760 --> 00:01:16,000 | |
| Simon has contributed so much to the world we live in, and it is a true pleasure to have him join us today. | |
| 11 | |
| 00:01:16,000 --> 00:01:17,920 | |
| Simon, welcome to the pod. | |
| 12 | |
| 00:01:18,320 --> 00:01:19,200 | |
| Thank you very much. | |
| 13 | |
| 00:01:19,200 --> 00:01:21,040 | |
| I'm excited to be here. | |
| 14 | |
| 00:01:21,040 --> 00:01:28,320 | |
| You know, when I think about all the things that you've done, your impact on the web spans so many influential projects. | |
| 15 | |
| 00:01:28,320 --> 00:01:35,520 | |
| But I'd love to understand what has been the most meaningful to you personally and what is a typical day look like for you. | |
| 16 | |
| 00:01:35,600 --> 00:01:39,200 | |
| So I think the the most impactful project I've worked on is still Django. | |
| 17 | |
| 00:01:39,200 --> 00:01:42,720 | |
| Like Django has it's extraordinary how far that framework's gone. | |
| 18 | |
| 00:01:42,720 --> 00:01:46,480 | |
| You know, NASA have used it, Instagram and Pinterest were built on top of it. | |
| 19 | |
| 00:01:46,480 --> 00:01:52,320 | |
| Um it uh face uh Meta's Threads is just another Django application I found out recently. | |
| 20 | |
| 00:01:52,320 --> 00:01:53,680 | |
| Um so that's amazing. | |
| 21 | |
| 00:01:53,680 --> 00:01:59,600 | |
| And I love the fact that Django can be classified now as boring technology in that it's the safe choice, right? | |
| 22 | |
| 00:01:59,600 --> 00:02:04,360 | |
| If you just want to build something on the web, if you pick Django, you're not going to run into any sharp edges. | |
| 23 | |
| 00:02:04,360 --> 00:02:05,640 | |
| There won't be any surprises. | |
| 24 | |
| 00:02:05,640 --> 00:02:07,880 | |
| I'm really proud that it's made it to that point. | |
| 25 | |
| 00:02:08,280 --> 00:02:21,240 | |
| But these days, um, I'm really excited about the more recent stuff I'm working on and working on open source tools for data journalism, where the dream here is I want the tools that I'm building to help somebody win a Pulitzer Prize in investigative reporting. | |
| 26 | |
| 00:02:21,240 --> 00:02:30,200 | |
| And that sounds wildly ambitious, and I think it is, it's ambitious, but that would be such a cool sort of um example of the kinds of problems I want to solve. | |
| 27 | |
| 00:02:30,200 --> 00:02:35,720 | |
| Like I build tools for other people, and I want those other people to achieve sort of world-changing results with them. | |
| 28 | |
| 00:02:35,800 --> 00:02:37,080 | |
| That is so cool. | |
| 29 | |
| 00:02:37,080 --> 00:02:43,160 | |
| Uh and and I'm just going to tell you an anecdote, which I have not mentioned to you before when we met. | |
| 30 | |
| 00:02:43,160 --> 00:02:49,960 | |
| I I had never touched Python, and I needed to deliver an e-commerce project for a client. | |
| 31 | |
| 00:02:49,960 --> 00:02:55,880 | |
| And I learned Python, Django built the app in two weeks. | |
| 32 | |
| 00:02:55,880 --> 00:02:58,040 | |
| And my client was so happy. | |
| 33 | |
| 00:02:58,040 --> 00:03:05,800 | |
| She was like, because it you know had that admin that that you guys put in there, and my client was like, you should sell this whole admin thing. | |
| 34 | |
| 00:03:05,800 --> 00:03:10,120 | |
| And I was just laughing so loud because this was a two week project for me. | |
| 35 | |
| 00:03:10,120 --> 00:03:11,640 | |
| That is so thrilling. | |
| 36 | |
| 00:03:11,640 --> 00:03:13,080 | |
| And I I've heard a lot of that. | |
| 37 | |
| 00:03:13,080 --> 00:03:15,640 | |
| A lot of people got their start in programming, learning Django. | |
| 38 | |
| 00:03:15,640 --> 00:03:18,760 | |
| And when we built Django, we never dreamed it would be a beginner's project. | |
| 39 | |
| 00:03:18,760 --> 00:03:23,720 | |
| We thought it was like, you know, for advanced for experienced engineers to quickly build web applications. | |
| 40 | |
| 00:03:23,720 --> 00:03:28,280 | |
| But since then there's the Django Girls tutorials that have been running for I think nearly a decade now. | |
| 41 | |
| 00:03:28,280 --> 00:03:29,960 | |
| There's been so much activity on that front. | |
| 42 | |
| 00:03:29,960 --> 00:03:34,520 | |
| And yeah, I meet people and they're like, oh, Django was my introduction to programming and web development. | |
| 43 | |
| 00:03:34,600 --> 00:03:35,800 | |
| That that's so exciting. | |
| 44 | |
| 00:03:35,800 --> 00:03:39,640 | |
| That you know, that wasn't the plan, but it's so thrilling to see that happening. | |
| 45 | |
| 00:03:39,640 --> 00:03:44,520 | |
| Yeah, though, to be fair, I did start with PHP in MySQL before, but uh did I. | |
| 46 | |
| 00:03:44,520 --> 00:03:46,560 | |
| That was my that was my intro to Python. | |
| 47 | |
| 00:03:46,560 --> 00:03:51,120 | |
| It was really hard to get Python working, but Django part of it was was awesome and easy. | |
| 48 | |
| 00:03:51,120 --> 00:03:51,600 | |
| Nice. | |
| 49 | |
| 00:03:51,600 --> 00:03:52,480 | |
| It's funny you say that. | |
| 50 | |
| 00:03:52,480 --> 00:03:53,520 | |
| Simon, pleased to meet you. | |
| 51 | |
| 00:03:53,520 --> 00:03:54,640 | |
| You and I never met before. | |
| 52 | |
| 00:03:54,640 --> 00:03:57,120 | |
| So thanks for spending your time with us today. | |
| 53 | |
| 00:03:57,120 --> 00:04:00,160 | |
| Um I've heard Python a lot over the past several weeks. | |
| 54 | |
| 00:04:00,160 --> 00:04:05,120 | |
| Our youngest son, who's a sophomore in Oregon State, he's currently doing a Python class. | |
| 55 | |
| 00:04:05,200 --> 00:04:07,040 | |
| He loves it compared to C. | |
| 56 | |
| 00:04:07,040 --> 00:04:08,320 | |
| He's like, I'm all over it. | |
| 57 | |
| 00:04:10,080 --> 00:04:11,200 | |
| He's a fan. | |
| 58 | |
| 00:04:11,200 --> 00:04:13,120 | |
| He's a huge fan, he is. | |
| 59 | |
| 00:04:13,120 --> 00:04:21,680 | |
| Um, you mentioned in your initial response about you know your goal, core goal of helping people and making things easier. | |
| 60 | |
| 00:04:21,680 --> 00:04:25,600 | |
| Um tying that into accessibility, digital accessibility. | |
| 61 | |
| 00:04:25,600 --> 00:04:30,240 | |
| How do you see the GNAI roadmap and accessibility coming together? | |
| 62 | |
| 00:04:30,240 --> 00:04:35,440 | |
| Um, and maybe the improvements from a from an inclusiveness perspective. | |
| 63 | |
| 00:04:35,440 --> 00:04:41,120 | |
| So this is something that I so I I do not I I'm not an expert in accessibility. | |
| 64 | |
| 00:04:41,120 --> 00:04:43,040 | |
| It's something I've cared about throughout my entire career. | |
| 65 | |
| 00:04:43,040 --> 00:04:51,040 | |
| Like Django from the very start always was built with sort of semantic HTML and all of those sort of 20 year ago accessibility concerns in place. | |
| 66 | |
| 00:04:51,040 --> 00:04:57,040 | |
| And the even these days there is an accessibility working group that I'm not involved with, but that's doing amazing work on the Django admin and so forth. | |
| 67 | |
| 00:04:57,040 --> 00:04:58,960 | |
| And so it's something I care very deeply about. | |
| 68 | |
| 00:04:58,960 --> 00:05:04,080 | |
| I'm fascinated to learn more about that intersection between generative AI and accessibility myself. | |
| 69 | |
| 00:05:04,080 --> 00:05:10,080 | |
| I feel like the most exciting trend over the past year for me has been the vision models, or these these multimodal models, right? | |
| 70 | |
| 00:05:10,080 --> 00:05:17,600 | |
| Which um they can consume um images, video, the audio stuff has got incredibly powerful in the past two months. | |
| 71 | |
| 00:05:17,600 --> 00:05:24,720 | |
| Like as of now, it's not even surprising that you can have an audio conversation with one of these models two months ago that was hardly a feature. | |
| 72 | |
| 00:05:24,720 --> 00:05:25,920 | |
| That's so cool. | |
| 73 | |
| 00:05:25,920 --> 00:05:35,560 | |
| And um, I feel like the accessibility benefits of these seem underexplored, at least from from my perspective, but but so so promising. | |
| 74 | |
| 00:05:35,560 --> 00:05:46,520 | |
| Some people I've talked to have been skeptical about the accessibility benefits because their argument is, you know, if you give somebody unreliable technology that might hallucinate and make things up, surely that that's that's harming them. | |
| 75 | |
| 00:05:46,520 --> 00:05:48,360 | |
| I don't think that's true. | |
| 76 | |
| 00:05:48,360 --> 00:05:53,000 | |
| I feel like people who use screen readers are used to unreliable technology. | |
| 77 | |
| 00:05:53,000 --> 00:05:58,040 | |
| You know, if you use a guide dog, a guide dog, it's a wonderful thing and a very unreliable piece of technology. | |
| 78 | |
| 00:05:58,040 --> 00:06:04,520 | |
| So, you know, when you consider that people with accessibility needs have agency, they can understand the limitations of the technology they're using. | |
| 79 | |
| 00:06:04,520 --> 00:06:09,000 | |
| I feel like giving them a tool where they can point their phone at something and it can describe it to them. | |
| 80 | |
| 00:06:09,000 --> 00:06:11,640 | |
| It's got really good OCR capabilities built in. | |
| 81 | |
| 00:06:11,640 --> 00:06:13,480 | |
| You can have an audio conversation with this. | |
| 82 | |
| 00:06:13,480 --> 00:06:19,400 | |
| This just feels like a world away from accessibility technology in that jet just sort of three or four years ago. | |
| 83 | |
| 00:06:19,480 --> 00:06:20,520 | |
| Agree, completely agree. | |
| 84 | |
| 00:06:20,680 --> 00:06:29,000 | |
| I think you know that concern ties into well, maybe the concern of LLMs not being fully inclusive. | |
| 85 | |
| 00:06:29,000 --> 00:06:32,840 | |
| And if LLMs are not inclusive, they're that there will be gaps. | |
| 86 | |
| 00:06:32,840 --> 00:06:34,040 | |
| But we can solve that. | |
| 87 | |
| 00:06:34,040 --> 00:06:46,280 | |
| We can solve that with engaging with individuals with disabilities with the prompts, we can evolve uh resolve that with making sure that we have a true comprehensive, non-biased data set when we're building data sets. | |
| 88 | |
| 00:06:46,280 --> 00:06:52,040 | |
| Um I think as long as we get ahead of it, as long as we're aware of that potential gap, that we can solve it. | |
| 89 | |
| 00:06:52,040 --> 00:06:53,800 | |
| I do, I believe that. | |
| 90 | |
| 00:06:53,800 --> 00:06:57,880 | |
| Simon, we are now uh, I think it is day 12. | |
| 91 | |
| 00:06:57,880 --> 00:07:00,920 | |
| Is it was it 12 days of open AI or 14 days? | |
| 92 | |
| 00:07:00,920 --> 00:07:04,200 | |
| I think it's 12 days, and we've reached the final day. | |
| 93 | |
| 00:07:04,200 --> 00:07:08,280 | |
| But it also feels like the 12 days of Google AI. | |
| 94 | |
| 00:07:08,280 --> 00:07:14,440 | |
| Uh, and when Google was doing their announcements, open AI, it sort of like did their own bunch of announcements. | |
| 95 | |
| 00:07:14,440 --> 00:07:19,120 | |
| It just feels like we're in a war right now, massive competition. | |
| 96 | |
| 00:07:19,120 --> 00:07:22,320 | |
| Uh, so much to talk about with respect to that. | |
| 97 | |
| 00:07:22,320 --> 00:07:29,120 | |
| But um, what do you what were you most excited about in the last couple of weeks that was released? | |
| 98 | |
| 00:07:29,120 --> 00:07:31,600 | |
| Like, what are the the top items? | |
| 99 | |
| 00:07:31,600 --> 00:07:42,880 | |
| And then as an aside, if any of those are um tie into accessibility, would love your thoughts because I haven't had enough time to even look at uh most of these. | |
| 100 | |
| 00:07:42,960 --> 00:07:44,240 | |
| It's been bewildering. | |
| 101 | |
| 00:07:44,240 --> 00:07:47,200 | |
| The whole month of December has just been a whirlwind. | |
| 102 | |
| 00:07:47,200 --> 00:07:50,080 | |
| And when is December the month that people release everything, right? | |
| 103 | |
| 00:07:50,080 --> 00:07:55,680 | |
| You'd expect people to be dialing down for the holidays, but no, we've had extraordinary releases from OpenAI. | |
| 104 | |
| 00:07:55,680 --> 00:07:59,360 | |
| Google have managed to undercut OpenAI, which has never happened before. | |
| 105 | |
| 00:07:59,360 --> 00:08:06,480 | |
| Like last year, every time Google made a Gemini announcement, OpenAI would launch something better that morning, almost as sort of a power move. | |
| 106 | |
| 00:08:06,480 --> 00:08:09,520 | |
| The opposite is happening today, which is so fascinating. | |
| 107 | |
| 00:08:09,520 --> 00:08:12,560 | |
| Like Google's Gemini team are really ramping up. | |
| 108 | |
| 00:08:12,560 --> 00:08:15,840 | |
| Um there's a bunch of Gemini stuff that's really exciting. | |
| 109 | |
| 00:08:15,840 --> 00:08:24,400 | |
| They released uh Gemini Flash 2.0, which is the for sort of the cheapest version of their Gemini 2.0 series, is a really impressive model. | |
| 110 | |
| 00:08:24,400 --> 00:08:26,320 | |
| I've been playing around with that one a whole lot. | |
| 111 | |
| 00:08:26,320 --> 00:08:33,280 | |
| The Gemini models, um, they can do audio input and they can do video input, which puts them a step ahead of OpenAI. | |
| 112 | |
| 00:08:33,280 --> 00:08:39,200 | |
| OpenAI have some preview audio models, but nothing like what Gemini can do on that sort of multimodal front. | |
| 113 | |
| 00:08:39,200 --> 00:08:55,040 | |
| Um, the really fun thing is um Gemini and OpenAI both now have streaming video APIs where you can literally point your webcam at something and you can then stream video images into the model, talk over them and have it talk back to you. | |
| 114 | |
| 00:08:55,040 --> 00:08:57,920 | |
| And this was this is absolute science fiction. | |
| 115 | |
| 00:08:57,920 --> 00:09:03,720 | |
| Like Gemini managed to squeeze their version of this out the day before OpenAI did, which was extraordinary. | |
| 116 | |
| 00:09:03,880 --> 00:09:07,960 | |
| OpenAI, however, productized it, it's in the Chat GPT mobile app now. | |
| 117 | |
| 00:09:08,120 --> 00:09:15,160 | |
| So I can fire up ChatGPT, I can turn on my webcam, I can point, and I can start having a conversation, including with Santa Claus. | |
| 118 | |
| 00:09:15,160 --> 00:09:18,120 | |
| They've got a gimmicky Santa Claus voice that they can talk to. | |
| 119 | |
| 00:09:18,120 --> 00:09:19,800 | |
| Like I introduced it to my chickens. | |
| 120 | |
| 00:09:19,800 --> 00:09:22,120 | |
| I said, here are my chickens, this is their names. | |
| 121 | |
| 00:09:22,120 --> 00:09:24,920 | |
| And then a few minutes later I pointed at Chick and said, which chicken is this? | |
| 122 | |
| 00:09:24,920 --> 00:09:26,040 | |
| And it got the answer right. | |
| 123 | |
| 00:09:26,040 --> 00:09:28,040 | |
| That's what are we even doing? | |
| 124 | |
| 00:09:28,040 --> 00:09:29,640 | |
| That's amazing, right? | |
| 125 | |
| 00:09:29,880 --> 00:09:31,720 | |
| Um that's so exciting. | |
| 126 | |
| 00:09:31,720 --> 00:09:38,040 | |
| And the accessibility implications of streaming video and audio into these things, that's extraordinary. | |
| 127 | |
| 00:09:38,040 --> 00:09:39,240 | |
| Absolutely extraordinary. | |
| 128 | |
| 00:09:39,240 --> 00:09:42,360 | |
| Those capabilities became available, what, three or four days ago? | |
| 129 | |
| 00:09:42,360 --> 00:09:44,680 | |
| Like this is absolute cutting edge. | |
| 130 | |
| 00:09:44,680 --> 00:09:46,760 | |
| The stuff is available over APIs as well. | |
| 131 | |
| 00:09:46,760 --> 00:09:48,120 | |
| Just was it yesterday? | |
| 132 | |
| 00:09:48,120 --> 00:09:49,400 | |
| I'm losing track of the days. | |
| 133 | |
| 00:09:49,400 --> 00:10:02,840 | |
| OpenAI now have a new Web RTC API for their real-time stuff, which I knocked out a little test web page, which you can click a button on, and now you're having an audio conversation with with one of their GPT voices. | |
| 134 | |
| 00:10:02,840 --> 00:10:06,440 | |
| And it was like a dozen lines of JavaScript to get that working. | |
| 135 | |
| 00:10:06,680 --> 00:10:08,200 | |
| Unbelievable, right? | |
| 136 | |
| 00:10:08,200 --> 00:10:10,920 | |
| So that stuff is, and again, it's just so new. | |
| 137 | |
| 00:10:10,920 --> 00:10:13,880 | |
| Like these streaming APIs didn't exist two weeks ago. | |
| 138 | |
| 00:10:13,880 --> 00:10:16,040 | |
| Now they've they've rolling out. | |
| 139 | |
| 00:10:16,040 --> 00:10:20,760 | |
| We're, I feel like we've hardly even started dipping our toes into what those can do. | |
| 140 | |
| 00:10:20,760 --> 00:10:27,240 | |
| The other exciting thing is um OpenAI drops the prices of their audio API by a lot. | |
| 141 | |
| 00:10:27,240 --> 00:10:31,480 | |
| Like previously it was prohibitively expensive, now it's just about affordable. | |
| 142 | |
| 00:10:31,480 --> 00:10:39,480 | |
| Um, Gemini haven't announced the pricing on their API yet, but all of their other models are just bargain basement prices already. | |
| 143 | |
| 00:10:39,480 --> 00:10:44,360 | |
| There's this part part of the benefits of the competition is that the pricing just keeps on going down. | |
| 144 | |
| 00:10:44,360 --> 00:10:48,400 | |
| It's unbelievably inexpensive to use these vision models right now. | |
| 145 | |
| 00:10:48,640 --> 00:10:58,160 | |
| A little while ago, so I say a while ago, like two weeks ago, Amazon announced their Amazon Nova models, which were effectively their version of the Google Gemini models. | |
| 146 | |
| 00:10:58,160 --> 00:11:00,640 | |
| They're similarly priced, they have similar capabilities. | |
| 147 | |
| 00:11:00,640 --> 00:11:22,480 | |
| And I did a uh napkin calculation and found that if I wanted to take 67,000 photographs from my photo library and run all of those through either Gemini 1.5 Flash or the Amazon Nova cheap one, it would cost me $10.25 to do 67,000 photos to get an actual useful text description of those photos. | |
| 148 | |
| 00:11:22,480 --> 00:11:28,160 | |
| That I ran those numbers three times because I didn't believe them, like the first time I calculated them. | |
| 149 | |
| 00:11:28,240 --> 00:11:29,440 | |
| That's amazing. | |
| 150 | |
| 00:11:29,440 --> 00:11:30,240 | |
| Wow, right? | |
| 151 | |
| 00:11:30,240 --> 00:11:31,120 | |
| Absolutely incredible. | |
| 152 | |
| 00:11:32,160 --> 00:11:35,680 | |
| Yeah, I was gonna ask you what tools you currently use. | |
| 153 | |
| 00:11:35,680 --> 00:11:38,960 | |
| It sounds like you use everything on a daily basis. | |
| 154 | |
| 00:11:38,960 --> 00:11:50,160 | |
| Um but from what you see now or where you see things growing, do you believe that there's a current front runner or somebody that's just gaining that constant momentum and getting ahead of the game? | |
| 155 | |
| 00:11:50,160 --> 00:11:51,760 | |
| No, no, and this is new. | |
| 156 | |
| 00:11:51,760 --> 00:11:54,560 | |
| Like um six months ago, it was OpenAI, right? | |
| 157 | |
| 00:11:54,560 --> 00:11:59,920 | |
| OpenAI with GPT, they launched GPT 4, and GPT 4 was the best available model for like nine months. | |
| 158 | |
| 00:11:59,920 --> 00:12:01,520 | |
| It felt like an eternity. | |
| 159 | |
| 00:12:01,520 --> 00:12:06,320 | |
| Then that broke when Anthropic released their Claude 3 Opus model. | |
| 160 | |
| 00:12:06,320 --> 00:12:08,320 | |
| I I'd have to look it up, I think it was March this year. | |
| 161 | |
| 00:12:08,320 --> 00:12:11,680 | |
| And then they followed up this model's Claude 3.5 Sonnet. | |
| 162 | |
| 00:12:11,680 --> 00:12:13,520 | |
| None of these names are very catchy. | |
| 163 | |
| 00:12:13,520 --> 00:12:18,000 | |
| That's been the best model, like the model I use every day for about six months now. | |
| 164 | |
| 00:12:18,000 --> 00:12:21,600 | |
| And I think it's still just ahead of the rest of the pack. | |
| 165 | |
| 00:12:21,600 --> 00:12:24,320 | |
| But the Gemini models are catching up really quickly. | |
| 166 | |
| 00:12:24,320 --> 00:12:33,080 | |
| The OpenAI models, um they're they're the new O1 models that they they released another one of those like a few days ago, those are really interesting as well. | |
| 167 | |
| 00:12:29,920 --> 00:12:43,400 | |
| But meanwhile, the the Chinese labs put out this um Quen 2.5 is an openly licensed model that I can run on my laptop, which is in that GPT4 space. | |
| 168 | |
| 00:12:43,400 --> 00:12:49,320 | |
| The Lama models from Meta Lama 3.3, again, GP4 level runs on my laptop directly. | |
| 169 | |
| 00:12:49,320 --> 00:12:50,520 | |
| I never thought that would happen. | |
| 170 | |
| 00:12:50,520 --> 00:12:55,400 | |
| I thought I'd need a $50,000 like mainframe computer to run that kind of model. | |
| 171 | |
| 00:12:55,400 --> 00:12:57,480 | |
| This is all changing so quickly. | |
| 172 | |
| 00:12:57,480 --> 00:13:05,160 | |
| The um the flip side is there is a suspicion that the the scaling thing is no longer holding. | |
| 173 | |
| 00:13:05,160 --> 00:13:10,760 | |
| Like it used to be you chuck in more data and more parameters and more GPU time, and you get better models. | |
| 174 | |
| 00:13:10,760 --> 00:13:15,960 | |
| But everything seems to be sort of leveling off at the GPT-4.0 Claude Sonnet level. | |
| 175 | |
| 00:13:15,960 --> 00:13:20,440 | |
| Like the new models are slightly better, but they're not like two times better. | |
| 176 | |
| 00:13:20,440 --> 00:13:26,600 | |
| And so the new frontier appears to be this idea of um um inference scaling. | |
| 177 | |
| 00:13:26,600 --> 00:13:33,720 | |
| It's this thing where you take a model that's that's that's really strong and you just give it longer to churn away to try and come up with answers. | |
| 178 | |
| 00:13:33,720 --> 00:13:36,920 | |
| And that um OpenAI started out with their O1 model. | |
| 179 | |
| 00:13:36,920 --> 00:13:45,480 | |
| Um Gemini four hours ago, Google announced their released their first version of this, this thing called Gemini Flash Thinking, I think it's called. | |
| 180 | |
| 00:13:45,480 --> 00:13:50,040 | |
| I just literally got some software working against that like half an hour ago. | |
| 181 | |
| 00:13:50,040 --> 00:13:52,920 | |
| Um, and it looks I'm happy I saw it a half an hour ago. | |
| 182 | |
| 00:13:52,920 --> 00:13:54,280 | |
| You already have code working. | |
| 183 | |
| 00:13:54,280 --> 00:13:56,440 | |
| This is this is Simon for you. | |
| 184 | |
| 00:13:56,440 --> 00:14:02,200 | |
| It's the the and the Chinese ones, there's um a Quen model called QWQ, which is the same pattern. | |
| 185 | |
| 00:14:02,200 --> 00:14:03,320 | |
| There's a another. | |
| 186 | |
| 00:14:03,320 --> 00:14:05,400 | |
| Um, and so this is all happening now. | |
| 187 | |
| 00:14:05,400 --> 00:14:20,080 | |
| And what's interesting about those is they're not better at every task, but they are better at tasks like coding and mathematics, where it helps where if you're a human being and you've got about a big notepad and you you wrote down your thinking step by step, that would help you solve a problem. | |
| 188 | |
| 00:14:20,080 --> 00:14:22,400 | |
| These models are now capable of doing that kind of stuff. | |
| 189 | |
| 00:14:22,400 --> 00:14:24,320 | |
| So it's a different kind of better. | |
| 190 | |
| 00:14:24,320 --> 00:14:32,960 | |
| They're not just getting universally better at everything, but certain sort of like problem-solving tasks that we've got a new frontier that people are working on. | |
| 191 | |
| 00:14:32,960 --> 00:14:35,360 | |
| And all of this happened so recently as well. | |
| 192 | |
| 00:14:35,360 --> 00:14:40,080 | |
| Like it's it's it's it's a full-time job keeping up with it, definitely. | |
| 193 | |
| 00:14:40,160 --> 00:14:41,040 | |
| It really is. | |
| 194 | |
| 00:14:41,040 --> 00:14:45,920 | |
| And uh it's definitely eating into my days, uh, just trying to keep up. | |
| 195 | |
| 00:14:45,920 --> 00:14:51,760 | |
| Um I am curious, since Eamon brought it up, what what do you use on a daily basis? | |
| 196 | |
| 00:14:51,760 --> 00:15:01,680 | |
| And what is good enough for you to have taken uh done the subscription for uh, including are you are you spending that 200 a month on the new open AI model? | |
| 197 | |
| 00:15:01,680 --> 00:15:03,120 | |
| Is that any any good? | |
| 198 | |
| 00:15:03,120 --> 00:15:08,480 | |
| I've seen some people say it's not that good, and then a few people are like, this is the best thing out there. | |
| 199 | |
| 00:15:08,480 --> 00:15:09,680 | |
| I'm so torn on that one. | |
| 200 | |
| 00:15:09,680 --> 00:15:11,680 | |
| So I'm not spending the 200 a month yet. | |
| 201 | |
| 00:15:11,680 --> 00:15:17,200 | |
| At the moment, I pay for Chat GPT 20 a month and Claude 20 a month. | |
| 202 | |
| 00:15:17,200 --> 00:15:21,520 | |
| I would be paying for GitHub Copilot, but I get it for free as an open source maintainer. | |
| 203 | |
| 00:15:21,520 --> 00:15:23,680 | |
| And that's it for my subscriptions. | |
| 204 | |
| 00:15:23,680 --> 00:15:28,480 | |
| But then I've also got API accounts with basically everyone, and I'm constantly experimenting with the APIs. | |
| 205 | |
| 00:15:28,480 --> 00:15:32,960 | |
| The thing is they're so cheap that most months my API bill across everyone comes like 10 bucks. | |
| 206 | |
| 00:15:32,960 --> 00:15:38,480 | |
| Like I've never managed to spend more than 20 on the API on all of the APIs in any given month. | |
| 207 | |
| 00:15:38,480 --> 00:15:41,760 | |
| So it's not a huge amount of money that I'm spending right now. | |
| 208 | |
| 00:15:41,760 --> 00:15:48,560 | |
| Yeah, the the the $200 thing gets you, it gets you less um caps on the 01 model. | |
| 209 | |
| 00:15:48,560 --> 00:15:57,760 | |
| Like the 01 model, I think you can use 50 times before you get locked out of it for a few days, which is a bit frustrating because I've started using that one a little bit more. | |
| 210 | |
| 00:15:57,760 --> 00:16:05,560 | |
| Um, and you get this thing called O1 Pro, which I really want to try, but I don't want to spend $200 a month to $200 to try. | |
| 211 | |
| 00:16:05,720 --> 00:16:08,600 | |
| So I just give me one free go at it, you know. | |
| 212 | |
| 00:16:08,600 --> 00:16:14,680 | |
| Um I don't know, I might end up paying $200 a month at some point, but I haven't quite justified it to myself yet. | |
| 213 | |
| 00:16:14,760 --> 00:16:22,440 | |
| We just had Ed Summers, the head of um accessibility for GitHub, um, and he announced that Copilot is now free. | |
| 214 | |
| 00:16:22,440 --> 00:16:26,040 | |
| Um so that's not even limits for everyone. | |
| 215 | |
| 00:16:26,120 --> 00:16:27,080 | |
| Yeah, it's uh yes. | |
| 216 | |
| 00:16:27,080 --> 00:16:32,040 | |
| I forget what the limits are, but it's it's it's absolutely and the great thing about that is it's not just a free trial. | |
| 217 | |
| 00:16:32,040 --> 00:16:35,720 | |
| This is a free tier that GitHub are planning to make permanent. | |
| 218 | |
| 00:16:35,720 --> 00:16:39,080 | |
| So you will always, it's especially important for people around the world, you know. | |
| 219 | |
| 00:16:39,080 --> 00:16:44,200 | |
| People people in India are much less likely to be able to set up that credit card subscription and so forth. | |
| 220 | |
| 00:16:44,200 --> 00:16:47,640 | |
| Now they get access to the copilot um experience. | |
| 221 | |
| 00:16:47,640 --> 00:16:48,840 | |
| I'm really excited about that. | |
| 222 | |
| 00:16:48,840 --> 00:16:54,360 | |
| I think that's a it's a it's like it's actually it's the oldest generative AI tool in mainstream use. | |
| 223 | |
| 00:16:54,360 --> 00:16:59,240 | |
| Copilot, it turns out, predates chat GPT by like what nearly two years. | |
| 224 | |
| 00:16:59,240 --> 00:17:02,680 | |
| They released the first version of Copilot in 2021. | |
| 225 | |
| 00:17:02,680 --> 00:17:08,440 | |
| And I love that it's not a chat and or at least it originally it was that autocomplete interface, which was really innovative. | |
| 226 | |
| 00:17:08,440 --> 00:17:11,960 | |
| It was a really really interesting way of interacting with those models. | |
| 227 | |
| 00:17:11,960 --> 00:17:15,400 | |
| So yeah, I'm I've been I'm I'm a huge copilot user. | |
| 228 | |
| 00:17:15,400 --> 00:17:20,040 | |
| I've I'm at that point now where if it's not running in my text editor, I feel restricted. | |
| 229 | |
| 00:17:20,040 --> 00:17:24,360 | |
| Like, wow, now I'm I'm having to actually type the code out in full myself. | |
| 230 | |
| 00:17:24,360 --> 00:17:36,280 | |
| Yeah, and now that they've added a system prompt um for the accessibility, that's so helpful because then you can really make sure that uh what it spits out is much more likely to be accessible. | |
| 231 | |
| 00:17:36,280 --> 00:17:39,240 | |
| Um I I've been asking them about that for a year. | |
| 232 | |
| 00:17:39,240 --> 00:17:42,920 | |
| So uh they announced it just now at the the universe. | |
| 233 | |
| 00:17:43,000 --> 00:17:44,040 | |
| That's amazing. | |
| 234 | |
| 00:17:44,040 --> 00:17:51,600 | |
| That's something I wrote up a thing last night where one of the things you can do with Claude, and ChatGPT has this as well now. | |
| 235 | |
| 00:17:51,600 --> 00:17:58,000 | |
| There's this thing called a project, and where you set up a project and you can dump a bunch of files into it, but you can also set custom instructions in there. | |
| 236 | |
| 00:17:58,000 --> 00:18:00,640 | |
| So it's a nice, easy way of doing system prompts. | |
| 237 | |
| 00:18:00,640 --> 00:18:14,560 | |
| And I've been setting up little projects for different types of code that I write with custom instructions that just have like always I always start my HTML documents in this way, always includes um box sizing border box in the CSS, little things like that. | |
| 238 | |
| 00:18:14,560 --> 00:18:16,080 | |
| Um it's fantastic. | |
| 239 | |
| 00:18:16,080 --> 00:18:24,560 | |
| Like I can now one shot prompt an app a full page of of working code and have all of those little um those little ideas baked into it. | |
| 240 | |
| 00:18:24,560 --> 00:18:29,280 | |
| It's also interesting because it means that you can use the model for things that aren't in its training data. | |
| 241 | |
| 00:18:29,280 --> 00:18:42,000 | |
| Like I've started using the Python UV tool a lot, which has ways of um running Python script where you list the dependencies in a magic comment at the top of the file, and then you don't have to think about your dependencies, it just uses them correctly. | |
| 242 | |
| 00:18:42,000 --> 00:18:48,560 | |
| And so I built a I built a little custom project which teaches it, give gives it one example of here's how you list your dependencies. | |
| 243 | |
| 00:18:48,560 --> 00:18:52,400 | |
| And now I can one shot fully working self-contained Python script. | |
| 244 | |
| 00:18:52,400 --> 00:19:09,280 | |
| So yeah, I absolutely buy that if you're if you have expertise in accessibility in ARIA stuff that the models aren't doing, you give them one example, just one example of your sort of ideal framework, uh, your your ideal layout, and from that point on they'll they'll be really good at producing code that fits that that example. | |
| 245 | |
| 00:19:09,280 --> 00:19:10,960 | |
| I love to see you enthusiasm here. | |
| 246 | |
| 00:19:10,960 --> 00:19:14,000 | |
| Yeah, it's clear how much you love it, it really is. | |
| 247 | |
| 00:19:14,000 --> 00:19:20,240 | |
| Um you personally um created a command line tool called LLM. | |
| 248 | |
| 00:19:20,240 --> 00:19:24,640 | |
| Can you share with our listeners what that's all about and how useful that can be? | |
| 249 | |
| 00:19:24,640 --> 00:19:35,000 | |
| Yeah, so this is um the the initial idea around this was I spent a lot of time in the terminal, like um running uh I'm I'm on MacOS, but you know I'm I'm I'm a terminal person. | |
| 250 | |
| 00:19:35,160 --> 00:19:44,200 | |
| And I realized that large language models and the terminal are really good match because at their most basic form, a large language model is a thing where you give it a prompt and it gives you a response. | |
| 251 | |
| 00:19:44,200 --> 00:19:47,480 | |
| And in the terminal, you're always piping things from one tool to another. | |
| 252 | |
| 00:19:47,480 --> 00:19:53,240 | |
| So wouldn't it be great if you could just pipe text into a language model and then have the response come back out again? | |
| 253 | |
| 00:19:53,240 --> 00:19:55,560 | |
| So the first version of LLM was exactly that. | |
| 254 | |
| 00:19:55,560 --> 00:19:58,120 | |
| It was using the OpenAI API. | |
| 255 | |
| 00:19:58,120 --> 00:20:02,600 | |
| And I noticed that nobody had LLM on the Python package repository yet. | |
| 256 | |
| 00:20:02,600 --> 00:20:03,560 | |
| So I grabbed that. | |
| 257 | |
| 00:20:03,560 --> 00:20:09,000 | |
| It was like a namespace grab, because a three-letter acronym tool felt like a cool thing to have. | |
| 258 | |
| 00:20:09,000 --> 00:20:11,080 | |
| And so I built that, and it turns out it is great. | |
| 259 | |
| 00:20:11,080 --> 00:20:18,520 | |
| It's really fun being able to say cat, my file.py, pipe, LLM, explain this code, and it spits out an explanation. | |
| 260 | |
| 00:20:18,520 --> 00:20:19,800 | |
| That's really fun. | |
| 261 | |
| 00:20:19,800 --> 00:20:25,560 | |
| And then I added plugin support to it so that you could have it support additional models. | |
| 262 | |
| 00:20:25,560 --> 00:20:31,800 | |
| Because, you know, why talk to just OpenAI when you could talk to Anthropic or Gemini or all of these other models as well. | |
| 263 | |
| 00:20:31,800 --> 00:20:37,480 | |
| And because that's based on plugins, anyone else can write a plugin that adds support for a new model. | |
| 264 | |
| 00:20:37,480 --> 00:20:40,120 | |
| And I also can write plugins that do local models. | |
| 265 | |
| 00:20:40,120 --> 00:20:43,560 | |
| So now my little command line tool out of the box does OpenAI. | |
| 266 | |
| 00:20:43,560 --> 00:20:47,240 | |
| And if you install a plugin, it can do Gemini and then Anthropic and Claude and so on. | |
| 267 | |
| 00:20:47,240 --> 00:20:50,840 | |
| And then you can install some plugins that will install models on your laptop. | |
| 268 | |
| 00:20:50,840 --> 00:20:54,600 | |
| And now you've got a completely offline language model environment. | |
| 269 | |
| 00:20:54,600 --> 00:20:55,640 | |
| So much fun. | |
| 270 | |
| 00:20:55,640 --> 00:20:59,720 | |
| Like it's it also means that whenever a new model comes out, I've got something I can do with it. | |
| 271 | |
| 00:20:59,720 --> 00:21:03,000 | |
| Like I can be like, okay, new Gemini model, tap, tap, tap, tap, tap. | |
| 272 | |
| 00:21:03,000 --> 00:21:06,200 | |
| Now my LLM plugin for that can support that model. | |
| 273 | |
| 00:21:06,200 --> 00:21:13,320 | |
| Um so it sort of helps me stay on top of new developments because I've got uh I'm actually writing code that interacts with these models. | |
| 274 | |
| 00:21:13,320 --> 00:21:14,680 | |
| And I use that on a daily basis. | |
| 275 | |
| 00:21:14,680 --> 00:21:29,200 | |
| Like there are all sorts of things where it's convenient to be in the terminal and to quickly ask a question or quickly analyze something, or you can do things like curl a URL and pipe that into the model, and now you've got ask questions against a web page. | |
| 276 | |
| 00:21:36,960 --> 00:21:39,040 | |
| A lot of people don't know how to use a terminal. | |
| 277 | |
| 00:21:39,040 --> 00:21:41,280 | |
| Like it's it's a it's a power user tool. | |
| 278 | |
| 00:21:41,280 --> 00:21:47,440 | |
| Um and it bothers me that a lot of the stuff I'm building is then only available to people who are terminal users. | |
| 279 | |
| 00:21:47,440 --> 00:21:51,840 | |
| So I have an ongoing goal to build the sort of web application on top of LLM. | |
| 280 | |
| 00:21:51,840 --> 00:22:00,000 | |
| So you can type LLM space web enter, it runs a local web server, it pops open your browser, and now you've got a GUI where you can start playing with models. | |
| 281 | |
| 00:22:00,000 --> 00:22:03,600 | |
| And I'm I'm forever two weeks away from getting that feature working. | |
| 282 | |
| 00:22:03,840 --> 00:22:04,640 | |
| Forever. | |
| 283 | |
| 00:22:04,640 --> 00:22:05,520 | |
| Yeah. | |
| 284 | |
| 00:22:05,840 --> 00:22:06,880 | |
| That'll be fun. | |
| 285 | |
| 00:22:06,880 --> 00:22:10,240 | |
| And you're also doing multimodal on the uh command line. | |
| 286 | |
| 00:22:10,240 --> 00:22:18,640 | |
| Oh, um, and I think I saw something you did that was accessibility that could be used for accessibility, like alt text or image description or something like that, correct? | |
| 287 | |
| 00:22:18,640 --> 00:22:24,720 | |
| So that's something I actually use large language models for uh most of my alt text text these days. | |
| 288 | |
| 00:22:24,720 --> 00:22:40,400 | |
| Like whenever I tweet an image or whatever, I basically I've I've got a clawed project that's called out text writer, and it's got a prompt and example, and I dump an image in, and it gives me the alt text, and I very rarely just use it because that's rude, right? | |
| 289 | |
| 00:22:40,400 --> 00:22:46,640 | |
| You should never just dump text onto people that you haven't reviewed yourself, but it's always a good starting point. | |
| 290 | |
| 00:22:46,640 --> 00:22:53,280 | |
| And normally I'll edit a tiny little bit, I'll like delete an unimportant detail, or I'll bulk something up, and then I've got alt text that works. | |
| 291 | |
| 00:22:53,280 --> 00:22:56,400 | |
| And often it's actually got really good taste. | |
| 292 | |
| 00:22:56,400 --> 00:23:05,000 | |
| Like a great example is if you're if you're if you've got a screenshot of an interface, there's a lot of words on a screenshot of like a uh an interface. | |
| 293 | |
| 00:22:59,840 --> 00:23:06,360 | |
| And most of those words don't matter. | |
| 294 | |
| 00:23:06,440 --> 00:23:09,160 | |
| Like the message you're trying to compare in the alt text is okay. | |
| 295 | |
| 00:23:09,160 --> 00:23:14,360 | |
| It's two panels on the left is a conversation, on the right, there's a preview of the SVG file or something. | |
| 296 | |
| 00:23:14,360 --> 00:23:16,840 | |
| My alt text writer normally gets that right. | |
| 297 | |
| 00:23:16,840 --> 00:23:27,320 | |
| Like it it it's even good at summarizing tables of data where it will notice that actually what really matters is that Gemini got a score of 57 and Nova got a score of 53. | |
| 298 | |
| 00:23:27,320 --> 00:23:31,320 | |
| And so it'll pull those details out and ignore like the release dates and so forth. | |
| 299 | |
| 00:23:31,320 --> 00:23:32,440 | |
| That's really cool. | |
| 300 | |
| 00:23:32,840 --> 00:23:39,560 | |
| So I'll be able to prioritize actually what the overall screen, I'll be able to pick out the key key components and key metrics on the screen. | |
| 301 | |
| 00:23:39,640 --> 00:23:40,840 | |
| Just does it already. | |
| 302 | |
| 00:23:40,840 --> 00:23:42,920 | |
| It's just got good taste by default. | |
| 303 | |
| 00:23:42,920 --> 00:23:44,120 | |
| And then you can always talk to it. | |
| 304 | |
| 00:23:44,120 --> 00:23:50,120 | |
| So it can give you alt text and you can reply and say, yeah, ignore this column, and then dump, it'll try again. | |
| 305 | |
| 00:23:50,120 --> 00:23:51,160 | |
| That I love that. | |
| 306 | |
| 00:23:51,160 --> 00:23:56,280 | |
| Like I'm I I take pride in the alt text on these images because so many people don't bother. | |
| 307 | |
| 00:23:56,280 --> 00:24:14,680 | |
| And I love um, I'll often try and drop little like in jokes in or things like not jokes that are sort of like Easter eggs that would spoil the experience for somebody with a screen reader who's who's actually using a screen reader, but just little things that make it clear that I'm trying to convey the the sort of the message that's embedded in the image. | |
| 308 | |
| 00:24:14,680 --> 00:24:16,920 | |
| It's really fun, you know. | |
| 309 | |
| 00:24:17,160 --> 00:24:34,040 | |
| Simon, I um I listened to about uh a one-hour podcast that you did a couple of months ago, uh, an engineering podcast, and and you really did a good job of explaining um where fine-tuning made sense, which I think you said doesn't really usually make sense. | |
| 310 | |
| 00:24:34,040 --> 00:24:36,600 | |
| Um, and you compared it to RAG. | |
| 311 | |
| 00:24:36,600 --> 00:24:43,080 | |
| Um, and recently there was another release that OpenAI did to improve the fine-tuning. | |
| 312 | |
| 00:24:43,080 --> 00:24:46,240 | |
| And where I'm going with this is I'd love for you to explain. | |
| 313 | |
| 00:24:46,320 --> 00:24:54,480 | |
| Like, I'm I'm very uh passionate about the coding LLMs and that they should be accessible by default. | |
| 314 | |
| 00:24:54,480 --> 00:24:56,640 | |
| Um, so there's two aspects to this. | |
| 315 | |
| 00:24:56,640 --> 00:25:03,040 | |
| One is um how do you create a model or a version of the model that is more accessible? | |
| 316 | |
| 00:25:03,040 --> 00:25:09,200 | |
| In other words, you you feed it accessible code, right? | |
| 317 | |
| 00:25:09,200 --> 00:25:29,600 | |
| Um, just to try and counteract uh the bad code that's in the training data, um, which, as you mentioned on that podcast, you're gonna really struggle to um to really make a difference if you're adding data because there's too much data that might in this case not be accessible. | |
| 318 | |
| 00:25:29,600 --> 00:25:34,080 | |
| So, how would you recommend um customizing the model? | |
| 319 | |
| 00:25:34,080 --> 00:25:59,520 | |
| And then part two of the question is how how would you recommend that I go ahead with my mission of trying to get AI researchers to pay a little more attention to this, um, perhaps with a benchmark or or somehow get them to compete with each other so that every time there's a new foundation model, they would take a look at the accessibility and say, hey, we perform well on on this benchmark. | |
| 320 | |
| 00:25:59,600 --> 00:26:06,160 | |
| A benchmark is an incredibly good idea, like absolutely fantastic idea to have an accessibility benchmark. | |
| 321 | |
| 00:26:06,160 --> 00:26:11,840 | |
| And I feel like it's it's difficult, like building benchmarks is not a trivial thing, but it's definitely achievable. | |
| 322 | |
| 00:26:11,840 --> 00:26:13,280 | |
| There's lots of examples out there. | |
| 323 | |
| 00:26:13,280 --> 00:26:15,040 | |
| There are people who could help with that. | |
| 324 | |
| 00:26:15,040 --> 00:26:16,400 | |
| I love that idea. | |
| 325 | |
| 00:26:16,400 --> 00:26:28,800 | |
| Because yeah, um, one of the things that's becoming increasingly clear with these models is um a lot of people have this idea that all you do is you scrape the entire internet and dump it into the model and then try and get even more data and dump it in. | |
| 326 | |
| 00:26:28,800 --> 00:26:32,120 | |
| That's not actually an accurate model of of how this stuff works. | |
| 327 | |
| 00:26:32,120 --> 00:26:37,800 | |
| The more you took the more the AI labs experiment, the more it's becoming clear that the quality of the data matters so much. | |
| 328 | |
| 00:26:37,800 --> 00:26:41,720 | |
| Like you really don't just want a random scrape of a boat, a bunch of junk. | |
| 329 | |
| 00:26:41,720 --> 00:26:44,760 | |
| You want really high quality, well curated data. | |
| 330 | |
| 00:26:44,760 --> 00:26:56,920 | |
| There's a lot of work going on right now with um synthetic data, where people are artificially creating vast amounts of data and feeding into their models because they know that they've just fed in a bunch of Python code that passes its unit tests, for example. | |
| 331 | |
| 00:26:56,920 --> 00:26:58,680 | |
| And that's that's just better. | |
| 332 | |
| 00:26:58,680 --> 00:27:07,000 | |
| So, and the flip side of that is that occasionally you hear little little hints that the labs are hiring experts just to help with their training data. | |
| 333 | |
| 00:27:07,000 --> 00:27:14,440 | |
| Like they will hire experts, like expert biologists to help like refine and dump in way more high-quality biology data. | |
| 334 | |
| 00:27:14,440 --> 00:27:20,520 | |
| There is no reason at all that they couldn't hire expert accessibility engineers to help curate and dump stuff in. | |
| 335 | |
| 00:27:20,520 --> 00:27:23,160 | |
| There just needs to, they need to see that there's demand for that. | |
| 336 | |
| 00:27:23,160 --> 00:27:27,160 | |
| So yeah, I and if there were benchmarks, that would that would help push the needle on that one. | |
| 337 | |
| 00:27:27,480 --> 00:27:31,960 | |
| I've written to all of the foundation models and not gotten any response so far. | |
| 338 | |
| 00:27:31,960 --> 00:27:34,200 | |
| But um, anyway, what were you gonna say? | |
| 339 | |
| 00:27:34,200 --> 00:27:37,320 | |
| Well, yeah, so it's we should talk a little bit more about fine-tuning. | |
| 340 | |
| 00:27:37,320 --> 00:27:43,880 | |
| So the the obvious everyone who starts working on these models, one of the first things they think is, I wish it knew my stuff, right? | |
| 341 | |
| 00:27:43,880 --> 00:27:47,240 | |
| I wish it had been trained on all of the documents within my company. | |
| 342 | |
| 00:27:47,240 --> 00:27:51,960 | |
| And so obviously, I should fine-tune a model to train it to understand that information. | |
| 343 | |
| 00:27:51,960 --> 00:27:54,280 | |
| That's the thing which mostly doesn't work. | |
| 344 | |
| 00:27:54,280 --> 00:28:01,440 | |
| I mean, you can try and do it, but it turns out dumping a little bit of extra information to a model that's been trained on a giant scrape of the internet. | |
| 345 | |
| 00:28:01,440 --> 00:28:05,080 | |
| There's so much in there already, it's very difficult to bias it in the correct direction. | |
| 346 | |
| 00:28:05,080 --> 00:28:09,480 | |
| And my big frustration with fine-tuning is lots of people will sell it to you. | |
| 347 | |
| 00:28:09,480 --> 00:28:12,120 | |
| There are very expensive APIs from all of these providers. | |
| 348 | |
| 00:28:12,120 --> 00:28:14,840 | |
| There are companies and startups that will help you do this. | |
| 349 | |
| 00:28:15,040 --> 00:28:24,240 | |
| When you ask them for demos, like I just want somebody to show me a really clear demo of look, here's the default model, here's the fine-tuned one, the default one sucks at answering this question. | |
| 350 | |
| 00:28:24,240 --> 00:28:26,080 | |
| The fine-tuned one is really good at it. | |
| 351 | |
| 00:28:26,080 --> 00:28:33,120 | |
| And these demos are really hard to come across, which is one of the reasons I'm very I remain skeptical of fine-tuning as a technique. | |
| 352 | |
| 00:28:33,120 --> 00:28:42,800 | |
| I think someday it's going to be useful and people will have those demos, but right now I feel like you can spend a lot of time and money and energy and just not get really great results out of it. | |
| 353 | |
| 00:28:42,800 --> 00:28:51,680 | |
| The flip side is the thing that's getting increasingly easy these days is just straight up prompting using these long context models. | |
| 354 | |
| 00:28:51,680 --> 00:29:01,840 | |
| So like just two years ago, most models only accepted up to like 8,000 tokens, which is like maybe maybe 20 pages of text. | |
| 355 | |
| 00:29:01,840 --> 00:29:03,760 | |
| I'd have to have to look that up. | |
| 356 | |
| 00:29:03,760 --> 00:29:11,520 | |
| Today, almost all of the good models will accept 100,000 tokens, and the and Gemini, it takes a million or two million tokens. | |
| 357 | |
| 00:29:11,520 --> 00:29:16,640 | |
| And that's like you can dump multiple novels worth of information into Gemini in one go. | |
| 358 | |
| 00:29:16,640 --> 00:29:31,200 | |
| Which means if you wanted to build a model that was really good at accessibility engineering, find like 50,000 tokens worth, 10,000 tokens worth of really high quality code, stick that in the prompt, and it'll it'll pick up from those examples. | |
| 359 | |
| 00:29:31,200 --> 00:29:33,520 | |
| Models are amazingly receptive to examples. | |
| 360 | |
| 00:29:33,520 --> 00:29:37,440 | |
| Like that's the most powerful way to work with them is to give them examples of what you want. | |
| 361 | |
| 00:29:37,440 --> 00:29:48,080 | |
| And honestly, even like three or four really good examples of well-written accessible code might be enough to start the models around a lot along the right route. | |
| 362 | |
| 00:29:48,080 --> 00:29:50,160 | |
| And that's a really cheap experiment to run. | |
| 363 | |
| 00:29:50,160 --> 00:30:02,280 | |
| There's also there are these um uh prompt caching mechanisms that a lot of the providers have now, where if you give it the same exact sort of system prompt, it costs way less money on the second and third and fourth goes. | |
| 364 | |
| 00:29:59,840 --> 00:30:03,240 | |
| And that's really useful as well. | |
| 365 | |
| 00:30:03,320 --> 00:30:09,400 | |
| If you're gonna have like a long prompt full of examples, you pay money up front for the first one, and then from then on it gets cheaper. | |
| 366 | |
| 00:30:09,400 --> 00:30:10,520 | |
| I think that's the way to do it. | |
| 367 | |
| 00:30:10,520 --> 00:30:12,680 | |
| It's also really quick to iterate on these. | |
| 368 | |
| 00:30:12,680 --> 00:30:17,560 | |
| Like you build a really big prompt, try it, then you tweak it and try it again and see if you get better results. | |
| 369 | |
| 00:30:17,560 --> 00:30:20,840 | |
| So I think that's the what that's the most promising avenue right now. | |
| 370 | |
| 00:30:20,920 --> 00:30:27,000 | |
| Now I finally understand the caching because I I didn't totally get it, but it's the system prompt. | |
| 371 | |
| 00:30:27,000 --> 00:30:32,200 | |
| That makes so much sense because that's gonna be a hit every every single time. | |
| 372 | |
| 00:30:32,200 --> 00:30:35,400 | |
| Um I mean, it's also it's it's common prefixes. | |
| 373 | |
| 00:30:35,400 --> 00:30:43,720 | |
| So you might have a system prompt that says you're a useful robot that answers questions based on this document, and then a regular prompt that's the document, and then questions after that. | |
| 374 | |
| 00:30:43,720 --> 00:30:46,920 | |
| And as long as the document stays the same, you'll get that benefit. | |
| 375 | |
| 00:30:46,920 --> 00:30:55,080 | |
| Also, if you're thinking about doing chat interfaces, the way chats work is each time you say something new, it replays the previous conversation. | |
| 376 | |
| 00:30:55,080 --> 00:30:57,320 | |
| And again, that's where caching kicks in. | |
| 377 | |
| 00:30:57,320 --> 00:31:02,440 | |
| So if the caching is happening, the the subsequent posts in the conversation save a lot of money. | |
| 378 | |
| 00:31:02,440 --> 00:31:03,240 | |
| That makes sense. | |
| 379 | |
| 00:31:03,240 --> 00:31:04,280 | |
| That makes sense. | |
| 380 | |
| 00:31:04,280 --> 00:31:13,720 | |
| And then for your approach of these uh really long prompts with examples in them, how would you contrast that with rag as an approach? | |
| 381 | |
| 00:31:13,720 --> 00:31:15,800 | |
| I think it's the same kind of idea. | |
| 382 | |
| 00:31:15,800 --> 00:31:19,640 | |
| So RAG, um, so RAG it stands for retrieval augmented generation. | |
| 383 | |
| 00:31:19,640 --> 00:31:39,240 | |
| And the first version of it was a trick where you ask the model a question, and rather than just answering, it goes and tries to look in your big corpus of documents for anything that looks roughly similar to that, and then junk just basically so it does a search, gets the results, sticks those into the prompt hidden from you, and then tries to answer the question. | |
| 384 | |
| 00:31:39,240 --> 00:31:40,840 | |
| It's a really effective trick. | |
| 385 | |
| 00:31:40,840 --> 00:31:46,800 | |
| Like it's the answer to how do I teach the model about my company isn't fine-tune a model. | |
| 386 | |
| 00:31:46,800 --> 00:31:50,080 | |
| It's set up a rag system that can run searches against things. | |
| 387 | |
| 00:31:50,080 --> 00:31:57,440 | |
| And really the lesson from that is most of prompt engineering, most of building on top of LLMs is thinking about the context. | |
| 388 | |
| 00:31:57,440 --> 00:32:06,320 | |
| It's thinking, okay, what is the best thing I can cram into those 8,000, 100,000 million tokens to guarantee that to increase the chance that I get a good answer? | |
| 389 | |
| 00:32:06,320 --> 00:32:26,400 | |
| And yeah, the the um the examples thing is almost like a fixed version of RAC, where there are actually things you can do where you could have a system where the user says, I want to build an interface that does this, and you do effectively a rag search against a hundred examples and find the five most relevant pieces of example code, bung those in the prompt and then answer the question that way. | |
| 390 | |
| 00:32:26,400 --> 00:32:27,440 | |
| And that would work really well. | |
| 391 | |
| 00:32:27,440 --> 00:32:29,440 | |
| That's like a very effective technique. | |
| 392 | |
| 00:32:29,440 --> 00:32:36,240 | |
| You touched upon it a few minutes ago about um how AI is got the potential actually to generate jobs. | |
| 393 | |
| 00:32:36,240 --> 00:32:42,800 | |
| Uh, as you well know, there's a general concern out there right now about AI replacing many jobs. | |
| 394 | |
| 00:32:42,800 --> 00:32:44,000 | |
| What's your response to that? | |
| 395 | |
| 00:32:44,000 --> 00:32:48,160 | |
| I know it's a very general question, but it is one of the larger concerns out there right now. | |
| 396 | |
| 00:32:48,160 --> 00:32:49,520 | |
| And it deservedly should be. | |
| 397 | |
| 00:32:49,520 --> 00:32:52,640 | |
| Like it's um, this is a very disruptive technology. | |
| 398 | |
| 00:32:52,640 --> 00:32:59,760 | |
| Like there are there are jobs that will be very that there are jobs that will be damaged by this, there will jobs that will be enhanced by this. | |
| 399 | |
| 00:32:59,760 --> 00:33:05,680 | |
| And so there's the sort of negative and the the the pessimistic and the optimistic way to look at this. | |
| 400 | |
| 00:33:05,680 --> 00:33:13,520 | |
| And I can actually focus on this as a software engineer because it turns out writing code is one of the things these models are best at. | |
| 401 | |
| 00:33:13,520 --> 00:33:24,160 | |
| Like writing, it's it's interesting where the great thing about code is that it's got fact checking built in because if a model spits out code and you run it and get an error, then obviously there's a mistake. | |
| 402 | |
| 00:33:24,160 --> 00:33:32,200 | |
| If it spits out an essay, you need to fact check every line of that essay, and that's a much harder process than just trying to run the compiler or whatever. | |
| 403 | |
| 00:33:32,200 --> 00:33:35,080 | |
| So a lot of software engineers are terrified of this. | |
| 404 | |
| 00:33:35,080 --> 00:33:41,400 | |
| They're like, hey, this is a technology which is going to if like anyone can write can get it to write code now. | |
| 405 | |
| 00:33:41,400 --> 00:33:43,880 | |
| My 20 years of experience are no longer valuable. | |
| 406 | |
| 00:33:43,880 --> 00:33:50,280 | |
| I may I need to find a new career in in plumbing or something which which won't be replaced by AI. | |
| 407 | |
| 00:33:50,280 --> 00:34:01,080 | |
| Um my perspective on this is a developer who's been using these systems on a daily basis for like a couple of years now, I find that they enhance my value. | |
| 408 | |
| 00:34:01,080 --> 00:34:08,120 | |
| Like I am so much more competent and capable as a developer because I've got these tools like assisting me. | |
| 409 | |
| 00:34:08,120 --> 00:34:14,840 | |
| I can write code in dozens of new programming languages that I never learned before, um, but I still get to benefit from my 20 years of experience. | |
| 410 | |
| 00:34:14,840 --> 00:34:35,080 | |
| Like, take somebody off the street who's never written any code before and ask them to build an iPhone app with Chat GPT, and they are going to run into so many pitfalls because you know, programming isn't just about can you write code, it's about thinking through the problems, understanding what's possible and what's not, understanding how to QA, what good code's having good taste. | |
| 411 | |
| 00:34:35,080 --> 00:34:38,120 | |
| There's so much depth to what we do as software engineers. | |
| 412 | |
| 00:34:38,120 --> 00:34:48,360 | |
| And I've I've said before that um generative AI probably gives me a like two to five times productivity boost on the part of my job that involves typing code on into a laptop. | |
| 413 | |
| 00:34:48,360 --> 00:34:50,120 | |
| But that's only 10% of what I do. | |
| 414 | |
| 00:34:50,120 --> 00:34:56,360 | |
| Like as a software engineer, most of my time isn't actually spent with the typing of the code, it's all of those other activities. | |
| 415 | |
| 00:34:56,360 --> 00:34:59,000 | |
| The II systems help with those around other activities too. | |
| 416 | |
| 00:34:59,000 --> 00:35:07,640 | |
| They can help me think think through architectural decisions and research library options and so on, but still I have to have that agency to understand what I'm doing. | |
| 417 | |
| 00:35:07,640 --> 00:35:10,920 | |
| So, as a software engineer, I don't feel threatened. | |
| 418 | |
| 00:35:10,920 --> 00:35:20,480 | |
| I think that there's my most optimistic view of this is that the cost of developing software goes down because an engineer like myself can be more ambitious and can take on more things. | |
| 419 | |
| 00:35:20,480 --> 00:35:22,880 | |
| As a result, demand for software goes up. | |
| 420 | |
| 00:35:22,880 --> 00:35:38,720 | |
| Because if you were a company that previously you'd never have dreamed of building like a custom CRM for your industry because it would have taken 20 engineers a year before you got any results, if it now takes four engineers three months to get results, maybe you're maybe you're in the market for software engineers now that you weren't before. | |
| 421 | |
| 00:35:38,720 --> 00:35:40,880 | |
| But that's the software engineering side of things. | |
| 422 | |
| 00:35:40,880 --> 00:35:43,520 | |
| That's sort of my sort of like rosy glasses. | |
| 423 | |
| 00:35:43,520 --> 00:35:49,280 | |
| There are other industries where this stuff is just massively disruptive, and I don't think there's a there's a happy ending. | |
| 424 | |
| 00:35:49,280 --> 00:35:52,000 | |
| Like my favorite example there is uh language translation, right? | |
| 425 | |
| 00:35:52,000 --> 00:36:14,400 | |
| If you are a human being who earns money translating text from one language into another, the models are not as good at it as you, but they are good enough and they are vanishingly inexpensive that I know a lot of translators now are finding that their job has changed from translate from one language to another to here is an AI generated translation, fix it up, and that you get paid less for. | |
| 426 | |
| 00:36:14,400 --> 00:36:15,280 | |
| And that sucks. | |
| 427 | |
| 00:36:15,280 --> 00:36:24,800 | |
| Like that's an entire industry of people who, even before Chat GPT, like just Google Translate about five years ago got good enough that that industry took a massive hit. | |
| 428 | |
| 00:36:24,800 --> 00:36:29,360 | |
| Um the question then is how many other examples are there like that? | |
| 429 | |
| 00:36:29,360 --> 00:36:46,480 | |
| Um, professional illustrators who worked at the sort of like lower end of the scale, like being commissioned to do illustrations to illustrate blog posts, they are having a terrible time because they are you can now prompt an image generation model and get an illustration that's good enough for your blog post, right? | |
| 430 | |
| 00:36:46,480 --> 00:36:54,400 | |
| It's um nowhere near what what a professional illustrator could do, but it's it's really taking a chunk out of that that that that end of the market. | |
| 431 | |
| 00:36:54,400 --> 00:37:00,440 | |
| I think um in the in the movie industry, the group that are most affected to my understanding is concept artists. | |
| 432 | |
| 00:37:00,760 --> 00:37:09,320 | |
| It used to be that if a director is dreaming up a sci-fi scenario, they are paying artists to just come up with those initial concepts to help them think through. | |
| 433 | |
| 00:37:09,320 --> 00:37:12,840 | |
| That's the kind of thing which maybe they're turning to generative AI for. | |
| 434 | |
| 00:37:12,840 --> 00:37:20,520 | |
| Um so yeah, so I am not gonna say that there aren't huge negative implications to a whole bunch of people around this stuff. | |
| 435 | |
| 00:37:20,520 --> 00:37:33,480 | |
| And ideally, like hopefully this shakes out to a point where now maybe concept artists are taking on more ambitious projects and they find a new niche in the market that pays well and so forth, but I can't guarantee that's going to happen. | |
| 436 | |
| 00:37:33,480 --> 00:37:34,920 | |
| And that that that sucks, you know. | |
| 437 | |
| 00:37:34,920 --> 00:37:47,080 | |
| It it sucks to be in the like you you talk about cars and horses, where the automobile came along and caused mass unemployment against people in the the horse industry and created loads more jobs. | |
| 438 | |
| 00:37:47,080 --> 00:37:49,560 | |
| Like now there are more jobs as drivers and people making cars. | |
| 439 | |
| 00:37:49,560 --> 00:37:53,720 | |
| It still sucks if you're a professional farrier working on horses, you know. | |
| 440 | |
| 00:37:53,720 --> 00:37:59,560 | |
| Being disrupted by technology that creates new jobs isn't great if you're one of those jobs that gets disrupted. | |
| 441 | |
| 00:37:59,640 --> 00:38:00,600 | |
| It's interesting. | |
| 442 | |
| 00:38:00,600 --> 00:38:08,440 | |
| Justine Bateman, uh, who in my youth uh uh family ties, um, she was uh Mallory on family ties. | |
| 443 | |
| 00:38:08,440 --> 00:38:11,880 | |
| Uh she has a computer science degree, I believe. | |
| 444 | |
| 00:38:11,880 --> 00:38:19,720 | |
| And um she has spoken out as an actress and a filmmaker, she has been railing against AI in a really big way. | |
| 445 | |
| 00:38:19,720 --> 00:38:31,800 | |
| And it's kind of interesting to see a techie uh uh go against it, um, because even though it is awful, the jobs that are going to be lost, there's nothing we're gonna do to stop it. | |
| 446 | |
| 00:38:31,800 --> 00:38:38,560 | |
| Um, and it will at least take it away from the big names in Hollywood and democratize it. | |
| 447 | |
| 00:38:38,560 --> 00:38:42,640 | |
| So a lot of people are going to be able to make movies for a lot cheaper. | |
| 448 | |
| 00:38:42,640 --> 00:38:43,160 | |
| Right? | |
| 449 | |
| 00:38:43,160 --> 00:38:43,480 | |
| Right. | |
| 450 | |
| 00:38:43,480 --> 00:38:45,680 | |
| It's like um the thing with the translators. | |
| 451 | |
| 00:38:45,920 --> 00:38:46,880 | |
| There's a trade-off here. | |
| 452 | |
| 00:38:44,920 --> 00:38:51,760 | |
| Like on the one hand, it really sucks if that was your your profession and it's been impacted in that way. | |
| 453 | |
| 00:38:51,920 --> 00:38:59,200 | |
| But if we now have technology, that means a billion people can have conversations who never could have afforded a human translator before. | |
| 454 | |
| 00:38:59,200 --> 00:39:08,480 | |
| That is, I mean, I don't like to make just straight up statements about one thing is worth something else, but that's a pretty compelling trade-off at that point, you know. | |
| 455 | |
| 00:39:08,480 --> 00:39:16,960 | |
| But and yeah, for filmmaking, so much of filmmaking is expensive and slow moving and frustrating. | |
| 456 | |
| 00:39:16,960 --> 00:39:23,360 | |
| I love, as a programmer, I love when frustrating parts of my job get get sped up. | |
| 457 | |
| 00:39:23,360 --> 00:39:36,800 | |
| But do we uh does the economy shake out that people who were doing like dis skilled but but you know, but skilled but um frustrating aspects of their work still get employed and earn more money doing more creative things? | |
| 458 | |
| 00:39:36,800 --> 00:39:38,080 | |
| God, I hope so. | |
| 459 | |
| 00:39:38,080 --> 00:39:38,960 | |
| But I don't know. | |
| 460 | |
| 00:39:38,960 --> 00:39:41,920 | |
| I'm not a I'm I'm not in that industry, I'm not an economist. | |
| 461 | |
| 00:39:41,920 --> 00:39:46,880 | |
| I I have no way of I can't say with any certainty that it's gonna play out in a good way. | |
| 462 | |
| 00:39:47,040 --> 00:39:48,240 | |
| That's my fear too. | |
| 463 | |
| 00:39:48,240 --> 00:39:50,560 | |
| And and here's another angle for you. | |
| 464 | |
| 00:39:50,560 --> 00:40:10,240 | |
| So yesterday, a friend, the sister of a friend of mine got scammed out of $20,000, and they kept her on the phone the entire time and went from uh blockchain uh or Bitcoin uh ATM to Bitcoin ATM, throwing in some money here, some money there. | |
| 465 | |
| 00:40:10,240 --> 00:40:14,560 | |
| Um, otherwise uh she would get arrested and she totally fell for it. | |
| 466 | |
| 00:40:14,560 --> 00:40:41,880 | |
| And it hit me that that not it won't be long before you're going to see some scammers take the voice of somebody you know, and and we've you know, a lot of people in AI have talked about this, um, and and and the solution is everybody should create a safe word, but then it hit me that all you have to do to get that safe word is you call the person whose safe word you need and whose voice you're gonna grab. | |
| 467 | |
| 00:40:41,880 --> 00:40:42,840 | |
| And you know what I mean? | |
| 468 | |
| 00:40:42,840 --> 00:40:53,240 | |
| Like if you have, let's say two siblings, you get both of their voices, you use 11 labs or something to emulate their voice, and then you grab that, you play man in the middle, right? | |
| 469 | |
| 00:40:53,240 --> 00:40:56,600 | |
| And you grab that safe word and then you hand it over. | |
| 470 | |
| 00:40:56,600 --> 00:41:07,080 | |
| It sort of feels like you you need a double safe word and a really um you really have to spend some time to get this right, and most people are not going to be able to handle that. | |
| 471 | |
| 00:41:07,080 --> 00:41:12,920 | |
| Yeah, I mean, that there are there are a lot of bad things, like bad people can use this technology to do a lot of bad things. | |
| 472 | |
| 00:41:12,920 --> 00:41:16,680 | |
| And in most of these cases, there's always an argument they could have done it before. | |
| 473 | |
| 00:41:16,680 --> 00:41:21,320 | |
| Like if you get if you if you're talented in impersonating voices, you could have pulled off that scam. | |
| 474 | |
| 00:41:21,320 --> 00:41:29,000 | |
| But not a lot of people are talented at personating voices, and it's it's that extra friction meant that wasn't necessarily a widespread scam. | |
| 475 | |
| 00:41:29,000 --> 00:41:33,080 | |
| The quality of voice cloning and so forth these days is is shockingly good. | |
| 476 | |
| 00:41:33,080 --> 00:41:41,880 | |
| Like the um, I recently found out that the the open AI, real the really good open AI voices, they can train those on like a 15-second audio sample. | |
| 477 | |
| 00:41:41,880 --> 00:41:45,720 | |
| They they prepay a professional voice actor for 15 seconds of their voice. | |
| 478 | |
| 00:41:45,720 --> 00:41:52,040 | |
| And they've deliberately not made that capability available to everyone else, but it's it's it's the way the models work. | |
| 479 | |
| 00:41:52,040 --> 00:42:03,400 | |
| Um so yeah, I I mean that's one of the other things that scares me about this stuff is is as a society, are we ready to to understand and to to to to to cope with this? | |
| 480 | |
| 00:42:03,400 --> 00:42:07,560 | |
| And if not, how quickly can we speak can can we get up to speed? | |
| 481 | |
| 00:42:07,560 --> 00:42:12,280 | |
| The the um the one that works me the most is isn't voice cloning, it's the romance scams, right? | |
| 482 | |
| 00:42:12,280 --> 00:42:34,720 | |
| The thing where you get a text message out of the blue, you apply they try and form a relationship with you those romance scams have been run out of effectively sweatshops in places like Indonesia and the Philippines for years where they get people with with good written English skills and effectively like force them to pull these scams on people even it's even cheaper if you can get a generative AI model to do that. | |
| 483 | |
| 00:42:34,720 --> 00:42:59,360 | |
| And yeah I think that's just going to be a growing problem that we have is that like scams are going to become more prevalent and they'll be cheaper to run like and yeah it's it's something yeah that it's always scale like so many of these problems come down to the fact that the bad thing was possible before but now it's possible at a hundred times the volume and yeah what what do we what how do we how do we fight back against that I I don't know. | |
| 484 | |
| 00:42:59,360 --> 00:43:12,480 | |
| Yeah watched you you talk about the romance scam it was like 16 minutes or some show on last week and it's exactly around that and how that is growing and growing and growing globally and it's just it's it's just so unfair it really is. | |
| 485 | |
| 00:43:13,360 --> 00:43:23,200 | |
| Many of our listeners on today's podcast will be interested to to kind of get a little bit more insight about your Django story kind of how that how you created it. | |
| 486 | |
| 00:43:23,200 --> 00:43:53,800 | |
| I know we don't have a lot of time but could you give us just an overview of how that was initiated and your your journey there and where it's at today absolutely so this is going back a long time this is two this is 21 years ago this is um 2003 um I was a university student and I had a blog and this in 2003 there were only about a hundred people with blogs talking about web development so we all knew each other um and this chap, um Adrian Holivati was a uh journalist, uh web developer working in Kansas, and on his blog, he put up a job ad. | |
| 487 | |
| 00:43:53,800 --> 00:43:57,480 | |
| And my university offered us a year in industry placement program. | |
| 488 | |
| 00:43:57,480 --> 00:44:01,960 | |
| So you could take a year off of university, go and work somewhere, and then then come back again. | |
| 489 | |
| 00:44:01,960 --> 00:44:04,600 | |
| And it meant that you could get a student visa. | |
| 490 | |
| 00:44:04,600 --> 00:44:11,960 | |
| So I got in touch with Adrian and said, hey, would this work as like a year-long sort of like paid internship kind of arrangement? | |
| 491 | |
| 00:44:11,960 --> 00:44:12,680 | |
| And it did. | |
| 492 | |
| 00:44:12,680 --> 00:44:14,600 | |
| So I moved out, moved from London. | |
| 493 | |
| 00:44:14,600 --> 00:44:20,840 | |
| Um I moved from England to Kansas to Lawrence, Kansas, and spent a year working at this tiny little local newspaper. | |
| 494 | |
| 00:44:20,840 --> 00:44:28,440 | |
| And yeah, Adrian and I were both PHP developers who wanted to use Python, and none of the Python web frameworks at the time quite did what we wanted. | |
| 495 | |
| 00:44:28,440 --> 00:44:36,440 | |
| So we ended up building our own little thin abstraction layer over the mod Python Apache module to build newspaper websites with. | |
| 496 | |
| 00:44:36,440 --> 00:44:39,960 | |
| And honestly, we had no idea that it was ever going to be an open source thing. | |
| 497 | |
| 00:44:39,960 --> 00:44:43,960 | |
| We thought it was the CMS that we were using to build these newspaper websites. | |
| 498 | |
| 00:44:43,960 --> 00:44:45,480 | |
| But um, I was there for a year. | |
| 499 | |
| 00:44:45,480 --> 00:44:56,600 | |
| I left, and then six months after I left, they got the go-ahead from the university publishers to release this as open source, partly because Ruby on Rails had just come out and was like taking the world by storm. | |
| 500 | |
| 00:44:56,600 --> 00:45:01,560 | |
| And they were looking at their thing and saying, hey, we've got a thing that looks a bit like Ruby on Rails, but it's for Python. | |
| 501 | |
| 00:45:01,560 --> 00:45:04,600 | |
| This company, 37 Signals, are doing well out of their release. | |
| 502 | |
| 00:45:04,600 --> 00:45:06,680 | |
| We should we should go ahead and put that out into the world. | |
| 503 | |
| 00:45:06,680 --> 00:45:07,320 | |
| And they did. | |
| 504 | |
| 00:45:07,320 --> 00:45:13,240 | |
| And um they they called it Django because Adrian Holivati is a huge Django Reinhardt gypsy jazz fan. | |
| 505 | |
| 00:45:13,240 --> 00:45:18,600 | |
| He actually um like he has a YouTube video where he does gypsy jazz guitar covers of different things. | |
| 506 | |
| 00:45:18,840 --> 00:45:20,600 | |
| He's a very talented musician. | |
| 507 | |
| 00:45:20,600 --> 00:45:26,360 | |
| Um yeah, and that that put Django out into the world, and it just grew and grew and grew. | |
| 508 | |
| 00:45:26,360 --> 00:45:29,240 | |
| And it's been, yeah, it's been out for nearly 20 years now. | |
| 509 | |
| 00:45:29,240 --> 00:45:35,400 | |
| We're we're planning a 2021st birthday party for it, um, hopefully next year, which would be really fun. | |
| 510 | |
| 00:45:35,400 --> 00:45:41,000 | |
| Um, but yeah, and so I was involved at the very start, um, and then tangentially involved after that. | |
| 511 | |
| 00:45:41,000 --> 00:45:49,040 | |
| I haven't been like a uh core contributing developer for a very long time, but I'd throw ideas over the fence and occasionally knock up a few patches and so forth. | |
| 512 | |
| 00:45:49,040 --> 00:45:50,480 | |
| Um, it's just been amazing. | |
| 513 | |
| 00:45:50,480 --> 00:45:57,280 | |
| Like watching that grow, watching the community around it grow around the world and seeing all of these things that people have built on top of it. | |
| 514 | |
| 00:45:57,280 --> 00:46:00,480 | |
| Yeah, I'm I'm really excited to see how that's worked out. | |
| 515 | |
| 00:46:00,480 --> 00:46:04,720 | |
| The engineering quality of uh Django is just top notch. | |
| 516 | |
| 00:46:04,720 --> 00:46:08,480 | |
| The only ORM I ever liked, I always hate ORMs. | |
| 517 | |
| 00:46:08,480 --> 00:46:09,920 | |
| It's like just goes straight. | |
| 518 | |
| 00:46:10,400 --> 00:46:11,920 | |
| And that was nothing to do with me. | |
| 519 | |
| 00:46:11,920 --> 00:46:20,960 | |
| The RM So when I was working on it, um uh Adrian built a code generator that generated because oh like database code is really repetitive. | |
| 520 | |
| 00:46:20,960 --> 00:46:25,280 | |
| So he built a thing that generated Python code for you to talk to your your models. | |
| 521 | |
| 00:46:25,280 --> 00:46:36,560 | |
| Um, and then um Malcolm Trudenic was the person who joined the Django community and helped uh helped um turn that into something into what we have today, an incredible piece of work. | |
| 522 | |
| 00:46:36,560 --> 00:46:37,680 | |
| It's such a good design. | |
| 523 | |
| 00:46:37,680 --> 00:46:42,000 | |
| And yeah, for years after the ORM came out, I still wasn't very good at SQL. | |
| 524 | |
| 00:46:42,000 --> 00:46:44,000 | |
| I just rely on the what the Django ORM did. | |
| 525 | |
| 00:46:44,000 --> 00:46:49,920 | |
| It's only in the past maybe five years that I've got super confident in using SQL for these things instead. | |
| 526 | |
| 00:46:49,920 --> 00:46:51,200 | |
| Yeah, interesting. | |
| 527 | |
| 00:46:51,200 --> 00:46:58,560 | |
| Well, speaking of data, you have another project called Data Set, which I would love for you to explain. | |
| 528 | |
| 00:46:58,560 --> 00:47:01,680 | |
| What does it do that no other database does? | |
| 529 | |
| 00:47:01,680 --> 00:47:06,400 | |
| Like what problem is it that you're trying to solve, and where are you going with dataset? | |
| 530 | |
| 00:47:06,400 --> 00:47:08,320 | |
| This is a very interesting question. | |
| 531 | |
| 00:47:08,320 --> 00:47:14,160 | |
| And um, I wish I had the one sentence answer, but I don't, so I'm gonna have to I'll have to give you a few paragraphs. | |
| 532 | |
| 00:47:14,160 --> 00:47:23,440 | |
| So, dataset is an open source tool I've been building for nearly seven years now, and it's a Python web application for exploring, analyzing, and publishing data. | |
| 533 | |
| 00:47:23,440 --> 00:47:31,080 | |
| So the if the initial idea was um it actually came out of work I'd done at newspapers, where when you're a newspaper, you often publish data-driven stories. | |
| 534 | |
| 00:47:31,160 --> 00:47:38,200 | |
| You'll have a story about um the number of um hospital beds currently available across the state or whatever. | |
| 535 | |
| 00:47:38,200 --> 00:47:39,960 | |
| And those stories come with data. | |
| 536 | |
| 00:47:39,960 --> 00:47:45,080 | |
| And what I wanted to start encouraging newspapers to publish the data behind the stories. | |
| 537 | |
| 00:47:45,080 --> 00:47:59,000 | |
| This is something we started doing when I worked at The Guardian back in 2000 and 200 10, 2011, where the idea was you'd publish a story and then we'd put out the data behind the story, and we just publish it as a Google spreadsheet. | |
| 538 | |
| 00:47:59,000 --> 00:48:03,800 | |
| We'd have a Google spreadsheet with these are the raw numbers that went into this piece of reporting. | |
| 539 | |
| 00:48:03,800 --> 00:48:10,120 | |
| I always felt that should be a better thing than Google spreadsheet, something a little bit more open, a little more but with more capabilities. | |
| 540 | |
| 00:48:10,120 --> 00:48:12,280 | |
| And so the first version of data set was just that. | |
| 541 | |
| 00:48:12,280 --> 00:48:29,720 | |
| It was like, okay, take a bunch of data, stick it in a SQLite database, because SQLite doesn't require a separate server, it's just a file, and then deploy an application that gives you a little interface, like a web UI on top of this database where you can click around through it, and a JSON API so that you can start building things against it as well. | |
| 542 | |
| 00:48:29,720 --> 00:48:33,880 | |
| And then other features like the ability to export CSV versions and so forth. | |
| 543 | |
| 00:48:33,880 --> 00:48:35,160 | |
| So that was the initial idea. | |
| 544 | |
| 00:48:35,160 --> 00:48:38,600 | |
| It was what's the best possible way of publishing data online? | |
| 545 | |
| 00:48:38,600 --> 00:48:41,480 | |
| Because to my surprise, there weren't really any solutions to that. | |
| 546 | |
| 00:48:41,480 --> 00:48:48,360 | |
| Like if you want to publish a million rows of data online, your options are basically stick a CSV file in an S3 bucket. | |
| 547 | |
| 00:48:48,360 --> 00:48:49,560 | |
| And how's that useful? | |
| 548 | |
| 00:48:49,560 --> 00:48:56,600 | |
| Like, like it's great for CSV nerds, but it's not exactly something that general the people can generally engage with. | |
| 549 | |
| 00:48:56,600 --> 00:49:03,080 | |
| So that was the initial idea, and then I added plugins, and I realized that my inspiration there was actually WordPress, right? | |
| 550 | |
| 00:49:03,080 --> 00:49:12,360 | |
| Where WordPress is a perfectly decent blogging engine with tens of thousands of plugins that mean any publishing problem you have, you can solve with WordPress plus some plugins. | |
| 551 | |
| 00:49:12,360 --> 00:49:17,120 | |
| And I thought, okay, what if that was the answer for data exploration and analysis projects? | |
| 552 | |
| 00:49:17,120 --> 00:49:29,440 | |
| Like any project you have that involves data, which is basically everything, if you could take dataset plus specific plugins for visualizations that you want to run or export formats, that would be a really cool thing to build. | |
| 553 | |
| 00:49:29,440 --> 00:49:32,800 | |
| So that's how the project's been evolving over the past few years. | |
| 554 | |
| 00:49:33,200 --> 00:49:45,440 | |
| And so you've got, there are, I think, 150 plugins now for things like GeoJSON export or visualize everything on a map, or um I've started building plugins for adding for um for editing data as well. | |
| 555 | |
| 00:49:45,440 --> 00:49:56,880 | |
| So you can actually use dataset as a kind of air table alternative where you're loading in your data, making edits to it, running transformations against it, doing geocoding operations, all of that kind of thing. | |
| 556 | |
| 00:49:56,880 --> 00:50:05,600 | |
| And I love this project because thanks to plugins, if there's anything in the world that I think is interesting, I can justify that as a plugin for dataset. | |
| 557 | |
| 00:50:05,600 --> 00:50:18,160 | |
| I can be like, okay, this week I'm into GIS and I'm gonna do geospatial plugins, and then next week I'm doing some weird AI stuff, and I could write plugins for dataset that use language models to generate SQL queries or whatever it is. | |
| 558 | |
| 00:50:18,160 --> 00:50:21,840 | |
| So effectively, this is the project I want to work on for the rest of my life. | |
| 559 | |
| 00:50:21,840 --> 00:50:24,560 | |
| If I'm gonna do that, it needs to earn its keep. | |
| 560 | |
| 00:50:24,560 --> 00:50:32,400 | |
| So for the past um year and a bit, I've been putting together the sort of business model side of it, which is effectively the WordPress thing again. | |
| 561 | |
| 00:50:32,400 --> 00:50:34,880 | |
| It's the hosted SaaS version of data set. | |
| 562 | |
| 00:50:34,880 --> 00:50:36,800 | |
| So dataset completely open source. | |
| 563 | |
| 00:50:36,800 --> 00:50:42,160 | |
| If you are comfortable running an Ubuntu virtual machine somewhere, go ahead and install it and run it. | |
| 564 | |
| 00:50:42,160 --> 00:50:52,160 | |
| Or you'll be able to pay me a summer month and I will run a private dataset instance for you and your team with all of the plugins and the integrations and API stuff and all of that kind of thing. | |
| 565 | |
| 00:50:52,160 --> 00:51:02,280 | |
| And I'm working at the moment, I'm I need to put the final searches on the billing side so I can actually turn on like self-service payments for it, but it's getting there. | |
| 566 | |
| 00:51:02,280 --> 00:51:06,040 | |
| It's um that that's an exciting um that that's called dataset cloud. | |
| 567 | |
| 00:50:59,600 --> 00:51:12,520 | |
| It's dataset.cloud as opposed to dataset, which is dataset.io, spelt like the word cassette. | |
| 568 | |
| 00:51:12,520 --> 00:51:15,960 | |
| So it's d-a-t-a-c-a-s-e-t-t-e. | |
| 569 | |
| 00:51:15,960 --> 00:51:21,880 | |
| But yeah, so that's that's something I'm spending a lot of time on at the moment is is that commercial side of of the open source project. | |
| 570 | |
| 00:51:21,880 --> 00:51:23,240 | |
| Did you say C? | |
| 571 | |
| 00:51:23,240 --> 00:51:24,360 | |
| Do you mean S? | |
| 572 | |
| 00:51:24,360 --> 00:51:25,160 | |
| I did mean S. | |
| 573 | |
| 00:51:25,160 --> 00:51:26,520 | |
| D-A-T-H-E. | |
| 574 | |
| 00:51:27,960 --> 00:51:28,680 | |
| Thank you for that. | |
| 575 | |
| 00:51:28,680 --> 00:51:29,320 | |
| Yeah. | |
| 576 | |
| 00:51:29,400 --> 00:51:35,080 | |
| So, Simon, obviously the engagement between developers and accessibility community is key. | |
| 577 | |
| 00:51:35,080 --> 00:51:39,560 | |
| It's key for the ongoing progress, both from a conformance and usability perspective. | |
| 578 | |
| 00:51:39,560 --> 00:51:43,480 | |
| As a lead developer yourself, you know, how can we bridge that gap? | |
| 579 | |
| 00:51:43,640 --> 00:51:47,480 | |
| What are specific areas that we can assist with sharing with you? | |
| 580 | |
| 00:51:47,560 --> 00:52:05,160 | |
| So the thing I find most difficult about like building accessible sites at the moment is there are lots of there are the WCAG guidelines and there are like um Chrome extensions that do audits and so forth, and I just don't trust them because just because my site passed an audit, that doesn't mean it's going to work in a screen reader. | |
| 581 | |
| 00:52:05,160 --> 00:52:17,080 | |
| Like, especially with um like modern JavaScript things, where if I'm building an interface where I click through tabs and parts of the page update, how should I make sure that screen readers are notified about the correct piece of the page? | |
| 582 | |
| 00:52:17,080 --> 00:52:20,520 | |
| I don't just want to be told that we're WCAG guidelines, I want demos. | |
| 583 | |
| 00:52:20,520 --> 00:52:29,240 | |
| The thing that's missing for me is I would like almost like a cookbook of accessibility patterns where it's like, here's how to do modal dialogues and here's to how to do tab switching. | |
| 584 | |
| 00:52:29,240 --> 00:52:33,880 | |
| And for each one, I want a video of what a screen reader does with that demo. | |
| 585 | |
| 00:52:33,880 --> 00:52:35,360 | |
| Or I do I want multiple videos. | |
| 586 | |
| 00:52:35,360 --> 00:52:42,920 | |
| Like show me for the two or three most popular screen readers, how do they behave when you lay out your modal dialogues or your tab interface? | |
| 587 | |
| 00:52:42,920 --> 00:52:47,040 | |
| The one that's really um really relevant right now is chat interfaces, right? | |
| 588 | |
| 00:52:47,040 --> 00:52:52,000 | |
| LLMs do that streaming text thing where you ask them a question and everything comes back a word at a time. | |
| 589 | |
| 00:52:52,000 --> 00:52:55,280 | |
| How do I make that accessibility accessible to a screen reader? | |
| 590 | |
| 00:52:55,280 --> 00:52:57,040 | |
| Like there must be patterns. | |
| 591 | |
| 00:52:57,040 --> 00:53:05,760 | |
| What those patterns are, it's very difficult for me to find examples of those put together by experts with proof in sort of video form that they do the right thing. | |
| 592 | |
| 00:53:05,760 --> 00:53:08,800 | |
| I would love to see that more of that kind of documentation. | |
| 593 | |
| 00:53:08,800 --> 00:53:14,800 | |
| Yeah, I think from our perspective, you know, we the accessibility community shares as much as they possibly can across the board. | |
| 594 | |
| 00:53:14,800 --> 00:53:17,680 | |
| And Joe has played a phenomenal part in that. | |
| 595 | |
| 00:53:17,680 --> 00:53:22,240 | |
| But as you well know, different tag stacks, um, there are commonalities to your point. | |
| 596 | |
| 00:53:22,240 --> 00:53:30,320 | |
| And I do believe there should be some type of base documentation, lessons learned, and again, the videos would go a long way. | |
| 597 | |
| 00:53:30,320 --> 00:53:36,000 | |
| We've done it from our perspective, from a Service Node perspective, and we're more than willing to share what we've built. | |
| 598 | |
| 00:53:36,000 --> 00:53:41,200 | |
| But a more open source, if you will, access to that type of content could go a long way. | |
| 599 | |
| 00:53:41,280 --> 00:53:44,480 | |
| My my technical preference here, I like HTML and JavaScript. | |
| 600 | |
| 00:53:44,480 --> 00:53:47,040 | |
| I don't want React and I don't want like VT and so forth. | |
| 601 | |
| 00:53:47,040 --> 00:53:55,440 | |
| I just want to give me an accessible HTML and JavaScript demo of how to do like five or six of these common common interactions. | |
| 602 | |
| 00:53:55,440 --> 00:54:01,040 | |
| And that's enough, like then if somebody's using React, they should be able to port that to React, you know. | |
| 603 | |
| 00:54:01,040 --> 00:54:04,320 | |
| But having those example, the thing I want is proof. | |
| 604 | |
| 00:54:04,320 --> 00:54:11,120 | |
| Like every time a new JavaScript library comes out, one of my first things I do is check to see if they've got any documentation about their accessibility. | |
| 605 | |
| 00:54:11,120 --> 00:54:12,400 | |
| Most of them don't at all. | |
| 606 | |
| 00:54:12,400 --> 00:54:17,040 | |
| Some of them will say we are we have the right ARIA tag. | |
| 607 | |
| 00:54:17,040 --> 00:54:18,080 | |
| That's still not enough for me. | |
| 608 | |
| 00:54:18,080 --> 00:54:25,680 | |
| I still don't believe I won't believe that it's accessible until I see video evidence that it's been tested and shown to work with the screen reader. | |
| 609 | |
| 00:54:25,680 --> 00:54:31,560 | |
| And I feel like um I think Adobe do have some of this documentation for some of their accessible React things. | |
| 610 | |
| 00:54:31,960 --> 00:54:45,000 | |
| Yeah, so what I'd recommend that you look at is um Charlie Tripp at Triplet, um, who's one of the accessibility architects that that um helps create what we're working on at my company? | |
| 611 | |
| 00:54:45,000 --> 00:54:48,040 | |
| Um he wrote the book on accessibility. | |
| 612 | |
| 00:54:48,040 --> 00:54:58,040 | |
| Uh so you can just go to the book on accessibility.com and he was working for T Mobile and for T Mobile built magenta ally. | |
| 613 | |
| 00:54:58,040 --> 00:55:00,920 | |
| So Magenta A11Y.com. | |
| 614 | |
| 00:55:01,240 --> 00:55:16,440 | |
| And it's got um it's got acceptance criteria, it's got uh sample HTML, it's got videos of screen readers or assistive technology, and I think it's gonna provide a lot of the solutions that you're looking for. | |
| 615 | |
| 00:55:16,440 --> 00:55:17,560 | |
| That's really exciting. | |
| 616 | |
| 00:55:17,560 --> 00:55:25,320 | |
| Like just that that for me, the thing that would make that would make my my life as an engineer who cares about this easier, it's it's demos. | |
| 617 | |
| 00:55:25,320 --> 00:55:34,840 | |
| It's it's straight up demos that are proven to work with explanations of why this is the right pattern for doing a tabbed interface or replacing part of the screen. | |
| 618 | |
| 00:55:34,840 --> 00:55:40,120 | |
| The other problem I've got at the moment, which is a really interesting one, is tables, just actual tables of data. | |
| 619 | |
| 00:55:40,120 --> 00:55:44,280 | |
| Because my my software data set presents tables of data. | |
| 620 | |
| 00:55:44,280 --> 00:55:46,520 | |
| I have no control over what those tables are. | |
| 621 | |
| 00:55:46,520 --> 00:55:51,240 | |
| Often it'll be like a 40 column table with like 2,000 rows in it. | |
| 622 | |
| 00:55:51,240 --> 00:55:59,000 | |
| I don't care how good my table markup is, that's gonna suck in a screen reader if you're trying to make sense of like 2,000 rows of data with 40 columns. | |
| 623 | |
| 00:55:59,000 --> 00:56:06,840 | |
| Um, I had a great conversation actually with Ed Summers about this, where we talked about how this is a great opportunity for chat-based interfaces, right? | |
| 624 | |
| 00:56:06,840 --> 00:56:15,920 | |
| If you can, if you've got that table and you can then say to it, what's the school with the highest number of um of children from this particular background? | |
| 625 | |
| 00:56:16,080 --> 00:56:17,280 | |
| And it gives you an answer. | |
| 626 | |
| 00:56:17,280 --> 00:56:18,160 | |
| That's just better. | |
| 627 | |
| 00:56:18,160 --> 00:56:19,920 | |
| That's just a better way of interacting. | |
| 628 | |
| 00:56:19,920 --> 00:56:28,080 | |
| So I'm really excited for my own data set um tool about like what are the plugins I can build that use language models to give you that sort of conversational interface. | |
| 629 | |
| 00:56:28,080 --> 00:56:29,760 | |
| Because I realized I thought that was a gimmick. | |
| 630 | |
| 00:56:29,760 --> 00:56:33,680 | |
| I thought having a conversation with the table feels like that's kind of fun, but is it really useful? | |
| 631 | |
| 00:56:33,680 --> 00:56:36,560 | |
| From accessibility perspective, it's amazingly useful. | |
| 632 | |
| 00:56:36,560 --> 00:56:37,760 | |
| Completely, completely agree. | |
| 633 | |
| 00:56:37,760 --> 00:56:40,240 | |
| And we we touched upon this yesterday with Ed. | |
| 634 | |
| 00:56:40,240 --> 00:56:57,120 | |
| Um, from a service now perspective, we will be releasing a conversational AI with Now Assist or Assist AI, uh full conversational AI, with the goal of creating a full conversational AI for an entire user journey from start to finish. | |
| 635 | |
| 00:56:57,120 --> 00:57:01,600 | |
| Then we're going to be hopefully overlaying that with what we call intuitive page summarization. | |
| 636 | |
| 00:57:01,600 --> 00:57:10,000 | |
| So an unusual here, what you just touched upon before, the key parts of that page, the most critical parts of that page will be read back right away. | |
| 637 | |
| 00:57:10,000 --> 00:57:12,000 | |
| So they don't have to be tapping across everything. | |
| 638 | |
| 00:57:12,000 --> 00:57:14,320 | |
| So we're actually working on both of them right now. | |
| 639 | |
| 00:57:14,320 --> 00:57:19,120 | |
| And isn't it amazing that what you just described is now something that can be built? | |
| 640 | |
| 00:57:19,120 --> 00:57:21,600 | |
| Like it's now feasible to build these kinds of things. | |
| 641 | |
| 00:57:21,600 --> 00:57:32,400 | |
| Um with like just interacting with a web page, straight up a screenshot of an entire web page felt in felt um fed into a high quality model, will do a good enough job right now that it can start being useful. | |
| 642 | |
| 00:57:32,400 --> 00:57:34,400 | |
| And that's fascinating. | |
| 643 | |
| 00:57:34,400 --> 00:57:35,440 | |
| Yep, love it. | |
| 644 | |
| 00:57:35,440 --> 00:57:36,000 | |
| Yeah. | |
| 645 | |
| 00:57:36,000 --> 00:57:45,760 | |
| And in fact, to prepare for this podcast, I took um I took your blog, which is just incredible, I threw it into Notebook LM. | |
| 646 | |
| 00:57:45,760 --> 00:58:00,120 | |
| I took a whole bunch of your podcasts, through that into Notebook LM, and then some of the questions I asked you, I did honestly get out of Notebook LM because it was able to munch through so much data and and provide that information. | |
| 647 | |
| 00:58:00,120 --> 00:58:01,400 | |
| It's such a cool tool. | |
| 648 | |
| 00:58:01,400 --> 00:58:02,760 | |
| Have you played with it? | |
| 649 | |
| 00:57:59,840 --> 00:58:06,760 | |
| Yeah, I'm I'm really I'm I'm really fascinated by Notebook LM. | |
| 650 | |
| 00:58:06,840 --> 00:58:11,240 | |
| Um, what you've just described, it's this is it's the best current consumer rag product. | |
| 651 | |
| 00:58:11,240 --> 00:58:18,440 | |
| Like really what you're doing there is you're just dumping a bunch of documents into a rag system, and when you ask it a question, it looks things up for you and so forth. | |
| 652 | |
| 00:58:18,440 --> 00:58:19,960 | |
| And it works really, really well. | |
| 653 | |
| 00:58:19,960 --> 00:58:27,880 | |
| The um the questions that you gave me, I've been on a few podcasts, they were by far the best select selection of questions in advance I've ever seen. | |
| 654 | |
| 00:58:27,880 --> 00:58:31,960 | |
| It's so interesting to hear that that was part of your process for putting those together. | |
| 655 | |
| 00:58:31,960 --> 00:58:42,840 | |
| The other thing, you can't talk about Notebook LM without talking about their brilliantly weird podcast thing, where they can generate a podcast about the um about whatever content you've thrown in there. | |
| 656 | |
| 00:58:42,840 --> 00:58:43,800 | |
| I love that. | |
| 657 | |
| 00:58:43,800 --> 00:58:47,000 | |
| It's both a gimmick and incredibly useful. | |
| 658 | |
| 00:58:47,000 --> 00:58:49,880 | |
| Um it is spookily good. | |
| 659 | |
| 00:58:49,880 --> 00:58:55,400 | |
| Like it's the best version of like fake human voices that I've heard from anything. | |
| 660 | |
| 00:58:55,400 --> 00:59:01,720 | |
| They just released a new preview feature last week, I think, where you can now interrupt the podcast and ask them a question. | |
| 661 | |
| 00:59:01,720 --> 00:59:06,680 | |
| And so you can basically join in and be part of the the podcast experience. | |
| 662 | |
| 00:59:06,680 --> 00:59:07,800 | |
| Very, very weird. | |
| 663 | |
| 00:59:07,800 --> 00:59:10,280 | |
| They also added um custom instructions to it. | |
| 664 | |
| 00:59:10,280 --> 00:59:28,680 | |
| So I like doing things like um, I fed in a uh big boring report about some like generative AI study, and I told them you are banana slugs, and I want you to have a conversation about this paper from the perspective of banana slugs talking about how this could affect your slug society, and they did it. | |
| 665 | |
| 00:59:28,680 --> 00:59:33,080 | |
| It's like a monthly pipe and oh my god. | |
| 666 | |
| 00:59:33,080 --> 00:59:34,520 | |
| That's incredible. | |
| 667 | |
| 00:59:35,240 --> 00:59:36,440 | |
| How fun, yeah. | |
| 668 | |
| 00:59:36,520 --> 00:59:42,040 | |
| And their team announced that they're leaving and starting their own company, so that should be a good one. | |
| 669 | |
| 00:59:42,280 --> 00:59:44,040 | |
| That's that's the constant pattern with this thing. | |
| 670 | |
| 00:59:44,040 --> 00:59:50,400 | |
| If you build anything good at all around AI, you should quit your job and raise a million, raise 10 million dollars off of VC straight. | |
| 671 | |
| 00:59:44,840 --> 00:59:51,040 | |
| Or a billion. | |
| 672 | |
| 00:59:51,200 --> 00:59:52,640 | |
| Some of them are getting a billion. | |
| 673 | |
| 00:59:52,800 --> 00:59:54,400 | |
| The money is flowing freely right now. | |
| 674 | |
| 00:59:54,400 --> 01:00:00,480 | |
| If you ever want to do a startup, if you can prove yourself on a product, then then yeah, it's uh it's the time to be doing that. | |
| 675 | |
| 01:00:00,480 --> 01:00:04,800 | |
| That's something I always like to emphasize is it's important to have fun with this stuff. | |
| 676 | |
| 01:00:04,800 --> 01:00:10,640 | |
| Like a lot of people haven't gotten on to how deeply entertaining these things can be if you give them the chance. | |
| 677 | |
| 01:00:10,640 --> 01:00:14,160 | |
| Like don't just ask them to tell you jokes, they'll tell you crap jokes. | |
| 678 | |
| 01:00:14,160 --> 01:00:21,120 | |
| But if you ask them to be banana slugs and talk about the impact of generative AI on their society, that keeps me entertained all the time. | |
| 679 | |
| 01:00:21,120 --> 01:00:22,960 | |
| There's just so much you can do with that. | |
| 680 | |
| 01:00:22,960 --> 01:00:24,080 | |
| I could listen to you all day. | |
| 681 | |
| 01:00:24,080 --> 01:00:25,600 | |
| I really, really could, Simon. | |
| 682 | |
| 01:00:25,600 --> 01:00:29,040 | |
| Um enjoyable, uh, educational. | |
| 683 | |
| 01:00:29,040 --> 01:00:37,760 | |
| Um and and just honestly, again, um, as I mentioned before, your passion for what you do is pretty apparent, and your authenticity is pretty apparent. | |
| 684 | |
| 01:00:37,760 --> 01:00:39,680 | |
| So thank you so much for your time today. | |
| 685 | |
| 01:00:39,840 --> 01:00:41,360 | |
| Greatly, greatly appreciate it. | |
| 686 | |
| 01:00:41,360 --> 01:00:42,080 | |
| Thanks very much. | |
| 687 | |
| 01:00:42,080 --> 01:00:43,840 | |
| This has been a really fun conversation. | |
| 688 | |
| 01:00:43,840 --> 01:00:44,640 | |
| Thanks, Simon. | |
| 689 | |
| 01:00:44,640 --> 01:00:49,760 | |
| And just let our audience know where they can reach you and and read your stuff. | |
| 690 | |
| 01:00:49,760 --> 01:00:56,960 | |
| So I'm online at Simon Willison.net is my blog, which I have updated every day since January the first this year. | |
| 691 | |
| 01:00:56,960 --> 01:01:01,520 | |
| So I'm just about to hit a year-long streak, which I'm I'm quite excited about. | |
| 692 | |
| 01:01:01,520 --> 01:01:04,160 | |
| Um that will link to all of my other stuff. | |
| 693 | |
| 01:01:04,160 --> 01:01:09,040 | |
| I'm um I have a very active GitHub account with 900 projects on it at the moment. | |
| 694 | |
| 01:01:09,040 --> 01:01:13,200 | |
| Um, and I'm on Blue Sky and Mastodon and Twitter and so forth as well. | |
| 695 | |
| 01:01:13,200 --> 01:01:13,600 | |
| Great. | |
| 696 | |
| 01:01:13,600 --> 01:01:16,160 | |
| Under Simon W at Simon W, right? | |
| 697 | |
| 01:01:16,160 --> 01:01:23,760 | |
| Uh yes, or at SimonWilson.net on Blue Sky and um Simon at SimonWilson.net on Mastodon. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment