TNS
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Large Language Models / Software Development

Human Insight + LLM Grunt Work = Creative Publishing Solution

A hybrid Google Docs / Markdown workflow, with the help of LLMs, can speed production and unify the review/revise cycle.
Jun 25th, 2024 7:03am by
Featued image for: Human Insight + LLM Grunt Work = Creative Publishing Solution
Image via Unsplash+. 

If you publish product documentation to a website, this might be a familiar pattern. You write an initial draft in Google Docs so your team can review and propose changes. It’s a rich authoring environment in which you can insert images by pasting bits copied from screens, and easily create and rearrange tables. When the team reaches a consensus, you have a nice representation of the web page you’d like to appear on your site.

But how do you make that happen? In this post, we’ll look at a surprisingly simple method that’s proven effective. In our case it’s tuned for a workflow that pushes GitHub repositories into Next.js-based sites hosted by Vercel, so our Markdown text and associated images are copied into those repos. But the method will work for any Markdown-oriented publishing system.

Why not use Markdown syntax in Google Docs and pass it through to publishable Markdown?

The naive solution is, of course, to just export the Google Doc as HTML. That gets you most of the way there, but the last mile is a slippery slope. Google Docs won’t let you create custom styles to align elements with their counterparts in your published web page. And the images are sourced from googleusercontent which is convenient — they seem to just magically work — but you probably want those images stored as named files in your publishing system.

So what I’ve seen happen, in several different environments, is manual transfer and reformatting that becomes a tax on the collaborative benefit of Google Docs.

I’d taken a few runs at this problem in the pre-LLM era, so for starters I revisited what my new team of assistants might bring to the party. Feeding an HTML export from Google Docs into Python’s Markdownify got us tantalizingly close, but that last mile really is a slippery slide into the weeds of document conversion.

Then it occurred to me: people are already writing triple backticks for code blocks, single backticks for inline code, ** and * for bold and italic, and Markdown-style links and lists. Why not use that Markdown syntax in Google Docs and and pass it through to publishable Markdown? That turned out to be a really fruitful idea that was easy to implement.

Progressive Pattern Simplification

Here’s the syntax I’m using to identify images in Google Docs.

[image: image_1 70%]

Here’s how that looks in the exported HTML.

[image: image_1 70%]<br></span><span style=”overflow: hidden; display: inline-block; margin: 0.00px 0.00px; border: 0.00px solid #000000; transform: rotate(0.00rad) translateZ(0px); -webkit-transform: rotate(0.00rad) translateZ(0px); width: 620.50px; height: 648.64px;”><img alt=”” src=”https://lh7-us.googleusercontent.com/FcQFpX3FcEHFx0WmbZlMSKryKxd9hnkdtvSwnlS-v5Y92x6VKfYFhb_b8yKlrx6q8j2gARKB5AGfQRJlMVpz4JbTzdQEg0GiHnreS7U0bD-XehoZT6S_DydXtSgpnlPutG8pso1XBZChKvl0″ style=”width: 620.50px; height: 648.64px; margin-left: 0.00px; margin-top: 0.00px; transform: rotate(0.00rad) translateZ(0px); -webkit-transform: rotate(0.00rad) translateZ(0px);” title=””>

Here’s the Markdownified version of it.

[image: my_image_1 70%] \n \n![](https://lh7-us.googleusercontent.com/FcQFpX3FcEHFx0WmbZlMSKryKxd9hnkdtvSwnlS-v5Y92x6VKfYFhb_b8yJtkv6q8j2gARKB5AGfQRJlMVpz4JbTzdQEg0GiHnreS7U0bD-XehoZT6S_DydXtSgpnlPutG8pso1XBZChKvl0)

Now two things need to happen: the image needs to be downloaded to a file with the indicated name, and the whole string needs to be replaced with this element.

<p><img alt=”aws_start_0_create_role_1″ style={{“width”:”70%”}} src=”/images/docs/my_image_1.png”/></p>

Since the site is fed by Markdown, why not a Markdown-style link? You can’t specify the optional width attribute in Markdown, so HTML syntax is necessary. (Why the double braces? There’s another level here: for attributes, the publishing system requires JSX syntax.)

Here are the functions that do the work.

My assistants helped with the coding, as usual, but they didn’t come up with the idea, nor would I have expected them to. While large language models (LLMs) can certainly be useful rubber ducks, I think that for now this kind of creative and non-obvious solution will require human insight. Maybe chain-of-thought prompting could have gotten me there, and if so I’d be delighted to learn how.

But meanwhile, I’m just grateful for the coding help. In this case, that included writing the regular expression, testing it, and then revising it to use long form with comments. I’m better at having an idea like this than hammering out the details, so I appreciate the help.

Please, ChatGPT, Don’t Write Code Unless I Ask!

ChatGPT really wants to code, and I’m finding it’s still a struggle to keep it focused on strategy.

That’s a sentence I never imagined I’d write. But as I worked through this exercise, I became increasingly frustrated by both ChatGPT’s and Claude’s eagerness to dive straight into coding. I asked both to refrain so that we could first discuss strategy. Claude responded pretty well to that request, but ChatGPT was stubbornly determined to write code.

When I mentioned this on Mastodon, Josh Kellendonk helpfully replied.

“I added instructions to my user base prompt not to generate code unless explicitly asked. ChatGPT has been much better behaved since.”

It was news to me that you can do that by way of the “Customize ChatGPT” link on your profile dropdown. And that does seem to have helped. But wow, ChatGPT really wants to code, and I’m finding it’s still a struggle to keep it focused on strategy.

Frictionless Screenshots

Screenshots are fundamental to software documentation. As developers, we use words and pictures of screens together to explain the various states that applications can be in. If the pictures are costly to create and maintain, we’ll use fewer of them than we ideally would. It’s easy enough to grab a region from a screen, and maybe tweak it in image editor.

But there’s a nontrivial cost for moving that image into the publishing system. You have to name it in the image editor, save it to the right place, and then insert a reference to it in the page where it will display. And when the application’s UX changes, you have to redo those steps.

Reducing that overhead has been a major benefit of this approach. Just capture bits from a screen, paste them into a Google Doc, write a descriptive name (which also serves as alt text), and that’s it. You’re done in seconds and, if you need to revise, that’s just as quick.

I can envision a knock-on benefit too. In “How to Learn Unfamiliar Software Tools with ChatGPT,” we saw how it’s now useful to upload pictures of application states. As the distinction between text and pictures of text fades away, pictures become rich prompts that can augment (or even supersede) verbal descriptions.

I first experienced this powerful mode when learning how to plot an equation in an unfamiliar tool, GeoGebra, by showing ChatGPT screenshots of failed efforts. That was effective, I think, because GeoGebra is a popular tool that’s well-represented in the corpus of documents that LLMs feed on.

The more application states pictorially represented in the docs, the likelier it will be that a user’s captured screenshot will be an effective prompt.

As noted in “How AI Can Help Improve Our Documentation,” we’re making effective use of Unblocked, which has ingested our documentation and can answer questions with help from that corpus. It doesn’t yet work with images but, if it gains that capability — and/or as the docs make their way into the public LLMs — screenshots will become a powerful alternate way to ask questions like: How did I land on this screen? What went wrong? Where do I go next? The more application states pictorially represented in the docs, the likelier it will be that a user’s captured screenshot will be an effective prompt.

If things turn out that way, it’ll be very different to what I once envisioned. It’s always bugged me that we have to write docs that say things like: “Click the Save button (at the top right of the screen).” My notion was always that the web platform affords a better solution: links. If an application’s URL surface area were sufficiently rich, you could just cite the link that results from following that instruction.

But there’s a practical limit to how much application state you can represent in a URL. And ultimately the state that matters is the one that’s rendered to the screen. So I think screenshots will play an even more important role than they already do. And less publishing friction means we can have more of them.

Integrated Review and Revision

Although streamlined publishing of screenshots is nice, the biggest win comes from reviewing and revising in Google Docs which, for better and worse, has become the defacto collaboration standard for many of us. Vercel does support comments on drafts, but you can’t just click “Accept” to incorporate a suggestion into a published doc. And it’s hard to keep track of comments in two places.

Accepting suggestions in Google, and thereby automatically updating the published web page, is much cleaner and simpler. Requested changes can be a lot easier to make too. I’d rather add a column to a table in a Google Doc than wrangle Markdown in a text editor!

Of course, Google Docs has its share of bugs. One that’s bitten me on this project: assigning H-level headings in close proximity to normal text. It can be fussy to separate the two, and while it’s nice to see styled headings in a Google doc, I might fall back to using Markdown headings. But that’s a minor nit, because on the whole this hybrid method has been a big win.

Group Created with Sketch.
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.