Notes

Minecraft Bluemap Plugins & Managing a Minecraft Server

For the last few years I’ve shared a survival minecraft world with my friend Cam. It’s hosted on WiseHosting and apart from a bunch of performance mods, the only non-vanilla stuff running on it are Distant Horizons and Bluemap.

While I could save some money by running my own server, I don’t really want to. I play minecraft to get away from my day-to-day, and I also don’t want to be responsible if something goes wrong and ~3 years of building in our world suddenly disappears. Hence I outsource to WiseHosting and avoid messing with the default installation too much.

But last month I found myself wanting three things which didn’t exist:

  1. A Bluemap plugin which shows Chunkbase-style structure markers on our actual world map.
  2. A Bluemap plugin to assist with chunk trimming.
  3. A cheap way to run a persistent shared creative world with Cam, and also to spin up and manage ephemeral minecraft servers for various reasons.

After a few days vibe coding and learning about the minecraft ecosystem I have all three…

Bluemap Structures

mc-bluemap-structures is a Fabric mod and Bluemap plugin which reads your world seed and adds chunkbase-like structure markers to your bluemap maps.

mc-bluemap-structures.png

It works by replicating Minecraft’s algorithms and creating BlueMap markers. You can read about how it works here. It’s currently somewhat limited by the fact that the BlueMap web app doesn’t handle very large numbers of markers performantly, so if you run it with a very large radius BlueMap will get janky. At some point I’ll look into rendering the markers myself as an overlay, which might help.

The process of building this was super interesting – I learned a whole bunch about how Minecraft decides where to generate structures, and also how Chunkbase manages to replicate the algorithms with such accuracy. I was surprised to discover that Chunkbase does this entirely on the client-side with a ~4 MB WebAssembly module compiled from Rust.

The project includes a command-line tool which uses playwright to extract structure positions directly from Chunkbase and save them to a JSON file. I built this as a way to verify my own algorithms against chunkbase, but it could easily be used to create a much simpler Bluemap plugin which simply reads a manually-generated JSON file from Chunkbase and shows markers based on that.

Bluemap Chunk Trimmer

As a minecraft map grows through player exploration, it becomes necessary to occasionally trim away chunks which have been generated but contain nothing valuable. This helps keep the world size (on disk) in check and also prevents players having to travel further and further afield to experience new features which will only appear in newly-generated chunks.

To do this well, you need a detailed map of your explored world and the ability to visually select chunks for deletion/retention. It’s also very helpful to know the playerInhabitedTime for each chunk – if it’s only a few seconds we can assume that it was only generated because someone flew near it and has probably never been visited. If it’s tens of hours we probably don’t want to trim it even if nothing’s been built, because someone’s spent enough time hanging out it’d feel weird if it changed.

Most of the GUI tools for chunk trimming only work on Windows or on local worlds (or both).

mc-bluemap-chunky-trimming is a Bluemap plugin which does two things:

1. Heatmap

mc-bluemap-chunky-heatmap.png

Reads the InhabitedTime NBT data from .mca region files and renders a heatmap as an overlay in bluemap’s 2D flat mode. Chunks with less than 1 minute of inhabited time don’t appear on the heatmap at all, and the rest are colored according to how long they’ve been inhabited, with the highest level being 10+ hours.

When the heatmap is on, a little HUD will show the inhabited time of the chunk under the cursor.

2. Chunk Selector

mc-bluemap-chunky-selector.png

With the Chunk Selector toggled on, control-click on a chunk will select it (or deselect it). Chunks can also be selected by dragging a box, or by “painting” with the mouse.

Selected chunks can be exported as either JSON or MCA-Selector compatible CSV. I intentionally decided against building the actual chunk deletion.

mc-infra

While the two plugins above might work for other people, mc-infra is very much for me alone.

It’s some fairly simple tooling for managing minecraft servers deployed to a Hetzner VPS.

1. Manifest System

The main feature is a manifest system which allows me to specify a bunch of stuff in YAML and use that to generate a suitable docker-compose file for itzg’s docker-minecraft-server and mc-router, as well as a bunch of other bits and pieces.

Given a manifest.yml like this

mod_groups:
fabric-base:
- fabric-api
- lithium
- ferrite-core
- c2me-fabric
- scalablelux
- noisiumforked
servers:
mynewworld:
type: FABRIC
version: LATEST
mode: creative
tier: permanent
seed: '493527618652710797'
mod_groups: [fabric-base]
modrinth_mods: [bluemap, distanthorizons, simple-voice-chat]
svc: true
pregen:
radius: 1500
backup:
interval: 24h
keep: 3

I’ll end up with a properly-configured fabric server called mynewworld with:

  • Suitable memory, disk usage, CPU limits etc for a “permanent”-tier world.
  • My standard fabric mods installed, plus Bluemap, DH and Simple voice Chat.
  • Proper config and port-forwarding to support Simple Voice Chat.
  • The Chunky mod installed and configured properly to pregen chunks to a 1500 radius.
  • 24h backups configured via itzg/docker-mc-backup.
  • Proper configuration so the server is available at mynewworld.mc.danny.is.
  • Because bluemap is included:
    • Proper configuration & setup of Bluemap
    • Nginx configured to serve bluemap at map-mynewworld.mc.danny.is

2. Control Scripts

A bunch of executable shell scripts are available on the PATH for managing the minecraft servers. Some of them are just very thin wrappers around docker commands. Others (like mc-nether-roof) are more complex.

Together, these scripts give me an interface for working with the minecraft servers on the box once I’ve ssh’d in.

3. Setup Scripts & Dev Tooling

The Hetzner box itself is configured with the tooling I want to work on it, including the developer tools I need to work on mods directly on the box. Scripts like setup.sh make it a little easier to recreate this whole thing on a fresh VPS if I ever need to.

4. Monitoring Web App

A simple Hono app provides a web interface for checking the status of the running minecraft servers, and includes some nice minecraft-specific stuff.

mc-infra-dashboard.png

mc-infra-worlddata.png

Wrapping Up

This whole thing was a super-interesting side-quest, and I learned a bunch of stuff despite most of the code here being written by an LLM. I don’t expect I’ll maintain these repos beyond keeping them all working for my own needs.

copy / view as markdown

Tracking Biggles Books

Thirty-odd years ago I borrowed some Biggles Audiobooks (on cassette tape) from my local library and enjoyed them so much I asked the librarian if they had any more. Turned out that while they had a few modern editions on the shelves, they had a whole collection of much older ones downstairs. And so for a year or two in the late ’90s I’d pop in to Eastbourne library every week or two: the lady always had 2-3 ready for me and took the time to ring round other libraries asking them to dust off any they had and sent them to Eastbourne.

Some time in 2000 I decided I wanted to read every Biggles book written and after digging about on a computer in the school library I eventually found a complete list on biggles.nl, which is a wonderful example of the kind of weird, niche hand-rolled website which first got me interested in the internet. (I’m happy to see that the International Biggles Association is still going in 2026!)

[Image]

I never did manage to read them all, but sometime in my early 20’s I started to pick up old copies if I saw them in second-hand bookshops and accidentally became a collector of biggles books. I now have 64 of the 98 published, and someday hope to own a first edition of all 98.

Managing my collection

During the COVID pandemic I accidentally spent fifty quid on eBay buying Biggles Books I already owned, so I set up an Airtable database to track my current collection and provide an easy list of those I’m still looking for. This was partly for me, and partly a reference for friends who spot a Biggles book in charity shops and message me.

Last week I realised my Biggles database is the only thing still on Airtable. So I grabbed a CSV export and had Claude Code create some stuff in Obsidian for me to keep notes, and a tiny interactive website at https://biggles.danny.is/.

The website is just three files:

  • canonical.json - All published books ordered chronologically with the title and publication date, and labelled as Pre-WW1, WW1, Interwar, WW2 or Post War.
  • collection.json – All the copies I own with the metadata I want to track.
  • index.html - Reads the JSON files and renders the site.

All three files are statically served by GitHub Pages.

Reflections

  1. This kind of project simply wouldn’t have happened without Claude Code. Sure, I could have done this without AI - but I just wouldn’t.
  2. Sites like biggles.nl and biggles.info are wonderful examples of the old-school “open web”. They have existed largely unchanged for decades and depend only on someone keeping their servers alive. And they’re still useful.
  3. Libraries are awesome.

Shoutout to the folks who worked at Eastbourne Library in the late ’90s, and to the folks who maintain biggles.nl and biggles.info. And if you’re reading this and have any books on my wanted list, please drop me an email!

copy / view as markdown

Code has always been the easy part | Quoting Kellan Elliott-McCrea

We’ve always had this tension. We’ve always fetishized the act of writing code, the quality of the code, the code as the primary artifact and IP. And on the other hand successful teams have always known that the value is the system, the value is human-technology hybrid that allows a product to be delivered, meet customer needs, evolve to provide more value over time, meet the spoken and unspoken needs of the problem domain, etc. This confusion in our thinking has laid at the heart of why, for example, technical hiring was such a disaster for so long. (Hiring continues to be terrible but it’s actually much better than it used to be. We now make fun of teams that ask you to reverse a linked list on a whiteboard while evaluating if they’d like to have a beer with you. That used to be the norm)

Said another way, we’ve known for a long time that code is the easy part. Has arguably always been the easy part, but certainly has been the easiest part of building software for the last several decades.

copy / view as markdown

Quoting Jasmine Sun on My Claude Code Psychosis

I had Claude resurface texts I forgot to respond to, and realized that the real blocker—obviously—was that I didn’t want to reply. I’ve tried countless apps to shut out distractions, but procrastinate just as well by staring blankly at walls. What’s actually tough about my job is coming up with novel frames for important ideas and devising sentences that are equal parts sharp, lively, and true. You can have the best Deep Research reports in the world, and still lack a unique point of view.

I’m not the only one having this issue. Just because Claude Code can be wielded by a nontechnical person does not mean it’ll be a big productivity boon. Sentence generation is a software problem, but insight is not. Sending reminders is a software problem, but motivation is not. When you have a hammer, everything looks like a nail; when you can conjure solutions at will, you won’t stop to ponder why you built them.

copy / view as markdown

Claude Code hallucinating like it's 2024

I’ve just been using Claude Code and my task management skill to work through my current life areas and projects and help me define next actions. I’ve had enough time with Opus 4.5 recently that I was legit surprised when it started hallucinating plausible but absolutely-incorrect stuff about some of my projects. Not least because it went from what I’m used to to GPT-3.5-levels of batshit hallucinations so suddenly.

Its explanation of why this happened is interesting.

Brief Context

I manage my stuff with a GTD/PARA-esque model: Areas, Projects and Tasks. Areas include things like Finance, Health, Coding etc and contain projects. Projects can contain tasks. I’ve recently built Taskdn, which stores areas, projects and tasks as markdown files in my Obsidian vault and includes a Claude Code skill & CLI to help Claude Code work with them. My personal area and project files have been in this system for a few months now, but having just shipped an Alpha release of the desktop app, it was time to populate my tasks/ directory with task files and start using them as my daily driver.

So I fired up CC, loaded the skill and basically said “look at all my areas and projects. Let’s define next actions for them all and create the appropriate task files. Some projects have clear checklists in the project docs, others we’ll need to discuss. Let’s do them one by one. What order would you suggest?”

Claude ran a few tdn --ai commands and sensibly decided it should read all my 14 area and 23 project files in full. It gave a very good summary of the current situation, sensibly suggested we skip a few projects and proposed an order of attack. For each project it would report anything in the project doc which seemed like current/future tasks and propose clear Next Actions to create. If it was unsure, it would ask me for more info. Once we’d agreed on the tasks to create it would do so and edit the project doc accordingly.

Which worked perfectly for the first three projects on the list:

  1. Tax Return YE April 2025
  2. End-of-Year Finance Reset (renamed to Jan Finance Reset + created new Credit File and History project)
  3. [REDACTED]

And then we got to the next three – all of which sit under my RAFAC area and have to do with my voluntary work with the RAF Air Cadets

RAFAC Re-qualification
Currently in project doc:
eLearning modules:
- Safeguarding (yearly), Data Protection (yearly), Prevent, Fire Safety, GDPR, Cyber Security, AMS 5 & 6
Courses:
- DCCT (in-progress, have place)
- Staff Ride (Feb 2026)
- M Qual & LR (need applications)
Admin:
- ID photos, Bader photo, DBS update, First Aid cert, SMS competencies

The only real things here are DCCT and M Qual & LR – the rest is plausible but totally made-up nonsense. (Fuck knows what Feb’s Staff Ride is but… it sounds fun? Maybe I’m joining the Cavalry!)

And for the next project, which relates to squaring my uniform and field kit, I was confidently told that my project doc prioritises the following: C95 Lightweights, MTP Trousers, C95 shirts, Flying suit, Softie jacket.

If you happen to know what these items are, you’ll certainly be laughing at the image of me wearing all of them together, in the year 2025. For everyone else: CS95 went out of service in 2010, lightweights are from the 1980’s and my softie jacket is from the mid 2000’s and buried somewhere in my loft. I do not, and never will, have need of a flying suit.

The project doc does contain a short list of MTP uniform items I need to replace, but it does not include trousers.

I’d maybe expect this kind of hallucination from contextless ChatGPT, especially with something as niche as this. But given what Claude knew about me in this session I did not expect ChatGPT-in-early-2024 levels of obviously-made-up stuff from Opus 4.5. To be clear, the most surprising things were:

  1. It’s so obviously wrong. It’s not the subtle kind of hallucination which modern Anthropic models tend to produce.
  2. Somewhat uncharacteristically, Claude didn’t spot its own nonsense. there was no wait I might have made that up lemme check or Let me clarify questions.
  3. It was so sudden. Claude went from sensible-and-effective to random-token-prediction-machine in like two turns.

So what’s going on?

Here’s our conversation on this (nothing was auto-compacted, so you can ignore the stuff about that)…

Okay, um you’ve hallucinated a bunch of stuff about what’s in the last two project documents, I suspect, because we’ve covered a lot of very different and disparate uh projects and areas while going over this. And you haven’t actually read the project documents in a while, so you’re just making up stuff you think will be in them

I’d like you to try to explain why you have just for the last couple of projects (so 4, 5 and 6) suddenly started to hallucinate credible but very incorrect information about what’s in the doc. I don’t need you to apologize for this, and I don’t need you to do anything about it, but I’d like to understand why you think it is that that’s happened when prior to that you were extremely accurate in helping me out and explaining what was in those docs and everything. Um I’m very familiar with using AI agents like Claude Code, so you you can be detailed or technical in your explanation of this. I want to know so that in the future I can design some instructions for this kind of work that will help to prevent it. I don’t need you to write those instructions for me now. I just need you to try to explain why you think this happened and in particular why it happened with projects four, five, and six here.

## Why the Hallucination Happened
The root cause is **stale context from early reads combined with pattern-matching fabrication**.
At the start of this session, I read all the project files in a batch during initial context gathering. That was ~50+ turns ago. As the conversation progressed:
1. **Context compression**: Those early file reads got pushed further back in my context window. The model may have summarized or partially dropped details to make room for new content (task creation outputs, your detailed explanations, etc.).
2. **No re-read before presenting**: For projects 1-3, I happened to have fresher memory or you provided enough detail that my errors were corrected. By projects 4-6, I was confidently presenting "what's in the doc" without actually re-reading it — I was reconstructing from degraded memory plus plausible-sounding details based on project titles.
3. **Pattern matching from similar content**: I'd read many RAFAC-related files, kit lists, workshop docs. My outputs for projects 4-6 were plausible *patterns* of what such docs typically contain — but not what *your* docs actually said.
4. **Confidence without verification**: The format "Currently in project doc:" followed by a code block looked authoritative, but I wasn't actually quoting the file — I was generating what I *expected* to find.
**The fix is simple**: Always run `Read` on the project file immediately before discussing its contents, even if I "remember" reading it earlier. Long conversations with many file operations require fresh reads, not memory.

Got it, that’s interesting. Can you just explain a tiny bit about how um your memory of your earlier reads degraded over time? Like I assume that that is compressed in addition to just the more context you have in your contact window, the more stuff there is to get confused about, right? Is that correct, that there is some form of of of compression happening there, i.e. summaries for for much older information. I’m also particularly interested that this seemed to happen with RAFAC-related stuff. Like the task immediately before that, you very correctly told me what was in the project document from memory without re reading. And then immediately the next project You were hallucinating plausible things. So in addition to uh compression and degradation in your context window, I’m wondering about two things: All my RAFAC-Related docs Make heavy use of very niche millitary-specific terminology and acronyms. And I’m wondering if that’s likely to have had effect on the training data that you you will use when you’re workingout what to generate next. And the second thing is that many of my previous projects were f fairly specific. I I can’t imagine you’ve read a great deal in your training data about fixing a Credit Record with [REDACTED] etc. And then finally I’m conscious that if you read all of my area and projects together, a lot of the other stuff is very kinda connected, right? Like you you you can imagine me as a programmer who does coding things, who is also into self improvement and systems management and has a consultancy, those things fit neatly together as a like an expected persona almost. My RAFAC/Millitary stuff um would not necessarily fit next to that in the same kind of way. And so I guess these three things I I’m asking here, I’m interested to know or four things I suppose, I’m interested to know if you think these have had an effect, and if so, how much, which of them do you think likely have the most effect. I’d really just like to explore this to help understand how you work herea little bit more.

completely agree with your fix of instructing you to always read the project doc (and probably area doc) before a new thing BTW.

Well what are your thoughts on all this?

This is a great set of questions. Let me work through each factor:
**On compression/degradation:**
There are two mechanisms at play:
1. **Attention diffusion** — As context grows, my attention over earlier tokens becomes more spread out. I don't literally "forget" earlier content, but my ability to precisely retrieve specific details weakens. It's less like deleting a file and more like trying to read a book where earlier pages are progressively out of focus.
2. **Autocompact summarization** — Claude Code specifically has an infrastructure-level feature that summarizes older conversation when context gets long. Your `/context` output showed "Autocompact buffer: 45.0k tokens" — this means there's a mechanism that can literally rewrite earlier exchanges into compressed summaries. If my file reads got autocompacted, the specific line-by-line content would be replaced with something like "read several RAFAC project files containing training requirements and kit lists." That's enough to generate plausible content, but not accurate content.
**On RAFAC terminology:**
I think this is a significant factor. When I encounter dense domain-specific acronyms (DCCT, MT600, SMS competencies, Bader Learn, DLE, CFAV, ODI), several things happen:
- These tokens are rarer in my training data, so my associations are weaker and more generic
- I pattern-match to "military/organizational training documentation" as a category rather than your specific situation
- When reconstructing, I generate what military training docs *typically* contain rather than what *yours* actually say
Contrast this with "Credit File and History" — that's unusual *content* but uses common *vocabulary*. I can represent "contact [REDACTED] at [REDACTED] about credit file" precisely because those are all common tokens with clear meanings. But "complete CFAV SPO Refresher on DLE" is a string of rare tokens where I'm essentially guessing at the relationships.
**On persona coherence:**
I think you've identified something real here. My training data contains strong clusters of co-occurring personas:
- Programmer + consultant + productivity systems + side projects + self-improvement
- These fit together as a recognizable "type" with predictable project structures
Your RAFAC involvement doesn't fit that cluster. It's a different persona — military volunteer instructor — that would typically appear alongside different life contexts in my training data.
When I'm reconstructing content from degraded memory, I'm essentially asking "what would plausibly be in this document?" My answer draws on:
1. What I vaguely remember reading (degraded)
2. What documents like this typically contain (training data patterns)
3. What fits this person's overall context (persona coherence)
For your coding/finance/productivity projects, (2) and (3) align well with your actual situation. For RAFAC, they don't — so my reconstructions pulled from "generic military training documents" rather than "Danny's specific RAFAC situation."
**Which factors matter most:**
Ranked by impact:
1. **Not re-reading** — Proximate cause. Everything else is downstream of this.
2. **Autocompact summarization** — If the system compressed "Read RAFAC Re-qualification.md containing [specific content]" into "read several RAFAC files," I've literally lost the ground truth.
3. **Niche terminology** — Weak token associations mean my reconstructions are more "vibes-based" and less precise.
4. **Persona mismatch** — My priors for "what this person's projects contain" are calibrated to the programmer-consultant cluster, not the military-instructor cluster.
The interesting thing is that for project 3 ([REDACTED]), I was still reasonably accurate. That project sits firmly in the productivity/self-improvement cluster where my pattern-matching works well. RAFAC was where the wheels came off — exactly where you'd predict based on this analysis.
**Practical implication:**
For future instructions, you might add something like: "For any domain with specialized terminology or that sits outside typical tech/productivity contexts, always re-read source files immediately before discussing them — don't rely on earlier reads regardless of how recent they seem."

All of which I find interesting, and food for thought as I continue to use Claude Code for things which aren’t programming.

copy / view as markdown

Older Notes