[{"content":"While prepping for interviews, I spent some time really digging into how RocksDB works \u2014 how its storage engine is designed, how data gets written, and how it gets read back. RocksDB (and the LSM-tree underneath it) is one of those things a lot of people have heard of but can&rsquo;t quite explain \u2014 I couldn&rsquo;t either, before I sat down with it. Once it clicked, I wrote up the core ideas as these notes, to share with anyone else trying to get it.\nI won&rsquo;t claim this is exhaustive or deeply expert, but I hope it leaves you (and future me) with a clear overall picture of how RocksDB actually turns.\nWhat RocksDB is In one line: an embeddable, persistent key-value store.\nEmbeddable: it isn&rsquo;t a standalone server like MySQL \u2014 it&rsquo;s a library you compile directly into your program, which cuts out inter-process communication overhead. Persistent: data lives on disk; nothing is lost on a crash. Forked from Google&rsquo;s LevelDB in 2012, written in C++, optimized specifically for SSDs and write-heavy workloads. Meta, Microsoft, Netflix, and Uber all use it. It is not distributed \u2014 replication and sharding are your job at a higher layer. The operations it exposes are humble: put(key, value) to write, get(key) to read, delete(key) to remove, merge(key, value) to combine, and iterator.seek() for range scans.\nThe core idea: the LSM-tree Everything in RocksDB is built on the LSM-tree (Log-Structured Merge-Tree).\nThe core tension it tackles: disks hate random writes and love sequential ones. The LSM-tree&rsquo;s trick is to buffer writes in memory, keep them sorted, then flush them to disk sequentially all at once. In other words, it batches a flood of random writes into sequential writes \u2014 and that&rsquo;s the fundamental reason it writes so fast.\nStructurally, data is split across many levels: the top level lives in memory, and below it sit level after level on disk, numbered L0, L1, L2\u2026 The deeper you go, the older and larger the data (each level is typically ~10\u00d7 the one above it).\nmemory \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u2502 MemTable (writable, sorted) \u2502 \u2190 new data lands here first \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 - - - - - - - - - - - - - - - - - - - - - - flush disk L0 [SST] [SST] [SST] \u2190 newest; key ranges may overlap across files L1 [SST][SST][SST][SST] \u2190 no overlap within a level, and bigger L2 [SST][SST] ...... \u2190 older and larger the deeper you go (~\u00d710) ... This structure dates back to 1996 and was designed for write-intensive workloads. Besides RocksDB, Bigtable, HBase, Cassandra, and MongoDB&rsquo;s WiredTiger engine are all LSM-tree based.\nWriting: how data gets in A single write lands in two places at once:\nput(key, value) \u2502 \u251c\u2500\u2500\u25ba WAL (appended sequentially to disk, for crash safety) \u2502 \u2514\u2500\u2500\u25ba MemTable (kept sorted in memory) \u2502 fills up at ~64MB \u25bc turns read-only; a background thread flushes it to one SST file \u2192 L0 MemTable: the in-memory write buffer where every insert, update, and delete goes first. It&rsquo;s kept sorted by key internally (the default implementation is a skip list), which is what makes the later flush and range queries efficient. One detail: a delete doesn&rsquo;t actually erase anything \u2014 it writes a tombstone record meaning &ldquo;this key is deleted.&rdquo; The real cleanup is left to compaction later.\nWAL (Write-Ahead Log): the MemTable is in memory, so a power loss would wipe it. So every write also appends a record to a WAL file on disk \u2014 key, value, operation type, and a checksum. After a crash, RocksDB replays the WAL to reconstruct the MemTable. Note the WAL is appended in write order, not sorted \u2014 it&rsquo;s optimizing purely for speed.\nFlush: once a MemTable fills up, it turns read-only and a fresh one takes over; a background thread then flushes the read-only MemTable into a single SST file on L0. Once that&rsquo;s done, the corresponding WAL can be discarded. Because the MemTable was already sorted, this flush is one sequential write \u2014 which is the whole point of the LSM-tree.\nWhat an SST file looks like An SST (Static Sorted Table) is the file that actually holds data on disk, and it&rsquo;s never modified once written. Inside is a pile of sorted key-value pairs, laid out in a carefully designed block format (blocks default to 4KB and can be compressed with Snappy, LZ4, ZSTD, etc.).\nAn SST is roughly split into a few sections:\nData blocks: the sorted key-value pairs. Since adjacent keys are similar, only the differences need to be stored (delta encoding) to save space. Index: records, for each data block, &ldquo;last key \u2192 offset in the file,&rdquo; so a lookup can binary-search straight to the right block instead of scanning the whole file. Bloom filter (optional): a probabilistic structure that very quickly answers &ldquo;this key is definitely not in this file.&rdquo; It may give a false &ldquo;yes,&rdquo; but never a false &ldquo;no&rdquo; \u2014 perfect for skipping, on a read, a whole batch of files you don&rsquo;t need to touch. Reading: how data gets found To read a key, you search newest to oldest, level by level \u2014 newer values sit higher, older ones lower, so the first hit is the latest value:\nCheck the active MemTable; Then the read-only MemTables not yet flushed; Then each SST file in L0 (L0 files can overlap in key range, so you have to check them one by one, newest to oldest); From L1 down, each level has non-overlapping key ranges, so you only need to locate and check one file per level. And within a single SST file, it&rsquo;s again three steps: first ask the Bloom filter whether the key is present \u2014 if not, skip the file entirely; if so, use the index to binary-search to the right data block; finally read that block and find the key inside it.\nSo the cost of a read comes down to how many levels and files you have to wade through \u2014 which leads straight into the next section.\nCompaction: the background cleanup that never stops As noted, a delete just writes a tombstone, and an update just writes a new value on top of the old one. Over time, the disk fills up with stale old versions and tombstones: they waste space and force reads to wade through more files.\nCompaction is the background job that cleans this up: it takes some SST files from one level, merges them with the overlapping files in the next level, throws away the shadowed old values and deleted keys, and writes fresh, clean SSTs into the lower level. Since every file is already sorted, the merge uses a k-way merge \u2014 a scaled-up version of the &ldquo;merge&rdquo; step in merge sort. It all runs on background threads, so it doesn&rsquo;t block foreground reads and writes.\nRocksDB defaults to leveled compaction:\nL0 is special: its files may overlap in key range (since they&rsquo;re flushed straight from MemTables); compaction triggers once the L0 file count hits a threshold (4 by default). L1 and below: within each level, all files have non-overlapping key ranges and are globally ordered; when a level&rsquo;s total size exceeds its target, the excess is merged down into the next level \u2014 sometimes cascading down several levels in a chain. It&rsquo;s all trade-offs: the three amplifications The key to understanding RocksDB tuning (really, all LSM engines) is three amplification factors:\nSpace amplification: disk space actually used \u00f7 size of the logical data. The more stale versions and tombstones pile up, the higher it gets. Read amplification: how many I\/O operations a single logical read actually performs. The more levels and files to wade through, the higher it gets. Write amplification: how many times a single logical write is actually written. The same piece of data gets rewritten to lower levels over and over during compaction, so this can get large. These three are a game of whack-a-mole: the more aggressively you compact, the smaller your space and read amplification, but the larger your write amplification \u2014 and vice versa. The right balance depends entirely on your workload, and the knobs are many and interdependent. Even the RocksDB authors admit it&rsquo;s hard to pin down the exact effect of each parameter, and recommend benchmarking a lot while keeping an eye on those three amplification factors.\nAn aside: the merge operation\nBesides put and delete, RocksDB has merge. When you need to apply lots of incremental updates to a value (say, repeatedly appending to a counter or a list), the traditional approach is read-modify-write: read it out, change it, write it back \u2014 clunky. merge lets you write just the increment and hands off the combining to a merge function you define, computing the final value only at read or compaction time. The upside is lower write amplification, plus it&rsquo;s thread-safe; the cost is that reads get more expensive \u2014 until the increments are consolidated, every read has to recompute them.\nThe bits worth remembering If I keep just one mental map, it&rsquo;s this:\nRocksDB = an embeddable, persistent KV store, descended from LevelDB, built on the LSM-tree; Writes: into the in-memory MemTable (sorted) + a sequential WAL (crash safety) \u2192 once full, flushed to an SST file on L0 \u2192 compaction slowly tidies things downward in the background; Reads: search newest to oldest, level by level, using a Bloom filter + index to skip and locate so you read as few stray files as possible; The essence: it trades &ldquo;write amplification&rdquo; for the high throughput of &ldquo;turning random writes into sequential ones&rdquo; \u2014 and between space, read, and write amplification, it&rsquo;s always a trade-off; there&rsquo;s no free lunch. Hold onto those few lines and the overall shape of RocksDB stands up. The finer details \u2014 skip lists, delta encoding, the various compaction strategies, how to tune the knobs \u2014 you can dive into whenever you actually need them.\nA lot of my understanding here comes from Artem Krylysov&rsquo;s How RocksDB Works, which goes into far more depth \u2014 highly recommended if you want to go deeper.\n","permalink":"https:\/\/neilmin.com\/posts\/how-rocksdb-works\/","summary":"While prepping for interviews, I spent some time really digging into how RocksDB works \u2014 how its storage engine is designed, how data gets written, and how it gets read back. RocksDB (and the LSM-tree underneath it) is one of those things a lot of people have heard of but can\u2019t quite explain \u2014 I couldn\u2019t either, before I sat down with it. Once it clicked, I wrote up the core ideas as these notes, to share with anyone else trying to get it.\n","tags":["Databases","RocksDB","LSM-Tree","Interview"],"title":"How RocksDB Works: A Minimal LSM-Tree Primer"},{"content":"I&rsquo;ve been prepping for coding interviews lately, and I went back through the sorting algorithms from scratch. The process gave me a bit of a scare: a lot of this I genuinely used to know \u2014 how quicksort&rsquo;s partition actually works, why it degrades \u2014 and now I had to pause to remember it. By the time I got to the non-comparison sorts \u2014 counting, radix, bucket \u2014 I realized that whole area had become more or less a blank.\nSo I decided to write this review down. Partly as a reference for other people getting ready for interviews, and partly as a record for my future self: next time I need to interview, I can come back here, skim through, and quickly figure out &ldquo;this one I still know, this one I forgot, let me focus there.&rdquo;\nHow to use this post:\nFirst look at the cheat sheet below \u2014 one glance tells you which algorithms you&rsquo;ve forgotten; For anything you want to dig into, use the table of contents (TOC) on the right to jump straight there; Every algorithm follows the same template: one-line idea \u2192 Python implementation \u2192 complexity \u2192 stability and in-place \u2192 interview notes, so they&rsquo;re easy to compare. All the code is in Python, because it reads closest to pseudocode and makes the logic easiest to see.\nOne-page cheat sheet Conclusions first. The table below covers all 11 sorts in this post. When an interviewer asks about complexity or stability, this is the table that should flash into your head.\nAlgorithm Best Average Worst Space Stable In-place Bubble O(n) O(n\u00b2) O(n\u00b2) O(1) \u2705 \u2705 Selection O(n\u00b2) O(n\u00b2) O(n\u00b2) O(1) \u274c \u2705 Insertion O(n) O(n\u00b2) O(n\u00b2) O(1) \u2705 \u2705 Shell O(n log n) \u2248O(n^1.3) O(n\u00b2) O(1) \u274c \u2705 Merge O(n log n) O(n log n) O(n log n) O(n) \u2705 \u274c Quick O(n log n) O(n log n) O(n\u00b2) O(log n) \u274c \u2705 Heap O(n log n) O(n log n) O(n log n) O(1) \u274c \u2705 Counting O(n+k) O(n+k) O(n+k) O(n+k) \u2705 \u274c Radix O(d\u00b7(n+k)) O(d\u00b7(n+k)) O(d\u00b7(n+k)) O(n+k) \u2705 \u274c Bucket O(n+k) O(n+k) O(n\u00b2) O(n+k) \u2705* \u274c Timsort O(n) O(n log n) O(n log n) O(n) \u2705 \u274c A few notes so the table doesn&rsquo;t mislead you:\nShell sort&rsquo;s complexity depends on the gap sequence; its best case changes with the sequence you pick, so the numbers here are just typical orders of magnitude. Quick sort&rsquo;s listed space is the average recursion-stack depth O(log n); the worst case degrades to O(n). It partitions in place, but the recursion itself uses the stack. Bucket sort&rsquo;s stability has an asterisk: it&rsquo;s only stable if the per-bucket sort (e.g. insertion sort) is stable. k is the range of values, d is the number of digits \u2014 the complexity of the non-comparison sorts is always tied to properties of the data itself, which I&rsquo;ll get into below. Before we start: a few unavoidable concepts Before going through the algorithms one by one, there are four concepts that nearly every sorting interview question relies on. Getting them straight first means I won&rsquo;t have to keep re-explaining them.\nComparison vs. non-comparison sorts A comparison sort decides order using only one operation: &ldquo;which of these two elements is bigger?&rdquo; Bubble, insertion, merge, quick, and heap are all comparison sorts. What they share: their theoretical lower bound is O(n log n) \u2014 nothing can beat it (the reason is below).\nA non-comparison sort doesn&rsquo;t compare; instead it uses the element values themselves to compute where each one belongs. Counting, radix, and bucket are all like this. Because they sidestep comparison, they can hit linear time O(n) \u2014 but the price is extra requirements on the data (e.g. it must be integers in a bounded range).\nStability If two elements have equal sort keys, and their relative order is preserved after sorting, the sort is stable.\nA concrete example. Say you have a batch of orders already sorted by time, and now you want to re-sort by amount:\nBefore (sorted by time): ($100, 9:00) ($50, 9:01) ($100, 9:02) Stable sort (by amount): ($50, 9:01) ($100, 9:00) ($100, 9:02) \u2190 the two $100s keep their time order Unstable sort: ($50, 9:01) ($100, 9:02) ($100, 9:00) \u2190 the two $100s got scrambled Why do interviewers love this? Because multi-key sorting depends on it: sort by the secondary key first, then use a stable sort on the primary key, and the secondary order is preserved. Knowing which sorts are stable (bubble, insertion, merge, counting, radix, Timsort) and which aren&rsquo;t (selection, shell, quick, heap) is almost guaranteed to come up.\nIn-place sorting If a sort needs only O(1) or O(log n) extra space, it&rsquo;s in-place. Merge sort allocates an extra O(n) array, so it isn&rsquo;t in-place; quick and heap only shuffle the original array, so they are. When an interviewer presses &ldquo;what if memory is tight?&rdquo;, this is usually what they&rsquo;re asking about.\nWhy complexity splits into best \/ average \/ worst The same algorithm can behave wildly differently on different inputs. Quicksort is the classic case: O(n log n) on random input, but if the input is already sorted and you keep picking the worst pivot, it degrades to O(n\u00b2). When you state complexity in an interview, it&rsquo;s best to say which case you mean \u2014 that&rsquo;s exactly where you show how deeply you understand it.\nWhy can&rsquo;t comparison sorts beat O(n log n)? Any comparison sort can be drawn as a decision tree: each internal node is one comparison, each leaf is one possible final arrangement. There are n! possible arrangements of n elements, so the tree must have at least n! leaves. A binary tree of height h has at most 2\u02b0 leaves, so 2\u02b0 \u2265 n!, i.e. h \u2265 log\u2082(n!). By Stirling&rsquo;s approximation, log\u2082(n!) \u2248 n log n. The tree&rsquo;s height is the number of comparisons in the worst case, so the lower bound is \u03a9(n log n). This also explains why going faster means dropping &ldquo;comparison&rdquo; entirely \u2014 which is what the non-comparison sorts do.\nOK, concepts done. Let&rsquo;s go through them one by one.\nComparison-based sorts Bubble Sort One-line idea: compare adjacent elements pairwise, swap if out of order; each pass &ldquo;bubbles&rdquo; the current largest element to the end.\ndef bubble_sort(arr): n = len(arr) for i in range(n - 1): swapped = False # each pass bubbles the largest of the unsorted region to the right end for j in range(n - 1 - i): if arr[j] &gt; arr[j + 1]: arr[j], arr[j + 1] = arr[j + 1], arr[j] swapped = True if not swapped: # a whole pass with no swaps means it&#39;s sorted; quit early break return arr Complexity: worst and average are both O(n\u00b2); with the swapped early-exit, it&rsquo;s O(n) on already-sorted input. Space O(1). Stability \/ in-place: stable (only swaps on a strict greater-than), in-place. Interview notes: basically never used in practice, but it&rsquo;s the textbook example of &ldquo;stable + early-exit reaches O(n)&rdquo;. Watch out for that swapped optimization \u2014 it&rsquo;s a common gotcha. LeetCode: 912. Sort an Array \u2014 there&rsquo;s no problem dedicated to bubble sort, but this generic sorting problem is a fine sandbox to practice the implementation (pure O(n\u00b2) will time out on large inputs, so it&rsquo;s practice only). Selection Sort One-line idea: each pass picks the smallest element from the unsorted region and places it at the end of the sorted region.\ndef selection_sort(arr): n = len(arr) for i in range(n - 1): min_idx = i # find the index of the minimum in the unsorted region [i+1, n) for j in range(i + 1, n): if arr[j] &lt; arr[min_idx]: min_idx = j arr[i], arr[min_idx] = arr[min_idx], arr[i] return arr Complexity: O(n\u00b2) no matter what the input looks like \u2014 it never speeds up on sorted data. Space O(1). Stability \/ in-place: unstable, in-place. For example [5a, 5b, 2]: the first pass swaps 2 with 5a, and the two 5s flip relative order. Interview notes: its one redeeming trait is the minimum number of swaps (at most n\u22121), which matters when writes are expensive. It&rsquo;s also the counterexample to &ldquo;best case can save you&rdquo; \u2014 it never does \u2014 and is often compared against insertion sort. LeetCode: 912. Sort an Array \u2014 practice the implementation on this generic problem; selection sort is a good way to feel &ldquo;few swaps, but no fewer comparisons.&rdquo; Insertion Sort One-line idea: like sorting a hand of cards \u2014 go left to right, inserting each new card into its correct spot among the already-sorted cards on the left.\ndef insertion_sort(arr): for i in range(1, len(arr)): key = arr[i] j = i - 1 # shift everything bigger than key one slot right to make room while j &gt;= 0 and arr[j] &gt; key: arr[j + 1] = arr[j] j -= 1 arr[j + 1] = key return arr Complexity: worst and average O(n\u00b2); close to O(n) on nearly-sorted input. Space O(1). Stability \/ in-place: stable (the while condition uses &gt;, not &gt;=), in-place. Interview notes: don&rsquo;t underestimate it. On small or nearly-sorted data, insertion sort beats quicksort, which is exactly why production-grade sorts like Timsort and Introsort fall back to it on small chunks. Of the three basic sorts, it&rsquo;s the most practically useful. LeetCode: 147. Insertion Sort List \u2014 a problem built for insertion sort: insert in place on a linked list. Shell Sort One-line idea: an upgraded insertion sort. First do insertion sort on elements spaced by a large &ldquo;gap&rdquo;, then shrink the gap step by step; the last pass uses gap 1 (plain insertion sort), but by then the array is &ldquo;mostly sorted&rdquo;, so it flies.\ndef shell_sort(arr): n = len(arr) gap = n \/\/ 2 while gap &gt; 0: # insertion sort on each subsequence with stride gap for i in range(gap, n): key = arr[i] j = i - gap while j &gt;= 0 and arr[j] &gt; key: arr[j + gap] = arr[j] j -= gap arr[j + gap] = key gap \/\/= 2 return arr Complexity: depends on the gap sequence. The n\/\/2 halving sequence above is O(n\u00b2) in the worst case; better sequences (Knuth&rsquo;s 3k+1, Sedgewick&rsquo;s) reach O(n^1.5) or better. Space O(1). Stability \/ in-place: unstable (gapped swaps scramble the relative order of equal elements), in-place. Interview notes: it&rsquo;s the poster child for &ldquo;making data roughly sorted first lets insertion sort go faster.&rdquo; Rarely asked directly, but worth knowing as the bridge \u2014 it pushes a simple O(n\u00b2) sort toward O(n log n). LeetCode: 912. Sort an Array \u2014 use it to practice shell sort and experiment with how different gap sequences affect runtime. Merge Sort One-line idea: divide and conquer. Split the array in half until you can&rsquo;t split further, then merge two already-sorted small arrays into one larger sorted array.\ndef merge_sort(arr): if len(arr) &lt;= 1: return arr mid = len(arr) \/\/ 2 left = merge_sort(arr[:mid]) right = merge_sort(arr[mid:]) return merge(left, right) def merge(left, right): result = [] i = j = 0 # two pointers, take the smaller of the two each time; &lt;= keeps it stable while i &lt; len(left) and j &lt; len(right): if left[i] &lt;= right[j]: result.append(left[i]) i += 1 else: result.append(right[j]) j += 1 result.extend(left[i:]) # whatever&#39;s left just gets appended result.extend(right[j:]) return result Complexity: best, average, and worst are all O(n log n) \u2014 rock solid, never degrades. Space O(n) (the merge needs an extra array). Stability \/ in-place: stable, not in-place. Interview notes: bulletproof complexity and naturally stable \u2014 the first choice when you need &ldquo;stable + guaranteed O(n log n) worst case.&rdquo; LeetCode: 148. Sort List \u2014 the optimal solution for sorting a linked list is merge sort; for the array version use 912. Sort an Array. Two high-frequency extensions:\nConcept break: linked-list sorting and external sorting\nLinked-list sorting: merge sort is especially friendly to linked lists \u2014 merging only rewires pointers, no extra array needed, so it can achieve O(1) extra space (not counting the recursion stack). This is why the standard answer to &ldquo;sort a linked list in O(n log n)&rdquo; is merge sort, not quicksort.\nExternal sorting: when the data is too big to fit in memory (the classic interview question: &ldquo;how do you sort a 10 GB file with 1 GB of memory?&rdquo;), the answer is external merge sort \u2014 split the big file into chunks small enough to fit in memory, read each in, sort it, write it back to disk, then use a k-way merge to combine those sorted files into the final result. Merge&rsquo;s essence \u2014 &ldquo;combining multiple sorted sequences&rdquo; \u2014 is taken to the extreme here.\nQuick Sort This is the section I most needed to pick back up \u2014 the partition details got fuzzy after five years untouched. Let&rsquo;s take it slow.\nOne-line idea: divide and conquer. Pick a pivot, partition the array into &ldquo;less than the pivot&rdquo; and &ldquo;greater than the pivot&rdquo;, put the pivot in its final place, then recurse on both sides.\n1. Lomuto partition (the easiest to memorize) def quick_sort(arr, low=0, high=None): if high is None: high = len(arr) - 1 if low &lt; high: p = partition(arr, low, high) quick_sort(arr, low, p - 1) # recurse left half quick_sort(arr, p + 1, high) # recurse right half return arr def partition(arr, low, high): pivot = arr[high] # Lomuto: always take the rightmost element as pivot i = low - 1 # i is the right boundary of the &#34;less than pivot&#34; region for j in range(low, high): if arr[j] &lt; pivot: i += 1 arr[i], arr[j] = arr[j], arr[i] arr[i + 1], arr[high] = arr[high], arr[i + 1] # put the pivot in place return i + 1 Lomuto&rsquo;s advantage is that it advances a single pointer i, so the logic is intuitive and easy to remember. For hand-writing quicksort in an interview, this version is the default.\n2. Why it degrades, and how to fix it Always taking the rightmost element as pivot has a fatal flaw: when the input is already sorted (or reverse-sorted), every partition splits the array into sizes 0 and n\u22121, the recursion depth becomes n, complexity degrades to O(n\u00b2), and it can blow the stack.\nThe fix is to stop letting the input &ldquo;predict&rdquo; the pivot \u2014 pick one at random, or take the median of the first, middle, and last elements (median-of-three):\nimport random def partition(arr, low, high): rand = random.randint(low, high) arr[rand], arr[high] = arr[high], arr[rand] # random pivot, swap it to the right, reuse the logic above pivot = arr[high] i = low - 1 for j in range(low, high): if arr[j] &lt; pivot: i += 1 arr[i], arr[j] = arr[j], arr[i] arr[i + 1], arr[high] = arr[high], arr[i + 1] return i + 1 Two added lines plug the most common pitfall \u2014 degrading on sorted input. When the interviewer presses &ldquo;what about quicksort&rsquo;s worst case&rdquo;, this is the standard answer.\n3. Three-way quicksort: handling lots of duplicates If the array has many duplicate values (say, all 0s and 1s), plain quicksort still does a lot of pointless recursion. Three-way quicksort (based on the &ldquo;Dutch national flag problem&rdquo;) splits the array into &lt; pivot, == pivot, and &gt; pivot, and skips the whole equal-to-pivot segment:\ndef quick_sort_3way(arr, low=0, high=None): if high is None: high = len(arr) - 1 if low &gt;= high: return arr pivot = arr[low] lt, i, gt = low, low, high # [low,lt)&lt;pivot [lt,i)==pivot (gt,high]&gt;pivot while i &lt;= gt: if arr[i] &lt; pivot: arr[lt], arr[i] = arr[i], arr[lt] lt += 1 i += 1 elif arr[i] &gt; pivot: arr[gt], arr[i] = arr[i], arr[gt] gt -= 1 # the swapped-in element isn&#39;t checked yet, so i stays else: i += 1 quick_sort_3way(arr, low, lt - 1) quick_sort_3way(arr, gt + 1, high) return arr Complexity: average O(n log n), worst O(n\u00b2) (almost never seen once you use a random pivot). Space O(log n), for the recursion stack. Stability \/ in-place: unstable (the long-distance swaps in partitioning scramble equal elements), in-place. Interview notes: default to Lomuto when hand-writing; bring up random \/ median-of-three when asked about the worst case; bring up three-way quicksort when asked about lots of duplicates. LeetCode: 912. Sort an Array \u2014 remember to use a random pivot on submission, or sorted \/ heavily-duplicated data will time out or overflow the recursion stack. One more high-frequency extension:\nConcept break: Quickselect\n&ldquo;Find the k-th largest \/ smallest element&rdquo; is an interview regular. If you only need the k-th one, there&rsquo;s no need to fully sort: use quicksort&rsquo;s partition, and after each partition look at where the pivot landed \u2014 then recurse into only the side that contains k. Average O(n), faster than sorting first and then indexing (O(n log n)).\ndef quickselect(arr, k): &#34;&#34;&#34;return the k-th smallest element, k counting from 1&#34;&#34;&#34; low, high, target = 0, len(arr) - 1, k - 1 while low &lt;= high: p = partition(arr, low, high) if p == target: return arr[p] elif p &lt; target: low = p + 1 # target is on the right else: high = p - 1 # target is on the left Practice: 215. Kth Largest Element in an Array \u2014 solve it with quickselect at average O(n), a nice contrast to the heap solution.\nHeap Sort One-line idea: first build the array into a max-heap (every parent \u2265 its children), so the root is the maximum; swap the root to the end, shrink the heap by one, sift the new root down, and repeat until sorted.\ndef heap_sort(arr): n = len(arr) # 1. build the heap: starting from the last non-leaf node, sift each one down for i in range(n \/\/ 2 - 1, -1, -1): sift_down(arr, i, n) # 2. repeatedly swap the root (max) to the end, then fix the remaining heap for end in range(n - 1, 0, -1): arr[0], arr[end] = arr[end], arr[0] sift_down(arr, 0, end) return arr def sift_down(arr, root, size): while True: largest = root left, right = 2 * root + 1, 2 * root + 2 if left &lt; size and arr[left] &gt; arr[largest]: largest = left if right &lt; size and arr[right] &gt; arr[largest]: largest = right if largest == root: # the parent is already the largest; stop sinking break arr[root], arr[largest] = arr[largest], arr[root] root = largest Complexity: best, average, and worst are all O(n log n). Building the heap is O(n) (not O(n log n) \u2014 a commonly-tested counterintuitive point), then n sift-downs of O(log n) each. Space O(1). Stability \/ in-place: unstable, in-place. Interview notes: it&rsquo;s the only sort that both guarantees worst-case O(n log n) and uses only O(1) space \u2014 pick it when memory is extremely tight and you can&rsquo;t afford to degrade. Note that it&rsquo;s the same machinery as a priority queue \/ heap: heapq, Top-K problems, the heap inside Dijkstra \u2014 all variants of this sift_down. Maintaining a size-k min-heap to find the Top-K is a chained follow-up in this area. LeetCode: 215. Kth Largest Element in an Array \u2014 the classic heap problem (maintain a size-k min-heap); it can also be solved with quickselect, a nice way to contrast the two approaches. Non-comparison-based sorts Every algorithm so far relies on &ldquo;comparison&rdquo;, which is why they&rsquo;re stuck at the O(n log n) line. The next three sidestep comparison, using the element values themselves as indices to place items \u2014 which lets them hit linear time. The price is requirements on the data. This is also the area where my own memory was blankest, so I&rsquo;ll go into a bit more detail.\nCounting Sort One-line idea: count how many times each value occurs, then use a prefix sum to compute each value&rsquo;s position in the result and drop it straight in. Good for integers with a small value range k.\ndef counting_sort(arr): if not arr: return arr lo, hi = min(arr), max(arr) k = hi - lo + 1 count = [0] * k for x in arr: # 1. count count[x - lo] += 1 for i in range(1, k): # 2. prefix sum: count[i] becomes &#34;number of elements &lt;= i&#34; count[i] += count[i - 1] result = [0] * len(arr) for x in reversed(arr): # 3. fill back-to-front to stay stable count[x - lo] -= 1 result[count[x - lo]] = x return result Complexity: O(n + k), where n is the element count and k is the value range. Space O(n + k). Stability \/ in-place: stable (the key is iterating back-to-front in step 3), not in-place. Interview notes: when k is far smaller than n (e.g. sorting a hundred thousand scores in 0\u2013100), it crushes any O(n log n) sort. But once k is large (e.g. sorting arbitrary 32-bit integers), the space blows up \u2014 that&rsquo;s exactly its limit, and the problem radix sort exists to solve. LeetCode: 75. Sort Colors \u2014 only three values (0, 1, 2), so counting sort (or three-way quicksort) handles it in one pass. Radix Sort One-line idea: sort digit by digit. Starting from the least significant digit (ones place), run one stable counting sort per digit, all the way up to the most significant digit. Because each pass is stable, the whole thing is sorted once you finish the highest digit.\ndef radix_sort(arr): if not arr: return arr max_val = max(arr) exp = 1 # current digit: 1=ones, 10=tens, 100=hundreds... while max_val \/\/ exp &gt; 0: arr = counting_sort_by_digit(arr, exp) exp *= 10 return arr def counting_sort_by_digit(arr, exp): count = [0] * 10 # base 10, each digit is only 0-9 for x in arr: count[(x \/\/ exp) % 10] += 1 for i in range(1, 10): count[i] += count[i - 1] result = [0] * len(arr) for x in reversed(arr): # back-to-front to keep this digit&#39;s sort stable (key to radix sort&#39;s correctness) digit = (x \/\/ exp) % 10 count[digit] -= 1 result[count[digit]] = x return result Complexity: O(d\u00b7(n + k)), where d is the number of digits in the largest value and k is the base (here, base 10, so k=10). Space O(n + k). Stability \/ in-place: stable, not in-place. Interview notes: it solves counting sort&rsquo;s &ldquo;space blows up when the range is large&rdquo; problem \u2014 by breaking a big integer into a few small digits. The version above only handles non-negative integers; to support negatives, shift everything to be non-negative first, or handle positives and negatives separately. Common interview questions: why must you go from low digit to high digit? Why must each digit&rsquo;s sort be stable? (Because sorting a higher digit relies on stability to preserve the order already established by the lower digits.) LeetCode: 164. Maximum Gap \u2014 it demands linear time and space, and the standard solution is exactly radix sort or bucket sort. Bucket Sort One-line idea: distribute the data evenly into a number of &ldquo;buckets&rdquo; by value, sort each bucket internally, then concatenate the buckets in order. Good for uniformly distributed data.\ndef bucket_sort(arr): if not arr: return arr n = len(arr) buckets = [[] for _ in range(n)] for x in arr: # assume elements are uniformly distributed in [0, 1) buckets[int(n * x)].append(x) result = [] for bucket in buckets: insertion_sort(bucket) # stable sort within buckets keeps the whole thing stable result.extend(bucket) return result Complexity: average O(n + k) when the data is uniformly distributed; the worst case (all elements crammed into one bucket) degrades to O(n\u00b2). Space O(n + k). Stability \/ in-place: depends on the per-bucket sort \u2014 stable if you use insertion sort; not in-place. Interview notes: its performance rides entirely on &ldquo;is the data uniformly distributed&rdquo;, which is the biggest difference from counting and radix. Counting and radix are insensitive to the shape of the data; bucket sort is sensitive to it. Classic use case: sorting a batch of floats uniformly distributed in [0, 1). LeetCode: 347. Top K Frequent Elements \u2014 bucketing by frequency is the slickest solution to this one. What gets used in the real world: Timsort Everything above is a &ldquo;textbook algorithm&rdquo;. But every time you call sorted(), what Python actually runs underneath is Timsort \u2014 a hybrid carefully tuned for real-world data. It&rsquo;s worth a section of its own, because being able to bring it up in an interview is often a plus.\nCore idea: real data is rarely fully random \u2014 it&rsquo;s often partially sorted already. Timsort seizes on this:\nFirst scan the array for naturally-sorted contiguous segments, called runs; Pad runs that are too short up to a minimum length (minrun, usually 32\u201364) using insertion sort \u2014 as noted earlier, insertion sort is fastest on small arrays; Then merge these runs pairwise following a set of rules, with a &ldquo;galloping&rdquo; mode to speed up the merges. So Timsort = the skeleton of merge sort + the small-chunk optimization of insertion sort + special-casing for already-sorted data.\nComplexity: worst O(n log n), but down to O(n) on nearly-sorted data. Space O(n). Stability: stable. This is why Python&rsquo;s sorted() and list.sort() are guaranteed stable. Trivia: Java&rsquo;s object sort (Arrays.sort(Object[])) also uses a Timsort variant; while C++&rsquo;s std::sort uses a different hybrid, Introsort (quicksort as the base, switching to heap sort when recursion gets too deep to avoid degrading, and insertion sort on small chunks). &ldquo;Quick + heap + insertion&rdquo; rolled into one \u2014 the same spirit as Timsort: there&rsquo;s no silver bullet; production-grade sorts are all hybrids. LeetCode: 56. Merge Intervals \u2014 sort first, then sweep and merge; in Python that sorted() call is running Timsort, so it&rsquo;s a good way to feel the speedup from &ldquo;real data is partially sorted&rdquo;. How to actually choose \/ how to answer in interviews Here&rsquo;s everything above boiled down to a &ldquo;which one should I use&rdquo; checklist:\nNo special requirements, just want speed \u2192 quicksort (random pivot). The default for most situations. Need stable, and the worst case must stay O(n log n) \u2192 merge sort. Memory is extremely tight (need O(1) space) and can&rsquo;t degrade \u2192 heap sort. Very small data (a few dozen) or nearly sorted \u2192 insertion sort. Sorting a linked list \u2192 merge sort. Integers with a small value range \u2192 counting sort. Integers but a huge range (e.g. fixed-length integers \/ strings) \u2192 radix sort. Data uniformly distributed over an interval \u2192 bucket sort. Only need the k-th largest \/ the median, not a full sort \u2192 quickselect. Data too big to fit in memory \u2192 external merge sort. A few common chained follow-ups \u2014 have the answers ready:\n&ldquo;Which sorts are stable?&rdquo; \u2192 bubble, insertion, merge, counting, radix, bucket (when the per-bucket sort is stable), Timsort. &ldquo;Quicksort&rsquo;s worst case and how to avoid it?&rdquo; \u2192 sorted input + a bad pivot degrades to O(n\u00b2); use a random pivot or median-of-three. &ldquo;Can you beat O(n log n)?&rdquo; \u2192 comparison sorts can&rsquo;t (the decision-tree lower bound); but if the data is integers in a bounded range, non-comparison sorts get you to O(n). &ldquo;Is there a sort that&rsquo;s O(n log n), stable, and in-place?&rdquo; \u2192 not in typical implementations; merge is stable but not in-place, heap is in-place but not stable, quick is in-place but not stable. A great prompt for testing whether you understand the trade-offs. A few common mistakes Pitfalls I stepped in (or nearly did) while reviewing:\nSelection sort doesn&rsquo;t speed up on sorted input \u2014 it has no early-exit, so it&rsquo;s always O(n\u00b2). Don&rsquo;t confuse it with bubble \/ insertion. Building a heap is O(n), not O(n log n). Intuitively it looks like n elements at O(log n) each, but a careful count (most nodes are near the bottom) gives O(n). A common counterintuitive question. Counting \/ radix sort fill the result back-to-front \u2014 that step is the source of stability; do it the other way and it&rsquo;s unstable, and an unstable radix sort is simply wrong. Quicksort&rsquo;s space isn&rsquo;t O(1) \u2014 it partitions in place, but the recursion stack is O(log n) on average and O(n) at worst. Bucket sort&rsquo;s worst case is O(n\u00b2) \u2014 don&rsquo;t only remember the O(n) average; it degrades when the distribution is skewed. Stable \u2260 in-place \u2014 these are two independent dimensions, often asked together in interviews. Don&rsquo;t conflate them. Wrapping up The framework for this topic is actually pretty clear:\nComparison sorts are stuck at O(n log n); among them quicksort is fastest but degrades, merge is stable but space-hungry, heap is in-place but unstable \u2014 there&rsquo;s no all-rounder, it&rsquo;s all trade-offs; Non-comparison sorts trade requirements on the data for linear time, but they make demands on that data; Production-grade sorts (Timsort, Introsort) are all hybrids, stitching together the strengths of several algorithms. If, like me, you&rsquo;re picking this back up after a few years, I&rsquo;d suggest running through that cheat sheet at the top: skip anything you can implement from memory and explain the complexity and stability of, and go back to the relevant section for whatever you stumble on. Next time you&rsquo;re prepping for interviews, just come back and skim it again.\nGood luck with your interviews.\n","permalink":"https:\/\/neilmin.com\/posts\/sorting-algorithms-interview-reference\/","summary":"I\u2019ve been prepping for coding interviews lately, and I went back through the sorting algorithms from scratch. The process gave me a bit of a scare: a lot of this I genuinely used to know \u2014 how quicksort\u2019s partition actually works, why it degrades \u2014 and now I had to pause to remember it. By the time I got to the non-comparison sorts \u2014 counting, radix, bucket \u2014 I realized that whole area had become more or less a blank.\n","tags":["Algorithms","Sorting","Interview","Python"],"title":"Sorting Algorithms for Coding Interviews: A Python Reference from Bubble Sort to Timsort"},{"content":"A little while ago, I kept running into personality-test projects everywhere.\nSome of them were the familiar polished MBTI-style sites, especially things like 16Personalities, where the whole experience feels surprisingly complete and serious. Others were much more online and much more unserious in tone, like SBTI, which feels almost designed to be screenshotted, forwarded, and argued about. On top of that, I also saw a few more niche variants floating around, including programmer-themed ones and projects like cbti-test, which showed how far you could push a pure frontend quiz product.\nAfter seeing a few of those in a row, I had a very simple thought:\nWhat if I built one for programmers too?\nAnd not just a quick joke page, but something a little more committed: a site that looks like a serious personality assessment on the outside, while the actual questions and results are clearly about the very specific ways programmers behave at work.\nWhat interested me was the contrast If you just make a \u201cprogrammer MBTI,\u201d it is very easy for it to turn into a disposable meme page. You laugh, send it to a friend, and forget about it the next day.\nI wanted something a little more specific than that.\nThe interesting part to me was the contrast:\nthe interface should feel real the interaction should feel real the result page should feel real but the content should quietly be about programmer habits, coping mechanisms, and work-brain damage Things like:\ndo you actually write code, or do you mostly orchestrate AI into writing it for you? are you the kind of person who hears \u201csmall internal tool\u201d and immediately starts talking about architecture? when something breaks, do you read logs and trace the source, or do you add five console.logs and hope the restart fixes it? are you in this because you love tech, or because you love not getting PIP\u2019d before vesting? Those are funny questions, but they are also concrete enough that they feel like actual behavior rather than vague personality labels.\nThe hard part was not coding it This project started out pretty fuzzy.\nAt first it was just a vibe:\nI wanted to make a programmer personality test, ideally one that felt closer to the North American Chinese tech-worker context than to generic internet programmer humor.\nBut once you actually start building something like this, you quickly realize there are a lot of decisions hiding underneath that initial idea.\nFor example:\nShould it start in Chinese only, or be bilingual from the beginning? Should it reuse traditional MBTI axes at all? Should the result page feel like a formal report or more like a shareable poster? Should the frontend use a router? Should deployment live under a path inside my main blog or on its own subdomain? Should the English version be a translation, or should it be rewritten naturally? None of those questions looks huge on its own, but together they decide whether the final thing feels like a toy or a real product.\nThat was probably my biggest takeaway from this build:\nthe part that takes time is not implementation, it is forcing all the product decisions to become explicit.\nSometimes that meant arguing over code. Sometimes it meant arguing over one line of copy, how aggressive a result title should sound, or whether the quiz should feel more like a real assessment or more like an internet joke.\nThose choices sound soft, but they end up shaping the entire product.\nI ended up making a custom four-axis framework called SHIP At some point I stopped trying to map everything back to classic MBTI and just made a custom framework for the app.\nI called it SHIP.\nPartly because the acronym reads well, and partly because it fits the culture: you can write code, talk architecture, and debate abstractions all day, but eventually the whole job is still about shipping.\nThe four dimensions ended up being:\n1. Source: where does your code actually come from? C = Copilot T = Typecraft In other words:\nare you the kind of person who sees repetitive work and immediately lets Gemini, Claude, or Copilot generate the skeleton? or are you still the kind of person who wants to hand-write the core logic even if AI already produced something usable? 2. Hierarchy: how addicted are you to architecture? O = Overdesign A = ASAP Some programmers see a basic CRUD feature and immediately start thinking about:\ndecoupling scalability HA backward compatibility reusable platform-level services Other programmers are only thinking one thought:\nCan this get pushed to prod tonight?\n3. Investigation: is your debugging style logical or mystical? L = Logic P = Pray Some people see a production issue and go straight to logs, traces, and source code. Others do some combination of:\nadd a few prints restart first write a defensive fallback script if the service is back up, root cause can wait 4. Purpose: what is actually driving you? G = Geek W = Worker Some people really will spend a weekend building a side project, trying a new framework, or reading technical docs for fun.\nOthers are much more directly driven by:\nperf review promo PIP H1B \/ PERM pressure layoff anxiety getting out of work on time The more I worked on it, the more I liked these axes. They are obviously satirical, but they are still grounded enough to feel recognizable.\nI kept the scoring algorithm deliberately simple On the implementation side, I had a very strong bias here:\nI did not want fake complexity.\nIt would have been easy to dress the project up with some more mysterious matching system, nearest-neighbor logic, or personality vectors that sound more \u201cscientific.\u201d But for this product, I honestly thought that would make it worse.\nThe structure is already simple:\n4 dimensions 2 poles per dimension 7-point Likert answers So the scoring model stayed simple too:\neach question belongs to one dimension each Likert choice maps to a score from +3 to -3 each dimension accumulates its own score the sign of the score determines the winning pole the four winning poles become a result code like CAPW That approach had a few advantages:\nit is easy to explain it is easy to test it fits the percentage-bar UI naturally it let me add bilingual support later without touching the scoring engine I ended up documenting the scoring logic in the project README in detail for exactly that reason. This kind of project is funny on the surface, but if the internals become harder to reason about than they need to be, it stops being fun to maintain very quickly.\nThe technical architecture stayed intentionally restrained Even though the site looks like a complete product now, the actual architecture is pretty restrained.\nI did not put it inside my blog repo. I split it out into its own repository and gave it its own subdomain:\nindependent repo Vite + React static deployment GitHub Pages custom subdomain: mbti.neilmin.com And I intentionally avoided a few things that would have made it look fancier without actually helping this project:\nno React Router no full i18n framework no backend no database Why no router? Because this app is still fundamentally one flow:\nintro questions result If I had introduced a full routing model just to make it look like a bigger SPA, I would have bought myself a bunch of GitHub Pages edge cases for very little benefit.\nSo the app just uses React state for screen transitions, and the only meaningful URL state it preserves is:\n?result=CODE That way shared result links still work, but the deployment model stays trivial.\nWhy no i18n framework? Later on I added English too.\nBut I still did not bring in a heavy i18n system, because this project is extremely copy-heavy and I did not want the English version to feel like a translation layer pasted over Chinese source text.\nSo the structure became:\none lightweight locale state one shared scoring engine one shared result-code system two separate content layers: Chinese questions English questions Chinese personality writeups English personality writeups That let the English version sound like natural English instead of translated Chinese.\nThe most addictive part turned out to be the character art and the share poster If I had stopped after the quiz flow and result page, the project would already have been \u201cdone enough.\u201d\nBut personality-test sites are not really judged only by how they read in the browser. They are also judged by what gets screenshotted and forwarded.\nThat is the point where I got pulled into two extra rabbit holes.\n1. Making character art for all 16 personas I wanted something in the general visual family of 16Personalities: clean, low-poly, geometric, readable.\nBut I still wanted the characters to feel like they belonged to this project\u2019s own world.\nAt first I thought about generating them individually, but that would have made consistency much harder. So I went with a more scalable workflow:\ngenerate full character sheets with AI cut them into separate assets with a script reuse those cutouts across the homepage, result page, and poster It felt very modern in a funny way: let the model handle the creative batch generation, then let scripts do the tedious mechanical cleanup.\n2. Building a dedicated vertical share poster The result page is for reading.\nThe poster is for sending around.\nSo instead of expecting people to screenshot the result page directly, I made a dedicated mobile-first vertical poster. It includes:\nthe result code the title the quote dimension summaries the longer description the lifestyle\/social profile a QR code the site URL And importantly, the QR code does not deep-link back to the sharer\u2019s result. It goes to the homepage of the test itself.\nThat distinction mattered to me:\nthe shared result is the content, but the QR code is the acquisition path.\nIn the end, this was really just something I found interesting enough to make Looking back, this project does not feel like some grand product experiment to me.\nIt feels more like one of those things that started with: \u201cthis would be kind of fun,\u201d and then I actually committed to making it real.\nI happened to run into a few personality-test projects in a row, thought the \u201cserious shell, unserious content\u201d contrast was funny, and realized programmers have more than enough specific habits and stereotypes to support that kind of format. So I built one.\nThe current version already does the things I wanted it to do:\nit feels reasonably polished the quiz itself reads well the result page works the share poster works But I definitely do not think it is \u201cfinished\u201d in some permanent sense. The question wording, the persona copy, the English details, the visuals, and the interactions all still have room to improve.\nIf you want to try it, it\u2019s live here:\nhttps:\/\/mbti.neilmin.com\nAnd if you finish it and think, \u201cthis is way too accurate,\u201d then that probably means the joke landed at least halfway.\n","permalink":"https:\/\/neilmin.com\/posts\/building-a-programmer-personality-test\/","summary":"A little while ago, I kept running into personality-test projects everywhere.\nSome of them were the familiar polished MBTI-style sites, especially things like 16Personalities, where the whole experience feels surprisingly complete and serious. Others were much more online and much more unserious in tone, like SBTI, which feels almost designed to be screenshotted, forwarded, and argued about. On top of that, I also saw a few more niche variants floating around, including programmer-themed ones and projects like cbti-test, which showed how far you could push a pure frontend quiz product.\n","tags":["React","Vite","Side Project","MBTI"],"title":"After Seeing MBTI and SBTI Everywhere, I Built a Programmer Personality Test"},{"content":"I&rsquo;ve looked through quite a few personal websites lately, and the more I look, the more interesting this whole category feels.\nThey are not all chasing the same style. Some are very restrained, some go really far with interaction, but the common thread is that they all feel unmistakably personal. I wanted to save a few of my favorites here, partly as inspiration for my own site later on.\ngkoberger.com What makes this one so fun is that the author built a virtual version of himself sitting at a desk. Then you can click almost everything in the scene, whether it&rsquo;s the person, the objects on the desk, or even the board behind him, and each one takes you to a different page. The first time I saw it, I immediately thought: wow, so this is another way a personal website can work. alanagoyal.com\/finder One of the coolest things about this site is that it turns almost the entire interface into something that feels like macOS. And it&rsquo;s not just for show. A lot of it is actually interactive. The whole experience feels light, playful, and very hard not to click through. sharyap.com This one is a bit similar to the previous site. Every time you open a new page, it feels like opening a new window on a desktop. What makes it even more charming is that many of the graphics, logos, and little visual details were drawn by the author herself, which makes the whole thing feel cohesive and incredibly cute. merodev.net This is a very cool, very futuristic 3D website. The rendering quality grabs you right away. The lighting, the scene, and the overall atmosphere all feel polished in a way that makes you stop and stare for a few seconds. bruno-simon.com This site is famous, and I first saw it years ago. Looking at it again now, it has evolved even more. The most ridiculous and unforgettable thing about it is that Bruno somehow turned his personal website into a 3D driving game. You drive around while exploring his work and personal info. Even now, I still wonder how he managed to fit all of that into one website. He also has a YouTube channel where he talks about how he built it. logartis.info This site feels a bit like watching a movie. You move through a forest-like environment, and the whole atmosphere feels incredibly complete. What I especially like is how well the weather and environmental transitions are handled. It creates a really strong sense of immersion. wodniack.dev This is another site with a very strong personal style. A lot of the interactions and scroll-triggered effects feel unique and artistic. You can tell the author is not just using a template, but actually using the website as a way to express a distinct visual point of view. getcoleman.com The thing that really made me stop here is how creatively the author introduces himself. He uses progress bars to show his timeline and different stages of his life. It instantly makes you think: oh, so this is another way to do a personal introduction. animejs.com This isn&rsquo;t a personal website, but a product showcase site. Still, the animation work here is incredibly smooth. The transitions between sections while scrolling, the pacing, and the motion all feel very well tuned. It&rsquo;s one of those sites where you immediately think: this is just really well made. joshwcomeau.com This site feels great from top to bottom. The typography, the colors, and even the little interactive game on the homepage all make you feel that the author&rsquo;s taste is excellent, not in a flashy way, but in a way that keeps feeling better the longer you look at it. On top of that, he writes a lot of technical articles, so it&rsquo;s not just beautiful, it&rsquo;s genuinely useful too. portfolio.ohevan.com This site feels more like a personal showcase of someone&rsquo;s portfolio, the places they&rsquo;ve been, and the photos they&rsquo;ve taken, and those photos genuinely look cinematic. I really like this direction myself. If I ever end up with enough photos that feel worth sharing, I&rsquo;d love to seriously think about making something like this too. What I love most about these sites probably comes down to two things: either the interactions are exceptionally good, or the visual design is strong enough to make you stop and look a little longer. But more importantly, they all reminded me just how far a website can be pushed. So many things that seem impossible, or that you would never expect someone to do on a website, can actually be done. For me, this whole list was genuinely eye-opening, which is exactly why I wanted to share it.\n","permalink":"https:\/\/neilmin.com\/posts\/favorite-personal-websites\/","summary":"I\u2019ve looked through quite a few personal websites lately, and the more I look, the more interesting this whole category feels.\nThey are not all chasing the same style. Some are very restrained, some go really far with interaction, but the common thread is that they all feel unmistakably personal. I wanted to save a few of my favorites here, partly as inspiration for my own site later on.\ngkoberger.com What makes this one so fun is that the author built a virtual version of himself sitting at a desk. Then you can click almost everything in the scene, whether it\u2019s the person, the objects on the desk, or even the board behind him, and each one takes you to a different page. The first time I saw it, I immediately thought: wow, so this is another way a personal website can work. alanagoyal.com\/finder One of the coolest things about this site is that it turns almost the entire interface into something that feels like macOS. And it\u2019s not just for show. A lot of it is actually interactive. The whole experience feels light, playful, and very hard not to click through. sharyap.com This one is a bit similar to the previous site. Every time you open a new page, it feels like opening a new window on a desktop. What makes it even more charming is that many of the graphics, logos, and little visual details were drawn by the author herself, which makes the whole thing feel cohesive and incredibly cute. merodev.net This is a very cool, very futuristic 3D website. The rendering quality grabs you right away. The lighting, the scene, and the overall atmosphere all feel polished in a way that makes you stop and stare for a few seconds. bruno-simon.com This site is famous, and I first saw it years ago. Looking at it again now, it has evolved even more. The most ridiculous and unforgettable thing about it is that Bruno somehow turned his personal website into a 3D driving game. You drive around while exploring his work and personal info. Even now, I still wonder how he managed to fit all of that into one website. He also has a YouTube channel where he talks about how he built it. logartis.info This site feels a bit like watching a movie. You move through a forest-like environment, and the whole atmosphere feels incredibly complete. What I especially like is how well the weather and environmental transitions are handled. It creates a really strong sense of immersion. wodniack.dev This is another site with a very strong personal style. A lot of the interactions and scroll-triggered effects feel unique and artistic. You can tell the author is not just using a template, but actually using the website as a way to express a distinct visual point of view. getcoleman.com The thing that really made me stop here is how creatively the author introduces himself. He uses progress bars to show his timeline and different stages of his life. It instantly makes you think: oh, so this is another way to do a personal introduction. animejs.com This isn\u2019t a personal website, but a product showcase site. Still, the animation work here is incredibly smooth. The transitions between sections while scrolling, the pacing, and the motion all feel very well tuned. It\u2019s one of those sites where you immediately think: this is just really well made. joshwcomeau.com This site feels great from top to bottom. The typography, the colors, and even the little interactive game on the homepage all make you feel that the author\u2019s taste is excellent, not in a flashy way, but in a way that keeps feeling better the longer you look at it. On top of that, he writes a lot of technical articles, so it\u2019s not just beautiful, it\u2019s genuinely useful too. portfolio.ohevan.com This site feels more like a personal showcase of someone\u2019s portfolio, the places they\u2019ve been, and the photos they\u2019ve taken, and those photos genuinely look cinematic. I really like this direction myself. If I ever end up with enough photos that feel worth sharing, I\u2019d love to seriously think about making something like this too. What I love most about these sites probably comes down to two things: either the interactions are exceptionally good, or the visual design is strong enough to make you stop and look a little longer. But more importantly, they all reminded me just how far a website can be pushed. So many things that seem impossible, or that you would never expect someone to do on a website, can actually be done. For me, this whole list was genuinely eye-opening, which is exactly why I wanted to share it.\n","tags":["Web Design","Inspiration","Portfolio"],"title":"Some Personal Websites I Love"},{"content":"This whole thing, honestly, was basically the software version of buying a dish of vinegar and ending up making an entire batch of dumplings.\nA while ago, I ran into a bug that was both obscure and genuinely hard to fix. Stack Overflow was not giving me anything useful. Google was not giving me much either. In the end I had to grind through it myself and solve it the hard way.\nOnce I finally got it resolved, my first reaction was that I should write the whole winding story down.\nBecause if I do not write down problems like that, there is a very good chance that a few months later even I will no longer remember how I reasoned my way through them. And after burning that many brain cells on one issue, not turning it into something useful would feel like a waste.\nBut then the obvious problem showed up:\nI had written the post. But where exactly was I supposed to publish it?\nThe real starting point was not &ldquo;I want a personal website&rdquo; I did not begin with some grand plan to build a personal website.\nMy actual thought process was much simpler: I just wanted a decent place to publish that debugging write-up.\nSo I spent some time researching where it should live.\nOn the Chinese internet, platforms like CSDN, Zhihu, and Blog Garden were all options. It was not like I had nowhere to post. But they all felt too constrained, and some of those platforms have a reputation for pulling stunts that make your content feel like it does not fully belong to you anymore. A clean technical post gets wrapped in platform noise, and that was not what I wanted.\nPlatforms like Medium or Blogger feel lighter, sure, but at the end of the day you are still renting space in someone else&rsquo;s house. You can decorate it a little, but the walls are not yours, the address is not yours, and if the platform changes its rules, some part of your setup stops being fully under your control.\nI was not doing this to monetize anything either. As an engineer, I ended up at the most predictable conclusion possible:\nIf I wanted full control, I should just build the thing myself.\nAnd once that idea showed up, the rest of the stack choice became pretty straightforward. Static hosting, free infrastructure, a workflow that fits naturally with Git, and no need to think too hard about servers. It is hard to imagine something more like a proper little digital home than GitHub Pages.\nIf I was going to build it, I might as well build it properly When it came to choosing the framework, I looked at the usual suspects first.\nHexo on the Node.js side Jekyll, the old GitHub Pages classic in the Ruby world Hugo, which has a reputation for being absurdly fast After comparing them a bit, I ended up choosing Hugo.\nThe reason was actually very simple: I had seen another site built with Hugo, liked how it looked overall, and for a site this small, the technical differences between these frameworks are not life-or-death anyway.\nFor the theme, I went with PaperMod.\nI like how restrained it is. The default look is clean, not noisy, and it keeps the content front and center. For someone like me, who mostly wants to write technical posts and keep a lightweight personal site around them, it was a really solid starting point.\nIf you want to build something similar, the basic setup is honestly pretty small:\n# 1. Install Hugo (macOS) brew install hugo # 2. Create the site hugo new site my-blog cd my-blog git init # 3. Add PaperMod as a submodule git submodule add https:\/\/github.com\/adityatelange\/hugo-PaperMod.git themes\/PaperMod There is one small detail here that is worth calling out very explicitly: PaperMod really should be added as a submodule.\nDo not just git clone it into themes\/ and assume you are done. That can look perfectly fine locally and still bite you later when you push to GitHub. Nested repositories and theme directories do not always behave the way people expect, and it is a very easy way to end up with a mysteriously broken site, missing theme files, or a pretty embarrassing 404 situation after deployment. I am, unfortunately, speaking from lived experience here.\nThe funniest part is that Vibe Coding actually worked really well If this had happened a few years ago, the project would still have been manageable, but it absolutely would have been the kind of thing that quietly consumed an entire weekend.\nYou would have to set up the theme, wire the Hugo config, decide how bilingual routing should work, figure out deployment, get GitHub Actions going, deal with the custom domain, tweak the styling, wire comments, wire analytics, and then spend way too much time on front-end details that individually look tiny but collectively refuse to go away.\nNow we live in the era of Vibe Coding.\nWhat really stood out to me this time is that AI was not most useful because it magically knew what I wanted. It was useful because once I knew the direction, it could help me chew through a huge amount of tedious, fragmented, annoying implementation work.\nThis site did not come together because an AI produced one perfect answer. It came together because I kept refining the target and using AI to try things with me:\nhow to structure bilingual routes how to pair English and Chinese content files how to make the language switcher prefer the current page&rsquo;s translation instead of always dumping you back on the homepage how fast the animated glow background should move how to make it visible enough in light mode what to use for page views after the first counter turned out to be flaky how to wire comments so the UI language follows the page language And a lot of those details were not &ldquo;one and done&rdquo; decisions. We went back and forth on them many times.\nThe bilingual content convention, for example, ended up being very simple:\nEnglish lives at \/ Chinese lives at \/zh\/ default English content files stay unsuffixed Chinese translations use .zh.md So it looks like this:\ncontent\/posts\/my-post.md content\/posts\/my-post.zh.md Even with vibe coding making things much easier, none of this really came out perfect in one shot. A lot of the site was not &ldquo;AI gave me the final answer.&rdquo; It was much closer to me standing next to a very fast collaborator and repeatedly saying:\nThis is not right, move it a bit left.\nThat animation is too fast, slow it down.\nThat color is ugly, try another one.\nThis interaction feels like magnets. I want it to feel like something soft being gently pushed aside.\nThat is why I have grown to like this workflow so much. I am not personally typing every line, but I am still doing the design work, the judgment, and the steering. AI feels less like a replacement and more like a teammate with a lot of execution stamina who does not get tired of being told to tweak things one more time.\nOnce I started, I could not help making it more complete Originally I only wanted a place to publish one technical post.\nBut once the site was half-built, the dangerous sentence showed up in my head:\nWell, since I am already here, I might as well finish it properly.\nAnd that was how things escalated.\n1. Bilingual support I already knew I wanted the site to support both Chinese and English, so from the beginning this was not designed as a single-language blog.\nThe current structure is:\nEnglish at \/ Chinese at \/zh\/ posts, search, and resume pages all follow the bilingual model On paper, that sounds like just a few Hugo config settings. In reality, the annoying part is all the edges: should menus be localized, should search be language-specific, should the language switcher jump to the homepage or the translated page, should the comment widget switch its UI language too? None of those decisions is huge on its own, but together they determine whether the site feels properly bilingual or merely &ldquo;technically available in two languages.&rdquo;\n2. GitHub Actions deployment For deployment, I kept it simple and just let GitHub Actions handle the job.\nThe Pages workflow is already in .github\/workflows\/gh-pages.yml, and the day-to-day workflow is basically:\nChange content or styling locally git add git commit git push Then CI\/CD takes care of the rest.\nThis has one dangerous side effect: it lowers the cost of polishing things so much that you start fixing everything. Sometimes even one sentence that feels slightly off is enough to trigger a whole new commit.\nAnd honestly, Hugo is very pleasant in this setup. Build with hugo --gc --minify, let GitHub Pages host the output, let GitHub Actions publish it, and you are done. There is almost no extra ops drama in the middle, which is exactly the level of complexity I wanted for a content-first site.\n3. Buying the domain Using neilmin.github.io would have been perfectly fine.\nBut then I made the mistake of checking domain prices and discovered that a .com with my name on it cost something like ten or twelve dollars a year. That is a very effective way to destroy a person&rsquo;s self-control. For less than the cost of a meal, I could have a little corner of the internet with my own name on it. Hard to argue with that.\nSo I bought neilmin.com.\nFunctionally, a custom domain does not change much. Emotionally, it changes a lot. The moment you type that URL into the browser, the site immediately feels more real and more yours.\nAfter the basics worked, the little extras became the most addictive part Once a site like this is functional, the next wave of fun is usually not &ldquo;can it work?&rdquo; but &ldquo;can I make it feel a little nicer?&rdquo;\nThat was the point where I started adding extra little toys.\nThe animated glowing background If you are reading this on a desktop, the homepage and search page should have a few softly moving glowing shapes in the background.\nThat effect is not an image or a video. It is just CSS plus a bit of JavaScript. It started as a mesh-gradient-like atmosphere thing, and then I kept iterating until it became much more interactive.\nThe funny part is how many rounds it took to get the feel right. The first version looked too mechanical. Another version felt weirdly magnetic and slippery. Eventually I rewrote the interaction to feel more like soft bubbles being gently pushed aside, using lerp-style easing and a smoother falloff curve. Now when you move the mouse through it, it should feel more like you are nudging something viscous than poking three glowing objects that want to fly away.\nPage views: from Busuanzi to Vercount At first I tried the usual Chinese blog counter, Busuanzi.\nBut these days its reliability feels&hellip; situational. So I eventually switched to Vercount.\nWhat I really liked there was that it is almost a drop-in replacement for Busuanzi at the DOM level.\nIn practice that meant I barely had to touch the existing busuanzi_* IDs. I could swap the script source, keep the markup mostly intact, and move on. That is exactly the kind of migration I appreciate when something is already wired into templates and I do not want to rebuild the whole feature from scratch.\nGiscus comments For comments, I ended up choosing Giscus.\nI really like the model: no separate database, no extra backend to maintain, just GitHub Discussions underneath. For a small site that already treats GitHub as home base, that is a very natural fit.\nAnd it matches the rest of the site nicely:\nno custom comment backend to operate dark mode support out of the box comment UI language can follow the page language So now the Chinese posts render a Chinese Giscus UI, and the English posts render an English one. Those details are small, but once they line up, the whole site feels much more coherent.\nAnd honestly, this is exactly the kind of thing that makes projects like this addictive. You think you are just adding comments, and then a minute later you care about whether the comment UI language switches correctly and whether the theme tracks light\/dark mode properly too.\nLooking back, this really was a full-on dumpling project At the beginning, I just wanted to publish one technical write-up.\nBy the end, I had:\nset up Hugo wired in PaperMod implemented bilingual support added search configured GitHub Pages deployment bought a custom domain added visitor stats added comments tuned the animated background refined the interaction feel All that because I wanted somewhere to post one debugging story.\nAnd honestly, I am glad it happened.\nOn one level it solved a very practical problem: I now have a place that is fully mine, where I can publish the kinds of things I actually want to write. On another level, it brought back a very familiar kind of engineering joy, the kind that says, &ldquo;Well, since I already got this far, I might as well make it a little better.&rdquo;\nFinal thoughts That original bug write-up is now out in the world.\nAnd this site, in a way, is the side effect it dragged into existence.\nIf you happened to land on this post, feel free to scroll down and leave a comment. Log in with GitHub, say hi, and if you want, help me test whether this whole setup is actually as stable as I think it is.\nBecause for all the extra polish it has now, the origin story of this site is still extremely simple:\nI just wanted a good place to publish one post that felt worth preserving.\n","permalink":"https:\/\/neilmin.com\/posts\/building-my-hugo-blog-with-github-pages\/","summary":"This whole thing, honestly, was basically the software version of buying a dish of vinegar and ending up making an entire batch of dumplings.\nA while ago, I ran into a bug that was both obscure and genuinely hard to fix. Stack Overflow was not giving me anything useful. Google was not giving me much either. In the end I had to grind through it myself and solve it the hard way.\n","tags":["Hugo","Web Development","DevOps"],"title":"How I Vibe-Coded This Blog Website Just to Publish One Post"},{"content":"A little while ago, we ran into a customer issue that turned out to be way more interesting than it looked at first glance. The customer was trying to generate a backup using pg_dump, and the job kept failing halfway through with this error:\npg_dump: error: could not write to output file: No space left on device When I first saw it, I honestly felt pretty relaxed. This kind of error looks like a standard problem. If the disk is full, you make the disk bigger and move on.\nThe Bizarre Beginning: A Black Hole That Wouldn&rsquo;t Fill The most obvious thing to try: add more disk The customer&rsquo;s database was already a few terabytes in size, so we did the straightforward thing and resized the target disk to 10 TB. In a situation like this, the first thing you usually check is:\ndf -h At that point I was genuinely thinking, &ldquo;Alright, this should be an easy win.&rdquo; We reran pg_dump, and somehow it failed again.\nAt first we still did not think too much of it. Maybe 10 TB was somehow still not enough? So we kept going and expanded the disk again, this time to well over 20 TB, and ran the job one more time.\nSame error. Same failure.\nThat was the moment it stopped making sense. df -h clearly showed plenty of free space, yet the system kept insisting: No space left on device.\nAt that point I was pretty sure this error was not saying what it appeared to be saying.\nFollowing the Clues: 200 Million Hidden Large Objects If this was not a simple hardware-capacity problem, then the next step was to look at the data itself. We started inspecting the contents of the database. At first, nothing jumped out: no badly bloated TOAST data, no unusually highly compressed data, nothing obviously suspicious.\nThen we noticed something that was definitely unusual: this customer had an enormous number of Large Objects stored in the database.\nConcept Break: What are PostgreSQL Large Objects? In PostgreSQL, a Large Object, or LO, is a mechanism for storing large chunks of data such as images, audio files, or documents. It exposes a file-like API with operations such as open, read, write, and seek.\nWhen you run pg_dump using Directory Format (-Fd), it generates files for table schemas and data. It also generates a separate dump file for every Large Object in the database. Those files are typically named after the LO&rsquo;s OID, something like:\nblob_12345.dat Under normal circumstances, Large Objects are usually used to store data that is huge in size but not huge in count. This customer was doing almost the exact opposite: they had stored more than 200 million very small Large Objects.\nAnd that was where things started to get interesting. pg_dump clearly was not optimized for the &ldquo;massive quantity of tiny LOs&rdquo; case. It just kept following its normal logic and created one file per object.\nLocal Reproduction: The 14 Million File Barrier Once we suspected the sheer number of LOs was the issue, we started reproducing the environment locally.\nWe were able to reproduce the same error surprisingly quickly, and the reproduction was very consistent. As we monitored the backup process, one pattern kept showing up: the failure always happened when the output directory reached roughly 14 million files.\nMy first instinct at that point was: maybe we are running out of inodes.\nThat would have been a very normal Linux explanation. Every file needs an inode, and with a huge number of tiny files, it is absolutely possible to run out of inodes long before you run out of disk blocks. So we checked:\ndf -i And that was not it either. We still had plenty of inodes left.\nSo then we went through the usual list of suspects: container limits, process limits such as ulimit -a, and other system-level resource constraints. We kept checking and checking, and nothing looked wrong.\nThe only thing that stayed stubbornly consistent was that 14-million-file threshold. Every time we hit it, the job died as if it had smashed into an invisible wall.\nThe Truth Revealed: EXT4 HTree and Hash Collisions With that &ldquo;14 million file barrier&rdquo; in mind, I dug through a lot of material online and also asked people who know Linux filesystems far better than I do. Eventually the root cause became clear: this was coming from EXT4&rsquo;s directory indexing behavior.\nDeep Dive: EXT4 HTree Limitations and Hash Collisions In Linux, a directory is fundamentally a file that stores filenames and pointers to their inodes. To make lookup efficient when a directory contains huge numbers of entries, EXT4 uses a hash-based tree index called HTree, which is conceptually similar to a B-Tree in databases.\nThe catch is that the traditional EXT4 HTree has a limited depth, usually two levels. If you keep dumping millions upon millions of files into one single directory, the hash space for those filenames starts getting crowded and severe hash collisions can occur. Once a hash bucket is full and the HTree has reached its depth limit and cannot split further, the filesystem refuses to create more files and returns ENOSPC back to the OS, which surfaces as No space left on device.\nThat explains perfectly why the crash always happened at around 14 million files. pg_dump generates files sequentially, and the filenames are derived from LO IDs, so the naming pattern is deterministic. In other words, every run walks into the exact same collision path and hits the exact same wall.\nHonestly, that was the satisfying part of the whole investigation. Up until then it felt like wandering around in the dark, touching random things and getting nowhere. Once this clicked, all the weird symptoms suddenly lined up.\nThe Fix: How to Get Around It Once we understood the root cause, the solution space became much clearer. There are really three levels where you can address this.\n1. Backup strategy: split the dump If Directory Format is choking because it is trying to put every LO into one folder, then one practical fix is to separate table data from Large Objects:\nDump regular table data using Directory Format (-Fd) while excluding blobs. Dump Large Objects separately into a single large file, such as plain SQL or custom format, so you avoid generating hundreds of millions of tiny files. In practice, that means using options such as --no-blobs for the table dump and exporting blobs separately.\n2. Filesystem level: enable large_dir Modern EXT4 is actually aware of this edge case and provides a feature called large_dir. Once enabled, it supports a 3-level HTree and allows directories to exceed 2 GB in size. That greatly reduces the probability of hash collisions and, in practice, almost eliminates this per-directory bottleneck.\nYou can enable it like this:\n# Unmount the disk first umount \/dev\/sdX # Enable the large_dir feature tune2fs -O large_dir \/dev\/sdX # Check the filesystem and remount e2fsck -f \/dev\/sdX mount \/dev\/sdX \/backup_dir 3. Architecture level: directory sharding More broadly, if your system genuinely needs to store tens of millions of physical files, putting all of them into one folder is simply not a good design. Even if you enable large_dir, basic operations such as ls can still become painfully slow.\nThe standard industry practice here is directory sharding based on filename hashes or IDs.\nFor example, if a file is named:\n1234567.dat Instead of placing it directly in one giant folder, you split it into subdirectories:\nUse the first two digits 12 as the first-level directory. Use the next two digits 34 as the second-level directory. The final path becomes:\n\/backup_dir\/12\/34\/1234567.dat That way, millions of files get distributed across thousands of subdirectories, and the number of files per directory stays low enough that filesystem bottlenecks never build up in the first place.\nFinal Thoughts Looking back, this was one of those debugging sessions that starts out feeling almost too easy. You think you already know the answer. Then the obvious fix does nothing, and suddenly you are forced to question every assumption you made along the way.\nAt first, we really thought the answer was just, &ldquo;make the disk bigger.&rdquo; Then we went from 10 TB to more than 20 TB and the problem still sat there, completely unmoved. After that we started suspecting inodes, containers, process limits, and every other system-level resource we could think of. None of them fit.\nAnd then the actual problem turned out to be hiding in EXT4 directory indexing and hash collisions, which is not exactly the first place your mind goes when you see a disk-space error.\nThat is probably the part I find most worth writing down. Not just that we fixed the issue, but that this kind of problem forces you to revisit system behaviors you normally take for granted. No space left on device sounds incredibly direct, but what it really means can be much more subtle than it looks.\nSo if you ever run into one of those situations where there is clearly still free space and yet the system refuses to write another byte, it might be worth resisting the urge to trust the error message too literally. Sometimes the real problem is hiding a layer or two deeper.\n","permalink":"https:\/\/neilmin.com\/posts\/linux-disk-bug-triage\/","summary":"A little while ago, we ran into a customer issue that turned out to be way more interesting than it looked at first glance. The customer was trying to generate a backup using pg_dump, and the job kept failing halfway through with this error:\npg_dump: error: could not write to output file: No space left on device When I first saw it, I honestly felt pretty relaxed. This kind of error looks like a standard problem. If the disk is full, you make the disk bigger and move on.\n","tags":["PostgreSQL","Linux","Debugging","Database"],"title":"Why PostgreSQL Kept Saying \u201cNo space left on device\u201d with 20TB Still Free"},{"content":"An MBTI-style test that tells you what kind of programmer personality you have. Hugo GitHub Actions Web UI","permalink":"https:\/\/neilmin.com\/projects\/#programmer-mbti-test","summary":"An MBTI-style test that tells you what kind of programmer personality you have.","tags":["Hugo","GitHub Actions","Web UI"],"title":"Programmer MBTI Test"},{"content":"A small collection of AI covers where my own voice replaces the original singers. Python PyTorch AI Tuning Audio Processing","permalink":"https:\/\/neilmin.com\/projects\/#ai-cover-songs","summary":"A small collection of AI covers where my own voice replaces the original singers.","tags":["Python","PyTorch","AI Tuning","Audio Processing"],"title":"AI Cover Songs"}]