Skip to content

Conversation

@tmck-code
Copy link
Owner

TL;DR

I lowered the number of cowfiles from ~5000 to ~3500 by removing duplicates

Context

When doing some unrelated debugging I noticed that there were pokemon entries that seemed to be straight-up duplicates, e.g. the variations of Lament here (between gen7 and gen8)

image

Upon inspecting the text, I found that this was indeed the case

 ☯ ~/d/p/build diff -s assets/cows/4348.cow assets/cows/1641.cow
Files assets/cows/4348.cow and assets/cows/1641.cow are identical

Changes

  • collect list of all cowfile data before each is written
  • check all data against previous entries before writing
  • skip if data was already written

this skips around 1500 files! which lowers the total from ~5000 to ~3500

this results in a small file size decrease, and a small speed increase (no more than 10%)

tmck-code added 2 commits May 28, 2025 23:27
- collect list of all cowfile data before each is written
- check all data against previous entries before writing
- skip if data was already written

this skips around 1500 files! which lowers the total from ~5000 to
~3500

this results in a small file size decrease, and a small speed increase
(no more than 10%)
@tmck-code tmck-code merged commit a129b66 into master May 28, 2025
1 check passed
@tmck-code tmck-code deleted the skip-duplicate-cowfiles branch May 28, 2025 13:38
@tmck-code tmck-code added the performance Related to speed/resource usage label May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Related to speed/resource usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants