Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite chunk system #8177

Merged
merged 96 commits into from
Sep 26, 2022
Merged

Rewrite chunk system #8177

merged 96 commits into from
Sep 26, 2022

Conversation

Spottedleaf
Copy link
Member

@Spottedleaf Spottedleaf commented Jul 23, 2022

In the future, I want to experiment with ticking separate independent regions of the world in parallel. However, the current chunk system is just not up to that task: The scheduling is tied to the main thread, the chunk holder state is tied to the main thread, and the chunk system cannot fully utilise the worker threads it is given.

The new chunk system can be scheduled by any thread and as a result, the only time the chunk generation logic needs to go to the main thread is when it needs to execute generation/load logic that can only be done by the main thread (i.e loading in entity/poi chunks). As a result, chunk generation should be far less affected by lower TPS.

The new chunk system is also capable of executing several chunk status generation tasks in parallel. In my testing, chunky was capable of pushing all 8 threads I gave it for generating a new world, unlike the current chunk system.

Current things left to do before this can be merged:

  • Verify stability (i.e no save/unload/shutdown/crash issues)
  • Plugin compatibility
  • Figure out a merge strategy that will allow for this patch to be dropped for game updates, ensuring that it cannot stall any update
  • Complete the todo list

Current todo list:

  • Make the WatchdogThread and the main thread instances of TickThread and rewrite all of the main thread checks to just check for instanceof. This will result in allowing the watchdog thread to perform chunk system operations without worrying about stalling forever, which is a very common issue that prevents the world from properly saving.
  • Do not use ConcurrentHashMap for chunk holders. Need a solution that does not use any locks. Currently, CHM is used so the testing can validate the chunk system without worrying about the internal chunk holder map being broken.
  • Make player chunk scheduling asynchronous. Currently, the main thread has to aggressively add and propagate ticket levels and do the initial scheduling work, which isn't negligible and requires mostly good TPS. Once the chunk scheduling is moved to its own thread, chunk generation speed should be mostly independent of TPS and frees the main thread of the burden.
  • ALL of the TODO comments...

There are currently many aspects of the chunk system that need to be tested:

  • Chunk generation
  • Chunk loading
  • Chunk (poi/entity/block) saving
  • Chunk (poi/entity/block) unloading, especially with players that are passengers of an entity or entities that are passengers or vehicles
  • All of the above combined with the server shutting down (there is a new shutdown process)
  • Plugin compatibility (custom world generators for example)

In order to assist with debugging, I've rewritten /paper debug chunks and added a new command /paper holderinfo. The new paper holderinfo command can be used to see how many read only chunks are in memory as well as protochunks.

At this point, the chunk system is very new (in alpha) and as such will likely have problems that can result in data loss or even possibly corruption. Do not use this branch for anything you aren't OK with losing.

Current build: https://cdn.discordapp.com/attachments/876902758366736457/1020824757953904701/paper-paperclip-1.19.2-R0.1-SNAPSHOT-reobf.jar

@Malfrador
Copy link
Member

chunks-2022-07-23_23.13.42.txt
Pre-genned world, originally in 1.17. About a 5x5 chunks area below and ahead of me refusing to load at all.

Second time this happened in about 90 minutes of testing with several different worlds. Didn't know about the debug command the first time it happened, so no report from that. Other than this it seems to work pretty well so far, and is a noticeable improvement over the normal Paper build. Haven't tested anything related to saving though.

@Astralchroma
Copy link
Contributor

Done a little bit of testing with FastAsyncWorldEdit, and a Movecraft-Like Plugin I maintain, blocks are being set correctly server side, but those changes are not being sent to the client and are not visible to the client until a relog.

I don't know how FAWE does it, but the plugin I maintain does this with LevelChunkSection#setBlockState and uses ClientboundLevelChunkWithLightPacket to send updates. Both names are Mojang Mappings.

I don't know if any of this is helpful, I can do more testing if you want, just wanted to try and be somewhat helpful.

@Cryptite
Copy link
Contributor

Perhaps worth finally closing out #1001 and its other couple children?

Also ensure that during the unload tick that scheduling occurs
only out of the schedule lock.

Add more scheduling debug info to debug json and chunk task toString
@Spottedleaf
Copy link
Member Author

chunks-2022-07-23_23.13.42.txt Pre-genned world, originally in 1.17. About a 5x5 chunks area below and ahead of me refusing to load at all.

Second time this happened in about 90 minutes of testing with several different worlds. Didn't know about the debug command the first time it happened, so no report from that. Other than this it seems to work pretty well so far, and is a noticeable improvement over the normal Paper build. Haven't tested anything related to saving though.

Fixed in commit 7dd4845, jar has been updated

Done a little bit of testing with FastAsyncWorldEdit, and a Movecraft-Like Plugin I maintain, blocks are being set correctly server side, but those changes are not being sent to the client and are not visible to the client until a relog.

I don't know how FAWE does it, but the plugin I maintain does this with LevelChunkSection#setBlockState and uses ClientboundLevelChunkWithLightPacket to send updates. Both names are Mojang Mappings.

I don't know if any of this is helpful, I can do more testing if you want, just wanted to try and be somewhat helpful.

You need to provide code or i can't even begin to guess about this

@EterNityCH
Copy link
Member

I was able to get over 300 cps with chunky generation 🚀 the rat is running overdrive!!

@pop4959
Copy link
Member

pop4959 commented Jul 24, 2022

This is some really interesting stuff, did a couple quickie performance comparisons to really get an idea of how this new chunk system works.

Five in total, all using Aikar's flags with 8GB allocated and with level-seed=0, generating a 2000 block radius (63001 chunks) in all 3 vanilla dimensions. Worlds completely deleted between each run.

I posted logs for all of these tests here but I'm just going to summarize the results below.

Test 1: paper-77 (old chunk system) as a control

overworld: 63001 chunks in 1038966 ms ≈ 60.6 chunks per second average
nether: 63001 chunks in 839342 ms ≈ 75.1 chunks per second average
end: 63001 chunks in 182556 ms ≈ 345.1 chunks per second average

Test 2: paper-7dd4845

overworld: 63001 chunks in 1098012 ms ≈ 57.4 chunks per second average (≈94% of control)
nether: 63001 chunks in 504580 ms ≈ 124.9 chunks per second average (≈166% of control)
end: 63001 chunks in 163540 ms ≈ 385.2 chunks per second average (≈111% of control)

Test 3: paper-7dd4845 (with -DPaper.WorkerThreadCount=16 -Dchunky.maxWorkingCount=200 flags added)

overworld: 63001 chunks in 514717 ms ≈ 122.4 chunks per second average (≈201% of control)
nether: 63001 chunks in 281947 ms ≈ 223.4 chunks per second average (≈297% of control)
end: 63001 chunks in 114282 ms ≈ 551.3 chunks per second average (≈159% of control)

Test 4: paper-7dd4845 (with only -DPaper.WorkerThreadCount=16 flag added)

overworld: 63001 chunks in 527374 ms ≈ 119.5 chunks per second average (≈197% of control)
nether: 63001 chunks in 294822 ms ≈ 213.7 chunks per second average (≈284% of control)
end: 63001 chunks in 123304 ms ≈ 510.9 chunks per second average (≈148% of control)

Test 5: paper-7dd4845 (with only -Dchunky.maxWorkingCount=200 flag added)

overworld: 63001 chunks in 1094049 ms ≈ 57.6 chunks per second average (≈94% of control)
nether: 63001 chunks in 512030 ms ≈ 123.0 chunks per second average (≈163% of control)
end: 63001 chunks in 134355 ms ≈ 468.9 chunks per second average (≈135% of control)

With otherwise default configs, the new chunk system seems to be noticeably faster, especially with more worker threads in Paper. I also played around with max working count but it didn't seem to have a noticeable impact.

The new chunk system looks to be able to achieve generation speeds up to about 2-3 times faster than previously. Though I acknowledge I did not actually test the control with the WorkerThreadCount flag, as I recall at least, it did not have nearly as much of an impact as with the new chunk system due to the bottlenecks in the old chunk system. If someone wants to go and confirm that, be my guest, but I think I've done enough testing of this for tonight. 😅

When the server shuts down, the chunk system is halted.
So, any queued async chunk unload would not complete.
To fix this, just execute the async chunk unload task
synchronously in the save method.
Unload at least 50 chunks or 5% of the unload queue,
whichever is larger
@Spottedleaf
Copy link
Member Author

The jar has been updated to fix some unloading chunks not saving during shutdown, and to stagger chunk unloading across multiple ticks.

Since it was not initialised, it would always be BLOCKING.
@Spottedleaf
Copy link
Member Author

Jar has been updated to fix incorrect initial priority for I/O load tasks

Also change all thread checks to check for whether the
current thread is an instance of TickThread. This will
improve reliability on watchdog shutdown as now the
watchdog thread should now pass all of the thread checks
in the chunk system.
@NamiUni
Copy link

NamiUni commented Jul 27, 2022

I took a video for those who want to visually see how fast this new chunk system is.
I tested it on an i7-8700k.

default.mp4

@CDFN
Copy link

CDFN commented Jul 29, 2022

java.lang.reflect.InvocationTargetException: null
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) ~[?:?]
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) ~[?:?]
	at java.lang.reflect.Method.invoke(Unknown Source) ~[?:?]
	at eu.okaeri.commands.meta.InvocationMeta.call(InvocationMeta.java:43) ~[screalms.jar:?]
	at eu.okaeri.commands.bukkit.CommandsBukkit.handleExecution(CommandsBukkit.java:271) ~[screalms.jar:?]
	at eu.okaeri.commands.bukkit.CommandsBukkit.executeCommand(CommandsBukkit.java:232) ~[screalms.jar:?]
	at eu.okaeri.commands.bukkit.CommandsBukkit.access$000(CommandsBukkit.java:47) ~[screalms.jar:?]
	at eu.okaeri.commands.bukkit.CommandsBukkit$Wrapper.execute(CommandsBukkit.java:182) ~[screalms.jar:?]
	at org.bukkit.command.SimpleCommandMap.dispatch(SimpleCommandMap.java:155) ~[paper-api-1.19-R0.1-SNAPSHOT.jar:?]
	at org.bukkit.craftbukkit.v1_19_R1.CraftServer.dispatchCommand(CraftServer.java:909) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at net.minecraft.server.network.ServerGamePacketListenerImpl.handleCommand(ServerGamePacketListenerImpl.java:2382) ~[?:?]
	at net.minecraft.server.network.ServerGamePacketListenerImpl.lambda$handleChatCommand$18(ServerGamePacketListenerImpl.java:2153) ~[?:?]
	at net.minecraft.server.TickTask.run(TickTask.java:18) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at net.minecraft.util.thread.BlockableEventLoop.doRunTask(BlockableEventLoop.java:153) ~[?:?]
	at net.minecraft.util.thread.ReentrantBlockableEventLoop.doRunTask(ReentrantBlockableEventLoop.java:24) ~[?:?]
	at net.minecraft.server.MinecraftServer.doRunTask(MinecraftServer.java:1349) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at net.minecraft.server.MinecraftServer.wrapRunnable(MinecraftServer.java:183) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at net.minecraft.util.thread.BlockableEventLoop.pollTask(BlockableEventLoop.java:126) ~[?:?]
	at net.minecraft.server.MinecraftServer.pollTaskInternal(MinecraftServer.java:1326) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at net.minecraft.server.MinecraftServer.pollTask(MinecraftServer.java:1319) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at net.minecraft.util.thread.BlockableEventLoop.managedBlock(BlockableEventLoop.java:136) ~[?:?]
	at net.minecraft.server.MinecraftServer.waitUntilNextTick(MinecraftServer.java:1297) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at net.minecraft.server.MinecraftServer.runServer(MinecraftServer.java:1182) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at net.minecraft.server.MinecraftServer.lambda$spin$0(MinecraftServer.java:303) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at java.lang.Thread.run(Unknown Source) ~[?:?]
Caused by: java.lang.UnsupportedOperationException
	at org.bukkit.craftbukkit.v1_19_R1.entity.CraftEntity.teleportAsync(CraftEntity.java:539) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at org.bukkit.craftbukkit.v1_19_R1.entity.CraftPlayer.teleportAsync(CraftPlayer.java:1141) ~[paper-1.19.jar:git-Paper-"4ec2e6e"]
	at org.bukkit.entity.Entity.teleportAsync(Entity.java:241) ~[paper-api-1.19-R0.1-SNAPSHOT.jar:?]
	at dev.cdfn.screalms.bukkit.core.command.RealmCommand.tp(RealmCommand.kt:82) ~[screalms.jar:?]
	... 26 more

Code which triggers this:

    @Executor
    @Completion(arg = ["realmUUID"], value = ["@uuidWorlds"])
    fun tp(@Sender player: Player, @Arg realmUUID: RealmUUID) {
        val world = plugin.server.getWorld(realmUUID.toString()) ?: throw IllegalArgumentException("non-loaded/existing realm uuid")
        player.teleportAsync(world.spawnLocation)
    }

This error occurs while attempting to use Player#teleportAsync method to world with no loaded chunks. I'm not sure if that's known or not so I thought it's better to report.

@MiniDigger
Copy link
Member

@CDFN, Entity#teleportAsync isn't implemented rn, thats a todo

@Spottedleaf
Copy link
Member Author

The jar has been updated to 1.19.1 and has fixed chunks/entity spawning and re-added incremental chunk saving

By dropping the data from stage 1, we did not run
the unload logic in stage 2. Now, the data is simply
not saved in stage 2, but still unloaded.
@Spottedleaf
Copy link
Member Author

I have updated the jar to support mustNotSave API for chunks properly and to fix some issues with saving unnecessary chunks during shutdown / world unload, as well as to improve the performance of /save-all flush
I am hoping this is the last build I need to make

@lynxplay
Copy link
Contributor

Can you provide the errors you are getting ?

@FrankHeijden
Copy link
Contributor

FrankHeijden commented Sep 19, 2022

@lynxplay @jokbon the error in Insights is because the field public final PersistentEntitySectionManager<Entity> entityManager was removed from net.minecraft.server.level.ServerLevel, this needs to be addressed in Insights in the future, so not an issue with this PR.

@lynxplay
Copy link
Contributor

Okay, sweet :) thanks for the quick feedback 👍

@Imptovskii
Copy link

Imptovskii commented Sep 21, 2022

I checked the test build, and on my old server Fujitsu Primergy TX100 S3P, with Intel Xeon E3 1230v2, 16GB DDR3-1600 ECC, HDD 4x500gb RAID-10, and the speed of chunks increased significantly.

I'm waiting for this patch for prod env, I think on my Core i5 11400f the results will be much better.

@Chronoken
Copy link

[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: java.lang.NoSuchFieldError: P
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at Model-Engine-R3.0.0.jar//com.ticxo.modelengine.nms.v1_19_1_R1.world.WorldHandlerImpl.asyncGetEntities(WorldHandlerImpl.java:130)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at Model-Engine-R3.0.0.jar//com.ticxo.modelengine.nms.v1_19_1_R1.world.WorldHandlerImpl.asyncGetNearbyEntities(WorldHandlerImpl.java:107)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at Model-Engine-R3.0.0.jar//com.ticxo.modelengine.nms.v1_19_1_R1.world.WorldHandlerImpl.asyncRayTraceEntities(WorldHandlerImpl.java:73)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at Model-Engine-R3.0.0.jar//com.ticxo.modelengine.nms.v1_19_1_R1.world.WorldHandlerImpl.asyncRayTrace(WorldHandlerImpl.java:41)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at Model-Engine-R3.0.0.jar//com.ticxo.modelengine.api.model.mananger.InteractionTicker.run(InteractionTicker.java:41)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at org.bukkit.craftbukkit.v1_19_R1.scheduler.CraftTask.run(CraftTask.java:101)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at org.bukkit.craftbukkit.v1_19_R1.scheduler.CraftAsyncTask.run(CraftAsyncTask.java:57)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at com.destroystokyo.paper.ServerSchedulerReportingWrapper.run(ServerSchedulerReportingWrapper.java:22)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[18:07:03] [Craft Scheduler Thread - 21 - ModelEngine/WARN]: at java.base/java.lang.Thread.run(Thread.java:833)

ModelEngine vR3.0.0 has some issues with the latest build (git-Paper-"9875e83" (MC: 1.19.2) (Implementing API version 1.19.2-R0.1-SNAPSHOT) (Git: 9875e83 on dev/chunksystem)).

@FrankHeijden
Copy link
Contributor

FrankHeijden commented Sep 21, 2022

@Chronoken ModelEngine needs to fix this, it is the same issue as mentioned before with Insights.

@Spottedleaf Spottedleaf marked this pull request as ready for review September 25, 2022 21:09
@Spottedleaf Spottedleaf requested a review from a team as a code owner September 25, 2022 21:09
@Spottedleaf Spottedleaf changed the title [DEV-BRANCH] Rewrite chunk system Rewrite chunk system Sep 25, 2022
@Spottedleaf
Copy link
Member Author

Spottedleaf commented Sep 25, 2022

Patches that were dropped that will need to be looked into later:

  • Delay Chunk Unloads based on Player Movement
  • Avoid hopper searches if there are no items [the container search patch] (already broken on current master, needs to be re-implemented anyways)
  • Optimise WorldServer#notify ?

Copy link
Member Author

@Spottedleaf Spottedleaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raw diff looks OK to me

@mrfloris
Copy link

I know we're not supposed to use this for comments, but I just want to say a massive thank you to spotted for this terrific beast of a todo item. Best of luck with this new Paper build in October as more server setups will start using it.

@adalmo
Copy link

adalmo commented Sep 27, 2022

[12:32:50 WARN]: [UltimateStacker] Plugin UltimateStacker v2.3.0 generated an exception while executing task 832
java.lang.NoSuchFieldError: P
at com.songoda.ultimatestacker.core.nms.v1_19_R1v2.world.SWorldImpl.getLivingEntities(SWorldImpl.java:26) ~[UltimateStacker-2.3.0.jar:?]
at com.songoda.ultimatestacker.core.world.SWorld.getLivingEntities(SWorld.java:35) ~[UltimateStacker-2.3.0.jar:?]
at com.songoda.ultimatestacker.tasks.StackingTask.run(StackingTask.java:86) ~[UltimateStacker-2.3.0.jar:?]
at org.bukkit.craftbukkit.v1_19_R1.scheduler.CraftTask.run(CraftTask.java:101) ~[paper-1.19.2.jar:git-Paper-176]
at org.bukkit.craftbukkit.v1_19_R1.scheduler.CraftAsyncTask.run(CraftAsyncTask.java:57) ~[paper-1.19.2.jar:git-Paper-176]
at com.destroystokyo.paper.ServerSchedulerReportingWrapper.run(ServerSchedulerReportingWrapper.java:22) ~[paper-1.19.2.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.lang.Thread.run(Thread.java:1589) ~[?:?]

Panilla & UltimateStacker Plugins (latest) are making this error on Paper#174 with new chunk rewrite system

@Insprill
Copy link

Please create a new issue for issues instead of commenting here. That's an error UltimateStacker will have to fix as they're getting a field through reflection that no longer exists.

@adalmo
Copy link

adalmo commented Sep 27, 2022

oh ok thanks i will try to ask

@PaperMC PaperMC locked as resolved and limited conversation to collaborators Sep 27, 2022
@JRoy
Copy link
Member

JRoy commented Sep 27, 2022

5j3cBNdKmZ3vOQi9.mp4

@kennytv kennytv deleted the dev/chunksystem branch January 9, 2023 21:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.