Depending on what structure you are opting for, you might want to take a look at this thread:
A few members of the Forum including myself discussed and worked out a solution for semantically chunking a document using GPT-4-turbo. In essence, the approach involves using GPT-4-turbo to create an outline of the document (incl. the identification of the start and end position of individual sections within the document) and then use the information to programmatically extract the text verbatim from the document into a structured JSON.
The benefit of this approach is that is that you only need one API call to get the document’s basic structure and that you don’t have to worry about the output token constraints. Additionally, it saves a lot of cost compared to a scenario where you ask the model to return the text verbatim. That said, the approach currently is mostly applicable to documents that have clearly defined sections.