Code and annotated data for "The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions"
The ShareGPT collection we used for annotation is publically available here, which contains 94k user queries.
By first configuring GPT API and running python annotation.py --data_file $DATA$ --demo_file $DEMO$ --output_file $OUTPUT$, the annotation starts with default engine GPT-4.
Due to the high demand for computing resources, we provide our annotated version here, where for each sample, we have the following attributes:
"id": the ID for each user query in the ShareGPT collection.
"domain": annotated results for domain/topics.
"summary": one sentence summarization of the user query.
"task_type": the specific task generated in a free-form manner by GPT-4.