You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* add coder version
* merge cooder and feedback prompts
* align v2 and v3 proposal prompts
* fix a small bug
* fix a bug
* fix another bug
* support both function calling and json mode in v2 proposal
* fix minor bug
* reformat
* remove proposal v3
* fix a small bug in json mode
* fix CI
* remove tmp file
* remove v3 check
---------
Co-authored-by: Xu Yang <[email protected]>
Copy file name to clipboardExpand all lines: rdagent/scenarios/data_science/dev/prompts.yaml
+7-3Lines changed: 7 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,9 @@ exp_feedback:
15
15
- Recommend corrective actions explicitly.
16
16
- Set `"Replace Best Result": "no"`.
17
17
- Begin your `reasoning` with `[Submission format error]`, clearly stating the issues causing experiment failure.
18
-
- If submission passes, proceed to Step 2.
18
+
- If submission passes the submission format check:
19
+
- If this is the first valid submission ever, set `"Replace Best Result": "yes"`.
20
+
- Otherwise, proceed to Step 2.
19
21
20
22
Step 2: Evaluate Alignment with Competition Requirements (if format correct)
21
23
- GOAL: CAREFULLY ANALYZE WHETHER THE EXPERIMENTAL SETUP AND CODE MAY CAUSE MISALIGNMENT BETWEEN VALIDATION AND TEST PERFORMANCE.
@@ -59,6 +61,8 @@ exp_feedback:
59
61
Provide detailed and constructive feedback structured as follows:
60
62
Example JSON Structure for Result Analysis:
61
63
{
64
+
"Submission Format Check": "yes or no",
65
+
"First Valid Submission": "yes or no",
62
66
"Observations": "Clearly summarize current and SOTA ensemble results with exact scores and notable patterns. Limit to no more than three concise, data-focused sentences. Your observation must be grounded by explicit evidence from scenario description or code implementation, not just validation scores.",
63
67
"Feedback for Hypothesis": Explicitly confirm or refute the hypothesis based on specific data points or performance trends. Limit to two sentences.",
64
68
"Evaluation Aligned With Task": "yes or no",
@@ -110,11 +114,11 @@ exp_feedback:
110
114
{{ cur_exp.experiment_workspace.all_codes }}
111
115
112
116
## Feedback of past experiments
113
-
{{ feedback_desc }}
117
+
{{ feedback_desc or "There has not been any experiments yet." }}
114
118
Please refer to these hypotheses and feedback to help you recommend new experiment and hypothesis
115
119
116
120
Tips:
117
-
- Step 1: If submission format has issues, prioritize fixing them before proceeding.
121
+
- Step 1: If submission format has issues, prioritize fixing them before proceeding. If the format is correct and it's the first valid submission ever (there has never been valid submissions in the past), set `"Replace Best Result": "yes"`. If the format is correct and this is not the first valid submission, proceed to Step 2.
118
122
- Step 2: If evaluation alignment issues are identified (validation approach does not follow competition requirements), address these methodological discrepancies immediately.
119
123
- Step 3: If new results significantly worse than SOTA, or repeated hyperparameter adjustments yield no improvement, it might be time to rethink or shift focus.
0 commit comments