You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: rdagent/scenarios/data_science/proposal/exp_gen/prompts_v2.yaml
+15-11Lines changed: 15 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
scenario_problem:
2
2
system: |-
3
3
{% include "scenarios.data_science.share:scen.role" %}
4
-
You will be given scenario and competition description and the current SOTA implementation and feedback.
4
+
You will be given the scenario description and the current SOTA implementation and feedback.
5
5
Your task is to analyze the given information and extract the **Scenario Problems** from the given materials.
6
6
7
7
## Scenario Problems
@@ -25,9 +25,6 @@ scenario_problem:
25
25
# Scenario Description
26
26
{{ scenario_desc }}
27
27
28
-
# Competition Description
29
-
{{ competition_desc }}
30
-
31
28
# Current SOTA Implementation
32
29
{{ sota_exp_desc }}
33
30
@@ -98,8 +95,11 @@ hypothesis_gen:
98
95
- If the problem relates to time/memory constraints, suggest smaller model sizes or alternative algorithms with reduced complexity.
99
96
- If the problem involves underperforming models, propose removing or replacing models with significantly worse performance.
100
97
- If the problem relates to hyperparameter tuning, recommend a specific method or strategy for tuning.
98
+
4. Specific and Non-Vague
99
+
- Avoid vague statements like "improve the model" or "optimize the pipeline." Instead, specify the exact changes to be made.
100
+
- No phrases like "for example" or "eg.," should be used in the hypothesis. Give a clear decision in the hypothesis.
101
101
{% if enable_idea_pool %}
102
-
4. Idea Reference
102
+
5. Idea Reference
103
103
- Each idea is a method, technique or trick that contributes to high performance from other competition implementation under similar problem. You are free to use them as an inspiration for your hypothesis proposal.
104
104
{% endif %}
105
105
@@ -114,7 +114,7 @@ hypothesis_gen:
114
114
Please score the proposed hypothesis from 1 to 10 for each of the following dimensions (where 1 means lowest and 10 means highest):
115
115
1. Problem-Hypothesis Alignment: How well the hypothesis addresses the identified problem.
116
116
2. Expected Impact: The estimated improvement after applying the hypothesis to current SOTA implementation.
117
-
3. Novelty: Degree of innovation compared to previous attempts. If the proposed hypothesis is very similar to previous experiments' hypothesis, assign low novelty score.
117
+
3. Novelty: Degree of innovation compared to previous attempts. If the proposed hypothesis is similar to previous experiments' hypothesis, assign novelty score to one.
118
118
4. Feasibility: The ease of implementing the proposed hypothesis in the current SOTA implementation.
119
119
5. Risk-Reward Balance: The exploration-exploitation balance of the proposed hypothesis.
120
120
@@ -147,10 +147,13 @@ task_gen:
147
147
{{ task_specification }}
148
148
149
149
## Task Design Guidelines
150
-
The task should be concise with several steps each only in a few sentences.
151
-
DO NOT repeat the details which has already included in the SOTA code. If the SOTA code has covered the steps perfectly, you should not repeat the steps in detail.
152
-
DO NOT write any code in the task description!
153
-
Observing reasons from failed experiments and feedback to prevent repeating similar mistakes in analogous situations.
150
+
1. The task should be concise with several steps each only in a few sentences.
151
+
2. DO NOT repeat the details which has already included in the SOTA code. If the SOTA code has covered the steps perfectly, you should not repeat the steps in detail.
152
+
3. DO NOT write any code in the task description!
153
+
4. Observe reasons from failed experiments and feedback to prevent repeating similar mistakes in analogous situations.
154
+
5. Specific and Non-Vague
155
+
- Avoid vague statements like "choose a proper model" Instead, specify the exact task to be made.
156
+
- No phrases like "for example" or "eg.," should be used in the task. Give a clear decision in the task.
154
157
155
158
## [Partial Response Format 1] Task Output Format:
156
159
{{ task_output_format }}
@@ -214,6 +217,7 @@ specification:
214
217
problem: |-
215
218
1. The problem should be specific and fine-grained. Avoid general or vague statements.
216
219
2. The problem should technical or methodological. Focus on design and implementation flaws, not runtime errors.
220
+
3. The problem should be strictly aligned with the improvement of target metric. The problem should fit the template: "IF THE PROBLEM IS SOLVED, THEN THE TARGET METRIC WILL IMPROVE."
217
221
218
222
hypothesis: |-
219
223
1. The hypothesis should be precise, testable, and directly actionable. Avoid general or vague statements. For example, "tuning a model" is too broad, whereas "increasing the learning rate to 0.1 in the LightGBM model will improve performance" is specific and actionable.
@@ -230,7 +234,7 @@ output_format:
230
234
problem: |-
231
235
For each of the identified problem, you should strictly adhere to the following JSON schema.
232
236
Your final output should be a dict containing all the identified problem without anything else.
233
-
Please respond at most five problems considering the most valuable and recently not explored.
237
+
Please respond at most five problems FEWER BUT BETTER considering the most valuable and recently not explored. Don't respond problems not relevant to the improvement of target metric.
234
238
{
235
239
"problem name 1": {
236
240
"problem": "Description of the first issue in no more than three sentences.",
Copy file name to clipboardExpand all lines: rdagent/scenarios/data_science/scen/prompts.yaml
+16-11Lines changed: 16 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -11,18 +11,16 @@ scenario_description: |-
11
11
------The name of the evaluation metric used------
12
12
{{ metric_name }}
13
13
14
-
------The time limit to your code------
14
+
{% if time_limit %}------The time limit to your code------
15
15
You code running is limit to {{ time_limit }}, please change yor model type and model parameters to make sure your code can run within the time limit.
16
16
17
-
{% if evaluation is not none %}
18
-
------Evaluation------
19
-
{{ evaluation }}
20
17
{% endif %}
18
+
{% if evaluation is not none %}------Evaluation------
19
+
{{ evaluation }}
21
20
22
-
The evaluation metrics used is directed as:
23
-
{% if metric_direction %} The metric is better when it is bigger.
24
-
{% else %} The metric is better when it is smaller.
25
21
{% endif %}
22
+
The evaluation metrics used is directed as:
23
+
The metric is better when it is {% if metric_direction %}bigger{% else %}smaller{% endif %}.
26
24
27
25
{% if eda_output is not none %}------Data Overview(EDA)------
28
26
{{ eda_output }}
@@ -60,11 +58,18 @@ competition_background: |-
60
58
Your knowledge spans cutting-edge data analysis techniques, advanced machine learning algorithms, and their practical applications to solve complex real-world problems.
61
59
You are dedicated to producing accurate, efficient, and innovative solutions.
62
60
63
-
The task type for this competition is {{ task_type }}.
64
-
The data type used in this competition is {{ data_type }}.
61
+
The task type for this competition is **{{ task_type }}**.
62
+
The data type used in this competition is **{{ data_type }}**.
63
+
65
64
Briefly, the competition involves: {{ brief_description }}.
66
-
The dataset used in this competition is: {{ dataset_description }}.
67
-
The evaluation metric of this competition is: {{ metric_description }}.
65
+
66
+
The dataset used in this competition is:
67
+
{{ dataset_description }}.
68
+
69
+
Submission channel number to each sample is: {{ model_output_channel }}.
70
+
71
+
The evaluation metric of this competition is:
72
+
{{ metric_description }}.
68
73
69
74
rich_style_description: |-
70
75
### {{ name }} Agent: Automated Feature Engineering & Model Tuning Evolution
0 commit comments