Skip to content

Commit 76d8536

Browse files
XianBWpeteryang1
andauthored
fix: align competion_full_desc and scenario_all_desc, remove redundant info in problems proposal (#808)
* align competition desc & scenario desc string * remove competition_desc when having used scenario_desc in problem gen * fix bug * remove redundant competition desc in naive expgen * improve proposal prompt * modify phrase --------- Co-authored-by: Xu Yang <[email protected]>
1 parent b7d2c12 commit 76d8536

File tree

6 files changed

+44
-42
lines changed

6 files changed

+44
-42
lines changed

rdagent/scenarios/data_science/proposal/exp_gen/naive.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@ class NaiveExpGen(ExpGen):
1515
def gen(self, trace: DSTrace) -> DSExperiment:
1616
sota_exp = trace.sota_experiment()
1717
scenario_desc = trace.scen.get_scenario_all_desc()
18-
competition_desc = trace.scen.get_competition_full_desc()
1918
sota_exp_desc = T("scenarios.data_science.share:describe.exp").r(
2019
exp=sota_exp, heading="Best of previous exploration of the scenario"
2120
)
@@ -28,7 +27,6 @@ def gen(self, trace: DSTrace) -> DSExperiment:
2827
sys_prompt = T(".naive:naive_gen.system").r()
2928

3029
user_prompt = T(".naive:naive_gen.user").r(
31-
competition_desc=competition_desc,
3230
sota_exp_desc=sota_exp_desc,
3331
scenario_desc=scenario_desc,
3432
exp_and_feedback_list_desc=exp_and_feedback_list_desc,

rdagent/scenarios/data_science/proposal/exp_gen/naive.yaml

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,6 @@ naive_gen:
2525
# Scenario Description
2626
{{ scenario_desc }}
2727
28-
# Competition Description
29-
{{ competition_desc }}
30-
3128
# Previous Experiments and Feedbacks:
3229
{{ exp_and_feedback_list_desc }}
3330

rdagent/scenarios/data_science/proposal/exp_gen/prompts_v2.yaml

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
scenario_problem:
22
system: |-
33
{% include "scenarios.data_science.share:scen.role" %}
4-
You will be given scenario and competition description and the current SOTA implementation and feedback.
4+
You will be given the scenario description and the current SOTA implementation and feedback.
55
Your task is to analyze the given information and extract the **Scenario Problems** from the given materials.
66
77
## Scenario Problems
@@ -25,9 +25,6 @@ scenario_problem:
2525
# Scenario Description
2626
{{ scenario_desc }}
2727
28-
# Competition Description
29-
{{ competition_desc }}
30-
3128
# Current SOTA Implementation
3229
{{ sota_exp_desc }}
3330
@@ -98,8 +95,11 @@ hypothesis_gen:
9895
- If the problem relates to time/memory constraints, suggest smaller model sizes or alternative algorithms with reduced complexity.
9996
- If the problem involves underperforming models, propose removing or replacing models with significantly worse performance.
10097
- If the problem relates to hyperparameter tuning, recommend a specific method or strategy for tuning.
98+
4. Specific and Non-Vague
99+
- Avoid vague statements like "improve the model" or "optimize the pipeline." Instead, specify the exact changes to be made.
100+
- No phrases like "for example" or "eg.," should be used in the hypothesis. Give a clear decision in the hypothesis.
101101
{% if enable_idea_pool %}
102-
4. Idea Reference
102+
5. Idea Reference
103103
- Each idea is a method, technique or trick that contributes to high performance from other competition implementation under similar problem. You are free to use them as an inspiration for your hypothesis proposal.
104104
{% endif %}
105105
@@ -114,7 +114,7 @@ hypothesis_gen:
114114
Please score the proposed hypothesis from 1 to 10 for each of the following dimensions (where 1 means lowest and 10 means highest):
115115
1. Problem-Hypothesis Alignment: How well the hypothesis addresses the identified problem.
116116
2. Expected Impact: The estimated improvement after applying the hypothesis to current SOTA implementation.
117-
3. Novelty: Degree of innovation compared to previous attempts. If the proposed hypothesis is very similar to previous experiments' hypothesis, assign low novelty score.
117+
3. Novelty: Degree of innovation compared to previous attempts. If the proposed hypothesis is similar to previous experiments' hypothesis, assign novelty score to one.
118118
4. Feasibility: The ease of implementing the proposed hypothesis in the current SOTA implementation.
119119
5. Risk-Reward Balance: The exploration-exploitation balance of the proposed hypothesis.
120120
@@ -147,10 +147,13 @@ task_gen:
147147
{{ task_specification }}
148148
149149
## Task Design Guidelines
150-
The task should be concise with several steps each only in a few sentences.
151-
DO NOT repeat the details which has already included in the SOTA code. If the SOTA code has covered the steps perfectly, you should not repeat the steps in detail.
152-
DO NOT write any code in the task description!
153-
Observing reasons from failed experiments and feedback to prevent repeating similar mistakes in analogous situations.
150+
1. The task should be concise with several steps each only in a few sentences.
151+
2. DO NOT repeat the details which has already included in the SOTA code. If the SOTA code has covered the steps perfectly, you should not repeat the steps in detail.
152+
3. DO NOT write any code in the task description!
153+
4. Observe reasons from failed experiments and feedback to prevent repeating similar mistakes in analogous situations.
154+
5. Specific and Non-Vague
155+
- Avoid vague statements like "choose a proper model" Instead, specify the exact task to be made.
156+
- No phrases like "for example" or "eg.," should be used in the task. Give a clear decision in the task.
154157
155158
## [Partial Response Format 1] Task Output Format:
156159
{{ task_output_format }}
@@ -214,6 +217,7 @@ specification:
214217
problem: |-
215218
1. The problem should be specific and fine-grained. Avoid general or vague statements.
216219
2. The problem should technical or methodological. Focus on design and implementation flaws, not runtime errors.
220+
3. The problem should be strictly aligned with the improvement of target metric. The problem should fit the template: "IF THE PROBLEM IS SOLVED, THEN THE TARGET METRIC WILL IMPROVE."
217221
218222
hypothesis: |-
219223
1. The hypothesis should be precise, testable, and directly actionable. Avoid general or vague statements. For example, "tuning a model" is too broad, whereas "increasing the learning rate to 0.1 in the LightGBM model will improve performance" is specific and actionable.
@@ -230,7 +234,7 @@ output_format:
230234
problem: |-
231235
For each of the identified problem, you should strictly adhere to the following JSON schema.
232236
Your final output should be a dict containing all the identified problem without anything else.
233-
Please respond at most five problems considering the most valuable and recently not explored.
237+
Please respond at most five problems FEWER BUT BETTER considering the most valuable and recently not explored. Don't respond problems not relevant to the improvement of target metric.
234238
{
235239
"problem name 1": {
236240
"problem": "Description of the first issue in no more than three sentences.",

rdagent/scenarios/data_science/proposal/exp_gen/proposal.py

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -226,14 +226,13 @@ def _f(user_prompt):
226226

227227

228228
class DSProposalV2ExpGen(ExpGen):
229-
def identify_scenario_problem(self, scenario_desc: str, competition_desc: str, sota_exp_desc: str) -> Dict:
229+
def identify_scenario_problem(self, scenario_desc: str, sota_exp_desc: str) -> Dict:
230230
sys_prompt = T(".prompts_v2:scenario_problem.system").r(
231231
problem_spec=T(".prompts_v2:specification.problem").r(),
232232
problem_output_format=T(".prompts_v2:output_format.problem").r(),
233233
)
234234
user_prompt = T(".prompts_v2:scenario_problem.user").r(
235235
scenario_desc=scenario_desc,
236-
competition_desc=competition_desc,
237236
sota_exp_desc=sota_exp_desc,
238237
)
239238
response = APIBackend().build_messages_and_create_chat_completion(
@@ -445,7 +444,6 @@ def gen(self, trace: DSTrace, pipeline: bool = False) -> DSExperiment:
445444
else:
446445
eda_output = sota_exp.experiment_workspace.file_dict.get("EDA.md", None)
447446
scenario_desc = trace.scen.get_scenario_all_desc(eda_output=eda_output)
448-
competition_desc = trace.scen.get_competition_full_desc()
449447

450448
sota_exp_desc = T("scenarios.data_science.share:describe.exp").r(
451449
exp=sota_exp, heading="Best of previous exploration of the scenario"
@@ -463,7 +461,6 @@ def gen(self, trace: DSTrace, pipeline: bool = False) -> DSExperiment:
463461
# Step 1: Identify problems
464462
scen_problems = self.identify_scenario_problem(
465463
scenario_desc=scenario_desc,
466-
competition_desc=competition_desc,
467464
sota_exp_desc=sota_exp_desc,
468465
)
469466
for problem_name in scen_problems:

rdagent/scenarios/data_science/scen/__init__.py

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -92,17 +92,6 @@ def _analysis_competition_description(self):
9292
self.metric_name = response_json_analysis.get("Metric Name", "custom_metric")
9393
self.metric_direction_guess = response_json_analysis.get("Metric Direction", True)
9494

95-
def get_competition_full_desc(self) -> str:
96-
return f"""Task Type: {self.task_type}
97-
Data Type: {self.data_type}
98-
Brief Description: {self.brief_description}
99-
Dataset Description: {self.dataset_description}
100-
Submission Specifications: {self.submission_specifications}
101-
Model Output Channel: {self.model_output_channel}
102-
Metric Evaluation Description: {self.metric_description}
103-
Metric Name: {self.metric_name}
104-
"""
105-
10695
@property
10796
def background(self) -> str:
10897
background_template = T(".prompts:competition_background")
@@ -111,6 +100,7 @@ def background(self) -> str:
111100
data_type=self.data_type,
112101
brief_description=self.brief_description,
113102
dataset_description=self.dataset_description,
103+
model_output_channel=self.model_output_channel,
114104
metric_description=self.metric_description,
115105
)
116106
return background_prompt
@@ -122,6 +112,17 @@ def rich_style_description(self) -> str:
122112
competition=self.competition,
123113
)
124114

115+
def get_competition_full_desc(self) -> str:
116+
return T(".prompts:scenario_description").r(
117+
background=self.background,
118+
submission_specifications=self.submission_specifications,
119+
evaluation=self.metric_description,
120+
metric_name=self.metric_name,
121+
metric_direction=self.metric_direction,
122+
time_limit=None,
123+
eda_output=None,
124+
)
125+
125126
def get_scenario_all_desc(self, eda_output=None) -> str:
126127
"""
127128
eda_output depends on dynamic .md files from current workspace, not fixed.

rdagent/scenarios/data_science/scen/prompts.yaml

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -11,18 +11,16 @@ scenario_description: |-
1111
------The name of the evaluation metric used------
1212
{{ metric_name }}
1313
14-
------The time limit to your code------
14+
{% if time_limit %}------The time limit to your code------
1515
You code running is limit to {{ time_limit }}, please change yor model type and model parameters to make sure your code can run within the time limit.
1616
17-
{% if evaluation is not none %}
18-
------Evaluation------
19-
{{ evaluation }}
2017
{% endif %}
18+
{% if evaluation is not none %}------Evaluation------
19+
{{ evaluation }}
2120
22-
The evaluation metrics used is directed as:
23-
{% if metric_direction %} The metric is better when it is bigger.
24-
{% else %} The metric is better when it is smaller.
2521
{% endif %}
22+
The evaluation metrics used is directed as:
23+
The metric is better when it is {% if metric_direction %}bigger{% else %}smaller{% endif %}.
2624
2725
{% if eda_output is not none %}------Data Overview(EDA)------
2826
{{ eda_output }}
@@ -60,11 +58,18 @@ competition_background: |-
6058
Your knowledge spans cutting-edge data analysis techniques, advanced machine learning algorithms, and their practical applications to solve complex real-world problems.
6159
You are dedicated to producing accurate, efficient, and innovative solutions.
6260
63-
The task type for this competition is {{ task_type }}.
64-
The data type used in this competition is {{ data_type }}.
61+
The task type for this competition is **{{ task_type }}**.
62+
The data type used in this competition is **{{ data_type }}**.
63+
6564
Briefly, the competition involves: {{ brief_description }}.
66-
The dataset used in this competition is: {{ dataset_description }}.
67-
The evaluation metric of this competition is: {{ metric_description }}.
65+
66+
The dataset used in this competition is:
67+
{{ dataset_description }}.
68+
69+
Submission channel number to each sample is: {{ model_output_channel }}.
70+
71+
The evaluation metric of this competition is:
72+
{{ metric_description }}.
6873
6974
rich_style_description: |-
7075
### {{ name }} Agent: Automated Feature Engineering & Model Tuning Evolution

0 commit comments

Comments
 (0)