You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add model removal and adjust some framework logic (#681)
* prune model task
* add component_description
* add model removal logic to component, hypo, and task gen
* fix ci
* adjust coder to meet the requirement of model removal
* fix and refine the logic of model removal
* add model removal logic in model_eval
* fix ci
* fix ci
* prune some unnecessary codes
Copy file name to clipboardExpand all lines: rdagent/components/coder/data_science/model/prompts.yaml
+40Lines changed: 40 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -138,3 +138,43 @@ model_eval:
138
138
--------- Whole workflow test stdout ---------
139
139
{{ workflow_stdout }}
140
140
{% endif %}
141
+
142
+
model_eval_rm:
143
+
system: |-
144
+
You are a data scientist responsible for evaluating model removal process.
145
+
146
+
## Task Description
147
+
{{ task_desc }}
148
+
149
+
{% if workflow_stdout is not none %}
150
+
## Whole Workflow Consideration
151
+
The model building code is part of the whole workflow. The user has executed the entire pipeline and provided additional stdout.
152
+
153
+
**Workflow Code:**
154
+
```python
155
+
{{ workflow_code }}
156
+
```
157
+
158
+
You should evaluate both the model removal test results and the overall workflow results. **Approve the code only if both tests pass.**
159
+
{% endif %}
160
+
161
+
## Evaluation Criteria
162
+
You will be given the standard output (`stdout`) from the model removal test and, if applicable, the workflow test.
163
+
164
+
Please respond with your feedback in the following JSON format and order
165
+
```json
166
+
{
167
+
"execution": "Describe how well the model removal executed, including any errors or issues encountered. Append all error messages and full traceback details without summarizing or omitting any information.",
168
+
"return_checking": "Check the generated value, including whether the value is generated and comparing the shape of the model output with the requirement in spec.md.",
169
+
"code": "Assess code quality, readability, and adherence to specifications.",
Copy file name to clipboardExpand all lines: rdagent/scenarios/data_science/proposal/prompts.yaml
+22-20Lines changed: 22 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -213,10 +213,6 @@ direct_exp_gen:
213
213
{% if workflow_check %}"workflow_update": [Partial Response Format 3], {% endif %}
214
214
}
215
215
216
-
{% if extra_requirement %}
217
-
{{ extra_requirement }}
218
-
{% endif %}
219
-
220
216
user: |-
221
217
# All former successful experiments and their feedbacks, the current SOTA solution is the combination of the best solutions of these trials:
222
218
{{ sota_exp_and_feedback_list_desc }}
@@ -226,8 +222,23 @@ direct_exp_gen:
226
222
The user has made several hypothesis on this scenario and did several evaluation on them.
227
223
{{ failed_exp_and_feedback_list_desc }}
228
224
229
-
{% if targets == "Building model" %}
225
+
{% if targets == "Model" %}
230
226
Based on the feedback from previous experiment failures, if the failure was due to exceeding the time limit or memory constraints, start with the smallest model size or choose alternative algorithms or methods with significantly lower time or space complexity instead of using a neural network. You can then iteratively refine and optimize the model in later stages.
227
+
228
+
Here is the SOTA solution:
229
+
{{ sota_exp_desc }}
230
+
Pay attention to the **Results** section. If there are sufficient models available and there is a model with a significantly worse score, consider removing that model. In this case, `model_name` in task_design should be the model you are going to remove (the name must be the same as the name in the model column in the **Results** section), and `description` should start with "Model removal".
231
+
Otherwise, if the number of available models is insufficient. Your task is to first decide whether to:
232
+
- Tune an existing model: Select one of the current models for further tuning and improvement.
233
+
- Add a new model: Introduce a new model to expand the hypothesis space.
234
+
235
+
The information of the model is described by the code of workspace.
236
+
237
+
Make a decision and proceed accordingly:
238
+
- If you decide to tune an existing model, select the existing model file and generate a new hypothesis.
239
+
- If you decide to add a new model, specify the type of model you would add and generate a new hypothesis related to the new model.
240
+
241
+
When building the model, if the runtime permits, consider incorporating hyperparameter search methods to improve performance.
231
242
{% endif %}
232
243
233
244
{% endif %}
@@ -258,13 +269,12 @@ component_gen:
258
269
system: |-
259
270
You are a Kaggle Grander Master. You are going to provide a solution for a kaggle competition.
260
271
261
-
Here is the description of the competition scenario:
262
-
```
272
+
# Here is the description of the competition scenario:
263
273
{{ scenario }}
264
-
```
265
274
266
275
# Here is the current best version of implementation:
267
276
{{ sota_exp_desc }}
277
+
[Notice] Pay attention to the **Results** section. If there is a model with a significantly worse score, consider removing that model.
268
278
269
279
{% if last_exp_diff %}
270
280
# Here are the differences between the latest version of implementation and the current best version of implementation
@@ -274,7 +284,9 @@ component_gen:
274
284
275
285
You will be provided the feedback for the latest implementation.
276
286
277
-
Please select the component you are going to improve the latest implementation or sota implementation.
287
+
Please select the component you are going to improve the sota implementation.
288
+
# Here is the brief description of the components you can select:
289
+
{{ component_desc }}
278
290
279
291
Please generate the output in JSON format following the format below:
280
292
{{ component_output_format }}
@@ -346,17 +358,7 @@ output_format:
346
358
The output should follow JSON format. The schema is as follows:
347
359
{
348
360
"model_name": "model name, must start with 'model_' and only contain letters, numbers, and underscores",
349
-
"description": "A precise and comprehensive description of the model",
350
-
"extra_params":
351
-
{
352
-
"model_type": "The type of the model, e.g., neural network, tree-based model, etc.",
353
-
"architecture": "A detailed description of the model's architecture, e.g., neural network layers or tree structures",
354
-
"hyperparameters": {
355
-
"hyperparameter_name_1": "value of hyperparameter 1",
356
-
"hyperparameter_name_2": "value of hyperparameter 2",
357
-
"hyperparameter_name_3": "value of hyperparameter 3"
358
-
},
359
-
},
361
+
"description": "A precise and comprehensive description of the model. Start with [Model building/tuning] or [Model removal].",
360
362
}
361
363
ensemble: |-
362
364
Design a specific and detailed ensemble task based on the given hypothesis. The output should be detailed enough to directly implement the corresponding code.
Copy file name to clipboardExpand all lines: rdagent/scenarios/data_science/scen/prompts.yaml
+6-6Lines changed: 6 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -1,21 +1,21 @@
1
1
scenario_description: |-
2
2
------Background of the scenario------
3
-
{{background}}
3
+
{{background}}
4
4
5
5
------ Guidelines for participating in the competition ----
6
6
Before submitting your results, we have numerous tests ready to check your code. Please ensure your submission is genuine and do not manipulate data or return values just to pass the tests, as this will not lead to successful final results.
7
7
8
8
------The expected output & submission format specifications------
9
-
{{submission_specifications}}
9
+
{{submission_specifications}}
10
10
11
11
{% if evaluation is not none %}
12
12
------Evaluation------
13
-
{{evaluation}}
13
+
{{evaluation}}
14
14
{% endif %}
15
15
16
16
The evaluation metrics used is directed as:
17
-
{% if metric_direction %}The metric is better when it is bigger.
18
-
{% else %}The metric is better when it is smaller.
17
+
{% if metric_direction %}The metric is better when it is bigger.
18
+
{% else %}The metric is better when it is smaller.
19
19
{% endif %}
20
20
21
21
{% if eda_output is not none %}
@@ -57,7 +57,7 @@ competition_background: |-
57
57
The data type used in this competition is {{ data_type }}.
58
58
Briefly, the competition involves: {{ brief_description }}.
59
59
The dataset used in this competition is: {{ dataset_description }}.
60
-
Your goal in this competition is to: {{target_description }}.
60
+
Your goal in this competition is to: {{target_description }}.
61
61
62
62
rich_style_description: |-
63
63
### {{ name }} Agent: Automated Feature Engineering & Model Tuning Evolution
Copy file name to clipboardExpand all lines: rdagent/scenarios/data_science/share.yaml
+13Lines changed: 13 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -60,3 +60,16 @@ describe: # some template to describe some object
60
60
Reason: {{ exp_and_feedback[1].reason }}
61
61
{% endfor %}
62
62
{% endif %}
63
+
64
+
65
+
component_description:
66
+
data_loader: |-
67
+
Loads and preprocesses competition data, ensuring proper data types, handling missing values, and providing an exploratory data analysis summary.
68
+
feature: |-
69
+
Transforms raw data into meaningful features while maintaining shape consistency, avoiding data leakage, and optimizing for model performance.
70
+
model: |-
71
+
Perform one of three tasks: model building, which develops a model to address the problem; model tuning, which optimizes an existing model for better performance; or model removal, which discards models that do not contribute effectively.
72
+
ensemble: |-
73
+
Combines predictions from multiple models using ensemble strategies, evaluates their performance, and generates the final test predictions.
74
+
workflow: |-
75
+
Integrates all pipeline components, from data loading to ensemble prediction, ensuring efficient execution and correct output formatting.
Copy file name to clipboardExpand all lines: rdagent/utils/agent/tpl.yaml
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -12,10 +12,10 @@ BatchEditOut: |-
12
12
For example:
13
13
Inject the code into the folder. Your file name should always contain the suffix. Your file name keys should be unique to avoid delete or replace conflicts.
14
14
{
15
-
<file name1>: "<code>", // indicate writing <code> into <file name1> (create new file or replace existing file)
15
+
<file name1>: "<code>", // indicate writing <code> into <file name1> (create a new file or update an existing file)
16
16
{% if with_del %}
17
-
<file name2>: "__DEL__" // indicate removing file name2. When we want to replace a file to a new one, we usually use this
17
+
<file name2>: "__DEL__" // indicate removing file name2. When we want to just remove a file or replace a file to a new one, we usually use this
18
18
{% else %}
19
-
<file name2>(optional): "<code>" // indicate writing <code> into <file name2> (create new file or replace existing file)
19
+
<file name2>(optional): "<code>" // indicate writing <code> into <file name2> (create a new file or update an existing file)
0 commit comments