You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: rdagent/scenarios/kaggle/prompts.yaml
+77-24Lines changed: 77 additions & 24 deletions
Original file line number
Diff line number
Diff line change
@@ -47,27 +47,80 @@ hypothesis_output_format: |-
47
47
"concise_knowledge": "One line summary. Transferable knowledge based on theoretical principles. Use conditional grammar. eg. "If...., ..; When..., .; and etc" Make sure that you state things clearly without ambiguity. Eg. avoid saying "previous hypothesis", because one wouldn't know what that is."
48
48
}
49
49
50
-
factor_hypothesis_specification: |-
51
-
1. **Type of Feature and Data Characteristics:**
52
-
- Define the type of feature introduced.
53
-
- Explain the data characteristics or patterns captured by this feature.
54
-
- Omit unnecessary or redundant details.
55
-
2. **Simple and Effective Features First:**
56
-
- Start with features that are simple and likely effective.
57
-
- Concisely explain why these features are expected to work.
58
-
- Avoid complex or combined features initially.
59
-
3. **Gradual Complexity Increase:**
60
-
- Introduce more complex features as more experimental results are gathered.
61
-
- Discuss potential advantages and complexities.
50
+
hypothesis_specification: |-
51
+
There are different types of hypothesis that correspond to different types of actions. The specifications are quite important here:
52
+
1) feature_engineering:
53
+
description: We engineer the features for the sake of best model performance on the basis of engineering the most influential features.
54
+
type_of_feature_and_data_characteristics:
55
+
- Clearly define the feature type being introduced.
56
+
- Highlight the specific data patterns or characteristics the feature captures.
57
+
- Keep it focused—omit unnecessary details.
58
+
start_with_simple_features:
59
+
- Begin with straightforward and impactful features.
60
+
- Briefly explain why these features are expected to work.
61
+
- Avoid combining complex features at the outset.
62
+
increase_complexity_gradually:
63
+
- Add more complex features only after gathering experimental results.
64
+
- Discuss potential advantages and the trade-offs involved.
62
65
- Combine features only after simpler ones are tested and validated.
63
-
4. **New Directions and Optimizations:**
64
-
- If a new direction is needed, explain why based on data analysis, domain knowledge, or observed patterns.
65
-
- Suggest only one new direction at a time for clarity.
66
-
- If a previous hypothesis did not surpass the previous best, but seems optimizable, you may continue in the same direction.
67
-
- Highlight that features surpassing the previous best are included in the feature library to avoid re-implementation.
68
-
5. **1-3 Feature tasks per Generation:**
69
-
- Ensure each generation produces 1-3 feature tasks.
70
-
- Balance simplicity and complexity to build a robust feature library.
66
+
new_directions_and_optimizations:
67
+
- Justify any new direction based on data analysis or domain knowledge.
68
+
- Focus on one new direction at a time for clarity.
69
+
- If a hypothesis shows optimization potential (even without surpassing previous best results), explain why and proceed.
70
+
feature_library_and_task_management:
71
+
- Include features that improve performance in the feature library.
72
+
- Each generation should focus on 1-3 feature tasks, balancing simplicity with complexity.
73
+
74
+
2) feature processing:
75
+
define_the_processing_method:
76
+
- Clearly state the type of feature processing.
77
+
- Explain how this processing captures data patterns or improves feature usefulness.
78
+
- Avoid redundant details.
79
+
begin_with_simple_processing:
80
+
- Start with simple, effective processing methods.
81
+
- Concisely explain why these methods should improve model performance.
82
+
- Introduce complex processing only after gathering experimental results.
83
+
introduce_complexity_gradually:
84
+
- Add more sophisticated processing methods step-by-step, after validation.
85
+
- Discuss the advantages, challenges, and trade-offs of advanced processing.
86
+
- Validate simpler methods before combining them with complex ones.
87
+
88
+
3) model feature selection:
89
+
selection_based_on_model_type:
90
+
- Specify which features are being selected and explain why, considering the model type (e.g., NN, Random Forest, LightGBM, XGBoost).
91
+
- Ensure the relationship between features and the model type is well-defined, as different features perform better on different models.
92
+
pattern_recognition:
93
+
- Explain the data characteristics or patterns that influenced feature selection for the specific model.
94
+
- Clarify how the selected features complement the model's strengths and handle its potential weaknesses.
95
+
96
+
4) model_design_and_tuning:
97
+
Explain the hypothesis clearly with valuable information. What kind of model are you building/tuning? What do you think is true? How you are revising and why? What are some innvations?
- Focus on designing new model architectures one at a time OR hyper-parameter tuning OR both.
100
+
- Each hypothesis should introduce a novel architecture or a significant modification to an existing one, while leveraging previous experiences and the hypothesis history.
101
+
- Optimize one model at a time, iterating until its potential is fully explored. Switch to a new model only when you believe the current model’s potential has been exhausted.
102
+
specific_to_model_type:
103
+
- Note that any types of tuning or model design must be specific to the model types available in our workspace.
104
+
- Clearly define the model type (e.g., Neural Network Models (eg, MLP, CNN, RNN, LSTM, GRU etc.), XGBoost, RandomForest, LightGBM) and the architecture/tuning being introduced.
105
+
- Ensure the architecture or tuning aligns with the data characteristics and the strengths or limitations of the specific model.
106
+
rationale_behind_architecture_and_tuning:
107
+
- Explain the innovation or reasoning behind the architectural design or tuning approach.
108
+
- Justify how the new structure or parameter change captures data patterns more effectively, improves learning efficiency, or enhances predictive power.
109
+
start_simple_innovate_gradually:
110
+
- Start with innovative yet simple changes to ensure each iteration is well-tested and the results are well-understood.
111
+
- Gradually introduce more complex architectural changes or hyper-parameter adjustments based on gathered results and insights.
112
+
introduce_one_innovation_at_a_time:
113
+
- Focus on testing one key innovation at a time to isolate its impact on performance.
114
+
- Avoid combining multiple innovations in a single iteration to maintain clarity in performance results.
115
+
balance_innovation_with_performance:
116
+
- Strive for a balance between creative design and practical, effective performance.
117
+
- If a design or tuning shows strong performance, document it in a "library" for future iterations.
118
+
iterative_testing_and_refinement:
119
+
- After each test, evaluate and refine the model architecture or tuning based on observed performance and data patterns.
120
+
- If a hypothesis shows potential but doesn't surpass previous results, continue optimizing in that direction.
121
+
hypothesis_statement:
122
+
- For each hypothesis, specify the exact innovation or tuning approach and explain why it's expected to enhance performance for the chosen model type.
123
+
71
124
72
125
feature_experiment_output_format: |-
73
126
According to the hypothesis, please help user design one or more feature engineering tasks.
0 commit comments