0% found this document useful (0 votes)
8 views5 pages

SWE Java Task

This guide outlines a structured approach for evaluating GitHub issues and Pull Requests (PRs) with an emphasis on Docker-based testing. It details the evaluation process through four phases: issue assessment, PR validation, Docker testing, and adding hints if necessary, along with specific criteria for both issues and PRs. The document also includes step-by-step instructions for conducting the evaluations and highlights the importance of ensuring proper testing and validation of code changes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views5 pages

SWE Java Task

This guide outlines a structured approach for evaluating GitHub issues and Pull Requests (PRs) with an emphasis on Docker-based testing. It details the evaluation process through four phases: issue assessment, PR validation, Docker testing, and adding hints if necessary, along with specific criteria for both issues and PRs. The document also includes step-by-step instructions for conducting the evaluations and highlights the importance of ensuring proper testing and validation of code changes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Comprehensive Guide to GitHub Issue and PR Evaluation with Docker Testing

This guide provides a structured approach to evaluating GitHub issues and their corresponding
Pull Requests (PRs), with a focus on Docker-based validation. Following these steps will help
you efficiently assess whether a task meets the required criteria and properly test the
implementation.

Introduction

The evaluation process consists of 4 main phases:


1. Issue assessment
2. PR validation
3. Docker-based testing
4. Adding hints (If needed)

Each phase has specific criteria that must be satisfied for a task to be considered complete and
acceptable. This guide walks you through the entire process from start to finish.

Issue and PR Evaluation Criteria

Issue Requirements

When reviewing an issue, ensure it meets these criteria:


• Consistency: Issue description aligns with the PR and test changes
• Clarity: Description is clear and understandable, even if not perfect
• Independence: Not a follow-up solution to another PR
• Validity: Not requesting a revert to a previous version
• Feasibility: Can be solved by a senior developer with reasonable effort
PR Requirements

For a PR to be considered valid, it must:


• Timeliness: Be merged after November 1, 2024
• Core Functionality: Include at least one new or updated non-test file
• Test Coverage: Contain new tests relevant to the issue
• Substantial Changes: Have at least 20 lines of non-test file changes (excluding comments)
• Meaningful Code: Include logical code changes, not just name or text changes
• Quality Tests: Tests should be meaningful (not just “assert true” type tests)

Step-by-Step Evaluation Process

1. Initial Assessment

1. Open the issue and PR links in separate browser tabs

2. Read the issue description thoroughly to understand the problem

3. Review the PR description to understand the proposed solution

4. Cross-reference to ensure they address the same problem

2. Code Review

1. Check the PR merge date to confirm it’s after November 1, 2024

2. Identify the non-test files that were changed or added

3. Count the lines of code changes in non-test files (excluding comments)

4. Examine the test files to ensure they properly validate the new functionality

5. Evaluate the quality and relevance of the tests


Dockerization:

The goal is to come up with a dockerfile that will be run successfully and will be able to run the
tests that will output pass or fail

1.​ The dockerfile should be executed on the base commit of the PR


2.​ Install git and other build essentials:

None
apt-get update && \
apt-get install -y git build-essential python3 && \
apt-get clean

1.​ Clone the repo and checkout the provided base_commit

None
git clone <repo_url>
cd <repo_name>
git checkout <base_commit_hash>
Build your Dockerfile:
docker build -t test-image -f Dockerfile .

At this stage, if the dockerfile fails to build, pls adjust it so that it can run, by for example adding
other packages install, make sure you never touch the code through, only the dockerfile

Run the container and validate test execution, It should run and not crash due to missing
dependencies or setup issues.

None
docker run --rm test-image

1.​ Run all the tests on the head as well


2.​ Copy the final Dockerfile and screenshot of successfully running it in the repository of
the task so we can assess its correctness
Adding hints:

1.​ Sometimes you need to add hints, if the problem statement/issue description is not
complete for various reasons or if there is a dependency between the golden tests and
the golden solution.
1.​ The problem statement only describes a general issue, and the cause of the
issue is not clear; it might become clear in the issue conversation or the PR
conversation.
2.​ The problem statement has multiple suggestions/possible fixes, and this gets
cleared later in the issue conversation or the PR conversation.
3.​ The problem statement mentioned some function/variable or any name, and the
PR used a different name.
4.​ The golden tests depend on a newly added module/variable/text added in the
golden solution.
2.​ The hints should be guiding the agent as if you are instructing a developer on how the
solution is structured (file names, variables names, function names, if they are hard
coded in tests).
3.​ It should not be explaining to the model how to solve the problem or referring to the PR
and what it includes. Assume the PR does not exist, and the model is trying to solve a
new issue.
4.​ Example scenarios of needed hints:
1.​ Issue has reference to other issues
2.​ PR has new non test file created that is imported in the tests
3.​ Issue description is inconsistent with the PR and test changes
4.​ The PR includes additional changes than proposed in the issue statement
5.​ Adding new variable/method/module that is used in the tests
6.​ The issue is a feature not a bug
7.​ Issue description contains images and/or videos
8.​ Issue description has a link to a sandbox
5.​ Provide the hint as a text file
Deliverable:

Upload your results (text, image, screenshots) using the file upload button

Task 1
PR [Link]
Issue [Link]

Task 2
PR [Link]
Issue [Link]

Task 3
PR [Link]
Issue [Link]

You might also like