kimi-dev/resources at master · morphym/kimi-dev

Name	Name	Last commit message	Last commit date
parent directory ..
40patches.jsonl	40patches.jsonl
40tests.jsonl	40tests.jsonl
README.md	README.md
single-pass-eval.jsonl	single-pass-eval.jsonl
test-time-final-single-pass-eval.jsonl	test-time-final-single-pass-eval.jsonl

Name

Last commit message

Last commit date

Here we provide a result file named single-pass-eval.jsonl, which contains the evaluation of trajectories generated by the model. For test-time scaling, we provide 40patches.jsonl, which contains the 40 patches used, and 40tests.jsonl, which contains the 40 tests. Additionally, we include the evaluation results of the final selected patches after test-time scaling in the file test-time-final-single-pass-eval.jsonl. The evaluation code will be released soon.

single-pass-eval.jsonl
For each item in the file:
- item['instance_id'] is the unique label for each instance
- item['patch'] is the patch generated by the model to fix the bug in the issue
- item['judge_res']['EVAL_EXEC_str'] is the test executing log after applying the item['result']['model_patch']
- item['judge_res']['test_status'] records the details of FAIL_TO_PASS, PASS_TO_PASS, FAIL_TO_FAIL, and PASS_TO_FAIL
- item['judge_res']['resolved'] is true if there is no failure in the test
40patches.jsonl
This file is used for test time scaling, containing 40 independent rollouts of the patches.
40tests.jsonl
This file is used for test time scaling, containing 40 independent rollouts of the tests.
test-time-final-single-pass-eval.jsonl
This file contains the evaluation log of the patch ultimately selected for each instance after performing test-time scaling. The structure of each item in the file is similar to that of single-pass-eval.jsonl.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

FilesExpand file tree

resources

Directory actions

More options

Directory actions

More options

Latest commit

History

resources

Folders and files

parent directory

README.md