Google Scholar

Evocodebench: An evolving code generation benchmark aligned with real-world code repositories

J Li, G Li, X Zhang, Y Dong, Z Jin - arXiv preprint arXiv:2404.00599, 2024 - arxiv.org

How to evaluate Large Language Models (LLMs) in code generation is an open question.
Existing benchmarks demonstrate poor alignment with real-world code repositories and are
insufficient to evaluate the coding abilities of LLMs. This paper proposes a new benchmark-
EvoCodeBench to address the preceding problems, which has three primary advances.(1)
EvoCodeBench aligns with real-world repositories in multiple dimensions, eg, code
distributions and dependency distributions.(2) EvoCodeBench offers comprehensive …

Save Cite Cited by 67 Related articles All 4 versions View as HTML

[CITATION][C] EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories. CoRR abs/2404.00599 (2024)

J Li, G Li, X Zhang, Y Dong, Z Jin - arXiv preprint arXiv:2404.00599, 2024

Save Cite Cited by 4 Related articles

Showing the best results for this search. See all results

Cite

Advanced search

Saved to My library

Evocodebench: An evolving code generation benchmark aligned with real-world code repositories

[CITATION][C] EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories. CoRR abs/2404.00599 (2024)