ProgramBench

Developed by facebookresearch

Open Source Python Global free #llm-benchmarking#reverse-engineering#code-generation#program-synthesis

ABOUT

ProgramBench is a benchmark developed by facebookresearch designed to evaluate the capability of Language Models (LLMs) to rebuild programs from scratch. It challenges AI agents to architect and implement a complete codebase that reproduces the original program's behavior, given only a compiled binary and its documentation. This tool is crucial for assessing LLMs' performance in reverse engineering and code generation tasks.

CAPABILITIES

Binary-to-source code reconstruction evaluation
Assesses AI agent program architecture and implementation
Provides standard dataset and leaderboard for performance comparison
Supports quick deployment in Python environments
Focuses on language model reverse engineering capabilities

SUPPORTED PLATFORMS

desktop

EXTERNAL RESOURCES

Visit Website ↗ GitHub Repository ↗