githubinferredactive
HumanStudy-Bench
provenance:github:AISmithLab/HumanStudy-Bench
WHAT THIS AGENT DOES
Here's a plain English summary of HumanStudy-Bench for a non-technical business user: HumanStudy-Bench is a tool that allows researchers to simulate how people would behave in social science experiments using AI. This helps speed up and potentially reduce the cost of research by allowing scientists to test their ideas virtually, rather than always needing to recruit and work with real human participants. Researchers in fields like psychology, economics, and sociology would find this valuable for quickly exploring research questions and refining their study designs.
README
<div align="center">
<img src="docs/img/new-HS-bench_logo.png" alt="HumanStudy-Bench Logo" width="300">
<h1>HumanStudy-Bench: Towards AI Agent Design for Participant Simulation</h1>
[](https://www.hs-bench.clawder.ai)
[](https://arxiv.org/abs/2602.00685)
[](https://opensource.org/licenses/MIT)
[](https://github.com/HumanStudy-Hub/HumanStudy-Bench)
</div>
---
> **New contributions are now accepted at the [Community Edition](https://github.com/HumanStudy-Hub/HumanStudy-Bench).** Please fork and submit PRs there.
> LLMs are increasingly used to simulate human participants in social science research, but existing evaluations conflate base model capabilities with agent design choices, making it unclear whether results reflect the model or the configuration.
## Overview
HumanStudy-Bench treats participant simulation as an *agent design problem* and provides a **standardized testbed** — combining an **Execution Engine** that reconstructs full experimental protocols from published studies and a **Benchmark** with standardized evaluation metrics — for *replaying human-subject experiments end-to-end* with alignment evaluation at the level of scientific inference.
This repository contains the **original 12 foundational studies** and benchmark code as described in the paper. For the paper's exact results, use tag `v1.0.0`.
## Community Edition
**Want to contribute a study or use the latest community-expanded benchmark?**
Head to the **[Community Edition](https://github.com/HumanStudy-Hub/HumanStudy-Bench)** — an open, community-driven repo where anyone can submit new studies via PR. The community version includes the same 12 foundational studies plus all community contributions.
[](https://github.com/HumanStudy-Hub/HumanStudy-Bench)
## Reproduction (Paper Results)
To reproduce the exact benchmark and results reported in the paper:
```bash
git checkout v1.0.0
```
## Citation & Hugging Face
If you use HumanStudy-Bench, please cite:
```bibtex
@misc{liu2026humanstudybenchaiagentdesign,
title={HumanStudy-Bench: Towards AI Agent Design for Participant Simulation},
author={Xuan Liu and Haoyang Shang and Zizhang Liu and Xinyan Liu and Yunze Xiao and Yiwen Tu and Haojian Jin},
year={2026},
eprint={2602.00685},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2602.00685},
}
```
**Hugging Face:** Benchmark and resources are available on the [Hugging Face Hub](https://huggingface.co/) — `fuyyckwhy/HS-Bench-results`.
## License
MIT License. See [LICENSE](LICENSE) for details.
PUBLIC HISTORY
First discoveredMar 21, 2026
IDENTITY
inferred
Identity inferred from code signals. No PROVENANCE.yml found.
Is this yours? Claim it →METADATA
platformgithub
first seenJan 31, 2026
last updatedMar 20, 2026
last crawled12 days ago
version—
README BADGE
Add to your README:
