githubinferredactive
Auto-claude-code-research-in-sleep
provenance:github:wanshuiyin/Auto-claude-code-research-in-sleep
ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works with Claude Code, Codex, OpenClaw, or any LLM agent.
README
# Auto-claude-code-research-in-sleep (ARIS ⚔️🌙) 💡 *Use ARIS workflows directly in Claude Code / Cursor / Trae, or get the full experience with the standalone CLI — enjoy any way you like!* 🔥 [**ARIS-Code CLI — 独立安装版**](docs/ARIS-Code-README_CN.md) · [English](docs/ARIS-Code-README_EN.md) | [⬇️ Download](https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep/releases/latest) <img src="docs/aris-code-banner.png" width="600" alt="ARIS-Code CLI">   [中文版 README](README_CN.md) | English  > 🌙 **Let Claude Code do research while you sleep.** Wake up to find your paper scored, weaknesses identified, experiments run, and narrative rewritten — autonomously. > > 🪶 **Radically lightweight — zero dependencies, zero lock-in.** The entire system is plain Markdown files. No framework to learn, no database to maintain, no Docker to configure, no daemon to babysit. Every skill is a single `SKILL.md` readable by any LLM — swap Claude Code for [Codex CLI](skills/skills-codex/), [OpenClaw](docs/OPENCLAW_ADAPTATION.md), [Cursor](docs/CURSOR_ADAPTATION.md), [Trae](docs/TRAE_ARIS_RUNBOOK_EN.md), [Antigravity](docs/ANTIGRAVITY_ADAPTATION.md), Windsurf, or your own agent and the workflows still work. Fork it, rewrite it, adapt it to your stack. > > *💡 ARIS is a methodology, not a platform. What matters is the research workflow — take it wherever you go. 🌱* [](https://mp.weixin.qq.com/s/tDniVryVGjDkkkWl-5sTkQ) · [](https://mp.weixin.qq.com/s/KLFU74lAL2FAIc9K6i1Kqg) · [](https://github.com/VoltAgent/awesome-agent-skills) · [-orange?style=flat)](https://aidigitalcrew.com) · [💬 Join Community](#-community) · [](#-citation) Custom [Claude Code](https://docs.anthropic.com/en/docs/claude-code) skills for autonomous ML research workflows. These skills orchestrate **cross-model collaboration** — Claude Code drives the research while an external LLM (via [Codex MCP](https://github.com/openai/codex)) acts as a critical reviewer. 🔀 **Also supports [alternative model combinations](#-alternative-model-combinations) (Kimi, LongCat, DeepSeek, etc.) — no Claude or OpenAI API required.** For example, [MiniMax-M2.7 + GLM-5 or GLM-5 + MiniMax-M2.7](docs/MiniMax-GLM-Configuration.md). 🤖 **[Codex CLI native](skills/skills-codex/)** — full skill set also available for OpenAI Codex. 🖱️ **[Cursor](docs/CURSOR_ADAPTATION.md)** — works in Cursor too. 🖥️ **[Trae](docs/TRAE_ARIS_RUNBOOK_EN.md)** — ByteDance AI IDE. 🚀 **[Antigravity](docs/ANTIGRAVITY_ADAPTATION.md)** — Google's agent-first IDE. 🆓 **[Free tier via ModelScope](docs/MODELSCOPE_GUIDE.md) — zero cost, zero lock-in.** > 💭 **Why not self-play with a single model?** Using Claude Code subagents or agent teams for both execution and review is technically possible, but tends to fall into **local minima** — the same model reviewing its own patterns creates blind spots. > > *Think of it like adversarial vs. stochastic bandits: a single model self-reviewing is the stochastic case (predictable reward noise), while cross-model review is adversarial (the reviewer actively probes weaknesses the executor didn't anticipate) — and adversarial bandits are fundamentally harder to game.* > > 💭 **Why two models, not more?** Two is the minimum needed to break self-play blind spots, and 2-player games converge to Nash equilibrium far more efficiently than n-player ones. Adding more reviewers increases API cost and coordination overhead with diminishing returns — the biggest gain is going from 1→2, not 2→4. > > Claude Code's strength is fast, fluid execution; Codex (GPT-5.4 xhigh) is slower but more deliberate and rigorous in critique. These complementary styles — **speed × rigor** — produce better outcomes than either model talking to itself. ## 🎯 More Than Just a Prompt > These are full pipelines — you can also use each workflow independently. Already have an idea? Skip to Workflow 1.5. Have results? Jump to Workflow 3. Got reviews? Jump to Workflow 4. See [Quick Start](#-quick-start) for all commands and [Workflows](#-workflows) for the full breakdown. **Basic mode** — give ARIS a research direction, it handles everything: ``` /research-pipeline "factorized gap in discrete diffusion LMs" ``` **🔥 Targeted mode** — got a paper you want to improve? Give ARIS the paper + the code: ``` /research-pipeline "improve method X" — ref paper: https://arxiv.org/abs/2406.04329, base repo: https://github.com/org/project ``` ARIS reads the paper → finds its weaknesses → clones the codebase → generates ideas that specifically fix *those* weaknesses with *that* code → runs experiments → writes your paper. Like telling a research assistant: *"read this paper, use this repo, find what's missing, and fix it."* > Mix and match: `ref paper` only = "what can be improved?", `base repo` only = "what can I build with this code?", both = "improve *this* paper using *this* code." **🔥 Rebuttal mode** — reviews just dropped? Don't panic. ARIS reads every concern, builds a strategy, and drafts a rebuttal that's grounded, structured, and under the character limit: ``` /rebuttal "paper/ + reviews" — venue: ICML, character limit: 5000 ``` | Parameter | Default | What it does | |-----------|---------|-------------| | `venue` | `ICML` | Target venue (ICML/NeurIPS/ICLR/CVPR/ACL/AAAI/ACM) | | `character limit` | — | **Required.** Hard character limit for rebuttal text | | `quick mode` | `false` | Stop after parsing + strategy (Phase 0-3). See what reviewers want before drafting | | `auto experiment` | `false` | Auto-run supplementary experiments via `/experiment-bridge` when reviewers ask for new evidence | | `max stress test rounds` | `1` | How many times GPT-5.4 xhigh stress-tests the draft | | `max followup rounds` | `3` | Per-reviewer follow-up round limit | Three safety gates — rebuttal will NOT finalize if any fails: - 🔒 **No fabrication** — every claim maps to paper/review/user-confirmed result - 🔒 **No overpromise** — every promise is user-approved - 🔒 **Full coverage** — every reviewer concern is tracked Two outputs: `PASTE_READY.txt` (exact char count, paste to venue) + `REBUTTAL_DRAFT_rich.md` (extended version for manual editing). **After acceptance** — your paper is in, now prepare the presentation: ``` /paper-slides "paper/" # → Beamer PDF + PPTX + speaker notes + Q&A prep /paper-poster "paper/" # → A0/A1 poster PDF + editable PPTX + SVG ``` > *💡 From idea to paper to podium — one toolchain. 🌱* ## 🏆 Papers Accepted with ARIS | Paper | Score | Venue | Author | Stack | |-------|:-----:|-------|--------|-------| | CS Paper | **8/10** "clear accept" | CS Conference | [@DefanXue](https://github.com/DefanXue) & [@Monglitay](https://github.com/Monglitay) | Claude Code + GPT-5.4 | | AAAI Paper | **7/10** "good paper, accept" | AAAI 2026 Main Technical | [@xinbo820-web](https://github.com/xinbo820-web) | Pure Codex CLI | > 🎉 Built entirely with ARIS — from idea to acceptance. [Full details + reviewer screenshots →](#-community-showcase--papers-built-with-aris) ## 📢 What's New - **2026-03-30** —  🔥 **Auto-debug & exhaust-before-surrender** — experiment-bridge now auto-diagnoses failures (OOM, import, path, CUDA, NaN) and retries up to 3× before giving up. auto-review-loop must try 2+ solution paths before conceding any reviewer [truncated…]
PUBLIC HISTORY
First discoveredMar 21, 2026
IDENTITY
inferred
Identity inferred from code signals. No PROVENANCE.yml found.
Is this yours? Claim it →METADATA
platformgithub
first seenMar 10, 2026
last updatedMar 21, 2026
last crawledtoday
version—
README BADGE
Add to your README:
