githubinferredactive
AXL
provenance:github:Koinic-Labs/AXL
WHAT THIS AGENT DOES
AXL is a tool that can generate computer code based on simple instructions. It helps businesses automate tasks that require coding, like creating small programs or scripts, without needing a dedicated software developer. This is useful for anyone who needs to quickly create or modify code, from small business owners to data analysts.
README
# AXL — Multi-Scale Agentic Transformer for CPU-Optimized Code Generation
[](LICENSE)
[](https://huggingface.co/Koinic)
[](https://github.com/Koinic/AXL)
**AXL** (Architecture eXperimental Lab) — A family of 27 agentic coding models optimized for training and running on consumer CPUs. No GPU required. Train on a Ryzen 5 5600G. Deploy via Python API (full quality) or Ollama (degraded quality).
---
## Quick Start
### Installation
```bash
git clone https://github.com/Koinic/AXL.git
cd AXL
pip install -e .
```
### Run a Model (Full Quality via Python API)
```bash
# Start the API server (full AXL multi-scale quality)
python AXL/API/serve_model.py --model checkpoints/axl_micro_lion --port 8880 --name axl-micro-lion
# Then call the OpenAI-compatible endpoint:
curl http://localhost:8880/v1/completions \
-H "Content-Type: application/json" \
-d '{"prompt": "def fibonacci(n):", "max_tokens": 100}'
```
This works with any OpenAI-compatible tool (Continue.dev, LlamaIndex, LangChain, Cursor).
### Run a Model via Ollama (Degraded Quality)
> **Warning:** Ollama GGUF files use only the fine-scale encoder (1/3 of the AXL architecture). The reported PPL values apply to the full multi-scale model. For full quality, use the Python API above.
```bash
cd AXL/HuggingFace/AXL-Micro-Lion
ollama create axl-micro-lion -f Modelfile
ollama run axl-micro-lion "def fibonacci(n):"
```
### Train Your Own Model (3 minutes)
```bash
# 1. Generate training data
python scripts/generate_all_training_data.py --skip-hf
# 2. Train AXL-Micro with Lion optimizer
python scripts/retrain_all_lion.py --models micro
# 3. Your model is saved in checkpoints/axl_micro_lion/
```
### Python Inference
```python
import torch
from multiscale_transformer.model.config import load_config, ModelConfig
from multiscale_transformer.model.model import MultiScaleTransformer
from multiscale_transformer.training.tokenizer import ByteTokenizer
# Load model
ckpt = torch.load("checkpoints/axl_micro_lion/axl_micro_lion.pt", map_location="cpu")
cfg = ckpt["config"]
config = ModelConfig(
vocab_size=cfg.get("vocab_size", 258), d_model=cfg.get("d_model", 256),
n_heads=cfg.get("n_heads", 4), d_ff=cfg.get("d_ff", 688),
n_layers_per_scale=cfg.get("n_layers_per_scale", 3),
n_cross_attn_layers=cfg.get("n_cross_attn_layers", 1),
max_seq_len=cfg.get("max_seq_len", 256),
)
model = MultiScaleTransformer(config)
model.load_state_dict(ckpt["model_state_dict"])
model.eval()
# Generate
tokenizer = ByteTokenizer()
ids = torch.tensor([tokenizer.encode("def fibonacci(n):\n")], dtype=torch.long)
with torch.no_grad():
out = model.generate(ids, max_new_tokens=100, temperature=0.8, top_k=40)
print(tokenizer.decode(out[0].tolist()))
```
---
## What is AXL?
AXL is a **multi-scale transformer** architecture designed by **Koinic** for **CPU-first** training and inference. It processes token sequences at three parallel resolution scales — fine (1x), medium (2x), and coarse (4x) — each with a dedicated encoder stack, cross-scale attention, and adaptive gating fusion.
**Why CPU-first?** Training a 1B-parameter model on GPU costs $10,000+. AXL trains the same model on a consumer CPU for **$0.004** in electricity.
**Key innovations:**
- **Lion optimizer**: 20x faster convergence than SGD, 50% less memory than AdamW
- **Byte-level tokenizer** (vocab=258): No vocabulary training, works with any language
- **GGUF export**: Real Q4_K_M quantization via llama.cpp
- **Progressive training**: Scale to 1B+ params on 16GB RAM
---
## Project Structure
```
AXL/ # Distribution package
├── GitHub/ # Code for others to train AXL
│ ├── multiscale_transformer/ # Core library
│ │ ├── model/ # Attention, blocks, config, model, layers
│ │ ├── training/ # Lion, GaLore, viz, distill, streaming, dataset
│ │ ├── data/ # Quality filter, data mixture
│ │ ├── export/ # GGUF export, HF compat, Ollama server
│ │ ├── axl_v2/ # Agentic extensions (tool router, self-debug, vision)
│ │ ├── benchmarks/ # HumanEval, MBPP, Perplexity, CodeBLEU
│ │ └── tests/ # Test suite
│ ├── scripts/ # All training, export, benchmark, utility scripts
│ ├── configs/ # YAML training configs + BPE tokenizer
│ ├── data/ # Training datasets
│ ├── docs/ # Paper, references, quickstart
│ ├── examples/ # Inference and training examples
│ ├── llama.cpp/ # Quantization tools (Windows binaries)
│ ├── README.md # This file
│ ├── LICENSE # Apache 2.0
│ ├── CHANGELOG.md # Version history
│ ├── requirements.txt # Python dependencies
│ ├── pyproject.toml # Package config
│ ├── Dockerfile # Container build
│ └── docker-compose.yml # Multi-service orchestration
│
└── HuggingFace/ # Ready-to-use models
├── README.md # Organization overview
├── paper_axl.tex # Research paper
├── references.bib # Bibliography
├── AXL_ARCHITECTURE.md # Architecture documentation
├── AXL-Code-1B-Lion/ # 27 model directories
├── AXL-Reasoning-Lion/
├── AXL-Micro-Lion/
├── ...
└── AXL-Vision-v2/
```
---
## How to Create Training Data
### Option 1: Download from HuggingFace
```bash
python scripts/download_training_data.py --max_gb 5
# Downloads Python code from HuggingFace datasets
```
### Option 2: Generate Synthetic Data
```bash
python scripts/generate_all_training_data.py --skip-hf
# Generates training data for all model types
```
### Option 3: Use Your Own Data
1. Place your text files in `data/` (any `.txt` or `.jsonl` file)
2. Point the training script to your file:
```bash
python scripts/train_axl_micro.py --data_path data/my_code.txt --max_time 600
```
### Data Format
- **Text files** (`.txt`): Raw text, one file per dataset
- **JSONL files** (`.jsonl`): JSON lines with `source_code`/`target_code` fields (for translation)
- **Byte-level tokenizer**: Any file works — no preprocessing needed
### Data Quality Filtering
```python
from multiscale_transformer.data.quality_filter import DataFilter
filter = DataFilter(min_lines=2, max_lines=500, require_syntax_valid=True)
clean = filter.filter_file("data/raw_code.txt", "data/clean_code.txt")
```
---
## How to Train a Model
### Step 1: Choose a Config
Configs are in `configs/`. Each defines architecture:
- `axl_micro.yaml` — 12.8M params, fastest training (3 min)
- `axl_code_1b.yaml` — 318M params, largest model (20 min)
- Or use the auto-builder: `from multiscale_transformer.training.model_builder import ModelBuilder`
### Step 2: Train
```bash
# Train one model with Lion optimizer (recommended)
python scripts/retrain_all_lion.py --models micro # 3 minutes
python scripts/retrain_all_lion.py --models code_1b # 20 minutes
# Train all models sequentially
python scripts/retrain_all_lion.py # ~50 minutes total
# Train with custom data
python scripts/train_axl_code_1b_lion.py --max_time 1200
```
### Step 3: Export to GGUF
```bash
python scripts/quantize_all_models.py --models code_1b_lion
# Produces F16 + Q4_K_M GGUF files in checkpoints/
```
### Step 4: Deploy
```bash
# Primary: Python API server (full multi-scale quality)
python AXL/API/serve_model.py --model checkpoints/axl_code_1b_lion --port 8880 --name axl-code-1b
# OpenA
[truncated…]PUBLIC HISTORY
First discoveredMar 31, 2026
IDENTITY
inferred
Identity inferred from code signals. No PROVENANCE.yml found.
Is this yours? Claim it →METADATA
platformgithub
first seenMar 30, 2026
last updatedMar 30, 2026
last crawled17 days ago
version—
README BADGE
Add to your README:
