AXL

provenance:github:Koinic-Labs/AXL

WHAT THIS AGENT DOES

AXL is a project focused on AI agent architecture. It is built primarily using Python and leverages technologies like Llama, OLama, and PyTorch. The project appears to be related to research and training in the field of artificial intelligence. It may be useful for developers and researchers exploring AI agent design and implementation, particularly those working with large language models. The project currently has one star on GitHub.

PROBLEM IT SOLVES

AXL aims to provide a structured architecture for AI agents, potentially simplifying the development process. Developers might use it to avoid building agent architectures from scratch and instead leverage a pre-defined framework.

View Source ↗First seen 3mo agoNot yet hireable

CAPABILITIES & CONSTRAINTS

TECH & STACK

pythonai-agentllmpytorchartificial-intelligencearchitecture

USE CASES

Code Generation Automation

README

# AXL — Multi-Scale Agentic Transformer for CPU-Optimized Code Generation

[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)
[![HuggingFace](https://img.shields.io/badge/HuggingFace-Koinic%2FAXL-yellow.svg)](https://huggingface.co/Koinic)
[![GitHub](https://img.shields.io/badge/GitHub-Koinic%2FAXL-black.svg)](https://github.com/Koinic/AXL)

**AXL** (Architecture eXperimental Lab) — A family of 27 agentic coding models optimized for training and running on consumer CPUs. No GPU required. Train on a Ryzen 5 5600G. Deploy via Python API (full quality) or Ollama (degraded quality).

---

## Quick Start

### Installation
```bash
git clone https://github.com/Koinic/AXL.git
cd AXL
pip install -e .
```

### Run a Model (Full Quality via Python API)
```bash
# Start the API server (full AXL multi-scale quality)
python AXL/API/serve_model.py --model checkpoints/axl_micro_lion --port 8880 --name axl-micro-lion

# Then call the OpenAI-compatible endpoint:
curl http://localhost:8880/v1/completions \
  -H "Content-Type: application/json" \
  -d '{"prompt": "def fibonacci(n):", "max_tokens": 100}'
```

This works with any OpenAI-compatible tool (Continue.dev, LlamaIndex, LangChain, Cursor).

### Run a Model via Ollama (Degraded Quality)
> **Warning:** Ollama GGUF files use only the fine-scale encoder (1/3 of the AXL architecture). The reported PPL values apply to the full multi-scale model. For full quality, use the Python API above.

```bash
cd AXL/HuggingFace/AXL-Micro-Lion
ollama create axl-micro-lion -f Modelfile
ollama run axl-micro-lion "def fibonacci(n):"
```

### Train Your Own Model (3 minutes)
```bash
# 1. Generate training data
python scripts/generate_all_training_data.py --skip-hf

# 2. Train AXL-Micro with Lion optimizer
python scripts/retrain_all_lion.py --models micro

# 3. Your model is saved in checkpoints/axl_micro_lion/
```

### Python Inference
```python
import torch
from multiscale_transformer.model.config import load_config, ModelConfig
from multiscale_transformer.model.model import MultiScaleTransformer
from multiscale_transformer.training.tokenizer import ByteTokenizer

# Load model
ckpt = torch.load("checkpoints/axl_micro_lion/axl_micro_lion.pt", map_location="cpu")
cfg = ckpt["config"]
config = ModelConfig(
    vocab_size=cfg.get("vocab_size", 258), d_model=cfg.get("d_model", 256),
    n_heads=cfg.get("n_heads", 4), d_ff=cfg.get("d_ff", 688),
    n_layers_per_scale=cfg.get("n_layers_per_scale", 3),
    n_cross_attn_layers=cfg.get("n_cross_attn_layers", 1),
    max_seq_len=cfg.get("max_seq_len", 256),
)
model = MultiScaleTransformer(config)
model.load_state_dict(ckpt["model_state_dict"])
model.eval()

# Generate
tokenizer = ByteTokenizer()
ids = torch.tensor([tokenizer.encode("def fibonacci(n):\n")], dtype=torch.long)
with torch.no_grad():
    out = model.generate(ids, max_new_tokens=100, temperature=0.8, top_k=40)
print(tokenizer.decode(out[0].tolist()))
```

---

## What is AXL?

AXL is a **multi-scale transformer** architecture designed by **Koinic** for **CPU-first** training and inference. It processes token sequences at three parallel resolution scales — fine (1x), medium (2x), and coarse (4x) — each with a dedicated encoder stack, cross-scale attention, and adaptive gating fusion.

**Why CPU-first?** Training a 1B-parameter model on GPU costs $10,000+. AXL trains the same model on a consumer CPU for **$0.004** in electricity.

**Key innovations:**
- **Lion optimizer**: 20x faster convergence than SGD, 50% less memory than AdamW
- **Byte-level tokenizer** (vocab=258): No vocabulary training, works with any language
- **GGUF export**: Real Q4_K_M quantization via llama.cpp
- **Progressive training**: Scale to 1B+ params on 16GB RAM

---

## Project Structure

```
AXL/                              # Distribution package
├── GitHub/                       # Code for others to train AXL
│   ├── multiscale_transformer/   # Core library
│   │   ├── model/                # Attention, blocks, config, model, layers
│   │   ├── training/             # Lion, GaLore, viz, distill, streaming, dataset
│   │   ├── data/                 # Quality filter, data mixture
│   │   ├── export/               # GGUF export, HF compat, Ollama server
│   │   ├── axl_v2/               # Agentic extensions (tool router, self-debug, vision)
│   │   ├── benchmarks/           # HumanEval, MBPP, Perplexity, CodeBLEU
│   │   └── tests/                # Test suite
│   ├── scripts/                  # All training, export, benchmark, utility scripts
│   ├── configs/                  # YAML training configs + BPE tokenizer
│   ├── data/                     # Training datasets
│   ├── docs/                     # Paper, references, quickstart
│   ├── examples/                 # Inference and training examples
│   ├── llama.cpp/                # Quantization tools (Windows binaries)
│   ├── README.md                 # This file
│   ├── LICENSE                   # Apache 2.0
│   ├── CHANGELOG.md              # Version history
│   ├── requirements.txt          # Python dependencies
│   ├── pyproject.toml            # Package config
│   ├── Dockerfile                # Container build
│   └── docker-compose.yml        # Multi-service orchestration
│
└── HuggingFace/                  # Ready-to-use models
    ├── README.md                 # Organization overview
    ├── paper_axl.tex             # Research paper
    ├── references.bib            # Bibliography
    ├── AXL_ARCHITECTURE.md       # Architecture documentation
    ├── AXL-Code-1B-Lion/         # 27 model directories
    ├── AXL-Reasoning-Lion/
    ├── AXL-Micro-Lion/
    ├── ...
    └── AXL-Vision-v2/
```

---

## How to Create Training Data

### Option 1: Download from HuggingFace
```bash
python scripts/download_training_data.py --max_gb 5
# Downloads Python code from HuggingFace datasets
```

### Option 2: Generate Synthetic Data
```bash
python scripts/generate_all_training_data.py --skip-hf
# Generates training data for all model types
```

### Option 3: Use Your Own Data
1. Place your text files in `data/` (any `.txt` or `.jsonl` file)
2. Point the training script to your file:
```bash
python scripts/train_axl_micro.py --data_path data/my_code.txt --max_time 600
```

### Data Format
- **Text files** (`.txt`): Raw text, one file per dataset
- **JSONL files** (`.jsonl`): JSON lines with `source_code`/`target_code` fields (for translation)
- **Byte-level tokenizer**: Any file works — no preprocessing needed

### Data Quality Filtering
```python
from multiscale_transformer.data.quality_filter import DataFilter
filter = DataFilter(min_lines=2, max_lines=500, require_syntax_valid=True)
clean = filter.filter_file("data/raw_code.txt", "data/clean_code.txt")
```

---

## How to Train a Model

### Step 1: Choose a Config
Configs are in `configs/`. Each defines architecture:
- `axl_micro.yaml` — 12.8M params, fastest training (3 min)
- `axl_code_1b.yaml` — 318M params, largest model (20 min)
- Or use the auto-builder: `from multiscale_transformer.training.model_builder import ModelBuilder`

### Step 2: Train
```bash
# Train one model with Lion optimizer (recommended)
python scripts/retrain_all_lion.py --models micro    # 3 minutes
python scripts/retrain_all_lion.py --models code_1b  # 20 minutes

# Train all models sequentially
python scripts/retrain_all_lion.py                    # ~50 minutes total

# Train with custom data
python scripts/train_axl_code_1b_lion.py --max_time 1200
```

### Step 3: Export to GGUF
```bash
python scripts/quantize_all_models.py --models code_1b_lion
# Produces F16 + Q4_K_M GGUF files in checkpoints/
```

### Step 4: Deploy
```bash
# Primary: Python API server (full multi-scale quality)
python AXL/API/serve_model.py --model checkpoints/axl_code_1b_lion --port 8880 --name axl-code-1b
# OpenA

[truncated…]

PUBLIC HISTORY

First discoveredMar 31, 2026

IDENTITY

inferred

Identity inferred from code signals. No PROVENANCE.yml found.

Is this yours? Claim it →

METADATA

platformgithub

first seenMar 30, 2026

last updatedMar 30, 2026

last crawled3 months ago

version—

RELATED AGENTS

askimo

Askimo is a platform that lets you interact with artificial intelligence in a simple way, whether through chatting, sear

blog-writer-multi-agents

Here's a plain English summary of the blog-writer-multi-agents AI agent: This agent automatically creates professional-

boss-skill

This agent, boss-skill, is designed to help employees navigate challenging workplace dynamics, particularly those involv

strands-multi-engineer-agent

This agent helps businesses understand how different AI models perform when tackling the same engineering task. It runs

J.E.L.L.Y._AI

J.E.L.L.Y._AI is an article writing AI developed by its creator. This repository is publicly available for job seeking p

More Code Generation agents →

README BADGE

Add to your README:

![Provenance](https://getprovenance.dev/api/badge?id=provenance:github:Koinic-Labs/AXL)