CVPR 2026 · AI Art Gallery

Autoregressive
Mosaics

Probing emergent 2D spatial reasoning in text-only LLMs

Ashwin Nedungadi

Universität Rostock · Institute for Visual and Analytical Computing

Paper (coming soon) Live Demo Video GitHub

Overview

The idea is simple: humans are naturally great at creating mosaic art. From the Roman Empire to French Neo-Impressionism, we can effortlessly place individual strokes to form a larger, coherent image — balancing local action with global structure. Large Language Models, however, struggle with this because they fundamentally lack spatial grounding.

If a language model is trained primarily on text and code, to what extent can it still recover coherent 2D visual concepts when forced to act as a pixel-level or programmatic painter?

Autoregressive Mosaics attempts to force an LLM trained only on text to paint one discrete pixel at a time. The system gives the model a blank grid (M × N) and a text prompt; the model must infer where to place structure and color step-by-step, using only its linguistic priors.

The results are often visually primitive, unstable, or unintentionally abstract — and that is exactly the point. They offer a raw, unfiltered look into how text-only models represent and fracture geometry, shape, and everyday visual concepts. As with any art, outputs are open to interpretation. Squint a little: what do you see?

Demonstration

Generations

Generation Methods

Two distinct pipelines explore the same core phenomenon from different angles, probing where geometry emerges, degrades, or collapses under autoregressive pressure.

ver2 — ASCII Canvas

The Literal Painter

In a single forward pass, the LLM generates an ASCII topology grid alongside a symbol-to-color palette. Every grid cell is a deliberate per-position decision. Because LLMs predict tokens in strict 1D sequence, 2D consistency — object boundaries, symmetry, position memory — quickly degrades. Shapes drift, tear, and collapse into fragmented, often compelling abstractions.

Single PassDirect PixelASCII + Palette

ver3 — Code Canvas

The Symbolic Artist

Instead of raw pixels, the LLM outputs Python rendering logic via a constrained drawing API (fill, rect, line, circle, triangle). A deterministic renderer rasterizes the result. This neuro-symbolic pipeline aligns with LLM strengths — symbolic decomposition, procedural logic, code synthesis — yielding sudden spatial coherence absent from the ASCII approach.

Code GenerationNeuro-SymbolicDeterministic

Citation

@misc{ned2026autoregressivemosaics,
  author       = {Nedungadi, Ashwin},
  title        = {Autoregressive Mosaics},
  year         = {2026},
  publisher    = {GitHub},
  booktitle    = {CVPR AI Art Gallery},
  howpublished = {\url{https://github.com/ashwin-ned/autoregressive-mosaics}}
}

Autoregressive Mosaics · CVPR 2026 AI Art Gallery

ashwin-ned.com

Autoregressive Mosaics — CVPR 2026 AI Art Gallery, by Ashwin Nedungadi

AutoregressiveMosaics

Autoregressive
Mosaics