CeADAR Tech Talk – Lizard: An Efficient Linearization Framework for Large Language Models – Dr. Trung Bui

Tech Talk Graphic Template Landscape Size - Tech Talk 2025-02

We are delighted to host Dr. Trung Bui of Adobe Research for our next Tech Talk, where he will present Lizard, a breakthrough framework designed to transform pretrained Transformer-based Large Language Models (LLMs) into flexible, subquadratic architectures capable of infinite-context generation.

Date: October 16, 2025
Time: 3:00 PM
Online Webinar

As LLMs grow, they face major challenges with memory and compute efficiency. Lizard, a new framework from Dr. Trung Bui, tackles these by introducing subquadratic attention, adaptive memory control, and hardware-aware algorithms, achieving near-lossless performance while outperforming previous methods by 18 points on the MMLU benchmark test.

Abstract

We propose Lizard, a linearization framework that transforms pretrained Transformer-based Large Language Models (LLMs) into flexible, subquadratic architectures for infinite-context generation. Transformer-based LLMs face significant memory and computational bottlenecks as context lengths increase, due to the quadratic complexity of softmax attention and the growing key-value (KV) cache. Lizard addresses these limitations by introducing a subquadratic attention mechanism that closely approximates softmax attention while preserving the output quality. Unlike previous linearization methods, which are often limited by fixed model structures and therefore exclude gating mechanisms, Lizard incorporates a gating module inspired by recent state-of-the-art linear models.

This enables adaptive memory control, supports constant-memory inference, offers strong length generalization, and allows more flexible model design. Lizard combines gated linear attention for global context compression with sliding window attention enhanced by meta memory, forming a hybrid mechanism that captures both long-range dependencies and fine-grained local interactions. Moreover, we introduce a hardware-aware algorithm that accelerates the training speed of our models. Extensive experiments show that Lizard achieves near-lossless recovery of the teacher model’s performance across standard language modeling tasks, while significantly outperforming previous linearization methods. On the 5-shot MMLU benchmark, Lizard improves over prior models by 18 points and shows significant improvements on associative recall tasks.

Date & Time: Oct 16, 2025 03:00 Pm