Apple’s Fast and Accurate Long Text Generation Model

Apple researchers have developed a new diffusion model that can generate long text passages up to 128 times faster than its counterparts. Here’s how it works:

Diffusion models generate text in parallel, refining multiple tokens over several steps until the full response takes shape. One variant, flow-matching models, skips this process and learns to generate the final result in one go.

A new study by Apple researchers proposes a model called Few-Step Discrete Flow-Matching (FS-DFM). The model was able to write full-length passages with just eight quick refinement rounds, matching the quality of diffusion models that required over a thousand steps. The researchers achieved this through a three-step approach: training the model to handle different refinement iterations, using a guiding “teacher” model to help it make larger updates, and tweaking iteration steps for faster results.

When compared to larger diffusion models, FS-DFM performed well in perplexity and entropy metrics. Perplexity measures text quality, with lower scores indicating more accurate and natural-sounding text. Entropy measures the confidence of word selection, with too low or too high values leading to repetitive or random text.

The researchers plan to release code and model checkpoints to facilitate reproducibility and further research. The full paper is available on arXiv, featuring performance examples that color-code iteration at which each word was last changed.

Source: https://9to5mac.com/2025/10/13/apples-new-language-model-can-write-long-texts-incredibly-fast