Creating autonomous brokers that successfully work together with Graphic Person Interfaces (GUIs) stays a difficult open drawback, particularly for small on-device fashions. On this paper, we current Ferret-UI Lite, a compact, end-to-end GUI agent that operates throughout various platforms, together with cell, net, and desktop. Using methods optimized for creating small fashions, we construct our 3B Ferret-UI Lite agent by way of curating a various GUI knowledge combination from actual and artificial sources, strengthening inference-time efficiency by way of chain-of-thought reasoning and visible tool-use, and reinforcement studying with designed rewards. Ferret-UI Lite achieves aggressive efficiency with different small-scale GUI brokers. In GUI grounding, Ferret-UI Lite attains scores of 91.6%, 53.3%, and 61.2% on the ScreenSpot-V2, ScreenSpot-Professional, and OSWorld-G benchmarks, respectively. For GUI navigation, Ferret-UI Lite achieves success charges of 28.0% on AndroidWorld and 19.8% on OSWorld. We share our strategies and classes discovered from creating compact, on-device GUI brokers.

