Embedding projections are fashionable for visualizing massive datasets and fashions. Nevertheless, folks typically encounter “friction” when utilizing embedding visualization instruments: (1) obstacles to adoption, e.g., tedious knowledge wrangling and loading, scalability limits, no integration of outcomes into current workflows, and (2) limitations in potential analyses, with out integration with exterior instruments to moreover present coordinated views of metadata. On this paper, we current Embedding Atlas, a scalable, interactive visualization instrument designed to make interacting with massive embeddings as straightforward as potential. Embedding Atlas makes use of trendy internet applied sciences and superior algorithms — together with density-based clustering, and automatic labeling — to offer a quick and wealthy knowledge evaluation expertise at scale. We consider Embedding Atlas with a aggressive evaluation in opposition to different fashionable embedding instruments, displaying that Embedding Atlas’s function set particularly helps scale back friction, and report a benchmark on its real-time rendering efficiency with thousands and thousands of factors. Embedding Atlas is on the market as open supply to help future work in embedding-based evaluation.

