Name: Fara-1.5
Author: Microsoft Research

Fara-1.5 is a family of three browser computer use agent models from Microsoft Research AI Frontiers, released May 22, 2026. Built on Qwen3.5 base models at 4B, 9B, and 27B parameter counts, the models are trained specifically to control a web browser - clicking, filling forms, navigating menus, logging into sites, and completing multi-step tasks from a plain-language prompt. The 27B variant scores 72.0% on Online-Mind2Web, beating OpenAI Operator (58.3%) and Gemini 2.5 Computer Use (57.3%).

TL;DR

Open-weight browser agent family (4B/9B/27B) built on Qwen3.5, trained for web task completion
Fara1.5-27B scores 72.0% on Online-Mind2Web - higher than OpenAI Operator and Gemini 2.5 Computer Use at similar cost
Inference code on GitHub (MIT license); Fara1.5-9B available now on Microsoft Foundry, integrated with MagenticLite

The release accompanies the launch of MagenticLite, Microsoft's redesigned agentic application that pairs Fara-1.5 (browser tasks) with MagenticBrain-14B (orchestration and local file system work). Together they form a self-hostable agentic stack that runs without frontier-scale compute. Unlike GPT-5.4's computer use, which runs fully managed in the cloud, Fara-1.5 can be rolled out on a single 24GB GPU using vLLM.

The model family is a direct successor to Fara-7B, released in late 2025. The 9B variant nearly doubles Fara-7B's Online-Mind2Web score (34.1% to 63.4%). The previous model used Qwen2.5-VL-7B as its base and was trained on 145K synthetic trajectories. Fara-1.5 scales this up dramatically with around 2 million training samples and a more sophisticated data pipeline.

Key Specifications

Specification	Details
Provider	Microsoft Research AI Frontiers
Model Family	Fara
Parameters	4B, 9B, 27B
Base Model	Qwen3.5 (all three sizes)
Context Window	Not disclosed
Input Price	Not disclosed
Output Price	Not disclosed
Release Date	May 22, 2026
License	MIT (inference code); model weights terms follow Microsoft Foundry terms
GitHub	microsoft/fara

Benchmark Performance

Fara-1.5 was assessed on two primary web agent benchmarks. Online-Mind2Web is the more demanding of the two: 300 tasks across 136 live websites, scored on end-to-end task completion. WebVoyager covers 643 tasks across 15 websites.

Benchmark	Fara1.5-4B	Fara1.5-9B	Fara1.5-27B	OpenAI Operator	Gemini 2.5 CU	Yutori Nav n1
Online-Mind2Web	57.3%	63.4%	72.0%	58.3%	57.3%	64.7%
WebVoyager	80.8%	86.6%	88.6%	87.0%	-	-

The 27B model is the clearest story: it beats every other system on Online-Mind2Web including Yutori's Navigator n1, which itself topped the web agent benchmarks leaderboard earlier this year. The 9B model matches or exceeds Operator and Gemini 2.5 Computer Use while running on consumer-grade hardware.

The WebVoyager results are tighter. Fara1.5-9B (86.6%) trails OpenAI Operator (87.0%) by less than one point. The 27B model (88.6%) edges past. WebVoyager's tasks are generally less complex than Online-Mind2Web, which explains why the gap between the Fara1.5 models and their proprietary rivals is smaller there.

The paper also reports additional evaluations. On WebTailBench v1.5 - a harder benchmark highlighting multi-step tasks that require backtracking - Fara1.5-9B reaches 64.5% process success and 32.3% outcome success. On ScreenSpot-Pro, the 9B model improves 18.1 points over Fara-7B, and on OSWorld-G Refined it improves 8.9 points. These numbers confirm gains aren't just benchmark-specific.

For context on where Claude sits: Claude Opus 4.8 achieves 84% on Online-Mind2Web, above all Fara-1.5 variants, though it runs as a proprietary API without any self-hosting option.

Microsoft Research Fara1.5 overview page The Microsoft Research article on Fara1.5, which details benchmark results and training methodology. Source: microsoft.com

Key Capabilities

Fara-1.5 uses an observe-think-act loop. Each inference step receives the conversation history plus the three most recent browser screenshots, outputs reasoning, and then predicts a single action. The action space covers standard mouse and keyboard inputs, web-specific operations like URL search, and context management tools - memory writes, note-taking, and user query requests. The memory tools matter for long-horizon tasks: the model can store intermediate results across hundreds of steps without losing track.

The training data composition reflects deliberate engineering choices. Of the ~2 million samples, 60% are web trajectories from live sites. The remaining 40% covers synthetic environments (12.8%), form filling (12.5%), visual grounding (8.8%), visual question answering (4.9%), GUI drag operations (0.8%), and instruction following and safety (0.1%). The diversity is intentional - pure web scraping produces too many trivial page loads and not enough recovery from errors.

The six synthetic environments in FaraGen1.5 are Mail, Calendar, Stream, ML, Stay, and Scheduler. Each was produced using GitHub Copilot CLI to produce a functional website replica with a real database-backed API. The value of synthetic environments is access: real booking sites and banking portals can't be used for training without user accounts and the risk of irreversible real-world actions. The synthetic equivalents let solvers generate training trajectories for exactly those task types.

Safety training runs through the dataset at 0.1% but isn't negligible. The model is trained to detect "critical points" - form submissions, email sends, account creations - and request user approval before proceeding. MagenticLite surfaces these approval requests in the UI, and the sandboxed Quicksand VM prevents the browser session from reaching the host file system without explicit permission.

MagenticLite and Fara1.5 integration on Microsoft Research blog MagenticLite pairs MagenticBrain-14B for orchestration with Fara-1.5 for browser-specific tasks, forming a self-hostable agentic stack. Source: microsoft.com

The FaraGen1.5 Training Pipeline

The data pipeline is the main technical contribution of the accompanying arXiv paper. FaraGen1.5 has three components: environments, solvers, and verifiers.

Environments mix live web URLs with six synthetic domains. The synthetic sites handle tasks where real execution would require credentials or cause irreversible changes.

Solvers create training trajectories. The primary solver is GPT-5.4, acting as a teacher model with custom tools replicating Fara-1.5's action space. GPT-5.4 reaches 83% on Online-Mind2Web as a solver, meaning most of the produced trajectories are high quality. A user simulator handles cases where the teacher needs to ask for clarification.

Verifiers filter the output across three criteria: correctness (using rubric-based LLM evaluation), efficiency (penalizing unnecessary steps), and safety (checking for critical-point mishandling or missing permission requests). Only trajectories passing all three go into the training set.

The synthetic-to-real transfer data is compelling. On a held-out four-domain test (Allrecipes, Apple, HuggingFace, GitHub), using synthetic replicas during training lifts performance from 73.4% baseline to 83.4%, and the full Fara1.5-9B training reaches 89.8%. The synthetic environments aren't a toy - they fill in real coverage gaps.

Pricing and Availability

Fara1.5-9B is available now on Microsoft Foundry and is integrated with MagenticLite. Fara1.5-4B and 27B were listed as "coming soon" at launch time; check the Microsoft Foundry catalog for current status. No per-token pricing has been published. Deployments on Foundry use managed compute billing, which depends on instance type and region.

For self-hosting, the inference code is on GitHub under the MIT license. Requirements: a GPU with at least 24GB VRAM, Python 3.12, vLLM, and Playwright. The vLLM serve command exposes an OpenAI-compatible endpoint that the included fara-cli connects to. Model weights for Fara-7B were released on HuggingFace at microsoft/Fara-7B; the Fara-1.5 HuggingFace pages for the 4B, 9B, and 27B variants were not yet live today (the GitHub README notes they are coming soon). The repo also lists LM Studio and Ollama as alternative local hosts for Windows and Mac.

MagenticLite itself is available on GitHub separately. MagenticBrain-14B, the orchestration model, is on Microsoft Foundry. Running the full stack requires deploying both models through Foundry managed compute and connecting them via OpenAI-compatible endpoints.

Strengths and Weaknesses

Strengths

State-of-the-art on Online-Mind2Web at 72.0% (27B), beating all other published systems including those from OpenAI and Google
Self-hostable on a single 24GB GPU via vLLM - no managed API dependency
MIT-licensed inference code, weights to be released on HuggingFace
Strong scaling: going from 4B to 27B gains 14.7 points on Online-Mind2Web
Built-in critical-point detection and user approval workflow reduces autonomous error risk
Integrated into MagenticLite for a complete browser + file system agent stack

Weaknesses

Model weights for 4B and 27B variants not yet on HuggingFace at launch (listed as coming soon)
No per-token pricing published, making cost estimation for API use opaque
Context window length not disclosed
27B model (72.0% on Online-Mind2Web) still trails Claude Opus 4.8 (84%), which is the current ceiling for this benchmark
WebVoyager results show Fara1.5-27B (88.6%) beats the 9B variant but only narrowly leads OpenAI Operator (87.0%)
Benchmark coverage is browser-only; no evaluation on desktop OSWorld or multi-application tasks

Web Agent Benchmarks Leaderboard - Full rankings across WebVoyager, Online-Mind2Web, and WebArena
Computer Use Leaderboard - Desktop agent rankings including OSWorld scores
GPT-5.4 Computer Use - OpenAI's competing computer use system, which Fara-1.5-27B beats on Online-Mind2Web

FAQ

What is Fara-1.5 best used for?

Browser automation tasks: form filling, product comparison, flight booking, event scheduling. It handles credentialed sites via user-provided approvals and supports tasks spanning hundreds of steps through built-in memory tools.

Is Fara-1.5 free to use?

The inference code is MIT-licensed and free. Model weights will be on HuggingFace. Running via Microsoft Foundry incurs managed compute costs that depend on instance type; no per-token price has been published.

How does Fara-1.5 compare to OpenAI Operator?

On Online-Mind2Web, Fara1.5-9B (63.4%) beats Operator (58.3%) and the 27B model (72.0%) extends that lead further. On WebVoyager, Operator (87.0%) narrowly beats the 9B but trails the 27B (88.6%).

Can I run Fara-1.5 locally?

Yes. The GitHub repo provides inference code. You need a 24GB VRAM GPU, vLLM, Playwright, and Python 3.12. The CLI wraps vLLM's OpenAI-compatible endpoint.

What is MagenticLite?

MagenticLite is Microsoft's agentic application that pairs Fara-1.5 (browser tasks) with MagenticBrain-14B (orchestration and file system work). It runs inside a sandboxed VM and requires explicit user approval before irreversible actions.

How was Fara-1.5 trained?

Supervised fine-tuning on roughly 2 million samples produced by the FaraGen1.5 pipeline, which uses GPT-5.4 as a teacher, six synthetic web environments, and three-stage verification (correctness, efficiency, safety).

Sources: