LM Studio Launches LM Link - Access Your GPU Rig's Models From Anywhere via Encrypted Mesh
LM Studio 0.4.5 introduces LM Link, built on Tailscale's tsnet library, letting users access local AI models on remote hardware through end-to-end encrypted connections with zero port forwarding.

LM Studio 0.4.5 shipped on February 25 with a feature that solves one of the most persistent annoyances in local AI: getting your models to follow you across devices. LM Link, built in partnership with Tailscale, creates an encrypted peer-to-peer mesh between your machines so remote models appear in the LM Studio interface alongside local ones. No port forwarding, no VPN configuration, no cloud subscriptions.
The pitch is simple. You have a GPU rig at home running a 70B model. You're on a laptop at a coffee shop. LM Link connects the two through WireGuard encryption and routes inference requests transparently. Your laptop's LM Studio shows the remote model as if it were loaded locally.
| Feature | LM Link | Ollama (remote) | vLLM | Manual Tailscale + API |
|---|---|---|---|---|
| Setup | lms link enable | Port forwarding + firewall rules | Server config + API setup | Tailscale install + API key mgmt |
| Encryption | WireGuard E2E (built in) | None by default | None by default | WireGuard E2E (separate install) |
| Auth | Identity-based (account) | None by default | API key | Tailscale ACL + API key |
| Port forwarding | Not required | Required | Required | Not required |
| API compatibility | OpenAI + Anthropic | OpenAI | OpenAI | OpenAI |
| Use case | Personal/small team remote access | Local-first serving | High-throughput production | DIY remote access |
| Pricing | Free (2 users, 5 devices each) | Free | Free | Free (up to 3 users) |
TL;DR
- LM Studio 0.4.5 introduces LM Link for accessing models on remote machines through end-to-end encrypted connections
- Built on Tailscale's tsnet library (userspace WireGuard) - no kernel-level permissions, no port forwarding, works through CGNAT and firewalls
- Remote models appear in LM Studio's model loader with local ones - compatible with chat, tools, and API calls
- Works with Claude Code, Codex, OpenCode, and any tool pointing at localhost:1234
- Free for up to 2 users with 5 devices each; currently in preview with staged rollout
How It Works
The Networking Layer
LM Link embeds Tailscale's tsnet library directly into LM Studio. Tsnet is a userspace Go program that adds the Tailscale mesh networking protocol without touching kernel sockets, system routing tables, or requiring privileged access. Each LM Studio instance becomes a standalone node on a private encrypted network.
The practical effect: two devices running LM Studio can find and connect to each other regardless of network topology. CGNAT, corporate firewalls, different subnets - none of it matters. The WireGuard protocol handles encryption, and Tailscale's coordination servers handle device discovery. All inference data - prompts, responses, model listings - travels peer-to-peer. Neither Tailscale nor LM Studio's backend service sees the content.
LM Link coexists independently with existing Tailscale VPN usage. If you already run Tailscale for other purposes, LM Link creates a separate isolated mesh that won't interfere.
Setting It Up
On the host machine (your GPU rig):
lms link enable
On the client machine, log into your LM Studio account and remote models appear in the standard model loader. They are served through the same localhost:1234 endpoint that local models use. Any tool configured to hit that endpoint - Claude Code, Codex, OpenCode, LangChain, custom Python scripts - works without changes.
For headless setups (servers, cloud VMs without displays), LM Studio offers llmster, a daemon that provides the same inference and LM Link capabilities without a GUI.
What the Server Sees
The privacy boundary is clearly defined:
- LM Studio backend sees: Device list (for discovery/connection purposes)
- LM Studio backend doesn't see: Prompts, responses, model information, hardware details
- Tailscale sees: Connection metadata for peer coordination
- Tailscale doesn't see: Any inference data (encrypted end-to-end)
What It Enables
The immediate use case is obvious: access your home GPU from anywhere. But the architecture enables some workflows that are harder to set up with existing tools.
Personal GPU as a Service
A single Nvidia 4090 or Mac Studio at home becomes a private inference server accessible from any device. The local LLM tools ecosystem has been building toward this, but the networking has always been the friction point. LM Link removes the friction without requiring users to learn networking.
Team Inference Without Cloud
A small team can share a GPU workstation. Two users with five devices each can connect to the same host. The free tier covers this. For teams that need more, enterprise plans are available.
Coding Agents on Thin Clients
This is where LM Link gets interesting for the AI coding tools crowd. Claude Code, Codex, and OpenCode all support custom API endpoints. Point them at localhost:1234 on a laptop and LM Link routes the actual inference to whatever hardware you have at home or in the office. You get the convenience of a thin client with the privacy and cost savings of local inference.
What It Does Not Tell You
Latency and Bandwidth
The announcement and documentation are silent on latency expectations. WireGuard adds minimal overhead, but the real bottleneck in remote inference is network bandwidth for token streaming, especially with long-context models. If your home connection has 10 Mbps upload, that constrains throughput regardless of how elegant the tunnel is. No benchmarks were published at launch.
Staged Rollout
LM Link is in "preview" with staged rollout access. Not everyone updating to 0.4.5 gets it right away. The changelog doesn't specify the rollout timeline or criteria.
Device Limits
The free tier allows 2 users with 5 devices each. There's currently no paid option for additional capacity - just an enterprise contact form. For a solo developer this is fine. For a team of five it isn't enough.
No Distributed Inference
LM Link routes requests to a single remote machine. It does not split a model across multiple nodes. If you need to run a 405B model that doesn't fit on one machine, you still need something like Exo or tensor parallelism across multiple GPUs. LM Link solves remote access, not model parallelism.
Build Quality
Build 2 of 0.4.5 had to fix a bug where the LM Link connector was not included in the in-app updater. The feature shipped incomplete in its first build. Preview status is warranted.
LM Link is the right feature at the right time. The local LLM movement has hit the point where the hardware works, the models are good enough, and the remaining barrier is plumbing. Making a GPU rig accessible from anywhere with a single CLI command and built-in encryption is the kind of infrastructure improvement that changes daily workflows. The missing latency benchmarks and staged rollout are worth noting, but the architecture - userspace WireGuard mesh, zero kernel modification, identity-based auth - is sound. If it works as described, the days of SSH tunnels and manual port forwarding for local LLM access are numbered.
Sources:
