Browser Harness Review: Letting an LLM Drive Your Browser for Real
browser-use/browser-harness is a 10k-star open-source project that connects LLMs directly to your real browser via CDP. I spent two days with it — here's what actually works.
广告
Browser Harness Review: Letting an LLM Drive Your Browser for Real
I caught wind of a new project from the browser-use team called browser-harness. It’s sitting at around 10k stars and makes a bold claim: “Self-healing harness that enables LLMs to complete any task.” In plain English — let a large language model directly control your browser, and it can fix itself when things break.
I’m already pretty familiar with the main browser-use project, which is probably the most mature browser-automation AI agent out there right now. This harness takes a completely different approach. Instead of wrapping everything in pre-defined actions, it hands the LLM a direct WebSocket connection to Chrome via the DevTools Protocol and essentially says: “Figure it out.”
How it actually works
The core idea is surprisingly radical. Traditional automation stacks give agents a fixed playbook of actions to execute. The harness flips that — it provides an ultra-thin connection layer (about 1,000 lines across 4 core files) and lets the LLM talk directly to the browser over CDP. When the agent notices a missing helper for a task, it writes one on the spot.
Here’s a concrete example: the agent wants to upload a file but agent_helpers.py doesn’t have a helper for that. So it writes one, adds it to the file, and proceeds. Next time the same action comes up, it just reuses the helper. That’s the “self-healing” part — every run accumulates domain knowledge.
The architecture breaks down like this:
install.md— first-time setup and browser bootstrapSKILL.md— day-to-day usage patternssrc/browser_harness/— protected core package the agent can’t touchagent-workspace/agent_helpers.py— helper code the agent edits freelyagent-workspace/domain-skills/— reusable site-specific skills
My real-world experience
Setup is genuinely simple. The README gives you a setup prompt you can paste straight into Claude Code. It opens chrome://inspect/#remote-debugging for you to tick the remote debugging checkbox, then you click Allow on the permission popup.
I tested two tasks:
Auto-filling a form — I asked the agent to send a message on LinkedIn. The first run got stuck at the file upload step, but about 30 seconds later it wrote its own helper and the second attempt went through smoothly. Honestly, the self-healing process impressed me more than the end result.
Batch data extraction — I needed to scrape order lists from an old dashboard with no API. The agent figured out the login flow, pagination logic, and even wrote a domain-skills/legacy-dashboard/ skill to persist that knowledge. Next time I ran the same site, it basically worked with zero extra configuration.
Quick start
git clone https://github.com/browser-use/browser-harness.git
cd browser-harness
# Follow install.md or paste the setup prompt into Claude Code
They also integrate with Browser Use Cloud, which gives you 3 concurrent browsers on the free tier, plus proxies and CAPTCHA solving. Worth trying if your local network is unreliable.
Pros and cons, straight up
What I liked:
- Extremely lightweight at ~1k core lines — you can actually read and modify the source
- Self-healing isn’t just marketing; the agent genuinely writes and reuses helpers
- Controls a real browser, not a headless simulation, so compatibility is excellent
- Domain skills accumulate over time — the same site gets faster with each run
What frustrated me:
- You need Chrome running with remote debugging enabled, which is a security consideration
- First runs on new sites are slow because the agent has to “explore” first
- Heavily geared toward Claude Code or similar AI coding tools — manual setup is more involved
- Documentation is still lean; edge cases often require reading the source code
Browser-use vs. browser-harness: which one?
If you want something that works out of the box without thinking about internals, stick with the main browser-use project. If you want maximum freedom for the agent, need it to evolve its own workflows, or you’re dealing with complex browser tasks that pre-defined actions can’t cover, harness is the better fit. Think of it as automatic vs. manual transmission — harness gives you full control over the clutch.
Bottom line
Browser Harness is one of those projects that feels underwhelming at first glance but grows on you the more you use it. The self-healing design is genuinely forward-thinking. I’m planning to migrate a few recurring data-scraping tasks over to it and see how much efficiency improves after a few weeks of accumulated domain skills. I’ll report back when I have numbers.
About the Author
Liudingyu is a full-stack developer and heavy GitHub user. With 900+ starred repos over the past 3 years, this site only covers tools I’ve actually used or deeply researched.
📧 Found a great tool to recommend? Email [email protected]
广告