Tools

Tools: Cheaper Than Perplexity, but Local — and Works With Any Agent: Agent Browser Workspace

2026-03-04 0 views admin

Tools: Cheaper Than Perplexity, but Local — and Works With Any Agent: Agent Browser Workspace

Source: Dev.to

What's Inside? ## 1) Low Level (utils/) ## 2) High Level (scripts/) ## How This Beats a Typical Deep Research SaaS ## 1) You Keep Control ## 2) Artifacts and Reproducibility ## 3) Local, No Extra Infrastructure ## Extensibility: Site Profiles Instead of Hardcoding ## DeepResearch Bench: Why Numbers Matter ## Numbers You Can Verify ## Why "44.37 on Haiku" Is More Than Just a Number ## Try It in 5 Minutes ## 1) Install ## 2) Start a Local Chrome for the Agent ## 3) Save Any Page as Markdown (with Images) ## 4) Deep Research: Google → Open → Save Source ## When a Site Looks "Empty" (SPA, JS Rendering, Lazy-Load) ## Not Just Deep Research: Where Else It Helps ## 1) Product and Marketing Research ## 2) Web Routine Automation ## 3) Form Discovery and Ready-Made Selectors ## Why "Local + Observable" Is a Big Deal ## Where to Read More If you've used Perplexity Deep Research, you probably know two feelings at once: 1) "Wow, it really digs deep." 2) "Too bad I can't see what's happening inside, intervene, restart a step, or expand the search — it's a black box." Agent Browser Workspace is not a "one-button SaaS". It's a local toolkit that gives any AI agent (Cursor, your own agent, an LLM orchestrator) a real browser for research. It runs on your machine, through your Chrome, but in a separate profile. No Docker, no exotic environments. Agent Browser Workspace is a repository with two layers: Ready-made building blocks for a research pipeline: With SaaS deep research, you usually see a progress bar and a final result. Here it's different: Research becomes iterative. One failed step doesn't kill the whole process. Deep research isn't just the final text. It's also the evidence base: No containers, no remote browsers, no special platforms: Three commands — done. Details in INSTALLATION.md. A common pain point in browser agents — selectors break. Sites change their markup, and the agent starts guessing. Here it's different — through site profiles: Need to support a new site? Add a JSON profile. Google changed its markup? Edit scripts/sites/google-search.json instead of rewriting code. When everyone claims "we have the best deep research", you need an external benchmark. That benchmark is DeepResearch Bench (DRB) — 100 "PhD-level" tasks, two metrics (RACE/FACT), and a public evaluation methodology. On the official DRB page in the Main Results section for the "Deep Research Agent" category (RACE overall): And here's Agent Browser Workspace's result: 44.37 (RACE overall) — above Perplexity Deep Research, closer to OpenAI/Gemini, and running on Claude Haiku 4.5 (a model significantly cheaper than "frontier" stacks). Results have been submitted to the leaderboard and are under review. Most comparisons forget about cost and controllability. Here you win on three fronts at once: 1) Quality close to the top (DRB overall near OpenAI/Gemini). 2) Lower cost (Haiku-class models). 3) Control and reproducibility — on your machine, with real artifacts (links.json + downloaded sources). PDFs are supported too: if a .pdf shows up in the results, getContent/googleSearch automatically extract the text. The classic failure of "fast" web scrapers: the HTML arrives, but there's no content. The project has an escalation path (details in AGENTS.md): The logic is simple: the page matters — don't skip it — escalate the extraction level. You can collect search results, lock down links.json, save 30–60 sources as Markdown, then ask the agent to "expand / compare / double-check / build a table" using local artifacts. Log in, click, download, fill out forms, take screenshots, save evidence — it's all here. getForms finds forms and fields, returns ready-to-use CSS selectors. From there, the agent calls browser.fill() or browser.fillForm() without guessing. Closed deep research products work well when you need a quick answer. But if you're working with research, business decisions, sources, verification, and iterations — you need a different mode: That's what Agent Browser Workspace is about. If you'd like to contribute to the open source project — here are the most impactful areas: Made it this far? Star the repo — it's the best way to help the project grow. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK: npm install npx playwright install chrome Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: npm install npx playwright install chrome COMMAND_BLOCK: npm install npx playwright install chrome COMMAND_BLOCK: npm run chrome Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: npm run chrome COMMAND_BLOCK: npm run chrome CODE_BLOCK: node scripts/getContent.js --url https://example.com --dir ./output --name page.md Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: node scripts/getContent.js --url https://example.com --dir ./output --name page.md CODE_BLOCK: node scripts/getContent.js --url https://example.com --dir ./output --name page.md COMMAND_BLOCK: # Stable snapshot of search results (links.json) node scripts/googleSearch.js "best AI newsletters 2026" --links --dir ./archive/my-research # Open result 0 and save content node scripts/googleSearch.js "best AI newsletters 2026" --open 0 --dir ./archive/my-research --name source-0.md Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: # Stable snapshot of search results (links.json) node scripts/googleSearch.js "best AI newsletters 2026" --links --dir ./archive/my-research # Open result 0 and save content node scripts/googleSearch.js "best AI newsletters 2026" --open 0 --dir ./archive/my-research --name source-0.md COMMAND_BLOCK: # Stable snapshot of search results (links.json) node scripts/googleSearch.js "best AI newsletters 2026" --links --dir ./archive/my-research # Open result 0 and save content node scripts/googleSearch.js "best AI newsletters 2026" --open 0 --dir ./archive/my-research --name source-0.md - Cheaper than Perplexity: on DeepResearch Bench we scored 44.37 (RACE overall) on Claude Haiku 4.5 — a model significantly cheaper than typical "frontier" stacks. - Local and transparent: a real Chrome right in front of you — you can stop it, log in, close a banner, restart a step, expand the search, refine the query. As many iterations as you need. - Not just deep research: also a tool for browser automation + content, form, and HTML data extraction. - Extensible: new sites are added as profiles in scripts/sites/.json — selectors and controls live separately from code and prompts. - utils/browserUse.js — control a real Chrome via Playwright: navigation, clicks, input, scrolling (including infinite scroll), screenshots, file and image downloads, JS execution on the page, tabs, CDP. - utils/getDataFromText.js — parse raw HTML without a browser: finds navigation, main content, forms, and converts content to Markdown. - getContent — save a page as Markdown + download images and rewrite links to local files. - getForms — find forms, classify them (search/auth/filter/contact/subscribe), and build ready-to-use CSS selectors for filling. - getAll — content + forms in one pass (single HTML snapshot). - googleSearch — step-by-step Google search: query → organic links → open → extract → close tab → pagination. - The browser is real — not a "virtual screenshot black box". - You can intervene: close a cookie banner, log in, confirm age, adjust a filter. - You can restart a specific step: open the next link, re-extract content, change the wait strategy (SPA/JS rendering), scroll through infinite scroll before extraction. - You can expand endlessly: "add 10 more sources", "double-check the numbers", "add a comparison table", "collect a list of alternatives", "follow the snowball of links". - links.json — a stable snapshot of Google results across all queries (you can continue later without re-running searches). - Downloaded pages in Markdown + images/ — sources stay on disk. - insights.md — a cumulative draft (in the RESEARCH.md methodology, this is part of the process). - npm install - npx playwright install chrome - npm run chrome (starts Chrome with CDP on port 9222) - scripts/sites/.json stores selectors and "controls" (which elements matter, what to do with them). - Scripts return a site field, and the agent uses ready-made selectors without guessing. - Official DRB site: https://deepresearch-bench.github.io/ - Repository: https://github.com/Ayanami0730/deep_research_bench - Gemini-2.5-Pro Deep Research: 48.88 - OpenAI Deep Research: 46.98 - Perplexity Deep Research: 42.25 - gotoAndWaitForContent() — wait for the DOM to stabilize after JS rendering - evaluate(() => document.body.innerText) — pull visible text directly - scroll({ times: N }) — load lazy content or a feed - screenshot({ fullPage: true }) — if text isn't accessible programmatically - fix obstacles, - restart steps, - refine the report, - keep going until the result is right. - GitHub: https://github.com/k-kolomeitsev/agent-browser-workspace - Tool overview and working rules: AGENTS.md - Installation and QOL instructions (profiles/shortcuts/checks): INSTALLATION.md - Deep research methodology: RESEARCH.md - new and improved site profiles in scripts/sites/ - better content extraction on tricky sites (SPA, paywall overlays, lazy rendering) - smarter form and field detection rules

🏷️ Tags

how-totutorialguidedev.toaimlopenaillmdockernodesslgitgithub