Show HN: Autotab – Programmable AI browser for turning web tasks into APIs

88 points · jonasnelle · 10 hours ago

Hey HN, we're Alexi and Jonas the co-founders of Autotab (https://autotab.com). Autotab is a chrome-based browser you can teach to do complex tasks, with a simple API for running them from your app or backend.

Here is a walkthrough of how it works: https://youtu.be/63co74JHy1k, and you can try it for free at https://autotab.com by downloading the app.

Why a dedicated editor?

The number one blocker we've found in building more flexible, agentic automations is performance quality BY FAR (https://www.langchain.com/stateofaiagents#barriers-and-chall...). For all the talk of cost, latency, and safety, the fact is most people are still just struggling to get agents to work. The keys to solving reliability are better models, yes, but also intent specification. Even humans don't zero-shot these tasks from a prompt. They need to be shown how to perform them, and then refined with question-asking + feedback over time. It is also quite difficult to formulate complete requirements on the spot from memory.

The editor makes it easy to build the specification up as you step through your workflow, while generating successful task trajectories for the model. This is the only way we've been able to get the reliability we need for production use cases.

But why build a browser?

Autotab started as a Chrome extension (with a Show HN post! https://news.ycombinator.com/item?id=37943931). As we iterated with users, we realized that we needed to focus on creating the control surface for intent specification, and that being stuck in a chrome sidepanel wasn't going to work. We also knew that we needed a level of control for the model that we couldn't get without owning the browser. In Autotab, the browser becomes a canvas on which the user and the model are taking turns showing and explaining the task.

Key features:

1. Self-healing automations that don't break when sites change

2. Dedicated authoring tool that builds memory for the model while defining steps for the automation

3. Control flows and deep configurability to keep automations on track, even when navigating complex reasoning tasks

4. Works with any website (no site-specific APIs needed)

5. Runs securely in the cloud or locally

6. Simple REST API + client libraries for Python, Node

We'd love to get any early feedback from the HN community, ideas for where you'd like the product to go, or experiences in this space. We will be in the comments for the next few hours to respond!

39 comments

thedays · 1 hours ago

Is Autotab able to scrape data from multiple websites with different structures and combine this data into structured data in one CSV or JSON file? Example: scrape interest rates offered on savings accounts from multiple bank websites and extract the name of the bank, bank logo, product name and interest rate for each account and run this saved query on a regular schedule (daily, weekly etc)?

pugio · 7 hours ago

I love the idea - owning the browser definitely seems like the right approach.

I tried it out on a workflow I've been manually piecing together and it gave me a bunch of "Error encountered, contact support" messages when doing things like clicking on a form input field, or even a button.

The more complex "Instruction" block worked correctly instead (literally things like "click the "Sign In" button), but then I ran out of the 5 minutes of free run time when trying to go through the full flow. I expect this kind of thing will be fixed soon, as it grows.

In terms of ultimate utility, what I really want is something which can export scripts that run entirely locally, but falling back to the more dynamic AI enhanced version when an error is encountered. I would want AutoTab to generate the workflow which I could then run on my own hardware in bulk.

Anyway, great work! This is definitely the best implementation I've seen of that glimpsed future of capable AI web browsing agents.

Show replies

adamkhakhar · 3 hours ago

This is awesome! What is your most common use case? Have you thought of competing with https://scribehow.com/ in the documentation space?

Show replies

MattDaEskimo · 9 hours ago

Very neat in theory but I'm failing to find any technical details.

Which layer is the automation happening? Inside using Dev tools? Multiple?

What is the self-healing mechanic? I'm guessing invoking an LLM to find what happened and fix it?

I guess what I'm wondering is. Is this some sort of hybrid between computer use and Dev tools usage?

Show replies

diegolazcano · 5 hours ago

This is awesome. I was just trying to get a rudimentary version of this for some "user" interaction heavy data extraction. Definitely giving it a try.

For a case with lots of requests how does Autotab handle ip-blocking? Does each run use a different portal instance?

Show replies