Launch HN: Midship (YC S24) – Turn PDFs, docs, and images into usable data

51 points · maxmaio · 7 hours ago

Hey HN, we are Max, Kieran, and Aahel from Midship (https://midship.ai). Midship makes it easy to extract data from unstructured documents like pdfs and images.

Here’s a video showing it in action: https://www.loom.com/share/ae43b6abfcc24e5b82c87104339f2625?..., and a demo playground (no signup required!) to test it out: https://app.midship.ai/demo

We started 5 months ago initially trying to make an AI natural language workflow builder that would be a simpler alternative to Zapier or Make.com. However, most of our users seemed to be much more interested in the basic (and not very good) document extraction feature we had. Seeing how people were spending hours a day manually extracting data from pdfs inspired us to build what has become Midship!

The problem is that despite all our progress in software, huge amounts of business data still lives in PDFs and images. Sure, you can OCR them, but getting clean, structured data out is still painful. Most existing tools just give you a blob of markdown - leaving you to figure out which parts matter and how they relate.

We've found that combining OCR with language models lets us do something more useful: extract specific fields and tables that users actually care about. The LLMs help correct OCR mistakes and understand context (like knowing that "Inv#" and "Invoice Number" mean the same thing).

We have two main kinds of users today, non-technical users that extract data via our web app and developers who use our extraction api. We were initially focused on the first one as they seemed like an underserved part of the market, but we’ve received a lot of interest from developers who face the same issues.

For pricing, we currently charge a monthly Saas fee per seat for the web app and a volume based pricing for the API.

We’re really excited to share what we’ve built so far and look forward to any feedback from the community!


31 comments
ctippett · 2 hours ago
Congrats on the launch. I just sent y'all an email – I'm curious with what you can do with airline crew rosters.
monkeydust · 3 hours ago
Heres a real world use case, our company has moved our pension provider. This provider like the old one sucks at providing me with a good way to navigate through the 120 funds I can invest in.

I want to create something that can paginate through 12 pages of html, perform clicks, download pdf fund factsheet, extract data from this factsheet into excel or CSV. Can this help? What's the best way to deal with the initial task of automating webpage interactions systematically?

Show replies

serjester · 4 hours ago
Honest question but how do you see your business being affected as foundational models improve? While I have massive complaints about them, Gemini + structured outputs is working remarkably well for this internally and it's only getting better. It's also an order of magnitude cheaper than anything I've seen commercially.

Show replies

ivanvanderbyl · 5 hours ago
Congrats on the launch!

I’m curious to hear more about your pivot from AI workflow builder to document parsing. I can see correlations there, but that original idea seems like a much larger opportunity than parsing PDFs to tables in what is an already very crowded space. What verticals did you find have this problem specifically that gave you enough conviction to pivot?

Show replies

zh2408 · 6 hours ago

Show replies