OpenAI crawler burning money for nothing
12 points ·
babuskov
·
https://mywebsite/1-post-title
https://mywebsite/2-post-title-second
https://mywebsite/3-post-title-third
https://mywebsite/4-etc
For some reason, it tries every combination of numbers, so the requests look like this: https://mywebsite/1-post-title/2-post-title-second
https://mywebsite/1-post-title/3-post-title-third
etc.Since the blog engine simply discards everything after number (1,2,3...) and just serves the content for blog post #1, #2, #3,... the web server returns a valid page. However, all those pages are the same.
The main problem here is that there is no website page that has such compound links like https://mywebsite/1-post-title/2-post-title-second
So it's clearly some bug in the crawler.
Maybe OpenAI is using AI code for their crawler because it has so dumb bugs you cannot believe any human would write it.
They will make 90000 requests to load my small blog with 300 posts.
Cannot imagine what happens with larger websites that have thousands of blog posts.
markus_zhang ·1 days ago
codemusings ·22 hours ago
It's clear they've all gone mad. The traffic spiked 400% overnight and made the CMS unresponsive a few times a day.
readyplayernull ·1 days ago
https://news.ycombinator.com/item?id=42660377
gbertb ·1 days ago
Show replies
101008 ·1 days ago
Show replies