Show HN: LLMpeg

github.com

157 points · jjcm · 6 days ago

Inspired by the "ffmpeg by examples" comments, here's a simple script that pulls it all together. Set your OpenAI API key env var and make the script executable, and you're golden.

79 comments
PaulKeeble · 2 days ago
FFMpeg is one of those tools that is really quite hard to use. The sheer surface area of the possible commands and options is incredible and then there is so much arcane knowledge around the right settings. Its defaults aren't very good and lead to poor quality output in a lot of cases and you can get some really weird errors when you combine certain settings. Its an amazingly capable tool but its equipped with every foot gun going.

Show replies

vunderba · 2 days ago
It's good that you have a "read" statement to force confirmation by the user of the command, but all it takes is one errant accidental enter to end up running arbitrary code returned from the LLM.

I'd constrain the tool to only run "ffmpeg" and extract the options/parameters from the LLM instead.

Show replies

minimaxir · 2 days ago
The system prompt may be a bit too simple, especially when using gpt-4o-mini as the base LLM that doesn't adhere to prompts well.

> You write ffmpeg commands based on the description from the user. You should only respond with a command line command for ffmpeg, never any additional text. All responses should be a single line without any line breaks.

I recently tried to get Claude 3.5 Sonnet to solve an FFmpeg problem (write a command to output 5 equally-time-spaced frames from a video) with some aggressive prompt engineering and while it seems internally consistent, I went down a rabbit hole trying to figure out why it didn't output anything, as the LLMs assume integer frames-per-second which is definitely not the case in the real world!

Show replies

davmar · 2 days ago
i think this type of interaction is the future in lots of areas. i can imagine we replace API's completely with a single endpoint where you hit it up with a description of what you want back. like, hit up 'news.ycombinator.com/api' with "give me all the highest rated submissions over the past week about LLMs". a server side LLM translates that to SQL, executes the query, returns the results.

this approach is broadly applicable to lots of domains just like FFMpeg. very very cool to see things moving in this direction.

Show replies

leobg · 1 days ago
This should be a terminal utility.

   xx ffmpeg video1.mp4 normalize audio without reencoding video to video2.mp4
And have sensible defaults. Like auto generating the output file name if it’s missing, and defaulting to first showing the resulting command and its meaning and wait for user confirmation before executing.

Show replies