XPost: alt.comp.freeware, rec.photo.digital
On Sun, 2/9/2025 6:20 PM, Marion wrote:
Don't read this line by line... but you might want to skim it quickly.
I'm new to AI where I realized AI can help me figure out what ffmpeg
commands to use when I need to slightly modify videos for posting.
Normally I ask here - and Paul gives me the answer! :)
But today, I shunned Paul in favor of my (new) good friend, Mr. AI!
The transcript below shows how AI usually gives the wrong answer at
first but you can hone that answer, little by little, to solve issues.
Here's what happened:
a. I needed to upload a video to Amazon Vine that was in two parts
b. So all I needed to do was concatenate two short videos I took
(same camera, same everything)
c. But the second video kept being rotated upside down (still is!)
In desperation, I asked for AI to help solve the problem...
Q: Hey AI. What is the Windows ffmpeg command to rotate a video 180 degrees clockwise
... snip session, to keep the response a bit shorter
I know this is frustrating, but I'm confident we can find the cause.
Please provide the information requested above, and we'll get to the
bottom of this.
It's up to you, to decide what conversational format you use with the AI.
To start with, your question can start with a description of inputs,
the question, and a series of constraint lines.
Some of the properties we know of, is the models have limits on tokens,
and seemingly easily forget things while doing a symbolic manipulation.
One of the reasons you got as far as you did, is the quality assurance
stage likely kept cutting in and forcing it to go back and refine the
question. For each of your thirteen questions.
On the subject matter, FFMPEG is a garbage in garbage out tool :-)
I think you knew that before starting this exercise, is that
FFMPEG is hit or miss on things. Or, at least, the ability of a human
to guess exactly what parameters switched in, would give the response
you wanted.
The AI is probably right, that it needs the refinement of you
providing all available metadata from each file, to correct
what is happening.
For example, one way of doing that, would be to say "each video segment
was shot on an iPhone7 in RAW mode using the 20Mpixel front camera.
While holding the camera in tall formation, rather than wide formation".
The AI could then better guess at what metadata had been injected
into the video, by the camera. The AI would also have a better idea
that an iPhone shoots in High Profile and so on. This would reduce
some of the stabs in the dark it is making.
But my experience (not really a lot of questions) with the AI,
is you're damned if you do and damned if you don't. If you try to
"lead" the AI, by perhaps including the wrong kind of parameter
or construction as part of your input, the stupid thing will try
and make an answer that *includes* your guess. This is bad. On the
one hand, we don't want to pollute the problem space with
unnecessary observations. We do want to provide enough color
commentary, so it can guess what is wrong better.
If we were to grab three Youtube videos and try and splice them
together, there's no guarantee they have all been reduced to some
clean baseline before we get them. Whereas if the AI knows all
three were shot with the same (named) camera, it will at least
know for example, that the camera automatically includes the
rotation metadata, as a function of how you held the camera and so on.
But just again as a general comment, I expect every session with
an AI to go like this. It very much depends on "your own intelligence",
to turn the "story summarizer" into a "problem solver". It's not AGI,
it's not even remotely close to AGI.
Similar to a USENET thread, you'll notice how threads go to hell
as a function of missing details. The participants here are better
at guessing some things, but they will flounder (and sub-threads result),
when the answer is looking too broad.
My very first question of the Ai, illustrates this. As I was
sitting at the machine, I said to myself "got to avoid giving
unbounded questions! You know a thing like this will go crazy
if you do that". And silly me, one of my thoughts on what
the machine would have, is "canned intro answers for noobs",
sort of like a user manual that says not to take it into
the bath with you. So I ask the machine:
What are your capabilities ?
I was expecting an answer such as "I summarize text", "I have a
primitive image drawing module for artwork", "I can do OCR if
you give me an image" and so on.
Instead, I got yards and yards of text until the limit timer
went off... and it erased all the text on the screen.
So this teaches you, in terms of computer languages that
have a "workspace" concept, like BASIC and APL, that as soon
as you step into the machine, you are "in the workspace". The
guard rails are gone. There is no user manual in there. It seems
to have the ability to tell the difference between "continuation
of previous chain" versus "new question". I expect the human
is providing enough hints for the machine to figure that out.
It doesn't apply a framework to anything it is doing. For example,
I don't see in your three questions, any reference at all by the
AI, as to what version of FFMPEG supports a certain parameter format.
It's interesting, that for you, the machine realizes it needs to
"gather samples and run them for its very self". Yet, if it
did that, I would expect there would be a token overflow. Even in the
data center, it has a 128K or 256K token limit (a token is less than
a word). On the DeepSeek distilled models, the limit is something
like 4K tokens. And the Excel spreadsheet joke someone released,
that can accept about seven words of input or so. I would consider
a model to be "sufficiently capable", if you could give it the
URL of the Firefox tarball, and tell the machine to "rewrite that code".
Which is hundreds of megabytes of material :-)
summary: Personal opinion, I don't think a conversational style is appropriate.
For each question, open a copy of Notepad, provide a good description
of inputs, the one-liner question you've got, and then any constraints.
The constraints don't mean anything, and will quite likely be ignored.
"Work slowly and step by step." Meaningless stuff like that. It's
already got one of those in the prompt, at a guess. Then copy your
Notepad text, into the query box.
Maybe some day, it will be able to accept your stub film segments and
run them for itself. But will it be able to recognize the second clip
is upside-down ?
Paul
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)