I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
Even the initial microprocessors have a reset pin. When they are powered up, the status of the electronics is unknown, so a small time after power up, the line is triggered by a timer (555 or whatever).
Then, there are many designs where you can not pull power, because there is an
unreachable battery.
Then, it is impossible to guarantee that the device will never find itself in a
pickle. No matter how fantastic the designers are.
I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
Sometimes it is the user that is lacking. Such as forgetting a
password. Maybe putting the device in the wrong mode and can not
remember how to get it out of that mode.
With the many lines of code it is impossiable to check out every
possiale combination of things that could ge wrong in a reasonable time.
While not really much of a chance but in the eairly days bits could be randomally changed by stray radiation.
I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
Don Y <[email protected]d> wrote:
I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
Nowadays 'reset' often means 'reset to factory settings' rather than 'reboot'. The factory settings reset is needed because maybe you forgot the password and have no other way to reconfigure the thing. Or you need to
make it go back into the initial pairing mode so you can attach it to
another network/etc.
Having a physical factory reset button means that somebody with physical access can always regain access to it. You can also use it to verify destructive actions (eg 'to wipe all the data, now hold the button') so that they can't be done remotely.
In practical terms such a button might just be a GPIO rather than wired to a reset line, and the software pays attention to it at certain times such as during boot.
For more developer-focused devices, a true reset button is also better than yanking the power cord which can cause wear on the connectors. So when version 497 of your code crashes you can hit the button and upload v498.
Ususally the "reset" button is really "factory reset", which is intended
to revert persistent state back to know values. A typical use for this,
that absolutely must not be done on a power cycle, is setting a login password to a known value ("1234", or some more secure unique string
printed on a label if the designer is sensible).
As a convenience (for who?), a quick press of the reset button usually
just does a POR, and you need to hold the button for some length of time
to reset the configuration. I think this has just become a convention
that every device seems to follow.
Yes, a lot of complex devices that really should be able to run forever sometimes can't, and lock up. I blame lack of understanding of the whole system by the programmers (and it is usually software that is at fault, rarely the hardware). It's going to get worse with completely clueless
people throwing together something that barely works with the help of automated idiots.
I write microcontroller code in assembler that runs 24x7x365x∞ (that's "infinity", for the UTF8-challenged :)
On 24/05/2025 23:34, Don Y wrote:
I don't quite understand the need for "reset" buttons on products.
There are things like routers and mobile phones where there is a very significant difference between power on/off and a hard factory reset. The former recovers it form having crashed internally whilst the latter trashes all
previous settings into oblivion.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Not always available - there are quite a few different levels of reset too. I recall one particularly annoying one on an early Android device that require holding the on/off button and volume down in for 4 minutes. It did do a hard factory reset on about the fifth attempt. The previous four having failed because my fingers slipped. One minor annoyance was that the very hard reset put it into Chinese language mode.
ISTR the ordinary soft factory reset was about 10s holding the magic buttons in
(but didn't work on this unit).
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
In an ideal world yes. But I have seen such devices in a state where the only process still running was the one pressing the dead man's handle to say that everything is OK. Many routers tend to go haywire after a continuous uptime of
about 2 or 3 months having fragmented their stack.
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
It could also be to reset to a known state. Several of the more annoying gadgets have one time programmability from a PC but to make them secure the moment you make them active all further communication is impossible.
The only way to reprogram is factory reset and start again from scratch. User interface designed by someone who really enjoyed the maze in Zork.
(some TV tuning menus fall into this category)
On 5/24/2025 5:37 PM, Carlos E. R. wrote:
Then, there are many designs where you can not pull power, because
there is an unreachable battery.
So, the battery is inaccessible but a reset button wouldn't be?
All the more reason NOT to need a reset button!
Then, it is impossible to guarantee that the device will never find
itself in a pickle. No matter how fantastic the designers are.
Barring hardware failures, software should be able to sort itself out,
On 2025-05-25 00:34, Don Y wrote:
I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene?� Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
Even the initial microprocessors have a reset pin. When they are powered
up, the status of the electronics is unknown, so a small time after
power up, the line is triggered by a timer (555 or whatever).
Then, there are many designs where you can not pull power, because there
is an unreachable battery.
Then, it is impossible to guarantee that the device will never find
itself in a pickle. No matter how fantastic the designers are.
While not really much of a chance but in the eairly days bits could be randomally changed by stray radiation.
And cycling power won't suffice?
Not always available - there are quite a few different levels of reset
too. I recall one particularly annoying one on an early Android device
that require holding the on/off button and volume down in for 4 minutes.
It did do a hard factory reset on about the fifth attempt. The previous
four having failed because my fingers slipped. One minor annoyance was
that the very hard reset put it into Chinese language mode.
On 2025-05-25 07:58, Don Y wrote:
And cycling power won't suffice?
Some designs preserve the status on power cycle. Intentionally.
On 2025-05-25 05:47, Ralph Mowery wrote:
While not really much of a chance but in the eairly days bits could be
randomally changed by stray radiation.
And today. It is even a bigger chance, with lower logic voltages. Chips are not
shielded.
On 2025-05-25 04:01, Don Y wrote:
On 5/24/2025 5:37 PM, Carlos E. R. wrote:
...
Then, there are many designs where you can not pull power, because there is >>> an unreachable battery.
So, the battery is inaccessible but a reset button wouldn't be?
All the more reason NOT to need a reset button!
Think phones.
Then, it is impossible to guarantee that the device will never find itself >>> in a pickle. No matter how fantastic the designers are.
Barring hardware failures, software should be able to sort itself out,
Should.
Actually, nobody warranties this. Oh, wait, I remember a company that did warranty it, for limited sized code, and hugely expensive. They did the core software on ATMs or the servers behind them, I don't remember.
Exactly. I recall a customer wanting us to verify all possible paths
through a bit of air traffic control radar software, about 100,000
lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
The testing campaign will have only scratched the surface when the Sun
runs out of hydrogen and goes supernova. Tomorrow's problem.
On 5/25/2025 12:33 PM, Joe Gwinn wrote:
Exactly. I recall a customer wanting us to verify all possible paths
through a bit of air traffic control radar software, about 100,000
lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
And probably 99.9% of them are superfluous.
Experience teaches you to construct your code so that testability is >enhanced. Instead of waiting until it seems to be "done" and then
trying to reassure yourself that it works as intended -- usually by
throwing EXPECTED conditions at it and hoping for the expected
results (that's not testing).
You need 2^20000 if there are 2^20000 distinct outcomes (leaf nodes) in
your code. I strongly doubt that to be the case.
The testing campaign will have only scratched the surface when the Sun
runs out of hydrogen and goes supernova. Tomorrow's problem.
How do you test an electronic circuit? Let's impose an infinite number of >discrete voltages on each of the input signals and verify the correct
outputs for each? (Do you deliberately verify all ranges of signal values >and frequencies? Or, just say "operation outside of these conditions is >indeterminate"?)
On Tue, 27 May 2025 14:13:02 -0700, Don Y
<[email protected]d> wrote:
On 5/25/2025 12:33 PM, Joe Gwinn wrote:
Exactly. I recall a customer wanting us to verify all possible paths
through a bit of air traffic control radar software, about 100,000
lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
And probably 99.9% of them are superfluous.
[snip]
The problem is that you have no way to know which cases are
irrelevant. And practical hardware will have many things able to
retain state.
Experience teaches you to construct your code so that testability is
enhanced. Instead of waiting until it seems to be "done" and then
trying to reassure yourself that it works as intended -- usually by
throwing EXPECTED conditions at it and hoping for the expected
results (that's not testing).
It is not usually the expected that causes trouble: It ain't what you
don't know that matters, it what you know that ain't so that's the
problem.
You need 2^20000 if there are 2^20000 distinct outcomes (leaf nodes) in
your code. I strongly doubt that to be the case.
So assume only 1000 IF statements, so it's 2^1000 or 10^600 or so.
You'll still run out of lifetime.
The testing campaign will have only scratched the surface when the Sun
runs out of hydrogen and goes supernova. Tomorrow's problem.
How do you test an electronic circuit? Let's impose an infinite number of >> discrete voltages on each of the input signals and verify the correct
outputs for each? (Do you deliberately verify all ranges of signal values >> and frequencies? Or, just say "operation outside of these conditions is
indeterminate"?)
You do all the tests for required behavior - does it meet stated requirements.
Then you random probe it for weeks and see what goes Bang!
One form of this is Fuzzing.
.<https://www.usenix.org/conference/usenixsecurity22/presentation/trippel>
On Sun, 25 May 2025 02:37:09 +0200, "Carlos E. R."
<[email protected]d> wrote:
On 2025-05-25 00:34, Don Y wrote:
I don't quite understand the need for "reset" buttons on products.
That function is always available by cycling power -- even for devices
where that is difficult for the user (e.g., PoE, BBU, etc.)
Shouldn't a device be able to get itself out of a "pickle" without
requiring the user to intervene? Particularly devices that are
intended to "run forever"?
I.e., it seems like the presence of a reset button is a tacit admission
that the engineering is "lacking"...
Even the initial microprocessors have a reset pin. When they are powered
up, the status of the electronics is unknown, so a small time after
power up, the line is triggered by a timer (555 or whatever).
Then, there are many designs where you can not pull power, because there
is an unreachable battery.
Then, it is impossible to guarantee that the device will never find
itself in a pickle. No matter how fantastic the designers are.
Exactly. I recall a customer wanting us to verify all possible paths
through a bit of air traffic control radar software, about 100,000
lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
The testing campaign will have only scratched the surface when the Sun
runs out of hydrogen and goes supernova. Tomorrow's problem.
Exactly. I recall a customer wanting us to verify all possible paths
through a bit of air traffic control radar software, about 100,000
lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
Executing each path in the code at least once is a much more tractable problem.
McCabes CCI metric will tell you how many test vectors it will take for a given complexity of code. And it should be done.
A sufficiently high CCI index for a routine also means that such code is highly
unlikely to be correct.
I recall one instance on a mainframe (brand withheld to protect the guilty) where a rogue program that ran continuously was slowly using up IO handles repeatedly opening the tracker ball interface each time a new user accessed it
(and never letting go).
One day after a particularly long uptime it completely ran out of IO handles. Guess what the first thing the error handler tried to do?
Yup! It tried to obtain a new IO handle to report the error!
The testing campaign will have only scratched the surface when the Sun
runs out of hydrogen and goes supernova. Tomorrow's problem.
You can't out test every possible combination of events but you can make sure that the code paths when executed in at least one scenario don't do anything horribly bad. A lot of faults can lurk in the rarely used error recovery code that only gets used after something else has gone wrong.
Ariane 5 was an example of that sort of thing.
comp.risks is littered with them.
On Tue, 27 May 2025 14:13:02 -0700, Don Y
<[email protected]d> wrote:
On 5/25/2025 12:33 PM, Joe Gwinn wrote:[snip]
Exactly. I recall a customer wanting us to verify all possible paths
through a bit of air traffic control radar software, about 100,000
lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
And probably 99.9% of them are superfluous.
The problem is that you have no way to know which cases are
irrelevant. And practical hardware will have many things able to
retain state.
On 28/05/2025 01:13, Joe Gwinn wrote:
On Tue, 27 May 2025 14:13:02 -0700, Don Y
<[email protected]d> wrote:
On 5/25/2025 12:33 PM, Joe Gwinn wrote:[snip]
Exactly. I recall a customer wanting us to verify all possible paths
through a bit of air traffic control radar software, about 100,000
lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
And probably 99.9% of them are superfluous.
The problem is that you have no way to know which cases are
irrelevant. And practical hardware will have many things able to
retain state.
A concept you are looking for here is "cyclomatic complexity" :
<https://en.wikipedia.org/wiki/Cyclomatic_complexity>
Conditionals are not independent (in most cases) - thus the number of
paths through code is not simply two to the power of the number of if >statements.
(McCabe cyclomatic complexity measures are not the only way to look at
this, and like anything else, the measure has its advantages and >disadvantages, and is not suitable for everything. But it's a
reasonable place to start.)
The way you handle complexity in software is exactly the same as any
other complexity - you break things down into manageable parts. For >software, that can be libraries, modules, files, classes, functions, and >blocks. You specify things from the top down, and test them from the
bottom up. How much testing you do, and how you do it, is going to
depend on the application - an air traffic control system will need more >thorough testing than a mobile phone game.
When you are looking at a function, the cyclomatic complexity can be >calculated by tools. It will then give you a good idea of how much
testing you need for the function. Too high a complexity, and you will
never be able to test the function to get a solid idea of its
correctness, as you would need too many test cases for practicality.
(You may still be able to use other analysis methods.) The complexity
can also give a good indication of how easy it is for humans to
understand the function and judge its correctness. (Again, no one
measure gives a complete picture.)
Other tools that can be useful in testing are code coverage tools - you
can check that your test setups check all paths through the code.
But remember that testing cannot prove the absence of bugs - only their >presence. And it only works on the assumption that the hardware is
correct - even when the software is perfect, you might still need that
reset button or a watchdog!
On Wed, 28 May 2025 14:41:56 +0200, David Brown
<[email protected]> wrote:
On 28/05/2025 01:13, Joe Gwinn wrote:
On Tue, 27 May 2025 14:13:02 -0700, Don Y
<[email protected]d> wrote:
On 5/25/2025 12:33 PM, Joe Gwinn wrote:[snip]
Exactly. I recall a customer wanting us to verify all possible paths >>>>> through a bit of air traffic control radar software, about 100,000
lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
And probably 99.9% of them are superfluous.
The problem is that you have no way to know which cases are
irrelevant. And practical hardware will have many things able to
retain state.
A concept you are looking for here is "cyclomatic complexity" :
<https://en.wikipedia.org/wiki/Cyclomatic_complexity>
Conditionals are not independent (in most cases) - thus the number of
paths through code is not simply two to the power of the number of if
statements.
True, but it's the bounding case.
(McCabe cyclomatic complexity measures are not the only way to look at
this, and like anything else, the measure has its advantages and
disadvantages, and is not suitable for everything. But it's a
reasonable place to start.)
The way you handle complexity in software is exactly the same as any
other complexity - you break things down into manageable parts. For
software, that can be libraries, modules, files, classes, functions, and
blocks. You specify things from the top down, and test them from the
bottom up. How much testing you do, and how you do it, is going to
depend on the application - an air traffic control system will need more
thorough testing than a mobile phone game.
When you are looking at a function, the cyclomatic complexity can be
calculated by tools. It will then give you a good idea of how much
testing you need for the function. Too high a complexity, and you will
never be able to test the function to get a solid idea of its
correctness, as you would need too many test cases for practicality.
(You may still be able to use other analysis methods.) The complexity
can also give a good indication of how easy it is for humans to
understand the function and judge its correctness. (Again, no one
measure gives a complete picture.)
I recall those days. Some managers thought that if they decreed that
no module could have a complexity (computed in various ways) exceeding
some arbitrary limit. The problem was that real-world problems are
vastly more complex, causing atomization of the inherent complexity
into a bazillion tiny modules, hiding the structure and imposing large
added processing overheads from traversing all those inter-module
interfaces.
Other tools that can be useful in testing are code coverage tools - you
can check that your test setups check all paths through the code.
We still do this, but the limitation is that all such tools yield far
more false alarms then valid hits, so all hits must be manually
verified.
But remember that testing cannot prove the absence of bugs - only their
presence. And it only works on the assumption that the hardware is
correct - even when the software is perfect, you might still need that
reset button or a watchdog!
Absolutely. This was true in the days of uniprocessors with one
megahertz clocks and kilobyte memories. Now it's hundreds of
processors with multi-gigahertz clocks and terabyte physical memories.
On 28/05/2025 18:07, Joe Gwinn wrote:
On Wed, 28 May 2025 14:41:56 +0200, David Brown
<[email protected]> wrote:
On 28/05/2025 01:13, Joe Gwinn wrote:
On Tue, 27 May 2025 14:13:02 -0700, Don Y
<[email protected]d> wrote:
On 5/25/2025 12:33 PM, Joe Gwinn wrote:[snip]
Exactly. I recall a customer wanting us to verify all possible paths >>>>>> through a bit of air traffic control radar software, about 100,000 >>>>>> lines of plain C. Roughly one in five executable line was an IF
statement, which is 20,000 IF statements. So there are 2^20000 =
10^6020 such paths.
And probably 99.9% of them are superfluous.
The problem is that you have no way to know which cases are
irrelevant. And practical hardware will have many things able to
retain state.
A concept you are looking for here is "cyclomatic complexity" :
<https://en.wikipedia.org/wiki/Cyclomatic_complexity>
Conditionals are not independent (in most cases) - thus the number of
paths through code is not simply two to the power of the number of if
statements.
True, but it's the bounding case.
(McCabe cyclomatic complexity measures are not the only way to look at
this, and like anything else, the measure has its advantages and
disadvantages, and is not suitable for everything. But it's a
reasonable place to start.)
The way you handle complexity in software is exactly the same as any
other complexity - you break things down into manageable parts. For
software, that can be libraries, modules, files, classes, functions, and >>> blocks. You specify things from the top down, and test them from the
bottom up. How much testing you do, and how you do it, is going to
depend on the application - an air traffic control system will need more >>> thorough testing than a mobile phone game.
When you are looking at a function, the cyclomatic complexity can be
calculated by tools. It will then give you a good idea of how much
testing you need for the function. Too high a complexity, and you will
never be able to test the function to get a solid idea of its
correctness, as you would need too many test cases for practicality.
(You may still be able to use other analysis methods.) The complexity
can also give a good indication of how easy it is for humans to
understand the function and judge its correctness. (Again, no one
measure gives a complete picture.)
I recall those days. Some managers thought that if they decreed that
no module could have a complexity (computed in various ways) exceeding
some arbitrary limit. The problem was that real-world problems are
vastly more complex, causing atomization of the inherent complexity
into a bazillion tiny modules, hiding the structure and imposing large
added processing overheads from traversing all those inter-module
interfaces.
The problem with any generalisation and rules is that they are sometimes >inappropriate. /Most/ functions, modules, pages of schematic diagram,
or whatever, should have a low complexity however you compute it. But
there are always some that are exceptions, where the code is clearer
despite being "complex" according to the metrics you use.
Other tools that can be useful in testing are code coverage tools - you
can check that your test setups check all paths through the code.
We still do this, but the limitation is that all such tools yield far
more false alarms then valid hits, so all hits must be manually
verified.
A false alarm for a code coverage report would mean code that is not
reported as hit, but actually /is/ hit when the code is run. How does
that come about?
But it is certainly true that any kind of automatic testing or
verification is only going to get you so far - false hits or missed
cases are inevitable.
But remember that testing cannot prove the absence of bugs - only their
presence. And it only works on the assumption that the hardware is
correct - even when the software is perfect, you might still need that
reset button or a watchdog!
Absolutely. This was true in the days of uniprocessors with one
megahertz clocks and kilobyte memories. Now it's hundreds of
processors with multi-gigahertz clocks and terabyte physical memories.
And it's still true - modern systems give more scope for hardware issues
than simpler systems (as well as more scope for subtle software bugs).
A cosmic ray in the wrong place can render all your software
verification void.
On Fri, 30 May 2025 17:53:59 +0200, David Brown
<[email protected]> wrote:
On 28/05/2025 18:07, Joe Gwinn wrote:
I recall those days. Some managers thought that if they decreed that
no module could have a complexity (computed in various ways) exceeding
some arbitrary limit. The problem was that real-world problems are
vastly more complex, causing atomization of the inherent complexity
into a bazillion tiny modules, hiding the structure and imposing large
added processing overheads from traversing all those inter-module
interfaces.
The problem with any generalisation and rules is that they are sometimes
inappropriate. /Most/ functions, modules, pages of schematic diagram,
or whatever, should have a low complexity however you compute it. But
there are always some that are exceptions, where the code is clearer
despite being "complex" according to the metrics you use.
No, all of the complexity metrics were blown away by practical
software running on practical hardware. Very few modules were that
simple, because too many too small modules carry large inter-module
interface overheads.
Other tools that can be useful in testing are code coverage tools - you >>>> can check that your test setups check all paths through the code.
We still do this, but the limitation is that all such tools yield far
more false alarms then valid hits, so all hits must be manually
verified.
A false alarm for a code coverage report would mean code that is not
reported as hit, but actually /is/ hit when the code is run. How does
that come about?
The code coverage vendors hold the details close, so we usually don't
know how hits are declared, and probably never will.
The one that I did manage to obtain the details turned out to be
looking for certain combinations of certain words and arrangements. It
had zero understanding of what the code did, never mind why.
Maybe modern AI will do better, but may be too expensive to make
business sense.
But it is certainly true that any kind of automatic testing or
verification is only going to get you so far - false hits or missed
cases are inevitable.
Yes, in fact the false hits dominate by a large factor, and the main
expense in using such tools is the human effort needed to extract
those few true hits.
But remember that testing cannot prove the absence of bugs - only their >>>> presence. And it only works on the assumption that the hardware is
correct - even when the software is perfect, you might still need that >>>> reset button or a watchdog!
Absolutely. This was true in the days of uniprocessors with one
megahertz clocks and kilobyte memories. Now it's hundreds of
processors with multi-gigahertz clocks and terabyte physical memories.
And it's still true - modern systems give more scope for hardware issues
than simpler systems (as well as more scope for subtle software bugs).
A cosmic ray in the wrong place can render all your software
verification void.
I must say that there was much worry about cosmic ray hits back in the
day, but they never turned out to matter in practice, except in space systems.
The dominant source of errors turned out to be electrical cross-talk
and interference in the backplanes, and meta-stability in interfaces
between logic clock domains in the larger hardware system.
I vaguely recall doing an analysis on this issue, some decades ago.
On 30/05/2025 19:39, Joe Gwinn wrote:
On Fri, 30 May 2025 17:53:59 +0200, David Brown
<[email protected]> wrote:
On 28/05/2025 18:07, Joe Gwinn wrote:
I recall those days. Some managers thought that if they decreed that
no module could have a complexity (computed in various ways) exceeding >>>> some arbitrary limit. The problem was that real-world problems are
vastly more complex, causing atomization of the inherent complexity
into a bazillion tiny modules, hiding the structure and imposing large >>>> added processing overheads from traversing all those inter-module
interfaces.
The problem with any generalisation and rules is that they are sometimes >>> inappropriate. /Most/ functions, modules, pages of schematic diagram,
or whatever, should have a low complexity however you compute it. But
there are always some that are exceptions, where the code is clearer
despite being "complex" according to the metrics you use.
No, all of the complexity metrics were blown away by practical
software running on practical hardware. Very few modules were that
simple, because too many too small modules carry large inter-module
interface overheads.
That changes nothing of the principles.
You aim for low and controlled complexity, at all levels, so that you
can realistically test, verify, and check the code and systems at the >different levels. (Checking can be automatic, manual, human code
reviews, code coverage tools, etc., - usually in combination.) Any part
with particularly high complexity is going to take more specialised
testing and checking - that costs more time and money, and is higher
risk. Sometimes it is still the right choice, because alternatives are
worse (such as the "too many small modules" issues you mention) or
because there are clear and reliable ways to test dues to particular
patterns (as you might get in a very large "dispatch" function).
You don't just throw your hands in the air and say it's better with
spaghetti in a module than spaghetti between modules, and therefore you
can ignore complexity! I don't believe that is what you are actually
doing, but it sounds a little like that.
Other tools that can be useful in testing are code coverage tools - you >>>>> can check that your test setups check all paths through the code.
We still do this, but the limitation is that all such tools yield far
more false alarms then valid hits, so all hits must be manually
verified.
A false alarm for a code coverage report would mean code that is not
reported as hit, but actually /is/ hit when the code is run. How does
that come about?
The code coverage vendors hold the details close, so we usually don't
know how hits are declared, and probably never will.
Do the gcc and gcov developers hold their details secret? I'm sure
there are many good reasons for picking different code coverage tools,
and I'm not suggesting that gcov is in any way the "best" (for many
reasons, code coverage tools would be of very limited use for most of my >work). And there are all sorts of different coverage metrics. But it
would surprise me if major vendors keep information about the prime
purpose of the tool a secret. Who would buy a coverage tool that
doesn't tell you what it measures?
The one that I did manage to obtain the details turned out to be
looking for certain combinations of certain words and arrangements. It
had zero understanding of what the code did, never mind why.
I am not sure what kind of tool you are referring to here.
Code
coverage tools track metrics about the functions, blocks and code lines
that are run. Different tools (or options) track different metrics -
counts, times, or just "at least once". They might track things at
different levels. Some are intrusive and accurate, others are
non-intrusive but statistical based. If you are wanting to use code
coverage tools in combination with branch testing, you just want to know
that during your test suite runs, every branch is tested at least once
in each direction.
Maybe modern AI will do better, but may be too expensive to make
business sense.
We can pretty much guarantee that commercial vendors will add claims of
AI to their tools and charge more for them. Whether or not they will be >better for it, is another matter.
I would expect AI to be more useful in the context of static error
checkers, simulators, and fuzz testers rather than code coverage at
run-time.
But it is certainly true that any kind of automatic testing or
verification is only going to get you so far - false hits or missed
cases are inevitable.
Yes, in fact the false hits dominate by a large factor, and the main
expense in using such tools is the human effort needed to extract
those few true hits.
Just to be clear - are you using non-intrusive statistical code coverage >tools (i.e., a background thread, timer, etc., that samples the program >counter of running code? Or are you using a tool that does
instrumentation when compiling? I'm trying to get an understanding of
the kinds of "false hits" you are seeing.
But remember that testing cannot prove the absence of bugs - only their >>>>> presence. And it only works on the assumption that the hardware is
correct - even when the software is perfect, you might still need that >>>>> reset button or a watchdog!
Absolutely. This was true in the days of uniprocessors with one
megahertz clocks and kilobyte memories. Now it's hundreds of
processors with multi-gigahertz clocks and terabyte physical memories. >>>>
And it's still true - modern systems give more scope for hardware issues >>> than simpler systems (as well as more scope for subtle software bugs).
A cosmic ray in the wrong place can render all your software
verification void.
I must say that there was much worry about cosmic ray hits back in the
day, but they never turned out to matter in practice, except in space
systems.
I guess there are many factors for that. If something weird happens,
and you have no explanation and it never happens again, then you can
easily say it was probably a cosmic ray - without any direct evidence.
It is also the case that many systems or subsystems are tolerant of an >occasional single-event upset - be it from cosmic rays or anything else.
If you have ECC memory, or other kinds of redundancy or error
checking, rare errors there are not an issue. So many types of memory, >buses, and communication protocols are effectively immune to such
things. However, critical parts of the system will still be vulnerable
to hardware glitches. It is not without justification that
safety-critical electronics often has two cores running in lockstep or
other types of redundancy.
The dominant source of errors turned out to be electrical cross-talk
and interference in the backplanes, and meta-stability in interfaces
between logic clock domains in the larger hardware system.
Sure. Cosmic rays were only an example (pulled out of thin air :.) ). >Glitches on power lines, unlucky coincidences on bit patterns,
production flaws or ESD damage eroding electrical tolerances - there are
lots of possibilities. I'm not trying to suggest relative likelihoods
here, as that will be highly variable.
I vaguely recall doing an analysis on this issue, some decades ago.
I recall something of the opposite - a long time ago, we had to add a
variety of "safety" features to a product to fulfil a customer's safety
/ reliability checklist, without regard to how realistic the failure >scenarios were and without spending time and money on analysis. The
result was, IMHO, lower reliability because it was more likely for the
extra monitoring and checking hardware and software to fail than for the >original functional stuff to fail. Many of these extra checks were in >themselves impossible to test.
On Wed, 4 Jun 2025 12:58:21 +0200, David Brown
<[email protected]> wrote:
On 30/05/2025 19:39, Joe Gwinn wrote:
On Fri, 30 May 2025 17:53:59 +0200, David Brown
<[email protected]> wrote:
On 28/05/2025 18:07, Joe Gwinn wrote:
I recall those days. Some managers thought that if they decreed that >>>>> no module could have a complexity (computed in various ways) exceeding >>>>> some arbitrary limit. The problem was that real-world problems are
vastly more complex, causing atomization of the inherent complexity
into a bazillion tiny modules, hiding the structure and imposing large >>>>> added processing overheads from traversing all those inter-module
interfaces.
The problem with any generalisation and rules is that they are sometimes >>>> inappropriate. /Most/ functions, modules, pages of schematic diagram, >>>> or whatever, should have a low complexity however you compute it. But >>>> there are always some that are exceptions, where the code is clearer
despite being "complex" according to the metrics you use.
No, all of the complexity metrics were blown away by practical
software running on practical hardware. Very few modules were that
simple, because too many too small modules carry large inter-module
interface overheads.
That changes nothing of the principles.
You aim for low and controlled complexity, at all levels, so that you
can realistically test, verify, and check the code and systems at the
different levels. (Checking can be automatic, manual, human code
reviews, code coverage tools, etc., - usually in combination.) Any part
with particularly high complexity is going to take more specialised
testing and checking - that costs more time and money, and is higher
risk. Sometimes it is still the right choice, because alternatives are
worse (such as the "too many small modules" issues you mention) or
because there are clear and reliable ways to test dues to particular
patterns (as you might get in a very large "dispatch" function).
In theory, sure. In practice, it didn't help enough to make it
worthwhile.
You don't just throw your hands in the air and say it's better with
spaghetti in a module than spaghetti between modules, and therefore you
can ignore complexity! I don't believe that is what you are actually
doing, but it sounds a little like that.
Peer review of the code works better, because no pattern scanning tool
can tell spaghetti from inherent complexity.
And this goes double for operating system kernel code, which violate essentially all of the coding standards developed for user-level
application code.
Other tools that can be useful in testing are code coverage tools - you >>>>>> can check that your test setups check all paths through the code.
We still do this, but the limitation is that all such tools yield far >>>>> more false alarms then valid hits, so all hits must be manually
verified.
A false alarm for a code coverage report would mean code that is not
reported as hit, but actually /is/ hit when the code is run. How does >>>> that come about?
The code coverage vendors hold the details close, so we usually don't
know how hits are declared, and probably never will.
Do the gcc and gcov developers hold their details secret? I'm sure
there are many good reasons for picking different code coverage tools,
and I'm not suggesting that gcov is in any way the "best" (for many
reasons, code coverage tools would be of very limited use for most of my
work). And there are all sorts of different coverage metrics. But it
would surprise me if major vendors keep information about the prime
purpose of the tool a secret. Who would buy a coverage tool that
doesn't tell you what it measures?
I was dealing with a proprietary code coverage tool that management
was quite enamored with and so was pressuring us to use. But we had
only a sales brochure to go from, and I point-blank refused to use it
without knowing what it did and how. This caused a copy of the
requirements document of the scanner to appear.
I don't think gcov existed then. We used gcc, so the software folk
would have used it were it both available and mature enough.
Maybe modern AI will do better, but may be too expensive to make
business sense.
We can pretty much guarantee that commercial vendors will add claims of
AI to their tools and charge more for them. Whether or not they will be
better for it, is another matter.
Yes. Don't forget Quantum.
I would expect AI to be more useful in the context of static error
checkers, simulators, and fuzz testers rather than code coverage at
run-time.
Why? I would think that a LLM could follow the thread far better than
any static checker.
The US financial firm Morgan Stanley is using AI to analyze and
summarize nine million lines of code (in languages such as COBOL) for re-implementation in modern languages. This from The Wall Street
Journal, 3 June 2025 issue:
"Morgan Stanley is now aiming artificial intelligence at one of
enterprise software's biggest pain points, and one it said Big Tech
hasn't quite nailed yet: helping rewrite old, outdated code into
modern coding languages.
In January, the company rolled out a tool known as DevGen.AI, built
in-house on OpenAI's GPT models. It can translate legacy code from
languages like COBOL into plain English specs that developers can then
use to rewrite it.
So far this year it's reviewed nine million lines of code, saving
developers 280,000 hours, said Mike Pizzi, Morgan Stanley's global
head of technology and operations."
And it's still true - modern systems give more scope for hardware issues >>>> than simpler systems (as well as more scope for subtle software bugs). >>>> A cosmic ray in the wrong place can render all your software
verification void.
I must say that there was much worry about cosmic ray hits back in the
day, but they never turned out to matter in practice, except in space
systems.
I guess there are many factors for that. If something weird happens,
and you have no explanation and it never happens again, then you can
easily say it was probably a cosmic ray - without any direct evidence.
It is also the case that many systems or subsystems are tolerant of an
occasional single-event upset - be it from cosmic rays or anything else.
If you have ECC memory, or other kinds of redundancy or error
checking, rare errors there are not an issue. So many types of memory,
buses, and communication protocols are effectively immune to such
things. However, critical parts of the system will still be vulnerable
to hardware glitches. It is not without justification that
safety-critical electronics often has two cores running in lockstep or
other types of redundancy.
What happened is that semiconductor technology progressed to the point
that the amount of charge (or whatever) that distinguished symbols
became very small and thus vulnerable to random errors, for which an error-correcting code had to be built in. At this point, cosmic rays
were lost in the random noise, so to speak. So ECC is now inherent,
not an extra-cost bolt-on.
The dominant source of errors turned out to be electrical cross-talk
and interference in the backplanes, and meta-stability in interfaces
between logic clock domains in the larger hardware system.
Sure. Cosmic rays were only an example (pulled out of thin air :.) ).
Glitches on power lines, unlucky coincidences on bit patterns,
production flaws or ESD damage eroding electrical tolerances - there are
lots of possibilities. I'm not trying to suggest relative likelihoods
here, as that will be highly variable.
And this is still true.
I vaguely recall doing an analysis on this issue, some decades ago.
I recall something of the opposite - a long time ago, we had to add a
variety of "safety" features to a product to fulfil a customer's safety
/ reliability checklist, without regard to how realistic the failure
scenarios were and without spending time and money on analysis. The
result was, IMHO, lower reliability because it was more likely for the
extra monitoring and checking hardware and software to fail than for the
original functional stuff to fail. Many of these extra checks were in
themselves impossible to test.
Yes. I recall directly testing the issue with ECC as implemented in
early DEC VAX computers, in the early 1980s. We had a customer who
specified ECC, so we had ECC. And soon discovered that the computer
was more reliable with ECC disabled than enabled. That was the end of
ECC.
On 04/06/2025 16:55, Joe Gwinn wrote:
On Wed, 4 Jun 2025 12:58:21 +0200, David Brown
<[email protected]> wrote:
On 30/05/2025 19:39, Joe Gwinn wrote:
On Fri, 30 May 2025 17:53:59 +0200, David Brown
<[email protected]> wrote:
On 28/05/2025 18:07, Joe Gwinn wrote:
I recall those days. Some managers thought that if they decreed that >>>>>> no module could have a complexity (computed in various ways) exceeding >>>>>> some arbitrary limit. The problem was that real-world problems are >>>>>> vastly more complex, causing atomization of the inherent complexity >>>>>> into a bazillion tiny modules, hiding the structure and imposing large >>>>>> added processing overheads from traversing all those inter-module
interfaces.
The problem with any generalisation and rules is that they are sometimes >>>>> inappropriate. /Most/ functions, modules, pages of schematic diagram, >>>>> or whatever, should have a low complexity however you compute it. But >>>>> there are always some that are exceptions, where the code is clearer >>>>> despite being "complex" according to the metrics you use.
No, all of the complexity metrics were blown away by practical
software running on practical hardware. Very few modules were that
simple, because too many too small modules carry large inter-module
interface overheads.
That changes nothing of the principles.
You aim for low and controlled complexity, at all levels, so that you
can realistically test, verify, and check the code and systems at the
different levels. (Checking can be automatic, manual, human code
reviews, code coverage tools, etc., - usually in combination.) Any part >>> with particularly high complexity is going to take more specialised
testing and checking - that costs more time and money, and is higher
risk. Sometimes it is still the right choice, because alternatives are
worse (such as the "too many small modules" issues you mention) or
because there are clear and reliable ways to test dues to particular
patterns (as you might get in a very large "dispatch" function).
In theory, sure. In practice, it didn't help enough to make it
worthwhile.
OK.
You don't just throw your hands in the air and say it's better with
spaghetti in a module than spaghetti between modules, and therefore you
can ignore complexity! I don't believe that is what you are actually
doing, but it sounds a little like that.
Peer review of the code works better, because no pattern scanning tool
can tell spaghetti from inherent complexity.
That's certainly true in some cases. It surprises me a little that your >experience was so much like that, but of course experiences differ. My >experience (and I freely admit I haven't used complexity analysis tools
much) is that most functions can be relatively low complexity - the >inherently high complexity stuff is only a small proportion of the code.
In one situation where this was not the case, I asked the programmer
to re-structure the whole thing - the code was badly designed from the
start and had become an incomprehensible mess. Peer review did not
help, because the peer (me) couldn't figure out what was going on in the >code.
However, it is entirely true that some code will be marked as very high >complexity by tools and yet easily and simply understood by human
reviewers. If that is happening a lot in a code base, automatic tools
(at least the ones you are trying) are not going to be much use.
And this goes double for operating system kernel code, which violate
essentially all of the coding standards developed for user-level
application code.
Different code has different needs and standards, yes.
Other tools that can be useful in testing are code coverage tools - you >>>>>>> can check that your test setups check all paths through the code. >>>>>>We still do this, but the limitation is that all such tools yield far >>>>>> more false alarms then valid hits, so all hits must be manually
verified.
A false alarm for a code coverage report would mean code that is not >>>>> reported as hit, but actually /is/ hit when the code is run. How does >>>>> that come about?
The code coverage vendors hold the details close, so we usually don't
know how hits are declared, and probably never will.
Do the gcc and gcov developers hold their details secret? I'm sure
there are many good reasons for picking different code coverage tools,
and I'm not suggesting that gcov is in any way the "best" (for many
reasons, code coverage tools would be of very limited use for most of my >>> work). And there are all sorts of different coverage metrics. But it
would surprise me if major vendors keep information about the prime
purpose of the tool a secret. Who would buy a coverage tool that
doesn't tell you what it measures?
I was dealing with a proprietary code coverage tool that management
was quite enamored with and so was pressuring us to use. But we had
only a sales brochure to go from, and I point-blank refused to use it
without knowing what it did and how. This caused a copy of the
requirements document of the scanner to appear.
No software tool can fix management problems :-(
I don't think gcov existed then. We used gcc, so the software folk
would have used it were it both available and mature enough.
Fair enough. I haven't done anything significant with gcov, so I can't
say how good it might be. (It is very difficult to use tools that write
data to files when you are working on small microcontrollers with no >filesystem and at most a small RTOS.)
Maybe modern AI will do better, but may be too expensive to make
business sense.
We can pretty much guarantee that commercial vendors will add claims of
AI to their tools and charge more for them. Whether or not they will be >>> better for it, is another matter.
Yes. Don't forget Quantum.
We are already into post-quantum algorithms, at least in some fields!
I would expect AI to be more useful in the context of static error
checkers, simulators, and fuzz testers rather than code coverage at
run-time.
Why (AI)? I would think that a LLM could follow the thread far better than >> any static checker.
I mean that I think there is more potential for adding useful AI
algorithms to static checkers and simulators than there is for using AI >algorithms in run-time code coverage tools. But that's just a guess,
not backed up by any evidence.
The US financial firm Morgan Stanley is using AI to analyze and
summarize nine million lines of code (in languages such as COBOL) for
re-implementation in modern languages. This from The Wall Street
Journal, 3 June 2025 issue:
"Morgan Stanley is now aiming artificial intelligence at one of
enterprise software's biggest pain points, and one it said Big Tech
hasn't quite nailed yet: helping rewrite old, outdated code into
modern coding languages.
I can see AI being a help here - just as many existing tools can be
helpful for figuring out what old code does. I am not holding my breath >waiting for AI to manage such conversions on its own.
In January, the company rolled out a tool known as DevGen.AI, built
in-house on OpenAI's GPT models. It can translate legacy code from
languages like COBOL into plain English specs that developers can then
use to rewrite it.
So far this year it's reviewed nine million lines of code, saving
developers 280,000 hours, said Mike Pizzi, Morgan Stanley's global
head of technology and operations."
And it's still true - modern systems give more scope for hardware issues >>>>> than simpler systems (as well as more scope for subtle software bugs). >>>>> A cosmic ray in the wrong place can render all your software
verification void.
I must say that there was much worry about cosmic ray hits back in the >>>> day, but they never turned out to matter in practice, except in space
systems.
I guess there are many factors for that. If something weird happens,
and you have no explanation and it never happens again, then you can
easily say it was probably a cosmic ray - without any direct evidence.
It is also the case that many systems or subsystems are tolerant of an
occasional single-event upset - be it from cosmic rays or anything else. >>> If you have ECC memory, or other kinds of redundancy or error
checking, rare errors there are not an issue. So many types of memory,
buses, and communication protocols are effectively immune to such
things. However, critical parts of the system will still be vulnerable
to hardware glitches. It is not without justification that
safety-critical electronics often has two cores running in lockstep or
other types of redundancy.
What happened is that semiconductor technology progressed to the point
that the amount of charge (or whatever) that distinguished symbols
became very small and thus vulnerable to random errors, for which an
error-correcting code had to be built in. At this point, cosmic rays
were lost in the random noise, so to speak. So ECC is now inherent,
not an extra-cost bolt-on.
For very dense and small feature size electronics, that is mostly true - >though even then there are parts that are vulnerable. It's just that
those parts are a tiny proportion of the die size, compared to memory
arrays, and the like.
The dominant source of errors turned out to be electrical cross-talk
and interference in the backplanes, and meta-stability in interfaces
between logic clock domains in the larger hardware system.
Sure. Cosmic rays were only an example (pulled out of thin air :.) ).
Glitches on power lines, unlucky coincidences on bit patterns,
production flaws or ESD damage eroding electrical tolerances - there are >>> lots of possibilities. I'm not trying to suggest relative likelihoods
here, as that will be highly variable.
And this is still true.
I vaguely recall doing an analysis on this issue, some decades ago.
I recall something of the opposite - a long time ago, we had to add a
variety of "safety" features to a product to fulfil a customer's safety
/ reliability checklist, without regard to how realistic the failure
scenarios were and without spending time and money on analysis. The
result was, IMHO, lower reliability because it was more likely for the
extra monitoring and checking hardware and software to fail than for the >>> original functional stuff to fail. Many of these extra checks were in
themselves impossible to test.
Yes. I recall directly testing the issue with ECC as implemented in
early DEC VAX computers, in the early 1980s. We had a customer who
specified ECC, so we had ECC. And soon discovered that the computer
was more reliable with ECC disabled than enabled. That was the end of
ECC.
Quis custodiet ipsos custodes?
Sometimes these fault monitors and error checking systems are just
kicking the can further down the road, and not actually improving anything.
On Wed, 4 Jun 2025 17:53:19 +0200, David Brown
<[email protected]> wrote:
On 04/06/2025 16:55, Joe Gwinn wrote:
On Wed, 4 Jun 2025 12:58:21 +0200, David Brown
<[email protected]> wrote:
All true, but at the end of the day, complexity metrics and coverage
tools didn't come even close to paying for itself, and so they
gradually faded.
Fair enough. I haven't done anything significant with gcov, so I can't
say how good it might be. (It is very difficult to use tools that write
data to files when you are working on small microcontrollers with no
filesystem and at most a small RTOS.)
In those cases, the development computers were far larger than the
target systems.
Why (AI)? I would think that a LLM could follow the thread far better than >>> any static checker.
I mean that I think there is more potential for adding useful AI
algorithms to static checkers and simulators than there is for using AI
algorithms in run-time code coverage tools. But that's just a guess,
not backed up by any evidence.
The US financial firm Morgan Stanley is using AI to analyze and
summarize nine million lines of code (in languages such as COBOL) for
re-implementation in modern languages. This from The Wall Street
Journal, 3 June 2025 issue:
"Morgan Stanley is now aiming artificial intelligence at one of
enterprise software's biggest pain points, and one it said Big Tech
hasn't quite nailed yet: helping rewrite old, outdated code into
modern coding languages.
I can see AI being a help here - just as many existing tools can be
helpful for figuring out what old code does. I am not holding my breath
waiting for AI to manage such conversions on its own.
Nor is MS - the new code is written by humans.
Although MS is trying automatic translation into modern languages, I
gather that it doesn't work all that well. In my world, there was a
lot of talk of building automatic converters to translate from one
computer language to another. It never worked because no machine
could understand the inventive ways people on tiny machines used
bespoke memory structures to improve performance.
War story: In the 1970s, I was involved in writing Fortran code to
implement a simulator used for training. This required 32-bit words
used as bit arrays, but the Fortran of the day had no bitwise
operation, only one bit per word logic. Which was crippling, so I
wrote an assembly-coded Fortran function that did the bitwise
operation expressions needed for this and that.
Our application programmers were intimidated by being asked to write a
little assembly, but I had set it up so they could easily do it, once
the shock wore off. It all worked.
Thirty years later, I got a phone call from the blue from someone in a company that had won a contract to recode the simulator code in C
wondering what to do with those assembly-coded functions. He was very relieved when I said that if the ISA library had been available, we
would have used that, and to just read the assembly source to find the bitwise operations being performed, and write that directly in C.
On Wed, 4 Jun 2025 17:53:19 +0200, David Brown
<[email protected]> wrote:
On 04/06/2025 16:55, Joe Gwinn wrote:
Peer review of the code works better, because no pattern scanning tool
can tell spaghetti from inherent complexity.
That's certainly true in some cases. It surprises me a little that your
experience was so much like that, but of course experiences differ. My
experience (and I freely admit I haven't used complexity analysis tools
much) is that most functions can be relatively low complexity - the
inherently high complexity stuff is only a small proportion of the code.
In one situation where this was not the case, I asked the programmer
to re-structure the whole thing - the code was badly designed from the
start and had become an incomprehensible mess. Peer review did not
help, because the peer (me) couldn't figure out what was going on in the
code.
All true, but at the end of the day, complexity metrics and coverage
tools didn't come even close to paying for itself, and so they
gradually faded.
However, it is entirely true that some code will be marked as very high
complexity by tools and yet easily and simply understood by human
reviewers. If that is happening a lot in a code base, automatic tools
(at least the ones you are trying) are not going to be much use.
I would expect AI to be more useful in the context of static error
checkers, simulators, and fuzz testers rather than code coverage at
run-time.
Forgotten from earlier: "Just to be clear - are you using
non-intrusive statistical code coverage tools (i.e., a background
thread, timer, etc., that samples the program counter of running code?
Or are you using a tool that does instrumentation when compiling? I'm
trying to get an understanding of the kinds of "false hits" you are
seeing."
The focus here is non-intrusive code evaluation tools.
We also use intrusive tools and instrumentation in the integration
lab.
On 04/06/2025 20:54, Joe Gwinn wrote:
On Wed, 4 Jun 2025 17:53:19 +0200, David Brown
<[email protected]> wrote:
I can see AI being a help here - just as many existing tools can be
helpful for figuring out what old code does. I am not holding my breath >>> waiting for AI to manage such conversions on its own.
Nor is MS - the new code is written by humans.
Although MS is trying automatic translation into modern languages, I
gather that it doesn't work all that well. In my world, there was a
lot of talk of building automatic converters to translate from one
computer language to another. It never worked because no machine
could understand the inventive ways people on tiny machines used
bespoke memory structures to improve performance.
Automatic translation between programming languages is unlikely to be successful unless the languages have very similar structures (say,
Pascal to C) or you are using the output just as an intermediary
language for compilation rather than as a new version of the program
(like using cfront for C++ to C transcompilation). Good code written in
one language is going to be structured differently from good code
written in a different programming language. And these old code bases
might have started as a "good code" at one time - they are very unlikely
to have remained so over decades of fiddling, fixing and expanding.
On 04/06/2025 19:54, Joe Gwinn wrote:
On Wed, 4 Jun 2025 17:53:19 +0200, David Brown
<[email protected]> wrote:
On 04/06/2025 16:55, Joe Gwinn wrote:
Peer review of the code works better, because no pattern scanning tool >>>> can tell spaghetti from inherent complexity.
That's certainly true in some cases. It surprises me a little that your >>> experience was so much like that, but of course experiences differ. My
experience (and I freely admit I haven't used complexity analysis tools
much) is that most functions can be relatively low complexity - the
inherently high complexity stuff is only a small proportion of the code. >>> In one situation where this was not the case, I asked the programmer
to re-structure the whole thing - the code was badly designed from the
start and had become an incomprehensible mess. Peer review did not
help, because the peer (me) couldn't figure out what was going on in the >>> code.
All true, but at the end of the day, complexity metrics and coverage
tools didn't come even close to paying for itself, and so they
gradually faded.
Complexity metrics and dataflow analysis tools worked OK for me when
working as a consultant digging companies out of deep holes they had got >themselves into. They invariably looked hurt at the long list of defects
that I would find almost immediately and I was almost always right about
code with an insane value for McCabe's CCI being full of latent bugs.
Some code is irreversibly complex and necessarily so, but a lot of it is
just the software equivalent of a rats nest prototype in electronics but >treated as if it was production quality code. Management's ship it and
be damned policy since they always wanted their sales target bonus.
However, it is entirely true that some code will be marked as very high
complexity by tools and yet easily and simply understood by human
reviewers. If that is happening a lot in a code base, automatic tools
(at least the ones you are trying) are not going to be much use.
That goes with the territory but you only have to look at it once.
I would expect AI to be more useful in the context of static error
checkers, simulators, and fuzz testers rather than code coverage at
run-time.
Forgotten from earlier: "Just to be clear - are you using
non-intrusive statistical code coverage tools (i.e., a background
thread, timer, etc., that samples the program counter of running code?
Or are you using a tool that does instrumentation when compiling? I'm
trying to get an understanding of the kinds of "false hits" you are
seeing."
The focus here is non-intrusive code evaluation tools.
We also use intrusive tools and instrumentation in the integration
lab.
I find instrumentation often disturbs the problem that I am looking at. YMMV
I'm a fan of Intel's vTune for finding hotspots in serious code.
Some code is irreversibly complex and necessarily so, but a lot of it is just the software equivalent of a rats nest prototype in electronics but treated as
if it was production quality code. Management's ship it and be damned policy since they always wanted their sales target bonus.
However, it is entirely true that some code will be marked as very high
complexity by tools and yet easily and simply understood by human
reviewers. If that is happening a lot in a code base, automatic tools
(at least the ones you are trying) are not going to be much use.
That goes with the territory but you only have to look at it once.
I would expect AI to be more useful in the context of static error
checkers, simulators, and fuzz testers rather than code coverage at
run-time.
Forgotten from earlier: "Just to be clear - are you using
non-intrusive statistical code coverage tools (i.e., a background
thread, timer, etc., that samples the program counter of running code?
Or are you using a tool that does instrumentation when compiling? I'm
trying to get an understanding of the kinds of "false hits" you are
seeing."
The focus here is non-intrusive code evaluation tools.
We also use intrusive tools and instrumentation in the integration
lab.
I find instrumentation often disturbs the problem that I am looking at. YMMV
I'm a fan of Intel's vTune for finding hotspots in serious code.
I recall something of the opposite - a long time ago, we had to add a
variety of "safety" features to a product to fulfil a customer's safety
/ reliability checklist, without regard to how realistic the failure >scenarios were and without spending time and money on analysis. The
result was, IMHO, lower reliability because it was more likely for the
extra monitoring and checking hardware and software to fail than for the >original functional stuff to fail. Many of these extra checks were in >themselves impossible to test.
In article <101p8sd$phe5$[email protected]>,
David Brown <[email protected]> wrote:
<SNIP>
I recall something of the opposite - a long time ago, we had to add a
variety of "safety" features to a product to fulfil a customer's safety
/ reliability checklist, without regard to how realistic the failure
scenarios were and without spending time and money on analysis. The
result was, IMHO, lower reliability because it was more likely for the
extra monitoring and checking hardware and software to fail than for the
original functional stuff to fail. Many of these extra checks were in
themselves impossible to test.
I worked on the Dutch railway systems safety and control software.
Once they added external control checking.
I've seen the code. In places there was an 8 level indentation
caused by if's switches and loops.
There was also a ban on automatic testing. I got on a row, because
I used a 3 line batch file (.BAT) to save on repetitive typing.
In article <101p8sd$phe5$[email protected]>,
David Brown <[email protected]> wrote:
<SNIP>
I recall something of the opposite - a long time ago, we had to add a >>variety of "safety" features to a product to fulfil a customer's safety
/ reliability checklist, without regard to how realistic the failure >>scenarios were and without spending time and money on analysis. The
result was, IMHO, lower reliability because it was more likely for the >>extra monitoring and checking hardware and software to fail than for the >>original functional stuff to fail. Many of these extra checks were in >>themselves impossible to test.
I worked on the Dutch railway systems safety and control software.
Once they added external control checking.
I've seen the code. In places there was an 8 level indentation
caused by if's switches and loops.
There was also a ban on automatic testing. I got on a row, because
I used a 3 line batch file (.BAT) to save on repetitive typing.
Groetjes Albert
On 06/06/2025 13:45, [email protected] wrote:
In article <101p8sd$phe5$[email protected]>,
David Brown <[email protected]> wrote:
<SNIP>
I recall something of the opposite - a long time ago, we had to add a
variety of "safety" features to a product to fulfil a customer's safety
/ reliability checklist, without regard to how realistic the failure
scenarios were and without spending time and money on analysis. The
result was, IMHO, lower reliability because it was more likely for the
extra monitoring and checking hardware and software to fail than for the >>> original functional stuff to fail. Many of these extra checks were in
themselves impossible to test.
I worked on the Dutch railway systems safety and control software.
Once they added external control checking.
I've seen the code. In places there was an 8 level indentation
caused by if's switches and loops.
I've occasionally seen that kind of thing from developers who come from
the PLC world, having trained as automation engineers. In that world,
it's not uncommon to have long changes of conditionals, which used to be implemented as chains of relays.
If there are a dozen safety checks,
each with an "OK" output, then you have them all in a line. If this is later translated into C code, if might then be translated into a series
of indented if statements. You get similarly bad code design from
people who are not programmers or trained in programming (at least, not
that type of programming), but have somehow ended up with the task.
On 2025-06-06 14:57, David Brown wrote:
On 06/06/2025 13:45, [email protected] wrote:
In article <101p8sd$phe5$[email protected]>,
David Brown <[email protected]> wrote:
<SNIP>
I recall something of the opposite - a long time ago, we had to add a
variety of "safety" features to a product to fulfil a customer's safety >>>> / reliability checklist, without regard to how realistic the failure
scenarios were and without spending time and money on analysis. The
result was, IMHO, lower reliability because it was more likely for the >>>> extra monitoring and checking hardware and software to fail than for
the
original functional stuff to fail. Many of these extra checks were in >>>> themselves impossible to test.
I worked on the Dutch railway systems safety and control software.
Once they added external control checking.
I've seen the code. In places there was an 8 level indentation
caused by if's switches and loops.
I've occasionally seen that kind of thing from developers who come
from the PLC world, having trained as automation engineers. In that
world, it's not uncommon to have long changes of conditionals, which
used to be implemented as chains of relays.
I think it should be a chain of ANDs/ORs, not IFs.
If there are a dozen safety checks, each with an "OK" output, then you
have them all in a line. If this is later translated into C code, if
might then be translated into a series of indented if statements. You
get similarly bad code design from people who are not programmers or
trained in programming (at least, not that type of programming), but
have somehow ended up with the task.
...
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 714 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 141:24:39 |
| Calls: | 12,087 |
| Files: | 14,998 |
| Messages: | 6,517,442 |