188 comments
joshka · 5 days ago
The problem with this article's name is that we're in an era where actually checking whether "nearly all sorts are broken" against all publicly available implementations of binary searches would be almost feasible, but that's not actually what the article is claiming.

> Moreover, to be really certain that a program is correct, you have to test it for all possible input values, but this is seldom feasible.

This is incorrect. Generally it's just a little excessive to try to solve the halting problem in a library's unit tests ;). You don't have to test a program for all possible inputs, only for all materially unique state transitions. In a binary search, each variable can be zero, one, some(*), max, overflowed. The combination of these is not infinite as the values of some that cause different behavior is much more reasonable when grouped into chunks of the same behavior.

Show replies

LiamPowell · 5 days ago
This bug, and many others, can be detected with a trivial amount of formal verification. I really think formal verification should see much wider use for things that see as much use as standard libraries, even if it's just for the trivial things like overflow and out-of-bounds access.

In the below code we can see a formal verification tool (GNATProve) detect both the original error and the out-of-bounds access that it causes. Doing the arithmetic using a larger type clears both reported errors without the need for any additional annotations for GNATProve.

    function Search (A : A_Type; Target : Integer) return Integer is
       Left : Integer := A'First;
       Right : Integer := A'Last;
    begin
       while Left <= Right loop
          declare
             Mid : Integer := (Left + Right) / 2;
          begin
             if A (Mid) = Target then
                return Mid;
             elsif A (Mid) < Target then
                Left := Mid + 1;
             elsif A (Mid) > Target then
                Right := Mid - 1;
             end if;
          end;
       end loop;
    end Search;
    
GNATProve output:

    Phase 1 of 2: generation of Global contracts ...
    Phase 2 of 2: flow analysis and proof ...

    wrapper.adb:12:36: medium: overflow check might fail, cannot prove lower bound for Left + Right
       12 |            Mid : Integer := (Left + Right) / 2;
          |                             ~~~~~~^~~~~~~~
      reason for check: result of addition must fit in a 32-bits machine integer
    wrapper.adb:12:45: info: division check proved
    
    wrapper.adb:14:19: medium: array index check might fail
       14 |            if A (Mid) = Target then
          |                  ^~~
      reason for check: value must be a valid index into the array

Show replies

bhouston · 5 days ago
It makes the case that we need Math.mean or Math.avg function that we can use in these cases rather than than reinventing it.

Basically we should favor using built in functions for as much as possible because those should be reliable and tested more than ad hoc code we write. And compilers should optimize those built in functions well so there is no extra cost in using them.

Show replies

djoldman · 5 days ago
> The bug is in this line:

> In Programming Pearls Bentley says that the analogous line "sets m to the average of l and u, truncated down to the nearest integer." On the face of it, this assertion might appear correct, but it fails for large values of the int variables low and high. Specifically, it fails if the sum of low and high is greater than the maximum positive int value (231 - 1). The sum overflows to a negative value, and the value stays negative when divided by two. In C this causes an array index out of bounds with unpredictable results. In Java, it throws ArrayIndexOutOfBoundsException.

At some point we have to draw an arbitrary line. Even an "arbitrary precision" calculation is bounded by system memory.

"bug" is not well-defined, or perhaps "bug" is more of a continuous label as opposed to discrete.

Show replies