an unknown image width solver for memgaze

on blog at

memgaze-pdl.pl (118KB, requires libperl-gtk2, PDL, and 'aplay' binary for sound)

This is another release of memgaze but with a bunch of quality of life fixes and a few new features like the tools for better extracting specific images and algorithmically solving for their likely original image widths.

This algorithm is based it on the idea that while manually adjusting the horizontal image width with a slider I'd see how there'd be diagonal lines (aliasing?) formed whenever the horizontal width wasn't just right (or some harmonic multiple of right). If it was right then the pixel at X position 89 in one line would likely be the same or similar to pixel at X position 89 in the line below it. Vertical lines. Only when it was the right width would the slanted lines disappear. Telling this to gemini-2.5-pro resulted in the code to do it.

It steps through a large range of tested horizontal resolutions and for each it compares vertically adjacent pairs of lines in the image to see how different they are. Specifically it does line_n - line_n+1 = score and takes the absolute value of [score] for all line pairs in an image summed together as the real score for that test width. The lower the score, the more the two adjacent lines are very close to the same, the more likely it is that that's the right resolution without wrapping or aliasing. So it goes through and computes the net score for each resolution and then reports the one with the lowest as the likely horizontal resolution and sets the image object to display it.

This usually works. If there's no real order in the data it'll often have a bias for settling on lower horizontal resolutions rather than higher. But for actual image data it will at least settle on an alias of the image from which you can figure out (based on overlaping/mirror repetition) to go higher or lower manually to find the true width. And if the image representation in ram has blank lines between actual lines this only find harmonic multiples of the true resolution.

sub calculate_vertical_coherence_fast {
    my ($pixels, $width, $height, $step) = @_;

    my $rowstride = $width * 3;
    my $total_difference = 0;

    # Iterate with a step for both x and y to sample the image
    for (my $y = 0; $y < $height - $step; $y += $step) {
        for (my $x = 0; $x < $width; $x += $step) {
            my $offset1 = ($y * $rowstride) + ($x * 3);
            my ($r1, $g1, $b1) = unpack('CCC', substr($pixels, $offset1, 3));

            # Compare with the pixel $step rows below
            my $offset2 = (($y + $step) * $rowstride) + ($x * 3);
            my ($r2, $g2, $b2) = unpack('CCC', substr($pixels, $offset2, 3));

            $total_difference += abs($r1 - $r2) + abs($g1 - $g2) + abs($b1 - $b2);
        }
    }
    return $total_difference;
}

In native perl these many comparisons were kind of slow. So I asked gemini-2.5-pro for help (again) and it made a really clever PDL (perl data language, compiled c/fortran internals for fast operations) version that at least 10x as fast as the cost of slightly increased RAM usage for memgaze.pl. With PDL piddle objects nothing is ever really copied or moved. Instead PDL creates different "views" of the same object and this is much faster.

    for my $test_width ( $min_width_to_test .. $max_width_to_test ) {
        ...
        my $trimmed_size = $test_width * $test_height * 3;
        my $image_pdl = $pdl_data->slice("0:" . ($trimmed_size - 1));

        my $reshaped = $image_pdl->reshape(3 * $test_width, $test_height);
        my $top_rows = $reshaped->slice(":,0:-2");
        my $bottom_rows = $reshaped->slice(":,1:-1");
        my $score = sum(abs($top_rows - $bottom_rows));
        
        my $normalized_score = $score / $test_height;
        ...
        if ($lowest_score == -1 or $normalized_score < $lowest_score) {
            $lowest_score = $normalized_score;
            $best_width = $test_width;
        }
    }

$top_rows is one of those "views". It includes all columns (:) but only rows from the first (0) up to, but not including, the last (-2). For a 1920 px wide test that'd be rows 0 through 1919. $bottom_rows is a view that includes all columns (:) but only rows from the second (1) up to the very end (-1). So that'd be rows 1 to 1920. That's two arrays of the exact same dimensions perfectly aligned so that row n in $top_rows corresponds to the original row n, and row n in $bottom_rows corresponds to the original row n+1. So when $top_rows - $bottom_rows it subtracts the entire bottom_rows array from the top_rows array, element by element, all at once in highly optimized C code. The result is a new piddle of the same size containing all the differences for taking the absolute value and sum.

Another problem for me personally writing the thing was that as soon as I use PDL;'d it overloaded a lot of the CORE:: perl functions like index() and suddenly my 'Process' list search button stopped working and giving wild errors with line numbers that didn't match to the code it was talking about. And at the time I had no idea that use PDL; overloaded/replaced functions like this. I spent a half day trying to figure it out before eventually someone on IRC mentioned PDL did this and I realized that index() had been replaced with PDL::index(). Changing all my index and related to CORE::index fixed it but I'd never have figured this out on my own. It stumped gemini-2.5-pro too.

            while (defined $search_iter) {
                my $display_text = lc($model->get($search_iter, 0));
                # use CORE::index to avoid conflict with PDL::index
                if (CORE::index($display_text, $search_text) != -1) {
                    $selection->select_iter($search_iter);
                    my $path = $model->get_path($search_iter);
                    $proc_tree_view->scroll_to_cell($path, undef, FALSE, 0, 0);
                    return; # Found a match, we're done.
                }
                $search_iter = $model->iter_next($search_iter);
            }

p.s. the funnest feature in this new version is wildly moving the image width slider around and seeing the aliasing "animate".

[comment on this post] Append "/@say/your message here" to the URL in the location bar and hit enter.

[webmention/pingback] Did you respond to this post? What's the URL?