Adnan Aman
In the early 20th century, photographer Sergei Prokudin-Gorskii captured monochromatic glass plate images using separate red, green, and blue filters. These glass plates represent an important part of photographic history, and using image processing techniques, we can combine these plates into a full-color image.
The process began by dividing the original images into three equal parts to isolate the blue, green, and
red channels. To align these channels, I used normalized cross-correlation (NCC), which measures the
similarity between two images by normalizing their pixel values and calculating their dot product. This
method provides an effective way to estimate how closely the color channels are aligned.
To improve the accuracy of the alignment, I cropped a small portion from the edges of each image. This
step was crucial for avoiding interference from unwanted details, such as borders or noise, that could
negatively impact the alignment process. After cropping, I tested a range of displacements in both the X
and Y directions. Through this method, I identified the optimal displacement to align the red, green,
and blue channels and create the final color image.
However, as image sizes increased, the alignment process became more computationally expensive. I
initially experimented with testing larger displacement ranges, but these proved too slow for
high-resolution images. To overcome this issue, I implemented an image pyramid approach. This method
involves progressively downscaling the image, performing alignment at a lower resolution, and then
refining the results as the image is scaled back up. By applying this technique, I reduced the search
space at each level, which drastically improved the efficiency of the alignment process for larger
images.
In some cases, such as the church image, the basic alignment method struggled to produce accurate
results. To address this, I further refined the process by cropping the image before applying the NCC
metric, ensuring that the focus was on the central areas of the image rather than the edges. This
improved the accuracy of the alignment and reduced artifacts in the image. By ensuring that the critical
features in the center of the image were prioritized during the alignment process, I was able to achieve
better alignment results, particularly for images that initially posed challenges.
Through this refined approach, I achieved a balance between accuracy and efficiency. The use of smaller
displacement ranges, combined with the pyramid scaling technique, allowed for faster computations
without sacrificing quality. By gradually refining the alignment from a low-resolution version of the
image up to the full resolution, the algorithm avoided the need to test every possible displacement at
the highest resolution, making the process much more manageable.
Image | Displacement (G, R) (x, y) | Runtime |
---|---|---|
G: (2, 5), R: (3, 12) | <1 sec | |
G: (4, 25), R: (-4, 58) | 16.9 sec | |
G: (24, 49), R: (56, 104) | 16.6 sec | |
G: (16, 60), R: (13, 124) | 16.5 sec | |
G: (17, 41), R: (23, 89) | 17.3 sec | |
G: (9, 54), R: (12, 116) | 17 sec | |
G: (11, 82), R: (13, 178) | 16.8 sec | |
G: (2, 3), R: (2, -3) | <1 sec | |
G: (26, 51), R: (36, 108) | 17.6 sec | |
G: (-11, 33), R: (-27, 140) | 17.2 sec | |
G: (29, 79), R: (37, 176) | 17 sec | |
G: (14, 53), R: (11, 112) | 16.7 sec | |
G: (3, 3), R: (3, 6) | <1 sec | |
G: (6, 43), R: (32, 87) | 17.4 sec | |
G: (-12, 46), R: (-41, 86) | 16.30 sec | |
G: (-16, 32), R: (-25, 78) | 17.88 sec | |
G: (10, 26), R: (11, 70) | 16.03 sec |