Colorizing the Prokudin-Gorskii Photo Collection

Project Background

In the early 20th century, photographer Sergei Prokudin-Gorskii captured monochromatic glass plate images using separate red, green, and blue filters. These glass plates represent an important part of photographic history, and using image processing techniques, we can combine these plates into a full-color image.

Approach

The process began by dividing the original images into three equal parts to isolate the blue, green, and red channels. To align these channels, I used normalized cross-correlation (NCC), which measures the similarity between two images by normalizing their pixel values and calculating their dot product. This method provides an effective way to estimate how closely the color channels are aligned.

To improve the accuracy of the alignment, I cropped a small portion from the edges of each image. This step was crucial for avoiding interference from unwanted details, such as borders or noise, that could negatively impact the alignment process. After cropping, I tested a range of displacements in both the X and Y directions. Through this method, I identified the optimal displacement to align the red, green, and blue channels and create the final color image.

However, as image sizes increased, the alignment process became more computationally expensive. I initially experimented with testing larger displacement ranges, but these proved too slow for high-resolution images. To overcome this issue, I implemented an image pyramid approach. This method involves progressively downscaling the image, performing alignment at a lower resolution, and then refining the results as the image is scaled back up. By applying this technique, I reduced the search space at each level, which drastically improved the efficiency of the alignment process for larger images.

In some cases, such as the church image, the basic alignment method struggled to produce accurate results. To address this, I further refined the process by cropping the image before applying the NCC metric, ensuring that the focus was on the central areas of the image rather than the edges. This improved the accuracy of the alignment and reduced artifacts in the image. By ensuring that the critical features in the center of the image were prioritized during the alignment process, I was able to achieve better alignment results, particularly for images that initially posed challenges.

Through this refined approach, I achieved a balance between accuracy and efficiency. The use of smaller displacement ranges, combined with the pyramid scaling technique, allowed for faster computations without sacrificing quality. By gradually refining the alignment from a low-resolution version of the image up to the full resolution, the algorithm avoided the need to test every possible displacement at the highest resolution, making the process much more manageable.

Original image showing the stacked plates for the cathedral.

Results

Image	Displacement (G, R) (x, y)	Runtime
Cathedral	G: (2, 5), R: (3, 12)	<1 sec
Church	G: (4, 25), R: (-4, 58)	16.9 sec
Emir	G: (24, 49), R: (56, 104)	16.6 sec
Harvesters	G: (16, 60), R: (13, 124)	16.5 sec
Icon	G: (17, 41), R: (23, 89)	17.3 sec
Lady	G: (9, 54), R: (12, 116)	17 sec
Melons	G: (11, 82), R: (13, 178)	16.8 sec
Monastery	G: (2, 3), R: (2, -3)	<1 sec
Onion Church	G: (26, 51), R: (36, 108)	17.6 sec
Sculpture	G: (-11, 33), R: (-27, 140)	17.2 sec
Self Portrait	G: (29, 79), R: (37, 176)	17 sec
Three Generations	G: (14, 53), R: (11, 112)	16.7 sec
Tobolsk	G: (3, 3), R: (3, 6)	<1 sec
Train	G: (6, 43), R: (32, 87)	17.4 sec
Four People	G: (-12, 46), R: (-41, 86)	16.30 sec
Na ostrovie Kapri	G: (-16, 32), R: (-25, 78)	17.88 sec
Mosque	G: (10, 26), R: (11, 70)	16.03 sec