Image Colorization - CS 639


Authors

Project Source Code

https://github.com/danieldmiller/image-colorization

Presentation

Presentation

Project Description

Image colorization is about coloring black and white (Grayscale) images. Image Colorization can be characterized into two main categories: Interactive and Automatic colorization. In an interactive method, a user generally maps color to individual pixels. This method is extremely intensive and prone to human error and the result is extremely dependent on the user. An automatic method takes in one or several images as inputs and transfers color from the reference image to the target image. A key challenge in automatic methods is parsing through large amounts of inputs and parameters.

In this project, we will be implementing image colorization using feature transfer from similar images. This method intends to transfer image colors from one image to another. The constraint of this method is that both images have to be semantically similar. This method uses superpixels of the reference (Input) image, extracts features such as mean intensity, standard deviation, Gabor features, and Speed Up Robust features.

Motivation

We were motivated to figure out a way to color an image due to WW2 images. When thinking of WW2, we always think of it in terms of gray-scale photos and videos. We wanted to bring some color perspective to WW2 photos

Current State of the Art Coloring Techniques

Automatic Techniques

These Techniques use a pre-trained coloring program to transfer color to an image. The State of the art techniques use Deep Learning and Deep Convolutional Neural Networks

While this technique has gotten very accurate over the years, It requires a lot of computing power. Depending on the number of layers the network has, the program can be slowed down significantly. There is roughly an inverse correlation between the number of layers and accuracy. On the other hand, fewer layers mean a faster speed.


Below is an example result from a state of the art CNN method (Colorful Image Colorization). Take note of the grass color seeping into the handle of mallet. The CNN mistook the handle of mallet as grass.Colorful Image Colorization

Neural net results

Weakness:

Requires huge data sets to train properly. Backpropagation in CNN's is tricky and can slow a system down.

Manual Techniques

Requires User to manually map color to an Image. The state of the art is Adobe Photoshop.

Weakness:

Tedious and Tiresome. Output is heavily dependent on the person coloring

Problem we are trying to solve

Automatic Image Colorization is underconstrained. There are infinite colors that can map to a certain pixel. When using Manual Coloring, this problem is less prevalent, but manually coloring 20 images is labor-intensive.

Deep Learning and Deep CNN's require a lot of computation power and can bottleneck an average machine. We are trying to implement a method that is direct and doesn't require a lot of computational power.

We are trying to figure out the right constraint for defining a map between the target and reference area of the image. Without proper constraints, Transferring color would be random. Texture is one such feature that might add a constraint to this problem. If two regions of an image share similar texture, then they could likely be similar in color. For example, if we look at two different forts build in the same time period in the same region, they will have similarities. Their window textures will be similar, they would probably have the same style of architecture, and similar styles of garden. If one looks at two passport photos of different human beings, they will likely have similarities such as eyes, nose, and potentially hair (if both have hair). From a human's standpoint, It is obvious that we can map hair color from one person to the other. Besides these, We also need to make sure that we find this region regardless of their orientation and direction.

Report

Approach

We used a computationally easy approach to color an image. This approach uses a reference image similar to the target image and maps color based on similar texture. This technique differs from Deep Learning in the sense that it is a direct method. Deep Learning is a method that adjusts its weights after learning from each image, whereas this is procedural. There is no need for having a huge data set. All we need are two images: one from which we will take color, and the other to which we will color.

Based On:

Paper: Image Colorization Using Similar Images

Citation: R. K. Gupta, A. Y. Chia, D. Rajan, E. S. Ng, and H. Zhiyong. Image Colorization Using Similar Images. https://people.cs.clemson.edu/~jzwang/ustc13/mm2012/p369-gupta.pdf

Method

Let R be the RGB reference image and T be the gray-scale target.

- Extract n superpixels per image

- For each superpixel, extract Gabor features and SURF.

- Cascade Feature match to get the right reference pixel per target superpixel

- Use Image Space Voting to filter out unmatched areas.

Intermediate Step Results

Below are the intermediate results that our algorithm produces

Uncolorized (Grayscale) Target Images

uncolorized fort image
Grayscale Fort
uncolorized flower image
Grayscale Flower

Superpixel Extraction

Areas of an image similar in characteristics. They help with speeding up computation. We found that the greater the superpixels, the more accurate color mapping was.

superpixel image
Superpixels of a fort
superpixel image
Superpixels of a flower

Gabor Features

Help in texture analysis of an image. We aim at finding similar superpixels with gabor features. If two superpixels have similar mean features, they are likely similar.

gabor features for fort
Some of the Gabor features for Fort
gabor features for flower
Some of the Gabor features for Flower

SURF

Speed Up Robust Features behave just like SIFT features, but they are faster. They are highly discriminative. This method detects interest points using a determinant of Hessian blob detector. SURF uses square-shaped filters as an approximation of Gaussian Smoothing. This descriptor provides a unique and robust description of an image feature, in our case the intensity distribution.

Cascade Feature Mapping

Let C be the set of all reference superpixels.

For every superpixel t in T (the gray scale image):

- Compute the Euclidean Distance of mean Gabor Features between t and c in C, for all elements of C.

- If this distace is less than a certain error, add it to a list of reference superpixels close to the target superpixel in texture.

In the second level, we intent to use SURF to half the number of superpixels close to the target superpixel t. Finally, use function f to narrow down on a single superpixel:

function f: w*d(g_r,g_t): w is a predefined weight and d(g_r,g_t) is the euclidean distance between target Gabor Features and reference Gabor Features. The superpixels corresponding to the minimum output of this function are matching pairs.

This function can be further extended to incorporate surf by adding another weight and distance function. So any other feature can be incorporated this way.

The weight constant for Gabor we chose was arbitrary as of now because we are only using one feature. In general, we are taking the min. But we have set the weight to be 0.2 for when we incorporate SURF.

In order to tranfer color from the reference superpixel to the target, chromatic color values are tranfered between the centers of these superpixels, and then spread across with an optimization based color algorithm.

Micro-Scribble Colorization

This optimization based algorithm to spread out the color values is a microscribble-based approach. After running gabor for each superpixel in the target image (uncolorized) to find the best matched superpixel in the reference image, we draw a micro-scribble in the center of the superpixel on our target image using the mean color value we found in the reference superpixel.

scribble fort image
Micro-scribbles of a fort
scribble flower image
Micro-scribbles of a flower

Final Coloration Results

Here are the final results that we achieved for colorization. This is only using Gabor features.


result for fort
Final Coloration of fort using Gabor Features

result for flower
Final Coloration of flower using Gabor Features

Discussion

We were able to successfully transfer color using the cascade step for Gabor Features. The result wasn't accurate when using different images to color It seems that SURF features and Intensity freatures are required to get better estimates of reference pixels per target pixels. This is because using Gabor texture analysis is still not a proper constraint.

While we were unable to implement the research paper accurately, we found that Color transfer is possible with this method, just not very accurate. If we can add more layers of features to narrow down on a reference superpixel, we would be able to develop a better program. Again, Image Colorization is underconstrained. Gabor and SURF features still do not properly constraint this problem.

Color is heavily dependent on the reference Image. If the reference flower in the Image was green, the color of our target flower would be green. In application, we would need a human to figure out a proper reference image

There are a lot of smudges that go out of where they should be, This is why we need and image space voting algorithm or a median filter.

Right now, the only constraint on our color transfer map is Gabor Feature, which uses similarity in textures. Transferring color from an image to the gray-scale counterpart of the same image is a very accurate transfer. This is because they have the same superpixels and hence the same Gabor features. So for each target superpixel, there is a superpixel in the reference image such that the euclidean distance between these two features is 0.

This is a method we took from the referenced research paper and modified it to our knowledge. It is not a fully defined method yet, so the output is dependent on our skill level too.

Pros

- A lot less data intensive than Neural Networks (i.e. doesn’t need the use of implemented libraries with pre-trained data sets).

- Conceptually a lot easier to understand.

- faster (will eventually slow down after adding more layers of features)

Cons

- Harder to implement than a Neural Network

- Getting a higher accuracy result needs more features to narrow down a map. This in a way defeats the reason we chose to implement this method. More Features implies more data elements per superpixels

- Color is heavily dependent on the reference image.

- As of now, Deep Nets provide us with better methods to color a gray-scale image. Even online image coloring sites would be efficient.

Overall, This method could prove better than CNN and Deep Learning, if and only if one can implement this method right. This method is tricky and requires decent knowledge about how color's work in an image

As of now, state of the art wins

Challenges

This was harder than we thought. This method is conceptually easy to understand, but gets very complicated in its technicality.

We overcame a challenge we were facing with mapping Gabor features per super-pixel. We Initially were getting no results out of the Cascade step. We were calculating Gabor pixels per image and not per superpixel, this was a bug in the code, and changed our method appropriately to get a superpixel map.

We are using a coloring scheme we do not fully understand. Transferring color is tricky, we initially were transferring the RGB channels, but this just copied a superpixel at another location. Felipe told us that color is related to the low frequencies of an image. So filtering out higher frequencies could give us color. We intend to research this method more in the future.

It is difficult to work with superpixels. We finally figured out a way to extract them from an image, but at first glance, it is hard to extract nonsquare/rectangle-shaped regions from an image. Our first method was to divide the image into grids, this was easy to implement but when using a shifted reference image or any other reference image beside the colored version of target, there were no results. This was in part because there may be grids on an image such that no target grid matches any reference grid.

Future Work

Finish up SURF and add other features.

Color transfer by filtering out higher frequencies of an Image. If we can successfully implement this, we don't need Micro-Scribbles to transfer color (suggested by Fellipe).

References

Paper: Image Colorization Using Similar Images

Citation: R. K. Gupta, A. Y. Chia, D. Rajan, E. S. Ng, and H. Zhiyong. Image Colorization Using Similar Images. https://people.cs.clemson.edu/~jzwang/ustc13/mm2012/p369-gupta.pdf


Paper: Surf: Speeded up robust features. Computer Vision and Image Understanding

Citation: H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool. Surf: Speeded up robust features. Computer Vision and Image Understanding (CVIU), 110(3):346–359, 2008


Paper: Transactions on Graphics

Citation: A. Levin, D. Lischinski, and Y. Weiss. Colorization using optimization. ACM Transactions on Graphics, 23(3):689–694, 2004.


Wikipedia: Gabor features and SURF

Gabor citation: https://en.wikipedia.org/wiki/Gabor_filter

SURF citation: https://en.wikipedia.org/wiki/Speeded_up_robust_features


Microscribbles:

Colorization using Optimization

Microscribble implementation used:

Colorization using optimization github page


State of the art for image colorization:

Colorful Image Colorization

Image Colorization with CNN

Image Colorization using Deep Learning