The objective of the conducted user study was to subjectively assess
the quality of color transfer methods. In this sense, we first computed 120 different color transfer results
using 6 different color transfer methods. The set of input and target images for the performed color transfer
consisted of indoor and outdoor images, macros images, multi-color and monochromatic images.
We developped an online platform for the user study (as shown in the figure below). The user study
was run on a Linux server. The users were presented with triplets of input, result and target images.
The input and the target images were shown first, and several seconds later, the result was displayed.
That way, the users were given the time to imagine the final result and compare their expectation
to the displayed result. This policy was chosen 1) to make the user evaluation easier and 2) to
study better the intent of the users. We simply asked the users to evaluate how close their
expectation was to the presented result. The user's expectation is an ensemble of many
factors - the quality of the color transfer, the quality of the final result, the representation
of the target image style, the contrast, the content and the specifics of the triplet of images, etc.
Our main goal was to capture as many of these factors as possible and to summarize them into a single
score.
In our user study, we included an image triplet, which we called a baseline. The baseline
consists of input and target images and a result whose colors are dissimilar to the target colors.
We proceded this way so that we could detect if the user is untrustworthy. Moreover, the users were
encouraged first to get familiar with the platform and the form of evaluation by trying out
a short demo (before the actual study). Finally, we asked the users some personal information (age, expertise, etc.).
The analysis of the data from the user study can be found in the paper.
The following figure presents several color transfer results and their perceptual scores, computed using the proposed metric.
The following 10 scores - SSIM, luminance similarity, color similarity, saliency, brightness, lightness, chroma, colorfulness, saturation and scores from the user study - for each of 120 results can be found here.