Let's start from a particular nominal solution and work backwards. Let's say we had some information about the composition which took place. If we assume that the foreground is a uniform color with uniform opacity, what information would help us?
- a mask describing the foreground region, or information which could be used to recreate the mask (e.g. the font name and size)
- the foreground color
- the foreground opacity
The proof of concept and its limitations
So let's say we have a mask, and we know the foreground color and opacity. We could back-calculate the original background image like so.
inpict = imread('https://www.mathworks.com/matlabcentral/answers/uploaded_files/499168/image.jpeg');
mask = imread('textmask.png');
BG = (R - FG.*alpha)./(1-alpha);
BG = BG.*mask + im2double(inpict).*(1-mask);
imshow(BG,'border','tight')
It's not perfect, but it is a lot better. If you look closely, you'll notice that much of the remaining error surrounds the text. In other words, our ability to fix the original background regions is limited by the damage done by JPG compression. This is a problem both in that it adds error to the background estimation, but it also means that artifacts of the composed region extend outside the mask.
We could try to fudge the rest with some masked filtering, but it's questionable whether that's better or worse. If you're going to slap another bunch of text over the top of it, it might not be objectionable.
mask = imdilate(mask,strel('disk',5));
filtered = medfilt3(BG,[11 11 1]);
BG = filtered.*mask + BG.*(1-mask);
imshow(BG,'border','tight')
Finding FG color and opacity
Now that we've shown that limited success is possible, how can we obtain those preconditions? Let's start with obtaining the color and opacity parameters. Let's say we sample some points from the image where we can assume that the original gradient is small. In a short time, we can build two color tables describing colors represented in the given image (R) and their assumed color in the original background image (BG).
In order to calculate the foreground color, we can use the following, but note that we know neither FG, nor opacity.
If we can assume that the foreground color is a neutral gray, we can find the opacity without yet having knowledge of the FG color. This can be done simply by picking an opacity, calculating FG, and then checking the average FG color's saturation. By any means appropriate, minimize the observed FG saturation. At that point, we have estimates for both FG and opacity.
R = uint8([255 255 255; 217 217 217; 169 177 224; 139 165 214; 153 179 216;
165 179 224; 148 167 223; 255 255 255; 220 212 199; 219 212 134;
224 210 135; 212 218 216; 255 255 255; 219 209 137; 221 170 167]);
BG = uint8([255 255 255; 255 255 255; 78 121 234; 8 91 243; 58 113 240;
85 125 249; 34 96 232; 255 255 255; 255 242 175; 254 233 13;
253 233 0; 245 247 246; 255 255 255; 252 228 0; 240 88 84]);
opacityvec = linspace(0.3,0.8,np);
FG = (R - BG.*(1-opacityvec(k)))./opacityvec(k);
[~,S,~] = imsplit(rgb2hsv(permute(FG,[1 3 2])));
opacity = opacityvec(idx)
FG = (R - BG.*(1-opacity))./opacity;
You might notice that these estimates differ slightly from the ones used in the example above. In practice, you'll probably want to use more than this small handful of test points used for the example. For sake of compactness, I'm not using the full color tables here.
Getting a mask
So now we know the foreground color and opacity. Now all we need is a mask describing the FG region. Extracting this directly from the image is going to be very difficult. The resolution is limited, and the JPG compression has done its damage. This limits our ability to use color information (e.g. local changes in saturation) to isolate the text. Given the poor FG-BG contrast near the white BG areas, we should expect our results to be poor in those areas. A combination of color selections could be used, and the rest touched up manually using polygon/freehand ROI tools (or even manual brushwork in an external editor).
Another possibility is to reconstruct the mask by discerning the font face and estimating other settings to obtain a best-fit mask. This is something that would need to be done in an external editor, as MATLAB has no practical tools to do this. This would require familiarity with the subtle variations in some of the most mundane fonts.
I don't have that familiarity or a large selection of fonts, so I opted to use a combination of color-based masks, brushwork, and character replication to construct a crude mask.
The punch line
So we can obtain an estimate of the original background image if we invest the time and effort to estimate the unknown elements of the original composition. These involve a significant amount of time, judgement, and manual effort to obtain. For compressed images, the best possible outcomes will be marginal, and well below what the expectations likely are. After all, unless we filter the result, the presence of the original watermark text is still plainly noticeable, and the text itself is still plainly readable.
Are there alternative solutions? Yes. You could simply block out the text and place your text atop the block. You could even just blur that region of the image and do the same.
Alternatively, you could just go find the original source image and use that.
That's right. This took me less than five minutes to find on my super-slow connection. I'm sure anyone else could have done so in less than a minute.
I feel I should also point out that "open-source" doesn't mean the same thing as "I found it somewhere". Additionally, since the source image is availabe, the watermark on the original image accomplishes nothing of value. Similarly, adding your watermark accomplishes nothing. If you're worried that someone might use your images, then I must point out:
- they aren't your images
- people can just get the originals
- if people are too lazy to do that, they'll just use your watermarked image anyway
So by watermarking someone else's image, you arguably encourage the laziest fraction of like-minded people to impersonate you.