Log in | Register
Forum > Announcements > Thread

New experimental feature: similar images

Chainer
Apr 04, 2021 - permalink

Did you base you approach on an existing algorithm somewhere?

No, I just messed around with various aspects of it until it was working well.

Apr 05, 2021 - permalink

Hi, great feature!

It doesn't always actually pick "similar images", but that's expected as it has no notion of content. I think some deep learning could do a lot here, I have many years of experience and would be happy to help.

To give an idea, it would be possible to train a model on a relevant task (e.g. correctly assigning tags) and then use it to extract features from the images. Once you have the image features it would be trivial to simply pick the nearest neighbors in feature space. If the model was trained properly the nearest images should be semantically and visually similar (depending on what task was used to train the model).

Apr 05, 2021 - edited Apr 05, 2021 - permalink

It doesn't always actually pick "similar images"

At first I was also expecting visually similar images. Maybe "Users also liked/favorited" would be more self explanatory, but I don't mind the current text.

To give an idea, it would be possible to train a model on a relevant task (e.g. correctly assigning tags) and then use it to extract features from the images. Once you have the image features it would be trivial to simply pick the nearest neighbors in feature space. If the model was trained properly the nearest images should be semantically and visually similar (depending on what task was used to train the model).

I was also thinking about playing with something like this to suggest tags when I upload something.

Chainer
Apr 06, 2021 - permalink

Hi, great feature!

It doesn't always actually pick "similar images", but that's expected as it has no notion of content. I think some deep learning could do a lot here, I have many years of experience and would be happy to help.

To give an idea, it would be possible to train a model on a relevant task (e.g. correctly assigning tags) and then use it to extract features from the images. Once you have the image features it would be trivial to simply pick the nearest neighbors in feature space. If the model was trained properly the nearest images should be semantically and visually similar (depending on what task was used to train the model).

Do you have any tips for frameworks / libraries / general approaches to get started with something like this? I do have more than 0 knowledge of ML but I am not very experienced in it.

Any application of ML to GWM seems to me like it would involve heavy image processing (and probably not apply to videos).

Apr 06, 2021 - permalink

Do you have any tips for frameworks / libraries / general approaches to get started with something like this? I do have more than 0 knowledge of ML but I am not very experienced in it.

Any application of ML to GWM seems to me like it would involve heavy image processing (and probably not apply to videos).

Agree that videos would be more tricky, but for single images once trained, using a deep model is not very expensive. I would estimate that a regular GeForce 1060-70 would be able to parse at least 60 images per sec. Training the model is expensive, but it can be done offline once in a while

As for libraries, i would go with pytorch which is very intuitive. This link is a good starting point: transfer_learning_tutorial. html (search it on pytorch website, i can't post links)

The given example is for single label classification, but it is easy to adapt it to multiple labels (the tags). If we find a platform to collaborate on, i'd be willing to help with code.

Chainer
Apr 13, 2021 - permalink

I updated this a bit:

  • I changed the weights the algorithm uses to result in slightly better matches (as subjectively eyeballed by me)
  • I lowered the limit for images where the link shows up to a score of 10 (down from 25). Since that's also the lowest possible score that shows up in the results, you can now go "infinite" by clicking through to any result, and then clicking "Similar images" under that pic.
  • If you run this on an image you favorited, it no longer uses your favorites in computing the result. This was problematic when running on an image with a low score (say, 10), that you also favorited, since too many of the results would be repeats of your favorites and thus boring. This also means different people can see slightly different results for a particular picture, if at least one of them has favorited it.
« first < prev Page 2 of 2 next > last »