New experimental feature: similar images

Chainer★

Apr 04, 2021 - permalink

Did you base you approach on an existing algorithm somewhere?

No, I just messed around with various aspects of it until it was working well.

testimony

Apr 05, 2021 - permalink

Hi, great feature!

It doesn't always actually pick "similar images", but that's expected as it has no notion of content. I think some deep learning could do a lot here, I have many years of experience and would be happy to help.

To give an idea, it would be possible to train a model on a relevant task (e.g. correctly assigning tags) and then use it to extract features from the images. Once you have the image features it would be trivial to simply pick the nearest neighbors in feature space. If the model was trained properly the nearest images should be semantically and visually similar (depending on what task was used to train the model).

Imagine★

Apr 05, 2021 - edited Apr 05, 2021 - permalink

It doesn't always actually pick "similar images"

At first I was also expecting visually similar images. Maybe "Users also liked/favorited" would be more self explanatory, but I don't mind the current text.

To give an idea, it would be possible to train a model on a relevant task (e.g. correctly assigning tags) and then use it to extract features from the images. Once you have the image features it would be trivial to simply pick the nearest neighbors in feature space. If the model was trained properly the nearest images should be semantically and visually similar (depending on what task was used to train the model).

I was also thinking about playing with something like this to suggest tags when I upload something.

Chainer★

Apr 06, 2021 - permalink

Hi, great feature!

It doesn't always actually pick "similar images", but that's expected as it has no notion of content. I think some deep learning could do a lot here, I have many years of experience and would be happy to help.

To give an idea, it would be possible to train a model on a relevant task (e.g. correctly assigning tags) and then use it to extract features from the images. Once you have the image features it would be trivial to simply pick the nearest neighbors in feature space. If the model was trained properly the nearest images should be semantically and visually similar (depending on what task was used to train the model).

Do you have any tips for frameworks / libraries / general approaches to get started with something like this? I do have more than 0 knowledge of ML but I am not very experienced in it.

Any application of ML to GWM seems to me like it would involve heavy image processing (and probably not apply to videos).

testimony

Apr 06, 2021 - permalink

Do you have any tips for frameworks / libraries / general approaches to get started with something like this? I do have more than 0 knowledge of ML but I am not very experienced in it.

Any application of ML to GWM seems to me like it would involve heavy image processing (and probably not apply to videos).

Agree that videos would be more tricky, but for single images once trained, using a deep model is not very expensive. I would estimate that a regular GeForce 1060-70 would be able to parse at least 60 images per sec. Training the model is expensive, but it can be done offline once in a while

As for libraries, i would go with pytorch which is very intuitive. This link is a good starting point: transfer_learning_tutorial. html (search it on pytorch website, i can't post links)

The given example is for single label classification, but it is easy to adapt it to multiple labels (the tags). If we find a platform to collaborate on, i'd be willing to help with code.

Chainer★

Apr 13, 2021 - permalink

I updated this a bit:

I changed the weights the algorithm uses to result in slightly better matches (as subjectively eyeballed by me)
I lowered the limit for images where the link shows up to a score of 10 (down from 25). Since that's also the lowest possible score that shows up in the results, you can now go "infinite" by clicking through to any result, and then clicking "Similar images" under that pic.
If you run this on an image you favorited, it no longer uses your favorites in computing the result. This was problematic when running on an image with a low score (say, 10), that you also favorited, since too many of the results would be repeats of your favorites and thus boring. This also means different people can see slightly different results for a particular picture, if at least one of them has favorited it.

ourgang

Apr 23, 2021 - permalink

I have been using this a lot and I love it! Great idea! :)

Chainer★

Apr 24, 2021 - permalink

I updated this so now it returns up to 5,000 results per image rather than 150 (with pagination). Results might get less consistent as you get to higher page numbers though.

Chainer★

Apr 30, 2021 - permalink

I made an update to try to counteract some of the bias where it would return mainly images from the same time period as the starting image, especially for newer images. So if do it with a newer image, ideally you should now see a better mix of new and slightly older pics.

goodoc2001

Apr 30, 2021 - permalink

Thanks Chainer

asqwert

Apr 30, 2021 - permalink

Thanks for all the work chainer,

Mind if I ask, how does the highest rating (this day, this week, this year, all time) work? All time is pretty self explanatory but I'm wondering if "this year" works like highest rating for the past 365 days or does it begin with the current year's first day (January 1, 2021)?

Chainer★

Apr 30, 2021 - permalink

Past X amount of time.

Elinedo

Apr 30, 2021 - permalink

It works way better than tag search! Awesome

SabreMucho555

May 02, 2021 - permalink

Nice feature!

fn1a

May 07, 2021 - permalink

I don't know your algorithm, but if I choose a distinctive image (e.g., chest flies for a lean, physique athlete), it provides similar images. To save CPU time, you may want to cache the algorithm's results. I got so many hits, I wound up going through a list of images and the list was probably recalculated several times. Good feature.

Chainer★

May 08, 2021 - permalink

The results are cached for an hour after you first load them for a particular image, so if you navigate the pages it doesn't get recomputed. (Unless you wait for an hour to click "next page".)

6Karrel6

Aug 08, 2021 - permalink

Don't know if that the correct place for it but I just want to say that I LOVE this feature! The results got better with every improvement on the algorithm and now it works like a charm. Really great work!

Reggieiv

Aug 08, 2021 - permalink

I love it.

gorgar

Dec 15, 2021 - permalink

Great feature! Would it be possible to limit results to videos when the image you like is a video? video-specific searches would be a dream elsewhere,too.

[deleted]

Jun 09, 2025 - permalink

this feature is fucking sick. for the mods - i was wondering how you guys made it cause most of the time it absolutely the physique im looking for.

Chainer★

Jun 10, 2025 - permalink

Thanks! Basically it picks some of your recent favorites, and looks at users who favorited those, and then at images those users favorited. It also tries to control for some confounders, like images having a high score (wouldn't want to show only those or that would get boring) or users who favorite tons of pics (so each of their favorites individually doesn't give as good of a signal).