PicArrange 3.0 now finds images with words

07/24/2024

Find your photos with words

PicArrange 3.0 is the first Mac app to offer text-based image search

Even the best visual similarity search is of little help if no suitable similar query image is available to start a visual search. For this reason, the new version of PicArrange is the first Mac app ever to offer the ability to describe the searched images with words. After typing the image description into the search box, images with similar content are displayed. The textual image search works best in English, but it is also possible to use other languages to formulate the image description.

Watch the video on YouTube

How does PicArrange’s visual and text search work?

Over the years, research has continually improved image descriptors, enabling efficient and compact representation of the content and appearance of images. A significant advancement came with OpenAI’s CLIP model, which embeds textual descriptions of images and their visual representations into a shared latent space.

However, the CLIP model has its limitations. For instance, the same text can describe vastly different-looking images. A phrase like “dinner in a fast food restaurant” might refer to an image of a typical fast food dish or one of people eating in a fast food restaurant. This ambiguity in visual concepts hampers the effectiveness of image searches using the CLIP model.

To address this issue, the we developed a method to fine-tune CLIP models. This enhancement significantly improves retrieval quality while preserving the joint text-image embedding’s effectiveness for text-to-image searches. By leveraging this advanced technique, PicArrange offers a more accurate and reliable visual search experience, ensuring that you can quickly find the exact images you need based on both visual content and textual descriptions.

Project page Download PicArrange