Google Image Search Explained

Image Search Explained

Here are the questions we hear most. How does it work? How does the computer know that two images are similar?

The principle is easy to understand and is reliant on what Dr. Neal Krawetz calls a fast algorithm.

Or more specifically, the key technology involved here is called "perceptual hash algorithm." Its role is to generate a "fingerprint" character string for each image and then compare the fingerprints. The closer the comparison result, the more similar the two images are.

Below is a simple implementation:

Step 1: Downsize the image.

Downsize the image to 8x8 pixels, i.e. 64 pixels in total. This step removes the details in the image, and only retains the structure, light, and shade among other essential information to discard the image difference caused by difference in sizes and proportions.

Step 2: Simplify the color.

Convert the downsized image into 64-level grayscale. In other words, there are 64 colors for all the pixels.

Step 3: Calculate the average value.

Calculate the average grayscale of all the 64 pixels.

Step 4: Compare the pixel grayscale.

Compare the grayscale of each pixel with the mean value. If the grayscale is larger than or equal to the average value, record the value as 1; otherwise, record the value as 0.

Step 5: Calculate the hash value.

Combine the comparison results in the previous step to get a 64-bit integer, which is the "fingerprint" of this image. The order of the combination is not critical, as long as you ensure all the images follow the same order.
== 8f373714acfcf4d0

After you have the fingerprint, you can compare different images to check how many bits in the 64 bits are different. In theory, this is same as calculating the "Hamming distance." If the number of different bits is less than 5, the two images are similar; if the number of different bits exceeds 10, it means the two images are different.

For specific code implementation, see imgHash.py written by Wote in Python. The code is short (only 52 lines). In usage, the first parameter refers to the benchmark image and the second parameter indicates the directory of other images for comparison. The returned result is the number of different bits of the two images (Hamming distance).

This algorithm is advantageous for being easy and quick, irrespective of the size of the image, but its disadvantage is that the image's content cannot change. If you add several texts on the image, the algorithm will not recognize it. It locates the original picture based on a thumbnail.

In practical application, pHash and SIFT use more robust algorithms as they can recognize the variations in images. They can match the original image as long as the deformation is less than 25 percent. These algorithms are more complicated, but they follow the same principle as the simple algorithm explained above, namely converting the image to a hash character value and then making the comparison.

See! Not that complex after all.

Interesting? Click hear to read two more image search methods.

时间: 2024-09-27 16:42:22

Google Image Search Explained的相关文章

Image Search Explained – 2 More Methods

In an earlier article titled "Image Search Explained – Method 1", we introduced a method explaining how tools search for similar images. In this post we discuss two more methods, the color distribution and the content feature method. Color distr

Google (Local) Search API的简单使用介绍_javascript技巧

花了两天的时间来用Google的API来做这么一个小东西,其实真正的实现代码不是很多,十几行而已.费时间的工作是了解各个API的功能以及调试JavaScript. 下面简单介绍一下这次我用到的几个函数吧. •构造函数google.search.LocalSearch() 这其实是创建了一个LocalSearch的Service,这个Service和其他Service(News, Blog, Web)一样,是供SearchControl使用的.这些Service决定了SearchControl的能

用PHP获取Google AJAX Search API 数据的代码_php技巧

http://code.google.com/apis/ajaxsearch/documentation/ 复制代码 代码如下: // This example request includes an optional API key which you will need to // remove or replace with your own key. // Read more about why it's useful to have an API key. // The request

掌握Ajax,第9部分: 使用Google Ajax Search API

在异步应用程序中使用公共 API 简介:发出异步请求并不意味着只是与您自己的服务器端程序交互.其实也可以与一些公共 API,例 如来自 Google 或 Amazon 的 API 进行通信,从而为 Web 应用程序增加您自己的脚本和服务器端程序所 不能提供的更多功能.在本文中,Brett McLaughlin 教您如何向公共 API,例如 Google 提供的 API 发 出请求并接收其响应. 到目前为止,这个系列只涉及到客户机 Web 页面向服务器端脚本和程序发出请求的情况.这就是大约 80%

试用google search的Ajax api_应用技巧

最近两个月访问google老抽风,也8知道伟大的GFW使了些什么手段.虽然形式不容乐观,仍然是看到了狗狗发布了google adsense api和google search ajax api,显然后者更具有实用价值.因此俺用迅雷不及掩耳盗铃之势申请了api key,在参考了online manual之后,还是用实际代码说话吧: 复制代码 代码如下: function OnLoad() {         // Create a search control         var search

Google Map Api和GOOGLE Search Api整合实现代码_javascript技巧

         将GOOGLE MAP API 和 GOOGLE Search API 进行整合,我用面向对象的方式写了一个类,通过传一个经纬度进去,自动通过GOOGLE LOCAL SEARCH获取附近的相关信息.比如餐厅.景点等,反过来标到地图上,并可在任意容器内显示. 下面是源码: 复制代码 代码如下: /* *Author:karry *Version:1.0 *Time:2008-12-01 *KMapSearch 类 *把GOOGLE MAP 和LocalSearch结合.只需要传

在IE8浏览器中添加Google SSL搜索

  Google终于宣布其搜索引擎也开始支持SSL加密传输的功能,这样用户在搜索过程中可以更好地保障您的数据安全,还可以有效的缓解国内用户在使用Google搜索经常被重置的现象. 下面我介绍一下怎样在IE浏览器上使用SSL加密的Google搜索,如果在IE浏览器中打开该网站的时候被转入google.com.hk,您在选择主页下的Go to Google.com ,之后就可以正常使用. 给IE8浏览器添加一个Google with SSL的搜索引擎,可以到IE浏览器官网的在线添加搜索引擎页面(ht

围观一个People Search

郑昀@玩聚RT 20091210 1. 请打开链接: http://martin.atkins.me.uk/peoplesearch/#riku http://martin.atkins.me.uk/peoplesearch/#hecaitou http://martin.atkins.me.uk/peoplesearch/#kunshou 尝试People Search. 2. 请注意上面这个页面的底端写着: This service is powered by Google AJAX Sea

Google和百度、雅虎的站内搜索代码

对于一个网站来说,使用搜索引擎来进行站内搜索往往比自己编写的站内搜索更高效,并且不占用网站服务器的资源,下面是我搜集到的几个主要搜索引擎(Google和百度.雅虎)的站内搜索代码,使用时只需要将代码里的"www.williamlong.info"替换成你的网址即可. <!--Google站内搜索开始--><form method=get action="http://www.google.com/search"><input type=