30 Jun, 2008
Have you ever forgotten the exact address of a site that you wanted to visit? Not a problem - just type the name of the site into the Google search box and hopefully it appears at the top of the search results page.
We call this “teleporting”, and we’re pleased that we have been able to minimize the need to remember an alphabet soup of .coms, .nets, and .orgs out of everyone’s lives. However, one of the trends we noticed while studying teleporting was that there were lots of searchers who would type the name of a specific website as if they wanted to teleport, but would then immediately issue another more a refined search within this site.
For example, if someone is looking for official information about the Hubble Space Telescope on the NASA website, one might first search for [NASA] and then [NASA Hubble Telescope], like this:
Through experimentation, we found that presenting users with a search box as part of the result increases their likelihood of finding the exact page they are looking for. So over the past few days we have been testing, and today we have fully rolled out, a search box that appears within some of the search results themselves. This feature will now occur when we detect a high probability that a user wants more refined search results within a specific site. Like the rest of our snippets, the sites that display the site search box are chosen algorithmically based on metrics that measure how useful the search box is to users.
We hope that you will make use of the site search box in order to get the information you’re looking for as quickly and easily as possible.
30 Jun, 2008
The main goal of our AJAX APIs team is to provide developers with the tools needed to create the next generation of great web applications. Our 20% goal is world peace. What better way to help further both objectives than to launch a Language API?
The API helps developers automatically translate content in their applications. Users on these sites will have an easier time communicating across lingual boundaries.
The Language API provides both translation and language detection. Here’s an example of the translation tool in action:
You can play around with the language detection capabilities via this example widget:
For more information on how to use the Language API in your code, please refer to the documentation here.
http://googleblog.blogspot.com/2008/03/new-google-ajax-language-api-tools-for.html
30 Jun, 2008
Google has just begun supporting Unicode 5.1, less than one month after it was released. It’s now available in search, so people speaking languages such as Malayalam can now search for words containing the new characters in Unicode 5.1.
Web pages can use a variety of different character encodings, like ASCII, Latin-1, or Windows 1252, or Unicode. Most encodings can only represent a few languages, but Unicode will handle anything from Chinese to French to Arabic. We have long used Unicode as the internal format for all the text we search: any other encoding is first converted to Unicode for processing. So we regularly update to each new version of Unicode (and relevant related standards like CLDR and BCP 47) to make sure we are current. Thus Unicode plays a key role in our mission.
Uptick in native Unicode webpages
Just last December there was an interesting milestone on the web. For the first time, we found that Unicode was the most frequent encoding found on web pages, overtaking both ASCII and Western European encodings—and by coincidence, within 10 days of one another. What’s more impressive than simply overtaking them is the speed with which this happened; take a look at the blue line in this graph.
You can see a long-term decline in pages encoded in ASCII (unaccented letters A through Z). More recently, there’s been a significant drop in the use of encodings covering only Western European letters (ASCII and a few accented letters like Ä, Ç, and Ø). We’re seeing similar declines in other language-specific encodings. Unicode, on the other hand, is showing a sharp increase in usage.
This is based on our indexing of web pages, and thus may vary somewhat from what other search engines find. However, the trends are pretty clear, and the continued rise in use of Unicode makes it even easier to do the processing for the many languages that we cover.