search engine
([email protected] / [email protected] ).
A search engine is a coordinated set of programs that searches for and identifies items in a database that
match specified criteria. Search engines are used to access information on the World Wide Web.
Google is the most commonly used internet search engine. Google search takes place in the following
three stages:
1. Crawling. Crawlers discover what pages exist on the web. A search engine constantly looks for new
and updated pages to add to its list of known pages. This is referred to as URL discovery. Once a
page is discovered, the crawler examines its content. The search engine uses an algorithm to choose
which pages to crawl and how often.
2. Indexing. After a page is crawled, the textual content is processed, analyzed and tagged with
attributes and metadata that help the search engine understand what the content is about. This also
enables the search engine to weed out duplicate pages and collect signals about the content, such as
the country or region the page is local to and the usability of the page.
3. Searching and ranking. When a user enters a query, the search engine searches the index for
matching pages and returns the results that appear the most relevant on the search engine results
page (SERP). The engine ranks content on a number of factors, such as the authoritativeness of a
page, back links to the page and keywords a page contains.
Specialized content search engines are more selective about the parts of the web they crawl and index.
For example, Creative Commons Search is a search engine for content shared explicitly for reuse under
Creative Commons license. This search engine only looks for that specific type of content.
Country-specific search engines may prioritize websites presented in the native language of the country
over English websites. Individual websites, such as large corporate sites, may use a search engine to
index and retrieve only content from that company's site. Some of the major search engine companies
license or sell their search engines for use on individual sites.