How search engines organize the web

If you've ever used a search engine for research on the web, you know that the lengths of the lists of matching web sites returned after entering a search can vary dramatically based on the search, and the search engine used. For example, if you enter the phrase "automation" on the Yahoo! search engine, you get a list of 96 categories and 2,688 specific sites to peruse.

03/01/1999


If you've ever used a search engine for research on the web, you know that the lengths of the lists of matching web sites returned after entering a search can vary dramatically based on the search, and the search engine used. For example, if you enter the phrase "automation" on the Yahoo! search engine, you get a list of 96 categories and 2,688 specific sites to peruse. If you enter the exact same word on Excite's site, you get no less than 79,254 individual sites. The same search on Lycos' site results in only 3 categories and about 100 individual sites, selected by how closely they match the word "automation."

How search engines work

Why such a difference between the sites? It all has to do with how that particular site chooses to find and keep track of all the web sites out there. Search engines do not search the entire World Wide Web. That would be impossible, given the number of web sites in existence and the fact that new sites are being added every day. Most search engines use robots to go out onto the Internet and collect the URLs and descriptions of as many web sites as possible. They are not robots in the true sense of the word; rather, they are computer programs that are designed to automatically locate and index URLs and the URLs referenced within. Like web surfing itself, it can seem like an endless process, which is why it must be an automated task.

The robots, sometimes called "spiders," then index the URLs found into a database, which is what the search engine will actually search. Since each different search engine has its own robot program or programs to do its indexing, and each search engine indexes URLs differently, the presentation and amount of subject-relevant web sites that search engines gather into their databases will obviously differ from search engine to search engine. That is not to say that any one search engine is better than any others. It all depends on how you prefer your information organized and presented to you.

Where to go?

Most search engines' robots or spiders will start in places where they can find large numbers of URLs to visit, usually pages that contain large amounts of links. From there they can explore countless different avenues and collect URLs and site descriptions most efficiently. However, since it is most certainly the case that there will be sites that robots can't get to from anywhere else, the majority of search engines also allow users to submit URLs for robots to visit, usually by filling out an online form. These two methods help to ensure that the search engine has a diverse collection of URLs in its database. Each individual search engine's database is then organized into categories to help make searching easier. The users can then search the database two ways: by entering a specific word or phrase, or by browsing the category or categories from which they would like further information.

Virtual libraries

Many of the places that robots will start collecting are classified as online "virtual libraries," dedicated to specific subjects or fields. Some virtual libraries operate the exact same way as a regular search engine does, differing only in the content they keep in their databases. Other virtual libraries simply keep an online list of links to related web sites that the user can browse and do not have a search option. These virtual libraries can become valuable tools for researching a particular subject, or keeping handy in a bookmark list for frequent referencing. Most virtual libraries can be found through the larger search engines.


Author Information

Laura Zurawski, web editor, lzurawski@cahners.com


Virtual Libraries for Manufacturing and Engineering

Manufacturing Marketplace,

Control Engineering Virtual Library,

1999 Automation Integrator Guide,

Most Commonly Used Search Engines

Alta Vista—

Excite—

HotBot—

InfoSeek—

Lycos—

WebCrawler—

Yahoo!—



No comments
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by...
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
The Engineering Leaders Under 40 program identifies and gives recognition to young engineers who...
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
Big Data and IIoT value; Monitoring Big Data; Robotics safety standards and programming; Learning about PID
Motor specification guidelines; Understanding multivariable control; Improving a safety instrumented system; 2017 Engineers' Choice Award Winners
Selecting the best controller from several viewpoints; System integrator advice for the IIoT; TSN and real-time Ethernet; Questions to ask when selecting a VFD; Action items for an aging PLC/DCS
This digital report will explore several aspects of how IIoT will transform manufacturing in the coming years.
Motion control advances and solutions can help with machine control, automated control on assembly lines, integration of robotics and automation, and machine safety.
This article collection contains several articles on the Industrial Internet of Things (IIoT) and how it is transforming manufacturing.

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

Future of oil and gas projects; Reservoir models; The importance of SCADA to oil and gas
Big Data and bigger solutions; Tablet technologies; SCADA developments
SCADA at the junction, Managing risk through maintenance, Moving at the speed of data
Automation Engineer; Wood Group
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
click me