SEARCH ENGINE JARGON GLOSSARY
Search Engine Jargon Often Used on the Internet Sometimes
Include More Than Obvious or Has More Than One Meaning.
A
Agent Name
Names given to web browsers and programs related to the web.
The agent name for Microsoft Internet Explorer 6.0 is:
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98) or Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1),
while for Google's spider it is: Googlebot/2.1+(+http://www.googlebot.com/bot.html)
Agent Name Delivery
Webserver could be set up to deliver different web pages to different
web site visitors, depending on what agent enters the page. As this technique
could artificially increase the relevancy of the web sites it's treated
as spam on most search engines, and especially Google.
ALT tags
HTML Img tag is used to place images on your website. The ALT tag is used
to describe those images, for people that can't see them. Search engine
spiders also can't see the images. ALT tags should be used if you have
images on your website, for people with disabilities as well as for search
engines spiders. They are also considered important by the search engines.
Applet
A small program that is a part of a web page (often written in Java -
Java Applet)
ASP
Active Server Pages use a server based scripting language to provide dynamic
and database driven content. Could be also used for cloaking.
B
Bait-and-Switch
Submitting one page to a search engine, and switching it with another
after search engine spider it.
Bridge Page
See Doorway Page.
C
CGI
Common Gateway Interface is a standard interface between webserver
software and other programs running on the same machine. CGI programs
could be used for processing forms, online orders, database queries on
web pages, and to provide dynamic web page content. A CGI web sites could
cause problems for search engines, and may not be indexed correctly on
most search engines, so at least a few pages of the web site should be
plain HTML.
Click through
The process when visitor click on a link which leads to the indexed
site in a search engine listings to visit it.
Cloaking
The hiding of a page content, normally done to stop potential stealing
of well optimized pages, but also used for providing of one page for a
search engine or directory and a different page for other actual visitor.
As this technique could artificially increase the relevancy of the web
sites and was abused in the past it's treated as spam on most search engines
and especially Google (which supplies data for Yahoo, AOL, Iwon, Netscape,
Kanoodle, etc. )
Cold Fusion
A cgi database program (.cf or cfm extension). Could also be used
by some cloaking programs.
Comment tags
The HTML Comment tags are used to hide text from browsers. Some search
engines ignore text between these symbols but others index such text as
if the comment tags were not there. As this tag was abused too much in
the past, similar to Keyword tag, they could be used as the reason for
penalization on search engines, if not used properly.
Crawler
See Spider
D
Description
Descriptive text which appears in a search engine listings of pages.
Some search engines take this description from the Description Meta tag
while others generate their own description from the text on the page.
Directories use text supplied at registration or modified by editors.
Directory
A server that indexes Internet web pages and provides lists of pages suitable
for particular queries. Web pages are usually collected manually, by user
submission, and commonly selected and categorized by editors (ODP -dmoz.org,
LookSmart, Yahoo, etc.)
Domain
Domain is a subset of Internet addresses. Typical top level domains are
.com, .edu, .gov, .org (divide into areas of use). There are also various
geographic top level domains for particular countries (.us, .uk, .ca,
.de, etc.). Keyword rich domain names should be used if possible, and
not too many level deep sub folders, as they could achieve better positioning
on search engines than web sites.
Doorway page
A page specifically made to target specific keywords or phrases. As this
technique could artificially increase the relevancy of the web sites and
was abused in the past it's treated as spam on most search engines. Also
known as bridge pages, gateway page, entry pages, portals or portal pages.
Dynamic content
The information on web pages change automatically, on the basis of database
content or user information (.asp, .cfm, .cgi, .shtml). Dynamic content
could cause problems for search engines, and may not be indexed correctly
on most search engines, especially if URL contains the ? (or other special)
character. At least a few pages of the web site should be plain HTML if
possible.
F
Frames
An HTML technique used to make a more effective web pages, with multiple
windows or sub windows. A framed web sites could cause problems for search
engines, and may not be indexed correctly on most search engines. Search
engines will usually index just the part within the <NOFRAMES> section,
so this section should include the text relevant to the thematic of the
web page. In general, they should be avoided if possible.
G
Gateway Page
See Doorway page
H
Heading
The text found inside HTML heading sections is accounted by most
search engine algorithm to ordered webpages suitably in their query results.
Hidden text
Text on a web page which is visible to search engine spiders but not
visible to human visitors (e.g. the text is the same or very similar color
as the background of the web page, extremely small font is used, usage
of multiple TITLE tags, the text in HTML comment, etc.). As this technique
could artificially increase the relevancy of the web sites and was abused
in the past it's treated as spam on most search engines.
HTML
HTML stands for Hypertext Markup Language - the coding language used for
all web sites.
HTTP
HTTP stands for Hypertext Transfer Protocol - the protocol used for communication
between web servers and web browsers.
Hyperlinks
Hyperlinks are used to link pages of the website, documents, etc. together.
Links are used to move through the website and/or to other websites, portals,
search engines, etc. on the www.
I
Image map
The image map consists of a set of hyperlinks attached to areas of an
image. Although most search engines should have no particular problems
following the links, it is better to provide text links as well.
Inbound Link
A hypertext link pointing to a particular web site. Inbound links as a
measure of the web popularity are used by search engines for positioning
of web pages in search engines indexes.
IP
Internet Protocol Number - a unique number which consist of 4 parts separated
by dots (e.g. 64.132.199.2. Each and every machine on the Internet has
a unique IP number.
IP delivery
Webserver could be set up to deliver different web pages to different
web site visitors, depending on the IP address of the client. As this
technique, similar to agent name delivery, could artificially increase
the relevancy of the web sites it's treated as spam on most search engines,
and especially Google.
J
Java
A programming language. Programs in Java can run on different types of
computers and/or operating systems.
Javascript
The Javascript is a simple computer language normally interpreted on the
client computer by the web browser, used for small programming tasks within
HTML web pages. Although some search engines could index this scripts,
they can't follow the links in them.
K
Keyword
A word used in a search engine query. This term is also used for targeted
main words on a web page in search engine optimization.
Keyword density
The percentage of a keyword in the text of a web page. To be treated as
the keyword on the search engines, a word should be used more than once
on a web page. This feature is used by search engines for positioning
of web pages in search engines indexes.
Keyword Domain Name
Or keyword rich domain name. Usage of keywords as part of the URL
to a website to improve search engine positioning.
Keyword phrase
A phrase used in a search engine query. This term is also used for targeted
main phrases on a web page in search engine optimization.
Keyword stuffing
The repetition of keywords and/or phrases in META tags or anywhere else
on the web site. As this technique could artificially increase the relevancy
of the web sites and was abused in the past it's treated as spam on most
search engines.
L
Link popularity
A measure of the quantity and quality of links pointing to a particular
web site (inbound links). This feature is used by search engines for positioning
of web pages in search engines indexes.
Log File
A file with all details of file accesses are stored, which is kept on
a server. Log files analyze can reveal many useful data on web site's
traffic.
M
Meta Search Engine
A server which submits queries to many search engine and directories,
sorts them and remove duplicates, and then summarizes all the results
(Ask Jeeves, Metacrawler, Dogpile, etc.).
Meta tag
Met tags are placed in the HTML header of a web page to provide information
not visible to browsers. The most commonly used meta tags are Title, Description
and Keywords tags. Because of the abuse in the past, the keywords tag
is very rarely taken into consideration for search engine positioning,
but search engines might reduce the ranking or even penalize the web site
if there are repetitions of keywords in keywords tag.
Mirror sites
Identical copies of web sites on different servers. Treated as spam on
search engines, as it could artificially increase the relevancy of the
web sites. There are special filters for removal of multiple mirror sites
from the search engines indexes (e.g. Infoseek Sniffer)
Multiple Keyword Tags
The usage of more than one Keywords META tag. As this could artificially
increase the relevancy of the web sites and was abused in the past it's
treated as spam on search engines.
Multiple Titles
The usage of more than one Title tag in the header section of a web page.
As this could artificially increase the relevancy of the web sites and
was abused in the past it's treated as spam on search engines.
O
Optimization
Quality changes of a web page's content to improve its search engine positioning
thus helping potential customers to find a web site more easily.
P
Placement
See Positioning
Positioning
Search engines and directories order web sites so that the most relevant
web sites appear first in search engine results for a particular query.
This process is called search engine positioning. This term is also used
to describe various techniques used by search engine (web site) optimizers
to help the web site rank higher in search engines.
Positioning algorithm
Each search engine or directory use some method to compare the keywords
or phrases in a query with the content of each web page in their indexes,
to ordered them suitably in the query results. Most search engines use
different algorithm, as well as develop it further or change their algorithms
in time to improve their listings.
Positioning techniques
Various methods of modifying web sites content so that it would be more
relevant for to a queries on search engines.
Q
Query
A word, or more often a group of words or phrase used on a search engine
or directory to find web pages or sites with appropriate content and information
on that thematic.
R
Ranking
See Positioning.
Refresh tag
Meta tag used to refresh page content after a given number of seconds.
As this technique was often abused in the past to force browsers to a
different page or site (Gateway pages), it should be avoided. Most search
engines will index only the final page, but might reduce the ranking as
well or even might penalize the web site.
Robot
Browser programs, which are not under human control, accessing web pages
and following hypertext links, such as search engine spiders.
robots.txt
A text file which restricts robots' access to certain pages or sub directories
of the web site, but only robots following the Robots Exclusion Standard
will read and obey the commands written in this file.
S
Search Engine
A server that indexes Internet web pages and provides lists of pages
suitable for particular queries. The indexes are usually generated using
spiders.
Search Term
See Query
Spaming, Spamdexing, Spoofing
Any technique that artificially increases the relevancy of the web
sites, thus increasing the potential position of a site on the search
engines listings, but decreasing the quality of the search engine's database.
This term is also used for sending of unsolicited bulk electronic mail.
Spider
Browser programs that are parts of search engines, which are not under
human control. They surf the web, finding and indexing web sites, their
content, keywords, text, links, etc. while storing their URL as well.
SSI
Server Side Includes are used to add dynamically generated content to
a web page. If SSI command is used within HTML code, the webserver will
execute them, and they will be replaced with the results of the SSI program.
Generally, this is considered better solution for search engine indexing
and positioning than more complicated PL/ASP database queries.
Stop Word
Words which are ignored in search engines query because they are so commonly
used that make no contribution to relevancy (web, get, the, etc. ).
T
Title
The text in the Title tag is displayed at the top of the window by
the web browser. Title text is important both because it is displayed
in search engine listings, and because it's taken into consideration for
search engine positioning. Not to be confused with heading text within
the web page.
Traffic
The number of visitors that come to a web site over a given period of
time.
U
URL
Universal Resource Locator (URL) is an address which can uniquely specify
any Internet resource (e.g. http - for webpages, ftp - for file transfers,
mailto -for e-mail addresses, etc.)
W
Website copywriting
The writing of text specifically for a web page, which can significantly
influence search engine positioning, thus considered a major part of search
engine optimization.

 |