Search Engine Optimization (SEO)

Search Engine Definition

Search Engine Optimization (SEO) is the process of improving the volume and quality of traffic to a web site from search engines via “natural” (“organic” or “algorithmic”) search results. Usually, the earlier a site is presented in the search results, or the higher it “ranks”, the more searchers will visit that site. SEO can also target different kinds of search, including image search, local search, and industry-specific vertical search engines.

Before going into all of the details and little bits of information regarding serach engines, there are a few things to consider.

  1. All of the work you do to improve your rankings does not actually guarantee results
  2. Search engines are constantly changing their policies and the way they work in attempts to stay ahead of those people that are trying to “trick the system”
  3. Search engine marketing should be part of a broader marketing strategy – promote your site through other means to get people to the site and do not solely rely on search engines. Blogs, Forum postings, emails, are just a few methods.
  4. It takes time for any changes you make to actually have an effect on your rankings, if at all. If you need immediate short term results, look into becoming a sponsored link.

How Search Engines Work

The term “search engine” is often used generically to describe both crawler-based search engines and human-powered directories. These two types of search engines gather their listings in radically different ways.

Crawler-Based Search Engines

Crawler-based search engines, such as Google, create their listings automatically. They “crawl” or “spider” the web, then people search through what they have found.

If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.

The Parts Of A Crawler-Based Search Engine

Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being “spidered” or “crawled.” The spider returns to the site on a regular basis, such as every month or two, to look for changes.

Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information.

Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been “spidered” but not yet “indexed.” Until it is indexed — added to the index — it is not available to those searching with the search engine.

Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant.

MORE INFO ON GOOGLE >>

Human-Powered Directories

A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted.

Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.

Check out dmoz.org to see an example of a Human-Powered Directory.

“Hybrid Search Engines” Or Mixed Results – METACRAWLERS

In the web’s early days, it used to be that a search engine either presented crawler-based results or human-powered listings. Today, it extremely common for both types of results to be presented. Some search engines maintain an associated directory and also have a part of their listings supplied by a third party. Some engines are known as metacrawlers because they can “crawl” through several other search engines all at the same time. Metacrawlers can be very useful when you are searching because you can compare results from several engines at once.

Major Search Engines: The Same, But Different

All crawler-based search engines have the basic parts described above, but there are differences in how these parts are tuned. That is why the same search on different search engines often produces different results.

SEARCH ENGINE FEATURES CHART >>

Getting listings

The leading search engines, Google, Yahoo! and Microsoft, use crawlers to find pages for their algorithmic search results. Pages that are linked from other search engine indexed pages do not need to be submitted because they are found automatically. Some search engines, notably Yahoo!, operate a paid submission service that guarantee crawling for either a set fee or cost per click. Such programs usually guarantee inclusion in the database, but do not guarantee specific ranking within the search results. Yahoo’s paid inclusion program has drawn criticism from advertisers and competitors. Two major directories, the Yahoo Directory and the Open Directory Project both require manual submission and human editorial review. Google offers Google Sitemaps, for which an XML type feed can be created and submitted for free to ensure that all pages are found, especially pages that aren’t discoverable by automatically following links.

Search engine crawlers may look at a number of different factors when crawling a site. Not every page is indexed by the search engines. Distance of pages from the root directory of a site may also be a factor in whether or not pages get crawled.

White hat versus black hat

SEO techniques are classified by some into two broad categories: techniques that search engines recommend as part of good design, and those techniques that search engines do not approve of and attempt to minimize the effect of, referred to as spamdexing. Some industry commentators classify these methods, and the practitioners who employ them, as either white hat SEO, or black hat SEO. White hats tend to produce results that last a long time, whereas black hats anticipate that their sites will eventually be banned once the search engines discover what they are doing.

A SEO tactic, technique or method is considered white hat if it conforms to the search engines’ guidelines and involves no deception. As the search engine guidelines are not written as a series of rules or commandments, this is an important distinction to note. White hat SEO is not just about following guidelines, but is about ensuring that the content a search engine indexes and subsequently ranks is the same content a user will see.

White hat advice is generally summed up as creating content for users, not for search engines, and then making that content easily accessible to the spiders, rather than attempting to game the algorithm. White hat SEO is in many ways similar to web development that promotes accessibility, although the two are not identical.

Black hat SEO attempts to improve rankings in ways that are disapproved of by the search engines, or involve deception. One black hat technique uses text that is hidden, either as text colored similar to the background, in an invisible div, or positioned off screen. Another method gives a different page depending on whether the page is being requested by a human visitor or a search engine, a technique known as cloaking.

Search engines may penalize sites they discover using black hat methods, either by reducing their rankings or eliminating their listings from their databases altogether. Such penalties can be applied either automatically by the search engines’ algorithms, or by a manual site review.

How Search Engines Rank Web Pages

Search for anything using your favorite crawler-based search engine. Nearly instantly, the search engine will sort through the millions of pages it knows about and present you with ones that match your topic. The matches will even be ranked, so that the most relevant ones come first.

Of course, the search engines don’t always get it right. Non-relevant pages make it through, and sometimes it may take a little more digging to find what you are looking for. But, by and large, search engines do an amazing job.

“Imagine walking up to a librarian and saying, ‘travel.’ You probably will get a response full of questions as it would be nearly impossible for someone to give you advice on such vague information.

Unfortunately, search engines don’t have the ability to ask a few questions to focus your search, as a librarian can. They also can’t rely on judgment and past experience to rank web pages, in the way humans can.

So, how do crawler-based search engines go about determining relevancy, when confronted with hundreds of millions of web pages to sort through? They follow a set of rules, known as an algorithm. Exactly how a particular search engine’s algorithm works is a closely-kept trade secret. However, all major search engines follow the general rules below.

Location, Location, Location…and Frequency

One of the the main rules in a ranking algorithm involves the location and frequency of keywords on a web page. Call it the location/frequency method, for short.

Remember the librarian mentioned above? They need to find books to match your request of “travel,” so it makes sense that they first look at books with travel in the title. Search engines operate the same way. Pages with the search terms appearing in the HTML title tag are often assumed to be more relevant than others to the topic.

Search engines will also check to see if the search keywords appear near the top of a web page, such as in the headline or in the first few paragraphs of text. They assume that any page relevant to the topic will mention those words right from the beginning.

Frequency is the other major factor in how search engines determine relevancy. A search engine will analyze how often keywords appear in relation to other words in a web page. Those with a higher frequency are often deemed more relevant than other web pages.

Spice In The Recipe

Now it’s time to qualify the location/frequency method described above. All the major search engines follow it to some degree, in the same way cooks may follow a standard chili recipe. But cooks like to add their own secret ingredients. In the same way, search engines add spice to the location/frequency method. Nobody does it exactly the same, which is one reason why the same search on different search engines produces different results.

To begin with, some search engines index more web pages than others. Some search engines also index web pages more often than others. The result is that no search engine has the exact same collection of web pages to search through. That naturally produces differences, when comparing their results.

Search engines may also penalize pages or exclude them from the index, if they detect search engine “spamming.” An example is when a word is repeated hundreds of times on a page, to increase the frequency and propel the page higher in the listings. Search engines watch for common spamming methods in a variety of ways, including following up on complaints from their users.

Off The Page Factors

Crawler-based search engines have plenty of experience now with webmasters who constantly rewrite their web pages in an attempt to gain better rankings. Some sophisticated webmasters may even go to great lengths to “reverse engineer” the location/frequency systems used by a particular search engine. Because of this, all major search engines now also make use of “off the page” ranking criteria.

Off the page factors are those that a webmasters cannot easily influence. Chief among these is link analysis. By analyzing how pages link to each other, a search engine can both determine what a page is about and whether that page is deemed to be “important” and thus deserving of a ranking boost. In addition, sophisticated techniques are used to screen out attempts by webmasters to build “artificial” links designed to boost their rankings.

Another off the page factor is clickthrough measurement. In short, this means that a search engine may watch what results someone selects for a particular search, then eventually drop high-ranking pages that aren’t attracting clicks, while promoting lower-ranking pages that do pull in visitors. As with link analysis, systems are used to compensate for artificial links generated by eager webmasters.

Search Engine Basics

Search engines are one of the primary ways that Internet users find Web sites. That’s why a Web site with good search engine listings may see a dramatic increase in traffic.

Everyone wants those good listings. Unfortunately, many Web sites appear poorly in search engine rankings or may not be listed at all because they fail to consider how search engines work.

In particular, submitting to search engines (as covered in the Essentials section) is only part of the challenge of getting good search engine positioning. It’s also important to prepare a Web site through “search engine optimization.”

Search engine optimization means ensuring that your Web pages are accessible to search engines and are focused in ways that help improve the chances they will be found.

Search Engine Watch members have access to in-depth information about submission issues and get extra benefits.
Click here to learn more about becoming a member

This next section provides information, techniques and a good grounding in the basics of search engine optimization. By using this information where appropriate, you may tap into visitors who previously missed your site.

The guide is not a primer on ways to trick or “spam” the search engines. In fact, there are not any “search engine secrets” that will guarantee a top listing. But there are a number of small changes you can make to your site that can sometimes produce big results.

Let’s go forward and first explore the two major ways search engines get their listings; then you will see how search engine optimization can especially help with crawler-based search engines.

Search Engine Placement Tips

A query on a crawler-based search engine often turns up thousands or even millions of matching web pages. In many cases, only the ten most “relevant” matches are displayed on the first page.

Naturally, anyone who runs a web site wants to be in the “top ten” results. This is because most users will find a result they like in the top ten. Being listed 11 or beyond means that many people may miss your web site.

The tips below will help you come closer to this goal, both for the keywords you think are important, and for phrases you may not even be anticipating.

Pick Your Target Keywords

How do you think people will search for your web page? The words you imagine them typing into the search box are your target keywords.

For example, say you have a page devoted to stamp collecting. Anytime someone types “stamp collecting,” you want your page to be in the top ten results. Accordingly, these are your target keywords for that page.

Each page in your web site will have different target keywords that reflect the page’s content. For example, say you have another page about the history of stamps. Then “stamp history” might be your keywords for that page.

Your target keywords should always be at least two or more words long. Usually, too many sites will be relevant for a single word, such as “stamps.” This “competition” means your odds of success are lower. Don’t waste your time fighting the odds. Pick phrases of two or more words, and you’ll have a better shot at success.

Position Your Keywords

Make sure your target keywords appear in the crucial locations on your web pages. The page’s HTML title tag is most important. Failure to put target keywords in the title tag is the main reason why perfectly relevant web pages may be poorly ranked. More about the title tag can be found on the How HTML Meta Tags Work page.

Build your titles around the top two or three phrases that you would like the page to be found for. The titles should be relatively short and attractive. Think of newspaper headlines. With a few words, they make you want to read a story. Similarly, your page titles are like headlines for your pages. They appear in search engine listings, and a short, attractive title may help encourage users to click through to your site.

Search engines also like pages where keywords appear “high” on the page, as described more fully on the Search Engine Ranking page. To accommodate them, use your target keywords for your page headline, if possible. Have them also appear in the first paragraphs of your web page.

Keep in mind that tables can “push” your text further down the page, making keywords less relevant because they appear lower on the page. This is because tables break apart when search engines read them. For example, picture a typical two-column page, where the first column has navigational links, while the second column has the keyword loaded text.

Humans see such a page like this:

Home Guitar Playing
Page 1 Rock
Page 2 Jazz
Page 3 Blues
Page 4 Folk

Search engines see the page like this:

Home
Page 1
Page 2
Page 3
Page 4

Guitar Playing

Rock
Jazz
Blues
Folk

See how the keywords have moved down the page? There is no easy way around this, other than to simplifying your table structure. Consider how tables might affect your page, but don’t necessarily stop using them. I like tables, and I’ll continue to use them.

Large sections of JavaScript can also have the same effect as tables. The search engine reads this information first, which causes the normal HTML text to appear lower on the page. Place your script further down on the page, if possible.

Using Your Competitors Keywords for Search Engine Optimization

There are quite a few new software and search engine optimization solutions and products out there these days. The main focus of them seems to concentrate on getting the best keywords for your site. Some even offer to scan your competitors web site in order to find out which keywords they are using, suggesting that all you have to do is copy them and you too will show up within the top ten listings.

Unfortunately, most companies who offer these types of services are merely preying on the lack of consumer knowledge in regards to true search engine optimization (SEO). They are banking on the fact that there is an uninitiated market out there which will unknowingly purchase these products. Usually the web site selling the product is a plethora of testimonials and hyped marketing jargon designed to do one thing, convince you the product is the greatest thing on Earth. I feel sorry for the consumers who do purchase these types of so-called search engine marketing programs, only to discover after their money is spent, that there is a lot more to online marketing than meets the eye.

One of the latest marketing forays by this type of pseudo-optimization software is the claim that all you need to do is analyze what your competition has on their web site for their keywords and phrases, and simply copy that formula for success. Yeah right! They guarantee you a top ten placement by simply following this procedure. See what the competition uses and do the same. It worked for them, so it will work for you, right? Please don’t fall into this trap. If it were that easy don’t you think everyone would be doing it? As a web site owner, use a little common sense and look beyond the hype.

A competitive analysis of a web site of similar nature to your own is in itself an excellent idea, but let’s get realistic about it. Sites which rank near the top of their categories are likely to have great keywords and phrases placed throughout the web site, but that’s only one part of the equation. The top sites also have a lot more going for them, such as excellent content, great meta tags and a good number of incoming links all relative to the subject matter on the site, to mention only a few requirements. By simply copying their keywords into your own site, you haven’t helped yourself at all. In fact you may have seriously damaged your chances for success. Here’s why: Let’s say you are in the hot tub business. Your competitor sells fiberglass hot tubs, you do not. Obviously if you simply copied the keyword phrase “fiberglass hot tubs” into your own keywords not only may you be a fool, but if the search engine spiders come by and see that particular phrase within your keywords, but find no reference nor content on your site to support that phrase, you may be penalized. Which in layman’s terms means your site could have a ‘black mark’ against it. Search engines tend to greatly frown upon web sites which have keywords and phrases that are unsupported within the sites textual content. It’s one of the measures they take to weed out unscrupulous marketers, who will chock a site full of car insurance terms to get a high ranking, but once clicked upon, ends up at an adult XXX site.

Although misrepresenting your site may not have been your intent, by merely copying your competitors keywords and using them as your own, you run the risk of being found guilty by association. Try explaining that to the search engines once the damage has been done.

The analysis of your competitors keywords and phrases is a good exercise to ensure you have similar terminology within the content of your site. Use them as an example of how it should be done. Do not, I repeat do not simply copy them to your own site! Your site should have its own tone and individuality. Treat your own keywords with respect. You must have appropriate textual content on your own pages to support the keywords. Anything less is an exercise in futility.

At its best, a competitive keyword analysis will give you a good basic idea which words you should be using within the content of your site. Incorporate some of them into your own sentences and descriptions but try and keep a realistic approach to the process.

One of most valuable aspects of a keyword analysis is the ability to spot words and terms you may be missing from your content. But before simply cutting and pasting, take a little time to find out which terms have the most ‘weight’ or most value, when it comes to usage by the searching public. Think about the keywords and terms originating from the searchers’ point of view. If you were trying to find your site without knowing it was on the Internet, what would you search for? What would Aunt Martha search for? What words and terms is your targeted customer likely to use when searching for your goods or services? Here’s a real tip: Don’t guess!

There are tools you can use to assist you in this decision making process. Take your competitive analysis keywords and run them through a site such as Overture’s Keyword Selection toolhttp://inventory.overture.com/d/search inventory/suggestion/

Originally designed for advertisers to select the best terms for pay for click advertising, it will show you how many times the term you used was searched for within the past few months. You may discover that “hot tubs” although generic enough to score highly, could be greatly enhanced when used as a term by adding a single word such as “accessory” or “portable”. The power of a single word (different from your competitors) also helps to set your site apart and dare we say, in some cases, may help your site to rise above them.

Remember – Choose the most popular ‘searched for words and terms’ and ensure they are contained within the content on your site before using them as keywords within tags or title descriptions.

Create Relevant Content

Changing your page titles is not necessarily going to help your page do well for your target keywords if the page has nothing to do with the topic. Your keywords need to be reflected in the page content.

In particular, that means you need HTML text on your page. Sometimes, sites present large sections of copy via graphics. It looks pretty, but search engines can’t read those graphics. That means they miss out on text that might make your site more relevant. Some of the search engines will index ALT text and comment information. But to be safe, use HTML text whenever possible. Some of your human visitors will appreciate it, also.

Be sure that your HTML text is “visible.” Some designers try to spam search engines by repeating keywords in a tiny font or in the same color as the background color to make the text invisible to browsers. Search engines are well aware of these and other tricks. Expect that if the text is not visible in a browser, then a search engine may not index it.

Finally, consider “expanding” your text references, where appropriate. For example, a stamp collecting page might have references to “collectors” and “collecting.” Expanding these references to “stamp collectors” and “stamp collecting” reinforces your strategic keywords in a legitimate and natural manner. Your page really is about stamp collecting, but edits may have reduced its relevancy unintentionally.

Avoid Search Engine Stumbling Blocks

Some search engines see the web the way someone using a very old browser might. They may not read image maps. They may not read frames. You need to anticipate these problems, or a search engine may not index any or all of your web pages.
Create HTML links

Often, designers create only image map links from the home page to inside pages. A search engine that can’t follow these links won’t be able to get “inside” the site. Unfortunately, the most descriptive, relevant pages are often inside pages rather than the home page.

Solve this problem by adding some HTML hyperlinks to the home page that lead to major inside pages or sections of your web site. This is something that will help some of your human visitors, also. Put these hyperlinks down at the bottom of the page. The search engine will find and follow them.

Also consider creating a site map page with text links to every page within your site. You can submit this page, which will help the search engines locate pages within your web site.

Finally, be sure you do a good job of linking internally between your pages. If you naturally point to different pages from within your site, you increase the odds that search engines will follow links and find more of your web site.

Build Inbound Links

Every major search engine uses link analysis as part of its ranking algorithm. This is done because it is very difficult for webmasters to “fake” good links, in the way they might try to spam search engines by manipulating the words on their web pages. As a result, link analysis gives search engines a useful means of determining which pages are good for particular topics.

By building links, you can help improve how well your pages perform in link analysis systems. The key is understanding that link analysis is not about “popularity.” In other words, it’s not an issue of getting lots of links from anywhere. Instead, you want links from good web pages that are related to the topics you want to be found for.

Here’s one simple means to find those good links. Go to the major search engines. Search for your target keywords. Look at the pages that appear in the top results. Now visit those pages and ask the site owners if they will link to you. Not everyone will, especially sites that are extremely competitive with yours. However, there will be non-competitive sites that will link to you — especially if you offer to link back.

Why is this system good? By searching for your target keywords, you’ll find the pages that the search engines deem authoritative, evidenced by the fact that they rank well. Hence, links from these pages are more important (and important for the terms you are interested in) than links from other pages. In addition, if these pages are top ranked, then they are likely to be receiving many visitors. Thus, if you can gain links from them, you might receive some of the visitors who initially go to those pages.

There are also other ways to attract quality links. One that has recently gained traction is linkbaiting. Linkbaiting refers to a variety of techniques used on a web site to attract links from other web sites. This can include content, online tools, downloads, or anything else that other site owners might find compelling enough to link to.

What is Link Popularity?

Link popularity is used by a number of search engines to help measure the ‘weight’ of a website. Weight can be measured by the number and quality of other web sites that link to yours. Link popularity is an essential part of any search engine optimization strategy. Good quality links are necessary to achieve high search engine rankings, as they are an important factor that the search engines look for in reviewing a web site. If everything else about a website is perfectly optimized, the website still may not show up at all, or will lag behind other, less valuable websites, if it has no incoming links.

Developing link popularity for a website is a sound investment in the future of its search engine positioning and placement. We have put together a short page to help you understand what relevant reciprocal links are. This page of the website will help you, the website operator, understand how to add support to this area of search engine optimization, to build reciprocal links, and to increase your link polarity. Be forewarned however, it will be a lengthy process, and is only one part of the search engine optimization / SEO service offered by Metamend.

How Can A Web Site Increase Its Link Popularity?
A structured, well-planned linking campaign is the best way to develop and maintain the link popularity of a website. It involves contacting other website owners or webmasters directly, and requesting that they create a link back to you. Successful link campaigns are time-consuming and tedious, but they will have a positive effect on your search engine rankings.

Some of the criteria we suggest you follow when building links for your website are:

* Ensure that the links you receive are relevant to your content in some way.
* Make sure that these are permanent links, and not short term.
* Ensure that you can control link text – what is said about your website.
* Ensure that the website you are linking with is reputable. Their reputation may affect yours. The higher they are ranked the better the result for yoU!
* Regularly check that the links to and from your website still work.
* Link from your links page to the other party’s top page, and vice versa

What is a reciprocal link?

A reciprocal link is a text and/or banner link to a web site that also has a text or banner link back (in reciprocation) to your own website. A reciprocal link is a reference. The website operator who exhanged links has essentially said that the “website that we linked to is worth looking at.”

Lastly, always remember that just like any other part of an SEO strategy, a reciprocal link is not a quick fix to bring in traffic. It’s only part of a search engine optimization strategy.

Why Build Reciprocal Links?
A properly executed reciprocal links campaign will help ‘push’ a website to the top of the search engine listings for a relevant keyword search. Think about it, the entire goal of a search engine marketing strategy is to attract visitors to a website. One of these steps is to ensure that when potential visitors to your website enter a relevant search phrase into a search engine, the search results page lists a link to your site, hopefully near the top.

By having as many websites as possible linking to your own, you are essentially getting that many recommendations from other web site operators. These recommendations are counted and weighed by the search engines as part of their algorithms.

Lastly, search engine robots navigate the Internet via links, so that by linking, you are helping the search engines build a better Internet. Therefore linking helps your score in the search engines, and helps the search engines find any websites you deem to be of interest.

Submit Your Key Pages

Most search engines will index the other pages from your web site by following links from a page you submit to them. But sometimes they miss, so it’s good to submit the top two or three pages that best summarize your web site.

Don’t trust the submission process to automated programs and services. Some of them are excellent, but the major search engines are too important. There aren’t that many. Submit manually, so that you can see if there are any problems reported.

Also, don’t bother submitting more than the top two or three pages. It doesn’t speed up the process to submit more. Submitting alternative pages is only insurance. In case the search engine has trouble reaching one of the pages, you’ve covered yourself by giving it another page from which to begin its crawl of your site.

Be patient. It can take up to a month to two months for your “non-submitted” pages to appear in a search engine. Additionally, some search engines may not list every page from your site.

Verify and Maintain Your Listing

Check on your pages and ensure they get listed, in the ways described on the Check URL page. Once your pages are listed in a search engine, monitor your listing every week or two. Strange things happen. Pages disappear from catalogs. Links go screwy. Watch for trouble, and resubmit if you spot problems.

Resubmit your site any time you make significant changes. Search engines should revisit on a regular schedule. However, some search engines have grown smart enough to realize some sites only change content once or twice a year, so they may visit less often. Resubmitting after major changes will help ensure that your site’s content is kept current.
Beyond Search Engines

It’s worth taking the time to make your site more search engine friendly because some simple changes may pay off with big results. Even if you don’t come up in the top ten for your target keywords, you may find an improvement for target keywords you aren’t anticipating. The addition of just one extra word can suddenly make a site appear more relevant, and it can be impossible to guess what that word will be.

Also, remember that while search engines are a primary way people look for web sites, they are not the only way. People also find sites through word-of-mouth, traditional advertising, traditional media, blog posts, web directories, and links from other sites. Since the advent of Web 2.0 applications, people are finding sites through feeds, blogs, podcasts, vlogs and many other means. Sometimes, these alternative forms can be more effective draws than search engines. The most effective marketing strategy is to combine search marketing with other online and offline media.

Finally, know when it’s time to call it quits. A few changes may be enough to achieve top rankings in one or two search engines. But that’s not enough for some people, and they will invest days creating special pages and changing their sites to try and do better. This time could usually be put to better use pursuing non-search engine publicity methods.

Don’t obsess over your ranking. Even if you follow every tip and find no improvement, you still have gained something. You will know that search engines are not the way you’ll be attracting traffic. You can concentrate your efforts in more productive areas, rather than wasting your valuable time.

Meta Tags

How To Use HTML Meta Tags

http://searchenginewatch.com/2167931/print

Want to get a top ranking in search engines? No problem! All you need to do is add a few magical “meta tags” to your web pages, and you’ll skyrocket to the top of the listings.

If only it were so easy. Let’s make it clear:

* Meta tags are not a magic solution.
* Meta tags are not a magic solution.
* Meta tags are not a magic solution.

Meta tags have never been a guaranteed way to gain a top ranking on crawler-based search engines. Today, the most valuable feature they offer the web site owner is the ability to control to some degree how their web pages are described by some search engines. They also offer the ability to prevent pages from being indexed at all. This page explores these and other meta tag-related features in more depth.
Meta Tag Overview

What are meta tags? They are information inserted into the “head” area of your web pages. Other than the title tag (explained below), information in the head area of your web pages is not seen by those viewing your pages in browsers. Instead, meta information in this area is used to communicate information that a human visitor may not be concerned with. Meta tags, for example, can tell a browser what “character set” to use or whether a web page has self-rated itself in terms of adult content.

Meta tags go in between the “opening” and “closing” HEAD tags.

The Title Tag

The HTML title tag isn’t really a meta tag, but it’s worth discussing in relation to them. Whatever text you place in the TITLE tag will appear in the title bar of someone’s browser when they view the web page.

Some browsers also supplement whatever you put in the title tag by adding their own name. The title tag is also used as the words to describe your page when someone adds it to their “Favorites” or “Bookmarks” lists.

But what about search engines! The title tag is crucial for them. The text you use in the title tag is one of the most important factors in how a search engine may decide to rank your web page (see the Search Engine Placement Tips section for more details). In addition, all major crawlers will use the text of your title tag as the text they use for the title of your page in your listings.

In review, think about the key terms you’d like your page to be found for in crawler-based search engines, then incorporate those terms into your title tag in a short, descriptive fashion. That text will then be used as your title in crawler-based search engines, as well as the title in bookmarks and in browser titlw bars.

The Meta Description Tag

The meta description tag allows you to influence the description of your page in the crawlers that support the tag. The text you want to be shown as your description goes between the quotation marks after the “content=” portion of the tag (generally, 200 to 250 characters may be indexed, though only a smaller portion of this amount may be displayed).

Is the description tag always used? Not with every search engine. For example, Google ignores the meta description tag and instead will automatically generate its own description for this page. Others may support it partially.

In review, it is worthwhile to use the meta description tag for your pages, because it gives you some degree of control with various crawlers. An easy way to do this often is to take the first sentence or two of body copy from your web page and use that for the meta description content.

The Meta Keywords Tag

The meta keywords tag allows you to provide additional text for crawler-based search engines to index along with your body copy. How does this help you? Well, for most major crawlers, it doesn’t. That’s because most crawlers now ignore the tag.

The meta keywords tag is sometimes useful as a way to reinforce the terms you think a page is important for ON THE FEW CRAWLERS THAT SUPPORT IT. For instance, if you had a page about stamp collecting — AND you say the words stamp collecting at various places in your body copy — then mentioning the words “stamp collecting” in the meta keywords tag MIGHT help boost your page a bit higher for those words.

Remember, if you don’t use the words “stamp collecting” on the page at all, then just adding them to the meta keywords tag is extremely unlikely to help the page do well for the term. The text in the meta keywords tag, FOR THE FEW CRAWLERS THAT SUPPORT IT, works in conjunction with the text in your body copy.

The meta keyword tag is also sometimes useful as a way to help your page come up for synonyms or unusual words that don’t appear on the page itself. For instance, let’s say you had a page all about the “Penny Black” stamp. You never actually say the word “collecting” on this page. By having the word in your meta keywords tag, then you may help increase the odds of coming up if someone searched for “penny black stamp collecting.” Of course you would greater increase the odds if you just used the word “collecting” in the body copy of the page itself.

Here’s another example. Let’s say you have a page about horseback riding, and you’ve written your page using “horseback” as a single word. You realize that some people may instead search for “horse back riding,” with “horse back” in their searches being two separate words. If you listed these words separately in your meta keywords tag, THEN MAYBE FOR THE FEW CRAWLERS THAT SUPPORT IT, your page might rank better for “horse back” riding. Sadly, the best way to ensure this would be to write your pages using both “horseback riding” and “horse back riding” in the text — or perhaps on some of your pages, use the single word version and on others, the two word version.

I’m using all these capital letters on purpose. Far too many people new to search engine optimization obsess with the meta keywords tag. FEW crawlers support it. For those that do, it MIGHT! MAYBE! PERHAPS! POSSIBLY! BUT WITH NO GUARANTEE! help improve the ranking of your page. It also may very well do nothing for your page at all. In fact, repeat a particular word too often in a meta keywords tag and you could actually harm your page’s chances of ranking well. Because of this, I strongly suggest that those new to search engine optimization not even worry about the tag at all.

Even those who are experienced in search engine optimization may decide it is no longer worth using the tags. Search Engine Watch doesn’t. Any meta keywords tags you find in the site were written in the past, when the keywords tag was more important. There’s no harm in leaving up existing tags you may have written, but going forward, writing new tags probably isn’t worth the trouble. The articles below explores this in more detail:

Inktomi says that you should include up to 25 words or phrases, with each word or phrase separated by commas. In the past, when the tag was supported by other search engines, they generally indexed up to 1,000 characters of text and commas were not required.

Meta Tag Generators, Builders and Evaluators

Meta Tag Builder
This form allows you to create very complicated meta tags using much more than the keywords and description tags, if you wish. Note that it will place a commented credit line into the tag. This can easily be removed, if you wish.

Robots.txt Files

A robots.txt is a file placed on your server to tell the various search engine spiders not to crawl or index certain sections or pages of your site. You can use it to prevent indexing totally, prevent certain areas of your site from being indexes or to issue individual indexing instructions to specific search engines.

The file itself is a simple text file, which can be created in Notepad or any ASCI text editor. It needs to be saved to the root directory of your site, that is the directory where your home page or index page is.

Why Do I Need One?

All search engines, or at least all the important ones, now look for a robots.txt file as soon their spiders or bots arrive on your site. So, even if you currently do not need to exclude the spiders from any part of your site, having a robots.txt file is still a good idea, it can act as a sort of invitation into your site.

There are a number of situations where you may wish to exclude spiders from some or all of your site.

  • You are still building the site, or certain pages, and do not want the unfinished work to appear in search engines
  • You have information that, while not sensitive enough to bother password protecting, is of no interest to anyone but those it is intended for and you would prefer it did not appear in search engines.
  • Most people will have some directories they would prefer were not crawled – for example do you really need to have your cgi-bin indexed? Or a directory that simply contains thank you or error pages.
  • If you are using doorway pages (similar pages, each optimized for an individual search engine) you may wish to ensure that individual robots do not have access to all of them. This is important in order to avoid being penalized for spamming a search engine with a series of overly similar pages.
  • You would like to exclude some bots or spiders altogether, for example those from search engines you do not want to appear in or those whose chief purpose is collecting email addresses.

The very fact that search engines are looking for them is reason enough to put one on your site. Have you looked at your site statistics recently? If your stats include a section on ‘files not found’, you are sure to see many entries where search engines spiders looked for, and failed to find, a robots.txt file on your site.

Creating the robots.txt file

There is nothing difficult about creating a basic robots.txt file. It can be created using notepad or whatever is your favorite text editor. Each entry has just two lines:

User-Agent: [Spider or Bot name]
Disallow: [Directory or File Name]

This line can be repeated for each directory or file you want to exclude, or for each spider or bot you want to exclude.

A few examples will make it clearer.

1. Exclude a file from an individual Search Engine

You have a file, privatefile.htm, in a directory called ‘private’ that you do not wish to be indexed by Google. You know that the spider that Google sends out is called ‘Googlebot’. You would add these lines to your robots.txt file:

User-Agent: Googlebot
Disallow: /private/privatefile.htm

2. Exclude a section of your site from all spiders and bots

You are building a new section to your site in a directory called ‘newsection’ and do not wish it to be indexed before you are finished. In this case you do not need to specify each robot that you wish to exclude, you can simply use a wildcard character, ‘*’, to exclude them all.

User-Agent: *
Disallow: /newsection/

Note that there is a forward slash at the beginning and end of the directory name, indicating that you do not want any files in that directory indexed.

3. Allow all spiders to index everything

Once again you can use the wildcard, ‘*’, to let all spiders know they are welcome. The second, disallow, line you just leave empty, that is your disallow from nowhere.

User-agent: *
Disallow:

4. Allow no spiders to index any part of your site

This requires just a tiny change from the command above – be careful!

User-agent: *
Disallow: /

If you use this command while building your site, don’t forget to remove it once your site is live!

Getting More Complicated

If you have a more complex set of requirements you are going to need a robots.txt file with a number of different commands. You need to be quite careful creating such a file, you do not want to accidentally disallow access to spiders or to areas you really want indexed.

Let’s take quite a complex scenario. You want most spiders to index most of your site, with the following exceptions:

1. You want none of the files in your cgi-bin (a folder that is often used to hold special scripts) indexed at all, nor do you want any of the FP specific folders indexed – eg _private, _themes, _vti_cnf and so on.
2. You want to exclude your entire site from a single search engine – let’s say Alta Vista.
3. You do not want any of your images to appear in the Google Image Search index.
4. You want to present a different version of a particular page to Lycos and Google.
(Caution here, there are a lot of question marks over the use of ‘doorway pages’ in this fashion. This is not the place for a discussion of them but if you are using this technique you should do some research on it first.)

Let’s take this one in stages!

1. First you would ban all search engines from the directories you do not want indexed at all:

User-agent: *
Disallow: /cgi-bin/
Disallow: /_borders/
Disallow: /_derived/
Disallow: /_fpclass/
Disallow: /_overlay/
Disallow: /_private/
Disallow: /_themes/
Disallow: /_vti_bin/
Disallow: /_vti_cnf/
Disallow: /_vti_log/
Disallow: /_vti_map/
Disallow: /_vti_pvt/
Disallow: /_vti_txt/

It is not necessary to create a new command for each directory, it is quite acceptable to just list them as above.

2. The next thing we want to do is to prevent Alta Vista from getting in there at all. The Altavista bot is called Scooter.

User-Agent: Scooter
Disallow: /

This entry can be thought of as an amendment to the first entry, which allowed all bots in everywhere except the defined files. We are now saying we mean all bots can index the whole site apart from the directories specified in the 1 above, except Scooter which can index nothing.

3. Now you want to keep Google away from those images. Google grabs these images with a sperate bot from the one that indexes pages generally, called Googlebot-Image. You have a couple of choices here:

User-Agent: Googlebot-Image
Disallow: /images/

That will work if you are very organized and keep all your images strictly in the images folder.

User-Agent: Googlebot-Image
Disallow: /

This one will prevent the Google image bot from indexing any of your images, no matter where they are in your site.

4. Finally, you have two pages called content1.html and content2.html, which are optimized for Google and Lycos respectively. So, you want to hide content1.html from Lycos (The Lycos spider is called T-Rex):

User-Agent: T-Rex
Disallow: /content1.html

and content2.html from Google.

User-Agent: Googlebot
Disallow: /content2.html

Summary and Links

Writing a robots.txt file is, as you have seen, a relatively simple matter. However it is important to bear in mind that it is not a security method. It may stop your specified pages from appearing in search engines, but it will not make them unavailable. There are many hundreds of bots and spiders crawling the Internet now and while most will respect your robot.txt file, some will not and there are even some designed specifically to visit the very pages you are specifying as being out of bounds.

What if I can’t make a robots.txt file?

Sometimes you cannot make a robots.txt file, because you don’t administer the entire server. All is not lost: there is a new standard for using HTML META tags to keep robots out of your documents.

The basic idea is that if you include a tag like:

<META NAME=”ROBOTS” CONTENT=”NOINDEX”>

in your HTML document, that document won’t be indexed.

If you do:

<META NAME=”ROBOTS” CONTENT=”NOFOLLOW”>

the links in that document will not be parsed by the robot.

The Robots META tag, placed in the HTML <HEAD> section of a page, can specify either or both of these actions. Many, but not all, search engine robots will recognize this tag and follow the rules for each page.

Do not index, but follow links
<META name=”ROBOTS” content=”NOINDEX”>
Use this for pages with many links on them, but not much useful data. Because “follow” is the default, you don’t have to include it.

Index, but do not follow links
<META name=”ROBOTS” content=”NOFOLLOW”>
Use this for pages which have useful content but links which may be irrelevant or obsolete.

Do not index or follow links
<META name=”ROBOTS” content=”NOINDEX,NOFOLLOW”>
This is for pages which should not be indexed at all. If you put that in every page, the site should not be indexed.

Index and follow links
<META name=”ROBOTS” content=”INDEX,FOLLOW”>
This is the default behavior: you don’t have to include this tag.

Sitemap.xml

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

Where do I place my Sitemap?

It is strongly recommended that you place your Sitemap at the root directory of your HTML server; that is, place it at http://example.com/sitemap.xml.

In some situations, you may want to produce different Sitemaps for different paths on your site — e.g., if security permissions in your organization compartmentalize write access to different directories.

We assume that if you have the permission to upload http://example.com/path/sitemap.xml, you also have permission to report metadata under http://example.com/path/.

All URLs listed in the Sitemap must reside on the same host as the Sitemap. For instance, if the Sitemap is located at http://www.example.com/sitemap.xml, it can’t include URLs from http://subdomain.example.com. If the Sitemap is located at http://www.example.com/myfolder/sitemap.xml, it can’t include URLs from http://www.example.com.

How big can my Sitemap be?

Sitemaps should be no larger than 10MB (10,485,760 bytes) and can contain a maximum of 50,000 URLs. These limits help to ensure that your web server does not get bogged down serving very large files. This means that if your site contains more than 50,000 URLs or your Sitemap is bigger than 10MB, you must create multiple Sitemap files and use a Sitemap index file. You should use a Sitemap index file even if you have a small site but plan on growing beyond 50,000 URLs or a file size of 10MB. A Sitemap index file can include up to 1,000 Sitemaps and must not exceed 10MB (10,485,760 bytes). You can also use gzip to compress your Sitemaps.

Once you have created the Sitemap file and placed it on your webserver, you need to inform the search engines that support this protocol of its location.

Submitting your Sitemap via the search engine’s submission interface

To submit your Sitemap directly to a search engine, which will enable you to receive status information and any processing errors, refer to each search engine’s documentation.

Specifying the Sitemap location in your robots.txt file

You can specify the location of the Sitemap using a robots.txt file. To do this, simply add the following line:

Sitemap: <sitemap_location>

The <sitemap_location> should be the complete URL to the Sitemap, such as: http://www.example.com/sitemap.xml

This directive is independent of the user-agent line, so it doesn’t matter where you place it in your file. If you have a Sitemap index file, you can include the location of just that file. You don’t need to list each individual Sitemap listed in the index file.

ONLINE SITEMAP GENERATOR >>

SEO Online Tools

SEO Analysis : Use for a complete analysis of your website
www.1-hit.com
www.websitegrader.com

Link Popularity : Check
Link Popularity Check

Indexing
Index visibility Checker
HTTP Viewer
Spider Viewer

Keywords
Keyword Density Analyser
Keyword Density
Keyword Difficulty
Keyword Optimizer
Keyword Suggestions
Keyword Typo Generator

Meta Tag Generator
Submit Express
Any Browser

Search Engine Position
Position Ranking

Robots
Robots text file generator
Robot code generator
Robots text validator

Sitemaps
xml-sitemaps

Test Your Site

http://www.websitegrader.com

Exercises

  • Write a list of keywords associated with your site
  • Write a description for your site
  • Create a robots.txt file for your site and post to the server
  • Create a sitemap.xml file and post to th server
  • Start working on content for the home page of your site (text! – using keywords)
  • Rethink (and change if necessary) your headings / titles, etc. for your site

Lorem ipsum dolor sit amet, consectetur adipiscing elit pretium urna quis tortor consectetur et pretium elit