There are two ways to retrieve visitor data about for your Web site. The first is to read the server visitor logs and the second is to use a real-time visitor tracking system.
The difference between them is as follows.
Log analyzers
When someone requests a Web page from a server it records this request in its raw visit logs. A typical server log record looks like this:
127.0.0.1 - frank [ 10/Oct/2000 : 13:55:36 -0700] ” GET /apache_pb.gif HTTP/1.0″ 200 2326 “http://www.example.com/start.html” “Mozilla/4.08 [en] (Win98; I ;Nav)”
The following is an example of the kind of information contained in a visitor record: a request for a document from the IP address “127.0.0.1″; if the document requested is protected through HTTP authentication (as in our case), we will see the User ID under which the user has passed the authentication process (here we see that the user has authenticated as “frank“); date and time of his visit; the HTTP request sent by his browser (here, the browser wants to GET the image “apache_pb.gif” from the server root directory); whether or not his request has been successful (200 is the HTTP response code returned by the server, it means the request was successful); the size of the content forwarded in response to the request – in our case, the image file is 2326 bytes long; referrer (where this visitor has come from – most probably, a Web page that links to or contains the file “apache_pb.gif“); and, finally, user agent, most typically, the browser used by the visitor.
You may think that the server access log contains lots of useful information, and it does. Many programs, called “log analyzers“, are able to parse server logs and display the information it contains in a user-friendly form. One of the most widespread log analyzers is webalizer (http://www.webalizer.com/). It comes as a bundle with many hosting service packages to provide Web site owners with comprehensive statistics of user visits.
However, if we take a deeper look, the records in the server access log do not provide enough information to be truly useful for your marketing. They cannot give you detailed and distinctive information about repeated and unique visits, transaction cycle time, details of visitor’s browser, screen and operating system, effectiveness of your advertising campaigns, keywords and referrers, etc. Even if it’s possible to extract this information from a server log, it requires very sophisticated log analysis software and a great deal of time spent on integration and organization.
Moreover, for certain reasons, server logs do not often provide information with the precision we want. The main reasons are as follows:
- Visits made by a person who uses a dynamic IP address (e.g. accesses the Internet via dial-up) will be recorded as made by several visitors.
- Visits made through a corporate network with only one external IP address will count as made by one unique visitor.
- If a page hasn’t finished (or hasn’t even started) loading into the user’s browser, the server writes this visit to log as soon as it has sent back the request header. However, the visitor may have not seen this page at all.
For these and several other reasons, we want a more improved system of tracking visits. We need a real-time visitor tracking system like HitLens.
Real-time visitor tracking
Most Web browsers, including Internet Explorer, are able to execute browser-side scripts. That is, when you compose a Web page, you can embed program code that will tell the browser not only to show the page on statically on the screen, but to simultaneously perform certain other actions as well.
As a rule, these scripts (written in JavaScript or VBScript) are used to create on-page animation, rollover effects, menu effects, etc. Also, they can be used to process data entered by the user into a form.
HitLens utilizes another purpose of browser-side scripts: that of reporting various data about visitor page views to a remote database, and thus becomes a perfect tool for collecting marketing statistics.
When someone opens a Web page in their browser, the browser executes the tracking JavaScript embedded into the page. The script contains commands to send every possible piece of information about the visitor to the tracking center. Certainly, it cannot report highly personal data like the visitor’s name or sex. What it can report is visitor’s screen resolution details, operating system and browser details, cookie information (if available), time zone, referrer, name of the page viewed and so on, as well as the information embedded directly into the script at design time.
This process remains invisible to the Web surfer viewing the page. The data passed to the tracking center are stored in the database, and when you request the statistics for your site, it is organized into the necessary report and delivered to your computer by HitLens.
This kind of reporting is able to provide a lot more information than simple server logs because the data is gathered directly at the time of visit and immediately at the visitor’s computer. This is also the reason why this kind of tracking is called “real-time” tracking.
To implement this kind of tracking, you must have access to the source code of your pages, since you need to embed the tracking script into them. You should also have privileges to upload files to your hosting server. There aren’t any more technical requirements.
Here’s how real-time tracking compares to log analysis:
|
Factor |
Log analyzer |
Visitor tracking system |
|||
|
Dynamic IP addresses |
Counts multiple visitors |
Counts one visitor |
|||
|
Large network accessing Internet through the proxy server |
Counts one visitor |
Counts multiple visitors |
|||
| Note: Visitor tracking systems identify unique visitors by cookies (this method is backed up by the IP tracking when cookies are not supported). Log analyzers identify visitors by their IP addresses. Cookie tracking provides much more accurate results than IP tracking when a) a visitor uses rotating (dynamic) IP addresses - server logs will record many visitors based on different IP addresses, while the Visitor tracking system will identify the visitor by checking the cookie, b) visitors come to your site from behind a corporate firewall and use the same proxy server - there will be one visitor in the log files, whereas Visitor tracking systems will recognize all visitors by reading their cookies. | |||||
|
Page caching by a proxy server |
Views of cached pages are not counted |
All page views are counted |
|||
| Note: When there is a proxy server that caches Web pages requested from the server : the repeated requests of the same page will not be recorded in the server log files, because the page will be taken from the cache of the proxy server, and the Web server will not get the request. | |||||
|
Non-browser activity |
Counted |
Not counted |
|||
| Note: Most robots do not support images and do not execute JavaScript, this is why their visits are excluded from reports in Visitor tracking systems. Human visitors with images and scripts turned off in their browsers will also not be counted in Visitor tracking systems. | |||||
|
Interrupted requests |
A visit is counted as soon as the request header is received |
Page loading and tracking script execution need some time, so if it is too short, the visit will not be counted |
|||
| Note: Visitor tracking systems need script execution . The time needed for a browser to load the page and execute the script depends on page size and the position of the script within the HTML code of the page. This is why if the server receives the page request, the hit (visit) will be recorded in server logs; but if the request has been interrupted and the tracking script has not been executed, this visit will not be counted by the Visitor tracking system. | |||||
|
Framed sites |
A page consisting of a few frames is considered as multiple pages |
A page consisting of a few frames is considered as a single page |
|||
| Note: Framed sites present some difficulty for accurate page view measurement because they consist of a few pages loaded in frames. Servers count all loaded pages in frames as multiple page views, while an advanced Visitor tracking system (like HitLens) will report one page view. | |||||
HitLens is one of the industry-leading solutions for real-time browser-side visitor tracking. In the next few lessons you will find out how to set up HitLens statistics for your Web site(s).
What you should remember
- There are two main ways to analyze traffic, one is by parsing server access logs and the second is by embedding tracking scripts into the pages themselves. Special software exists to simplify both techniques/
- Real-time visitor tracking with the help of a tracking script helps you gather more precise and valuable marketing information than the analysis of server access logs.
As we learned in the introductory lesson, 

If you submit manually to a search engine, it’s clear whether your submission has been accepted: most commonly, you will be shown a message confirming that your page has been queued for crawling, or an error message explaining why it hasn’t.
There are hundreds of directories on the Web that cover every possible market, offering you valuable opportunities to get your site listed in the crawler-based engines, expose your site to your audience and increase the absolute value of your pages (also known as Google Page Rank). The first (and, if you succeed, maybe the only) directories to get listed in are the Yahoo and DMOZ.



