|
|
|
| United States Patent | 6189030 |
| Link to this page | http://www.wikipatents.com/6189030.html |
| Inventor(s) | Kirsch; Steven T. (Los Altos, CA);
Lindblad; Christopher J. (Oakland, CA) |
| Abstract | A message is provided to a tracking server system in response to a client
system referencing a predetermined resource locator that corresponds to a
resource external to the tracking server system. The tracking server
system indirectly provides for the client system to have an informational
element selectable by the client system, where the informational element
is graphically identified on the client system with informational content
obtainable from a content server system through use of a content resource
locator. The informational element includes a tracking resource locator,
referencing the tracking server system, and data identifying the
informational element. The selection of the informational element causes
the client system to use the tracking resource locator to provide the data
to the tracking server system and to use the content resource locator to
obtain the informational content from the content server system. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 6189030 |
|
|
Method and apparatus for redirection of server external hyper-link
references |
|
|
|
|
|
| Publication Date |
February 13, 2001 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
CROSS REFERENCE TO RELATED APPLICATIONS
This application is a continuation-in-part of application Ser. No.
08/999,727, filed Dec. 23, 1997, now U.S. Pat. No. 5,870,546, which is a
continuation of application Ser. No. 08/604,468, filed Feb. 21, 1996, now
U.S. Pat. No. 5,751,956. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention is generally related to the control of network
information server systems supporting World Wide Web based data pages and,
in particular, to a server system and process for efficiently redirecting
external server hyper-link references for purposes of controlling,
moderating, and accounting for such references.
2. Description of the Related Art
The recent substantial growth and use of the internationally connected
network generally known as the Internet has largely been due to widespread
support of the hypertext transfer protocol (HTTP). This protocol permits
client systems connected through Internet Service Providers (ISPs) to
access independent and geographically scattered server systems also
connected to the Internet. Client side browsers, such as Netscape
Mozilla.RTM. and Navigator.RTM. (Netscape Communications Corp.), Microsoft
Internet Explorer.RTM. and NCSA Mosaic.TM., provide efficient graphical
user interface based client applications that implement the client side
portion of the HTTP protocol.
Server side application programs, generically referred to as HTTPd servers,
implement the server side portion of the HTTP protocol. HTTP server
applications are available both commercially, from companies such as
Netscape, and as copyrighted freeware available in source code form from
NCSA.
The distributed system of communication and information transfer made
possible by the HTTP protocol is commonly known as the World Wide Web (WWW
or W3) or as simply "the Web." From a client side user interface
perspective, a system of uniform resource locators (URLs) is used to
direct the operation of a web browser in establishing atomic transactional
communication sessions with designated web server computer systems. In
general, each URL is of the basic form:
http://<server_name>.<sub-domain.top_level-domain>/<path>
The server_name is typically "www" and the sub_domain.top-level_domain is a
standard Internet domain reference. The path is an optional additional URL
qualifier.
Specification by user selection of a URL on the client side results in a
transaction being established in which the client sends the server an HTTP
message referencing a default or explicitly named data file constructed in
accordance with the hypertext mark up language (HTML). This data file or
web page is returned in one or more response phase HTTP messages by the
server, generally for display by the client browser. Additional embedded
image references may be identified in the returned web page resulting in
the client browser initiating subsequent HTML transactions to retrieve
typically embedded graphics files. A fully reconstructed web page image is
then presented by the browser through the browser's graphical user
interface.
Due to the completely distributed client/server architecture of the Web, as
made possible by the URL system further supported by the existing Internet
name resolution services and routing conventions, HTTP servers can be
independently established with little difficulty. Consequently, the Web
has no centrally or even regionally enforced organization other than
loosely by name of the top level domain. Searching for information or
other resources provided by individual HTTP servers is therefore
problematic almost by definition. Because of the time, cost and complexity
of assembling comprehensive, yet efficiently searchable databases of web
information and resources, commercial Internet Business Services (IBS)
have been established to provide typically fee based or advertising
revenue supported search engine services that operate against compilations
of the information and resources available via the Web correlated to
source URLs. Access to such search engines is usually provided through
server local web pages served by the Internet Business Services. The
results of a search are served in the form of local web pages with
appropriate embedded remote or hyper-linked URLs dynamically constructed
by the server of the Internet Business Service.
Because of the opportunity presented by the likely repeated client access
and retrieval of search engine and search result web pages, providers of
other Internet based services have begun to actively place advertisements
on these web pages. As is typical in advertising mediums, the frequency of
display of an advertisement generally defines the compensation paid to the
advertisement publisher. Thus, the number of times that an advertisement
is simply transferred to a client browser provides an indication of how
effectively the advertisement is being published. A more direct measure of
the effectiveness of a particular advertisement on a particular web page
is the number of times a client web browser chooses to actively pursue the
URL represented by the advertisement. Thus, there is a need to be able to
track information obtainable from a client browser when a hyper-linked
advertiser's URL is selected.
The difficulty in obtaining direct reference information arises from the
fact that a web page with an embedded advertisement and corresponding
remote URL is served in its entirety to the client browser upon first
reference to the web page. The selection of a particular advertiser's URL
is then by definition performed through an independent transaction
directed to the HTTPd server associated with the advertiser. Since the
advertiser publishing HTTPd server is not part of this subsequent
transaction, the publishing server is conventionally incapable of tracking
client browser hyper-links actually executed to an advertiser's URL or any
other URLs embedded in a web page previously served to the client browser.
Simple web page access counters are relatively well known and used
throughout the Web. These access counters are based on a common gateway
interface (CGI) facility supported by modern HTTPd server systems. The CGI
facility permits generally small programs, at least typically in terms of
function, to be executed by a server in response to a client URL request.
That is, the HTML web page definition provides for the embedding of a
specific HTML reference that will specify execution of a server side CGI
program as part of the process of the web browser reconstructing an image
of a served web page. Such a HTML reference is typically of the form:
<img src="http://www.target.com/cgi-bin/count.cgi">
Thus, a counter value incremented with each discrete execution of the CGI
program (count.cgi) dynamically provides part of the displayable image of
the reconstructed web page. The time, remote client requester, client
domain, client browser type and other information that may be known
through the operation of the HTTP protocol may be logged as part of the
CGI program's function. Consequently, a reasonable manner of accounting
and auditing for certain web page accesses exists.
Access counters, however, fundamentally log only server local web page
accesses. The client browser to the CGI program is evaluated by the client
in connection with the initial serving of the web page to the client
browser. The initial serving of the web page to the client browser can be
connected, but any subsequent selection of a URL that provides a
hyper-link reference to an external server is not observed and therefore
is not counted by a CGI program based access counter. Other limitations of
access counters arise from the fact that the implementing CGI program is
an independently loadable executable. The CGI program must be discretely
loaded and executed by the server computer system in response to each URL
reference to the CGI program. The repeated program loading and execution
overhead, though potentially small for each individual invocation of the
CGI program, can represent a significant if not substantial load to the
sever computer system. The frequent execution of CGI programs is commonly
associated with a degradation of the effective average access time of the
HTTPd server in responding to client URL requests. Since an Internet
Business Service providing access to a search engine logs millions of
requests each day, even small reductions in the efficiency of serving web
pages can seriously degrade the cost efficiency of the Internet Business
Service. As of December, 1995, Infoseek Corporation, in particular,
handles an average of five million retrievals a day.
The execution overhead associated with CGI programs is often rather
significant. Many CGI programs are implemented at least in part through
the use of an interpreted language such as Perl or TCL. Consequently, a
substantial processing overhead is involved in multiple mass storage
transfers to load both the interpreter and CGI program scripts, to process
the scripts through the execution of the interpreter, and then actually
log whatever useful data is generated, typically to persistent mass
storage. Finally, the interpreter and/or CGI program may have to be
unloaded.
In addition, external CGI programs present a significant problem in terms
of maintenance, including initial and ongoing server configuration and
control, and security in the context of a busy server system. Individual
CGI programs will likely be needed for each independent web page in order
to separately identify web page service counts. Alternatively, a CGI
program can be made sufficiently complex to be able to distinguish the
precise manner in which the program is called so as to identify a
particular web page and log an appropriately distinctive access count.
Maintenance of such CGI programs on a server system where large numbers of
page accesses are being separately counted is non trivial.
Further, the existence of external programs, particularly of scripts that
are interpreted dynamically, represents a potential security problem. In
particular, the access and execute permissions of interpreted scripts must
be carefully managed and monitored to prevent any unauthorized script from
being executed that could, in turn, compromise the integrity of the data
being collected if not the fundamental integrity of the server computer
system itself. Consequently, known access counters provide no solution
directly in full or in part to the need to account or audit URL references
to external servers based on hyper-links from previously served web pages.
The HTTP protocol itself provides for a basic server based system of URL
redirection for servers and clients supporting the 1.5 or later versions
of the HTTP protocol. A configuration file associated with an HTTP server
(typically srm.conf) can specify a redirect directive that effectively
maps a server local directory URL reference to an external URL reference
through the use of a configuration directive of the form:
Redirect /dir1 http://newserver.widget.com/dir1
When a Version 1.5 or later HTTP server receives a URL reference to a local
directory (/dir1) that is specified as above for redirection, a redirect
message is returned to the client browser including a new location in the
form of an URL (http://newserver.widget.com/dir1). This redirect URL is
then used by the client browser as the basis for a conventional client URL
request.
This existing server based redirection function is insufficient to support
external server access tracking since, in its usual form, the redirection
is of the entire directory hierarchy that shares a common redirected base
directory. Even in the most restricted form, the redirection is performed
on a per directory reference basis. Thus, every access to the directory,
independent of the particular web page or graphics image or CGI program
that is the specific object of an access request is nonetheless discretely
redirected without distinction. Any potential use of the existing server
redirect function is therefore exceedingly constrained if not practically
prohibited by the HTTP protocol defined operation of the redirect
directive.
Furthermore, the redirect directive capability of the HTTP protocol server
does not provide for the execution of a CGI program or other executable
coincident with the performance of the redirection thereby essentially
precluding any action to capture information related to the redirect URL
request. In addition, the complexity of the resource configuration file
necessary to specify redirection down to a per directory configuration
again raises significant configuration, maintenance and, to a lesser
degree, security issues. Thus, server redirection does not possess even
the basic capabilities necessary to support external URL hyper-link
reference auditing or accounting.
Finally, a form of redirection might be accomplished though the utilization
of a relatively complex CGI program. Such a redirection CGI program would
likely need to perform some form of alternate resource identification as
necessary to identify a redirection target URL. Assuming that a unique
target URL can be identified, a redirection message can then be returned
to a client from the CGI program through the HTTP server as necessary to
provide a redirection URL to the client browser.
Unfortunately, any such CGI program would embody all of the disadvantages
associated with even the simplest access counter programs. Not only would
problems of execution load and latency, as well as configuration,
maintenance and security remain, but such an approach to providing
redirection is inherently vulnerable to access spoofing. Access spoofing
is a problem particular to CGI programs arising from the fact that the
HTML reference to the CGI program may be issued without relation to any
particular web page. Consequently, any CGI program implementing an access
counter or other auditing or accounting data collecting program can
produce an artificially inflated access count from repeated reference to
the CGI program HTML statement outside and independent of a proper web
page. Access spoofing inherently undermines the apparent if not actual
integrity of any data gathered by a CGI program. Since, at minimum, the
ability to insure the accuracy of even a simple access count would be of
fundamental importance to an Internet service advertiser, the use of CGI
programs to provide even basic accounting or auditing functions is of
limited practical use. Finally, HTML does not provide a tamper-proof way
for two URLs to be accessed in sequence with just one URL reference
button, such as, for example, a server CGI counter URL reference followed
by external server URL reference.
SUMMARY OF THE INVENTION
Thus, a general purpose of the present invention is to provide a system and
method of reliably tracking and redirecting hyper-link references to
external server systems.
This is achieved by the present invention through the provision of a
message to a tracking server system in response to a client system
referencing a predetermined resource locator that corresponds to a
resource external to the tracking server system. The tracking server
system indirectly provides for the client system to have an informational
element selectable by the client system, where the informational element
is graphically identified on the client system with informational content
obtainable from a content server system through use of a content resource
locator. The informational element includes a tracking resource locator,
referencing the tracking server system, and data identifying the
informational element. The selection of the informational element causes
the client system to use the tracking resource locator to provide the data
to the tracking server system and to use the content resource locator to
obtain the informational content from the content server system.
Thus, an advantage of the present invention is that URL reference data is
captured in an expedient manner that interposes a minimum latency in
returning the ultimately referenced web page while imposing minimum
visibility of the redirection protocol on client users.
Another advantage of the present invention is that independent invocations
of server external support programs and multiple external data references
are not required as a consequence of the present invention, thereby
minimizing the CPU and disk intensive load on the web server computer
system and the resulting latency.
A further advantage of the present invention is that the reference
identifier and a redirection directive can both be maintained wholly
within the URL specification discretely provided by a client HTML request.
Thus, the present invention is superior in both efficiency and maintenance
requirements to a CGI counter, or any method that incorporates a CGI
counter.
Still another advantage of the present invention is that program
modifications necessary to support the protocol of the present invention
are implemented entirely at the server end of a protocol transaction.
Client side participation in the transaction is within the existing client
side defined HTML protocol.
A still further advantage of the present invention is that the
implementation of the invention introduces minimum exposure to additional
security breaches due to the closed form of the protocol while providing
substantial security against inappropriate URL and protocol references.
This is accomplished preferably by the inclusion of validation codes
inside the URL specification.
BRIEF DESCRIPTION OF THE DRAWINGS
These and other advantages and features of the present invention will
become better understood upon consideration of the following detailed
description of the invention when considered in connection with the
accompanying drawings, in which like reference numerals designate like
parts throughout the figures thereof, and wherein:
FIG. 1 provides a schematic representation of client and server computer
systems inter-networked through the Internet;
FIG. 2 provides a block diagram of a server computer system implementing an
HTTP daemon (HTTPd) server in accordance with a preferred embodiment of
the present invention;
FIG. 3 provides a flow diagram illustrating the process performed by a
preferred embodiment of the present invention in receiving and processing
client URL requests;
FIG. 4 provides a flow diagram illustrating the server side processing of
special redirect URLs in accordance with another preferred embodiment of
the present invention;
FIG. 5 provides a generalized process representation of client and server
computer systems implementing the alternate processes of the present
invention;
FIG. 6 is a flow diagram illustrating a server-side process that provides
for the issuance of a content request message in accordance with a
preferred embodiment of the present invention; and
FIG. 7 is a flow diagram illustrating a client-side process that provides
for the issuance of a tracking message in accordance with a preferred
embodiment of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
A typical environment 10 utilizing the Internet for network services is
shown in FIG. 1. Client computer system 12 is coupled directly or through
an Internet service provider (ISP) to the Internet 14. By logical
reference via a uniform resource locator, a corresponding Internet server
system 16, 18 may be accessed. A generally closed hypertext transfer
protocol transaction is conducted between a client browser application
executing on the client system 12 and an HTTPd server application
executing on the server system 16. In a preferred embodiment of the
present invention, the server system 16 represents an Internet Business
Service (IBS) that supports or serves web pages that embed hyper-link
references to other HTTPd server systems coupled to the Internet 14 and
that are at least logically external to the server system 16.
Within this general framework, the present invention enables the tracking
of the selection of embedded hyper-link references by client system 12.
That is, an embedded hyper-link reference is associated with a graphical
banner or other Web page element that is selectable, or clickable, by a
user of the client system 12. A banner click on a client system is
typically made to obtain information, identified in some fashion by the
banner graphic that is of interest to the client system user. Tracking is
preferably enabled by embedding HTML information in the Web page served to
the client system 12. This information is served from any prearranged
HTTPd server system to the client system 12. The prearrangement is with an
IBS to track banner clicks, on Web pages served by or on behalf of a
designated tracking HTTPd server system, such as system 16, that operates
to collect the served page provided tracking information.
The embedded information is, in accord with the present invention,
sufficient to enable the client computer system 12 to provide tracking
information to the HTTPd server system 16. As will be seen, this
information is also sufficient, directly or indirectly, to enable the
client computer to request the information associated with the banner
graphic. As will also be seen, there are a number of possible
implementations of the present invention. These implementations can
generally be categorized as predominately using either a server-side or
client-side process, as involving proprietary, plug-in, and interpreted
control processes, and as using any of a number of specific data transfer
protocols.
The preferred embodiment of the present invention utilizes a server-side
process implemented as a proprietary modification to the HTTPd server
application executed by the server system 16 and that uses the HTTP
redirection directive. Thus, a web page served by an HTTPd server system,
such as the server system 16 or another server system (not shown) to the
client 12 embeds a URL reference to a web page served by the logically
external server system. Selection of this embedded URL through the client
browser of the client computer system 12 results initially in an HTTP
transaction with the server system 16 rather than the external server. The
information stored in the embedded URL first served with the web page to
client system 12 is thus provided back to the server system 16 upon
selection of the URL even though the apparent target of the URL is the
external server system. A redirection response is then provided by the
server system 16 to the client system 12 providing the corresponding
redirection URL.
As shown in FIG. 2, the server system 16 receives the redirection request
information via a network connection 20 to a network interface 22 within
the server system 16. The network interface 22 is coupled through an
internal bus 24 to a central processing unit (CPU) 26. The CPU 26 executes
a network operating system 28 in support of the network interface 22 and
other functional aspects of the server system 16. The network operating
system 28 supports the execution by the CPU 26 of an HTTPd server
application 30 that defines the responsive operation of the server system
16 to HTTP requests received via the network 20. Finally, the network
operating system 28 provides for temporary and persistent storage of data
in a mass storage device 32 preferably including a persistent storage
media such as provided by a conventional hard disk drive.
In accordance with the preferred embodiment of the present invention, the
embedded redirection information provided as part of a URL HTTP request is
processed by the HTTPd server 30. Preferably, the processing by the HTTPd
server 30 is performed through the execution of the server 30 itself as
opposed to the execution of any external CGI programs or the like. The
redirection information is processed by the execution of the server 30 to
identify and validate the particular URL reference that provided the
redirection information and to generate a redirection target URL.
In a preferred embodiment of the present invention, an embedded URL
containing redirection information is formatted as follows:
http://<direct_server>/redirect?<data>?http://<redirect_server>
The direct_server portion of the embedded URL specifies the HTTP server
target of a transaction that is to be initially established by the client
system 12. The remaining information is provided to the tracking or
targeted direct server. The direct server may be any HTTPd server
accessible by the client system 12 that has been designated to service
redirection requests in accordance with the present invention.
The term "redirect" in the embedded redirection URL is a key word that is
pre-identified to the HTTPd server 30 to specify that the URL corresponds
to a redirection request in accordance with the present invention.
Although the term "redirect" is the preferred term, any term or code may
be selected provided that the term can be uniquely identified by the HTTPd
server 30 to designate a redirection URL. The recognition processing of
the "redirect" term is preferably performed through the execution of the
server 30 by way of a corresponding modification to the HTTPd server
application. That is, the HTTPd server application is modified to
recognize the term "redirect" as a key word and to execute a subprogram to
implement the server-side process of this preferred embodiment.
Alternately, the modification to the HTTPd server application can be
implemented as a "plug-in" binary program operative through a conventional
interface provided with the HTTPd server application to obtain essentially
the same functionality. Although of possibly lesser performance, a server
application embedded language, such as Java.RTM. or JavaScript.RTM., may
be also alternately used to implement the server-side process of
recognizing the "redirect" key word and performing the further processing
to implement the present invention.
The "data" term of the redirection URL provides reference identifier data
to the HTTPd server 30 that can be used to further identify and
potentially validate a redirection URL to the HTTPd server 30. The data
thus permits an accounting of the redirection URL to be made by the HTTPd
server 30. In the context of an advertisement, the data may encode a
particular advertising client for whom access data may be kept, a
particular instance of the graphic image provided to a client system 12 in
association with the redirection URL, and potentially a validation code
that may serve | | |