|
|
|
| United States Patent | 6144962 |
| Link to this page | http://www.wikipatents.com/6144962.html |
| Inventor(s) | Weinberg; Amir (Zoran, IL); Pogrebisky; Michael (Herzliya, IL) |
| Abstract | A visual Web site analysis program, implemented as a collection of software
components, provides a variety of features for facilitating the analysis
and management of Web sites and Web site content. A mapping component
scans a Web site over a network connection and builds a site map which
graphically depicts the URLs and links of the site. Site maps are
generated using a unique layout and display methodology which allows the
user to visualize the overall architecture of the Web site. Various map
navigation and URL filtering features are provided to facilitate the task
of identifying and repairing common Web site problems, such as links to
missing URLs. A dynamic page scan feature enables the user to include
dynamically-generated Web pages within the site map by capturing the
output of a standard Web browser when a form is submitted by the user, and
then automatically resubmitting this output during subsequent mappings of
the site. The Web site analysis program is implemented using an extensible
architecture which includes an API that allows plug-in applications to
manipulate the display of the site map. Various plug-ins are provided
which utilize the API to extend the functionality of the analysis program,
including an action tracking plug-in which detects user activity and
behavioral data (link activity levels, common site entry and exit points,
etc.) from server log files and then superimposes such data onto the site
map. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 6144962 |
|
|
Visualization of web sites and hierarchical data structures |
|
|
|
|
|
| Publication Date |
November 7, 2000 |
|
|
|
|
|
| Filing Date |
April 11, 1997 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
PRIORITY CLAIM
This application claims the benefit of U.S. Provisional Application No.
60/028,474 titled SOFTWARE SYSTEM AND ASSOCIATED METHODS FOR FACILITATING
THE ANALYSIS AND MANAGEMENT OF WEB SITES, filed Oct. 15, 1996, which is
hereby incorporated by reference.
MICROFICHE APPENDIX
This specification includes a microfiche appendix consisting of 1 sheet
with 45 frames) which contains a partial source code listing and an API
(application program interface) listing of a preferred embodiment of the
invention, as Appendices A and B, respectively. These materials form part
of the disclosure of the specification. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
What is claimed is:
1. A computer program for facilitating the visualization of a web site, the computer program comprising, on a computer readable medium:
executable scanning code which scans the web site and generates a representation of the web site within a memory of a computer, the web site representation including representations of content objects and links of the web site;
executable hierarchy identification code which reduces the web site representation generated by the scanning code to a hierarchical tree representation in which each content object corresponds to a respective node of the tree; and
executable mapping code which generates and displays a graphical map of the web site on a display screen of the computer based on the tree representation, the mapping code configured to graphically represent the content objects of the web site as
respective content object icons, the mapping code comprising a layout routine which positions and interconnects the content object icons within the map based upon parent-child relationships of the content objects within the tree, the layout routine
positioning the icons such that icons of parent content objects ("parents") are spatially grouped together with icons of the respective immediate children content objects ("children") of said parents to form a hierarchical arrangement of parent-child
icon clusters in which the children generally surround their respective immediate parents over multiple parent-child levels.
2. The computer program according to claim 1, wherein the hierarchy identification code is configured to apply a shortest path span tree algorithm to the web site representation.
3. The computer program according to claim 1, wherein the layout routine implements a layout method which positions the icons of a plurality of immediate children of a common parent at angular increments around the icon of the parent the layout
routine is configured to recursively apply the layout method at least once for each parent content object that has multiple children.
4. The computer program according to claim 3, wherein the layout routine automatically positions the respective icons of the plurality of immediate children away from the icon of the parent by substantially equal radial distances when all of the
immediate children correspond to leaf nodes of the tree, said radial distances being directly proportional to a total number of the plurality of immediate children.
5. The computer program according to claim 1, wherein the layout routine is configured to select the display sizes of the content object icons within the map in direct proportion to respective numbers of outgoing links, so that web pages with
relatively large numbers of outgoing links are represented by relatively large icons throughout the map.
6. The computer program according to claim 1, wherein the mapping code is configured to graphically represent each link of the tree as a respective line which connects an icon of a parent to an icon of an immediate child of the parent within the
map.
7. The computer program according to claim 6, wherein the mapping code is configured to position an icon of a home page content object of the web site generally at the center of the map such that the hierarchical arrangement of icon clusters
extends outward from the icon of the home page content object.
8. The computer program according to claim 1, further comprising a navigation interface which allows a user of the computer to interactively navigate the graphical map, the navigation interface including magnification controls for allowing the
user to select a portion of the map and view the selected portion in a magnified, zoomed-in display mode.
9. The computer program according to claim 8, wherein the navigation interface presents the user with a contextual view of a navigation position within the map by displaying, within a window of the display screen, a zoomed-out view of the map
together with a graphical indication of the portion being viewed in the zoomed-in mode.
10. The computer program according to claim 8, wherein the navigation interface includes executable code for automatically exposing object-specific information within the map in response to magnification of the map via the magnification
controls.
11. The computer program according to claim 10, wherein the code for automatically exposing object-specific information comprises code for automatically displaying a textual annotation of a content object when an icon of the content object is
magnified beyond a predetermined magnification level.
12. The computer program according to claim 1, wherein the scanning code includes code for automatically scanning a dynamically-generated web page using a data set captured during a prior web site browsing session, the dynamically-generated web
page being a page which is generated dynamically in response to data submitted by a user.
13. The computer program according to claim 1, further comprising map comparison code which compares a current map of the web site to a previously-generated map of the web site, and generates a comparison map which graphically represents
differences between the current and previously-generated maps, to thereby allow a user to view changes that have been made to the web site.
14. The computer program according to claim 1, further comprising an application program interface ("API"), the API including methods which allow application programs to superimpose site data onto the graphical map.
15. The computer program according to claim 14, further comprising an application program which makes calls to the API to superimpose web site usage data onto the map.
16. The computer program according to claim 1, wherein the layout routine positions icons of children around icons of their respective parents over a range of substantially 360 degrees over multiple parent-child levels.
17. The computer program according to claim 1, further comprising filtering code which provides a user option to apply a content filter for hiding icons of content objects of a predetermined type.
18. The computer program according to claim 1, wherein the mapping code provides a user option to re-apply the layout routine to a filtered map generated by the filtering code.
19. The computer program according to claim 1, wherein the layout routine positions a largest satellite cluster of a first parent using the same entry angle as used to position the first parent, so that the largest satellite, the first parent,
and a parent of the first parent fall generally along the same line.
20. A method of facilitating the visualization of a web site by a user, the method comprising the computer-implemented steps of:
(a) scanning the web site to generate a representation of the web site;
(b) generating a graphical site map of the web site on a display screen of a computer using the representation generated in step (a) to enable the user to view a structural representation of the web site, the site map comprising a plurality of
interconnected icons which represent content objects of the web site; and
(c) providing a navigation interface which allows the user to interactively navigate the site map, the navigation interface including a zoom control which allows the user to interactively zoom in and out on user-selected portions of the map.
21. The method according to claim 20, wherein step (b) comprises displaying textual annotations within the site map in association with the icons.
22. The method according to claim 20, wherein step (b) comprises automatically adjusting sizes of at least the icons such that the entire site map fits on the display screen.
23. The method according to claim 22, wherein the navigation interface concurrently displays first and second views of the site map when the user views the site map in a zoomed-in mode, the first view being a magnified view of a user-selected
portion of the site map, the second view being a perspective view of the site map together with a graphical indication of the user-selected portion.
24. The method according to claim 20, wherein step (b) comprises displaying the icons within the map such that groups of sibling icons generally surround their respective parent icons.
25. A method of facilitating the visualization of a web site, the method comprising the computer-implemented steps of:
identifying a hierarchy of content objects of the web site; and
generating a graphical map of the hierarchy on a display screen, the graphical map comprising interconnected content object icons that represent respective content objects, the step of generating comprising:
(a) identifying a parent content object ("parent") of the hierarchy that has multiple immediate children content objects ("children");
(b) identifying the multiple immediate children of the parent identified in step (a);
(c) positioning respective icons of the immediate children identified in step (b) around an icon of the parent identified in step (a) such that the icons of the children generally surround the icon of the parent within the map; and
(d) repeating steps (a), (b) and (c) for each of a plurality of additional parents of the hierarchy that have multiple immediate children.
26. The method according to claim 25, wherein the step of identifying a hierarchy comprises generating a tree representation of the content objects and links of the web site.
27. The method according to claim 25, wherein step (c) is performed such that the icons of the immediate children are spaced at substantially equal angular intervals around the icon of the parent when none of the immediate children has a child.
28. The method according to claim 25, wherein step (c) is performed such that all of the icons of the immediate children are radially spaced away from the icon of the parent by substantially the same distance when none of the immediate children
has a child.
29. The method according to claim 28, wherein the distance is directly proportional to the number of immediate children.
30. The method according to claim 25, wherein the step of generating the graphical map further comprises selecting icon display sizes such that the icon of each parent that has no grandchildren is directly proportional in size to the number of
immediate children of that parent.
31. The method according to claim 25, wherein the step of generating a graphical map further comprises connecting, within the map, the respective icon of each child of the hierarchy to the icon of the child's immediate parent, to graphically
represent the links of the hierarchy.
32. The method according to claim 31, wherein the step of identifying the hierarchy comprises:
generating a graph data structure which represents at least some of the content objects and links of the web site; and
applying a span tree algorithm to the graph data structure to generate a hierarchical tree representation of the web site, the step of applying causing a plurality of the links of the graph data structure to be omitted from the graphical map.
33. The method according to claim 32, wherein the step of applying the span tree algorithm comprises assigning a home page content object of the web site as a root node of the tree.
34. The method according to claim 25, wherein steps (a)-(d) are performed such that an icon of a home page content object is positioned generally at the center of the map, and such that respective icons of a plurality of immediate children of
the home page content object are positioned around the home page icon over a range of substantially 360 degrees.
35. The method according to claim 25, wherein step (c) comprises calculating angular spacings for positioning the icons of the children identified in step (b) around the icon of the parent identified in step (a).
36. The method according to claim 25, further comprising the steps of:
presenting a user with a content filter which enables the user to selectively hide icons of content objects of a predetermined type within the map; and
automatically hiding icons of content objects of the predetermined type in response to user actuation of the content filter, to generate a filtered map.
37. The method according to claim 36, wherein the step of automatically hiding is performed without substantially changing a general layout of content object icons within the map, so that the filtered map is presented to the user as a skeletal
view of the original map.
38. The method according to claim 37, further comprising the steps of:
presenting the user with a layout control which enables the user to selectively apply a layout method to the filtered the map; and
reformatting the filtered map on the display screen in response to user actuation of the layout control to generate a non-skeletal view of the filtered map, the step of reformatting comprising reapplying steps (a)-(d) to the content objects of
the filtered map.
39. The method according to claim 38, wherein the step of reformatting comprises modifying respective sizes of at least some of the content object icons of the filtered map.
40. The method according to claim 25, further comprising the steps of:
presenting a user with a variable zoom control which enables the user to magnify a user-selected portion of the graphical map; and
enlarging the user-selected portion of the map on the display screen in response to user actuation of the zoom control.
41. The method according to claim 40, wherein the step of enlarging comprises displaying, within the map, content object information that is not displayed when the map is viewed in a zoomed-out display mode.
42. The method according to claim 40, further comprising the step of presenting the user with a contextual view of a navigation position within the map by displaying, within a window of the display screen, a zoomed-out view of the map together
with a graphical indication of the user-selected portion.
43. The method according to claim 25, further comprising positioning a largest satellite cluster of a parent using the same entry angle as used to position the parent.
44. A method of generating a graphical map of a tree data structure on a display screen, the tree structure comprising a plurality of objects, the method comprising the computer-implemented steps of:
(a) representing the objects as respective icons within the map;
(b) identifying, within the tree structure, a parent object ("parent") that has multiple immediate children objects ("children");
(c) identifying the multiple immediate children of the parent identified in step (b);
(d) positioning the respective icons of the immediate children identified in step (c) around the icon of the parent identified in step (b) on the display screen such that the icons of the children surround the icon of the parent within the map;
(e) displaying a respective parent-child connection within the map between each child icon and the parent icon positioned in step (d); and
(f) repeating steps (b), (c), (d) and (e) for each of a plurality of additional parents of the tree structure that have multiple immediate children.
45. The method according to claim 44, wherein step (d) comprises calculating angular spacings for positioning the icons of the children identified in step (c) around the icon of the parent identified in step (b).
46. The method according to claim 44, wherein steps (b)-(f) are performed such that icons of children objects are positioned around icons of their respective parents over a range of substantially 360 degrees.
47. The method according to claim 44, wherein step (d) is performed such that the icons of the immediate children are spaced at substantially equal angular intervals around the icon of the parent when none of the immediate children has a child.
48. The method according to claim 44, wherein step (d) is performed such that all of the icons of the immediate children are radially spaced away from the icon of the parent by substantially the same distance when none of the immediate children
has a child.
49. The method according to claim 48, wherein the distance is directly proportional to the number of immediate children.
50. The method according to claim 44, further comprising selecting icon display sizes such that the icon of each parent object that has no grandchildren objects is directly proportional in size to the number of immediate children of that parent
object.
51. The method according to claim 44, wherein the tree data structure represents an arrangement of content objects of a web site.
52. The method according to claim 44, wherein the tree data structure represents a locally-stored arrangement of files and file directories.
53. The method according to claim 44, further comprising the steps of:
presenting a user with a variable zoom control which enables the user to selectively zoom-in on portions of the graphical map; and
enlarging a portion of the map on the display screen in response to user actuation of the zoom control.
54. The method according to claim 53, further comprising the step of presenting the user with a contextual view of a navigation position within the map by displaying, within a window of the display screen, a zoomed-out view of the map together
with a graphical indication of the portion.
55. The method according to claim 44, further comprising positioning a largest satellite cluster of a parent using the same entry angle as used to position the parent.
56. A method of representing a hierarchical node-link structure on a display screen, comprising:
(a) identifying a plurality of parent nodes of the structure that have multiple children nodes, including parent nodes at multiple different levels of the hierarchical structure;
(b) positioning a parent node identified in step (a) on the display screen, and positioning the children nodes of the parent node around the parent node over an angular range which exceeds 180 degrees; and
(c) repeating step (b) recursively for multiple additional parent nodes identified in step (a), including parent nodes at multiple different levels of the hierarchical structure;
wherein the method produces a map which comprises a hierarchical arrangement of parent-child node clusters.
57. The method of claim 56, further comprising positioning a largest satellite node cluster of a first parent node on the display screen using the same entry angle as used to position the first parent node, so that the largest satellite cluster,
the first parent node, and a parent of the first parent fall generally along the same line.
58. A method of graphically representing a web document on a display screen, the web document comprising a main document portion which comprises a plurality of links to a plurality of respective document components, the method comprising the
computer-implemented steps of:
representing the main document portion as a first icon on the display screen;
representing the plurality of document components as a plurality of respective additional icons that are positioned on the display screen generally around the first icons the step of representing comprising spacing each of the additional icons
away from the first icon by substantially the same distance, and calculating said distance based at least in-plant on a total number of said document components; and
representing each of the plurality of links as a respective interconnection between the first icon and a respective one of the additional icons.
59. The method according to claim 58, wherein the step of representing the plurality of document components comprises positioning the additional icons circularly around the first icon on the display screen.
60. The method according to claim 59, wherein the step of positioning comprises spacing the additional icons around the first icon at substantially equal angular intervals.
61. The method according to claim 58, wherein the step of representing the plurality of document components comprises positioning the additional icons circularly around the first icon on the display screen over an angular range of less than 360
degrees.
62. The method according to claim 61, further comprising:
representing a link to the main document portion as at least a line which extends from the first icon on the display screen; and
selecting the angular range such that the angular range does not encompass the line.
63. The method according to claim 62, wherein the step of representing the plurality of document components comprises spacing the additional icons apart from one another at substantially equal angular intervals within the angular range.
64. The method according to claim 58, further comprising modifying icon display sizes so that the first icon is larger than any of the plurality of additional icons.
65. The method according to claim 58, further comprising displaying, in close proximity to each additional icon, a respective textual annotation which identifies at least a content type of the respective document component.
66. The method according to claim 58, further comprising the computer-implemented steps of:
parsing the main document component to identify the plurality of links and the plurality of document components;
attempting to access each of the plurality of document components identified during the step of parsing, and representing a document component for which the attempt to access is unsuccessful with a special icon that represents a failed access
attempt.
67. The method according to claim 58, wherein the main document component further comprises a hyperlink to a second web document, and the method further comprises displaying the hyperlink as an interconnection between the first icon and an icon
that represents the second web document. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
FIELD OF THE INVENTION
The present invention relates generally to database management, analysis and visualization tools. More particularly, the present invention relates to software tools for facilitating the management and analysis of World Wide Web sites and other
types of database systems which utilize hyperlinks to facilitate user navigation.
BACKGROUND OF THE INVENTION
With the increasing popularity and complexity of Internet and intranet applications, the task of managing Web site content and maintaining Web site effectiveness has become increasingly difficult. Company Webmasters and business managers are
routinely faced with a wide array of burdensome tasks, including, for example, the identification and repair of large numbers of broken links (i.e., links to missing URLs), the monitoring and organization of large volumes of diverse,
continuously-changing Web site content, and the detection and management of congested links. These problems are particularly troublesome for companies that rely on their respective Web sites to provide mission-critical information and services to
customers and business partners.
Several software companies have developed software products which address some of these problems by generating graphical maps of Web site content and providing tools for navigating and managing the content displayed within the maps. Examples of
such software tools include WebMapper.TM. from Netcarta Corporation and WebAnalyzer.TM. from InContext Corporation. Unfortunately, the graphical site maps generated by these products tend to be difficult to navigate, and fail to convey much of the
information needed by Webmasters to effectively manage complex Web sites. As a result, many companies continue to resort to the burdensome task of manually generating large, paper-based maps of their Web sites. In addition, many of these products are
only capable of mapping certain types of Web pages, and do not provide the types of analysis tools needed by Webmasters to evaluate the performance and effectiveness of Web sites.
The present invention addresses these and other limitations in existing products and technologies.
SUMMARY OF THE INVENTIVE FEATURES
In accordance with the present invention, a software package ("Web site analysis program") is provided which includes a variety of features for facilitating the management and analysis of Web sites. In the preferred embodiment, the program runs
on a network-connected PC under the Windows.RTM. 95 or Windows.RTM. NT operating system, and utilizes the standard protocols and conventions of the World Wide Web ("Web"). In other embodiments, the program may be adapted to provide for the analysis of
other types of hypertextual-content sites, including sites based on non-standard protocols.
In the preferred embodiment, the program includes Web site scanning routines which use conventional webcrawling techniques to gather information about the content objects (HTML documents, GIF files, etc.) and links of a Web site via a network
connection. Mapping routines of the program in-turn use this information to generate, on the computer's display screen, a graphical site map that shows the overall architecture (i.e., the structural arrangement of content objects and links) of the Web
site. A user interface of the program allows the user to perform actions such as initiate and pause the scanning/mapping of a Web site, zoom in and out on portions of a site map, apply content filters to the site map to filter out content objects of
specific types, and save and retrieve maps to/from disk. A map comparison tool allows the user to generate a comparison map which highlights changes that have been made to the Web site since a previous mapping of the site.
In accordance with one aspect of the invention, the Web site analysis program implements a map generation method which greatly facilitates the visualization by the user of the overall architecture of the Web site, and allows the user to navigate
the map in an intuitive manner to explore the content of the Web site. To generate the site map, a structural representation of the Web site (specifying the actual arrangement of content objects and links) is initially reduced, for purposes of
generating the site map, to a hierarchical tree representation in which each content object of the Web site is represented as a node of the tree. A recursive layout method is then applied which uses the parent-child node relationships, as such
relationships exist within the tree, to spatially position the nodes (represented as respective icons within the map) on the display screen such that children nodes are positioned around and connected to their respective immediate parents. (This layout
method can also be used to display other types of hierarchical data structures, such as the tree structure of a conventional file system.) The result is a map which comprises a hierarchical arrangement of parent-child node (icon) clusters in which
parent-child relationships are immediately apparent.
As part of the layout method, the relative sizes of the node icons are preferably adjusted such that nodes with relatively large numbers of outgoing links have a relatively large icon size, and thus stand out in the map. In addition, the node
and link display sizes are automatically adjusted such that the entire map is displayed on the display screen, regardless of the size of the Web site. As the user zooms in on portions of the map, additional details of the Web site's content objects are
automatically revealed within the map.
In accordance with another aspect of the invention, the Web site analysis program is based on an extensible architecture that allows software components to be added that make extensive use of the program's mapping functionality. Specifically,
the architecture includes an API (application program interface) which includes API procedures ("methods") that allow other applications ("plug-ins") to, among other things, manipulate the display attributes of the nodes and links within a site map.
Using these methods, a plug-in application can be added which dynamically superimposes data onto the site map by, for example, selectively modifying display colors of nodes and links, selectively hiding nodes and links, and/or attaching alphanumeric
annotations to the nodes and links. The API also includes methods for allowing plug-in components to access Web site data (both during and following the Web site scanning process) retrieved by the scanning routines, and for adding menu commands to the
user interface of the main program.
In accordance with another aspect of the invention, software routines (preferably implemented within a plug-in application) are provided for processing a Web site's server access log file to generate Web site usage data, and for displaying the
usage data on a site map. This usage data may, for example, be in the form of the number of "hits" per link, the number of Web site exit events per node, or the navigation paths taken by specific users ("visitors"). This usage data is preferably
generated by processing the entries within the log file on a per-visitor basis to determine the probable navigation path taken by each respective visitor to the Web site. (Standard-format access log files which record each access to any page of the Web
site are typically maintained by conventional Web servers.) In a preferred implementation, the usage data is then superimposed onto the site map (using the API methods) using different node and link display colors to represent different respective levels
of user activity. Using this feature, Webmasters can readily detect common "problem areas" such as congested links and popular Web site exit points. In addition, by looking at individual navigation paths on a per-visitor basis, Webmasters can identify
popular navigation paths taken by visitors to the site.
In accordance with yet another aspect of the invention, the Web site analysis program includes software routines and associated user interface controls for automatically scanning and mapping dynamically-generated Web pages, such as Web pages
generated "on-the-fly" in response to user-specified database queries. This feature generally involves the two-step process of capturing and recording a dataset manually entered by the user into an embedded form of a Web page (such as a page of a
previously-mapped Web site), and then automatically resubmitting the dataset (within the form) when the Web site is later re-scanned. As will be appreciated, this feature of the invention can also be applied to conventional Internet search engines.
To effectuate the capture of one or more datasets in the preferred implementation, the user initiates a capture session from the user interface; this causes a standard Web browser to be launched and temporarily configured to use the Web site
analysis program as an HTTP-level proxy to communicate with Web sites. Thereafter, until the capture session is terminated by the user, any pages retrieved with the browser, and any forms (datasets) submitted from the browser, are automatically recorded
by the Web site analysis program into the site map. When the site map is subsequently updated (using an "automatic update" option of the user interface), the scanning routines automatically re-enter the captured datasets into the corresponding forms and
recreate the form submissions. The dynamically-generated Web pages returned in response to these automatic form submissions are then added to the updated site map as respective nodes. A related aspect of the invention involves the associated method of
locally capturing the output of the Web browser to generate a sequence that can subsequently be used to automatically evaluate a Web site.
BRIEF DESCRIPTION OF THE DRAWINGS
The various features of the invention will now be described in greater detail with reference to the drawings of a preferred software package referred to as the Astra.TM. SiteManager.TM. Web site analysis tool ("Astra"), its screen displays, and
various related components. In these drawings, reference numbers are re-used, where appropriate, to indicate a correspondence between referenced items.
FIG. 1 is a screen display which illustrates an example Web site map generated by Astra, and which illustrates the menu, tool and filter bars of the Astra graphical user interface.
FIGS. 2 and 3 are screen displays which illustrate respective zoomed-in views of the site map of FIG. 1.
FIG. 4 is a screen display which illustrates a split-screen display mode, wherein a graphical representation of a Web site is displayed in an upper window and a textual representation of the Web site is displayed in a lower window.
FIG. 5 is a screen display which illustrates a navigational aid of the Astra graphical user interface.
FIG. 6 is a screen display illustrating a feature which allows a user to selectively view the outbound links of URL in a hierarchical display format.
FIG. 7 is a block diagram which illustrates the general architecture of Astra, which is shown in the context of a client computer communicating with a Web site.
FIG. 8 illustrates the object model used by Astra.
FIG. 9 illustrates a multi-threaded process used by Astra for scanning and mapping Web sites.
FIG. 10 illustrates the general decision process used by Astra to scan a URL.
FIG. 11 is a block diagram which illustrates a method used by Astra to scan dynamically-generated Web pages.
FIG. 12 is a flow diagram which further illustrates the method for scanning dynamically-generated Web pages.
FIGS. 13-15 are a sequence of screen displays which further illustrate the operation of Astra's dynamic page scanning feature.
FIG. 16 is a screen display which illustrates the site map of FIG. 1 following the application of a filter which filters out all URLs (and associated links) having a status other than "OK." FIG. 17 illustrates the general program sequence
followed by Astra to generate filtered maps of the type shown in FIG. 16.
FIG. 18 illustrates the filtered map of FIG. 16 redisplayed in Astra's Visual Web Display format.
FIG. 19 is a screen display which illustrates an activity monitoring feature of Astra.
FIG. 20 illustrates a decision process used by Astra to generate link activity data (of the type illustrated in FIG. 19) from a server access log file.
FIG. 21 is a screen display which illustrates a map comparison tool of Astra.
FIG. 22 is a screen display which illustrates a link repair feature of Astra.
FIGS. 23 and 24 are partial screen displays which illustrate layout features in accordance with another embodiment of the invention.
The screen displays included in the figures were generated from screen captures taken during the execution of the Astra code. In order to comply with patent office standards, the original screen captures have been modified to reduce shading and
to replace certain color-coded regions with appropriate cross hatching. All copyrights in these screen displays are hereby reserved.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The description of the preferred embodiments is arranged within the following sections:
I. Glossary of Terms and Acronyms
II. Overview
III. Map Layout and Display Methodology
IV. Astra Graphical User Interface
V. Astra Software Architecture
VI. Scanning Process
VII. Scanning and Mapping of Dynamically-Generated Pages
VIII. Display of Filtered Maps
IX. Tracking and Display of Visitor Activity
X. Map Comparison Tool
XI. Link Repair Plug-in
XII. Conclusion
I. Glossary of Terms and Acronyms
The following definitions and explanations provide background information pertaining to the technical field of the present invention, and are intended to facilitate an understanding of both the invention and the preferred embodiments thereof.
Additional definitions are provided throughout the detailed description.
Internet. The Internet is a collection of interconnected public and private computer networks that are linked together by a set of standard protocols (such as TCP/IP, HTTP, FTP and Gopher) to form a global, distributed network.
Document. Generally, a collection of data that can be viewed using an application program, and that appears or is treated as a self-contained entity. Documents typically include control codes that specify how the document content is displayed
by the application program. An "HTML document" is a special type of document which includes HTML (HyperText Markup Language) codes to permit the document to be viewed using a Web browser program. An HTML document that is accessible on a World Wide Web
site is commonly referred to as a "Web document" or "Web page." Web documents commonly include embedded components, such as GIF (Graphics Interchange Format) files, which are represented within the HTML coding as links to other URLs. (See "HTML" and
"URL" below.)
Hyperlink. A navigational link from one document to another, or from one portion (or component) of a document to another. Typically, a hyperlink is displayed as a highlighted word or phrase that can be clicked on using the mouse to jump to the
associated document or document portion.
Hypertext System. A computer-based informational system in which documents (and possibly other types of data entities) are linked together via hyperlinks to form a user-navigable "web." Although the term "text" appears within "hypertext," the
documents and hyperlinks of a hypertext system may (and typically do) include other forms of media. For example, a hyperlink to a sound file may be represented within a document by graphic image of an audio speaker.
World Wide Web. A distributed, global hypertext system, based on an set of standard protocols and conventions (such as HTTP and HTML, discussed below), which uses the Internet as a transport mechanism. A software program which allows users to
request and view World Wide Web ("Web") documents is commonly referred to as a "Web browser," and a program which responds to such requests by returning ("serving") Web documents is commonly referred to as a "Web server."
Web Site. As used herein, "web site" refers generally to a database or other collection of inter-linked hypertextual documents ("web documents") and associated data entities, which is accessible via a computer network, and which forms part of a
larger, distributed informational system. Depending upon its context, the term may also refer to the associated hardware and/or software server components used to provide access to such documents. When used herein with initial capitalization (i.e.,
"Web site"), the term refers more specifically to a web site of the World Wide Web. (In general, a Web site corresponds to a particular Internet domain name, such as "merc-intcom," and includes the content of or associated with a particular
organization.) Other types of web sites may include, for example, a hypertextual database of a corporate "intranet" (i.e., an internal network which uses standard Internet protocols), or a site of a hypertext system that uses document retrieval protocols
other than those of the World Wide Web.
Content Object. As used herein, a data entity (document, document component, etc.) that can be selectively retrieved from a web site. In the context of the World Wide Web, common types of content objects include HTML documents, GIF files, sound
files, video files, Java applets and aglets, and downloadable applications, and each object has a unique identifier (referred to as the "URL") which specifies the location of the object (See "URL" below.)
URL (Uniform Resource Locator). A unique address which fully specifies the location of a content object on the Internet. The general format of a URL is protocol://machine-address/path/filename. (As will be apparent from the context in which it
is used, the term "URL" is also used herein to refer to the corresponding content object itself.)
Graph/Tree. In the context of database systems, the term "graph" (or "graph structure") refers generally to a data structure that can be represented as a collection of interconnected nodes. As described below, a Web site can conveniently be
represented as a graph in which each node of the graph corresponds to a content object of the Web site, and in which each interconnection between two nodes represents a link within the Web site. A "tree" is a specific type of graph structure in which
exactly one path exists from a main or "root" node to each additional node of the structure. The terms "parent" and "child" are commonly used to refer to the interrelationships of nodes within a tree structure (or other hierarchical graph structure),
and the term "leaf" or "leaf node" is used to refer to nodes that have no children. For additional information on graph and tree data structures, see Alfred V. Aho et al, Data Structures and Algorithms, Addison-Wesley, 1982.
TCP/IP (Transfer Control Protocol/Internet Protocol). A standard Internet protocol which specifies how computer | | |