The invention provides an information retrieval system wherein, even when a plurality of data bases are retrieved simultaneously, information retrieval can be performed at a high speed without necessity for special hardware and without being influenced by the magnitudes of the data bases of a retrieval object. The information retrieval system includes a plurality of retrieval servers for performing retrieval processing, and a retrieval management server for managing operation of the plurality of retrieval servers. The retrieval management server is constructed so as to divide a data base of a retrieval object and relating information regarding the data base and allocate resulting divided parts of the data base and the relating information corresponding to the divided parts in sets to some or all of the plurality of retrieval servers. The plurality of retrieval servers are constructed so as to perform information retrieval for the divided parts of the data base allocated by the retrieval management server parallelly to and independently of each other.
A document retrieval system of the present invention includes an agent system that facilitates document retrieval by mediating a search request from a user to databases provided with search engines that search for a document with a keyword. When a facilitating agent is supplied with a search request containing a keyword, the facilitating agent refers to a facilitating database that stores keyword information indicating the relationship between the keyword and the databases provided with search engines for each keyword so as to determine a database agent to which the search request is to be sent out.
A information retrieving method of realizing the improvement of retrieval performance without conducting fine-grain processing and allowing a parallelizing operation for sequential retrieval engines to be conducted easily. Accordingly, the information retrieving method according to this invention is made to connect retrieving servers in parallel with respect to a retrieval managing server through the use of a parallel framework which makes the retrieving servers conduct parallel processing by integrating the functions of the retrieving servers in a manner of directly using the functions thereof without changing. The data to be retrieved is distributed or substantially equally to the retrieving servers, while the retrieval requests from clients are successively broadcasted to the retrieving servers without waiting for the retrieval results from the retrieving servers. This invention can be effectively applied for fetching necessary information from a database retaining various kinds of information.
A method, system and program product for automatically retrieving documents is provided. Specifically, a hit list of documents is generated directly from an input file of requests. Once generated, the hit list is processed according to system, data object, storage node identification, storage drive and/or cache. Once sorted, plurality of retrieval programs are launched and executed in parallel to retrieve the requested documents.
In a distributed system, where data is maintained in at least two databases and the data includes at least one data element, the amount of data transmitted during data recovery is minimized by comparing a first total of the data elements of the data in a first database with a second total of a corresponding data elements of corresponding data in a second database. An updating procedure for the data element is initiated if the first total and the second total are not the same.