Modular content framework and document format methods and systems are described. The described framework and format define a set of building blocks for composing, packaging, distributing, and rendering document-centered content. These building blocks define a platform-independent framework for document formats that enable software and hardware systems to generate, exchange, and display documents reliably and consistently. The framework and format have been designed in a flexible and extensible fashion. In addition to this general framework and format, a particular format, known as the reach package format, is defined using the general framework. The reach package format is a format for storing paginated documents. The contents of a reach package can be displayed or printed with full fidelity among devices and applications in a wide range of environments and across a wide range of scenarios.
RELATED APPLICATIONS
This application is a continuation of and claims priority to U.S. patent application Ser. No. 10/837,040, filed on Apr. 30, 2004, the disclosure of which is incorporated by reference herein.
A system determines an ordered sequence of documents and determines an amount of novel content contained in each document of the ordered sequence of documents. The system assigns a novelty score to each document based on the determined amount of novel content.