|
Description  |
|
|
BACKGROUND OF THE INVENTION
This invention relates generally to an apparatus and method for composing
visual source material. In particular, the invention provides an apparatus
and method for dynamically composing stored source material for producing
a composition sequence, the electronic data necessary to form the
composition sequence, or edited output.
Over the past two decades, video tape has substantially replaced the
traditional photographic, e.g. silver halide, and other "non-electronic"
film as the preferred media on which to film or compose a movie, news, or
other program material. The increasing use of video tape has occurred
despite certain inherent limitations associated with video tape in
comparison with traditional film. Video tape, like a developed
photographic film, is inherently a "serial access" medium; however an
editor is unable to "see" the images on the video tape medium. The video
editor must instead rely upon electronic apparatus to read and view the
images and to compose them to produce an edited product. To the contrary,
the film editor is able to have "hands-on" access to the film and can
directly view the visual scenes thereon. The film editor can cut and
splice the film in the editing room.
The departure from film to video tape has dampened creative talents in some
respects, in that the director is no longer able to apply subjective
talents directly to the program medium. Instead, intermediate
technically-skilled operating personnel are employed to control the
electronic composing process, taking orders from the director. The orders
are in terms of data, e.g., alphanumeric addresses of different taped
sequences, rather than in terms of visual images.
The intermediate personnel thus perform the real time hands-on manipulation
of the video tape in an abstract environment of alphanumeric information
and work with bays of switches on a complex control panel. The director's
feel for the composition process is diminished, and the composing process
is, as a consequence, slow and tedious, with lessened subjective
interaction.
It is also known that one advantage to composing film media is the ability
to react to the temporal nature of the media. Edited film can be browsed
back and forth, picked up and viewed, like a book, and physically spliced.
These advantages do not yet exist in present day video composing
equipment.
Therefore, primary objects of the invention are increasing the throughput
in the composing of video source material, lessening or even removing the
need for intermediate personnel so that the director is closer and more
involved in the composing process, and solving the time-space problem
inherent in video tape composition. Other objects of the invention are a
flexible composition apparatus and method, and a reliable and
user-friendly apparatus and method that can be employed directly, or
indirectly, to create automatically a final edited master. Other objects
of the invention will in part be obvious and will in part appear
hereinafter.
OVERVIEW OF THE INVENTION
Stated broadly, the invention provides equipment and methods for processing
image information with improved human interaction. In a preferred practice
of the invention, the image information is video images, as conventionally
recorded and stored with electronic signals. The equipment and method have
many applications.
In one aspect, the video processing equipment according to the invention
makes it possible for an operator to scan visually through a vast library
of stored video images with greater speed and control than previously
possible. This new search capability which the invention provides has many
uses. An illustrative one is for a news service to search a video data
base for film clips of a subject that has suddenly become newsworthy.
Another aspect of video processing equipment according to the invention
makes it possible for an operator to assemble a collection of video images
into a program sequence, with a new degree of speed, facility and ease. An
example of this use of the invention is to compose a television program
from a collection of shots recorded at different times and/or from
different sources.
In each application of the invention, the video source material is in the
form of groups of frames, typically sequential, as results from filming a
scene with a video camera. The groups of frames, referred to herein as
segments, can be stored, when received by the equipment, in an unknown or
an undesired order. The composition equipment enables an operator to
search the sequences of video segments, examine them as desired, and to
select portions of any sequences for sorting or reordering, for trimming,
and for introducing different transitions from one segment to another--all
with human ease, responsiveness, and subjective interaction akin to that
of a skilled driver of a performance automobile.
Equipment according to the invention generally employs a bank of
independently operable video tape recorders for storing two or more
duplicate counterparts of the video source material. The equipment also
has several monitors on which the video source material and video labels
can be displayed for operator viewing. A video label is, in the context of
this invention, a representation derived from a frame of source material.
A typical label is a low resolution digital representation of a high
resolution source image. Such a label can be electronically stored and
accessed at high speeds, yet when viewed by an operator, the label
provides nearly the same information as the corresponding high resolution
source material.
In one illustrative embodiment of this equipment, there is a first, main
monitor on which a selected sequence or other video segment can be
repeatedly displayed, as if recorded on an endless loop. A set of
secondary monitors can display selected scenes of a video sequence,
typically of the sequence being displayed on the main monitor. In
addition, there preferably are two linear arrays of passive display
monitors. One array is positioned above the other so that each monitor of
the upper array is paired with, and aligned above, a monitor of the lower
array.
An operator standing or sitting before this video display system controls
it with two sets of controls, one for each hand. Each control set has a
cluster of finger switches, e.g., push-buttons, arranged with a large
wheel for tactile operation with minimal hand movement.
In general operation, the illustrated embodiment of this equipment can
include operating modes termed "output", "sort", "trim" and "splice". An
operator enters video source material into the equipment, i.e. stores it
on the video recorders, with the input mode. The operator can view the
video images, typically on the main monitor, as they are being entered.
The operator generally controls the equipment to prepare and store label
pairs of the source material as it is being entered. The label pairs can
be operator selected incoming video frames, or can be automatically
selected by the equipment on a repeating basis, e.g., every thirtieth
input frame. Each label includes information identifying the corresponding
segment of source material, preferably identifying an address where that
segment is stored on the video tape recorders.
In the illustrated sort mode of operation, an operator assembles selected
label pairs, representing stored video source material, in a desired
program sequence. The sequence of the label pairs is independent of the
sequence according to which the source material is stored in the
equipment.
More particularly, in this sort mode, an operator can select one or more
labels representing any stored video segment and place it in selected
sequence with one or more labels representing another video segment.
Further, the operator can rearrange the sequence of the selected labels.
In one use of the equipment, the operator selects a label pair and the
video segment which it represents is then displayed with continuous
repetition on the main monitor screen while the labels for that segment
are displayed on the secondary monitors.
In the trim mode, the operator can shorten or lengthen any selected segment
as it is being repeatedly displayed on the main monitor. The secondary
monitors in this mode of operation display labels representing the first
and last frames of the "trimmed" segment.
When the operator has assembled two or more selected and sequentially
ordered and trimmed segments in this manner, the beginning and ending
labels of each trimmed segment can be displayed on one set of the passive
monitors, in the selected sequence. Further, the operator can collapse two
or more of such sorted segments if they are contiguous and represent, as a
single label pair displayed on the passive monitors, the plural segments
forming the collapsed group.
An operator uses the splice mode of operation to edit the transition
between sorted video segments. The equipment in one embodiment of the
invention enables the operator to control the length of a transition, the
type of transition, the starting and stopping points of the transition,
and the number of frames over which the transition occurs.
Several particular features of the invention further increase the
effectiveness of this equipment and method. For example, the illustrative
equipment, according to the invention, provides the operator, in the sort
mode, with the capability of smoothly scrolling label images across the
array of passive monitors. The display in essence slides the label images
across the displays, in line with preceding and subsequent labels.
DETAILED SUMMARY OF THE INVENTION
The invention, as previously noted, relates to an apparatus and method for
composing image source material stored on at least one image storage
medium. The source material is composed of a sequence of stored frames
representing a time sequential visual image. Sequences of the frames are
associated to form a video segment.
The apparatus features a plurality of pictorial display elements arranged
in an ordered array. Each display element provides a visual presentation
of a video label selected from a sequence of labels, the label sequence
representing a sequence of the video segments. Each selected label
identifies one video segment. The invention further features an operator
control device and a composing control circuitry operative with the medium
and the operator control device.
The composing control circuitry includes elements for selectively supplying
each pictorial display device with electrical data signals representing a
selected one of the sequence of labels. Elements of the control circuitry
are responsive to the operator control device for changing the labels
displayed by the plural pictoral display elements. The supply elements
further have circuitry responsive to the changing element for smoothly
scrolling the labels across the ordered array of display elements.
In a particular embodiment, the smoothly scrolling system has a display
processing control circuit and a display processing unit having at least
one row processor. Each row processor features a plurality of digitized
picture generators for receiving digitized picture data representing a
picture to be displayed and for generating an analog picture output
representing the digitized picture data. Each row processor further
features a plurality of analog routing elements, each element having as
its inputs, at least a plurality of the picture generator analog circuits
outputs. Each routing element is then adapted to select its output from
among the picture generator output signals in response to a controlling
input signal.
These and other features of the equipment and method of the invention
described and illustrated herein provide significant improvements in the
human processing of image information.
BRIEF DESCRIPTION OF THE DRAWINGS
Other objects, features, and advantages of the invention will appear from
the following description of preferred embodiments, taken together with
the drawings in which:
FIG. 1 is a perspective view of the composing apparatus according to the
invention;
FIG. 2 is a detailed schematic block diagram of the electrical circuitry
according to the invention;
FIG. 3 is a detailed plan view of a typical console according to the
invention;
FIG. 4 is an electrical block diagram showing those elements used for the
smooth scrolling display generation and control for the passive display
screens;
FIG. 5 is a detailed electrical diagram showing the elements necessary for
providing a smooth scroll of the video images across the passive display
screens;
FIG. 6 is a partial electrical schematic diagram corresponding to FIG. 2
and showing the elements necessary for displaying and controlling video on
the active display screens;
FIG. 7 is a partial electrical schematic diagram corresponding to FIG. 2
and showing the portions of the system which input video to the apparatus;
FIG. 8 is a detailed block diagram of the video tape recorder interface of
FIG. 2;
FIG. 9 is an enlarged plan view of a manual control assembly for the
apparatus;
FIG. 10 is a flow chart showing controller operation during the input mode
of operation;
FIGS. 11A-11F are flow charts showing controller operation during the sort
mode of operation;
FIG. 12 is a flow chart showing controller operation during the trim mode
of operation;
FIG. 13 is a flow chart showing controller operation during the splice mode
of operation;
FIG. 14 is another embodiment illustrating an alternate routing circuitry
to that of FIG. 2;
FIG. 15 is a detailed block diagram of a video tape recorder interface as
modified for the routing circuit of FIG. 14;
FIG. 16 is a more detailed electrical block diagram of the routing circuit
of FIG. 14; and
FIG. 17 is a detailed electrical diagram of the cross-point array circuitry
of FIG. 16.
DESCRIPTION OF PREFERRED EMBODIMENTS
General Description of the Apparatus
The illustrated embodiment of the invention is directed to composing image
source material stored in a memory medium, for example, video tape used
with a video tape recorder, to produce a sequential grouping of segments
making up a program or story. In some applications, composition can, but
need not, further include the editing function of creating a final edited
master. Typically, the image source material either is derived from
already existing, production quality video tape(s) or is provided, in real
time, from one or more video cameras for recording on video tape.
In its standard format, the video signal has a plurality of frames, each
frame having two fields. The video is displayed for normal viewing at a
rate of thirty frames per second. When the video tape is prepared, the
recording device associates with each field a specific address or
identification tag. The address is typically written in accordance with
the SMPTE time code, a standard used throughout the television industry.
Thus, irrespective of the source of the video material, there is
associated with each field of the recorded signal, a unique address or
location which is read when the field is read or otherwise retrieved.
Referring to FIG. 1, a video composition system 10 has a control console 12
from which an operator/editor controls the operation of the entire system
and provides the composing instructions which enable the system to prepare
a listing of video segments to be serially connected to form a finished
composition sequence. As used herein segment refers to a sequence of
frames. The frame sequence may form a shot, a scene (a sequence of shots),
a picture sector (a sequence of scenes), a program or story (a sequence of
picture sectors), or any other desired grouping of frames.
According to the illustrated embodiment of the invention, the system
employs a plurality of serial storage media 13. The storage media need not
be limited to serial storage; however, present technology has not provided
a random access storage medium of sufficient capacity and reasonable cost
to replace the serial storage medium. In the illustrated embodiment, the
serial storage media are video tape recorders (VTR's) such as those
commercially sold and manufactured by Sony Corporation under model No.
2500. This VTR operates according to a Beta II or Beta III tape format,
has multiple heads for both simultaneous recording and retrieval, provides
a times-two and a times-ten or times-fifteen speed control, a frame freeze
for both forward and reverse modes, and allows significant remote control
capability. Referring also to FIG. 2, the illustrated VTR's, labeled 14,
16, 18, and 20, interface with a computer/controller 22 through respective
recorder interfaces 24, 26, 28, and 30. While only four VTR's are shown in
the figures, it is contemplated that the system 10 will include up to
twenty-two or more VTR's to accomplish the many functions and provide the
many features to be described hereinafter. For purposes of simplicity in
the drawing, more VTR's have not been shown but have been indicated by the
plurality of dots between the various recorders 14, 16; and 16, 18; and
18, 20.
The control of the entire system depends upon the man-machine interaction
available from the control console 12. Referring to FIGS. 1 and 3, the
illustrated control console has a large main display screen 32 flanked by
a plurality of smaller display screens 34, 36, 38, and 40. There are
further provided a plurality of yet smaller label display screens 42, 44,
46, . . . , 68, arranged in a two row ordered array. Below the label
display screens is a manual control panel 70 which includes a right hand
control wheel 74 and a left hand control wheel 76. Adjacent the control
wheels are a number of control keys 78, 79, 80, . . . , 99, and levers
100, 101, 102, 103 whose functions are described in greater detail below.
The apparatus 10 also has a keyboard unit 104 having a simplified
typewriter keyboard for entering alphanumeric information into the
computer/controller 22 and for responding to requests for information or
instructions which appear on a display screen monitor 105. In the
illustrated embodiment of the invention, the typewriter keyboard and
monitor are housed apart from console 12, although the monitor and
keyboard could also be integrated with the console 12 as described in
copending U.S. application, Ser. No. 452,287. The preferred and described
layout of the control console 12 can be changed in accordance the needs of
the particular application. Thus, other applications can require a
different arrangement of the components and/or different numbers of
display screens or other controls.
The apparatus is heavily user interactive. From the control panel 70 the
operator/editor can effect substantially any operating mode which is
required for composing a video program from one or more available source
video tapes. As noted above, the scenes recorded on video tape are made up
of a sequence of frames, each frame being composed of two interlaced
television fields. The composition apparatus 10, in the illustrated
embodiment, is capable of operating upon the frames forming the source
video at any of a number of operating levels. According to the preferred
embodiment of the invention, each operating level can be viewed as a
"bin". Each bin contains a plurality of segments displayed as groups of
label pairs, and each class of bins has a separate and distinct purpose.
For example, at the lowest or most elementary level, there is a "source
bin". The source bin represents the operating level at which source
material is read by and stored in the apparatus. At another operating
level, there exists a "discard bin." The discard bin, as its name implies,
contains those segments which, while once belonging in the source bin,
have been "discarded" and removed, for example from the source bin. The
"discarded" segments can be later retrieved as described in more detail
below.
Another operating level, the so-called "select bin", acts like a temporary
scratch pad memory in which the apparatus stores segments on a last in,
first out (LIFO) basis. The select bin operating level is useful for
moving segments from for example the source bin to for example a higher
level bin. The higher level bins, of which there are four according to the
illustrated embodiment, are "program bins". It is at the program bin
operating level that program material is sorted, trimmed, and spliced.
(In an alternate embodiment of the invention, by way of further example,
the operating levels can be designed according to a completely different
philosophy. According to one alternate operating method, at the lowest or
most elementary level (the zero level), the apparatus can operate upon one
frame at a time. At a higher level, the apparatus can operate upon
predetermined segments of m.sup.n frames where n, an integer, represents
the operational level (level "1", level "2", etc.) and m is an arbitrary
integer greater than 1. For example, if m equals seven, level one operates
upon segments of seven frames, level two upon segments of forty-nine
frames, etc.
A predetermined segment, however, will not generally correspond to a shot,
a scene, etc. Therefore, the apparatus has the further capability, in this
alternate operating level embodiment, of allowing the operator to
designate segments of connected frames. At the operator designated levels
of operation, the frames, when sequentially connected together, in the
most elementary sense form "shots" (analogous to film clips). A plurality
of shots (or clips) can be spliced together to form a scene and a
plurality of scenes can be spliced together for forming a video sector.
Correspondingly, a plurality of video sectors together forms an entire
program or story. In this alternate embodiment of the invention, the
apparatus operates at any of the levels of shot, scene, or sector as well
as at predetermined levels "0", "1", "2", and "3" described above. Thus
depending upon the level of operation selected, in the alternate
embodiment, the apparatus can operate upon either individual frames (level
0), a predetermined group of frames (levels 1, 2, and 3), or at the shot
level (level 4), the scene level (level 5), or the sector level (level
6).)
Referring again to the preferred and illustrated embodiment of the
invention, source material read into the source bin, can have segments (or
clips) marked off (but not physically divided) in a number of ways. The
segments can be designated by, for example, regular sampling, wherein a
segment is marked with labels extracted at a repeating predetermined time
duration such as one second. Another method employed, according to the
invention, for marking off source video into segments, relies upon an
operator actuated control panel key which enables the operator to mark off
the incoming source material into segments by making preliminary decisions
on-the-fly.
In either instance, the composition apparatus 10 uses pictorial labels to
designate each segment (or a sequence of frames) of the video material
being composed. Thus, instead of forcing the user to manually identify and
record a video segment by either the SMPTE time codes or another
artificial determination, one or more fields or frames of the segment
(preferably digitized frames together with their machine retrievable SMPTE
address codes), are employed to pictorially "label" the segment. The
labels can be, as described below, the frames at the beginning and the end
of the segment. In other circumstances, the labels can be near the
beginning and the end of the segment, or elsewhere.
Furthermore, more than one label can be used for a segment. In the
illustrated embodiment of the invention, two labels are used, one
pictorial label corresponding to the frame at the beginning of the segment
and a second pictorial label corresponding to the frame at the end of the
segment. (Alternately, one label can be employed during an initial "rough
cut" and two labels can be used for the later composition work.) As the
segments are assembled, in a desired order as described hereinafter, the
labels corresponding to the segments are similarly ordered.
In the illustrated embodiment of the invention the display screens 42, 44,
46, . . . , 68 are designated "passive displays" and are generally
employed for presenting a spatial display of the label pairs associated
with a sequence of segments, one vertical pair of display screens showing
the beginning label (top display) and the ending label (bottom display) of
a segment. The video segment associated with a selected one of the label
pairs, designated by a control cursor, will typically be displayed on the
main screen or "active display" 32. The beginning and ending labels of the
segment being displayed on the active display 32 will typically be
displayed on various of screens 34, 36, 38, 40 depending upon the mode of
operation as described below.
In the illustrated embodiment, if the control cursor, the location of which
is indicated by illumination elements 324 and controlled by lever 100,
(FIG. 3), were set to the center screen pair of the passive displays, that
is, to displays 54, 56, the segment corresponding to displays 54, 56 will
generally be displayed on the main screen 32. Further, the apparatus
displays pictorial label pairs corresponding to the just preceding three
earlier segments on the three preceding vertical display screen pairs,
i.e., display pairs 42, 44; 46, 48; and 50, 52. Similarly, the pictorial
label pairs corresponding to the next succeeding three occurring segments
are presented on passive display screens 58, 60; 62, 64; and 66, 68.
Thereby, the control console provides a spatial display corresponding to
the temporal image presentation. This snapshot-type multiple label display
enables the user to maintain in temporal perspective, where the presently
displayed segment on screen 32 "fits" in the segment sequence.
Referring now to FIG. 2, the communications and data management center of
the apparatus is the composing computer/controller 22. The
computer/controller has a central processing unit which can be for example
an Omnibyte OB68KlA, manufactured by Omnibyte of West Chicago, Ill.
Associated with the controller 22 is a printer 106, for example a dot
matrix printer such as the Versatec V80 manufactured by the Versatec
Division of Xerox Corp. The controller 22 further has a digital data bus
107 for transmitting digital data between the computer, a disk controller
108, a picture cache memory 109, a video digitizer 110, a display
processing unit 112, and a display processing control 114. The controller
22 is further adapted to receive inputs from the control panel 70 through
an interface unit 120. As noted above, the controller 22 is in direct
communication with the various VTR interfaces 24, 26, 28, and 30 as well
as with video port interfaces 122, 124, and 126. (Interfaces 122, 124 and
126 operate in response to controller 22 for controlling external video
equipment, for example external VTR's.) The controller 22 also operates
video routing circuitry 130, 132, and 134, and a video effects switching
circuit 136. In the illustrated embodiment, the controller 22 operates
with a system clock generator 140 for system signal synchronization.
According to the illustrated embodiment, at the beginning of a composing
session, controller 22 operates in a default mode, which is an automatic
segmenting mode, for dividing "raw video" source material into plural
segments. The illustrated apparatus is thus designed to effect a
segmentation of the source material according to a predetermined method
and sequence. This segmentation process is described above as a periodic
sampling process. On the other hand, as noted above, it is also desirable
for the operator to review the source material quickly and roughly and
indicate his initial feel for the divisions between segments. This
operator controlled segmentation function can be implemented in any
arbitrary manner, and is described in detail below.
Controller 22 is further responsive to the operator console for providing a
storyboard output to printer 106. The storyboard output includes a
sequence of labels, generally at a program bin level, which describes the
flow of the story. In addition, if textual material had been entered from
the keyboard 104 with respect to any segment label, that material is also
printed on the storyboard. The operator/editor can then use the storyboard
as a "hard copy" guide and aid during the composition process.
Passive Display Operation
Referring to FIGS. 2-7, in accordance with the illustrated embodiment of
the invention, each passive display screen 42, 44, 46, . . . , 68 is a 3.7
inch monitor on which a relatively low resolution, 128.times.120 picture
element (pixel) raster is displayed. In the preferred embodiment of the
invention, the raster has sixteen levels of gray scale corresponding to
four bits of information. In other embodiments of the invention, more or
less resolution, both spatially and in gray scale, or color, can be
employed.
The digital display data, which represents the pictorial labels, is
generated by the video digitizer 110 under control of the controller 22.
Digitizer 110 receives analog video input data from the video routing
circuitry 130 over a line 143. The video digitizer, which includes a fast
A-D converter and a two picture capacity random access memory, stores the
digitized video, digitized to four bits, for later presentation over the
digital bus either to the display processing unit 112, to a disk storage
146, or to the cache memory 109. Controller 22 controls the flow of
digital data from the video digitizer, disk, or cache storage to the
display processing unit and is capable of dynamically updating the
pictorial labels displayed at the console 12 at a rapid rate, for example,
twenty-four per second.
The digitizer, through its computer interface, receives instructions from
controller 22 over the computer bus 107. The digitizer is fast enough to
grab a frame on-the-fly from an ongoing stream of video information over
line 143. The interface can therefore be instructed by the controller 22,
upon recognition of the time code location, to trigger upon recognition of
the next vertical interval pulse, and the video or video segment
associated therewith will then be digitized and stored. The frame time
code is used by the apparatus to identify the frame. The digitizer can
also digitize a frame displayed in the freeze mode of VTR operation, read
its time code, and store the data for future use by the controller.
The video output from the video routing circuitry 130 to the video
digitizer is selected and dictated by the signal levels from the
controller 22 over lines 142. The video routing circuitry 130 is an
EXCLUSIVE OR routing circuitry which takes one of the video inputs (from
the VTR's 14, . . . , 16, . . . , 18, . . . , 20, from video input ports
275, 276, and from routing circuit 132) and provides that selected input
to the video digitizer over line 143. The selected video input signal can
thus be digitized to become available to be displayed as pictorial label.
The video input and frame selection process is at least partially
controlled, as described below, by the operator/editor at control console
12.
Controller 22 has associated with its disk controller 108, the high speed
disk storage device 146. Storage device 146 can be employed, for example,
to store all labels of interest so that they can be output to the display
processing unit 112 as needed. Since each illustrated passive display
screen requires only eight kilobytes of information, the disk controller
and disk storage are fully capable of changing all of the displays stored
by the display processing unit 112 within a short time duration and
therefore provide a great flexibility to operation of the pictorial label
presentation.
Even though the disk controller and disk storage can operate with access
speeds on the order of ten milliseconds, the retrieval of labels from
different sections of the disk can result in a non-uniform rate of change
for the passive displays. The apparatus therefore employs the picture
cache memory 109, a high speed solid state memory attached to the
controller bus 107, for maintaining a fast uniform label change rate. The
cache memory typically has sufficient storage capacity for sixty label
pairs and has an access time on the order of tens of microseconds which is
significantly faster than the access time for disk storage 146. The cache
memory operat | | |