|
Description  |
|
|
TECHNICAL FIELD
The present invention relates to videoconferencing systems and more
particularly to a videoconferencing system which can accommodate a
plurality of different devices and which provides for ease of operation by
the user.
BACKGROUND OF THE INVENTION
Typical prior art videoconferencing systems fall into one of two
categories: those where the intelligence is centralized in the
coder-decoder (codec) or a system control unit; and those where the
intelligence is distributed so that each peripheral device controller has
the intelligence necessary to directly control other peripheral devices in
the system. One shortcoming of centralized intelligence systems is that
such systems are not readily adaptable to accommodate new devices and new
versions of existing devices. The addition of another peripheral device
beyond the number originally planned for, or the addition of a new type of
peripheral device, can require a substantial investment in time and money
to accommodate the desired additional device or new device. Furthermore,
most centralized intelligence systems have a limited capacity with respect
to the number of ports available to connect to peripheral devices. Once
this capacity has been reached, new devices can be added only by removing
existing devices, such as lesser used devices, or by obtaining another
codec or system controller which can accommodate the increased number of
devices.
Distributed intelligence systems, such as that shown in U.S. Pat. No.
5,218,627 to Corey, have the shortcoming in that each peripheral device
controller must have the intelligence necessary to control every type of
peripheral device connected to the network, and every additional
peripheral device must have a peripheral device controller which has the
intelligence necessary to control all the existing devices on the network.
Therefore, the addition of a new type of peripheral device requires new
programming to be provided for each of the existing peripheral device
controllers, and requires programming of the controller for the new type
of device to accommodate the existing peripheral devices.
Therefore, there is a need for a videoconferencing system which can readily
accommodate both additional peripheral devices and new types of peripheral
devices.
Positioning of video cameras is required for videoconferencing as well as
for a number of other activities, such as surveillance. The terms pan,
tilt, zoom and focus are industry standards which define the four major
axes for which a camera may be adjusted. Traditional camera positioning
provides for manual adjustment of these axes, as well as buttons which
provide for automatically positioning the camera to a preset location. A
preset function recalls the pan, tilt, zoom and focus settings that have
been previously ascertained and stored for that preset location.
Traditional videoconferencing systems provide for rather rudimentary
control of these camera functions. That is, the user has a control panel
for manually controlling camera functions, such as buttons for up/down,
left/right, zoom in/out, and focus. The user can also typically select one
of several preset camera settings so that, by the press of a single
button, the camera will automatically position and focus itself at some
preselected target. Of course, the preset function requires planning
because the camera must be manually adjusted for the preset, and then the
settings stored. The preset button then merely recalls these settings and
adjusts the camera accordingly. If a location has not been preset then the
user must manually adjust the pan, tilt, zoom, and focus settings for that
location.
However, these controls are not intuitively obvious or easy to use, partly
because the user may think that the camera should pan in one direction to
center an object whereas, because of the position of the camera with
respect to the user and the object, which object may be the user, the
camera should actually move in the opposite direction. For example, the
user typically sits at a table and faces the camera, and beside the camera
is a monitor screen which allows the user to see the picture that the
camera is capturing. If the user is centered in the picture, and wishes
the camera to center on his right shoulder, the user may think that he
wants the camera to pan left because, on the screen as seen by the user,
the user's right shoulder is to the left of the user's center. However,
the camera should actually pan to the right because, from the camera's
viewpoint, the user's right shoulder is to the right of the user's center.
Also, current manual camera positioning techniques typically use a fixed
motor speed. This results in the panning being too rapid and the scene
flying by when the camera is zoomed in on an object, or in the panning
being too slow and the scene taking a prolonged time to change to the
desired location when the camera is in a wide field of view setting
(zoomed out).
Furthermore, in traditional videoconferencing systems, when the camera is
moving from to a preset location the pan and tilt systems move at the same
rate. If the required pan movement is different than the required tilt
movement then the camera will have completed its movement along one axis
before it has completed its movement along the other axis. This makes the
camera movement appear to be jerky and unnatural.
After the user has completed the process of changing the camera position
the user may have to refocus the camera. As chance would have it, the
first attempt to refocus the camera usually is in the wrong direction.
That is, the user inadvertently defocuses the camera. The learning process
is short, but the need to focus creates delays and frustration.
When the system has multiple cameras which are subject to control by the
user, typical systems require the user to use buttons on the control
keyboard to manually select the camera to be controlled, and/or assigning
separate keys to separate cameras. Frequently, the user will select the
wrong camera, or adjust the wrong camera.
SUMMARY OF THE INVENTION
The present invention provides a video teleconferencing system which
combines a central intelligence with distributed intelligence to provide a
versatile, adaptable system. The system comprises a controller and a
plurality of network converters. Each network converter is connected to a
system network as well as to one or more peripheral devices. The
controller contains the software necessary for its own operation as well
as the operation of each of the network converters. The user selects the
type of device that is connected to a network converter and the controller
sends the software appropriate to that type of device to the network
converter. The network converter loads the software into its own memory
and is thereby configured for operation with that type of device. This
allows a network converter to be quickly programmed for a particular
peripheral device. This also allows for quick and convenient upgrading of
the system to accommodate new devices. Rather than having to design a new
network converter for each type of new peripheral device, software for
that new device is written and stored in the controller. The software can
then be loaded into a network converter when that new device is added to
the system. Therefore, existing network converters can be used to
accommodate new devices. This reduces the number and type of network
converters that must be maintained in inventory and also minimizes the
obsolescence of network converters as new devices and new versions of
existing devices become available.
In addition, the present invention provides that the controller will
perform conversion of instructions from the initiating device, such as a
mouse, to the controlled device, such as a camera. This allows for easy
and convenient upgrading of the system to accommodate new devices because
the peripheral devices do not need to understand the signals from other
peripheral devices. The controller performs the necessary device-to-device
signal translation. For example, one network controller will convert
signals from a mouse into network standard control signals which represent
the mouse movement, such as left, right, up, down, button 1 depressed,
button 1 released, etc., regardless of the type of mouse being used. The
controller then inspects these network standard control signals to
determine the type of action requested by the user. The controller then
generates network standard control signals corresponding to the desired
action and places these signals onto the network. Examples of network
standard control signals intended for the control of a camera might be pan
left, pan right, etc. The camera network converter then performs a
conversion of the network standard signals from the controller into the
type of control signals required for that particular camera, such as +12
volts, -12 volts, binary command 0110, etc. When a new type of peripheral
device, such as a new camera, is added the new device may require control
signals which are completely different than any existing device so the
control signals presently provided by the camera network converter would
not give the desired results. In the present invention the network
standard signals do not change. Rather, new software is written for the
camera network converter so that the camera network converter provides the
appropriate signals to the new camera, such as +7 volts, -3 volts, binary
command 100110, etc. In this manner, peripheral devices from different
manufacturers and new peripheral devices are readily accommodated by
adding new software for the controller. The user can then instruct the
controller to load the new software into the converter so that the
converter is now configured for the new device.
The present invention also provides for control of devices on remote
systems. The use of network standard signals allows a user at a local site
to easily control a device at a remote site, even if the controller at the
local site does not have software appropriate for that type of device. The
controller at the local site receives the network standard signals
corresponding to the action taken by the user and determines the action
(pan left, pan right, etc.) required at the remote site. The local
controller then sends the network standard signals for the action to the
remote controller. The remote controller receives the network standard
signals from the local controller and sends these network standard signals
to the remote network converter for the device, and the remote network
converter does have the appropriate software for the remote device. The
remote network converter then converts the network standard signals into
the signals appropriate for that type of peripheral device.
The present invention provides alternative methods of adjusting the pan,
tilt, zoom and focus of a camera. In one method the user positions a
pointer over an object displayed on a monitor and clicks a mouse button.
This causes the camera to be automatically positioned so as to center the
object in the monitor display. In another method the user uses the pointer
to draw a rectangle around the object or area of interest. This causes the
camera to be automatically positioned to center the object in the monitor
display and adjust the zoom and focus so that the designated area in the
rectangle fills the display. This is a substantial improvement over prior
art systems in that a camera may be automatically positioned for objects
or areas for which there are no preset values.
The present invention provides an improvement to panning. The panning speed
is automatically adjusted in accordance with the current zoom (field of
view) setting. When the camera is zoomed in, panning will occur at a slow
rate so that objects do not fly by at high speed. When the camera is
zoomed out, panning will occur at a fast rate so that objects do not crawl
by at slow speed. The result is that, regardless of the zoom setting,
objects appear to move across the scene at a fixed, comfortable rate,
which is user selectable.
The present invention provides an improvement to panning and tilting the
camera. When the camera position is to be changed, the time to complete
the change in the pan position is determined and the time to complete the
change in the tilt position is determined. Then, the faster process is
slowed down so as to be completed at the same time as the slower process.
This causes the camera to move smoothly and linearly from the starting
position to the ending position.
The present invention provides a method for automatically focusing the
camera. Each time that the camera is positioned toward and manually
focused on an object or area the system automatically stores the camera
position and the focus setting. When the camera is next positioned toward
the object or area the system automatically recalls the stored focus
setting and implements that setting. The present invention defines
relationships between regions so that a focus setting may be determined
even if that region has not been used before.
The present invention further provides for automatic selection of the
camera to be controlled. The user simply positions a pointer over the
desired scene and the system automatically selects, for further control,
the camera which is providing that scene. This method is particularly
useful when picture-within-picture, split screen, and four-quadrant screen
displays are in use.
BRIEF DESCRIPTION OF THE DRAWING
FIG. 1 is a block diagram of the preferred embodiment of the present
invention.
FIG. 2 is a block diagram of a serial interface-type network converter.
FIG. 3 is a block diagram of a parallel interface-type network converter.
FIG. 4 is a block diagram of a specialized-type network converter.
FIGS. 5A and 5B are a flow chart of the method used for positioning a
camera.
FIGS. 6A and 6B are an illustration of the operation of the automatic zoom
feature of the present invention.
FIG. 7 is a flow chart of the method for controlling the aim point and the
zoom operation of the camera.
FIG. 8 is a schematic block diagram of a video unit control node.
FIG. 9 is a schematic block diagram of an audio unit control node.
FIGS. 10A-10C are illustrations of the relationship between regions.
FIGS. 11A and 11B are a flow chart of the camera focusing process.
FIG. 12A is an illustration of the preferred embodiment of a camera of the
present invention.
FIG. 12B is an illustration of the feedback system associated with the
camera controls.
FIG. 13 is an illustration of a two-monitor videoconferencing system of the
present invention.
DETAILED DESCRIPTION
Turning now to the drawings, in which like numerals reference like
components throughout the several figures, the preferred embodiment of the
present invention will be described.
System Overview
FIG. 1 is a block diagram of the preferred embodiment of the present
invention. The videoconferencing system comprises a controller 10, a
plurality of network converters (C) 11A-11K connected to a network 23, a
mouse 12, a control panel 13, an audio unit control node 14, a video unit
control node 15, a coder-decoder (codec) 16, a camera unit control node
17, a joystick 18, a power supply 19, a video cassette recorder/playback
unit (VCR) 20, monitors 21, and a modem 22. The video teleconferencing
system also comprises items which, for the sake of clarity, are not shown
in FIG. 1, such as: cameras, pan/tilt and zoom/focus units for the
cameras, microphones, speakers, audio cabling, video cabling, and
telephone and power wiring. Each device 10, 12-22 is connected to a
converter 11A-11K. The converters are connected, preferably in a
daisy-chain (serial) manner, via the network designated generally as 23.
Converter 11A is shown as part of controller 10, and converters 11B-11K
are shown as being stand alone components which are separate from their
respective connected devices 12-22. However, this is merely a preference
and any converter 11 may be a stand alone component or may be a part of
its associated device. In the preferred embodiment, the network 23 is the
LON-based network developed by Echelon, Inc., Palo Alto, Calif. However,
other networks, such as Ethernet, may be used.
Each converter 11 contains information which either converts network
standard signals on network 23 into control signals for the connected
device 10, 12-22, converts control/status signals for the connected
device(s) into network standard signals for network 23, or both. For
example, network controller 11B will convert signals from the mouse 12
into network standard control signals which represent the mouse movement,
such as left, right, up, down, button 1 depressed, button 1 released, etc.
Network converter 11B provides the same network standard control signals
for a particular type of mouse movement regardless of the type of mouse
being used. In operation, network standard control signals from control
devices such as mouse 12, control panel 13, joystick 18 or codec 16, are
sent, via converters 11 and network 23, to controller 10. It is also
possible for a single converter to service two or more devices, such as
converter 11B servicing mouse 12 and joystick 18, and converter 11I
servicing two monitors 21A and 22B. When sending information concerning
the user's movement of devices 12 or 18, converter 11B also sends
information as to whether the activity is associated with the mouse 12 or
the joystick 18. The controller 10 then inspects these network standard
control signals to determine the type of action requested by the user and
the device which should take the action, generates network standard
control signals corresponding to the desired action, and places these
signals onto the network 23. As in any network, a converter 11 inspects
the address of the incoming network standard signals on the network 23 to
determine if the data is intended for that converter or its connected
device. If so, then the converter 11 will capture the data, which is a
network standard control signal representing the desired action, and
convert the data into the appropriate type of signal for the connected
device.
For example, assume that the user has used the mouse 12 to select a camera
(not shown in FIG. 1 ) and has moved the mouse in a direction which
indicates that the selected camera should pan to the left. The mouse
movement signals are converted by converter 11B into network standard
control signals indicating, for example, the direction of the movement of
the mouse and the status of the buttons on the mouse (depressed, not
depressed). Converter 11B then generates an address for controller 10 and
places these network standard signals on network 23. Converters 11C-11K
ignore these signals because the address indicates that the signals are
not for them. Converter 11A recognizes the address as its own, captures
these signals, and provides the signals to controller 10. Controller 10
determines that the network standard control signals signify a mouse
movement corresponding to an instruction for the selected camera to pan to
the left and, accordingly, generates network standard control signals
corresponding to such camera movement. Controller 11 then instructs
converter 11A to address these signals to the network converter for
pan/tilt unit control node 17 and to place these signals on network 23.
Converter 11G recognizes the address as its own (or as intended for its
connected pan/tilt device), and captures the network standard signals.
Converter 11G then generates control signals appropriate for the type of
pan mechanism (not shown) used with the selected camera.
Therefore, even if the type of mouse is changed or the type of pan/tilt
mechanism is changed, the network standard signals from the mouse or to
the pan/tilt mechanism will not change. Rather, the network converters 11
will convert the signals from the mouse 12 into network standard signals
and will convert the network standard signals into signals appropriate for
the pan/flit mechanism.
As an example, the signals from mouse 12 may indicate that mouse 12 is
being moved to the left at a certain rate and the appropriate signals
provided to the pan motor may be +12 volts or, if the pan motor has a
digital controller or interface, the signals provided by converter 11G may
be a binary signal such as 101011 or some other code which corresponds to
the code and format required to achieve the specified action.
It will be appreciated that, for a simple action, such as pan left or right
and tilt up or down, controller 10 may not be required and converters 11B
and 11G may be programmed to achieve the desired correspondence between
the movement of the mouse 12, the depression of keys on control panel 13,
and movement of the pan motor. However, in the preferred embodiment, mouse
12 is also used to specify functions which do not have a one-to-one
correspondence between mouse movement and pan motor action, such as the
point-and-click and the draw-and-release operations described below and
therefore all network signals are directed to or come from controller 10.
Similarly, status information from monitor control node 21 is addressed by
converter 11I to controller 10 (converter 11A) and then placed on network
23. Controller 10 then inspects the status information to determine if the
selected monitor (not shown) is in the proper mode, such as on or off.
Control panel 13 is a conventional videoconferencing system control panel,
well known in the art, and provides, via buttons, such functions as pan
left, pan right, tilt up, tilt down, mute on/off, zoom in/out, focusing,
presettable camera settings, and volume up/down. Audio unit control node
14 controls the flow of audio signals among the devices which send or
receive audio signals, such as microphones, speakers, codec 16, telephone
lines, and VCR 20. Video unit control node 15 controls routing of video
signals among the different devices which send or receive video signals
such as codec 16, VCR 20, cameras, and monitors 21. Codec 16 provides
conventional codec functions. Camera unit control node 17 controls the
pan, tilt, zoom, and focus of the cameras and provides feedback regarding
these parameters. Power supply 19 provides operating power for the
converters 11 and also for the other devices 10, 12-18, 20-22 connected to
the system. VCR 20 is a conventional video cassette recorder/playback
device. Monitors 21 are commercially available monitors and, in the
preferred embodiment, are Mitsubishi color televisions, model CS-35EX1,
available from Mitsubishi Electronics America, Inc., Cypress, Calif. Modem
22 is a conventional modem, preferably having a data communications rate
of at least 9600 bits per second.
Those of skill in the art will appreciate that a typical codec 16 has a
port for connection to one or more dial-up or dedicated telephone lines.
There are several different protocols which can be used for codec-to-codec
communications. If the codecs are using the same protocol then they can
negotiate as to what features, such as data transfer rate, data
compression algorithms, etc., are to be used in the videoconferencing
session. However, the codecs must be configured to use the same protocol
or information transfer is not possible. If one codec has been configured
to use a first protocol and a second codec has been configured to use a
second protocol then the codecs will not be able to communicate. Codecs
generally have a keypad and a display which are used for setting up the
codec. However, the codes for setting up and the display indicating the
stage of setup or the results of the entered code are typically not
intuitive. Therefore, setting up (configuring) a codec for a particular
protocol is, in most cases, a tedious and time consuming task which is
preferably performed by a technician who is familiar with the instruction
and result codes used by that codec. However, codecs have a data port
which can also be used for transferring data as well as for setting up the
codec. This data port is advantageously used in the present invention to
allow a codec 16 to be configured by the controller 10. In the preferred
embodiment, codec 16 is a type Visualink 5000, manufactured by NEC
America, Inc., Hillsboro, Org.
Using, for example, the mouse 12 or the control panel 13, the user can
instruct controller 10 to establish the videoconferencing session.
Controller 10 will, via converters 11A and 11F and network 23, instruct
codec 16 to dial up or otherwise access the remote codec (the codec at the
other videoconferencing location). Codec 16 will then attempt to establish
communications with the remote codec. If communications are successfully
established the codecs will negotiate what features will be used and then
the session may begin. However, if communications cannot be established,
such as because the codecs are configured for different protocols, the
local codec 16 will report to controller 10 that codec 16 was able contact
the remote codec but was unable to establish communications (handshake)
with the remote codec because the remote codec was using a different
protocol. Controller 10 will then, via converters 11A and 11N, instruct
modem 22 to dial up the remote modem (the modem for the videoconferencing
system at the other location). Once controller-to-controller
communications have been established via modem then controller 10 can
instruct the remote controller to configure the remote codec for a
particular protocol. The remote controller will take action, if necessary,
to configure the remote codec to the same protocol. Conversely, controller
10 can receive information from and/or negotiate with the remote
controller as to the protocol(s) supported by, or the current
configuration of, the remote codec and then configure codec 16 to the same
protocol as the remote codec. Then, controller 10 can again instruct codec
16 to establish communications with the remote codec and, as both codecs
have now been configured to the same protocol, the codecs can establish
communications and negotiate features, and the videoconferencing session
can begin.
The present invention also provides for local control of remote devices. In
addition to controller 10 being able to communicate with any device 12-18,
20-22 on the local network 23, controller 10 may also communicate with a
similarly situated controller at a remote site (not shown) via the data
port on codec 16. The user, using mouse 12, control panel 13, or joystick
18, may command a particular action to be performed at the remote site,
such as panning the remote camera to the left or right, tilting the remote
camera up or down, etc. The user's actions are converted into network
standard control signals and these signals are sent by converter 11B to
controller 10. Controller 10 determines the action required at the remote
site and sends, via network 23 and codec 16, network standard control
signals corresponding to the action to the remote controller. The remote
controller then sends, via its own network, the network standard signals
to the converter for the remote pan/tilt unit. The remote converter then
generates the appropriate instruction for the remote pan/tilt unit control
node which, in turn, causes the pan/tilt mechanism for the selected remote
camera to perform the action specified by the user at the local site. The
user at the local site can therefore control all of the functions of all
the devices at the remote site that the remote user can control at the
remote site, even if the remote site has devices available which are not
available at the local site. However, in practice, some functions at a
site are preferably controlled only by the user at that particular site,
such as microphone muting, monitor on/off operation, and speaker volume
control settings.
The present invention also provides for system diagnostics. In the
preferred embodiment, camera unit control node 17, in addition to
receiving instructions from controller 10, also reports the results of an
instruction to controller 10. Each pan/tilt unit has a position indicator,
either as part of the unit or as a retrofit device. The position indicator
indicates the current pan position and the current tilt position. The
camera unit control node 17 accepts the position signals from the position
indicator and provides these signals to the controller 10. Controller 10
inspects these signals to determine whether the selected pan/tilt unit is
taking the proper action with respect to the control signals. For example,
assume that controller 10 has instructed a particular pan/tilt unit to pan
in a certain direction at a certain rate but that the pan/tilt unit either
does not pan, or pans at a different rate. The camera unit control node 17
reports the response of the selected pan/tilt unit to controller 10. If
the response of the selected pan/tilt unit is improper then controller 10
will cause a report to be generated which alerts the system operator to
the problem. The report may be provided in a number of ways. For example,
the presence of the report may be indicated by an icon on the screen of a
monitor 2 1. This alerts the system operator to select the report to
ascertain the nature of the problem. Or, the controller 10 may cause a
report to be printed, either by a printer (not shown) connected to a
printer port on controller 10 or by a printer (not shown) connected as
another device on the network 23. The report may also indicate the
severity of the problem. For example, a slow pan is generally not a
critical item, but indicates that the pan/flit unit should be serviced in
the near future to prevent the complete failure of and/or damage to the
unit. Conversely, a unit which does not pan at all requires immediate
servicing as continued attempts by the user to cause that pan/tilt unit to
pan could result in gear damage or motor burnout.
Modem 22 also allows for remote diagnostics and reporting. If the
videoconferencing system is, for example, being serviced by a remote party
then the remote party can, using a personal computer and a modem, call up
modem 22, establish communications with controller 10, and instruct
controller 10 to send, via modem 22, the current system diagnostics.
Furthermore, controller 10 can be programmed to use modem 22 to call up
the remote party, establish communications with the remote computer, and
automatically send the current system diagnostics. The programming may
specify that the call is to be performed at a certain time of day, such as
during off-duty hours, or whenever a serious failure occurs, such as the
complete failure of a pan/flit unit, or both.
The controller-to-controller communications, via either codecs or modems,
also allows the controller at one site, such as a remote site, to inform
the controller at another site, such as the local site, that a particular
device or function is inoperative at the remote site. Then, when the user
attempts to use that device or function the local controller will
disregard the instructions from the user and inform the user that that
device or function is out of service.
Controller 10, in addition to performing system diagnostics, also attempts
simple system repairs. For example, if the pan/flit unit will not pan in
one direction, controller 10 will instruct the pan/tilt unit to pan in the
other direction so as to attempt to dislodge any cable which may be
snagged. If this action is successful and the pan/tilt unit is then
operational controller 10 will log the failure and the repair so that the
service technician will know to inspect that unit for loose or snagged
cables and to service that unit. If the action is not successful then
controller 10 will disregard future instructions from the user as to the
desired movement of that pan/tilt unit and will not attempt to send
further instructions with respect to the failed function. That is, pan
instructions will not be sent because the pan function is not operative,
but tilt instructions may be sent because that function still operates
properly. However, as another option, controller 10 may be programmed to
cause operating power to be entirely removed from the failed pan/tilt
unit.
Similar action and reporting may be taken with respect to other functions
and devices. For example, the camera unit control node 17 also controls
the zoom an | | |