|
|
|
| United States Patent | 5586250 |
| Link to this page | http://www.wikipatents.com/5586250.html |
| Inventor(s) | Carbonneau; Guy A. (Winter Springs, FL);
Wu; Bernie (Longwood, FL);
Jones; Tim (Deltona, FL) |
| Abstract | An intelligent status monitoring, reporting and control module is coupled
to a SCSI bus that interconnects a cluster of SCSI-compatible data storage
modules (e.g., magnetic disk drives). The status monitoring, reporting and
control module is otherwise coupled to the cluster of SCSI-compatible data
storage modules and to power maintenance and/or other maintenance
subsystems of the cluster for monitoring and controlling states of the
data storage modules and power maintenance and/or other maintenance
subsystems that are not readily monitored or controlled directly by way of
the SCSI bus. The status monitoring, reporting and control module sends
status reports to a local or remote system supervisor and executes control
commands supplied by the local or remote system supervisor. The status
reports include reports about system temperature and power conditions. The
executable commands include commands for regulating system temperature and
power conditions. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5586250 |
|
|
SCSI-coupled module for monitoring and controlling SCSI-coupled raid
bank and bank environment |
|
|
|
|
|
| Publication Date |
December 17, 1996 |
|
|
|
|
|
| Filing Date |
November 12, 1993 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
What is claimed is:
1. A redundant data storage system comprising:
a data exchange bus;
a plurality of data storage means removably supported in a plurality of
physical-support slots, said plurality of data storage means being for
redundantly storing a body of data, each storage means being operatively
coupled to the data exchange bus; and
status defining means also operatively coupled to the data exchange bus,
the status defining means being further operatively coupled to a selected
one or more of the plural data storage means and the physical-support
slots for sensing a local status of the selected one or more of the plural
data storage means and the physical-support slots, said local status being
one that cannot be otherwise determined by way of the data exchange bus
and the status defining means being further for reporting the sensed local
status to the data exchange bus;
wherein said local status includes one or more parameters selected from the
group consisting of:
(a) a local power voltage level inside a specified one of the selected
plural data storage means;
(b) a local power current level inside a specified one of the selected
plural data storage means;
(c) an amount of deviation from a prespecified nominal value for a local
power voltage level inside a specified one of the selected plural data
storage means;
(d) an amount of deviation from a prespecified nominal value for a local
power current level inside a specified one of the selected plural data
storage means;
(e) a temperature level inside a specified one of the selected plural data
storage means;
(f) a temperature condition inside a specified one of the selected plural
data storage means that is outside of a predefined range;
(g) a presence within a specified one of the selected plural
physical-support slots of a corresponding data storage means;
(h) a local removability from a specified one of the selected plural
physical-support slots of a corresponding data storage means; and
(i) an abnormal sound emanating from one of the selected plural data
storage means.
2. The system of claim 1 wherein the data exchange bus is a SCSI (Small
Computer System Interface) bus.
3. The system of claim 2 wherein the plural data storage means define a
RAID system.
4. The system of claim 2 wherein each of the plural data storage means and
the status defining means has a unique SCSI device identification number.
5. The system of claim 2 wherein a SCSI-to-host adaptor device is further
coupled to one terminal end of the SCSI bus and wherein the status
defining means is coupled to an opposed second terminal end of the SCSI
bus.
6. The system of claim 5 wherein the SCSI-to-host adaptor device and the
status defining means cooperate to test the data path integrity of the
SCSI bus portions disposed between them.
7. A redundant data storage system according to claim 1 further comprising
a support cage having a plurality of said physical-support slots each for
supporting a corresponding one of the plural data storage means, the
support cage further supporting the status defining means, wherein the
plural data storage means are each modularly removable from the support
cage.
8. A redundant data storage system according to claim 7 wherein each of the
plural data storage means is modularly removable from the support cage on
a hot-pluggable basis.
9. A redundant data storage system according to claim 7 wherein the
combination of the support cage, the plural data storage means, and the
status defining means, is sized to slip into a standard 51/4 inch form
factor, full-height drive bay of an IBM-PC.TM. compatible computer.
10. The system of claim 7 further comprising:
housing means for securely enclosing the supporting cage and plural data
storage units and the status defining means, said housing means having one
or more access means by which physical access may be obtained to the
components securely enclosed in the housing means;
wherein the status defining means includes means for monitoring the one or
more access means and for determining whether physical access is
immediately obtainable to one or more components enclosed in the housing
means by way of the one or more access means.
11. The system of claim 10 wherein the one or more access means each
includes locking means for preventing immediate physical access to a
corresponding one or more components enclosed in the housing means; and
wherein the status defining means includes means for selectively switching
the locking means between locked and unlocked states.
12. The system of claim 1 further comprising a plurality of redundant power
supplies for supplying continuous power to the plural data storage units
and to the status defining means even in the event where one of the
redundant power supplies fails;
wherein the status defining means includes means for detecting and
reporting degradation in the voltage or current supplying capabilities of
one or more of said plurality of redundant power supplies.
13. The system of claim 1 further comprising a plurality of redundant fans
operatively coupled to each of the plural data storage means for
redundantly providing a flow of cooling air at a desired volumetric
flowrate to each of the plural data storage means even in the event that
one of the redundant cooling fans fails;
wherein the status defining means includes means for detecting and
reporting degradation in the flowrate providing capabilities of one or
more of said plurality of redundant fans.
14. The system of claim 1 further comprising:
a first supporting cage for transportably housing two or more of said data
storage means, the first supporting cage having connectors removably
connected to the data exchange bus so that the two or more data storage
means housed within the first supporting cage can be disconnected from the
data exchange bus and transported away while housed in the first
supporting cage;
wherein the status defining means is also removably connected to the data
exchange bus so that the status defining means can be disconnected from
the data exchange bus and transported away together with the first
supporting cage; and
wherein the status defining means includes information storage means for
storing information about the two or more data storage means housed within
the first supporting cage.
15. The system of claim 14 wherein the stored information defines one or
more of: (a) a usage history describing past usage of the data storage
means housed within the first supporting cage; (b) an error history
describing past operating errors experienced by the data storage means
housed within the first supporting cage; and (c) a repair history
describing past repair operations performed on the data storage means
housed within the first supporting cage.
16. The system of claim 14 further comprising a second support cage for
transportably housing one or more additional ones of said data storage
means, the second supporting cage having connectors removably connected to
the data exchange bus so that the one or more additional data storage
means housed within the second supporting cage can be disconnected from
the data exchange bus and transported away while housed in the second
supporting cage;
wherein the status defining means is adapted for being transported away
from the data exchange bus together with the second supporting cage; and
wherein information storage means of the status defining means includes
means for storing additional information about the one or more additional
data storage means housed within the second supporting cage.
17. The system of claim 16 wherein the status defining means is physically
joined to the first supporting cage and the stored information defines one
or more of:
(a) a usage history describing past usage of the additional data storage
means housed within the second supporting cage;
(b) an error history describing past operating errors experienced by the
data storage means housed within the second supporting cage; and
(c) a repair history describing past repair operations performed on the
data storage means housed within the second supporting cage.
18. A cluster of SCSI modules coupled to one another by a SCSI bus, each
SCSI module having a respective local power status defined by a local
power voltage level and a local power current level delivered to internal
circuitry within the module, wherein at least one of the SCSI modules has
no means for directly reporting to the SCSI bus, the status of local power
delivered to internal circuitry of the at least one SCSI module and
wherein a second of the SCSI modules includes:
status monitoring and reporting means, operatively coupled to the at least
one SCSI module, for monitoring and reporting to the SCSI bus, at least
one of the local power voltage level and the local power current level
being delivered to internal circuitry of the at least one SCSI module.
19. The SCSI cluster of claim 18 wherein:
each SCSI module has a respective local temperature status defined by at
least one temperature level developed within the internal circuitry of the
SCSI module:
said at least one of the SCSI modules has no means for directly reporting
to the SCSI bus, the local temperature status of the internal circuitry of
the at least one SCSI module; and
said status monitoring and reporting means is further for monitoring and
reporting to the SCSI bus, the local temperature status of the at least
one SCSI module.
20. The SCSI cluster of claim 18 wherein:
each SCSI module can be caused to be manually removable from said cluster
and the manual removability of each SCSI module is defined by a local
locking means and;
said status monitoring and reporting means is further for monitoring and
reporting to the SCSI bus, the manual removability status of the at least
one SCSI module.
21. The SCSI cluster of claim 18 wherein the SCSI bus is further coupled to
an externally controllable SCSI module and wherein the status monitoring
and reporting means includes:
SCSI bus integrity testing means for testing, in cooperation with the
externally controllable SCSI module, the integrity of the SCSI data path
between the externally controllable SCSI module and the second SCSI
module.
22. The SCSI cluster of claim 21 wherein the externally controllable SCSI
module and the second SCSI module are positioned at opposed operative ends
of the SCSI bus.
23. The SCSI cluster of claim 18 wherein the at least one of the SCSI
modules is a magnetic disk drive.
24. The SCSI cluster of claim 18 wherein the at least one of the SCSI
modules is part of a RAID bank.
25. The SCSI cluster of claim 18 wherein the at least one of the SCSI
modules is a tape drive.
26. The SCSI cluster of claim 1 wherein the status monitoring and reporting
means comprises:
SCSI interface means, coupled to the SCSI bus, for managing SCSI bus
phases;
status monitoring interface circuitry operatively coupled to monitor the
status of local power delivered to internal circuitry of the at least one
SCSI module; and
a microcontroller, coupled to the SCSI interface means and to the status
monitoring interface circuitry, for receiving non-SCSI power status
reports from the interface circuitry and for layering the power status
reports into a data transfer phase block to be used in a SCSI SEND or
RECEIVE operation, and for causing the SCSI interface means to include the
data transfer phase block having said status report layered therein,
within the data transfer phase of a corresponding SCSI SEND or RECEIVE
operation.
27. The SCSI cluster of claim 9 wherein the microcontroller is responsive
to a predefined opcode layered into a command data block (CDB) portion of
a received SCSI RECEIVE communication, the opcode asking the
microcontroller to report the status of a power-related condition defined
by the opcode or parameters attached to the opcode, and the
microcontroller transferring the requested status into a corresponding
data transfer phase block to-be included in the data return phase of said
SCSI RECEIVE communication, and sending said data transfer phase block to
the SCSI interface means for inclusion in the data return phase of said
SCSI RECEIVE communication.
28. The SCSI cluster of claim 18 wherein a variable power supply delivers
power to the at least one SCSI module and wherein the second of the SCSI
modules further comprises:
power control means, operatively coupled to the variable power supply of
the at least one SCSI module and responsive to commands received over the
SCSI bus, for controlling the level of power delivered to internal
circuitry of the at least one SCSI module.
29. The SCSI cluster of claim 19 wherein a first variable speed fan
supplies a flow of cooling air to the at least one SCSI module and wherein
the second of the SCSI modules further comprises:
fan control means, operatively coupled to the first variable speed fan and
responsive to commands received over the SCSI bus, for varying the speed
level of said first variable speed fan.
30. The SCSI cluster of claim 20 wherein said local locking means is
electrically controllable and wherein the second of the SCSI modules
further comprises:
lock control means, operatively coupled to the local locking means and
responsive to commands received over the SCSI bus, for automatically
locking and unlocking said local locking means.
31. A redundant data storage system comprising:
a data exchange bus for connection to an external host controller;
a plurality of data storage means removably supported in a plurality of
physical-support slots, said plurality of data storage means being for
redundantly storing a body of data, each storage means being operatively
coupled to the data exchange bus; and
status defining means also operatively coupled to the data exchange bus,
the status defining means including programmable control means that is
programmable by way of instructions downloaded from the external host
controller through the data exchange bus, said downloaded instructions
including instructions for causing the status defining means to test the
integrity of the data exchange bus.
32. A redundant data storage system according to claim 31 wherein:
the status defining means is further operatively coupled to a selected one
or more of the plural data storage means and the physical-support slots
for sensing a local status of the selected one or more of the plural data
storage means and the physical-support slots;
the status defining means is further for reporting the sensed local status
to the data exchange bus in accordance with the said downloaded
instructions; and
said local status includes one or more parameters selected from the group
consisting of:
(a) a local power current level inside a specified one of the selected
plural data storage means;
(b) an amount of deviation from a prespecified nominal value for a local
power current level inside a specified one of the selected plural data
storage means;
(c) a temperature level inside a specified one of the selected plural data
storage means;
(d) a temperature condition inside a specified one of the selected plural
data storage means that is outside of a predefined range;
(e) a presence within a specified one of the selected plural
physical-support slots of a corresponding data storage means;
(f) a local removability from a specified one of the selected plural
physical-support slots of a corresponding data storage means; and
(g) an abnormal sound emanating from one of the selected plural data
storage means.
33. A status monitoring and reporting system for use in conjunction with a
SCSI-based array of plural data storage units, the system comprising:
status defining means for monitoring two or more operational attributes of
the plural data storage devices, at least two of the monitored attributes
being selected from the group consisting of:
(a) a local voltage or current condition of each data storage device,
(b) the amount of accumulated active usage time of each data storage
device,
(c) the amount of free storage space available in each data storage device,
(d) the historical error rate of each data storage device,
(e) the volume of data access requests made to each data storage device,
(f) the air flowrate output of one or more cooling fans provided for
cooling each data storage device,
(g) the local temperature of each data storage device, and
(h) the closed/open, locked/unlocked states of one or more access doors
providing physical access to each data storage device; and the system
further comprising:
SCSI interface means, coupled between the status defining means and the
SCSI bus, for transferring status information from the status defining
unit to the SCSI bus, the transferred status information indicating the
state of a monitored one or more of said attributes.
34. A status control system for use in conjunction with an array of
SCSI-based data storage units, the status control system comprising:
a status control unit for controlling two or more operational attributes of
the array of data storage units, at least two of the controlled attributes
being selected from the group consisting of:
(a) the voltage or current of one or more power supplies provided for
supplying power to each data storage device,
(b) the cooling rate of one or more temperature control units provided for
regulating the temperature of each data storage device,
(c) the locked/unlocked state of one or more lockable access doors
providing physical access to each data storage device;
and the system further comprising:
SCSI interface means, coupled between the status control means and the SCSI
bus, for receiving status control commands from the SCSI bus and
transferring the control commands to the status control unit for
execution, the transferred control commands indicating a desired state for
a controllable one or more of said attributes.
35. A method of monitoring and controlling a cluster of data storage
modules interconnected by a data exchange bus wherein operations of the
cluster are supported by power maintenance and other maintenance
subsystems, said method comprising the steps of:
(a) attaching a status defining means to the data exchange bus;
(b) operatively coupling the status defining means to the power maintenance
and other environment maintenance subsystems of the cluster; and
(c) operating the status defining means so that the status defining means
provides one or more of the following functions:
(c.1) providing on-site reports via an on-site indicator means of cluster
status and cluster problems to an on-site observer by way of a frontpanel
messaging module;
(c.2) providing off-site reports via the data exchange bus of cluster
status and cluster problems to a remote system supervisor;
(c.3) testing the data path integrity of the data exchange bus;
(c.4) storing retrievable data providing error history, repair history, and
usage history information about a portable one or more of the cluster of
data storage modules with which the status defining means is associated;
(c.5) supporting inventory/asset management functions in a large network
containing the cluster of data storage modules;
(c.6) monitoring traffic patterns of communications to or from members of
the cluster;
(c.7) switching a configuration of the cluster in response to a sensed
degradation event within the cluster;
(c.8) monitoring and managing background environmental aspects of cluster
operation such as maintaining appropriate temperatures within the cluster,
maintaining predefined power levels within the cluster, and assuring
physical security of cluster members.
36. The SCSI cluster of claim 1 wherein said at least one of the SCSI
modules comprises at least three substantially similar SCSI modules.
37. The SCSI cluster of claim 36 wherein said at least three substantially
similar SCSI modules define a RAID bank.
38. The SCSI cluster of claim 1 wherein said at least one of the SCSI
modules comprises six substantially similar SCSI modules.
39. The SCSI cluster of claim 38 wherein said at least six substantially
similar SCSI modules defines two RAID banks.
40. The SCSI cluster of claim 4 wherein said status monitoring and
reporting means includes nonvolatile writable memory means for storing
test instructions downloaded through said SCSI bus prior to said integrity
testing of the SCSI bus.
41. A data storage and retrieval system comprising:
(a) a host computer including a host-to-SCSI adaptor module for coupling
the host computer to a plurality of independent SCSI buses, each of the
SCSI buses being capable of operatively coupling together a limited,
respective number of SCSI modules at one time, the host-to-SCSI adaptor
module defining a first such SCSI module on each of said plurality of
independent SCSI buses;
(b) a plurality of storage array housing cabinets each operatively coupled
to at least one bus of said plurality of independent SCSI buses, wherein:
(b.1) each storage array housing cabinet houses a respective plurality of
data storage devices,
(b.2) each storage array housing cabinet further houses a respective two or
more modularly-replaceable redundant power supplies that are operatively
coupled to supply operating power to the respective data storage devices
of the cabinet,
(b.3) each storage array housing cabinet further houses a respective two or
more redundant cooling fans each fan being operatively coupled to provide
mutually independent cooling to the redundant power supplies and to the
data storage devices of the cabinet,
(b.4) each storage array housing cabinet further includes:
status defining means also operatively coupled to the respective SCSI bus,
the status defining means being further operatively coupled to a selected
one or more of the plural SCSI modules for sensing a local status of the
selected one or more of the SCSI modules, said local status being one that
cannot be otherwise determined by way of the SCSI bus and the status
defining means being further for reporting the sensed local status to the
data exchange bus;
wherein said local status includes one or more parameters selected from the
group consisting of:
(a) a local power voltage level inside a specified one of the selected
plural SCSI modules;
(b) a local power current level inside a specified one of the selected
plural SCSI modules;
(c) an amount of deviation from a prespecified nominal value for a local
power voltage level inside a specified one of the selected plural SCSI
modules;
(d) an amount of deviation from a prespecified nominal value for a local
power current level inside a specified one of the selected plural SCSI
modules;
(e) a temperature level inside a specified one of the selected plural SCSI
modules; and
(f) a temperature condition inside a specified one of the selected plural
SCSI modules that is outside of a predefined range. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND
1. Field of the Invention
The invention relates generally to redundant arrays of data storage
devices. The invention relates more specifically to a RAID system that
connects to a host computer by way of a SCSI interface and a
diagnostics/control module that also connects to the SCSI interface.
2a. Cross Reference to Related Applications
The following copending U.S. patent application is assigned to the assignee
of the present application, is related to the present application and its
disclosure is incorporated herein by reference:
(A) Ser. No. 08/124,276 filed Sep. 20, 1993 by Larry Kibler et al and
entitled, FULL-HEIGHT DISK DRIVE SUPPORT STRUCTURE.
2c. Cross Reference to Related Patents
The following U.S. patent is related to the present application and its
disclosure is incorporated herein by reference:
(A) U.S. Pat. No. 5,148,432 issued to Gordon et al. and entitled ARRAYED
DISK DRIVE SYSTEM AND METHOD.
3. Description of the Related Art
The use of RAID data storage systems (Redundant Array of Inexpensive
Disk-drives) is becoming increasingly popular due to economic and
technical reasons.
Data storage strategies are being shifted away from having one large
mainframe computer coupled to an array of a few, large disk units or a
few, bulk tape units, and are instead being shifted in favor of having
many desktop or mini- or micro-computers intercoupled by a network to one
another and to many small, inexpensive and modularly interchangeable data
storage devices (e.g., to an array of small, inexpensive, magnetic storage
disk drives). One of the reasons behind this trend is a desire in the
industry to maintain at least partial system functionality even in the
event of a failure in a particular system component. If one of the
numerous mini/micro-computers fails, the others can continue to function.
If one of the numerous data storage devices fails, the others can continue
to provide data access. Also increases in data storage capacity can be
economically provided in small increments as the need for increased
capacity develops.
A common configuration includes a so-called "client/server computer"
sandwiched between a local area network (LAN) and a RAID data storage
system. Remote users (clients) send requests for read and/or write access
to data files contained in the RAID system over the network (LAN). The
client/server computer services each request on a time shared basis.
As the client/server computer performs its client servicing tasks, the
client/server computer is burdened at the same time with the overhead of
attending to mundane tasks such as monitoring the operational status of
each disk drive in the RAID system and taking corrective action, or at
least issuing an alarm, when a problem develops.
A difficulty develops when the request-servicing bandwidth and/or storage
capacity of such a RAID-based client/server system needs to be scaled
upwardly. If the number of network users (clients) or request-load per
user increases, the request-servicing burden that is placed on the
client/server computer tends to increase correspondingly. At some point,
the client/server computer bumps against the limits of its data processing
speed and system responsiveness suffers.
System responsiveness is disadvantageously degraded by the burden that
status monitoring overhead places on the client/server computer. In other
words, the status monitoring overhead disadvantageously reduces the
ability of the client/server computer to more quickly respond to the
ever-growing number of service requests that it receives from the network.
In addition, the status-monitoring overhead burden disadvantageously grows
as more data storage drives are added to the RAID system. And accordingly,
even though the addition of more data storage drives beneficially
increases the system's storage capacity, it also tends to degrade system
response speed.
The status monitoring function of the client/server computer is typically
supported by customized hardware that is added to an expandable bus of the
client/server computer. In one configuration, a serial and/or parallel I/O
board is inserted into one of the expansion slots of the client/server
computer and site-customized cables are routed from this I/O board to
status sensors that are mounted on or in various components of the disk
array. Monitoring software is loaded into the client/server computer to
drive the I/O board, to query the various sensors and to receive status
reports back from them. Such an arrangement is disadvantageous in that an
expansion slot of the client/server computer is consumed for carrying out
the disk-array monitoring function. It is also disadvantageous because of
the customized nature of the sensor cables extending from the I/O board.
Each RAID server tends to have its own unique configuration. A network
having many such uniquely-configured servers is difficult to maintain.
Increasingly, there is a need within the industry for arranging the
client/server computer as an off-the-shelf commodity item that can be
quickly and inexpensively replaced in case of failure. There is a long
felt desire in the industry to avoid customized routings of cables between
a stand-alone computer and peripheral sensors. There is a need in the
industry for disk drive arrays or other data storage arrays that can be
quickly and efficiently serviced in the event of a failure. There is a
growing desire in the industry to be able to control all operations of a
networked RAID system from a remote control console without adversely
affecting normal operations of the network.
SUMMARY OF THE INVENTION
The invention helps to attain the above-mentioned objectives by providing a
SCSI-coupled module for monitoring and for controlling a SCSI-coupled
cluster of devices such as a SCSI-coupled RAID bank.
A structure in accordance with the invention comprises: a cluster of SCSI
modules coupled to one another by a SCSI bus, wherein at least one of the
SCSI modules has no means for directly reporting to the SCSI bus, the
status of power delivered to internal circuitry of the at least one SCSI
module or the status of other conditions (e.g., temperature, open door)
affecting the operability or security of the at least one SCSI module and
wherein a second of the SCSI modules includes status monitoring, reporting
and control means for monitoring and directly reporting to the SCSI bus,
the status of power delivered to internal circuitry of the at least one
SCSI module or the status of other conditions (e.g., temperature)
affecting the operability and/or security of the at least one SCSI module.
The status monitoring, reporting and control means is optionally provided
with control functions so that it can actively control the power delivered
to internal circuitry of the at least one SCSI module or the status of
other conditions (e.g., temperature, door lockings) affecting the
operability and/or security of the at least one SCSI module either in
response to commands received over the SCSI bus or on its own initiative.
A method in accordance with the invention comprises the steps of: (a)
attaching a status monitoring, reporting and control means to a SCSI bus
having a cluster of SCSI modules; (b) operatively coupling the status
monitoring, reporting and control means to a power maintenance and/or
other environment maintenance subsystems of the cluster; and (c) operating
the status monitoring, reporting and control means so that the status
monitoring, reporting and control means provides one or more of the
following functions: (c.1) providing on-site reports via an on-site
indicator means of cluster status and cluster problems to an on-site
observer (e.g., by creating appropriate indication patterns on a
frontpanel messaging module); (c.2) providing off-site reports via the
SCSI bus of cluster status and cluster problems to a remote system
supervisor; (c.3) testing the data path integrity of the SCSI bus; (c.4)
conveying error history, repair history, usage history and other
information about a portable cluster of SCSI modules to which the status
monitoring, reporting and control means is attached; (c.5) supporting
inventory/asset management functions in a large network containing the
SCSI cluster; (c.6) monitoring traffic patterns of SCSI communications to
or from members of the cluster; (c.7) switching a configuration of the
cluster in response to a sensed degradation event within the cluster;
(c.8) monitoring and managing background environmental aspects of cluster
operation such as maintaining appropriate temperatures within the cluster,
maintaining predefined power levels within the cluster, and assuring
system security.
These and other aspects of the invention will be described in more detail
below.
BRIEF DESCRIPTION OF THE DRAWINGS
The below detailed description makes reference to the accompanying
drawings, in which:
FIG. 1A is a generalized block diagram of a non-SCSI to SCSI status
transfer system in accordance with the invention;
FIG. 1B is a block diagram of a SCSI-based data access network system
(DANS) in accordance with the invention;
FIGS. 2A-2B show schematics of cabinet monitor and control (CMAC) boards in
accordance with the invention;
FIG. 3A shows a six drive configuration; and
FIG. 3B shows a bank of drive cabinets each holding eighteen drives.
DETAILED DESCRIPTION
Referring to FIG. 1A, there is first shown a generalized block diagram of a
non-SCSI to SCSI status transfer system in accordance with the invention.
Modules 10, 11, 12, . . . , 15 each include a Small Computer System
Interface (SCSI) for enabling SCSI-based data exchange between these
modules 10, 11, 12, . . . , 15 in accordance with well known industry
standards. Although only four such SCSI modules are shown, it is to be
understood that the SCSI data exchange network (or SCSI "channel") can
have as many as eight such modules and that each module has a unique SCSI
identification number (ID#0 through ID#7). Each module can have within it,
as many as 8 uniquely-addressable, SCSI logical units. Thus the SCSI
channel can support as many as 64 uniquely-addressable, SCSI logical
units.
In the illustrated example, module 10 is assigned SCSI ID#0, module 11 is
assigned SCSI ID#1, module 12 is assigned SCSI ID#2, and module 15 is
assigned SCSI ID#7. Four additional SCSI modules (not shown) can be
inserted between modules 12 and 15 and assigned respective SCSI ID's #3 to
#6.
SCSI cables 31-35 interconnect corresponding SCSI modules 10-15 in daisy
chain fashion according to well known industry practice. Modules 11-15 are
spaced relatively close to one another (they are "clustered") while module
10 is located relatively far (roughly 1 to 25 feet away) from the other
modules 11-15. Because of this physical separation, a first
power/environment support unit 16 is used to supply electrical power and
provide other operational necessities (e.g., cooling) to the cluster of
modules 11-15 while a second power/environmental support unit 17 is used
to supply electrical power and provide other operational necessities
(e.g., cooling) to the out-of-cluster module 10. An electrical/mechanical
connection means 36 operatively couples the first power/environmental
support unit 16 to the clustered SCSI modules 11-15 while a separate,
second electrical/mechanical connection means 37 operatively couples the
second power/environmental support unit 17 to separated SCSI module 10.
Module 10 is connected to a system supervisor 2 by means of a communication
network 5. Communication between the system supervisor 2 and the remaining
cluster of modules 11-15 is substantially limited to that which can be
carried over the SCSI network (cables 31-35) to the first module 10, and
from there over the communication network 5 to the system supervisor 2.
SCSI modules 11 and 12 do not include means for reporting:, by way of the
SCSI network, (1) the status of power delivered to their internal
circuitry (e.g., is it at nominal voltage and current, and if not what is
the amount of deviation?) or (2) the status of other environmental
conditions affecting their operability, such as temperature build-up, or
(3) the status of yet other environmental conditions affecting their
security, such as their physical removability or actual removal from the
cluster.
With regard to the mentioned report items, SCSI communications do not on
their own provide definitive answers. If a SCSI module is not responding
to SCSI commands, such nonresponsiveness does not specifically indicate
whether the cause is due to failure of the SCSI interface, or loss of
power, or overheating, or physical removal or disconnect Of the module, or
some other reason. Because there is no status reporting means in modules
11 and 12, and SCSI communications do not provide definitive answers, the
system supervisor 2 has no way of learning about power or environmental
problems simply from communications carried out with SCSI modules 11 and
12 over SCSI bus 31-35.
To overcome this problem, a Status Monitoring And Reporting means 60 (SMARt
means 60) is provided within SCSI module 15 for monitoring the status of
the first power/environment support unit 16 and the status of nearby
modules 11-12, and even its own status, and for reporting the status of
these monitored devices to the system supervisor 2 by way of the SCSI
network 31-35. Sensors 21, 22, . . . , 25, 26 are attached to respective
units 11, 12, . . . , 15 and 16 for monitoring temperature, electrical
power levels and other aspects of cluster 11-15 that affect the
operatability and/or security of SCSI cluster 11-15. Local sensor lines
51, 52, . . . , 55, 56 respectively connect sensors 21, 22, . . . , 25, 26
to the status monitoring and reporting means 60.
An appropriate intelligence means (e.g., a microcontroller or
microcomputer, not shown) is provided within the status monitoring and
reporting means (SMARt) 60 for causing it to periodically monitor the
status of temperature, electrical power levels and other aspects affecting
the operatability and security of SCSI cluster 11-15 and to report
worrisome developments to the system supervisor 2 by way of the SCSI
network 31-35.
Note that the status monitoring and reporting (SMARt) means 60 is
preferably located in the SCSI module 15 that is most distal along the
SCSI chain of cables 31-35 from the communication network 5 and the system
supervisor 2. The intelligence means (e.g., a microcontroller or
microcomputer) within the status monitoring and reporting (SMARt) means 60
can be advantageously used to test the integrity of the data path between
the system supervisor 2 and end module 15, that data path including the
series of connections made by communication network 5, the SCSI chain of
cables 31-35, and the intervening modules 10-12. Appropriate test patterns
can be sent from the system supervisor 2 to test for shorts, opens,
stuck-at faults and so forth, in the chain of interconnects 5, 31-35. Such
techniques for verifying network integrity are well known in the art.
Communications between the status monitoring and reporting (SEt) means 60
and the system supervisor 2 are carried out using a communications
protocol layered on top of the industry standard SCSI protocol. For
example, a first one or more bytes of data that is sent during the data
transfer phase of a SCSI SEND or RECEIVE operation defines an operation
code field (op code) recognizable to one or both of the SMARt means 60 and
the system supervisor 2. A following one or more bytes of data that is
sent during the data transfer phase of the SCSI SEND or RECEIVE operation
defines parameters of the op code. (The op codes and parameters can be
inserted in the CDB (command data block) of a SCSI RECEIVE or SEND
operation or in a subsequent one or more data blocks.)
More specifically, when the network control console 102 is the initiator of
a data exchange operation and wishes to receive information from the SMARt
means 60, it sends the corresponding op code and parameters to first
module 10 by way of communication network 5. The op code and parameters
sent by the network control console 102 are thereafter embedded by module
10 into the CDB (command data block) of a SCSI RECEIVE command which
module 10 sends to the status monitoring and reporting means 60 of module
15 by way of SCSI network cables 31-35. The SMARt means 60 analyzes the
embedded op code and parameters and responsively returns the desired data
during the data phase of the same SCSI RECEIVE operation. If the network
control console 102 wishes to ask the SMARt means 60 to perform a
particular operation (e.g., to turn on an LED, not shown, that is attached
to cluster 11-15), the network control console 102 sends the corresponding
op code and parameters to first module 10 by way of communication network
5. The op code and parameters sent by the network control console 102 are
thereafter embedded by module 10 into the CDB (command data block) and/or | | |