WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
SCSI-coupled module for monitoring and controlling SCSI-coupled raid bank and bank environment    
United States Patent5586250   
Link to this pagehttp://www.wikipatents.com/5586250.html
Inventor(s)Carbonneau; Guy A. (Winter Springs, FL); Wu; Bernie (Longwood, FL); Jones; Tim (Deltona, FL)
AbstractAn intelligent status monitoring, reporting and control module is coupled to a SCSI bus that interconnects a cluster of SCSI-compatible data storage modules (e.g., magnetic disk drives). The status monitoring, reporting and control module is otherwise coupled to the cluster of SCSI-compatible data storage modules and to power maintenance and/or other maintenance subsystems of the cluster for monitoring and controlling states of the data storage modules and power maintenance and/or other maintenance subsystems that are not readily monitored or controlled directly by way of the SCSI bus. The status monitoring, reporting and control module sends status reports to a local or remote system supervisor and executes control commands supplied by the local or remote system supervisor. The status reports include reports about system temperature and power conditions. The executable commands include commands for regulating system temperature and power conditions.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5586250
SCSI-coupled module for monitoring and controlling SCSI-coupled raid

     bank and bank environment - US Patent 5586250 Drawing
SCSI-coupled module for monitoring and controlling SCSI-coupled raid bank and bank environment
Inventor     Carbonneau; Guy A. (Winter Springs, FL); Wu; Bernie (Longwood, FL); Jones; Tim (Deltona, FL)
Owner/Assignee     Conner Peripherals, Inc. (San Jose, CA)
Patent assignment
All assignments
Publication Date     December 17, 1996
Application Number     08/151,525
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     November 12, 1993
US Classification     714/44 714/6 714/48
Int'l Classification     G06F 011/00
Examiner     Beausoliel Jr.; Robert W.
Assistant Examiner     Fisch; Alan M.
Attorney/Law Firm     Fliesler, Dubb, Meyer & Lovejoy
Address
Parent Case    
Priority Data    
USPTO Field of Search     395/575 395/182.03 395/182.12 395/182.04 395/183.17 395/183.2 395/182.05 395/185.01
Patent Tags     scsi-coupled module monitoring controlling scsi-coupled raid bank bank environment
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5367669
Holland
714/7
Nov,1994

[0 after 0 votes]
5337414
Hashemi
710/52
Aug,1994

[0 after 0 votes]
5313585
Jeffries
711/201
May,1994

[0 after 0 votes]
5313626
Jones
714/5
May,1994

[0 after 0 votes]
5148432
Gordon
714/7
Sep,1992

[0 after 0 votes]
3704363
Salmassy
714/704
Nov,1972

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A redundant data storage system comprising:

a data exchange bus;

a plurality of data storage means removably supported in a plurality of physical-support slots, said plurality of data storage means being for redundantly storing a body of data, each storage means being operatively coupled to the data exchange bus; and

status defining means also operatively coupled to the data exchange bus, the status defining means being further operatively coupled to a selected one or more of the plural data storage means and the physical-support slots for sensing a local status of the selected one or more of the plural data storage means and the physical-support slots, said local status being one that cannot be otherwise determined by way of the data exchange bus and the status defining means being further for reporting the sensed local status to the data exchange bus;

wherein said local status includes one or more parameters selected from the group consisting of:

(a) a local power voltage level inside a specified one of the selected plural data storage means;

(b) a local power current level inside a specified one of the selected plural data storage means;

(c) an amount of deviation from a prespecified nominal value for a local power voltage level inside a specified one of the selected plural data storage means;

(d) an amount of deviation from a prespecified nominal value for a local power current level inside a specified one of the selected plural data storage means;

(e) a temperature level inside a specified one of the selected plural data storage means;

(f) a temperature condition inside a specified one of the selected plural data storage means that is outside of a predefined range;

(g) a presence within a specified one of the selected plural physical-support slots of a corresponding data storage means;

(h) a local removability from a specified one of the selected plural physical-support slots of a corresponding data storage means; and

(i) an abnormal sound emanating from one of the selected plural data storage means.

2. The system of claim 1 wherein the data exchange bus is a SCSI (Small Computer System Interface) bus.

3. The system of claim 2 wherein the plural data storage means define a RAID system.

4. The system of claim 2 wherein each of the plural data storage means and the status defining means has a unique SCSI device identification number.

5. The system of claim 2 wherein a SCSI-to-host adaptor device is further coupled to one terminal end of the SCSI bus and wherein the status defining means is coupled to an opposed second terminal end of the SCSI bus.

6. The system of claim 5 wherein the SCSI-to-host adaptor device and the status defining means cooperate to test the data path integrity of the SCSI bus portions disposed between them.

7. A redundant data storage system according to claim 1 further comprising a support cage having a plurality of said physical-support slots each for supporting a corresponding one of the plural data storage means, the support cage further supporting the status defining means, wherein the plural data storage means are each modularly removable from the support cage.

8. A redundant data storage system according to claim 7 wherein each of the plural data storage means is modularly removable from the support cage on a hot-pluggable basis.

9. A redundant data storage system according to claim 7 wherein the combination of the support cage, the plural data storage means, and the status defining means, is sized to slip into a standard 51/4 inch form factor, full-height drive bay of an IBM-PC.TM. compatible computer.

10. The system of claim 7 further comprising:

housing means for securely enclosing the supporting cage and plural data storage units and the status defining means, said housing means having one or more access means by which physical access may be obtained to the components securely enclosed in the housing means;

wherein the status defining means includes means for monitoring the one or more access means and for determining whether physical access is immediately obtainable to one or more components enclosed in the housing means by way of the one or more access means.

11. The system of claim 10 wherein the one or more access means each includes locking means for preventing immediate physical access to a corresponding one or more components enclosed in the housing means; and

wherein the status defining means includes means for selectively switching the locking means between locked and unlocked states.

12. The system of claim 1 further comprising a plurality of redundant power supplies for supplying continuous power to the plural data storage units and to the status defining means even in the event where one of the redundant power supplies fails;

wherein the status defining means includes means for detecting and reporting degradation in the voltage or current supplying capabilities of one or more of said plurality of redundant power supplies.

13. The system of claim 1 further comprising a plurality of redundant fans operatively coupled to each of the plural data storage means for redundantly providing a flow of cooling air at a desired volumetric flowrate to each of the plural data storage means even in the event that one of the redundant cooling fans fails;

wherein the status defining means includes means for detecting and reporting degradation in the flowrate providing capabilities of one or more of said plurality of redundant fans.

14. The system of claim 1 further comprising:

a first supporting cage for transportably housing two or more of said data storage means, the first supporting cage having connectors removably connected to the data exchange bus so that the two or more data storage means housed within the first supporting cage can be disconnected from the data exchange bus and transported away while housed in the first supporting cage;

wherein the status defining means is also removably connected to the data exchange bus so that the status defining means can be disconnected from the data exchange bus and transported away together with the first supporting cage; and

wherein the status defining means includes information storage means for storing information about the two or more data storage means housed within the first supporting cage.

15. The system of claim 14 wherein the stored information defines one or more of: (a) a usage history describing past usage of the data storage means housed within the first supporting cage; (b) an error history describing past operating errors experienced by the data storage means housed within the first supporting cage; and (c) a repair history describing past repair operations performed on the data storage means housed within the first supporting cage.

16. The system of claim 14 further comprising a second support cage for transportably housing one or more additional ones of said data storage means, the second supporting cage having connectors removably connected to the data exchange bus so that the one or more additional data storage means housed within the second supporting cage can be disconnected from the data exchange bus and transported away while housed in the second supporting cage;

wherein the status defining means is adapted for being transported away from the data exchange bus together with the second supporting cage; and

wherein information storage means of the status defining means includes means for storing additional information about the one or more additional data storage means housed within the second supporting cage.

17. The system of claim 16 wherein the status defining means is physically joined to the first supporting cage and the stored information defines one or more of:

(a) a usage history describing past usage of the additional data storage means housed within the second supporting cage;

(b) an error history describing past operating errors experienced by the data storage means housed within the second supporting cage; and

(c) a repair history describing past repair operations performed on the data storage means housed within the second supporting cage.

18. A cluster of SCSI modules coupled to one another by a SCSI bus, each SCSI module having a respective local power status defined by a local power voltage level and a local power current level delivered to internal circuitry within the module, wherein at least one of the SCSI modules has no means for directly reporting to the SCSI bus, the status of local power delivered to internal circuitry of the at least one SCSI module and wherein a second of the SCSI modules includes:

status monitoring and reporting means, operatively coupled to the at least one SCSI module, for monitoring and reporting to the SCSI bus, at least one of the local power voltage level and the local power current level being delivered to internal circuitry of the at least one SCSI module.

19. The SCSI cluster of claim 18 wherein:

each SCSI module has a respective local temperature status defined by at least one temperature level developed within the internal circuitry of the SCSI module:

said at least one of the SCSI modules has no means for directly reporting to the SCSI bus, the local temperature status of the internal circuitry of the at least one SCSI module; and

said status monitoring and reporting means is further for monitoring and reporting to the SCSI bus, the local temperature status of the at least one SCSI module.

20. The SCSI cluster of claim 18 wherein:

each SCSI module can be caused to be manually removable from said cluster and the manual removability of each SCSI module is defined by a local locking means and;

said status monitoring and reporting means is further for monitoring and reporting to the SCSI bus, the manual removability status of the at least one SCSI module.

21. The SCSI cluster of claim 18 wherein the SCSI bus is further coupled to an externally controllable SCSI module and wherein the status monitoring and reporting means includes:

SCSI bus integrity testing means for testing, in cooperation with the externally controllable SCSI module, the integrity of the SCSI data path between the externally controllable SCSI module and the second SCSI module.

22. The SCSI cluster of claim 21 wherein the externally controllable SCSI module and the second SCSI module are positioned at opposed operative ends of the SCSI bus.

23. The SCSI cluster of claim 18 wherein the at least one of the SCSI modules is a magnetic disk drive.

24. The SCSI cluster of claim 18 wherein the at least one of the SCSI modules is part of a RAID bank.

25. The SCSI cluster of claim 18 wherein the at least one of the SCSI modules is a tape drive.

26. The SCSI cluster of claim 1 wherein the status monitoring and reporting means comprises:

SCSI interface means, coupled to the SCSI bus, for managing SCSI bus phases;

status monitoring interface circuitry operatively coupled to monitor the status of local power delivered to internal circuitry of the at least one SCSI module; and

a microcontroller, coupled to the SCSI interface means and to the status monitoring interface circuitry, for receiving non-SCSI power status reports from the interface circuitry and for layering the power status reports into a data transfer phase block to be used in a SCSI SEND or RECEIVE operation, and for causing the SCSI interface means to include the data transfer phase block having said status report layered therein, within the data transfer phase of a corresponding SCSI SEND or RECEIVE operation.

27. The SCSI cluster of claim 9 wherein the microcontroller is responsive to a predefined opcode layered into a command data block (CDB) portion of a received SCSI RECEIVE communication, the opcode asking the microcontroller to report the status of a power-related condition defined by the opcode or parameters attached to the opcode, and the microcontroller transferring the requested status into a corresponding data transfer phase block to-be included in the data return phase of said SCSI RECEIVE communication, and sending said data transfer phase block to the SCSI interface means for inclusion in the data return phase of said SCSI RECEIVE communication.

28. The SCSI cluster of claim 18 wherein a variable power supply delivers power to the at least one SCSI module and wherein the second of the SCSI modules further comprises:

power control means, operatively coupled to the variable power supply of the at least one SCSI module and responsive to commands received over the SCSI bus, for controlling the level of power delivered to internal circuitry of the at least one SCSI module.

29. The SCSI cluster of claim 19 wherein a first variable speed fan supplies a flow of cooling air to the at least one SCSI module and wherein the second of the SCSI modules further comprises:

fan control means, operatively coupled to the first variable speed fan and responsive to commands received over the SCSI bus, for varying the speed level of said first variable speed fan.

30. The SCSI cluster of claim 20 wherein said local locking means is electrically controllable and wherein the second of the SCSI modules further comprises:

lock control means, operatively coupled to the local locking means and responsive to commands received over the SCSI bus, for automatically locking and unlocking said local locking means.

31. A redundant data storage system comprising:

a data exchange bus for connection to an external host controller;

a plurality of data storage means removably supported in a plurality of physical-support slots, said plurality of data storage means being for redundantly storing a body of data, each storage means being operatively coupled to the data exchange bus; and

status defining means also operatively coupled to the data exchange bus, the status defining means including programmable control means that is programmable by way of instructions downloaded from the external host controller through the data exchange bus, said downloaded instructions including instructions for causing the status defining means to test the integrity of the data exchange bus.

32. A redundant data storage system according to claim 31 wherein:

the status defining means is further operatively coupled to a selected one or more of the plural data storage means and the physical-support slots for sensing a local status of the selected one or more of the plural data storage means and the physical-support slots;

the status defining means is further for reporting the sensed local status to the data exchange bus in accordance with the said downloaded instructions; and

said local status includes one or more parameters selected from the group consisting of:

(a) a local power current level inside a specified one of the selected plural data storage means;

(b) an amount of deviation from a prespecified nominal value for a local power current level inside a specified one of the selected plural data storage means;

(c) a temperature level inside a specified one of the selected plural data storage means;

(d) a temperature condition inside a specified one of the selected plural data storage means that is outside of a predefined range;

(e) a presence within a specified one of the selected plural physical-support slots of a corresponding data storage means;

(f) a local removability from a specified one of the selected plural physical-support slots of a corresponding data storage means; and

(g) an abnormal sound emanating from one of the selected plural data storage means.

33. A status monitoring and reporting system for use in conjunction with a SCSI-based array of plural data storage units, the system comprising:

status defining means for monitoring two or more operational attributes of the plural data storage devices, at least two of the monitored attributes being selected from the group consisting of:

(a) a local voltage or current condition of each data storage device,

(b) the amount of accumulated active usage time of each data storage device,

(c) the amount of free storage space available in each data storage device,

(d) the historical error rate of each data storage device,

(e) the volume of data access requests made to each data storage device,

(f) the air flowrate output of one or more cooling fans provided for cooling each data storage device,

(g) the local temperature of each data storage device, and

(h) the closed/open, locked/unlocked states of one or more access doors providing physical access to each data storage device; and the system further comprising:

SCSI interface means, coupled between the status defining means and the SCSI bus, for transferring status information from the status defining unit to the SCSI bus, the transferred status information indicating the state of a monitored one or more of said attributes.

34. A status control system for use in conjunction with an array of SCSI-based data storage units, the status control system comprising:

a status control unit for controlling two or more operational attributes of the array of data storage units, at least two of the controlled attributes being selected from the group consisting of:

(a) the voltage or current of one or more power supplies provided for supplying power to each data storage device,

(b) the cooling rate of one or more temperature control units provided for regulating the temperature of each data storage device,

(c) the locked/unlocked state of one or more lockable access doors providing physical access to each data storage device;

and the system further comprising:

SCSI interface means, coupled between the status control means and the SCSI bus, for receiving status control commands from the SCSI bus and transferring the control commands to the status control unit for execution, the transferred control commands indicating a desired state for a controllable one or more of said attributes.

35. A method of monitoring and controlling a cluster of data storage modules interconnected by a data exchange bus wherein operations of the cluster are supported by power maintenance and other maintenance subsystems, said method comprising the steps of:

(a) attaching a status defining means to the data exchange bus;

(b) operatively coupling the status defining means to the power maintenance and other environment maintenance subsystems of the cluster; and

(c) operating the status defining means so that the status defining means provides one or more of the following functions:

(c.1) providing on-site reports via an on-site indicator means of cluster status and cluster problems to an on-site observer by way of a frontpanel messaging module;

(c.2) providing off-site reports via the data exchange bus of cluster status and cluster problems to a remote system supervisor;

(c.3) testing the data path integrity of the data exchange bus;

(c.4) storing retrievable data providing error history, repair history, and usage history information about a portable one or more of the cluster of data storage modules with which the status defining means is associated;

(c.5) supporting inventory/asset management functions in a large network containing the cluster of data storage modules;

(c.6) monitoring traffic patterns of communications to or from members of the cluster;

(c.7) switching a configuration of the cluster in response to a sensed degradation event within the cluster;

(c.8) monitoring and managing background environmental aspects of cluster operation such as maintaining appropriate temperatures within the cluster, maintaining predefined power levels within the cluster, and assuring physical security of cluster members.

36. The SCSI cluster of claim 1 wherein said at least one of the SCSI modules comprises at least three substantially similar SCSI modules.

37. The SCSI cluster of claim 36 wherein said at least three substantially similar SCSI modules define a RAID bank.

38. The SCSI cluster of claim 1 wherein said at least one of the SCSI modules comprises six substantially similar SCSI modules.

39. The SCSI cluster of claim 38 wherein said at least six substantially similar SCSI modules defines two RAID banks.

40. The SCSI cluster of claim 4 wherein said status monitoring and reporting means includes nonvolatile writable memory means for storing test instructions downloaded through said SCSI bus prior to said integrity testing of the SCSI bus.

41. A data storage and retrieval system comprising:

(a) a host computer including a host-to-SCSI adaptor module for coupling the host computer to a plurality of independent SCSI buses, each of the SCSI buses being capable of operatively coupling together a limited, respective number of SCSI modules at one time, the host-to-SCSI adaptor module defining a first such SCSI module on each of said plurality of independent SCSI buses;

(b) a plurality of storage array housing cabinets each operatively coupled to at least one bus of said plurality of independent SCSI buses, wherein:

(b.1) each storage array housing cabinet houses a respective plurality of data storage devices,

(b.2) each storage array housing cabinet further houses a respective two or more modularly-replaceable redundant power supplies that are operatively coupled to supply operating power to the respective data storage devices of the cabinet,

(b.3) each storage array housing cabinet further houses a respective two or more redundant cooling fans each fan being operatively coupled to provide mutually independent cooling to the redundant power supplies and to the data storage devices of the cabinet,

(b.4) each storage array housing cabinet further includes:

status defining means also operatively coupled to the respective SCSI bus, the status defining means being further operatively coupled to a selected one or more of the plural SCSI modules for sensing a local status of the selected one or more of the SCSI modules, said local status being one that cannot be otherwise determined by way of the SCSI bus and the status defining means being further for reporting the sensed local status to the data exchange bus;

wherein said local status includes one or more parameters selected from the group consisting of:

(a) a local power voltage level inside a specified one of the selected plural SCSI modules;

(b) a local power current level inside a specified one of the selected plural SCSI modules;

(c) an amount of deviation from a prespecified nominal value for a local power voltage level inside a specified one of the selected plural SCSI modules;

(d) an amount of deviation from a prespecified nominal value for a local power current level inside a specified one of the selected plural SCSI modules;

(e) a temperature level inside a specified one of the selected plural SCSI modules; and

(f) a temperature condition inside a specified one of the selected plural SCSI modules that is outside of a predefined range.
 Description Submit all comments and votes
 


BACKGROUND

1. Field of the Invention

The invention relates generally to redundant arrays of data storage devices. The invention relates more specifically to a RAID system that connects to a host computer by way of a SCSI interface and a diagnostics/control module that also connects to the SCSI interface.

2a. Cross Reference to Related Applications

The following copending U.S. patent application is assigned to the assignee of the present application, is related to the present application and its disclosure is incorporated herein by reference:

(A) Ser. No. 08/124,276 filed Sep. 20, 1993 by Larry Kibler et al and entitled, FULL-HEIGHT DISK DRIVE SUPPORT STRUCTURE.

2c. Cross Reference to Related Patents

The following U.S. patent is related to the present application and its disclosure is incorporated herein by reference:

(A) U.S. Pat. No. 5,148,432 issued to Gordon et al. and entitled ARRAYED DISK DRIVE SYSTEM AND METHOD.

3. Description of the Related Art

The use of RAID data storage systems (Redundant Array of Inexpensive Disk-drives) is becoming increasingly popular due to economic and technical reasons.

Data storage strategies are being shifted away from having one large mainframe computer coupled to an array of a few, large disk units or a few, bulk tape units, and are instead being shifted in favor of having many desktop or mini- or micro-computers intercoupled by a network to one another and to many small, inexpensive and modularly interchangeable data storage devices (e.g., to an array of small, inexpensive, magnetic storage disk drives). One of the reasons behind this trend is a desire in the industry to maintain at least partial system functionality even in the event of a failure in a particular system component. If one of the numerous mini/micro-computers fails, the others can continue to function. If one of the numerous data storage devices fails, the others can continue to provide data access. Also increases in data storage capacity can be economically provided in small increments as the need for increased capacity develops.

A common configuration includes a so-called "client/server computer" sandwiched between a local area network (LAN) and a RAID data storage system. Remote users (clients) send requests for read and/or write access to data files contained in the RAID system over the network (LAN). The client/server computer services each request on a time shared basis.

As the client/server computer performs its client servicing tasks, the client/server computer is burdened at the same time with the overhead of attending to mundane tasks such as monitoring the operational status of each disk drive in the RAID system and taking corrective action, or at least issuing an alarm, when a problem develops.

A difficulty develops when the request-servicing bandwidth and/or storage capacity of such a RAID-based client/server system needs to be scaled upwardly. If the number of network users (clients) or request-load per user increases, the request-servicing burden that is placed on the client/server computer tends to increase correspondingly. At some point, the client/server computer bumps against the limits of its data processing speed and system responsiveness suffers.

System responsiveness is disadvantageously degraded by the burden that status monitoring overhead places on the client/server computer. In other words, the status monitoring overhead disadvantageously reduces the ability of the client/server computer to more quickly respond to the ever-growing number of service requests that it receives from the network. In addition, the status-monitoring overhead burden disadvantageously grows as more data storage drives are added to the RAID system. And accordingly, even though the addition of more data storage drives beneficially increases the system's storage capacity, it also tends to degrade system response speed.

The status monitoring function of the client/server computer is typically supported by customized hardware that is added to an expandable bus of the client/server computer. In one configuration, a serial and/or parallel I/O board is inserted into one of the expansion slots of the client/server computer and site-customized cables are routed from this I/O board to status sensors that are mounted on or in various components of the disk array. Monitoring software is loaded into the client/server computer to drive the I/O board, to query the various sensors and to receive status reports back from them. Such an arrangement is disadvantageous in that an expansion slot of the client/server computer is consumed for carrying out the disk-array monitoring function. It is also disadvantageous because of the customized nature of the sensor cables extending from the I/O board. Each RAID server tends to have its own unique configuration. A network having many such uniquely-configured servers is difficult to maintain.

Increasingly, there is a need within the industry for arranging the client/server computer as an off-the-shelf commodity item that can be quickly and inexpensively replaced in case of failure. There is a long felt desire in the industry to avoid customized routings of cables between a stand-alone computer and peripheral sensors. There is a need in the industry for disk drive arrays or other data storage arrays that can be quickly and efficiently serviced in the event of a failure. There is a growing desire in the industry to be able to control all operations of a networked RAID system from a remote control console without adversely affecting normal operations of the network.

SUMMARY OF THE INVENTION

The invention helps to attain the above-mentioned objectives by providing a SCSI-coupled module for monitoring and for controlling a SCSI-coupled cluster of devices such as a SCSI-coupled RAID bank.

A structure in accordance with the invention comprises: a cluster of SCSI modules coupled to one another by a SCSI bus, wherein at least one of the SCSI modules has no means for directly reporting to the SCSI bus, the status of power delivered to internal circuitry of the at least one SCSI module or the status of other conditions (e.g., temperature, open door) affecting the operability or security of the at least one SCSI module and wherein a second of the SCSI modules includes status monitoring, reporting and control means for monitoring and directly reporting to the SCSI bus, the status of power delivered to internal circuitry of the at least one SCSI module or the status of other conditions (e.g., temperature) affecting the operability and/or security of the at least one SCSI module. The status monitoring, reporting and control means is optionally provided with control functions so that it can actively control the power delivered to internal circuitry of the at least one SCSI module or the status of other conditions (e.g., temperature, door lockings) affecting the operability and/or security of the at least one SCSI module either in response to commands received over the SCSI bus or on its own initiative.

A method in accordance with the invention comprises the steps of: (a) attaching a status monitoring, reporting and control means to a SCSI bus having a cluster of SCSI modules; (b) operatively coupling the status monitoring, reporting and control means to a power maintenance and/or other environment maintenance subsystems of the cluster; and (c) operating the status monitoring, reporting and control means so that the status monitoring, reporting and control means provides one or more of the following functions: (c.1) providing on-site reports via an on-site indicator means of cluster status and cluster problems to an on-site observer (e.g., by creating appropriate indication patterns on a frontpanel messaging module); (c.2) providing off-site reports via the SCSI bus of cluster status and cluster problems to a remote system supervisor; (c.3) testing the data path integrity of the SCSI bus; (c.4) conveying error history, repair history, usage history and other information about a portable cluster of SCSI modules to which the status monitoring, reporting and control means is attached; (c.5) supporting inventory/asset management functions in a large network containing the SCSI cluster; (c.6) monitoring traffic patterns of SCSI communications to or from members of the cluster; (c.7) switching a configuration of the cluster in response to a sensed degradation event within the cluster; (c.8) monitoring and managing background environmental aspects of cluster operation such as maintaining appropriate temperatures within the cluster, maintaining predefined power levels within the cluster, and assuring system security.

These and other aspects of the invention will be described in more detail below.

BRIEF DESCRIPTION OF THE DRAWINGS

The below detailed description makes reference to the accompanying drawings, in which:

FIG. 1A is a generalized block diagram of a non-SCSI to SCSI status transfer system in accordance with the invention;

FIG. 1B is a block diagram of a SCSI-based data access network system (DANS) in accordance with the invention;

FIGS. 2A-2B show schematics of cabinet monitor and control (CMAC) boards in accordance with the invention;

FIG. 3A shows a six drive configuration; and

FIG. 3B shows a bank of drive cabinets each holding eighteen drives.

DETAILED DESCRIPTION

Referring to FIG. 1A, there is first shown a generalized block diagram of a non-SCSI to SCSI status transfer system in accordance with the invention. Modules 10, 11, 12, . . . , 15 each include a Small Computer System Interface (SCSI) for enabling SCSI-based data exchange between these modules 10, 11, 12, . . . , 15 in accordance with well known industry standards. Although only four such SCSI modules are shown, it is to be understood that the SCSI data exchange network (or SCSI "channel") can have as many as eight such modules and that each module has a unique SCSI identification number (ID#0 through ID#7). Each module can have within it, as many as 8 uniquely-addressable, SCSI logical units. Thus the SCSI channel can support as many as 64 uniquely-addressable, SCSI logical units.

In the illustrated example, module 10 is assigned SCSI ID#0, module 11 is assigned SCSI ID#1, module 12 is assigned SCSI ID#2, and module 15 is assigned SCSI ID#7. Four additional SCSI modules (not shown) can be inserted between modules 12 and 15 and assigned respective SCSI ID's #3 to #6.

SCSI cables 31-35 interconnect corresponding SCSI modules 10-15 in daisy chain fashion according to well known industry practice. Modules 11-15 are spaced relatively close to one another (they are "clustered") while module 10 is located relatively far (roughly 1 to 25 feet away) from the other modules 11-15. Because of this physical separation, a first power/environment support unit 16 is used to supply electrical power and provide other operational necessities (e.g., cooling) to the cluster of modules 11-15 while a second power/environmental support unit 17 is used to supply electrical power and provide other operational necessities (e.g., cooling) to the out-of-cluster module 10. An electrical/mechanical connection means 36 operatively couples the first power/environmental support unit 16 to the clustered SCSI modules 11-15 while a separate, second electrical/mechanical connection means 37 operatively couples the second power/environmental support unit 17 to separated SCSI module 10.

Module 10 is connected to a system supervisor 2 by means of a communication network 5. Communication between the system supervisor 2 and the remaining cluster of modules 11-15 is substantially limited to that which can be carried over the SCSI network (cables 31-35) to the first module 10, and from there over the communication network 5 to the system supervisor 2.

SCSI modules 11 and 12 do not include means for reporting:, by way of the SCSI network, (1) the status of power delivered to their internal circuitry (e.g., is it at nominal voltage and current, and if not what is the amount of deviation?) or (2) the status of other environmental conditions affecting their operability, such as temperature build-up, or (3) the status of yet other environmental conditions affecting their security, such as their physical removability or actual removal from the cluster.

With regard to the mentioned report items, SCSI communications do not on their own provide definitive answers. If a SCSI module is not responding to SCSI commands, such nonresponsiveness does not specifically indicate whether the cause is due to failure of the SCSI interface, or loss of power, or overheating, or physical removal or disconnect Of the module, or some other reason. Because there is no status reporting means in modules 11 and 12, and SCSI communications do not provide definitive answers, the system supervisor 2 has no way of learning about power or environmental problems simply from communications carried out with SCSI modules 11 and 12 over SCSI bus 31-35.

To overcome this problem, a Status Monitoring And Reporting means 60 (SMARt means 60) is provided within SCSI module 15 for monitoring the status of the first power/environment support unit 16 and the status of nearby modules 11-12, and even its own status, and for reporting the status of these monitored devices to the system supervisor 2 by way of the SCSI network 31-35. Sensors 21, 22, . . . , 25, 26 are attached to respective units 11, 12, . . . , 15 and 16 for monitoring temperature, electrical power levels and other aspects of cluster 11-15 that affect the operatability and/or security of SCSI cluster 11-15. Local sensor lines 51, 52, . . . , 55, 56 respectively connect sensors 21, 22, . . . , 25, 26 to the status monitoring and reporting means 60.

An appropriate intelligence means (e.g., a microcontroller or microcomputer, not shown) is provided within the status monitoring and reporting means (SMARt) 60 for causing it to periodically monitor the status of temperature, electrical power levels and other aspects affecting the operatability and security of SCSI cluster 11-15 and to report worrisome developments to the system supervisor 2 by way of the SCSI network 31-35.

Note that the status monitoring and reporting (SMARt) means 60 is preferably located in the SCSI module 15 that is most distal along the SCSI chain of cables 31-35 from the communication network 5 and the system supervisor 2. The intelligence means (e.g., a microcontroller or microcomputer) within the status monitoring and reporting (SMARt) means 60 can be advantageously used to test the integrity of the data path between the system supervisor 2 and end module 15, that data path including the series of connections made by communication network 5, the SCSI chain of cables 31-35, and the intervening modules 10-12. Appropriate test patterns can be sent from the system supervisor 2 to test for shorts, opens, stuck-at faults and so forth, in the chain of interconnects 5, 31-35. Such techniques for verifying network integrity are well known in the art.

Communications between the status monitoring and reporting (SEt) means 60 and the system supervisor 2 are carried out using a communications protocol layered on top of the industry standard SCSI protocol. For example, a first one or more bytes of data that is sent during the data transfer phase of a SCSI SEND or RECEIVE operation defines an operation code field (op code) recognizable to one or both of the SMARt means 60 and the system supervisor 2. A following one or more bytes of data that is sent during the data transfer phase of the SCSI SEND or RECEIVE operation defines parameters of the op code. (The op codes and parameters can be inserted in the CDB (command data block) of a SCSI RECEIVE or SEND operation or in a subsequent one or more data blocks.)

More specifically, when the network control console 102 is the initiator of a data exchange operation and wishes to receive information from the SMARt means 60, it sends the corresponding op code and parameters to first module 10 by way of communication network 5. The op code and parameters sent by the network control console 102 are thereafter embedded by module 10 into the CDB (command data block) of a SCSI RECEIVE command which module 10 sends to the status monitoring and reporting means 60 of module 15 by way of SCSI network cables 31-35. The SMARt means 60 analyzes the embedded op code and parameters and responsively returns the desired data during the data phase of the same SCSI RECEIVE operation. If the network control console 102 wishes to ask the SMARt means 60 to perform a particular operation (e.g., to turn on an LED, not shown, that is attached to cluster 11-15), the network control console 102 sends the corresponding op code and parameters to first module 10 by way of communication network 5. The op code and parameters sent by the network control console 102 are thereafter embedded by module 10 into the CDB (command data block) and/or