Magellan Linux

Annotation of /alx-src/tags/kernel26-2.6.12-alx-r9/Documentation/IPMI.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 630 - (hide annotations) (download)
Wed Mar 4 11:03:09 2009 UTC (15 years, 6 months ago) by niro
File MIME type: text/plain
File size: 21237 byte(s)
Tag kernel26-2.6.12-alx-r9
1 niro 628
2     The Linux IPMI Driver
3     ---------------------
4     Corey Minyard
5     <minyard@mvista.com>
6     <minyard@acm.org>
7    
8     The Intelligent Platform Management Interface, or IPMI, is a
9     standard for controlling intelligent devices that monitor a system.
10     It provides for dynamic discovery of sensors in the system and the
11     ability to monitor the sensors and be informed when the sensor's
12     values change or go outside certain boundaries. It also has a
13     standardized database for field-replacable units (FRUs) and a watchdog
14     timer.
15    
16     To use this, you need an interface to an IPMI controller in your
17     system (called a Baseboard Management Controller, or BMC) and
18     management software that can use the IPMI system.
19    
20     This document describes how to use the IPMI driver for Linux. If you
21     are not familiar with IPMI itself, see the web site at
22     http://www.intel.com/design/servers/ipmi/index.htm. IPMI is a big
23     subject and I can't cover it all here!
24    
25     Configuration
26     -------------
27    
28     The LinuxIPMI driver is modular, which means you have to pick several
29     things to have it work right depending on your hardware. Most of
30     these are available in the 'Character Devices' menu.
31    
32     No matter what, you must pick 'IPMI top-level message handler' to use
33     IPMI. What you do beyond that depends on your needs and hardware.
34    
35     The message handler does not provide any user-level interfaces.
36     Kernel code (like the watchdog) can still use it. If you need access
37     from userland, you need to select 'Device interface for IPMI' if you
38     want access through a device driver. Another interface is also
39     available, you may select 'IPMI sockets' in the 'Networking Support'
40     main menu. This provides a socket interface to IPMI. You may select
41     both of these at the same time, they will both work together.
42    
43     The driver interface depends on your hardware. If you have a board
44     with a standard interface (These will generally be either "KCS",
45     "SMIC", or "BT", consult your hardware manual), choose the 'IPMI SI
46     handler' option. A driver also exists for direct I2C access to the
47     IPMI management controller. Some boards support this, but it is
48     unknown if it will work on every board. For this, choose 'IPMI SMBus
49     handler', but be ready to try to do some figuring to see if it will
50     work.
51    
52     There is also a KCS-only driver interface supplied, but it is
53     depracated in favor of the SI interface.
54    
55     You should generally enable ACPI on your system, as systems with IPMI
56     should have ACPI tables describing them.
57    
58     If you have a standard interface and the board manufacturer has done
59     their job correctly, the IPMI controller should be automatically
60     detect (via ACPI or SMBIOS tables) and should just work. Sadly, many
61     boards do not have this information. The driver attempts standard
62     defaults, but they may not work. If you fall into this situation, you
63     need to read the section below named 'The SI Driver' on how to
64     hand-configure your system.
65    
66     IPMI defines a standard watchdog timer. You can enable this with the
67     'IPMI Watchdog Timer' config option. If you compile the driver into
68     the kernel, then via a kernel command-line option you can have the
69     watchdog timer start as soon as it intitializes. It also have a lot
70     of other options, see the 'Watchdog' section below for more details.
71     Note that you can also have the watchdog continue to run if it is
72     closed (by default it is disabled on close). Go into the 'Watchdog
73     Cards' menu, enable 'Watchdog Timer Support', and enable the option
74     'Disable watchdog shutdown on close'.
75    
76    
77     Basic Design
78     ------------
79    
80     The Linux IPMI driver is designed to be very modular and flexible, you
81     only need to take the pieces you need and you can use it in many
82     different ways. Because of that, it's broken into many chunks of
83     code. These chunks are:
84    
85     ipmi_msghandler - This is the central piece of software for the IPMI
86     system. It handles all messages, message timing, and responses. The
87     IPMI users tie into this, and the IPMI physical interfaces (called
88     System Management Interfaces, or SMIs) also tie in here. This
89     provides the kernelland interface for IPMI, but does not provide an
90     interface for use by application processes.
91    
92     ipmi_devintf - This provides a userland IOCTL interface for the IPMI
93     driver, each open file for this device ties in to the message handler
94     as an IPMI user.
95    
96     ipmi_si - A driver for various system interfaces. This supports
97     KCS, SMIC, and may support BT in the future. Unless you have your own
98     custom interface, you probably need to use this.
99    
100     ipmi_smb - A driver for accessing BMCs on the SMBus. It uses the
101     I2C kernel driver's SMBus interfaces to send and receive IPMI messages
102     over the SMBus.
103    
104     af_ipmi - A network socket interface to IPMI. This doesn't take up
105     a character device in your system.
106    
107     Note that the KCS-only interface ahs been removed.
108    
109     Much documentation for the interface is in the include files. The
110     IPMI include files are:
111    
112     net/af_ipmi.h - Contains the socket interface.
113    
114     linux/ipmi.h - Contains the user interface and IOCTL interface for IPMI.
115    
116     linux/ipmi_smi.h - Contains the interface for system management interfaces
117     (things that interface to IPMI controllers) to use.
118    
119     linux/ipmi_msgdefs.h - General definitions for base IPMI messaging.
120    
121    
122     Addressing
123     ----------
124    
125     The IPMI addressing works much like IP addresses, you have an overlay
126     to handle the different address types. The overlay is:
127    
128     struct ipmi_addr
129     {
130     int addr_type;
131     short channel;
132     char data[IPMI_MAX_ADDR_SIZE];
133     };
134    
135     The addr_type determines what the address really is. The driver
136     currently understands two different types of addresses.
137    
138     "System Interface" addresses are defined as:
139    
140     struct ipmi_system_interface_addr
141     {
142     int addr_type;
143     short channel;
144     };
145    
146     and the type is IPMI_SYSTEM_INTERFACE_ADDR_TYPE. This is used for talking
147     straight to the BMC on the current card. The channel must be
148     IPMI_BMC_CHANNEL.
149    
150     Messages that are destined to go out on the IPMB bus use the
151     IPMI_IPMB_ADDR_TYPE address type. The format is
152    
153     struct ipmi_ipmb_addr
154     {
155     int addr_type;
156     short channel;
157     unsigned char slave_addr;
158     unsigned char lun;
159     };
160    
161     The "channel" here is generally zero, but some devices support more
162     than one channel, it corresponds to the channel as defined in the IPMI
163     spec.
164    
165    
166     Messages
167     --------
168    
169     Messages are defined as:
170    
171     struct ipmi_msg
172     {
173     unsigned char netfn;
174     unsigned char lun;
175     unsigned char cmd;
176     unsigned char *data;
177     int data_len;
178     };
179    
180     The driver takes care of adding/stripping the header information. The
181     data portion is just the data to be send (do NOT put addressing info
182     here) or the response. Note that the completion code of a response is
183     the first item in "data", it is not stripped out because that is how
184     all the messages are defined in the spec (and thus makes counting the
185     offsets a little easier :-).
186    
187     When using the IOCTL interface from userland, you must provide a block
188     of data for "data", fill it, and set data_len to the length of the
189     block of data, even when receiving messages. Otherwise the driver
190     will have no place to put the message.
191    
192     Messages coming up from the message handler in kernelland will come in
193     as:
194    
195     struct ipmi_recv_msg
196     {
197     struct list_head link;
198    
199     /* The type of message as defined in the "Receive Types"
200     defines above. */
201     int recv_type;
202    
203     ipmi_user_t *user;
204     struct ipmi_addr addr;
205     long msgid;
206     struct ipmi_msg msg;
207    
208     /* Call this when done with the message. It will presumably free
209     the message and do any other necessary cleanup. */
210     void (*done)(struct ipmi_recv_msg *msg);
211    
212     /* Place-holder for the data, don't make any assumptions about
213     the size or existence of this, since it may change. */
214     unsigned char msg_data[IPMI_MAX_MSG_LENGTH];
215     };
216    
217     You should look at the receive type and handle the message
218     appropriately.
219    
220    
221     The Upper Layer Interface (Message Handler)
222     -------------------------------------------
223    
224     The upper layer of the interface provides the users with a consistent
225     view of the IPMI interfaces. It allows multiple SMI interfaces to be
226     addressed (because some boards actually have multiple BMCs on them)
227     and the user should not have to care what type of SMI is below them.
228    
229    
230     Creating the User
231    
232     To user the message handler, you must first create a user using
233     ipmi_create_user. The interface number specifies which SMI you want
234     to connect to, and you must supply callback functions to be called
235     when data comes in. The callback function can run at interrupt level,
236     so be careful using the callbacks. This also allows to you pass in a
237     piece of data, the handler_data, that will be passed back to you on
238     all calls.
239    
240     Once you are done, call ipmi_destroy_user() to get rid of the user.
241    
242     From userland, opening the device automatically creates a user, and
243     closing the device automatically destroys the user.
244    
245    
246     Messaging
247    
248     To send a message from kernel-land, the ipmi_request() call does
249     pretty much all message handling. Most of the parameter are
250     self-explanatory. However, it takes a "msgid" parameter. This is NOT
251     the sequence number of messages. It is simply a long value that is
252     passed back when the response for the message is returned. You may
253     use it for anything you like.
254    
255     Responses come back in the function pointed to by the ipmi_recv_hndl
256     field of the "handler" that you passed in to ipmi_create_user().
257     Remember again, these may be running at interrupt level. Remember to
258     look at the receive type, too.
259    
260     From userland, you fill out an ipmi_req_t structure and use the
261     IPMICTL_SEND_COMMAND ioctl. For incoming stuff, you can use select()
262     or poll() to wait for messages to come in. However, you cannot use
263     read() to get them, you must call the IPMICTL_RECEIVE_MSG with the
264     ipmi_recv_t structure to actually get the message. Remember that you
265     must supply a pointer to a block of data in the msg.data field, and
266     you must fill in the msg.data_len field with the size of the data.
267     This gives the receiver a place to actually put the message.
268    
269     If the message cannot fit into the data you provide, you will get an
270     EMSGSIZE error and the driver will leave the data in the receive
271     queue. If you want to get it and have it truncate the message, us
272     the IPMICTL_RECEIVE_MSG_TRUNC ioctl.
273    
274     When you send a command (which is defined by the lowest-order bit of
275     the netfn per the IPMI spec) on the IPMB bus, the driver will
276     automatically assign the sequence number to the command and save the
277     command. If the response is not receive in the IPMI-specified 5
278     seconds, it will generate a response automatically saying the command
279     timed out. If an unsolicited response comes in (if it was after 5
280     seconds, for instance), that response will be ignored.
281    
282     In kernelland, after you receive a message and are done with it, you
283     MUST call ipmi_free_recv_msg() on it, or you will leak messages. Note
284     that you should NEVER mess with the "done" field of a message, that is
285     required to properly clean up the message.
286    
287     Note that when sending, there is an ipmi_request_supply_msgs() call
288     that lets you supply the smi and receive message. This is useful for
289     pieces of code that need to work even if the system is out of buffers
290     (the watchdog timer uses this, for instance). You supply your own
291     buffer and own free routines. This is not recommended for normal use,
292     though, since it is tricky to manage your own buffers.
293    
294    
295     Events and Incoming Commands
296    
297     The driver takes care of polling for IPMI events and receiving
298     commands (commands are messages that are not responses, they are
299     commands that other things on the IPMB bus have sent you). To receive
300     these, you must register for them, they will not automatically be sent
301     to you.
302    
303     To receive events, you must call ipmi_set_gets_events() and set the
304     "val" to non-zero. Any events that have been received by the driver
305     since startup will immediately be delivered to the first user that
306     registers for events. After that, if multiple users are registered
307     for events, they will all receive all events that come in.
308    
309     For receiving commands, you have to individually register commands you
310     want to receive. Call ipmi_register_for_cmd() and supply the netfn
311     and command name for each command you want to receive. Only one user
312     may be registered for each netfn/cmd, but different users may register
313     for different commands.
314    
315     From userland, equivalent IOCTLs are provided to do these functions.
316    
317    
318     The Lower Layer (SMI) Interface
319     -------------------------------
320    
321     As mentioned before, multiple SMI interfaces may be registered to the
322     message handler, each of these is assigned an interface number when
323     they register with the message handler. They are generally assigned
324     in the order they register, although if an SMI unregisters and then
325     another one registers, all bets are off.
326    
327     The ipmi_smi.h defines the interface for management interfaces, see
328     that for more details.
329    
330    
331     The SI Driver
332     -------------
333    
334     The SI driver allows up to 4 KCS or SMIC interfaces to be configured
335     in the system. By default, scan the ACPI tables for interfaces, and
336     if it doesn't find any the driver will attempt to register one KCS
337     interface at the spec-specified I/O port 0xca2 without interrupts.
338     You can change this at module load time (for a module) with:
339    
340     modprobe ipmi_si.o type=<type1>,<type2>....
341     ports=<port1>,<port2>... addrs=<addr1>,<addr2>...
342     irqs=<irq1>,<irq2>... trydefaults=[0|1]
343     regspacings=<sp1>,<sp2>,... regsizes=<size1>,<size2>,...
344     regshifts=<shift1>,<shift2>,...
345     slave_addrs=<addr1>,<addr2>,...
346    
347     Each of these except si_trydefaults is a list, the first item for the
348     first interface, second item for the second interface, etc.
349    
350     The si_type may be either "kcs", "smic", or "bt". If you leave it blank, it
351     defaults to "kcs".
352    
353     If you specify si_addrs as non-zero for an interface, the driver will
354     use the memory address given as the address of the device. This
355     overrides si_ports.
356    
357     If you specify si_ports as non-zero for an interface, the driver will
358     use the I/O port given as the device address.
359    
360     If you specify si_irqs as non-zero for an interface, the driver will
361     attempt to use the given interrupt for the device.
362    
363     si_trydefaults sets whether the standard IPMI interface at 0xca2 and
364     any interfaces specified by ACPE are tried. By default, the driver
365     tries it, set this value to zero to turn this off.
366    
367     The next three parameters have to do with register layout. The
368     registers used by the interfaces may not appear at successive
369     locations and they may not be in 8-bit registers. These parameters
370     allow the layout of the data in the registers to be more precisely
371     specified.
372    
373     The regspacings parameter give the number of bytes between successive
374     register start addresses. For instance, if the regspacing is set to 4
375     and the start address is 0xca2, then the address for the second
376     register would be 0xca6. This defaults to 1.
377    
378     The regsizes parameter gives the size of a register, in bytes. The
379     data used by IPMI is 8-bits wide, but it may be inside a larger
380     register. This parameter allows the read and write type to specified.
381     It may be 1, 2, 4, or 8. The default is 1.
382    
383     Since the register size may be larger than 32 bits, the IPMI data may not
384     be in the lower 8 bits. The regshifts parameter give the amount to shift
385     the data to get to the actual IPMI data.
386    
387     The slave_addrs specifies the IPMI address of the local BMC. This is
388     usually 0x20 and the driver defaults to that, but in case it's not, it
389     can be specified when the driver starts up.
390    
391     When compiled into the kernel, the addresses can be specified on the
392     kernel command line as:
393    
394     ipmi_si.type=<type1>,<type2>...
395     ipmi_si.ports=<port1>,<port2>... ipmi_si.addrs=<addr1>,<addr2>...
396     ipmi_si.irqs=<irq1>,<irq2>... ipmi_si.trydefaults=[0|1]
397     ipmi_si.regspacings=<sp1>,<sp2>,...
398     ipmi_si.regsizes=<size1>,<size2>,...
399     ipmi_si.regshifts=<shift1>,<shift2>,...
400     ipmi_si.slave_addrs=<addr1>,<addr2>,...
401    
402     It works the same as the module parameters of the same names.
403    
404     By default, the driver will attempt to detect any device specified by
405     ACPI, and if none of those then a KCS device at the spec-specified
406     0xca2. If you want to turn this off, set the "trydefaults" option to
407     false.
408    
409     If you have high-res timers compiled into the kernel, the driver will
410     use them to provide much better performance. Note that if you do not
411     have high-res timers enabled in the kernel and you don't have
412     interrupts enabled, the driver will run VERY slowly. Don't blame me,
413     these interfaces suck.
414    
415    
416     The SMBus Driver
417     ----------------
418    
419     The SMBus driver allows up to 4 SMBus devices to be configured in the
420     system. By default, the driver will register any SMBus interfaces it finds
421     in the I2C address range of 0x20 to 0x4f on any adapter. You can change this
422     at module load time (for a module) with:
423    
424     modprobe ipmi_smb.o
425     addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]]
426     dbg=<flags1>,<flags2>...
427     [defaultprobe=0] [dbg_probe=1]
428    
429     The addresses are specified in pairs, the first is the adapter ID and the
430     second is the I2C address on that adapter.
431    
432     The debug flags are bit flags for each BMC found, they are:
433     IPMI messages: 1, driver state: 2, timing: 4, I2C probe: 8
434    
435     Setting smb_defaultprobe to zero disabled the default probing of SMBus
436     interfaces at address range 0x20 to 0x4f. This means that only the
437     BMCs specified on the smb_addr line will be detected.
438    
439     Setting smb_dbg_probe to 1 will enable debugging of the probing and
440     detection process for BMCs on the SMBusses.
441    
442     Discovering the IPMI compilant BMC on the SMBus can cause devices
443     on the I2C bus to fail. The SMBus driver writes a "Get Device ID" IPMI
444     message as a block write to the I2C bus and waits for a response.
445     This action can be detrimental to some I2C devices. It is highly recommended
446     that the known I2c address be given to the SMBus driver in the smb_addr
447     parameter. The default adrress range will not be used when a smb_addr
448     parameter is provided.
449    
450     When compiled into the kernel, the addresses can be specified on the
451     kernel command line as:
452    
453     ipmb_smb.addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]]
454     ipmi_smb.dbg=<flags1>,<flags2>...
455     ipmi_smb.defaultprobe=0 ipmi_smb.dbg_probe=1
456    
457     These are the same options as on the module command line.
458    
459     Note that you might need some I2C changes if CONFIG_IPMI_PANIC_EVENT
460     is enabled along with this, so the I2C driver knows to run to
461     completion during sending a panic event.
462    
463    
464     Other Pieces
465     ------------
466    
467     Watchdog
468     --------
469    
470     A watchdog timer is provided that implements the Linux-standard
471     watchdog timer interface. It has three module parameters that can be
472     used to control it:
473    
474     modprobe ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type>
475     preaction=<preaction type> preop=<preop type> start_now=x
476     nowayout=x
477    
478     The timeout is the number of seconds to the action, and the pretimeout
479     is the amount of seconds before the reset that the pre-timeout panic will
480     occur (if pretimeout is zero, then pretimeout will not be enabled). Note
481     that the pretimeout is the time before the final timeout. So if the
482     timeout is 50 seconds and the pretimeout is 10 seconds, then the pretimeout
483     will occur in 40 second (10 seconds before the timeout).
484    
485     The action may be "reset", "power_cycle", or "power_off", and
486     specifies what to do when the timer times out, and defaults to
487     "reset".
488    
489     The preaction may be "pre_smi" for an indication through the SMI
490     interface, "pre_int" for an indication through the SMI with an
491     interrupts, and "pre_nmi" for a NMI on a preaction. This is how
492     the driver is informed of the pretimeout.
493    
494     The preop may be set to "preop_none" for no operation on a pretimeout,
495     "preop_panic" to set the preoperation to panic, or "preop_give_data"
496     to provide data to read from the watchdog device when the pretimeout
497     occurs. A "pre_nmi" setting CANNOT be used with "preop_give_data"
498     because you can't do data operations from an NMI.
499    
500     When preop is set to "preop_give_data", one byte comes ready to read
501     on the device when the pretimeout occurs. Select and fasync work on
502     the device, as well.
503    
504     If start_now is set to 1, the watchdog timer will start running as
505     soon as the driver is loaded.
506    
507     If nowayout is set to 1, the watchdog timer will not stop when the
508     watchdog device is closed. The default value of nowayout is true
509     if the CONFIG_WATCHDOG_NOWAYOUT option is enabled, or false if not.
510    
511     When compiled into the kernel, the kernel command line is available
512     for configuring the watchdog:
513    
514     ipmi_watchdog.timeout=<t> ipmi_watchdog.pretimeout=<t>
515     ipmi_watchdog.action=<action type>
516     ipmi_watchdog.preaction=<preaction type>
517     ipmi_watchdog.preop=<preop type>
518     ipmi_watchdog.start_now=x
519     ipmi_watchdog.nowayout=x
520    
521     The options are the same as the module parameter options.
522    
523     The watchdog will panic and start a 120 second reset timeout if it
524     gets a pre-action. During a panic or a reboot, the watchdog will
525     start a 120 timer if it is running to make sure the reboot occurs.
526    
527     Note that if you use the NMI preaction for the watchdog, you MUST
528     NOT use nmi watchdog mode 1. If you use the NMI watchdog, you
529     must use mode 2.
530    
531     Once you open the watchdog timer, you must write a 'V' character to the
532     device to close it, or the timer will not stop. This is a new semantic
533     for the driver, but makes it consistent with the rest of the watchdog
534     drivers in Linux.