Magellan Linux

Annotation of /trunk/kernel26-magellan/patches-2.6.21-r7/0154-2.6.21-suspend2-2.2.9.17.patch

Parent Directory Parent Directory | Revision Log Revision Log


Revision 269 - (hide annotations) (download)
Sat Jul 21 00:37:57 2007 UTC (16 years, 10 months ago) by niro
File size: 471058 byte(s)
2.6.21-magellan-r7

1 niro 269 diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
2     index 12533a9..68e902e 100644
3     --- a/Documentation/kernel-parameters.txt
4     +++ b/Documentation/kernel-parameters.txt
5     @@ -82,6 +82,7 @@ parameter is applicable:
6     SH SuperH architecture is enabled.
7     SMP The kernel is an SMP kernel.
8     SPARC Sparc architecture is enabled.
9     + SUSPEND2 Suspend2 is enabled.
10     SWSUSP Software suspend is enabled.
11     TS Appropriate touchscreen support is enabled.
12     USB USB support is enabled.
13     @@ -1140,6 +1141,8 @@ and is between 256 and 4096 characters. It is defined in the file
14     noresume [SWSUSP] Disables resume and restores original swap
15     space.
16    
17     + noresume2 [SUSPEND2] Disables resuming and restores original swap signature.
18     +
19     no-scroll [VGA] Disables scrollback.
20     This is required for the Braillex ib80-piezo Braille
21     reader made by F.H. Papenmeier (Germany).
22     @@ -1445,6 +1448,11 @@ and is between 256 and 4096 characters. It is defined in the file
23    
24     retain_initrd [RAM] Keep initrd memory after extraction
25    
26     + resume2= [SUSPEND2] Specify the storage device for Suspend2.
27     + Format: <writer>:<writer-parameters>.
28     + See Documentation/power/suspend2.txt for details of the
29     + formats for available image writers.
30     +
31     rhash_entries= [KNL,NET]
32     Set number of hash buckets for route cache
33    
34     diff --git a/Documentation/power/suspend2-internals.txt b/Documentation/power/suspend2-internals.txt
35     new file mode 100644
36     index 0000000..ba4e1e5
37     --- /dev/null
38     +++ b/Documentation/power/suspend2-internals.txt
39     @@ -0,0 +1,473 @@
40     + Software Suspend 2.2 Internal Documentation.
41     + Version 1
42     +
43     +1. Introduction.
44     +
45     + Software Suspend 2.2 is an addition to the Linux Kernel, designed to
46     + allow the user to quickly shutdown and quickly boot a computer, without
47     + needing to close documents or programs. It is equivalent to the
48     + hibernate facility in some laptops. This implementation, however,
49     + requires no special BIOS or hardware support.
50     +
51     + The code in these files is based upon the original implementation
52     + prepared by Gabor Kuti and additional work by Pavel Machek and a
53     + host of others. This code has been substantially reworked by Nigel
54     + Cunningham, again with the help and testing of many others, not the
55     + least of whom is Michael Frank. At its heart, however, the operation is
56     + essentially the same as Gabor's version.
57     +
58     +2. Overview of operation.
59     +
60     + The basic sequence of operations is as follows:
61     +
62     + a. Quiesce all other activity.
63     + b. Ensure enough memory and storage space are available, and attempt
64     + to free memory/storage if necessary.
65     + c. Allocate the required memory and storage space.
66     + d. Write the image.
67     + e. Power down.
68     +
69     + There are a number of complicating factors which mean that things are
70     + not as simple as the above would imply, however...
71     +
72     + o The activity of each process must be stopped at a point where it will
73     + not be holding locks necessary for saving the image, or unexpectedly
74     + restart operations due to something like a timeout and thereby make
75     + our image inconsistent.
76     +
77     + o It is desirous that we sync outstanding I/O to disk before calculating
78     + image statistics. This reduces corruption if one should suspend but
79     + then not resume, and also makes later parts of the operation safer (see
80     + below).
81     +
82     + o We need to get as close as we can to an atomic copy of the data.
83     + Inconsistencies in the image will result in inconsistent memory contents at
84     + resume time, and thus in instability of the system and/or file system
85     + corruption. This would appear to imply a maximum image size of one half of
86     + the amount of RAM, but we have a solution... (again, below).
87     +
88     + o In 2.6, we choose to play nicely with the other suspend-to-disk
89     + implementations.
90     +
91     +3. Detailed description of internals.
92     +
93     + a. Quiescing activity.
94     +
95     + Safely quiescing the system is achieved using two methods.
96     +
97     + First, we note that the vast majority of processes don't need to run during
98     + suspend. They can be 'frozen'. We therefore implement a refrigerator
99     + routine, which processes enter and in which they remain until the cycle is
100     + complete. Processes enter the refrigerator via try_to_freeze() invocations
101     + at appropriate places. A process cannot be frozen in any old place. It
102     + must not be holding locks that will be needed for writing the image or
103     + freezing other processes. For this reason, userspace processes generally
104     + enter the refrigerator via the signal handling code, and kernel threads at
105     + the place in their event loops where they drop locks and yield to other
106     + processes or sleep.
107     +
108     + The second part of our method for quisescing the system involves freezing
109     + the filesystems. We use the standard freeze_bdev and thaw_bdev functions to
110     + ensure that all of the user's data is synced to disk before we begin to
111     + write the image. This is particularly important with XFS, where without
112     + bdev freezing, activity may still occur after we begin to write the image
113     + (potentially causing in-memory and on-disk corruption later).
114     +
115     + Quiescing the system works most quickly and reliably when we add one more
116     + element to the algorithm: separating the freezing of userspace processes
117     + from the freezing of kernel space processes, and doing the filesystem freeze
118     + in between. The filesystem freeze needs to be done while kernel threads such
119     + as kjournald can still run. At the same time, though, everything will be
120     + less racy and run more quickly if we stop userspace submitting more I/O work
121     + while we're trying to quiesce.
122     +
123     + Quiescing the system is therefore done in three steps:
124     + - Freeze userspace
125     + - Freeze filesystems
126     + - Freeze kernel threads
127     +
128     + If we need to free memory, we thaw kernel threads and filesystems, but not
129     + userspace. We can then free caches without worrying about deadlocks due to
130     + swap files being on frozen filesystems or such like.
131     +
132     + One limitation of this is that FUSE filesystems are incompatible with
133     + suspending to disk. They need to be unmounted prior to suspending, to avoid
134     + potential deadlocks.
135     +
136     + b. Ensure enough memory & storage are available.
137     +
138     + We have a number of constraints to meet in order to be able to successfully
139     + suspend and resume.
140     +
141     + First, the image will be written in two parts, described below. One of these
142     + parts needs to have an atomic copy made, which of course implies a maximum
143     + size of one half of the amount of system memory. The other part ('pageset')
144     + is not atomically copied, and can therefore be as large or small as desired.
145     +
146     + Second, we have constraints on the amount of storage available. In these
147     + calculations, we may also consider any compression that will be done. The
148     + cryptoapi module allows the user to configure an expected compression ratio.
149     +
150     + Third, the user can specify an arbitrary limit on the image size, in
151     + megabytes. This limit is treated as a soft limit, so that we don't fail the
152     + attempt to suspend if we cannot meet this constraint.
153     +
154     + c. Allocate the required memory and storage space.
155     +
156     + Having done the initial freeze, we determine whether the above constraints
157     + are met, and seek to allocate the metadata for the image. If the constraints
158     + are not met, or we fail to allocate the required space for the metadata, we
159     + seek to free the amount of memory that we calculate is needed and try again.
160     + We allow up to four iterations of this loop before aborting the cycle. If we
161     + do fail, it should only be because of a bug in Suspend's calculations.
162     +
163     + These steps are merged together in the prepare_image function, found in
164     + prepare_image.c. The functions are merged because of the cyclical nature
165     + of the problem of calculating how much memory and storage is needed. Since
166     + the data structures containing the information about the image must
167     + themselves take memory and use storage, the amount of memory and storage
168     + required changes as we prepare the image. Since the changes are not large,
169     + only one or two iterations will be required to achieve a solution.
170     +
171     + The recursive nature of the algorithm is miminised by keeping user space
172     + frozen while preparing the image, and by the fact that our records of which
173     + pages are to be saved and which pageset they are saved in use bitmaps (so
174     + that changes in number or fragmentation of the pages to be saved don't
175     + feedback via changes in the amount of memory needed for metadata). The
176     + recursiveness is thus limited to any extra slab pages allocated to store the
177     + extents that record storage used, and he effects of seeking to free memory.
178     +
179     + d. Write the image.
180     +
181     + We previously mentioned the need to create an atomic copy of the data, and
182     + the half-of-memory limitation that is implied in this. This limitation is
183     + circumvented by dividing the memory to be saved into two parts, called
184     + pagesets.
185     +
186     + Pageset2 contains the page cache - the pages on the active and inactive
187     + lists. These pages aren't needed or modifed while Suspend2 is running, so
188     + they can be safely written without an atomic copy. They are therefore
189     + saved first and reloaded last. While saving these pages, Suspend2 carefully
190     + ensures that the work of writing the pages doesn't make the image
191     + inconsistent.
192     +
193     + Once pageset2 has been saved, we prepare to do the atomic copy of remaining
194     + memory. As part of the preparation, we power down drivers, thereby providing
195     + them with the opportunity to have their state recorded in the image. The
196     + amount of memory allocated by drivers for this is usually negligible, but if
197     + DRI is in use, video drivers may require significants amounts. Ideally we
198     + would be able to query drivers while preparing the image as to the amount of
199     + memory they will need. Unfortunately no such mechanism exists at the time of
200     + writing. For this reason, Suspend2 allows the user to set an
201     + 'extra_pages_allowance', which is used to seek to ensure sufficient memory
202     + is available for drivers at this point. Suspend2 also lets the user set this
203     + value to 0. In this case, a test driver suspend is done while preparing the
204     + image, and the difference (plus a margin) used instead.
205     +
206     + Having suspended the drivers, we save the CPU context before making an
207     + atomic copy of pageset1, resuming the drivers and saving the atomic copy.
208     + After saving the two pagesets, we just need to save our metadata before
209     + powering down.
210     +
211     + As we mentioned earlier, the contents of pageset2 pages aren't needed once
212     + they've been saved. We therefore use them as the destination of our atomic
213     + copy. In the unlikely event that pageset1 is larger, extra pages are
214     + allocated while the image is being prepared. This is normally only a real
215     + possibility when the system has just been booted and the page cache is
216     + small.
217     +
218     + This is where we need to be careful about syncing, however. Pageset2 will
219     + probably contain filesystem meta data. If this is overwritten with pageset1
220     + and then a sync occurs, the filesystem will be corrupted - at least until
221     + resume time and another sync of the restored data. Since there is a
222     + possibility that the user might not resume or (may it never be!) that
223     + suspend might oops, we do our utmost to avoid syncing filesystems after
224     + copying pageset1.
225     +
226     + e. Power down.
227     +
228     + Powering down uses standard kernel routines. Suspend2 supports powering down
229     + using the ACPI S3, S4 and S5 methods or the kernel's non-ACPI power-off.
230     + Supporting suspend to ram (S3) as a power off option might sound strange,
231     + but it allows the user to quickly get their system up and running again if
232     + the battery doesn't run out (we just need to re-read the overwritten pages)
233     + and if the battery does run out (or the user removes power), they can still
234     + resume.
235     +
236     +4. Data Structures.
237     +
238     + Suspend2 uses three main structures to store its metadata and configuration
239     + information:
240     +
241     + a) Pageflags bitmaps.
242     +
243     + Suspend records which pages will be in pageset1, pageset2, the destination
244     + of the atomic copy and the source of the atomically restored image using
245     + bitmaps. These bitmaps are created from order zero allocations to maximise
246     + reliability. The individual pages are combined together with pointers to
247     + form per-zone bitmaps, which are in turn combined with another layer of
248     + pointers to construct the overall bitmap.
249     +
250     + The pageset1 bitmap is thus easily stored in the image header for use at
251     + resume time.
252     +
253     + As mentioned above, using bitmaps also means that the amount of memory and
254     + storage required for recording the above information is constant. This
255     + greatly simplifies the work of preparing the image. In earlier versions of
256     + Suspend2, extents were used to record which pages would be stored. In that
257     + case, however, eating memory could result in greater fragmentation of the
258     + lists of pages, which in turn required more memory to store the extents and
259     + more storage in the image header. These could in turn require further
260     + freeing of memory, and another iteration. All of this complexity is removed
261     + by having bitmaps.
262     +
263     + Bitmaps also make a lot of sense because Suspend2 only ever iterates
264     + through the lists. There is therefore no cost to not being able to find the
265     + nth page in order 0 time. We only need to worry about the cost of finding
266     + the n+1th page, given the location of the nth page. Bitwise optimisations
267     + help here.
268     +
269     + The data structure is: unsigned long ***.
270     +
271     + b) Extents for block data.
272     +
273     + Suspend2 supports writing the image to multiple block devices. In the case
274     + of swap, multiple partitions and/or files may be in use, and we happily use
275     + them all. This is accomplished as follows:
276     +
277     + Whatever the actual source of the allocated storage, the destination of the
278     + image can be viewed in terms of one or more block devices, and on each
279     + device, a list of sectors. To simplify matters, we only use contiguous,
280     + PAGE_SIZE aligned sectors, like the swap code does.
281     +
282     + Since sector numbers on each bdev may well not start at 0, it makes much
283     + more sense to use extents here. Contiguous ranges of pages can thus be
284     + represented in the extents by contiguous values.
285     +
286     + Variations in block size are taken account of in transforming this data
287     + into the parameters for bio submission.
288     +
289     + We can thus implement a layer of abstraction wherein the core of Suspend2
290     + doesn't have to worry about which device we're currently writing to or
291     + where in the device we are. It simply requests that the next page in the
292     + pageset or header be written, leaving the details to this lower layer.
293     + The lower layer remembers where in the sequence of devices and blocks each
294     + pageset starts. The header always starts at the beginning of the allocated
295     + storage.
296     +
297     + So extents are:
298     +
299     + struct extent {
300     + unsigned long minimum, maximum;
301     + struct extent *next;
302     + }
303     +
304     + These are combined into chains of extents for a device:
305     +
306     + struct extent_chain {
307     + int size; /* size of the extent ie sum (max-min+1) */
308     + int allocs, frees;
309     + char *name;
310     + struct extent *first, *last_touched;
311     + };
312     +
313     + For each bdev, we need to store a little more info:
314     +
315     + struct suspend_bdev_info {
316     + struct block_device *bdev;
317     + dev_t dev_t;
318     + int bmap_shift;
319     + int blocks_per_page;
320     + };
321     +
322     + The dev_t is used to identify the device in the stored image. As a result,
323     + we expect devices at resume time to have the same major and minor numbers
324     + as they had while suspending. This is primarily a concern where the user
325     + utilises LVM for storage, as they will need to dmsetup their partitions in
326     + such a way as to maintain this consistency at resume time.
327     +
328     + bmap_shift and blocks_per_page record apply the effects of variations in
329     + blocks per page settings for the filesystem and underlying bdev. For most
330     + filesystems, these are the same, but for xfs, they can have independant
331     + values.
332     +
333     + Combining these two structures together, we have everything we need to
334     + record what devices and what blocks on each device are being used to
335     + store the image, and to submit i/o using bio_submit.
336     +
337     + The last elements in the picture are a means of recording how the storage
338     + is being used.
339     +
340     + We do this first and foremost by implementing a layer of abstraction on
341     + top of the devices and extent chains which allows us to view however many
342     + devices there might be as one long storage tape, with a single 'head' that
343     + tracks a 'current position' on the tape:
344     +
345     + struct extent_iterate_state {
346     + struct extent_chain *chains;
347     + int num_chains;
348     + int current_chain;
349     + struct extent *current_extent;
350     + unsigned long current_offset;
351     + };
352     +
353     + That is, *chains points to an array of size num_chains of extent chains.
354     + For the filewriter, this is always a single chain. For the swapwriter, the
355     + array is of size MAX_SWAPFILES.
356     +
357     + current_chain, current_extent and current_offset thus point to the current
358     + index in the chains array (and into a matching array of struct
359     + suspend_bdev_info), the current extent in that chain (to optimise access),
360     + and the current value in the offset.
361     +
362     + The image is divided into three parts:
363     + - The header
364     + - Pageset 1
365     + - Pageset 2
366     +
367     + The header always starts at the first device and first block. We know its
368     + size before we begin to save the image because we carefully account for
369     + everything that will be stored in it.
370     +
371     + The second pageset (LRU) is stored first. It begins on the next page after
372     + the end of the header.
373     +
374     + The first pageset is stored second. It's start location is only known once
375     + pageset2 has been saved, since pageset2 may be compressed as it is written.
376     + This location is thus recorded at the end of saving pageset2. It is page
377     + aligned also.
378     +
379     + Since this information is needed at resume time, and the location of extents
380     + in memory will differ at resume time, this needs to be stored in a portable
381     + way:
382     +
383     + struct extent_iterate_saved_state {
384     + int chain_num;
385     + int extent_num;
386     + unsigned long offset;
387     + };
388     +
389     + We can thus implement a layer of abstraction wherein the core of Suspend2
390     + doesn't have to worry about which device we're currently writing to or
391     + where in the device we are. It simply requests that the next page in the
392     + pageset or header be written, leaving the details to this layer, and
393     + invokes the routines to remember and restore the position, without having
394     + to worry about the details of how the data is arranged on disk or such like.
395     +
396     + c) Modules
397     +
398     + One aim in designing Suspend2 was to make it flexible. We wanted to allow
399     + for the implementation of different methods of transforming a page to be
400     + written to disk and different methods of getting the pages stored.
401     +
402     + In early versions (the betas and perhaps Suspend1), compression support was
403     + inlined in the image writing code, and the data structures and code for
404     + managing swap were intertwined with the rest of the code. A number of people
405     + had expressed interest in implementing image encryption, and alternative
406     + methods of storing the image.
407     +
408     + In order to achieve this, Suspend2 was given a modular design.
409     +
410     + A module is a single file which encapsulates the functionality needed
411     + to transform a pageset of data (encryption or compression, for example),
412     + or to write the pageset to a device. The former type of module is called
413     + a 'page-transformer', the later a 'writer'.
414     +
415     + Modules are linked together in pipeline fashion. There may be zero or more
416     + page transformers in a pipeline, and there is always exactly one writer.
417     + The pipeline follows this pattern:
418     +
419     + ---------------------------------
420     + | Suspend2 Core |
421     + ---------------------------------
422     + |
423     + |
424     + ---------------------------------
425     + | Page transformer 1 |
426     + ---------------------------------
427     + |
428     + |
429     + ---------------------------------
430     + | Page transformer 2 |
431     + ---------------------------------
432     + |
433     + |
434     + ---------------------------------
435     + | Writer |
436     + ---------------------------------
437     +
438     + During the writing of an image, the core code feeds pages one at a time
439     + to the first module. This module performs whatever transformations it
440     + implements on the incoming data, completely consuming the incoming data and
441     + feeding output in a similar manner to the next module. A module may buffer
442     + its output.
443     +
444     + During reading, the pipeline works in the reverse direction. The core code
445     + calls the first module with the address of a buffer which should be filled.
446     + (Note that the buffer size is always PAGE_SIZE at this time). This module
447     + will in turn request data from the next module and so on down until the
448     + writer is made to read from the stored image.
449     +
450     + Part of definition of the structure of a module thus looks like this:
451     +
452     + int (*rw_init) (int rw, int stream_number);
453     + int (*rw_cleanup) (int rw);
454     + int (*write_chunk) (struct page *buffer_page);
455     + int (*read_chunk) (struct page *buffer_page, int sync);
456     +
457     + It should be noted that the _cleanup routine may be called before the
458     + full stream of data has been read or written. While writing the image,
459     + the user may (depending upon settings) choose to abort suspending, and
460     + if we are in the midst of writing the last portion of the image, a portion
461     + of the second pageset may be reread. This may also happen if an error
462     + occurs and we seek to abort the process of writing the image.
463     +
464     + The modular design is also useful in a number of other ways. It provides
465     + a means where by we can add support for:
466     +
467     + - providing overall initialisation and cleanup routines;
468     + - serialising configuration information in the image header;
469     + - providing debugging information to the user;
470     + - determining memory and image storage requirements;
471     + - dis/enabling components at run-time;
472     + - configuring the module (see below);
473     +
474     + ...and routines for writers specific to their work:
475     + - Parsing a resume2= location;
476     + - Determining whether an image exists;
477     + - Marking a resume as having been attempted;
478     + - Invalidating an image;
479     +
480     + Since some parts of the core - the user interface and storage manager
481     + support - have use for some of these functions, they are registered as
482     + 'miscellaneous' modules as well.
483     +
484     + d) Sysfs data structures.
485     +
486     + This brings us naturally to support for configuring Suspend2. We desired to
487     + provide a way to make Suspend2 as flexible and configurable as possible.
488     + The user shouldn't have to reboot just because they want to now suspend to
489     + a file instead of a partition, for example.
490     +
491     + To accomplish this, Suspend2 implements a very generic means whereby the
492     + core and modules can register new sysfs entries. All Suspend2 entries use
493     + a single _store and _show routine, both of which are found in sysfs.c in
494     + the kernel/power directory. These routines handle the most common operations
495     + - getting and setting the values of bits, integers, longs, unsigned longs
496     + and strings in one place, and allow overrides for customised get and set
497     + options as well as side-effect routines for all reads and writes.
498     +
499     + When combined with some simple macros, a new sysfs entry can then be defined
500     + in just a couple of lines:
501     +
502     + { SUSPEND2_ATTR("progress_granularity", SYSFS_RW),
503     + SYSFS_INT(&progress_granularity, 1, 2048)
504     + },
505     +
506     + This defines a sysfs entry named "progress_granularity" which is rw and
507     + allows the user to access an integer stored at &progress_granularity, giving
508     + it a value between 1 and 2048 inclusive.
509     +
510     + Sysfs entries are registered under /sys/power/suspend2, and entries for
511     + modules are located in a subdirectory named after the module.
512     +
513     diff --git a/Documentation/power/suspend2.txt b/Documentation/power/suspend2.txt
514     new file mode 100644
515     index 0000000..b5a8edb
516     --- /dev/null
517     +++ b/Documentation/power/suspend2.txt
518     @@ -0,0 +1,713 @@
519     + --- Suspend2, version 2.2 ---
520     +
521     +1. What is it?
522     +2. Why would you want it?
523     +3. What do you need to use it?
524     +4. Why not just use the version already in the kernel?
525     +5. How do you use it?
526     +6. What do all those entries in /sys/power/suspend2 do?
527     +7. How do you get support?
528     +8. I think I've found a bug. What should I do?
529     +9. When will XXX be supported?
530     +10 How does it work?
531     +11. Who wrote Suspend2?
532     +
533     +1. What is it?
534     +
535     + Imagine you're sitting at your computer, working away. For some reason, you
536     + need to turn off your computer for a while - perhaps it's time to go home
537     + for the day. When you come back to your computer next, you're going to want
538     + to carry on where you left off. Now imagine that you could push a button and
539     + have your computer store the contents of its memory to disk and power down.
540     + Then, when you next start up your computer, it loads that image back into
541     + memory and you can carry on from where you were, just as if you'd never
542     + turned the computer off. Far less time to start up, no reopening
543     + applications and finding what directory you put that file in yesterday.
544     + That's what Suspend2 does.
545     +
546     + Suspend2 has a long heritage. It began life as work by Gabor Kuti, who,
547     + with some help from Pavel Machek, got an early version going in 1999. The
548     + project was then taken over by Florent Chabaud while still in alpha version
549     + numbers. Nigel Cunningham came on the scene when Florent was unable to
550     + continue, moving the project into betas, then 1.0, 2.0 and so on up to
551     + the present 2.2 series. Pavel Machek's swsusp code, which was merged around
552     + 2.5.17 retains the original name, and was essentially a fork of the beta
553     + code until Rafael Wysocki came on the scene in 2005 and began to improve it
554     + further.
555     +
556     +2. Why would you want it?
557     +
558     + Why wouldn't you want it?
559     +
560     + Being able to save the state of your system and quickly restore it improves
561     + your productivity - you get a useful system in far less time than through
562     + the normal boot process.
563     +
564     +3. What do you need to use it?
565     +
566     + a. Kernel Support.
567     +
568     + i) The Suspend2 patch.
569     +
570     + Suspend2 is part of the Linux Kernel. This version is not part of Linus's
571     + 2.6 tree at the moment, so you will need to download the kernel source and
572     + apply the latest patch. Having done that, enable the appropriate options in
573     + make [menu|x]config (under Power Management Options), compile and install your
574     + kernel. Suspend2 works with SMP, Highmem, preemption, x86-32, PPC and x86_64.
575     +
576     + Suspend2 patches are available from http://suspend2.net.
577     +
578     + ii) Compression and encryption support.
579     +
580     + Compression and encryption support are implemented via the
581     + cryptoapi. You will therefore want to select any Cryptoapi transforms that
582     + you want to use on your image from the Cryptoapi menu while configuring
583     + your kernel.
584     +
585     + You can also tell Suspend to write it's image to an encrypted and/or
586     + compressed filesystem/swap partition. In that case, you don't need to do
587     + anything special for Suspend2 when it comes to kernel configuration.
588     +
589     + iii) Configuring other options.
590     +
591     + While you're configuring your kernel, try to configure as much as possible
592     + to build as modules. We recommend this because there are a number of drivers
593     + that are still in the process of implementing proper power management
594     + support. In those cases, the best way to work around their current lack is
595     + to build them as modules and remove the modules while suspending. You might
596     + also bug the driver authors to get their support up to speed, or even help!
597     +
598     + b. Storage.
599     +
600     + i) Swap.
601     +
602     + Suspend2 can store the suspend image in your swap partition, a swap file or
603     + a combination thereof. Whichever combination you choose, you will probably
604     + want to create enough swap space to store the largest image you could have,
605     + plus the space you'd normally use for swap. A good rule of thumb would be
606     + to calculate the amount of swap you'd want without using Suspend2, and then
607     + add the amount of memory you have. This swapspace can be arranged in any way
608     + you'd like. It can be in one partition or file, or spread over a number. The
609     + only requirement is that they be active when you start a suspend cycle.
610     +
611     + There is one exception to this requirement. Suspend2 has the ability to turn
612     + on one swap file or partition at the start of suspending and turn it back off
613     + at the end. If you want to ensure you have enough memory to store a image
614     + when your memory is fully used, you might want to make one swap partition or
615     + file for 'normal' use, and another for Suspend2 to activate & deactivate
616     + automatically. (Further details below).
617     +
618     + ii) Normal files.
619     +
620     + Suspend2 includes a 'filewriter'. The filewriter can store your image in a
621     + simple file. Since Linux has the idea of everything being a file, this is
622     + more powerful than it initially sounds. If, for example, you were to set up
623     + a network block device file, you could suspend to a network server. This has
624     + been tested and works to a point, but nbd itself isn't stateless enough for
625     + our purposes.
626     +
627     + Take extra care when setting up the filewriter. If you just type commands
628     + without thinking and then try to suspend, you could cause irreversible
629     + corruption on your filesystems! Make sure you have backups.
630     +
631     + Most people will only want to suspend to a local file. To achieve that, do
632     + something along the lines of:
633     +
634     + echo "Suspend2" > /suspend-file
635     + dd if=/dev/zero bs=1M count=512 >> suspend-file
636     +
637     + This will create a 512MB file called /suspend-file. To get Suspend2 to use
638     + it:
639     +
640     + echo /suspend-file > /sys/power/suspend2/filewriter/filewriter_target
641     +
642     + Then
643     +
644     + cat /sys/power/suspend2/resume2
645     +
646     + Put the results of this into your bootloader's configuration (see also step
647     + C, below:
648     +
649     + ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE---
650     + # cat /sys/power/suspend2/resume2
651     + file:/dev/hda2:0x1e001
652     +
653     + In this example, we would edit the append= line of our lilo.conf|menu.lst
654     + so that it included:
655     +
656     + resume2=file:/dev/hda2:0x1e001
657     + ---EXAMPLE-ONLY-DON'T-COPY-AND-PASTE---
658     +
659     + For those who are thinking 'Could I make the file sparse?', the answer is
660     + 'No!'. At the moment, there is no way for Suspend2 to fill in the holes in
661     + a sparse file while suspending. In the longer term (post merge!), I'd like
662     + to change things so that the file could be dynamically resized as needed.
663     + Right now, however, that's not possible and not a priority.
664     +
665     + c. Bootloader configuration.
666     +
667     + Using Suspend2 also requires that you add an extra parameter to
668     + your lilo.conf or equivalent. Here's an example for a swap partition:
669     +
670     + append="resume2=swap:/dev/hda1"
671     +
672     + This would tell Suspend2 that /dev/hda1 is a swap partition you
673     + have. Suspend2 will use the swap signature of this partition as a
674     + pointer to your data when you suspend. This means that (in this example)
675     + /dev/hda1 doesn't need to be _the_ swap partition where all of your data
676     + is actually stored. It just needs to be a swap partition that has a
677     + valid signature.
678     +
679     + You don't need to have a swap partition for this purpose. Suspend2
680     + can also use a swap file, but usage is a little more complex. Having made
681     + your swap file, turn it on and do
682     +
683     + cat /sys/power/suspend2/swapwriter/headerlocations
684     +
685     + (this assumes you've already compiled your kernel with Suspend2
686     + support and booted it). The results of the cat command will tell you
687     + what you need to put in lilo.conf:
688     +
689     + For swap partitions like /dev/hda1, simply use resume2=/dev/hda1.
690     + For swapfile `swapfile`, use resume2=swap:/dev/hda2:0x242d.
691     +
692     + If the swapfile changes for any reason (it is moved to a different
693     + location, it is deleted and recreated, or the filesystem is
694     + defragmented) then you will have to check
695     + /sys/power/suspend2/swapwriter/headerlocations for a new resume_block value.
696     +
697     + Once you've compiled and installed the kernel and adjusted your bootloader
698     + configuration, you should only need to reboot for the most basic part
699     + of Suspend2 to be ready.
700     +
701     + If you only compile in the swapwriter, or only compile in the filewriter,
702     + you don't need to add the "swap:" part of the resume2= parameters above.
703     + resume2=/dev/hda2:0x242d will work just as well.
704     +
705     + d. The hibernate script.
706     +
707     + Since the driver model in 2.6 kernels is still being developed, you may need
708     + to do more, however. Users of Suspend2 usually start the process via a script
709     + which prepares for the suspend, tells the kernel to do its stuff and then
710     + restore things afterwards. This script might involve:
711     +
712     + - Switching to a text console and back if X doesn't like the video card
713     + status on resume.
714     + - Un/reloading PCMCIA support since it doesn't play well with suspend.
715     +
716     + Note that you might not be able to unload some drivers if there are
717     + processes using them. You might have to kill off processes that hold
718     + devices open. Hint: if your X server accesses an USB mouse, doing a
719     + 'chvt' to a text console releases the device and you can unload the
720     + module.
721     +
722     + Check out the latest script (available on suspend2.net).
723     +
724     +4. Why not just use the version already in the kernel?
725     +
726     + The version in the vanilla kernel has a number of drawbacks. Among these:
727     + - it has a maximum image size of 1/2 total memory.
728     + - it doesn't allocate storage until after it has snapshotted memory.
729     + This means that you can't be sure suspending will work until you
730     + see it start to write the image.
731     + - it performs all of it's I/O synchronously.
732     + - it does not allow you to press escape to cancel a cycle
733     + - it does not allow you to automatically swapon a file when
734     + starting a cycle.
735     + - it does not allow you to use multiple swap partitions.
736     + - it does not allow you to use swapfiles.
737     + - it does not allow you to use ordinary files.
738     + - it just invalidates an image and continues to boot if you
739     + accidentally boot the wrong kernel after suspending.
740     + - it doesn't support any sort of nice display while suspending
741     + - it is moving toward requiring that you have an initrd/initramfs
742     + to ever have a hope of resuming (uswsusp). While uswsusp will
743     + address some of the concerns above, it won't address all, and
744     + will be more complicated to get set up.
745     +
746     +5. How do you use it?
747     +
748     + A suspend cycle can be started directly by doing:
749     +
750     + echo > /sys/power/suspend2/do_resume
751     +
752     + In practice, though, you'll probably want to use the hibernate script
753     + to unload modules, configure the kernel the way you like it and so on.
754     + In that case, you'd do (as root):
755     +
756     + hibernate
757     +
758     + See the hibernate script's man page for more details on the options it
759     + takes.
760     +
761     + If you're using the text or splash user interface modules, one neat feature
762     + of Suspend2 that you might find useful is that you can press Escape at any
763     + time during suspending, and the process will be aborted.
764     +
765     + Due to the way suspend works, this means you'll have your system back and
766     + perfectly usable almost instantly. The only exception is when it's at the
767     + very end of writing the image. Then it will need to reload a small (
768     + usually 4-50MBs, depending upon the image characteristics) portion first.
769     +
770     + If you run into problems with resuming, adding the "noresume2" option to
771     + the kernel command line will let you skip the resume step and recover your
772     + system.
773     +
774     +6. What do all those entries in /sys/power/suspend2 do?
775     +
776     + /sys/power/suspend2 is the directory which contains files you can use to
777     + tune and configure Suspend2 to your liking. The exact contents of
778     + the directory will depend upon the version of Suspend2 you're
779     + running and the options you selected at compile time. In the following
780     + descriptions, names in brackets refer to compile time options.
781     + (Note that they're all dependant upon you having selected CONFIG_SUSPEND2
782     + in the first place!).
783     +
784     + Since the values of these settings can open potential security risks, they
785     + are usually accessible only to the root user. You can, however, enable a
786     + compile time option which makes all of these files world-accessible. This
787     + should only be done if you trust everyone with shell access to this
788     + computer!
789     +
790     + - checksum/enabled
791     +
792     + Use cryptoapi hashing routines to verify that Pageset2 pages don't change
793     + while we're saving the first part of the image, and to get any pages that
794     + do change resaved in the atomic copy. This should normally not be needed,
795     + but if you're seeing issues, please enable this. If your issues stop you
796     + being able to resume, enable this option, suspend and cancel the cycle
797     + after the atomic copy is done. If the debugging info shows a non-zero
798     + number of pages resaved, please report this to Nigel.
799     +
800     + - compression/algorithm
801     +
802     + Set the cryptoapi algorithm used for compressing the image.
803     +
804     + - compression/expected_compression
805     +
806     + These values allow you to set an expected compression ratio, which Software
807     + Suspend will use in calculating whether it meets constraints on the image
808     + size. If this expected compression ratio is not attained, the suspend will
809     + abort, so it is wise to allow some spare. You can see what compression
810     + ratio is achieved in the logs after suspending.
811     +
812     + - debug_info:
813     +
814     + This file returns information about your configuration that may be helpful
815     + in diagnosing problems with suspending.
816     +
817     + - do_resume:
818     +
819     + When anything is written to this file suspend will attempt to read and
820     + restore an image. If there is no image, it will return almost immediately.
821     + If an image exists, the echo > will never return. Instead, the original
822     + kernel context will be restored and the original echo > do_suspend will
823     + return.
824     +
825     + - do_suspend:
826     +
827     + When anything is written to this file, the kernel side of Suspend2 will
828     + begin to attempt to write an image to disk and power down. You'll normally
829     + want to run the hibernate script instead, to get modules unloaded first.
830     +
831     + - driver_model_beeping
832     +
833     + Enable beeping when suspending and resuming the drivers. Might help with
834     + determining where a problem in resuming occurs.
835     +
836     + - */enabled
837     +
838     + These option can be used to temporarily disable various parts of suspend.
839     +
840     + - encryption/*
841     +
842     + The iv, key, save_key_and_iv, mode and algorithm values allow you to
843     + select a cryptoapi encryption algoritm, set the iv and key and whether
844     + they are saved in the image header. Saving the iv and key in the image
845     + header is of course less secure than having them on some external device,
846     + such as a USB key. If you want to use a USB key, you'll need to write
847     + some scripting in your initrd/ramfs to retrieve the key & iv from your
848     + USB key and put them into the entries again prior to doing the echo to
849     + do_resume.
850     +
851     + - extra_pages_allowance
852     +
853     + When Suspend2 does its atomic copy, it calls the driver model suspend
854     + and resume methods. If you have DRI enabled with a driver such as fglrx,
855     + this can result in the driver allocating a substantial amount of memory
856     + for storing its state. Extra_pages_allowance tells suspend2 how much
857     + extra memory it should ensure is available for those allocations. If
858     + your attempts at suspending end with a message in dmesg indicating that
859     + insufficient extra pages were allowed, you need to increase this value.
860     +
861     + - filewriter/target:
862     +
863     + Read this value to get the current setting. Write to it to point Suspend
864     + at a new storage location for the filewriter. See above for details of how
865     + to set up the filewriter.
866     +
867     + - freezer_test
868     +
869     + This entry can be used to get Suspend2 to just test the freezer without
870     + actually doing a suspend cycle. It is useful for diagnosing freezing
871     + issues.
872     +
873     + - image_exists:
874     +
875     + Can be used in a script to determine whether a valid image exists at the
876     + location currently pointed to by resume2=. Returns up to three lines.
877     + The first is whether an image exists (-1 for unsure, otherwise 0 or 1).
878     + If an image eixsts, additional lines will return the machine and version.
879     + Echoing anything to this entry removes any current image.
880     +
881     + - image_size_limit:
882     +
883     + The maximum size of suspend image written to disk, measured in megabytes
884     + (1024*1024).
885     +
886     + - interface_version:
887     +
888     + The value returned by this file can be used by scripts and configuration
889     + tools to determine what entries should be looked for. The value is
890     + incremented whenever an entry in /sys/power/suspend2 is obsoleted or
891     + added.
892     +
893     + - last_result:
894     +
895     + The result of the last suspend, as defined in
896     + include/linux/suspend-debug.h with the values SUSPEND_ABORTED to
897     + SUSPEND_KEPT_IMAGE. This is a bitmask.
898     +
899     + - log_everything (CONFIG_PM_DEBUG):
900     +
901     + Setting this option results in all messages printed being logged. Normally,
902     + only a subset are logged, so as to not slow the process and not clutter the
903     + logs. Useful for debugging. It can be toggled during a cycle by pressing
904     + 'L'.
905     +
906     + - pause_between_steps (CONFIG_PM_DEBUG):
907     +
908     + This option is used during debugging, to make Suspend2 pause between
909     + each step of the process. It is ignored when the nice display is on.
910     +
911     + - powerdown_method:
912     +
913     + Used to select a method by which Suspend2 should powerdown after writing the
914     + image. Currently:
915     +
916     + 0: Don't use ACPI to power off.
917     + 3: Attempt to enter Suspend-to-ram.
918     + 4: Attempt to enter ACPI S4 mode.
919     + 5: Attempt to power down via ACPI S5 mode.
920     +
921     + Note that these options are highly dependant upon your hardware & software:
922     +
923     + 3: When succesful, your machine suspends-to-ram instead of powering off.
924     + The advantage of using this mode is that it doesn't matter whether your
925     + battery has enough charge to make it through to your next resume. If it
926     + lasts, you will simply resume from suspend to ram (and the image on disk
927     + will be discarded). If the battery runs out, you will resume from disk
928     + instead. The disadvantage is that it takes longer than a normal
929     + suspend-to-ram to enter the state, since the suspend-to-disk image needs
930     + to be written first.
931     + 4/5: When successful, your machine will be off and comsume (almost) no power.
932     + But it might still react to some external events like opening the lid or
933     + trafic on a network or usb device. For the bios, resume is then the same
934     + as warm boot, similar to a situation where you used the command `reboot'
935     + to reboot your machine. If your machine has problems on warm boot or if
936     + you want to protect your machine with the bios password, this is probably
937     + not the right choice. Mode 4 may be necessary on some machines where ACPI
938     + wake up methods need to be run to properly reinitialise hardware after a
939     + suspend-to-disk cycle.
940     + 0: Switch the machine completely off. The only possible wakeup is the power
941     + button. For the bios, resume is then the same as a cold boot, in
942     + particular you would have to provide your bios boot password if your
943     + machine uses that feature for booting.
944     +
945     + - progressbar_granularity_limit:
946     +
947     + This option can be used to limit the granularity of the progress bar
948     + displayed with a bootsplash screen. The value is the maximum number of
949     + steps. That is, 10 will make the progress bar jump in 10% increments.
950     +
951     + - reboot:
952     +
953     + This option causes Suspend2 to reboot rather than powering down
954     + at the end of saving an image. It can be toggled during a cycle by pressing
955     + 'R'.
956     +
957     + - resume_commandline:
958     +
959     + This entry can be read after resuming to see the commandline that was used
960     + when resuming began. You might use this to set up two bootloader entries
961     + that are the same apart from the fact that one includes a extra append=
962     + argument "at_work=1". You could then grep resume_commandline in your
963     + post-resume scripts and configure networking (for example) differently
964     + depending upon whether you're at home or work. resume_commandline can be
965     + set to arbitrary text if you wish to remove sensitive contents.
966     +
967     + - swapwriter/swapfilename:
968     +
969     + This entry is used to specify the swapfile or partition that
970     + Suspend2 will attempt to swapon/swapoff automatically. Thus, if
971     + I normally use /dev/hda1 for swap, and want to use /dev/hda2 for specifically
972     + for my suspend image, I would
973     +
974     + echo /dev/hda2 > /sys/power/suspend2/swapwriter/swapfile
975     +
976     + /dev/hda2 would then be automatically swapon'd and swapoff'd. Note that the
977     + swapon and swapoff occur while other processes are frozen (including kswapd)
978     + so this swap file will not be used up when attempting to free memory. The
979     + parition/file is also given the highest priority, so other swapfiles/partitions
980     + will only be used to save the image when this one is filled.
981     +
982     + The value of this file is used by headerlocations along with any currently
983     + activated swapfiles/partitions.
984     +
985     + - swapwriter/headerlocations:
986     +
987     + This option tells you the resume2= options to use for swap devices you
988     + currently have activated. It is particularly useful when you only want to
989     + use a swap file to store your image. See above for further details.
990     +
991     + - toggle_process_nofreeze
992     +
993     + This entry can be used to toggle the NOFREEZE flag on a process, to allow it
994     + to run during Suspending. It should be used with extreme caution. There are
995     + strict limitations on what a process running during suspend can do. This is
996     + really only intended for use by Suspend's helpers (userui in particular).
997     +
998     + - userui_program
999     +
1000     + This entry is used to tell Suspend what userspace program to use for
1001     + providing a user interface while suspending. The program uses a netlink
1002     + socket to pass messages back and forward to the kernel, allowing all of the
1003     + functions formerly implemented in the kernel user interface components.
1004     +
1005     + - user_interface/debug_sections (CONFIG_PM_DEBUG):
1006     +
1007     + This value, together with the console log level, controls what debugging
1008     + information is displayed. The console log level determines the level of
1009     + detail, and this value determines what detail is displayed. This value is
1010     + a bit vector, and the meaning of the bits can be found in the kernel tree
1011     + in include/linux/suspend2.h. It can be overridden using the kernel's
1012     + command line option suspend_dbg.
1013     +
1014     + - user_interface/default_console_level (CONFIG_PM_DEBUG):
1015     +
1016     + This determines the value of the console log level at the start of a
1017     + suspend cycle. If debugging is compiled in, the console log level can be
1018     + changed during a cycle by pressing the digit keys. Meanings are:
1019     +
1020     + 0: Nice display.
1021     + 1: Nice display plus numerical progress.
1022     + 2: Errors only.
1023     + 3: Low level debugging info.
1024     + 4: Medium level debugging info.
1025     + 5: High level debugging info.
1026     + 6: Verbose debugging info.
1027     +
1028     + - user_interface/enable_escape:
1029     +
1030     + Setting this to "1" will enable you abort a suspend by
1031     + pressing escape, "0" (default) disables this feature. Note that enabling
1032     + this option means that you cannot initiate a suspend and then walk away
1033     + from your computer, expecting it to be secure. With feature disabled,
1034     + you can validly have this expectation once Suspend begins to write the
1035     + image to disk. (Prior to this point, it is possible that Suspend might
1036     + about because of failure to freeze all processes or because constraints
1037     + on its ability to save the image are not met).
1038     +
1039     + - version:
1040     +
1041     + The version of suspend you have compiled into the currently running kernel.
1042     +
1043     +7. How do you get support?
1044     +
1045     + Glad you asked. Suspend2 is being actively maintained and supported
1046     + by Nigel (the guy doing most of the kernel coding at the moment), Bernard
1047     + (who maintains the hibernate script and userspace user interface components)
1048     + and its users.
1049     +
1050     + Resources availble include HowTos, FAQs and a Wiki, all available via
1051     + suspend2.net. You can find the mailing lists there.
1052     +
1053     +8. I think I've found a bug. What should I do?
1054     +
1055     + By far and a way, the most common problems people have with suspend2
1056     + related to drivers not having adequate power management support. In this
1057     + case, it is not a bug with suspend2, but we can still help you. As we
1058     + mentioned above, such issues can usually be worked around by building the
1059     + functionality as modules and unloading them while suspending. Please visit
1060     + the Wiki for up-to-date lists of known issues and work arounds.
1061     +
1062     + If this information doesn't help, try running:
1063     +
1064     + hibernate --bug-report
1065     +
1066     + ..and sending the output to the users mailing list.
1067     +
1068     + Good information on how to provide us with useful information from an
1069     + oops is found in the file REPORTING-BUGS, in the top level directory
1070     + of the kernel tree. If you get an oops, please especially note the
1071     + information about running what is printed on the screen through ksymoops.
1072     + The raw information is useless.
1073     +
1074     +9. When will XXX be supported?
1075     +
1076     + If there's a feature missing from Suspend2 that you'd like, feel free to
1077     + ask. We try to be obliging, within reason.
1078     +
1079     + Patches are welcome. Please send to the list.
1080     +
1081     +10. How does it work?
1082     +
1083     + Suspend2 does its work in a number of steps.
1084     +
1085     + a. Freezing system activity.
1086     +
1087     + The first main stage in suspending is to stop all other activity. This is
1088     + achieved in stages. Processes are considered in fours groups, which we will
1089     + describe in reverse order for clarity's sake: Threads with the PF_NOFREEZE
1090     + flag, kernel threads without this flag, userspace processes with the
1091     + PF_SYNCTHREAD flag and all other processes. The first set (PF_NOFREEZE) are
1092     + untouched by the refrigerator code. They are allowed to run during suspending
1093     + and resuming, and are used to support user interaction, storage access or the
1094     + like. Other kernel threads (those unneeded while suspending) are frozen last.
1095     + This leaves us with userspace processes that need to be frozen. When a
1096     + process enters one of the *_sync system calls, we set a PF_SYNCTHREAD flag on
1097     + that process for the duration of that call. Processes that have this flag are
1098     + frozen after processes without it, so that we can seek to ensure that dirty
1099     + data is synced to disk as quickly as possible in a situation where other
1100     + processes may be submitting writes at the same time. Freezing the processes
1101     + that are submitting data stops new I/O from being submitted. Syncthreads can
1102     + then cleanly finish their work. So the order is:
1103     +
1104     + - Userspace processes without PF_SYNCTHREAD or PF_NOFREEZE;
1105     + - Userspace processes with PF_SYNCTHREAD (they won't have NOFREEZE);
1106     + - Kernel processes without PF_NOFREEZE.
1107     +
1108     + b. Eating memory.
1109     +
1110     + For a successful suspend, you need to have enough disk space to store the
1111     + image and enough memory for the various limitations of Suspend2's
1112     + algorithm. You can also specify a maximum image size. In order to attain
1113     + to those constraints, Suspend2 may 'eat' memory. If, after freezing
1114     + processes, the constraints aren't met, Suspend2 will thaw all the
1115     + other processes and begin to eat memory until its calculations indicate
1116     + the constraints are met. It will then freeze processes again and recheck
1117     + its calculations.
1118     +
1119     + c. Allocation of storage.
1120     +
1121     + Next, Suspend2 allocates the storage that will be used to save
1122     + the image.
1123     +
1124     + The core of Suspend2 knows nothing about how or where pages are stored. We
1125     + therefore request the active writer (remember you might have compiled in
1126     + more than one!) to allocate enough storage for our expect image size. If
1127     + this request cannot be fulfilled, we eat more memory and try again. If it
1128     + is fulfiled, we seek to allocate additional storage, just in case our
1129     + expected compression ratio (if any) isn't achieved. This time, however, we
1130     + just continue if we can't allocate enough storage.
1131     +
1132     + If these calls to our writer change the characteristics of the image such
1133     + that we haven't allocated enough memory, we also loop. (The writer may well
1134     + need to allocate space for its storage information).
1135     +
1136     + d. Write the first part of the image.
1137     +
1138     + Suspend2 stores the image in two sets of pages called 'pagesets'.
1139     + Pageset 2 contains pages on the active and inactive lists; essentially
1140     + the page cache. Pageset 1 contains all other pages, including the kernel.
1141     + We use two pagesets for one important reason: We need to make an atomic copy
1142     + of the kernel to ensure consistency of the image. Without a second pageset,
1143     + that would limit us to an image that was at most half the amount of memory
1144     + available. Using two pagesets allows us to store a full image. Since pageset
1145     + 2 pages won't be needed in saving pageset 1, we first save pageset 2 pages.
1146     + We can then make our atomic copy of the remaining pages using both pageset 2
1147     + pages and any other pages that are free. While saving both pagesets, we are
1148     + careful not to corrupt the image. Among other things, we use lowlevel block
1149     + I/O routines that don't change the pagecache contents.
1150     +
1151     + The next step, then, is writing pageset 2.
1152     +
1153     + e. Suspending drivers and storing processor context.
1154     +
1155     + Having written pageset2, Suspend2 calls the power management functions to
1156     + notify drivers of the suspend, and saves the processor state in preparation
1157     + for the atomic copy of memory we are about to make.
1158     +
1159     + f. Atomic copy.
1160     +
1161     + At this stage, everything else but the Suspend2 code is halted. Processes
1162     + are frozen or idling, drivers are quiesced and have stored (ideally and where
1163     + necessary) their configuration in memory we are about to atomically copy.
1164     + In our lowlevel architecture specific code, we have saved the CPU state.
1165     + We can therefore now do our atomic copy before resuming drivers etc.
1166     +
1167     + g. Save the atomic copy (pageset 1).
1168     +
1169     + Suspend can then write the atomic copy of the remaining pages. Since we
1170     + have copied the pages into other locations, we can continue to use the
1171     + normal block I/O routines without fear of corruption our image.
1172     +
1173     + f. Save the suspend header.
1174     +
1175     + Nearly there! We save our settings and other parameters needed for
1176     + reloading pageset 1 in a 'suspend header'. We also tell our writer to
1177     + serialise its data at this stage, so that it can reread the image at resume
1178     + time. Note that the writer can write this data in any format - in the case
1179     + of the swapwriter, for example, it splits header pages in 4092 byte blocks,
1180     + using the last four bytes to link pages of data together. This is completely
1181     + transparent to the core.
1182     +
1183     + g. Set the image header.
1184     +
1185     + Finally, we edit the header at our resume2= location. The signature is
1186     + changed by the writer to reflect the fact that an image exists, and to point
1187     + to the start of that data if necessary (swapwriter).
1188     +
1189     + h. Power down.
1190     +
1191     + Or reboot if we're debugging and the appropriate option is selected.
1192     +
1193     + Whew!
1194     +
1195     + Reloading the image.
1196     + --------------------
1197     +
1198     + Reloading the image is essentially the reverse of all the above. We load
1199     + our copy of pageset 1, being careful to choose locations that aren't going
1200     + to be overwritten as we copy it back (We start very early in the boot
1201     + process, so there are no other processes to quiesce here). We then copy
1202     + pageset 1 back to its original location in memory and restore the process
1203     + context. We are now running with the original kernel. Next, we reload the
1204     + pageset 2 pages, free the memory and swap used by Suspend2, restore
1205     + the pageset header and restart processes. Sounds easy in comparison to
1206     + suspending, doesn't it!
1207     +
1208     + There is of course more to Suspend2 than this, but this explanation
1209     + should be a good start. If there's interest, I'll write further
1210     + documentation on range pages and the low level I/O.
1211     +
1212     +11. Who wrote Suspend2?
1213     +
1214     + (Answer based on the writings of Florent Chabaud, credits in files and
1215     + Nigel's limited knowledge; apologies to anyone missed out!)
1216     +
1217     + The main developers of Suspend2 have been...
1218     +
1219     + Gabor Kuti
1220     + Pavel Machek
1221     + Florent Chabaud
1222     + Bernard Blackham
1223     + Nigel Cunningham
1224     +
1225     + They have been aided in their efforts by a host of hundreds, if not thousands
1226     + of testers and people who have submitted bug fixes & suggestions. Of special
1227     + note are the efforts of Michael Frank, who had his computers repetitively
1228     + suspend and resume for literally tens of thousands of cycles and developed
1229     + scripts to stress the system and test Suspend2 far beyond the point
1230     + most of us (Nigel included!) would consider testing. His efforts have
1231     + contributed as much to Suspend2 as any of the names above.
1232     diff --git a/MAINTAINERS b/MAINTAINERS
1233     index 277877a..f1adba1 100644
1234     --- a/MAINTAINERS
1235     +++ b/MAINTAINERS
1236     @@ -3218,6 +3218,13 @@ M: sammy@sammy.net
1237     W: http://sammy.net/sun3/
1238     S: Maintained
1239    
1240     +SUSPEND2
1241     +P: Nigel Cunningham
1242     +M: nigel@suspend2.net
1243     +L: suspend2-devel@suspend2.net
1244     +W: http://suspend2.net
1245     +S: Maintained
1246     +
1247     SVGA HANDLING
1248     P: Martin Mares
1249     M: mj@ucw.cz
1250     diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
1251     index b8c4e25..21982d9 100644
1252     --- a/arch/i386/mm/fault.c
1253     +++ b/arch/i386/mm/fault.c
1254     @@ -23,6 +23,7 @@
1255     #include <linux/module.h>
1256     #include <linux/kprobes.h>
1257     #include <linux/uaccess.h>
1258     +#include <linux/suspend.h>
1259    
1260     #include <asm/system.h>
1261     #include <asm/desc.h>
1262     @@ -33,6 +34,9 @@ extern void die(const char *,struct pt_regs *,long);
1263    
1264     static ATOMIC_NOTIFIER_HEAD(notify_page_fault_chain);
1265    
1266     +int suspend2_faulted = 0;
1267     +EXPORT_SYMBOL(suspend2_faulted);
1268     +
1269     int register_page_fault_notifier(struct notifier_block *nb)
1270     {
1271     vmalloc_sync_all();
1272     @@ -311,6 +315,20 @@ fastcall void __kprobes do_page_fault(struct pt_regs *regs,
1273    
1274     si_code = SEGV_MAPERR;
1275    
1276     + /* During a Suspend2 atomic copy, with DEBUG_SLAB, we will
1277     + * get page faults where slab has been unmapped. Map them
1278     + * temporarily and set the variable that tells Suspend2 to
1279     + * unmap afterwards.
1280     + */
1281     +
1282     + if (unlikely(suspend2_running && !suspend2_faulted)) {
1283     + struct page *page = NULL;
1284     + suspend2_faulted = 1;
1285     + page = virt_to_page(address);
1286     + kernel_map_pages(page, 1, 1);
1287     + return;
1288     + }
1289     +
1290     /*
1291     * We fault-in kernel-space virtual memory on-demand. The
1292     * 'reference' page table is init_mm.pgd.
1293     diff --git a/arch/i386/mm/init.c b/arch/i386/mm/init.c
1294     index ae43688..a180d21 100644
1295     --- a/arch/i386/mm/init.c
1296     +++ b/arch/i386/mm/init.c
1297     @@ -387,7 +387,7 @@ static void __init pagetable_init (void)
1298     #endif
1299     }
1300    
1301     -#if defined(CONFIG_SOFTWARE_SUSPEND) || defined(CONFIG_ACPI_SLEEP)
1302     +#if defined(CONFIG_SUSPEND_SHARED) || defined(CONFIG_ACPI_SLEEP)
1303     /*
1304     * Swap suspend & friends need this for resume because things like the intel-agp
1305     * driver might have split up a kernel 4MB mapping.
1306     @@ -774,13 +774,13 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
1307     unsigned long addr;
1308    
1309     for (addr = begin; addr < end; addr += PAGE_SIZE) {
1310     - ClearPageReserved(virt_to_page(addr));
1311     - init_page_count(virt_to_page(addr));
1312     + //ClearPageReserved(virt_to_page(addr));
1313     + //init_page_count(virt_to_page(addr));
1314     memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE);
1315     - free_page(addr);
1316     - totalram_pages++;
1317     + //free_page(addr);
1318     + //totalram_pages++;
1319     }
1320     - printk(KERN_INFO "Freeing %s: %ldk freed\n", what, (end - begin) >> 10);
1321     + //printk(KERN_INFO "Freeing %s: %ldk freed\n", what, (end - begin) >> 10);
1322     }
1323    
1324     void free_initmem(void)
1325     diff --git a/arch/i386/mm/pageattr.c b/arch/i386/mm/pageattr.c
1326     index 412ebbd..dded2e1 100644
1327     --- a/arch/i386/mm/pageattr.c
1328     +++ b/arch/i386/mm/pageattr.c
1329     @@ -252,7 +252,27 @@ void kernel_map_pages(struct page *page, int numpages, int enable)
1330     */
1331     __flush_tlb_all();
1332     }
1333     +EXPORT_SYMBOL(kernel_map_pages);
1334     #endif
1335    
1336     +int page_is_mapped(struct page *page)
1337     +{
1338     + pte_t *kpte;
1339     + unsigned long address;
1340     + struct page *kpte_page;
1341     +
1342     + if(PageHighMem(page))
1343     + return 0;
1344     +
1345     + address = (unsigned long)page_address(page);
1346     +
1347     + kpte = lookup_address(address);
1348     + if (!kpte)
1349     + return -EINVAL;
1350     + kpte_page = virt_to_page(kpte);
1351     +
1352     + return (pte_val(*kpte) & (__PAGE_KERNEL_EXEC | __PAGE_KERNEL)) ? 1:0;
1353     +}
1354     EXPORT_SYMBOL(change_page_attr);
1355     EXPORT_SYMBOL(global_flush_tlb);
1356     +EXPORT_SYMBOL(page_is_mapped);
1357     diff --git a/arch/i386/power/Makefile b/arch/i386/power/Makefile
1358     index 2de7bbf..72a6169 100644
1359     --- a/arch/i386/power/Makefile
1360     +++ b/arch/i386/power/Makefile
1361     @@ -1,2 +1,2 @@
1362     obj-$(CONFIG_PM) += cpu.o
1363     -obj-$(CONFIG_SOFTWARE_SUSPEND) += swsusp.o suspend.o
1364     +obj-$(CONFIG_SUSPEND_SHARED) += swsusp.o suspend.o
1365     diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
1366     index 8120d42..5d57f0c 100644
1367     --- a/arch/powerpc/kernel/Makefile
1368     +++ b/arch/powerpc/kernel/Makefile
1369     @@ -36,7 +36,7 @@ obj-$(CONFIG_GENERIC_TBSYNC) += smp-tbsync.o
1370     obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
1371     obj-$(CONFIG_6xx) += idle_6xx.o l2cr_6xx.o cpu_setup_6xx.o
1372     obj-$(CONFIG_TAU) += tau_6xx.o
1373     -obj32-$(CONFIG_SOFTWARE_SUSPEND) += swsusp_32.o
1374     +obj32-$(CONFIG_SUSPEND_SHARED) += swsusp_32.o
1375     obj32-$(CONFIG_MODULES) += module_32.o
1376    
1377     ifeq ($(CONFIG_PPC_MERGE),y)
1378     diff --git a/arch/powerpc/platforms/powermac/setup.c b/arch/powerpc/platforms/powermac/setup.c
1379     index 651fa42..243e5f4 100644
1380     --- a/arch/powerpc/platforms/powermac/setup.c
1381     +++ b/arch/powerpc/platforms/powermac/setup.c
1382     @@ -425,7 +425,7 @@ static void __init find_boot_device(void)
1383     * only
1384     */
1385    
1386     -#ifdef CONFIG_SOFTWARE_SUSPEND
1387     +#ifdef CONFIG_SUSPEND_SHARED
1388    
1389     static int pmac_pm_prepare(suspend_state_t state)
1390     {
1391     @@ -480,16 +480,16 @@ static struct pm_ops pmac_pm_ops = {
1392     .valid = pmac_pm_valid,
1393     };
1394    
1395     -#endif /* CONFIG_SOFTWARE_SUSPEND */
1396     +#endif /* CONFIG_SUSPEND_SHARED */
1397    
1398     static int initializing = 1;
1399    
1400     static int pmac_late_init(void)
1401     {
1402     initializing = 0;
1403     -#ifdef CONFIG_SOFTWARE_SUSPEND
1404     +#ifdef CONFIG_SUSPEND_SHARED
1405     pm_set_ops(&pmac_pm_ops);
1406     -#endif /* CONFIG_SOFTWARE_SUSPEND */
1407     +#endif /* CONFIG_SUSPEND_SHARED */
1408     return 0;
1409     }
1410    
1411     diff --git a/arch/x86_64/kernel/Makefile b/arch/x86_64/kernel/Makefile
1412     index bb47e86..9943190 100644
1413     --- a/arch/x86_64/kernel/Makefile
1414     +++ b/arch/x86_64/kernel/Makefile
1415     @@ -26,7 +26,7 @@ obj-y += io_apic.o mpparse.o \
1416     obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o crash.o
1417     obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
1418     obj-$(CONFIG_PM) += suspend.o
1419     -obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend_asm.o
1420     +obj-$(CONFIG_SUSPEND_SHARED) += suspend_asm.o
1421     obj-$(CONFIG_CPU_FREQ) += cpufreq/
1422     obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
1423     obj-$(CONFIG_IOMMU) += pci-gart.o aperture.o
1424     diff --git a/arch/x86_64/kernel/suspend.c b/arch/x86_64/kernel/suspend.c
1425     index 91f7e67..983f16f 100644
1426     --- a/arch/x86_64/kernel/suspend.c
1427     +++ b/arch/x86_64/kernel/suspend.c
1428     @@ -140,7 +140,7 @@ void fix_processor_context(void)
1429    
1430     }
1431    
1432     -#ifdef CONFIG_SOFTWARE_SUSPEND
1433     +#ifdef CONFIG_SUSPEND_SHARED
1434     /* Defined in arch/x86_64/kernel/suspend_asm.S */
1435     extern int restore_image(void);
1436    
1437     @@ -219,4 +219,4 @@ int swsusp_arch_resume(void)
1438     restore_image();
1439     return 0;
1440     }
1441     -#endif /* CONFIG_SOFTWARE_SUSPEND */
1442     +#endif /* CONFIG_SUSPEND_SHARED */
1443     diff --git a/crypto/Kconfig b/crypto/Kconfig
1444     index 086fcec..efde46c 100644
1445     --- a/crypto/Kconfig
1446     +++ b/crypto/Kconfig
1447     @@ -406,6 +406,14 @@ config CRYPTO_DEFLATE
1448    
1449     You will most probably want this if using IPSec.
1450    
1451     +config CRYPTO_LZF
1452     + tristate "LZF compression algorithm"
1453     + default y
1454     + select CRYPTO_ALGAPI
1455     + help
1456     + This is the LZF algorithm. It is especially useful for Suspend2,
1457     + because it achieves good compression quickly.
1458     +
1459     config CRYPTO_MICHAEL_MIC
1460     tristate "Michael MIC keyed digest algorithm"
1461     select CRYPTO_ALGAPI
1462     diff --git a/crypto/Makefile b/crypto/Makefile
1463     index 12f93f5..69a8af3 100644
1464     --- a/crypto/Makefile
1465     +++ b/crypto/Makefile
1466     @@ -46,5 +46,6 @@ obj-$(CONFIG_CRYPTO_ANUBIS) += anubis.o
1467     obj-$(CONFIG_CRYPTO_DEFLATE) += deflate.o
1468     obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o
1469     obj-$(CONFIG_CRYPTO_CRC32C) += crc32c.o
1470     +obj-$(CONFIG_CRYPTO_LZF) += lzf.o
1471    
1472     obj-$(CONFIG_CRYPTO_TEST) += tcrypt.o
1473     diff --git a/crypto/lzf.c b/crypto/lzf.c
1474     new file mode 100644
1475     index 0000000..7c74784
1476     --- /dev/null
1477     +++ b/crypto/lzf.c
1478     @@ -0,0 +1,325 @@
1479     +/*
1480     + * Cryptoapi LZF compression module.
1481     + *
1482     + * Copyright (c) 2004-2005 Nigel Cunningham <nigel@suspend2.net>
1483     + *
1484     + * based on the deflate.c file:
1485     + *
1486     + * Copyright (c) 2003 James Morris <jmorris@intercode.com.au>
1487     + *
1488     + * and upon the LZF compression module donated to the Suspend2 project with
1489     + * the following copyright:
1490     + *
1491     + * This program is free software; you can redistribute it and/or modify it
1492     + * under the terms of the GNU General Public License as published by the Free
1493     + * Software Foundation; either version 2 of the License, or (at your option)
1494     + * any later version.
1495     + * Copyright (c) 2000-2003 Marc Alexander Lehmann <pcg@goof.com>
1496     + *
1497     + * Redistribution and use in source and binary forms, with or without modifica-
1498     + * tion, are permitted provided that the following conditions are met:
1499     + *
1500     + * 1. Redistributions of source code must retain the above copyright notice,
1501     + * this list of conditions and the following disclaimer.
1502     + *
1503     + * 2. Redistributions in binary form must reproduce the above copyright
1504     + * notice, this list of conditions and the following disclaimer in the
1505     + * documentation and/or other materials provided with the distribution.
1506     + *
1507     + * 3. The name of the author may not be used to endorse or promote products
1508     + * derived from this software without specific prior written permission.
1509     + *
1510     + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED
1511     + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MER-
1512     + * CHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
1513     + * EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPE-
1514     + * CIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
1515     + * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
1516     + * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
1517     + * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTH-
1518     + * ERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
1519     + * OF THE POSSIBILITY OF SUCH DAMAGE.
1520     + *
1521     + * Alternatively, the contents of this file may be used under the terms of
1522     + * the GNU General Public License version 2 (the "GPL"), in which case the
1523     + * provisions of the GPL are applicable instead of the above. If you wish to
1524     + * allow the use of your version of this file only under the terms of the
1525     + * GPL and not to allow others to use your version of this file under the
1526     + * BSD license, indicate your decision by deleting the provisions above and
1527     + * replace them with the notice and other provisions required by the GPL. If
1528     + * you do not delete the provisions above, a recipient may use your version
1529     + * of this file under either the BSD or the GPL.
1530     + */
1531     +
1532     +#include <linux/kernel.h>
1533     +#include <linux/module.h>
1534     +#include <linux/init.h>
1535     +#include <linux/module.h>
1536     +#include <linux/crypto.h>
1537     +#include <linux/err.h>
1538     +#include <linux/vmalloc.h>
1539     +#include <asm/string.h>
1540     +
1541     +struct lzf_ctx {
1542     + void *hbuf;
1543     + unsigned int bufofs;
1544     +};
1545     +
1546     +/*
1547     + * size of hashtable is (1 << hlog) * sizeof (char *)
1548     + * decompression is independent of the hash table size
1549     + * the difference between 15 and 14 is very small
1550     + * for small blocks (and 14 is also faster).
1551     + * For a low-memory configuration, use hlog == 13;
1552     + * For best compression, use 15 or 16.
1553     + */
1554     +static const int hlog = 14;
1555     +
1556     +/*
1557     + * don't play with this unless you benchmark!
1558     + * decompression is not dependent on the hash function
1559     + * the hashing function might seem strange, just believe me
1560     + * it works ;)
1561     + */
1562     +static inline u16 first(const u8 *p)
1563     +{
1564     + return ((p[0]) << 8) + p[1];
1565     +}
1566     +
1567     +static inline u16 next(u8 v, const u8 *p)
1568     +{
1569     + return ((v) << 8) + p[2];
1570     +}
1571     +
1572     +static inline u32 idx(unsigned int h)
1573     +{
1574     + return (((h ^ (h << 5)) >> (3*8 - hlog)) + h*3) & ((1 << hlog) - 1);
1575     +}
1576     +
1577     +/*
1578     + * IDX works because it is very similar to a multiplicative hash, e.g.
1579     + * (h * 57321 >> (3*8 - hlog))
1580     + * the next one is also quite good, albeit slow ;)
1581     + * (int)(cos(h & 0xffffff) * 1e6)
1582     + */
1583     +
1584     +static const int max_lit = (1 << 5);
1585     +static const int max_off = (1 << 13);
1586     +static const int max_ref = ((1 << 8) + (1 << 3));
1587     +
1588     +/*
1589     + * compressed format
1590     + *
1591     + * 000LLLLL <L+1> ; literal
1592     + * LLLOOOOO oooooooo ; backref L
1593     + * 111OOOOO LLLLLLLL oooooooo ; backref L+7
1594     + *
1595     + */
1596     +
1597     +static void lzf_compress_exit(struct crypto_tfm *tfm)
1598     +{
1599     + struct lzf_ctx *ctx = crypto_tfm_ctx(tfm);
1600     +
1601     + if (!ctx->hbuf)
1602     + return;
1603     +
1604     + vfree(ctx->hbuf);
1605     + ctx->hbuf = NULL;
1606     +}
1607     +
1608     +static int lzf_compress_init(struct crypto_tfm *tfm)
1609     +{
1610     + struct lzf_ctx *ctx = crypto_tfm_ctx(tfm);
1611     +
1612     + /* Get LZF ready to go */
1613     + ctx->hbuf = vmalloc_32((1 << hlog) * sizeof(char *));
1614     + if (ctx->hbuf)
1615     + return 0;
1616     +
1617     + printk(KERN_WARNING "Failed to allocate %ld bytes for lzf workspace\n",
1618     + (long) ((1 << hlog) * sizeof(char *)));
1619     + return -ENOMEM;
1620     +}
1621     +
1622     +static int lzf_compress(struct crypto_tfm *tfm, const u8 *in_data,
1623     + unsigned int in_len, u8 *out_data, unsigned int *out_len)
1624     +{
1625     + struct lzf_ctx *ctx = crypto_tfm_ctx(tfm);
1626     + const u8 **htab = ctx->hbuf;
1627     + const u8 **hslot;
1628     + const u8 *ip = in_data;
1629     + u8 *op = out_data;
1630     + const u8 *in_end = ip + in_len;
1631     + u8 *out_end = op + *out_len - 3;
1632     + const u8 *ref;
1633     +
1634     + unsigned int hval = first(ip);
1635     + unsigned long off;
1636     + int lit = 0;
1637     +
1638     + memset(htab, 0, sizeof(htab));
1639     +
1640     + for (;;) {
1641     + if (ip < in_end - 2) {
1642     + hval = next(hval, ip);
1643     + hslot = htab + idx(hval);
1644     + ref = *hslot;
1645     + *hslot = ip;
1646     +
1647     + if ((off = ip - ref - 1) < max_off
1648     + && ip + 4 < in_end && ref > in_data
1649     + && *(u16 *) ref == *(u16 *) ip && ref[2] == ip[2]
1650     + ) {
1651     + /* match found at *ref++ */
1652     + unsigned int len = 2;
1653     + unsigned int maxlen = in_end - ip - len;
1654     + maxlen = maxlen > max_ref ? max_ref : maxlen;
1655     +
1656     + do
1657     + len++;
1658     + while (len < maxlen && ref[len] == ip[len]);
1659     +
1660     + if (op + lit + 1 + 3 >= out_end) {
1661     + *out_len = PAGE_SIZE;
1662     + return 0;
1663     + }
1664     +
1665     + if (lit) {
1666     + *op++ = lit - 1;
1667     + lit = -lit;
1668     + do
1669     + *op++ = ip[lit];
1670     + while (++lit);
1671     + }
1672     +
1673     + len -= 2;
1674     + ip++;
1675     +
1676     + if (len < 7) {
1677     + *op++ = (off >> 8) + (len << 5);
1678     + } else {
1679     + *op++ = (off >> 8) + (7 << 5);
1680     + *op++ = len - 7;
1681     + }
1682     +
1683     + *op++ = off;
1684     +
1685     + ip += len;
1686     + hval = first(ip);
1687     + hval = next(hval, ip);
1688     + htab[idx(hval)] = ip;
1689     + ip++;
1690     + continue;
1691     + }
1692     + } else if (ip == in_end)
1693     + break;
1694     +
1695     + /* one more literal byte we must copy */
1696     + lit++;
1697     + ip++;
1698     +
1699     + if (lit == max_lit) {
1700     + if (op + 1 + max_lit >= out_end) {
1701     + *out_len = PAGE_SIZE;
1702     + return 0;
1703     + }
1704     +
1705     + *op++ = max_lit - 1;
1706     + memcpy(op, ip - max_lit, max_lit);
1707     + op += max_lit;
1708     + lit = 0;
1709     + }
1710     + }
1711     +
1712     + if (lit) {
1713     + if (op + lit + 1 >= out_end) {
1714     + *out_len = PAGE_SIZE;
1715     + return 0;
1716     + }
1717     +
1718     + *op++ = lit - 1;
1719     + lit = -lit;
1720     + do
1721     + *op++ = ip[lit];
1722     + while (++lit);
1723     + }
1724     +
1725     + *out_len = op - out_data;
1726     + return 0;
1727     +}
1728     +
1729     +static int lzf_decompress(struct crypto_tfm *tfm, const u8 *src,
1730     + unsigned int slen, u8 *dst, unsigned int *dlen)
1731     +{
1732     + u8 const *ip = src;
1733     + u8 *op = dst;
1734     + u8 const *const in_end = ip + slen;
1735     + u8 *const out_end = op + *dlen;
1736     +
1737     + *dlen = PAGE_SIZE;
1738     + do {
1739     + unsigned int ctrl = *ip++;
1740     +
1741     + if (ctrl < (1 << 5)) { /* literal run */
1742     + ctrl++;
1743     +
1744     + if (op + ctrl > out_end)
1745     + return 0;
1746     + memcpy(op, ip, ctrl);
1747     + op += ctrl;
1748     + ip += ctrl;
1749     + } else { /* back reference */
1750     +
1751     + unsigned int len = ctrl >> 5;
1752     +
1753     + u8 *ref = op - ((ctrl & 0x1f) << 8) - 1;
1754     +
1755     + if (len == 7)
1756     + len += *ip++;
1757     +
1758     + ref -= *ip++;
1759     + len += 2;
1760     +
1761     + if (op + len > out_end || ref < (u8 *) dst)
1762     + return 0;
1763     +
1764     + do
1765     + *op++ = *ref++;
1766     + while (--len);
1767     + }
1768     + }
1769     + while (op < out_end && ip < in_end);
1770     +
1771     + *dlen = op - (u8 *) dst;
1772     + return 0;
1773     +}
1774     +
1775     +static struct crypto_alg alg = {
1776     + .cra_name = "lzf",
1777     + .cra_flags = CRYPTO_ALG_TYPE_COMPRESS,
1778     + .cra_ctxsize = 0,
1779     + .cra_module = THIS_MODULE,
1780     + .cra_list = LIST_HEAD_INIT(alg.cra_list),
1781     + .cra_init = lzf_compress_init,
1782     + .cra_exit = lzf_compress_exit,
1783     + .cra_u = { .compress = {
1784     + .coa_compress = lzf_compress,
1785     + .coa_decompress = lzf_decompress } }
1786     +};
1787     +
1788     +static int __init init(void)
1789     +{
1790     + return crypto_register_alg(&alg);
1791     +}
1792     +
1793     +static void __exit fini(void)
1794     +{
1795     + crypto_unregister_alg(&alg);
1796     +}
1797     +
1798     +module_init(init);
1799     +module_exit(fini);
1800     +
1801     +MODULE_LICENSE("GPL");
1802     +MODULE_DESCRIPTION("LZF Compression Algorithm");
1803     +MODULE_AUTHOR("Marc Alexander Lehmann & Nigel Cunningham");
1804     diff --git a/drivers/base/core.c b/drivers/base/core.c
1805     index d7fcf82..edbbf76 100644
1806     --- a/drivers/base/core.c
1807     +++ b/drivers/base/core.c
1808     @@ -27,6 +27,8 @@
1809     int (*platform_notify)(struct device * dev) = NULL;
1810     int (*platform_notify_remove)(struct device * dev) = NULL;
1811    
1812     +static int do_dump_stack;
1813     +
1814     /*
1815     * sysfs bindings for devices.
1816     */
1817     @@ -638,6 +640,18 @@ int device_add(struct device *dev)
1818     class_intf->add_dev(dev, class_intf);
1819     up(&dev->class->sem);
1820     }
1821     +
1822     +#ifdef CONFIG_PM
1823     + if (!((dev->class && dev->class->resume) ||
1824     + (dev->bus && (dev->bus->resume || dev->bus->resume_early))) &&
1825     + !dev->pm_safe) {
1826     + printk("Device driver %s lacks bus and class support for "
1827     + "being resumed.\n", kobject_name(&dev->kobj));
1828     + if (do_dump_stack)
1829     + dump_stack();
1830     + }
1831     +#endif
1832     +
1833     Done:
1834     kfree(class_name);
1835     put_device(dev);
1836     @@ -975,6 +989,7 @@ struct device *device_create(struct class *class, struct device *parent,
1837     dev->class = class;
1838     dev->parent = parent;
1839     dev->release = device_create_release;
1840     + dev->pm_safe = 1;
1841    
1842     va_start(args, fmt);
1843     vsnprintf(dev->bus_id, BUS_ID_SIZE, fmt, args);
1844     @@ -1183,3 +1198,11 @@ out:
1845     }
1846    
1847     EXPORT_SYMBOL_GPL(device_move);
1848     +
1849     +static int __init pm_debug_dump_stack(char *str)
1850     +{
1851     + do_dump_stack = 1;
1852     + return 1;
1853     +}
1854     +
1855     +__setup("pm_debug_dump_stack", pm_debug_dump_stack);
1856     diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
1857     index b6073bd..32bd423 100644
1858     --- a/drivers/macintosh/via-pmu.c
1859     +++ b/drivers/macintosh/via-pmu.c
1860     @@ -42,7 +42,6 @@
1861     #include <linux/interrupt.h>
1862     #include <linux/device.h>
1863     #include <linux/sysdev.h>
1864     -#include <linux/freezer.h>
1865     #include <linux/syscalls.h>
1866     #include <linux/suspend.h>
1867     #include <linux/cpu.h>
1868     diff --git a/drivers/md/md.c b/drivers/md/md.c
1869     index 509171c..76e5ac8 100644
1870     --- a/drivers/md/md.c
1871     +++ b/drivers/md/md.c
1872     @@ -5309,6 +5309,8 @@ void md_do_sync(mddev_t *mddev)
1873     last_mark = next;
1874     }
1875    
1876     + while(freezer_is_on())
1877     + yield();
1878    
1879     if (kthread_should_stop()) {
1880     /*
1881     diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
1882     index a3c1755..171be82 100644
1883     --- a/drivers/pci/pci-driver.c
1884     +++ b/drivers/pci/pci-driver.c
1885     @@ -451,6 +451,10 @@ int __pci_register_driver(struct pci_driver *drv, struct module *owner,
1886     if (error)
1887     driver_unregister(&drv->driver);
1888    
1889     + if (!drv->resume)
1890     + printk("PCI driver %s lacks driver specific resume support.\n",
1891     + drv->name);
1892     +
1893     return error;
1894     }
1895    
1896     diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
1897     index 9e3e943..e91fdef 100644
1898     --- a/drivers/usb/core/driver.c
1899     +++ b/drivers/usb/core/driver.c
1900     @@ -788,6 +788,9 @@ int usb_register_driver(struct usb_driver *new_driver, struct module *owner,
1901     usbcore_name, new_driver->name);
1902     usbfs_update_special();
1903     usb_create_newid_file(new_driver);
1904     + if (!new_driver->resume)
1905     + printk("USB driver %s lacks resume support.\n",
1906     + new_driver->name);
1907     } else {
1908     printk(KERN_ERR "%s: error %d registering interface "
1909     " driver %s\n",
1910     diff --git a/include/asm-i386/cacheflush.h b/include/asm-i386/cacheflush.h
1911     index 74e03c8..3bb8575 100644
1912     --- a/include/asm-i386/cacheflush.h
1913     +++ b/include/asm-i386/cacheflush.h
1914     @@ -36,4 +36,6 @@ void kernel_map_pages(struct page *page, int numpages, int enable);
1915     void mark_rodata_ro(void);
1916     #endif
1917    
1918     +extern int page_is_mapped(struct page *page);
1919     +
1920     #endif /* _I386_CACHEFLUSH_H */
1921     diff --git a/include/asm-i386/suspend.h b/include/asm-i386/suspend.h
1922     index 8dbaafe..e23fd20 100644
1923     --- a/include/asm-i386/suspend.h
1924     +++ b/include/asm-i386/suspend.h
1925     @@ -8,6 +8,9 @@
1926    
1927     static inline int arch_prepare_suspend(void) { return 0; }
1928    
1929     +extern int suspend2_faulted;
1930     +#define clear_suspend2_fault() do { suspend2_faulted = 0; } while(0)
1931     +
1932     /* image of the saved processor state */
1933     struct saved_context {
1934     u16 es, fs, gs, ss;
1935     diff --git a/include/asm-ppc/suspend.h b/include/asm-ppc/suspend.h
1936     index 3df9f32..9d5db0e 100644
1937     --- a/include/asm-ppc/suspend.h
1938     +++ b/include/asm-ppc/suspend.h
1939     @@ -10,3 +10,6 @@ static inline void save_processor_state(void)
1940     static inline void restore_processor_state(void)
1941     {
1942     }
1943     +
1944     +#define suspend2_faulted (0)
1945     +#define clear_suspend2_fault() do { } while(0)
1946     diff --git a/include/asm-x86_64/cacheflush.h b/include/asm-x86_64/cacheflush.h
1947     index ab1cb5c..b8e7def 100644
1948     --- a/include/asm-x86_64/cacheflush.h
1949     +++ b/include/asm-x86_64/cacheflush.h
1950     @@ -32,4 +32,9 @@ int change_page_attr_addr(unsigned long addr, int numpages, pgprot_t prot);
1951     void mark_rodata_ro(void);
1952     #endif
1953    
1954     +static inline int page_is_mapped(struct page *page)
1955     +{
1956     + return 1;
1957     +}
1958     +
1959     #endif /* _X8664_CACHEFLUSH_H */
1960     diff --git a/include/asm-x86_64/suspend.h b/include/asm-x86_64/suspend.h
1961     index bc7f817..2f18e1b 100644
1962     --- a/include/asm-x86_64/suspend.h
1963     +++ b/include/asm-x86_64/suspend.h
1964     @@ -12,6 +12,9 @@ arch_prepare_suspend(void)
1965     return 0;
1966     }
1967    
1968     +#define suspend2_faulted (0)
1969     +#define clear_suspend2_fault() do { } while(0)
1970     +
1971     /* Image of the saved processor state. If you touch this, fix acpi_wakeup.S. */
1972     struct saved_context {
1973     u16 ds, es, fs, gs, ss;
1974     diff --git a/include/linux/device.h b/include/linux/device.h
1975     index 5cf30e9..69122c1 100644
1976     --- a/include/linux/device.h
1977     +++ b/include/linux/device.h
1978     @@ -402,6 +402,7 @@ struct device {
1979     char bus_id[BUS_ID_SIZE]; /* position on parent bus */
1980     struct device_type *type;
1981     unsigned is_registered:1;
1982     + unsigned pm_safe:1; /* No resume fn is ok? */
1983     struct device_attribute uevent_attr;
1984     struct device_attribute *devt_attr;
1985    
1986     diff --git a/include/linux/dyn_pageflags.h b/include/linux/dyn_pageflags.h
1987     new file mode 100644
1988     index 0000000..23d9127
1989     --- /dev/null
1990     +++ b/include/linux/dyn_pageflags.h
1991     @@ -0,0 +1,68 @@
1992     +/*
1993     + * include/linux/dyn_pageflags.h
1994     + *
1995     + * Copyright (C) 2004-2006 Nigel Cunningham <nigel@suspend2.net>
1996     + *
1997     + * This file is released under the GPLv2.
1998     + *
1999     + * It implements support for dynamically allocated bitmaps that are
2000     + * used for temporary or infrequently used pageflags, in lieu of
2001     + * bits in the struct page flags entry.
2002     + */
2003     +
2004     +#ifndef DYN_PAGEFLAGS_H
2005     +#define DYN_PAGEFLAGS_H
2006     +
2007     +#include <linux/mm.h>
2008     +
2009     +/* [pg_dat][zone][page_num] */
2010     +typedef unsigned long **** dyn_pageflags_t;
2011     +
2012     +#if BITS_PER_LONG == 32
2013     +#define UL_SHIFT 5
2014     +#else
2015     +#if BITS_PER_LONG == 64
2016     +#define UL_SHIFT 6
2017     +#else
2018     +#error Bits per long not 32 or 64?
2019     +#endif
2020     +#endif
2021     +
2022     +#define BIT_NUM_MASK (sizeof(unsigned long) * 8 - 1)
2023     +#define PAGE_NUM_MASK (~((1 << (PAGE_SHIFT + 3)) - 1))
2024     +#define UL_NUM_MASK (~(BIT_NUM_MASK | PAGE_NUM_MASK))
2025     +
2026     +/*
2027     + * PAGENUMBER gives the index of the page within the zone.
2028     + * PAGEINDEX gives the index of the unsigned long within that page.
2029     + * PAGEBIT gives the index of the bit within the unsigned long.
2030     + */
2031     +#define BITS_PER_PAGE (PAGE_SIZE << 3)
2032     +#define PAGENUMBER(zone_offset) ((int) (zone_offset >> (PAGE_SHIFT + 3)))
2033     +#define PAGEINDEX(zone_offset) ((int) ((zone_offset & UL_NUM_MASK) >> UL_SHIFT))
2034     +#define PAGEBIT(zone_offset) ((int) (zone_offset & BIT_NUM_MASK))
2035     +
2036     +#define PAGE_UL_PTR(bitmap, node, zone_num, zone_pfn) \
2037     + ((bitmap[node][zone_num][PAGENUMBER(zone_pfn)])+PAGEINDEX(zone_pfn))
2038     +
2039     +#define BITMAP_FOR_EACH_SET(bitmap, counter) \
2040     + for (counter = get_next_bit_on(bitmap, max_pfn + 1); counter <= max_pfn; \
2041     + counter = get_next_bit_on(bitmap, counter))
2042     +
2043     +extern void clear_dyn_pageflags(dyn_pageflags_t pagemap);
2044     +extern int allocate_dyn_pageflags(dyn_pageflags_t *pagemap);
2045     +extern void free_dyn_pageflags(dyn_pageflags_t *pagemap);
2046     +extern unsigned long get_next_bit_on(dyn_pageflags_t bitmap, unsigned long counter);
2047     +
2048     +extern int test_dynpageflag(dyn_pageflags_t *bitmap, struct page *page);
2049     +extern void set_dynpageflag(dyn_pageflags_t *bitmap, struct page *page);
2050     +extern void clear_dynpageflag(dyn_pageflags_t *bitmap, struct page *page);
2051     +#endif
2052     +
2053     +/*
2054     + * With the above macros defined, you can do...
2055     + * #define PagePageset1(page) (test_dynpageflag(&pageset1_map, page))
2056     + * #define SetPagePageset1(page) (set_dynpageflag(&pageset1_map, page))
2057     + * #define ClearPagePageset1(page) (clear_dynpageflag(&pageset1_map, page))
2058     + */
2059     +
2060     diff --git a/include/linux/freezer.h b/include/linux/freezer.h
2061     index 5e75e26..f49c9be 100644
2062     --- a/include/linux/freezer.h
2063     +++ b/include/linux/freezer.h
2064     @@ -1,7 +1,10 @@
2065     -/* Freezer declarations */
2066     +#ifndef LINUX_FREEZER_H
2067     +#define LINUX_FREEZER_H
2068    
2069     #include <linux/sched.h>
2070    
2071     +/* Freezer declarations */
2072     +
2073     #ifdef CONFIG_PM
2074     /*
2075     * Check if a process has been frozen
2076     @@ -73,6 +76,18 @@ static inline int try_to_freeze(void)
2077    
2078     extern void thaw_some_processes(int all);
2079    
2080     +extern int freezer_state;
2081     +#define FREEZER_OFF 0
2082     +#define FREEZER_USERSPACE_FROZEN 1
2083     +#define FREEZER_FULLY_ON 2
2084     +
2085     +static inline int freezer_is_on(void)
2086     +{
2087     + return (freezer_state == FREEZER_FULLY_ON);
2088     +}
2089     +
2090     +extern void thaw_kernel_threads(void);
2091     +
2092     #else
2093     static inline int frozen(struct task_struct *p) { return 0; }
2094     static inline int freezing(struct task_struct *p) { return 0; }
2095     @@ -85,6 +100,9 @@ static inline int freeze_processes(void) { BUG(); return 0; }
2096     static inline void thaw_processes(void) {}
2097    
2098     static inline int try_to_freeze(void) { return 0; }
2099     +static inline int freezer_is_on(void) { return 0; }
2100     +static inline void thaw_kernel_threads(void) { }
2101    
2102    
2103     #endif
2104     +#endif
2105     diff --git a/include/linux/kernel.h b/include/linux/kernel.h
2106     index 9ddf25c..c18ec95 100644
2107     --- a/include/linux/kernel.h
2108     +++ b/include/linux/kernel.h
2109     @@ -113,6 +113,8 @@ extern int vsprintf(char *buf, const char *, va_list)
2110     __attribute__ ((format (printf, 2, 0)));
2111     extern int snprintf(char * buf, size_t size, const char * fmt, ...)
2112     __attribute__ ((format (printf, 3, 4)));
2113     +extern int snprintf_used(char *buffer, int buffer_size,
2114     + const char *fmt, ...);
2115     extern int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
2116     __attribute__ ((format (printf, 3, 0)));
2117     extern int scnprintf(char * buf, size_t size, const char * fmt, ...)
2118     diff --git a/include/linux/netlink.h b/include/linux/netlink.h
2119     index 2a20f48..b4d9db1 100644
2120     --- a/include/linux/netlink.h
2121     +++ b/include/linux/netlink.h
2122     @@ -24,6 +24,8 @@
2123     /* leave room for NETLINK_DM (DM Events) */
2124     #define NETLINK_SCSITRANSPORT 18 /* SCSI Transports */
2125     #define NETLINK_ECRYPTFS 19
2126     +#define NETLINK_SUSPEND2_USERUI 20 /* For suspend2's userui */
2127     +#define NETLINK_SUSPEND2_USM 21 /* For suspend2's userspace storage manager */
2128    
2129     #define MAX_LINKS 32
2130    
2131     diff --git a/include/linux/suspend.h b/include/linux/suspend.h
2132     index bf99bd4..02c1184 100644
2133     --- a/include/linux/suspend.h
2134     +++ b/include/linux/suspend.h
2135     @@ -27,14 +27,9 @@ extern void mark_free_pages(struct zone *zone);
2136     /* kernel/power/swsusp.c */
2137     extern int software_suspend(void);
2138    
2139     -#if defined(CONFIG_VT) && defined(CONFIG_VT_CONSOLE)
2140     extern int pm_prepare_console(void);
2141     extern void pm_restore_console(void);
2142     #else
2143     -static inline int pm_prepare_console(void) { return 0; }
2144     -static inline void pm_restore_console(void) {}
2145     -#endif /* defined(CONFIG_VT) && defined(CONFIG_VT_CONSOLE) */
2146     -#else
2147     static inline int software_suspend(void)
2148     {
2149     printk("Warning: fake suspend called\n");
2150     @@ -45,8 +40,6 @@ static inline int software_suspend(void)
2151     void save_processor_state(void);
2152     void restore_processor_state(void);
2153     struct saved_context;
2154     -void __save_processor_state(struct saved_context *ctxt);
2155     -void __restore_processor_state(struct saved_context *ctxt);
2156     unsigned long get_safe_page(gfp_t gfp_mask);
2157    
2158     /*
2159     @@ -55,4 +48,81 @@ unsigned long get_safe_page(gfp_t gfp_mask);
2160     */
2161     #define PAGES_FOR_IO 1024
2162    
2163     +enum {
2164     + SUSPEND_CAN_SUSPEND,
2165     + SUSPEND_CAN_RESUME,
2166     + SUSPEND_RUNNING,
2167     + SUSPEND_RESUME_DEVICE_OK,
2168     + SUSPEND_NORESUME_SPECIFIED,
2169     + SUSPEND_SANITY_CHECK_PROMPT,
2170     + SUSPEND_PAGESET2_NOT_LOADED,
2171     + SUSPEND_CONTINUE_REQ,
2172     + SUSPEND_RESUMED_BEFORE,
2173     + SUSPEND_RESUME_NOT_DONE,
2174     + SUSPEND_BOOT_TIME,
2175     + SUSPEND_NOW_RESUMING,
2176     + SUSPEND_IGNORE_LOGLEVEL,
2177     + SUSPEND_TRYING_TO_RESUME,
2178     + SUSPEND_TRY_RESUME_RD,
2179     + SUSPEND_LOADING_ALT_IMAGE,
2180     + SUSPEND_STOP_RESUME,
2181     + SUSPEND_IO_STOPPED,
2182     +};
2183     +
2184     +#ifdef CONFIG_SUSPEND2
2185     +
2186     +/* Used in init dir files */
2187     +extern unsigned long suspend_state;
2188     +#define set_suspend_state(bit) (set_bit(bit, &suspend_state))
2189     +#define clear_suspend_state(bit) (clear_bit(bit, &suspend_state))
2190     +#define test_suspend_state(bit) (test_bit(bit, &suspend_state))
2191     +extern int suspend2_running;
2192     +
2193     +#else /* !CONFIG_SUSPEND2 */
2194     +
2195     +#define suspend_state (0)
2196     +#define set_suspend_state(bit) do { } while(0)
2197     +#define clear_suspend_state(bit) do { } while (0)
2198     +#define test_suspend_state(bit) (0)
2199     +#define suspend2_running (0)
2200     +#endif /* CONFIG_SUSPEND2 */
2201     +
2202     +#ifdef CONFIG_SUSPEND_SHARED
2203     +#ifdef CONFIG_SUSPEND2
2204     +extern void suspend2_try_resume(void);
2205     +#else
2206     +#define suspend2_try_resume() do { } while(0)
2207     +#endif
2208     +
2209     +extern int resume_attempted;
2210     +
2211     +#ifdef CONFIG_SOFTWARE_SUSPEND
2212     +extern int software_resume(void);
2213     +#else
2214     +static inline int software_resume(void)
2215     +{
2216     + resume_attempted = 1;
2217     + suspend2_try_resume();
2218     + return 0;
2219     +}
2220     +#endif
2221     +
2222     +static inline void check_resume_attempted(void)
2223     +{
2224     + if (resume_attempted)
2225     + return;
2226     +
2227     + software_resume();
2228     +}
2229     +#else
2230     +#define check_resume_attempted() do { } while(0)
2231     +#define resume_attempted (0)
2232     +#endif
2233     +
2234     +#ifdef CONFIG_PRINTK_NOSAVE
2235     +#define POSS_NOSAVE __nosavedata
2236     +#else
2237     +#define POSS_NOSAVE
2238     +#endif
2239     +
2240     #endif /* _LINUX_SWSUSP_H */
2241     diff --git a/include/linux/swap.h b/include/linux/swap.h
2242     index 0068688..ad3fead 100644
2243     --- a/include/linux/swap.h
2244     +++ b/include/linux/swap.h
2245     @@ -190,8 +190,9 @@ extern void swap_setup(void);
2246     /* linux/mm/vmscan.c */
2247     extern unsigned long try_to_free_pages(struct zone **, gfp_t,
2248     struct task_struct *p);
2249     extern unsigned long shrink_all_memory(unsigned long nr_pages);
2250     +extern void shrink_one_zone(struct zone *zone, int desired_size);
2251     extern int vm_mapped;
2252     extern int vm_hardmaplimit;
2253     extern int remove_mapping(struct address_space *mapping, struct page *page);
2254     extern long vm_total_pages;
2255     @@ -368,5 +369,10 @@ static inline swp_entry_t get_swap_page(void)
2256     #define disable_swap_token() do { } while(0)
2257    
2258     #endif /* CONFIG_SWAP */
2259     +
2260     +/* For Suspend2 - unlink LRU pages while saving separately */
2261     +void unlink_lru_lists(void);
2262     +void relink_lru_lists(void);
2263     +
2264     #endif /* __KERNEL__*/
2265     #endif /* _LINUX_SWAP_H */
2266     diff --git a/include/linux/time.h b/include/linux/time.h
2267     index 8ea8dea..5c07f00 100644
2268     --- a/include/linux/time.h
2269     +++ b/include/linux/time.h
2270     @@ -224,4 +224,7 @@ struct itimerval {
2271     */
2272     #define TIMER_ABSTIME 0x01
2273    
2274     +extern void save_avenrun(void);
2275     +extern void restore_avenrun(void);
2276     +
2277     #endif
2278     diff --git a/init/do_mounts.c b/init/do_mounts.c
2279     index dc1ec08..3e67387 100644
2280     --- a/init/do_mounts.c
2281     +++ b/init/do_mounts.c
2282     @@ -139,11 +139,16 @@ dev_t name_to_dev_t(char *name)
2283     char s[32];
2284     char *p;
2285     dev_t res = 0;
2286     - int part;
2287     + int part, mount_result;
2288    
2289     #ifdef CONFIG_SYSFS
2290     int mkdir_err = sys_mkdir("/sys", 0700);
2291     - if (sys_mount("sysfs", "/sys", "sysfs", 0, NULL) < 0)
2292     + /*
2293     + * When changing resume2 parameter for Software Suspend, sysfs may
2294     + * already be mounted.
2295     + */
2296     + mount_result = sys_mount("sysfs", "/sys", "sysfs", 0, NULL);
2297     + if (mount_result < 0 && mount_result != -EBUSY)
2298     goto out;
2299     #endif
2300    
2301     @@ -195,7 +200,8 @@ dev_t name_to_dev_t(char *name)
2302     res = try_name(s, part);
2303     done:
2304     #ifdef CONFIG_SYSFS
2305     - sys_umount("/sys", 0);
2306     + if (mount_result >= 0)
2307     + sys_umount("/sys", 0);
2308     out:
2309     if (!mkdir_err)
2310     sys_rmdir("/sys");
2311     @@ -434,12 +440,27 @@ void __init prepare_namespace(void)
2312    
2313     is_floppy = MAJOR(ROOT_DEV) == FLOPPY_MAJOR;
2314    
2315     + /* Suspend2:
2316     + * By this point, suspend_early_init has been called to initialise our
2317     + * sysfs interface. If modules are built in, they have registered (all
2318     + * of the above via initcalls).
2319     + *
2320     + * We have not yet looked to see if an image exists, however. If we
2321     + * have an initrd, it is expected that the user will have set it up
2322     + * to echo > /sys/power/suspend2/do_resume and thus initiate any
2323     + * resume. If they don't do that, we do it immediately after the initrd
2324     + * is finished (major issues if they mount filesystems rw from the
2325     + * initrd! - they are warned. If there's no usable initrd, we do our
2326     + * check next.
2327     + */
2328     if (initrd_load())
2329     goto out;
2330    
2331     if (is_floppy && rd_doload && rd_load_disk(0))
2332     ROOT_DEV = Root_RAM0;
2333    
2334     + check_resume_attempted();
2335     +
2336     mount_root();
2337     out:
2338     sys_mount(".", "/", NULL, MS_MOVE, NULL);
2339     diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c
2340     index 2cfd7cb..9c24ec4 100644
2341     --- a/init/do_mounts_initrd.c
2342     +++ b/init/do_mounts_initrd.c
2343     @@ -6,6 +6,7 @@
2344     #include <linux/romfs_fs.h>
2345     #include <linux/initrd.h>
2346     #include <linux/sched.h>
2347     +#include <linux/suspend.h>
2348     #include <linux/freezer.h>
2349    
2350     #include "do_mounts.h"
2351     @@ -58,10 +59,18 @@ static void __init handle_initrd(void)
2352     current->flags |= PF_NOFREEZE;
2353     pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
2354     if (pid > 0) {
2355     - while (pid != sys_wait4(-1, NULL, 0, NULL))
2356     + while (pid != sys_wait4(-1, NULL, 0, NULL)) {
2357     yield();
2358     + try_to_freeze();
2359     + }
2360     }
2361    
2362     + if (!resume_attempted)
2363     + printk(KERN_ERR "Suspend2: No attempt was made to resume from "
2364     + "any image that might exist.\n");
2365     + clear_suspend_state(SUSPEND_BOOT_TIME);
2366     + current->flags &= ~PF_NOFREEZE;
2367     +
2368     /* move initrd to rootfs' /old */
2369     sys_fchdir(old_fd);
2370     sys_mount("/", ".", NULL, MS_MOVE, NULL);
2371     diff --git a/init/main.c b/init/main.c
2372     index a92989e..cd3f60f 100644
2373     --- a/init/main.c
2374     +++ b/init/main.c
2375     @@ -54,6 +54,7 @@
2376     #include <linux/lockdep.h>
2377     #include <linux/pid_namespace.h>
2378     #include <linux/device.h>
2379     +#include <linux/suspend.h>
2380    
2381     #include <asm/io.h>
2382     #include <asm/bugs.h>
2383     @@ -804,7 +805,9 @@ static int __init init(void * unused)
2384    
2385     /*
2386     * check if there is an early userspace init. If yes, let it do all
2387     - * the work
2388     + * the work. For suspend2, we assume that it will do the right thing
2389     + * with regard to trying to resume at the right place. When that
2390     + * happens, the BOOT_TIME flag will be cleared.
2391     */
2392    
2393     if (!ramdisk_execute_command)
2394     diff --git a/kernel/kmod.c b/kernel/kmod.c
2395     index 7962761..f0d6fc1 100644
2396     --- a/kernel/kmod.c
2397     +++ b/kernel/kmod.c
2398     @@ -34,6 +34,7 @@
2399     #include <linux/kernel.h>
2400     #include <linux/init.h>
2401     #include <linux/resource.h>
2402     +#include <linux/freezer.h>
2403     #include <asm/uaccess.h>
2404    
2405     extern int max_threads;
2406     @@ -338,6 +339,11 @@ int call_usermodehelper_pipe(char *path, char **argv, char **envp,
2407     }
2408     sub_info.stdin = f;
2409    
2410     + if (freezer_is_on()) {
2411     + printk(KERN_WARNING "Freezer is on. Refusing to start %s.\n", path);
2412     + return -EBUSY;
2413     + }
2414     +
2415     queue_work(khelper_wq, &sub_info.work);
2416     wait_for_completion(&done);
2417     return sub_info.retval;
2418     diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
2419     index 51a4dd0..4caac70 100644
2420     --- a/kernel/power/Kconfig
2421     +++ b/kernel/power/Kconfig
2422     @@ -48,6 +48,18 @@ config DISABLE_CONSOLE_SUSPEND
2423     suspend/resume routines, but may itself lead to problems, for example
2424     if netconsole is used.
2425    
2426     +config PRINTK_NOSAVE
2427     + depends on PM && PM_DEBUG
2428     + bool "Preserve printk data from boot kernel when resuming."
2429     + default n
2430     + ---help---
2431     + This option gives printk data and the associated variables the
2432     + attribute __nosave, which means that they will not be saved as
2433     + part of the image. The net effect is that after resuming, your
2434     + dmesg will show the messages from prior to the atomic restore,
2435     + instead of the messages from the resumed kernel. This may be
2436     + useful for debugging hibernation.
2437     +
2438     config PM_TRACE
2439     bool "Suspend/resume event tracing"
2440     depends on PM && PM_DEBUG && X86_32 && EXPERIMENTAL
2441     @@ -162,3 +174,174 @@ config APM_EMULATION
2442     random kernel OOPSes or reboots that don't seem to be related to
2443     anything, try disabling/enabling this option (or disabling/enabling
2444     APM in your BIOS).
2445     +
2446     +menuconfig SUSPEND2_CORE
2447     + tristate "Suspend2"
2448     + depends on PM
2449     + select DYN_PAGEFLAGS
2450     + select HOTPLUG_CPU if SMP
2451     + default y
2452     + ---help---
2453     + Suspend2 is the 'new and improved' suspend support.
2454     +
2455     + See the Suspend2 home page (suspend2.net)
2456     + for FAQs, HOWTOs and other documentation.
2457     +
2458     + comment "Image Storage (you need at least one allocator)"
2459     + depends on SUSPEND2_CORE
2460     +
2461     + config SUSPEND2_FILE
2462     + tristate "File Allocator"
2463     + depends on SUSPEND2_CORE
2464     + default y
2465     + ---help---
2466     + This option enables support for storing an image in a
2467     + simple file. This should be possible, but we're still
2468     + testing it.
2469     +
2470     + config SUSPEND2_SWAP
2471     + tristate "Swap Allocator"
2472     + depends on SUSPEND2_CORE
2473     + default y
2474     + select SWAP
2475     + ---help---
2476     + This option enables support for storing an image in your
2477     + swap space.
2478     +
2479     + comment "General Options"
2480     + depends on SUSPEND2_CORE
2481     +
2482     + config SUSPEND2_CRYPTO
2483     + tristate "Compression support"
2484     + depends on SUSPEND2_CORE && CRYPTO
2485     + default y
2486     + ---help---
2487     + This option adds support for using cryptoapi compression
2488     + algorithms. Compression is particularly useful as
2489     + the LZF support that comes with the Suspend2 patch can double
2490     + your suspend and resume speed.
2491     +
2492     + You probably want this, so say Y here.
2493     +
2494     + comment "No compression support available without Cryptoapi support."
2495     + depends on SUSPEND2_CORE && !CRYPTO
2496     +
2497     + config SUSPEND2_USERUI
2498     + tristate "Userspace User Interface support"
2499     + depends on SUSPEND2_CORE && NET
2500     + default y
2501     + ---help---
2502     + This option enabled support for a userspace based user interface
2503     + to Suspend2, which allows you to have a nice display while suspending
2504     + and resuming, and also enables features such as pressing escape to
2505     + cancel a cycle or interactive debugging.
2506     +
2507     + config SUSPEND2_DEFAULT_RESUME2
2508     + string "Default resume device name"
2509     + depends on SUSPEND2_CORE
2510     + ---help---
2511     + You normally need to add a resume2= parameter to your lilo.conf or
2512     + equivalent. With this option properly set, the kernel has a value
2513     + to default. No damage will be done if the value is invalid.
2514     +
2515     + config SUSPEND2_KEEP_IMAGE
2516     + bool "Allow Keep Image Mode"
2517     + depends on SUSPEND2_CORE
2518     + ---help---
2519     + This option allows you to keep and image and reuse it. It is intended
2520     + __ONLY__ for use with systems where all filesystems are mounted read-
2521     + only (kiosks, for example). To use it, compile this option in and boot
2522     + normally. Set the KEEP_IMAGE flag in /sys/power/suspend2 and suspend.
2523     + When you resume, the image will not be removed. You will be unable to turn
2524     + off swap partitions (assuming you are using the swap allocator), but future
2525     + suspends simply do a power-down. The image can be updated using the
2526     + kernel command line parameter suspend_act= to turn off the keep image
2527     + bit. Keep image mode is a little less user friendly on purpose - it
2528     + should not be used without thought!
2529     +
2530     + config SUSPEND2_REPLACE_SWSUSP
2531     + bool "Replace swsusp by default"
2532     + default y
2533     + depends on SUSPEND2_CORE
2534     + ---help---
2535     + Suspend2 can replace swsusp. This option makes that the default state,
2536     + requiring you to echo 0 > /sys/power/suspend2/replace_swsusp if you want
2537     + to use the vanilla kernel functionality. Note that your initrd/ramfs will
2538     + need to do this before trying to resume, too.
2539     + With overriding swsusp enabled, Suspend2 will use both the resume= and
2540     + noresume commandline options _and_ the resume2= and noresume2 ones (for
2541     + compatibility). resume= takes precedence over resume2=. Echoing disk
2542     + to /sys/power/state will start a Suspend2 cycle. If resume= doesn't
2543     + specify an allocator and both the swap and file allocators are compiled in,
2544     + the swap allocator will be used by default.
2545     +
2546     + config SUSPEND2_CLUSTER
2547     + tristate "Cluster support"
2548     + default n
2549     + depends on SUSPEND2_CORE && NET && BROKEN
2550     + ---help---
2551     + Support for linking multiple machines in a cluster so that they suspend
2552     + and resume together.
2553     +
2554     + config SUSPEND2_DEFAULT_CLUSTER_MASTER
2555     + string "Default cluster master address/port"
2556     + depends on SUSPEND2_CLUSTER
2557     + ---help---
2558     + If this machine will be the master, simply enter a port on which to
2559     + listen for slaves.
2560     + If this machine will be a slave, enter the ip address and port on
2561     + which the master listens with a colon separating them.
2562     + If no value is set here, cluster support will be disabled by default.
2563     +
2564     + config SUSPEND2_CHECKSUM
2565     + bool "Checksum pageset2"
2566     + depends on SUSPEND2_CORE
2567     + select CRYPTO
2568     + select CRYPTO_ALGAPI
2569     + select CRYPTO_MD5
2570     + ---help---
2571     + Adds support for checksumming pageset2 pages, to ensure you really get an
2572     + atomic copy. Should not normally be needed, but here for verification and
2573     + diagnostic purposes.
2574     +
2575     +config SUSPEND_SHARED
2576     + bool
2577     + depends on SUSPEND2_CORE || SOFTWARE_SUSPEND
2578     + default y
2579     +
2580     +config SUSPEND2_USERUI_EXPORTS
2581     + bool
2582     + depends on SUSPEND2_USERUI=m
2583     + default y
2584     +
2585     +config SUSPEND2_SWAP_EXPORTS
2586     + bool
2587     + depends on SUSPEND2_SWAP=m
2588     + default y
2589     +
2590     +config SUSPEND2_FILE_EXPORTS
2591     + bool
2592     + depends on SUSPEND2_FILE=m
2593     + default y
2594     +
2595     +config SUSPEND2_CRYPTO_EXPORTS
2596     + bool
2597     + depends on SUSPEND2_CRYPTO=m
2598     + default y
2599     +
2600     +config SUSPEND2_CORE_EXPORTS
2601     + bool
2602     + depends on SUSPEND2_CORE=m
2603     + default y
2604     +
2605     +config SUSPEND2_EXPORTS
2606     + bool
2607     + depends on SUSPEND2_SWAP_EXPORTS || SUSPEND2_FILE_EXPORTS || \
2608     + SUSPEND2_CRYPTO_EXPORTS || SUSPEND2_CLUSTER=m || \
2609     + SUSPEND2_USERUI_EXPORTS
2610     + default y
2611     +
2612     +config SUSPEND2
2613     + bool
2614     + depends on SUSPEND2_CORE!=n
2615     + default y
2616     diff --git a/kernel/power/Makefile b/kernel/power/Makefile
2617     index 38725f5..a4a8c5b 100644
2618     --- a/kernel/power/Makefile
2619     +++ b/kernel/power/Makefile
2620     @@ -5,6 +5,32 @@ endif
2621    
2622     obj-y := main.o process.o console.o
2623     obj-$(CONFIG_PM_LEGACY) += pm.o
2624     -obj-$(CONFIG_SOFTWARE_SUSPEND) += swsusp.o disk.o snapshot.o swap.o user.o
2625     +obj-$(CONFIG_SUSPEND_SHARED) += snapshot.o
2626     +
2627     +suspend_core-objs := modules.o sysfs.o suspend.o \
2628     + io.o pagedir.o prepare_image.o \
2629     + extent.o pageflags.o ui.o \
2630     + power_off.o atomic_copy.o
2631     +
2632     +obj-$(CONFIG_SUSPEND2) += suspend2_builtin.o
2633     +
2634     +ifdef CONFIG_SUSPEND2_CHECKSUM
2635     +suspend_core-objs += checksum.o
2636     +endif
2637     +
2638     +ifdef CONFIG_NET
2639     +suspend_core-objs += storage.o netlink.o
2640     +endif
2641     +
2642     +obj-$(CONFIG_SUSPEND2_CORE) += suspend_core.o
2643     +obj-$(CONFIG_SUSPEND2_CRYPTO) += suspend_compress.o
2644     +
2645     +obj-$(CONFIG_SUSPEND2_SWAP) += suspend_block_io.o suspend_swap.o
2646     +obj-$(CONFIG_SUSPEND2_FILE) += suspend_block_io.o suspend_file.o
2647     +obj-$(CONFIG_SUSPEND2_CLUSTER) += cluster.o
2648     +
2649     +obj-$(CONFIG_SUSPEND2_USERUI) += suspend_userui.o
2650     +
2651     +obj-$(CONFIG_SOFTWARE_SUSPEND) += swsusp.o disk.o swap.o user.o
2652    
2653     obj-$(CONFIG_MAGIC_SYSRQ) += poweroff.o
2654     diff --git a/kernel/power/atomic_copy.c b/kernel/power/atomic_copy.c
2655     new file mode 100644
2656     index 0000000..d2b8cd7
2657     --- /dev/null
2658     +++ b/kernel/power/atomic_copy.c
2659     @@ -0,0 +1,416 @@
2660     +/*
2661     + * kernel/power/atomic_copy.c
2662     + *
2663     + * Copyright 2004-2007 Nigel Cunningham (nigel at suspend2 net)
2664     + * Copyright (C) 2006 Red Hat, inc.
2665     + *
2666     + * Distributed under GPLv2.
2667     + *
2668     + * Routines for doing the atomic save/restore.
2669     + */
2670     +
2671     +#include <linux/suspend.h>
2672     +#include <linux/highmem.h>
2673     +#include <linux/cpu.h>
2674     +#include <linux/freezer.h>
2675     +#include <linux/console.h>
2676     +#include "suspend.h"
2677     +#include "storage.h"
2678     +#include "power_off.h"
2679     +#include "ui.h"
2680     +#include "power.h"
2681     +#include "io.h"
2682     +#include "prepare_image.h"
2683     +#include "pageflags.h"
2684     +#include "checksum.h"
2685     +#include "suspend2_builtin.h"
2686     +
2687     +int extra_pd1_pages_used;
2688     +
2689     +/*
2690     + * Highmem related functions (x86 only).
2691     + */
2692     +
2693     +#ifdef CONFIG_HIGHMEM
2694     +
2695     +/**
2696     + * copyback_high: Restore highmem pages.
2697     + *
2698     + * Highmem data and pbe lists are/can be stored in highmem.
2699     + * The format is slightly different to the lowmem pbe lists
2700     + * used for the assembly code: the last pbe in each page is
2701     + * a struct page * instead of struct pbe *, pointing to the
2702     + * next page where pbes are stored (or NULL if happens to be
2703     + * the end of the list). Since we don't want to generate
2704     + * unnecessary deltas against swsusp code, we use a cast
2705     + * instead of a union.
2706     + **/
2707     +
2708     +static void copyback_high(void)
2709     +{
2710     + struct page * pbe_page = (struct page *) restore_highmem_pblist;
2711     + struct pbe *this_pbe, *first_pbe;
2712     + unsigned long *origpage, *copypage;
2713     + int pbe_index = 1;
2714     +
2715     + if (!pbe_page)
2716     + return;
2717     +
2718     + this_pbe = (struct pbe *) kmap_atomic(pbe_page, KM_BOUNCE_READ);
2719     + first_pbe = this_pbe;
2720     +
2721     + while (this_pbe) {
2722     + int loop = (PAGE_SIZE / sizeof(unsigned long)) - 1;
2723     +
2724     + origpage = kmap_atomic((struct page *) this_pbe->orig_address,
2725     + KM_BIO_DST_IRQ);
2726     + copypage = kmap_atomic((struct page *) this_pbe->address,
2727     + KM_BIO_SRC_IRQ);
2728     +
2729     + while (loop >= 0) {
2730     + *(origpage + loop) = *(copypage + loop);
2731     + loop--;
2732     + }
2733     +
2734     + kunmap_atomic(origpage, KM_BIO_DST_IRQ);
2735     + kunmap_atomic(copypage, KM_BIO_SRC_IRQ);
2736     +
2737     + if (!this_pbe->next)
2738     + break;
2739     +
2740     + if (pbe_index < PBES_PER_PAGE) {
2741     + this_pbe++;
2742     + pbe_index++;
2743     + } else {
2744     + pbe_page = (struct page *) this_pbe->next;
2745     + kunmap_atomic(first_pbe, KM_BOUNCE_READ);
2746     + if (!pbe_page)
2747     + return;
2748     + this_pbe = (struct pbe *) kmap_atomic(pbe_page,
2749     + KM_BOUNCE_READ);
2750     + first_pbe = this_pbe;
2751     + pbe_index = 1;
2752     + }
2753     + }
2754     + kunmap_atomic(first_pbe, KM_BOUNCE_READ);
2755     +}
2756     +
2757     +#else /* CONFIG_HIGHMEM */
2758     +void copyback_high(void) { }
2759     +#endif
2760     +
2761     +/**
2762     + * free_pbe_list: Free page backup entries used by the atomic copy code.
2763     + *
2764     + * Normally, this function isn't used. If, however, we need to abort before
2765     + * doing the atomic copy, we use this to free the pbes previously allocated.
2766     + **/
2767     +static void free_pbe_list(struct pbe **list, int highmem)
2768     +{
2769     + struct pbe *free_pbe = *list;
2770     + struct page *page = (struct page *) free_pbe;
2771     +
2772     + do {
2773     + int i;
2774     +
2775     + if (highmem)
2776     + free_pbe = (struct pbe *) kmap(page);
2777     +
2778     + for (i = 0; i < PBES_PER_PAGE; i++) {
2779     + if (!free_pbe)
2780     + break;
2781     + __free_page(free_pbe->address);
2782     + free_pbe = free_pbe->next;
2783     + }
2784     +
2785     + if (highmem) {
2786     + struct page *next_page = NULL;
2787     + if (free_pbe)
2788     + next_page = (struct page *) free_pbe->next;
2789     + kunmap(page);
2790     + __free_page(page);
2791     + page = next_page;
2792     + }
2793     +
2794     + } while(page && free_pbe);
2795     +
2796     + *list = NULL;
2797     +}
2798     +
2799     +/**
2800     + * copyback_post: Post atomic-restore actions.
2801     + *
2802     + * After doing the atomic restore, we have a few more things to do:
2803     + * 1) We want to retain some values across the restore, so we now copy
2804     + * these from the nosave variables to the normal ones.
2805     + * 2) Set the status flags.
2806     + * 3) Resume devices.
2807     + * 4) Get userui to redraw.
2808     + * 5) Reread the page cache.
2809     + **/
2810     +
2811     +void copyback_post(void)
2812     +{
2813     + int loop;
2814     +
2815     + suspend_action = suspend2_nosave_state1;
2816     + suspend_debug_state = suspend2_nosave_state2;
2817     + console_loglevel = suspend2_nosave_state3;
2818     +
2819     + for (loop = 0; loop < 4; loop++)
2820     + suspend_io_time[loop/2][loop%2] =
2821     + suspend2_nosave_io_speed[loop/2][loop%2];
2822     +
2823     + set_suspend_state(SUSPEND_NOW_RESUMING);
2824     + set_suspend_state(SUSPEND_PAGESET2_NOT_LOADED);
2825     +
2826     + if (suspend_activate_storage(1))
2827     + panic("Failed to reactivate our storage.");
2828     +
2829     + suspend_ui_redraw();
2830     +
2831     + suspend_cond_pause(1, "About to reload secondary pagedir.");
2832     +
2833     + if (read_pageset2(0))
2834     + panic("Unable to successfully reread the page cache.");
2835     +
2836     + clear_suspend_state(SUSPEND_PAGESET2_NOT_LOADED);
2837     +}
2838     +
2839     +/**
2840     + * suspend_copy_pageset1: Do the atomic copy of pageset1.
2841     + *
2842     + * Make the atomic copy of pageset1. We can't use copy_page (as we once did)
2843     + * because we can't be sure what side effects it has. On my old Duron, with
2844     + * 3DNOW, kernel_fpu_begin increments preempt count, making our preempt
2845     + * count at resume time 4 instead of 3.
2846     + *
2847     + * We don't want to call kmap_atomic unconditionally because it has the side
2848     + * effect of incrementing the preempt count, which will leave it one too high
2849     + * post resume (the page containing the preempt count will be copied after
2850     + * its incremented. This is essentially the same problem.
2851     + **/
2852     +
2853     +void suspend_copy_pageset1(void)
2854     +{
2855     + int i;
2856     + unsigned long source_index, dest_index;
2857     +
2858     + source_index = get_next_bit_on(pageset1_map, max_pfn + 1);
2859     + dest_index = get_next_bit_on(pageset1_copy_map, max_pfn + 1);
2860     +
2861     + for (i = 0; i < pagedir1.size; i++) {
2862     + unsigned long *origvirt, *copyvirt;
2863     + struct page *origpage, *copypage;
2864     + int loop = (PAGE_SIZE / sizeof(unsigned long)) - 1;
2865     +
2866     + origpage = pfn_to_page(source_index);
2867     + copypage = pfn_to_page(dest_index);
2868     +
2869     + origvirt = PageHighMem(origpage) ?
2870     + kmap_atomic(origpage, KM_USER0) :
2871     + page_address(origpage);
2872     +
2873     + copyvirt = PageHighMem(copypage) ?
2874     + kmap_atomic(copypage, KM_USER1) :
2875     + page_address(copypage);
2876     +
2877     + while (loop >= 0) {
2878     + *(copyvirt + loop) = *(origvirt + loop);
2879     + loop--;
2880     + }
2881     +
2882     + if (PageHighMem(origpage))
2883     + kunmap_atomic(origvirt, KM_USER0);
2884     + else if (suspend2_faulted) {
2885     + printk("%p (%lu) being unmapped after faulting during atomic copy.\n", origpage, source_index);
2886     + kernel_map_pages(origpage, 1, 0);
2887     + clear_suspend2_fault();
2888     + }
2889     +
2890     + if (PageHighMem(copypage))
2891     + kunmap_atomic(copyvirt, KM_USER1);
2892     +
2893     + source_index = get_next_bit_on(pageset1_map, source_index);
2894     + dest_index = get_next_bit_on(pageset1_copy_map, dest_index);
2895     + }
2896     +}
2897     +
2898     +/**
2899     + * __suspend_post_context_save: Steps after saving the cpu context.
2900     + *
2901     + * Steps taken after saving the CPU state to make the actual
2902     + * atomic copy.
2903     + *
2904     + * Called from swsusp_save in snapshot.c via suspend_post_context_save.
2905     + **/
2906     +
2907     +int __suspend_post_context_save(void)
2908     +{
2909     + int old_ps1_size = pagedir1.size;
2910     +
2911     + calculate_check_checksums(1);
2912     +
2913     + free_checksum_pages();
2914     +
2915     + suspend_recalculate_image_contents(1);
2916     +
2917     + extra_pd1_pages_used = pagedir1.size - old_ps1_size;
2918     +
2919     + if (extra_pd1_pages_used > extra_pd1_pages_allowance) {
2920     + printk("Pageset1 has grown by %d pages. "
2921     + "extra_pages_allowance is currently only %d.\n",
2922     + pagedir1.size - old_ps1_size,
2923     + extra_pd1_pages_allowance);
2924     + set_result_state(SUSPEND_ABORTED);
2925     + set_result_state(SUSPEND_EXTRA_PAGES_ALLOW_TOO_SMALL);
2926     + return -1;
2927     + }
2928     +
2929     + if (!test_action_state(SUSPEND_TEST_FILTER_SPEED) &&
2930     + !test_action_state(SUSPEND_TEST_BIO))
2931     + suspend_copy_pageset1();
2932     +
2933     + return 0;
2934     +}
2935     +
2936     +/**
2937     + * suspend2_suspend: High level code for doing the atomic copy.
2938     + *
2939     + * High-level code which prepares to do the atomic copy. Loosely based
2940     + * on the swsusp version, but with the following twists:
2941     + * - We set suspend2_running so the swsusp code uses our code paths.
2942     + * - We give better feedback regarding what goes wrong if there is a problem.
2943     + * - We use an extra function to call the assembly, just in case this code
2944     + * is in a module (return address).
2945     + **/
2946     +
2947     +int suspend2_suspend(void)
2948     +{
2949     + int error;
2950     +
2951     + suspend2_running = 1; /* For the swsusp code we use :< */
2952     +
2953     + if (test_action_state(SUSPEND_PM_PREPARE_CONSOLE))
2954     + pm_prepare_console();
2955     +
2956     + if ((error = arch_prepare_suspend()))
2957     + goto err_out;
2958     +
2959     + local_irq_disable();
2960     +
2961     + /* At this point, device_suspend() has been called, but *not*
2962     + * device_power_down(). We *must* device_power_down() now.
2963     + * Otherwise, drivers for some devices (e.g. interrupt controllers)
2964     + * become desynchronized with the actual state of the hardware
2965     + * at resume time, and evil weirdness ensues.
2966     + */
2967     +
2968     + if ((error = device_power_down(PMSG_FREEZE))) {
2969     + set_result_state(SUSPEND_DEVICE_REFUSED);
2970     + set_result_state(SUSPEND_ABORTED);
2971     + printk(KERN_ERR "Some devices failed to power down, aborting suspend\n");
2972     + goto enable_irqs;
2973     + }
2974     +
2975     + error = suspend2_lowlevel_builtin();
2976     +
2977     + if (!suspend2_in_suspend)
2978     + copyback_high();
2979     +
2980     + device_power_up();
2981     +enable_irqs:
2982     + local_irq_enable();
2983     + if (test_action_state(SUSPEND_PM_PREPARE_CONSOLE))
2984     + pm_restore_console();
2985     +err_out:
2986     + suspend2_running = 0;
2987     + return error;
2988     +}
2989     +
2990     +/**
2991     + * suspend_atomic_restore: Prepare to do the atomic restore.
2992     + *
2993     + * Get ready to do the atomic restore. This part gets us into the same
2994     + * state we are in prior to do calling do_suspend2_lowlevel while
2995     + * suspending: hot-unplugging secondary cpus and freeze processes,
2996     + * before starting the thread that will do the restore.
2997     + **/
2998     +
2999     +int suspend_atomic_restore(void)
3000     +{
3001     + int error, loop;
3002     +
3003     + suspend2_running = 1;
3004     +
3005     + suspend_prepare_status(DONT_CLEAR_BAR, "Prepare console");
3006     +
3007     + if (test_action_state(SUSPEND_PM_PREPARE_CONSOLE))
3008     + pm_prepare_console();
3009     +
3010     + suspend_prepare_status(DONT_CLEAR_BAR, "Device suspend.");
3011     +
3012     + suspend_console();
3013     + if ((error = device_suspend(PMSG_PRETHAW))) {
3014     + printk("Some devices failed to suspend\n");
3015     + goto device_resume;
3016     + }
3017     +
3018     + if (test_action_state(SUSPEND_LATE_CPU_HOTPLUG)) {
3019     + suspend_prepare_status(DONT_CLEAR_BAR, "Disable nonboot cpus.");
3020     + if (disable_nonboot_cpus()) {
3021     + set_result_state(SUSPEND_CPU_HOTPLUG_FAILED);
3022     + set_result_state(SUSPEND_ABORTED);
3023     + goto device_resume;
3024     + }
3025     + }
3026     +
3027     + suspend_prepare_status(DONT_CLEAR_BAR, "Atomic restore preparation");
3028     +
3029     + suspend2_nosave_state1 = suspend_action;
3030     + suspend2_nosave_state2 = suspend_debug_state;
3031     + suspend2_nosave_state3 = console_loglevel;
3032     +
3033     + for (loop = 0; loop < 4; loop++)
3034     + suspend2_nosave_io_speed[loop/2][loop%2] =
3035     + suspend_io_time[loop/2][loop%2];
3036     + memcpy(suspend2_nosave_commandline, saved_command_line, COMMAND_LINE_SIZE);
3037     +
3038     + mb();
3039     +
3040     + local_irq_disable();
3041     +
3042     + if (device_power_down(PMSG_FREEZE)) {
3043     + printk(KERN_ERR "Some devices failed to power down. Very bad.\n");
3044     + goto device_power_up;
3045     + }
3046     +
3047     + /* We'll ignore saved state, but this gets preempt count (etc) right */
3048     + save_processor_state();
3049     +
3050     + error = swsusp_arch_resume();
3051     + /*
3052     + * Code below is only ever reached in case of failure. Otherwise
3053     + * execution continues at place where swsusp_arch_suspend was called.
3054     + *
3055     + * We don't know whether it's safe to continue (this shouldn't happen),
3056     + * so lets err on the side of caution.
3057     + */
3058     + BUG();
3059     +
3060     +device_power_up:
3061     + device_power_up();
3062     + if (test_action_state(SUSPEND_LATE_CPU_HOTPLUG))
3063     + enable_nonboot_cpus();
3064     +device_resume:
3065     + device_resume();
3066     + resume_console();
3067     + free_pbe_list(&restore_pblist, 0);
3068     +#ifdef CONFIG_HIGHMEM
3069     + free_pbe_list(&restore_highmem_pblist, 1);
3070     +#endif
3071     + if (test_action_state(SUSPEND_PM_PREPARE_CONSOLE))
3072     + pm_restore_console();
3073     + suspend2_running = 0;
3074     + return 1;
3075     +}
3076     diff --git a/kernel/power/block_io.h b/kernel/power/block_io.h
3077     new file mode 100644
3078     index 0000000..49eaa51
3079     --- /dev/null
3080     +++ b/kernel/power/block_io.h
3081     @@ -0,0 +1,55 @@
3082     +/*
3083     + * kernel/power/block_io.h
3084     + *
3085     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
3086     + * Copyright (C) 2006 Red Hat, inc.
3087     + *
3088     + * Distributed under GPLv2.
3089     + *
3090     + * This file contains declarations for functions exported from
3091     + * block_io.c, which contains low level io functions.
3092     + */
3093     +
3094     +#include <linux/buffer_head.h>
3095     +#include "extent.h"
3096     +
3097     +struct suspend_bdev_info {
3098     + struct block_device *bdev;
3099     + dev_t dev_t;
3100     + int bmap_shift;
3101     + int blocks_per_page;
3102     +};
3103     +
3104     +/*
3105     + * Our exported interface so the swapwriter and filewriter don't
3106     + * need these functions duplicated.
3107     + */
3108     +struct suspend_bio_ops {
3109     + int (*bdev_page_io) (int rw, struct block_device *bdev, long pos,
3110     + struct page *page);
3111     + void (*check_io_stats) (void);
3112     + void (*reset_io_stats) (void);
3113     + void (*finish_all_io) (void);
3114     + int (*forward_one_page) (void);
3115     + void (*set_extra_page_forward) (void);
3116     + void (*set_devinfo) (struct suspend_bdev_info *info);
3117     + int (*read_chunk) (unsigned long *index, struct page *buffer_page,
3118     + unsigned int *buf_size, int sync);
3119     + int (*write_chunk) (unsigned long index, struct page *buffer_page,
3120     + unsigned int buf_size);
3121     + void (*read_header_init) (void);
3122     + int (*rw_header_chunk) (int rw, struct suspend_module_ops *owner,
3123     + char *buffer, int buffer_size);
3124     + int (*write_header_chunk_finish) (void);
3125     + int (*rw_init) (int rw, int stream_number);
3126     + int (*rw_cleanup) (int rw);
3127     +};
3128     +
3129     +extern struct suspend_bio_ops suspend_bio_ops;
3130     +
3131     +extern char *suspend_writer_buffer;
3132     +extern int suspend_writer_buffer_posn;
3133     +extern int suspend_read_fd;
3134     +extern struct extent_iterate_saved_state suspend_writer_posn_save[3];
3135     +extern struct extent_iterate_state suspend_writer_posn;
3136     +extern int suspend_header_bytes_used;
3137     diff --git a/kernel/power/checksum.c b/kernel/power/checksum.c
3138     new file mode 100644
3139     index 0000000..356af21
3140     --- /dev/null
3141     +++ b/kernel/power/checksum.c
3142     @@ -0,0 +1,371 @@
3143     +/*
3144     + * kernel/power/checksum.c
3145     + *
3146     + * Copyright (C) 2006-2007 Nigel Cunningham (nigel at suspend2 net)
3147     + * Copyright (C) 2006 Red Hat, inc.
3148     + *
3149     + * This file is released under the GPLv2.
3150     + *
3151     + * This file contains data checksum routines for suspend2,
3152     + * using cryptoapi. They are used to locate any modifications
3153     + * made to pageset 2 while we're saving it.
3154     + */
3155     +
3156     +#include <linux/suspend.h>
3157     +#include <linux/module.h>
3158     +#include <linux/highmem.h>
3159     +#include <linux/vmalloc.h>
3160     +#include <linux/crypto.h>
3161     +#include <linux/scatterlist.h>
3162     +
3163     +#include "suspend.h"
3164     +#include "modules.h"
3165     +#include "sysfs.h"
3166     +#include "io.h"
3167     +#include "pageflags.h"
3168     +#include "checksum.h"
3169     +#include "pagedir.h"
3170     +
3171     +static struct suspend_module_ops suspend_checksum_ops;
3172     +
3173     +/* Constant at the mo, but I might allow tuning later */
3174     +static char suspend_checksum_name[32] = "md5";
3175     +/* Bytes per checksum */
3176     +#define CHECKSUM_SIZE (128 / 8)
3177     +
3178     +#define CHECKSUMS_PER_PAGE ((PAGE_SIZE - sizeof(void *)) / CHECKSUM_SIZE)
3179     +
3180     +static struct crypto_hash *suspend_checksum_transform;
3181     +static struct hash_desc desc;
3182     +static int pages_allocated;
3183     +static unsigned long page_list;
3184     +
3185     +static int suspend_num_resaved = 0;
3186     +
3187     +#if 1
3188     +#define PRINTK(a, b...) do { } while(0)
3189     +#else
3190     +#define PRINTK(a, b...) do { printk(a, ##b); } while(0)
3191     +#endif
3192     +
3193     +/* ---- Local buffer management ---- */
3194     +
3195     +/*
3196     + * suspend_checksum_cleanup
3197     + *
3198     + * Frees memory allocated for our labours.
3199     + */
3200     +static void suspend_checksum_cleanup(int ending_cycle)
3201     +{
3202     + if (ending_cycle && suspend_checksum_transform) {
3203     + crypto_free_hash(suspend_checksum_transform);
3204     + suspend_checksum_transform = NULL;
3205     + desc.tfm = NULL;
3206     + }
3207     +}
3208     +
3209     +/*
3210     + * suspend_crypto_prepare
3211     + *
3212     + * Prepare to do some work by allocating buffers and transforms.
3213     + * Returns: Int: Zero. Even if we can't set up checksum, we still
3214     + * seek to suspend.
3215     + */
3216     +static int suspend_checksum_prepare(int starting_cycle)
3217     +{
3218     + if (!starting_cycle || !suspend_checksum_ops.enabled)
3219     + return 0;
3220     +
3221     + if (!*suspend_checksum_name) {
3222     + printk("Suspend2: No checksum algorithm name set.\n");
3223     + return 1;
3224     + }
3225     +
3226     + suspend_checksum_transform = crypto_alloc_hash(suspend_checksum_name, 0, 0);
3227     + if (IS_ERR(suspend_checksum_transform)) {
3228     + printk("Suspend2: Failed to initialise the %s checksum algorithm: %ld.\n",
3229     + suspend_checksum_name,
3230     + (long) suspend_checksum_transform);
3231     + suspend_checksum_transform = NULL;
3232     + return 1;
3233     + }
3234     +
3235     + desc.tfm = suspend_checksum_transform;
3236     + desc.flags = 0;
3237     +
3238     + return 0;
3239     +}
3240     +
3241     +static int suspend_print_task_if_using_page(struct task_struct *t, struct page *seeking)
3242     +{
3243     + struct vm_area_struct *vma;
3244     + struct mm_struct *mm;
3245     + int result = 0;
3246     +
3247     + mm = t->active_mm;
3248     +
3249     + if (!mm || !mm->mmap) return 0;
3250     +
3251     + /* Don't try to take the sem when processes are frozen,
3252     + * drivers are suspended and irqs are disabled. We're
3253     + * not racing with anything anyway. */
3254     + if (!irqs_disabled())
3255     + down_read(&mm->mmap_sem);
3256     +
3257     + for (vma = mm->mmap; vma; vma = vma->vm_next) {
3258     + if (vma->vm_flags & VM_PFNMAP)
3259     + continue;
3260     + if (vma->vm_start) {
3261     + unsigned long posn;
3262     + for (posn = vma->vm_start; posn < vma->vm_end;
3263     + posn += PAGE_SIZE) {
3264     + struct page *page =
3265     + follow_page(vma, posn, 0);
3266     + if (page == seeking) {
3267     + printk("%s(%d)", t->comm, t->pid);
3268     + result = 1;
3269     + goto out;
3270     + }
3271     + }
3272     + }
3273     + }
3274     +
3275     +out:
3276     + if (!irqs_disabled())
3277     + up_read(&mm->mmap_sem);
3278     +
3279     + return result;
3280     +}
3281     +
3282     +static void print_tasks_using_page(struct page *seeking)
3283     +{
3284     + struct task_struct *p;
3285     +
3286     + read_lock(&tasklist_lock);
3287     + for_each_process(p) {
3288     + if (suspend_print_task_if_using_page(p, seeking))
3289     + printk(" ");
3290     + }
3291     + read_unlock(&tasklist_lock);
3292     +}
3293     +
3294     +/*
3295     + * suspend_checksum_print_debug_stats
3296     + * @buffer: Pointer to a buffer into which the debug info will be printed.
3297     + * @size: Size of the buffer.
3298     + *
3299     + * Print information to be recorded for debugging purposes into a buffer.
3300     + * Returns: Number of characters written to the buffer.
3301     + */
3302     +
3303     +static int suspend_checksum_print_debug_stats(char *buffer, int size)
3304     +{
3305     + int len;
3306     +
3307     + if (!suspend_checksum_ops.enabled)
3308     + return snprintf_used(buffer, size,
3309     + "- Checksumming disabled.\n");
3310     +
3311     + len = snprintf_used(buffer, size, "- Checksum method is '%s'.\n",
3312     + suspend_checksum_name);
3313     + len+= snprintf_used(buffer + len, size - len,
3314     + " %d pages resaved in atomic copy.\n", suspend_num_resaved);
3315     + return len;
3316     +}
3317     +
3318     +static int suspend_checksum_storage_needed(void)
3319     +{
3320     + if (suspend_checksum_ops.enabled)
3321     + return strlen(suspend_checksum_name) + sizeof(int) + 1;
3322     + else
3323     + return 0;
3324     +}
3325     +
3326     +/*
3327     + * suspend_checksum_save_config_info
3328     + * @buffer: Pointer to a buffer of size PAGE_SIZE.
3329     + *
3330     + * Save informaton needed when reloading the image at resume time.
3331     + * Returns: Number of bytes used for saving our data.
3332     + */
3333     +static int suspend_checksum_save_config_info(char *buffer)
3334     +{
3335     + int namelen = strlen(suspend_checksum_name) + 1;
3336     + int total_len;
3337     +
3338     + *((unsigned int *) buffer) = namelen;
3339     + strncpy(buffer + sizeof(unsigned int), suspend_checksum_name,
3340     + namelen);
3341     + total_len = sizeof(unsigned int) + namelen;
3342     + return total_len;
3343     +}
3344     +
3345     +/* suspend_checksum_load_config_info
3346     + * @buffer: Pointer to the start of the data.
3347     + * @size: Number of bytes that were saved.
3348     + *
3349     + * Description: Reload information needed for dechecksuming the image at
3350     + * resume time.
3351     + */
3352     +static void suspend_checksum_load_config_info(char *buffer, int size)
3353     +{
3354     + int namelen;
3355     +
3356     + namelen = *((unsigned int *) (buffer));
3357     + strncpy(suspend_checksum_name, buffer + sizeof(unsigned int),
3358     + namelen);
3359     + return;
3360     +}
3361     +
3362     +/*
3363     + * Free Checksum Memory
3364     + */
3365     +
3366     +void free_checksum_pages(void)
3367     +{
3368     + PRINTK("Freeing %d checksum pages.\n", pages_allocated);
3369     + while (pages_allocated) {
3370     + unsigned long next = *((unsigned long *) page_list);
3371     + PRINTK("Page %3d is at %lx and points to %lx.\n", pages_allocated, page_list, next);
3372     + ClearPageNosave(virt_to_page(page_list));
3373     + free_page((unsigned long) page_list);
3374     + page_list = next;
3375     + pages_allocated--;
3376     + }
3377     +}
3378     +
3379     +/*
3380     + * Allocate Checksum Memory
3381     + */
3382     +
3383     +int allocate_checksum_pages(void)
3384     +{
3385     + int pages_needed = DIV_ROUND_UP(pagedir2.size, CHECKSUMS_PER_PAGE);
3386     +
3387     + if (!suspend_checksum_ops.enabled)
3388     + return 0;
3389     +
3390     + PRINTK("Need %d checksum pages for %ld pageset2 pages.\n", pages_needed, pagedir2.size);
3391     + while (pages_allocated < pages_needed) {
3392     + unsigned long *new_page =
3393     + (unsigned long *) get_zeroed_page(GFP_ATOMIC);
3394     + if (!new_page)
3395     + return -ENOMEM;
3396     + SetPageNosave(virt_to_page(new_page));
3397     + (*new_page) = page_list;
3398     + page_list = (unsigned long) new_page;
3399     + pages_allocated++;
3400     + PRINTK("Page %3d is at %lx and points to %lx.\n", pages_allocated, page_list, *((unsigned long *) page_list));
3401     + }
3402     +
3403     + return 0;
3404     +}
3405     +
3406     +#if 0
3407     +static void print_checksum(char *buf, int size)
3408     +{
3409     + int index;
3410     +
3411     + for (index = 0; index < size; index++)
3412     + printk("%x ", buf[index]);
3413     +
3414     + printk("\n");
3415     +}
3416     +#endif
3417     +
3418     +/*
3419     + * Calculate checksums
3420     + */
3421     +
3422     +void calculate_check_checksums(int check)
3423     +{
3424     + int pfn, index = 0;
3425     + unsigned long next_page, this_checksum = 0;
3426     + struct scatterlist sg[2];
3427     + char current_checksum[CHECKSUM_SIZE];
3428     +
3429     + if (!suspend_checksum_ops.enabled)
3430     + return;
3431     +
3432     + next_page = (unsigned long) page_list;
3433     +
3434     + if (check)
3435     + suspend_num_resaved = 0;
3436     +
3437     + BITMAP_FOR_EACH_SET(pageset2_map, pfn) {
3438     + int ret;
3439     + if (index % CHECKSUMS_PER_PAGE) {
3440     + this_checksum += CHECKSUM_SIZE;
3441     + } else {
3442     + this_checksum = next_page + sizeof(void *);
3443     + next_page = *((unsigned long *) next_page);
3444     + }
3445     + PRINTK("Put checksum for page %3d %p in %lx.\n", index, page_address(pfn_to_page(pfn)), this_checksum);
3446     + sg_set_buf(&sg[0], page_address(pfn_to_page(pfn)), PAGE_SIZE);
3447     + if (check) {
3448     + ret = crypto_hash_digest(&desc, sg,
3449     + PAGE_SIZE, current_checksum);
3450     + if (memcmp(current_checksum, (char *) this_checksum, CHECKSUM_SIZE)) {
3451     + SetPageResave(pfn_to_page(pfn));
3452     + printk("Page %d changed. Saving in atomic copy."
3453     + "Processes using it:", pfn);
3454     + print_tasks_using_page(pfn_to_page(pfn));
3455     + printk("\n");
3456     + suspend_num_resaved++;
3457     + if (test_action_state(SUSPEND_ABORT_ON_RESAVE_NEEDED))
3458     + set_result_state(SUSPEND_ABORTED);
3459     + }
3460     + } else
3461     + ret = crypto_hash_digest(&desc, sg,
3462     + PAGE_SIZE, (char *) this_checksum);
3463     + if (ret) {
3464     + printk("Digest failed. Returned %d.\n", ret);
3465     + return;
3466     + }
3467     + index++;
3468     + }
3469     +}
3470     +
3471     +static struct suspend_sysfs_data sysfs_params[] = {
3472     + { SUSPEND2_ATTR("enabled", SYSFS_RW),
3473     + SYSFS_INT(&suspend_checksum_ops.enabled, 0, 1, 0)
3474     + },
3475     +
3476     + { SUSPEND2_ATTR("abort_if_resave_needed", SYSFS_RW),
3477     + SYSFS_BIT(&suspend_action, SUSPEND_ABORT_ON_RESAVE_NEEDED, 0)
3478     + }
3479     +};
3480     +
3481     +/*
3482     + * Ops structure.
3483     + */
3484     +static struct suspend_module_ops suspend_checksum_ops = {
3485     + .type = MISC_MODULE,
3486     + .name = "Checksumming",
3487     + .directory = "checksum",
3488     + .module = THIS_MODULE,
3489     + .initialise = suspend_checksum_prepare,
3490     + .cleanup = suspend_checksum_cleanup,
3491     + .print_debug_info = suspend_checksum_print_debug_stats,
3492     + .save_config_info = suspend_checksum_save_config_info,
3493     + .load_config_info = suspend_checksum_load_config_info,
3494     + .storage_needed = suspend_checksum_storage_needed,
3495     +
3496     + .sysfs_data = sysfs_params,
3497     + .num_sysfs_entries = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data),
3498     +};
3499     +
3500     +/* ---- Registration ---- */
3501     +int s2_checksum_init(void)
3502     +{
3503     + int result = suspend_register_module(&suspend_checksum_ops);
3504     +
3505     + /* Disabled by default */
3506     + suspend_checksum_ops.enabled = 0;
3507     + return result;
3508     +}
3509     +
3510     +void s2_checksum_exit(void)
3511     +{
3512     + suspend_unregister_module(&suspend_checksum_ops);
3513     +}
3514     diff --git a/kernel/power/checksum.h b/kernel/power/checksum.h
3515     new file mode 100644
3516     index 0000000..9984eec
3517     --- /dev/null
3518     +++ b/kernel/power/checksum.h
3519     @@ -0,0 +1,27 @@
3520     +/*
3521     + * kernel/power/checksum.h
3522     + *
3523     + * Copyright (C) 2006-2007 Nigel Cunningham (nigel at suspend2 net)
3524     + * Copyright (C) 2006 Red Hat, inc.
3525     + *
3526     + * This file is released under the GPLv2.
3527     + *
3528     + * This file contains data checksum routines for suspend2,
3529     + * using cryptoapi. They are used to locate any modifications
3530     + * made to pageset 2 while we're saving it.
3531     + */
3532     +
3533     +#if defined(CONFIG_SUSPEND2_CHECKSUM)
3534     +extern int s2_checksum_init(void);
3535     +extern void s2_checksum_exit(void);
3536     +void calculate_check_checksums(int check);
3537     +int allocate_checksum_pages(void);
3538     +void free_checksum_pages(void);
3539     +#else
3540     +static inline int s2_checksum_init(void) { return 0; }
3541     +static inline void s2_checksum_exit(void) { }
3542     +static inline void calculate_check_checksums(int check) { };
3543     +static inline int allocate_checksum_pages(void) { return 0; };
3544     +static inline void free_checksum_pages(void) { };
3545     +#endif
3546     +
3547     diff --git a/kernel/power/cluster.c b/kernel/power/cluster.c
3548     new file mode 100644
3549     index 0000000..b5ab9ad
3550     --- /dev/null
3551     +++ b/kernel/power/cluster.c
3552     @@ -0,0 +1,152 @@
3553     +/*
3554     + * kernel/power/cluster.c
3555     + *
3556     + * Copyright (C) 2006-2007 Nigel Cunningham (nigel at suspend2 net)
3557     + *
3558     + * This file is released under the GPLv2.
3559     + *
3560     + * This file contains routines for cluster hibernation support.
3561     + *
3562     + */
3563     +
3564     +#include <linux/suspend.h>
3565     +#include <linux/module.h>
3566     +
3567     +#include "suspend.h"
3568     +#include "modules.h"
3569     +#include "sysfs.h"
3570     +#include "io.h"
3571     +
3572     +static char suspend_cluster_master[63] = CONFIG_SUSPEND2_DEFAULT_CLUSTER_MASTER;
3573     +
3574     +static struct suspend_module_ops suspend_cluster_ops;
3575     +
3576     +/* suspend_cluster_print_debug_stats
3577     + *
3578     + * Description: Print information to be recorded for debugging purposes into a
3579     + * buffer.
3580     + * Arguments: buffer: Pointer to a buffer into which the debug info will be
3581     + * printed.
3582     + * size: Size of the buffer.
3583     + * Returns: Number of characters written to the buffer.
3584     + */
3585     +static int suspend_cluster_print_debug_stats(char *buffer, int size)
3586     +{
3587     + int len;
3588     +
3589     + if (strlen(suspend_cluster_master))
3590     + len = snprintf_used(buffer, size, "- Cluster master is '%s'.\n",
3591     + suspend_cluster_master);
3592     + else
3593     + len = snprintf_used(buffer, size, "- Cluster support is disabled.\n");
3594     + return len;
3595     +}
3596     +
3597     +/* cluster_memory_needed
3598     + *
3599     + * Description: Tell the caller how much memory we need to operate during
3600     + * suspend/resume.
3601     + * Returns: Unsigned long. Maximum number of bytes of memory required for
3602     + * operation.
3603     + */
3604     +static int suspend_cluster_memory_needed(void)
3605     +{
3606     + return 0;
3607     +}
3608     +
3609     +static int suspend_cluster_storage_needed(void)
3610     +{
3611     + return 1 + strlen(suspend_cluster_master);
3612     +}
3613     +
3614     +/* suspend_cluster_save_config_info
3615     + *
3616     + * Description: Save informaton needed when reloading the image at resume time.
3617     + * Arguments: Buffer: Pointer to a buffer of size PAGE_SIZE.
3618     + * Returns: Number of bytes used for saving our data.
3619     + */
3620     +static int suspend_cluster_save_config_info(char *buffer)
3621     +{
3622     + strcpy(buffer, suspend_cluster_master);
3623     + return strlen(suspend_cluster_master + 1);
3624     +}
3625     +
3626     +/* suspend_cluster_load_config_info
3627     + *
3628     + * Description: Reload information needed for declustering the image at
3629     + * resume time.
3630     + * Arguments: Buffer: Pointer to the start of the data.
3631     + * Size: Number of bytes that were saved.
3632     + */
3633     +static void suspend_cluster_load_config_info(char *buffer, int size)
3634     +{
3635     + strncpy(suspend_cluster_master, buffer, size);
3636     + return;
3637     +}
3638     +
3639     +/*
3640     + * data for our sysfs entries.
3641     + */
3642     +static struct suspend_sysfs_data sysfs_params[] = {
3643     + {
3644     + SUSPEND2_ATTR("master", SYSFS_RW),
3645     + SYSFS_STRING(suspend_cluster_master, 63, SYSFS_SM_NOT_NEEDED)
3646     + },
3647     +
3648     + {
3649     + SUSPEND2_ATTR("enabled", SYSFS_RW),
3650     + SYSFS_INT(&suspend_cluster_ops.enabled, 0, 1)
3651     + }
3652     +};
3653     +
3654     +/*
3655     + * Ops structure.
3656     + */
3657     +
3658     +static struct suspend_module_ops suspend_cluster_ops = {
3659     + .type = FILTER_MODULE,
3660     + .name = "Cluster",
3661     + .directory = "cluster",
3662     + .module = THIS_MODULE,
3663     + .memory_needed = suspend_cluster_memory_needed,
3664     + .print_debug_info = suspend_cluster_print_debug_stats,
3665     + .save_config_info = suspend_cluster_save_config_info,
3666     + .load_config_info = suspend_cluster_load_config_info,
3667     + .storage_needed = suspend_cluster_storage_needed,
3668     +
3669     + .sysfs_data = sysfs_params,
3670     + .num_sysfs_entries = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data),
3671     +};
3672     +
3673     +/* ---- Registration ---- */
3674     +
3675     +#ifdef MODULE
3676     +#warning Module set.
3677     +#define INIT static __init
3678     +#define EXIT static __exit
3679     +#else
3680     +#define INIT
3681     +#define EXIT
3682     +#endif
3683     +
3684     +INIT int s2_cluster_init(void)
3685     +{
3686     + int temp = suspend_register_module(&suspend_cluster_ops);
3687     +
3688     + if (!strlen(suspend_cluster_master))
3689     + suspend_cluster_ops.enabled = 0;
3690     + return temp;
3691     +}
3692     +
3693     +EXIT void s2_cluster_exit(void)
3694     +{
3695     + suspend_unregister_module(&suspend_cluster_ops);
3696     +}
3697     +
3698     +#ifdef MODULE
3699     +MODULE_LICENSE("GPL");
3700     +module_init(s2_cluster_init);
3701     +module_exit(s2_cluster_exit);
3702     +MODULE_AUTHOR("Nigel Cunningham");
3703     +MODULE_DESCRIPTION("Cluster Support for Suspend2");
3704     +#endif
3705     diff --git a/kernel/power/cluster.h b/kernel/power/cluster.h
3706     new file mode 100644
3707     index 0000000..d44bbf7
3708     --- /dev/null
3709     +++ b/kernel/power/cluster.h
3710     @@ -0,0 +1,17 @@
3711     +/*
3712     + * kernel/power/cluster.h
3713     + *
3714     + * Copyright (C) 2006-2007 Nigel Cunningham (nigel at suspend2 net)
3715     + * Copyright (C) 2006 Red Hat, inc.
3716     + *
3717     + * This file is released under the GPLv2.
3718     + */
3719     +
3720     +#ifdef CONFIG_SUSPEND2_CLUSTER
3721     +extern int s2_cluster_init(void);
3722     +extern void s2_cluster_exit(void);
3723     +#else
3724     +static inline int s2_cluster_init(void) { return 0; }
3725     +static inline void s2_cluster_exit(void) { }
3726     +#endif
3727     +
3728     diff --git a/kernel/power/disk.c b/kernel/power/disk.c
3729     index aec19b0..5d014f1 100644
3730     --- a/kernel/power/disk.c
3731     +++ b/kernel/power/disk.c
3732     @@ -24,6 +24,8 @@
3733    
3734     #include "power.h"
3735    
3736     +#include "suspend.h"
3737     +#include "suspend2_builtin.h"
3738    
3739     static int noresume = 0;
3740     char resume_file[256] = CONFIG_PM_STD_PARTITION;
3741     @@ -118,6 +120,11 @@ int pm_suspend_disk(void)
3742     {
3743     int error;
3744    
3745     +#ifdef CONFIG_SUSPEND2
3746     + if (test_action_state(SUSPEND_REPLACE_SWSUSP))
3747     + return suspend2_try_suspend(1);
3748     +#endif
3749     +
3750     error = prepare_processes();
3751     if (error)
3752     return error;
3753     @@ -200,10 +207,22 @@ int pm_suspend_disk(void)
3754     *
3755     */
3756    
3757     -static int software_resume(void)
3758     +int software_resume(void)
3759     {
3760     int error;
3761    
3762     + resume_attempted = 1;
3763     +
3764     +#ifdef CONFIG_SUSPEND2
3765     + /*
3766     + * We can't know (until an image header - if any - is loaded), whether
3767     + * we did override swsusp. We therefore ensure that both are tried.
3768     + */
3769     + if (test_action_state(SUSPEND_REPLACE_SWSUSP))
3770     + printk("Replacing swsusp.\n");
3771     + suspend2_try_resume();
3772     +#endif
3773     +
3774     mutex_lock(&pm_mutex);
3775     if (!swsusp_resume_device) {
3776     if (!strlen(resume_file)) {
3777     @@ -274,9 +293,6 @@ static int software_resume(void)
3778     return 0;
3779     }
3780    
3781     -late_initcall(software_resume);
3782     -
3783     -
3784     static const char * const pm_disk_modes[] = {
3785     [PM_DISK_FIRMWARE] = "firmware",
3786     [PM_DISK_PLATFORM] = "platform",
3787     @@ -457,6 +473,7 @@ static int __init resume_offset_setup(char *str)
3788     static int __init noresume_setup(char *str)
3789     {
3790     noresume = 1;
3791     + set_suspend_state(SUSPEND_NORESUME_SPECIFIED);
3792     return 1;
3793     }
3794    
3795     diff --git a/kernel/power/extent.c b/kernel/power/extent.c
3796     new file mode 100644
3797     index 0000000..cd0ff0c
3798     --- /dev/null
3799     +++ b/kernel/power/extent.c
3800     @@ -0,0 +1,305 @@
3801     +/*
3802     + * kernel/power/extent.c
3803     + *
3804     + * Copyright (C) 2003-2007 Nigel Cunningham (nigel at suspend2 net)
3805     + *
3806     + * Distributed under GPLv2.
3807     + *
3808     + * These functions encapsulate the manipulation of storage metadata. For
3809     + * pageflags, we use dynamically allocated bitmaps.
3810     + */
3811     +
3812     +#include <linux/module.h>
3813     +#include <linux/suspend.h>
3814     +#include "modules.h"
3815     +#include "extent.h"
3816     +#include "ui.h"
3817     +#include "suspend.h"
3818     +
3819     +/* suspend_get_extent
3820     + *
3821     + * Returns a free extent. May fail, returning NULL instead.
3822     + */
3823     +static struct extent *suspend_get_extent(void)
3824     +{
3825     + struct extent *result;
3826     +
3827     + if (!(result = kmalloc(sizeof(struct extent), GFP_ATOMIC)))
3828     + return NULL;
3829     +
3830     + result->minimum = result->maximum = 0;
3831     + result->next = NULL;
3832     +
3833     + return result;
3834     +}
3835     +
3836     +/* suspend_put_extent_chain.
3837     + *
3838     + * Frees a whole chain of extents.
3839     + */
3840     +void suspend_put_extent_chain(struct extent_chain *chain)
3841     +{
3842     + struct extent *this;
3843     +
3844     + this = chain->first;
3845     +
3846     + while(this) {
3847     + struct extent *next = this->next;
3848     + kfree(this);
3849     + chain->num_extents--;
3850     + this = next;
3851     + }
3852     +
3853     + chain->first = chain->last_touched = NULL;
3854     + chain->size = 0;
3855     +}
3856     +
3857     +/*
3858     + * suspend_add_to_extent_chain
3859     + *
3860     + * Add an extent to an existing chain.
3861     + */
3862     +int suspend_add_to_extent_chain(struct extent_chain *chain,
3863     + unsigned long minimum, unsigned long maximum)
3864     +{
3865     + struct extent *new_extent = NULL, *start_at;
3866     +
3867     + /* Find the right place in the chain */
3868     + start_at = (chain->last_touched &&
3869     + (chain->last_touched->minimum < minimum)) ?
3870     + chain->last_touched : NULL;
3871     +
3872     + if (!start_at && chain->first && chain->first->minimum < minimum)
3873     + start_at = chain->first;
3874     +
3875     + while (start_at && start_at->next && start_at->next->minimum < minimum)
3876     + start_at = start_at->next;
3877     +
3878     + if (start_at && start_at->maximum == (minimum - 1)) {
3879     + start_at->maximum = maximum;
3880     +
3881     + /* Merge with the following one? */
3882     + if (start_at->next &&
3883     + start_at->maximum + 1 == start_at->next->minimum) {
3884     + struct extent *to_free = start_at->next;
3885     + start_at->maximum = start_at->next->maximum;
3886     + start_at->next = start_at->next->next;
3887     + chain->num_extents--;
3888     + kfree(to_free);
3889     + }
3890     +
3891     + chain->last_touched = start_at;
3892     + chain->size+= (maximum - minimum + 1);
3893     +
3894     + return 0;
3895     + }
3896     +
3897     + new_extent = suspend_get_extent();
3898     + if (!new_extent) {
3899     + printk("Error unable to append a new extent to the chain.\n");
3900     + return 2;
3901     + }
3902     +
3903     + chain->num_extents++;
3904     + chain->size+= (maximum - minimum + 1);
3905     + new_extent->minimum = minimum;
3906     + new_extent->maximum = maximum;
3907     + new_extent->next = NULL;
3908     +
3909     + chain->last_touched = new_extent;
3910     +
3911     + if (start_at) {
3912     + struct extent *next = start_at->next;
3913     + start_at->next = new_extent;
3914     + new_extent->next = next;
3915     + } else {
3916     + if (chain->first)
3917     + new_extent->next = chain->first;
3918     + chain->first = new_extent;
3919     + }
3920     +
3921     + return 0;
3922     +}
3923     +
3924     +/* suspend_serialise_extent_chain
3925     + *
3926     + * Write a chain in the image.
3927     + */
3928     +int suspend_serialise_extent_chain(struct suspend_module_ops *owner,
3929     + struct extent_chain *chain)
3930     +{
3931     + struct extent *this;
3932     + int ret, i = 0;
3933     +
3934     + if ((ret = suspendActiveAllocator->rw_header_chunk(WRITE, owner,
3935     + (char *) chain,
3936     + 2 * sizeof(int))))
3937     + return ret;
3938     +
3939     + this = chain->first;
3940     + while (this) {
3941     + if ((ret = suspendActiveAllocator->rw_header_chunk(WRITE, owner,
3942     + (char *) this,
3943     + 2 * sizeof(unsigned long))))
3944     + return ret;
3945     + this = this->next;
3946     + i++;
3947     + }
3948     +
3949     + if (i != chain->num_extents) {
3950     + printk(KERN_EMERG "Saved %d extents but chain metadata says there "
3951     + "should be %d.\n", i, chain->num_extents);
3952     + return 1;
3953     + }
3954     +
3955     + return ret;
3956     +}
3957     +
3958     +/* suspend_load_extent_chain
3959     + *
3960     + * Read back a chain saved in the image.
3961     + */
3962     +int suspend_load_extent_chain(struct extent_chain *chain)
3963     +{
3964     + struct extent *this, *last = NULL;
3965     + int i, ret;
3966     +
3967     + if ((ret = suspendActiveAllocator->rw_header_chunk(READ, NULL,
3968     + (char *) chain, 2 * sizeof(int)))) {
3969     + printk("Failed to read size of extent chain.\n");
3970     + return 1;
3971     + }
3972     +
3973     + for (i = 0; i < chain->num_extents; i++) {
3974     + this = kmalloc(sizeof(struct extent), GFP_ATOMIC);
3975     + if (!this) {
3976     + printk("Failed to allocate a new extent.\n");
3977     + return -ENOMEM;
3978     + }
3979     + this->next = NULL;
3980     + if ((ret = suspendActiveAllocator->rw_header_chunk(READ, NULL,
3981     + (char *) this, 2 * sizeof(unsigned long)))) {
3982     + printk("Failed to an extent.\n");
3983     + return 1;
3984     + }
3985     + if (last)
3986     + last->next = this;
3987     + else
3988     + chain->first = this;
3989     + last = this;
3990     + }
3991     + return 0;
3992     +}
3993     +
3994     +/* suspend_extent_state_next
3995     + *
3996     + * Given a state, progress to the next valid entry. We may begin in an
3997     + * invalid state, as we do when invoked after extent_state_goto_start below.
3998     + *
3999     + * When using compression and expected_compression > 0, we let the image size
4000     + * be larger than storage, so we can validly run out of data to return.
4001     + */
4002     +unsigned long suspend_extent_state_next(struct extent_iterate_state *state)
4003     +{
4004     + if (state->current_chain == state->num_chains)
4005     + return 0;
4006     +
4007     + if (state->current_extent) {
4008     + if (state->current_offset == state->current_extent->maximum) {
4009     + if (state->current_extent->next) {
4010     + state->current_extent = state->current_extent->next;
4011     + state->current_offset = state->current_extent->minimum;
4012     + } else {
4013     + state->current_extent = NULL;
4014     + state->current_offset = 0;
4015     + }
4016     + } else
4017     + state->current_offset++;
4018     + }
4019     +
4020     + while(!state->current_extent) {
4021     + int chain_num = ++(state->current_chain);
4022     +
4023     + if (chain_num == state->num_chains)
4024     + return 0;
4025     +
4026     + state->current_extent = (state->chains + chain_num)->first;
4027     +
4028     + if (!state->current_extent)
4029     + continue;
4030     +
4031     + state->current_offset = state->current_extent->minimum;
4032     + }
4033     +
4034     + return state->current_offset;
4035     +}
4036     +
4037     +/* suspend_extent_state_goto_start
4038     + *
4039     + * Find the first valid value in a group of chains.
4040     + */
4041     +void suspend_extent_state_goto_start(struct extent_iterate_state *state)
4042     +{
4043     + state->current_chain = -1;
4044     + state->current_extent = NULL;
4045     + state->current_offset = 0;
4046     +}
4047     +
4048     +/* suspend_extent_start_save
4049     + *
4050     + * Given a state and a struct extent_state_store, save the current
4051     + * position in a format that can be used with relocated chains (at
4052     + * resume time).
4053     + */
4054     +void suspend_extent_state_save(struct extent_iterate_state *state,
4055     + struct extent_iterate_saved_state *saved_state)
4056     +{
4057     + struct extent *extent;
4058     +
4059     + saved_state->chain_num = state->current_chain;
4060     + saved_state->extent_num = 0;
4061     + saved_state->offset = state->current_offset;
4062     +
4063     + if (saved_state->chain_num == -1)
4064     + return;
4065     +
4066     + extent = (state->chains + state->current_chain)->first;
4067     +
4068     + while (extent != state->current_extent) {
4069     + saved_state->extent_num++;
4070     + extent = extent->next;
4071     + }
4072     +}
4073     +
4074     +/* suspend_extent_start_restore
4075     + *
4076     + * Restore the position saved by extent_state_save.
4077     + */
4078     +void suspend_extent_state_restore(struct extent_iterate_state *state,
4079     + struct extent_iterate_saved_state *saved_state)
4080     +{
4081     + int posn = saved_state->extent_num;
4082     +
4083     + if (saved_state->chain_num == -1) {
4084     + suspend_extent_state_goto_start(state);
4085     + return;
4086     + }
4087     +
4088     + state->current_chain = saved_state->chain_num;
4089     + state->current_extent = (state->chains + state->current_chain)->first;
4090     + state->current_offset = saved_state->offset;
4091     +
4092     + while (posn--)
4093     + state->current_extent = state->current_extent->next;
4094     +}
4095     +
4096     +#ifdef CONFIG_SUSPEND2_EXPORTS
4097     +EXPORT_SYMBOL_GPL(suspend_add_to_extent_chain);
4098     +EXPORT_SYMBOL_GPL(suspend_put_extent_chain);
4099     +EXPORT_SYMBOL_GPL(suspend_load_extent_chain);
4100     +EXPORT_SYMBOL_GPL(suspend_serialise_extent_chain);
4101     +EXPORT_SYMBOL_GPL(suspend_extent_state_save);
4102     +EXPORT_SYMBOL_GPL(suspend_extent_state_restore);
4103     +EXPORT_SYMBOL_GPL(suspend_extent_state_goto_start);
4104     +EXPORT_SYMBOL_GPL(suspend_extent_state_next);
4105     +#endif
4106     diff --git a/kernel/power/extent.h b/kernel/power/extent.h
4107     new file mode 100644
4108     index 0000000..c97772b
4109     --- /dev/null
4110     +++ b/kernel/power/extent.h
4111     @@ -0,0 +1,77 @@
4112     +/*
4113     + * kernel/power/extent.h
4114     + *
4115     + * Copyright (C) 2003-2007 Nigel Cunningham (nigel at suspend2 net)
4116     + *
4117     + * This file is released under the GPLv2.
4118     + *
4119     + * It contains declarations related to extents. Extents are
4120     + * suspend's method of storing some of the metadata for the image.
4121     + * See extent.c for more info.
4122     + *
4123     + */
4124     +
4125     +#include "modules.h"
4126     +
4127     +#ifndef EXTENT_H
4128     +#define EXTENT_H
4129     +
4130     +struct extent {
4131     + unsigned long minimum, maximum;
4132     + struct extent *next;
4133     +};
4134     +
4135     +struct extent_chain {
4136     + int size; /* size of the chain ie sum (max-min+1) */
4137     + int num_extents;
4138     + struct extent *first, *last_touched;
4139     +};
4140     +
4141     +struct extent_iterate_state {
4142     + struct extent_chain *chains;
4143     + int num_chains;
4144     + int current_chain;
4145     + struct extent *current_extent;
4146     + unsigned long current_offset;
4147     +};
4148     +
4149     +struct extent_iterate_saved_state {
4150     + int chain_num;
4151     + int extent_num;
4152     + unsigned long offset;
4153     +};
4154     +
4155     +#define suspend_extent_state_eof(state) ((state)->num_chains == (state)->current_chain)
4156     +
4157     +/* Simplify iterating through all the values in an extent chain */
4158     +#define suspend_extent_for_each(extent_chain, extentpointer, value) \
4159     +if ((extent_chain)->first) \
4160     + for ((extentpointer) = (extent_chain)->first, (value) = \
4161     + (extentpointer)->minimum; \
4162     + ((extentpointer) && ((extentpointer)->next || (value) <= \
4163     + (extentpointer)->maximum)); \
4164     + (((value) == (extentpointer)->maximum) ? \
4165     + ((extentpointer) = (extentpointer)->next, (value) = \
4166     + ((extentpointer) ? (extentpointer)->minimum : 0)) : \
4167     + (value)++))
4168     +
4169     +void suspend_put_extent_chain(struct extent_chain *chain);
4170     +int suspend_add_to_extent_chain(struct extent_chain *chain,
4171     + unsigned long minimum, unsigned long maximum);
4172     +int suspend_serialise_extent_chain(struct suspend_module_ops *owner,
4173     + struct extent_chain *chain);
4174     +int suspend_load_extent_chain(struct extent_chain *chain);
4175     +
4176     +/* swap_entry_to_extent_val & extent_val_to_swap_entry:
4177     + * We are putting offset in the low bits so consecutive swap entries
4178     + * make consecutive extent values */
4179     +#define swap_entry_to_extent_val(swp_entry) (swp_entry.val)
4180     +#define extent_val_to_swap_entry(val) (swp_entry_t) { (val) }
4181     +
4182     +void suspend_extent_state_save(struct extent_iterate_state *state,
4183     + struct extent_iterate_saved_state *saved_state);
4184     +void suspend_extent_state_restore(struct extent_iterate_state *state,
4185     + struct extent_iterate_saved_state *saved_state);
4186     +void suspend_extent_state_goto_start(struct extent_iterate_state *state);
4187     +unsigned long suspend_extent_state_next(struct extent_iterate_state *state);
4188     +#endif
4189     diff --git a/kernel/power/io.c b/kernel/power/io.c
4190     new file mode 100644
4191     index 0000000..854d1a7
4192     --- /dev/null
4193     +++ b/kernel/power/io.c
4194     @@ -0,0 +1,1407 @@
4195     +/*
4196     + * kernel/power/io.c
4197     + *
4198     + * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu>
4199     + * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz>
4200     + * Copyright (C) 2002-2003 Florent Chabaud <fchabaud@free.fr>
4201     + * Copyright (C) 2002-2007 Nigel Cunningham (nigel at suspend2 net)
4202     + *
4203     + * This file is released under the GPLv2.
4204     + *
4205     + * It contains high level IO routines for suspending.
4206     + *
4207     + */
4208     +
4209     +#include <linux/suspend.h>
4210     +#include <linux/version.h>
4211     +#include <linux/utsname.h>
4212     +#include <linux/mount.h>
4213     +#include <linux/highmem.h>
4214     +#include <linux/module.h>
4215     +#include <linux/kthread.h>
4216     +#include <asm/tlbflush.h>
4217     +
4218     +#include "suspend.h"
4219     +#include "modules.h"
4220     +#include "pageflags.h"
4221     +#include "io.h"
4222     +#include "ui.h"
4223     +#include "storage.h"
4224     +#include "prepare_image.h"
4225     +#include "extent.h"
4226     +#include "sysfs.h"
4227     +#include "suspend2_builtin.h"
4228     +
4229     +char poweroff_resume2[256];
4230     +
4231     +/* Variables shared between threads and updated under the mutex */
4232     +static int io_write, io_finish_at, io_base, io_barmax, io_pageset, io_result;
4233     +static int io_index, io_nextupdate, io_pc, io_pc_step;
4234     +static unsigned long pfn, other_pfn;
4235     +static DEFINE_MUTEX(io_mutex);
4236     +static DEFINE_PER_CPU(struct page *, last_sought);
4237     +static DEFINE_PER_CPU(struct page *, last_high_page);
4238     +static DEFINE_PER_CPU(struct pbe *, last_low_page);
4239     +static atomic_t worker_thread_count;
4240     +static atomic_t io_count;
4241     +
4242     +/* suspend_attempt_to_parse_resume_device
4243     + *
4244     + * Can we suspend, using the current resume2= parameter?
4245     + */
4246     +int suspend_attempt_to_parse_resume_device(int quiet)
4247     +{
4248     + struct list_head *Allocator;
4249     + struct suspend_module_ops *thisAllocator;
4250     + int result, returning = 0;
4251     +
4252     + if (suspend_activate_storage(0))
4253     + return 0;
4254     +
4255     + suspendActiveAllocator = NULL;
4256     + clear_suspend_state(SUSPEND_RESUME_DEVICE_OK);
4257     + clear_suspend_state(SUSPEND_CAN_RESUME);
4258     + clear_result_state(SUSPEND_ABORTED);
4259     +
4260     + if (!suspendNumAllocators) {
4261     + if (!quiet)
4262     + printk("Suspend2: No storage allocators have been "
4263     + "registered. Suspending will be disabled.\n");
4264     + goto cleanup;
4265     + }
4266     +
4267     + if (!resume2_file[0]) {
4268     + if (!quiet)
4269     + printk("Suspend2: Resume2 parameter is empty."
4270     + " Suspending will be disabled.\n");
4271     + goto cleanup;
4272     + }
4273     +
4274     + list_for_each(Allocator, &suspendAllocators) {
4275     + thisAllocator = list_entry(Allocator, struct suspend_module_ops,
4276     + type_list);
4277     +
4278     + /*
4279     + * Not sure why you'd want to disable an allocator, but
4280     + * we should honour the flag if we're providing it
4281     + */
4282     + if (!thisAllocator->enabled)
4283     + continue;
4284     +
4285     + result = thisAllocator->parse_sig_location(
4286     + resume2_file, (suspendNumAllocators == 1),
4287     + quiet);
4288     +
4289     + switch (result) {
4290     + case -EINVAL:
4291     + /* For this allocator, but not a valid
4292     + * configuration. Error already printed. */
4293     + goto cleanup;
4294     +
4295     + case 0:
4296     + /* For this allocator and valid. */
4297     + suspendActiveAllocator = thisAllocator;
4298     +
4299     + set_suspend_state(SUSPEND_RESUME_DEVICE_OK);
4300     + set_suspend_state(SUSPEND_CAN_RESUME);
4301     + if (!quiet)
4302     + printk("Suspend2: Resuming enabled.\n");
4303     +
4304     + returning = 1;
4305     + goto cleanup;
4306     + }
4307     + }
4308     + if (!quiet)
4309     + printk("Suspend2: No matching enabled allocator found. "
4310     + "Resuming disabled.\n");
4311     +cleanup:
4312     + suspend_deactivate_storage(0);
4313     + return returning;
4314     +}
4315     +
4316     +void attempt_to_parse_resume_device2(void)
4317     +{
4318     + suspend_prepare_usm();
4319     + suspend_attempt_to_parse_resume_device(0);
4320     + suspend_cleanup_usm();
4321     +}
4322     +
4323     +void save_restore_resume2(int replace, int quiet)
4324     +{
4325     + static char resume2_save[255];
4326     + static unsigned long suspend_state_save;
4327     +
4328     + if (replace) {
4329     + suspend_state_save = suspend_state;
4330     + strcpy(resume2_save, resume2_file);
4331     + strcpy(resume2_file, poweroff_resume2);
4332     + } else {
4333     + strcpy(resume2_file, resume2_save);
4334     + suspend_state = suspend_state_save;
4335     + }
4336     + suspend_attempt_to_parse_resume_device(quiet);
4337     +}
4338     +
4339     +void attempt_to_parse_po_resume_device2(void)
4340     +{
4341     + int ok = 0;
4342     +
4343     + /* Temporarily set resume2 to the poweroff value */
4344     + if (!strlen(poweroff_resume2))
4345     + return;
4346     +
4347     + printk("=== Trying Poweroff Resume2 ===\n");
4348     + save_restore_resume2(SAVE, NOQUIET);
4349     + if (test_suspend_state(SUSPEND_CAN_RESUME))
4350     + ok = 1;
4351     +
4352     + printk("=== Done ===\n");
4353     + save_restore_resume2(RESTORE, QUIET);
4354     +
4355     + /* If not ok, clear the string */
4356     + if (ok)
4357     + return;
4358     +
4359     + printk("Can't resume from that location; clearing poweroff_resume2.\n");
4360     + poweroff_resume2[0] = '\0';
4361     +}
4362     +
4363     +/* noresume_reset_modules
4364     + *
4365     + * Description: When we read the start of an image, modules (and especially the
4366     + * active allocator) might need to reset data structures if we
4367     + * decide to invalidate the image rather than resuming from it.
4368     + */
4369     +
4370     +static void noresume_reset_modules(void)
4371     +{
4372     + struct suspend_module_ops *this_filter;
4373     +
4374     + list_for_each_entry(this_filter, &suspend_filters, type_list)
4375     + if (this_filter->noresume_reset)
4376     + this_filter->noresume_reset();
4377     +
4378     + if (suspendActiveAllocator && suspendActiveAllocator->noresume_reset)
4379     + suspendActiveAllocator->noresume_reset();
4380     +}
4381     +
4382     +/* fill_suspend_header()
4383     + *
4384     + * Description: Fill the suspend header structure.
4385     + * Arguments: struct suspend_header: Header data structure to be filled.
4386     + */
4387     +
4388     +static void fill_suspend_header(struct suspend_header *sh)
4389     +{
4390     + int i;
4391     +
4392     + memset((char *)sh, 0, sizeof(*sh));
4393     +
4394     + sh->version_code = LINUX_VERSION_CODE;
4395     + sh->num_physpages = num_physpages;
4396     + memcpy(&sh->uts, init_utsname(), sizeof(struct new_utsname));
4397     + sh->page_size = PAGE_SIZE;
4398     + sh->pagedir = pagedir1;
4399     + sh->pageset_2_size = pagedir2.size;
4400     + sh->param0 = suspend_result;
4401     + sh->param1 = suspend_action;
4402     + sh->param2 = suspend_debug_state;
4403     + sh->param3 = console_loglevel;
4404     + sh->root_fs = current->fs->rootmnt->mnt_sb->s_dev;
4405     + for (i = 0; i < 4; i++)
4406     + sh->io_time[i/2][i%2] = suspend_io_time[i/2][i%2];
4407     +}
4408     +
4409     +/*
4410     + * rw_init_modules
4411     + *
4412     + * Iterate over modules, preparing the ones that will be used to read or write
4413     + * data.
4414     + */
4415     +static int rw_init_modules(int rw, int which)
4416     +{
4417     + struct suspend_module_ops *this_module;
4418     + /* Initialise page transformers */
4419     + list_for_each_entry(this_module, &suspend_filters, type_list) {
4420     + if (!this_module->enabled)
4421     + continue;
4422     + if (this_module->rw_init && this_module->rw_init(rw, which)) {
4423     + abort_suspend(SUSPEND_FAILED_MODULE_INIT,
4424     + "Failed to initialise the %s filter.",
4425     + this_module->name);
4426     + return 1;
4427     + }
4428     + }
4429     +
4430     + /* Initialise allocator */
4431     + if (suspendActiveAllocator->rw_init(rw, which)) {
4432     + abort_suspend(SUSPEND_FAILED_MODULE_INIT,
4433     + "Failed to initialise the allocator.");
4434     + if (!rw)
4435     + suspendActiveAllocator->invalidate_image();
4436     + return 1;
4437     + }
4438     +
4439     + /* Initialise other modules */
4440     + list_for_each_entry(this_module, &suspend_modules, module_list) {
4441     + if (!this_module->enabled ||
4442     + this_module->type == FILTER_MODULE ||
4443     + this_module->type == WRITER_MODULE)
4444     + continue;
4445     + if (this_module->rw_init && this_module->rw_init(rw, which)) {
4446     + set_result_state(SUSPEND_ABORTED);
4447     + printk("Setting aborted flag due to module init failure.\n");
4448     + return 1;
4449     + }
4450     + }
4451     +
4452     + return 0;
4453     +}
4454     +
4455     +/*
4456     + * rw_cleanup_modules
4457     + *
4458     + * Cleanup components after reading or writing a set of pages.
4459     + * Only the allocator may fail.
4460     + */
4461     +static int rw_cleanup_modules(int rw)
4462     +{
4463     + struct suspend_module_ops *this_module;
4464     + int result = 0;
4465     +
4466     + /* Cleanup other modules */
4467     + list_for_each_entry(this_module, &suspend_modules, module_list) {
4468     + if (!this_module->enabled ||
4469     + this_module->type == FILTER_MODULE ||
4470     + this_module->type == WRITER_MODULE)
4471     + continue;
4472     + if (this_module->rw_cleanup)
4473     + result |= this_module->rw_cleanup(rw);
4474     + }
4475     +
4476     + /* Flush data and cleanup */
4477     + list_for_each_entry(this_module, &suspend_filters, type_list) {
4478     + if (!this_module->enabled)
4479     + continue;
4480     + if (this_module->rw_cleanup)
4481     + result |= this_module->rw_cleanup(rw);
4482     + }
4483     +
4484     + result |= suspendActiveAllocator->rw_cleanup(rw);
4485     +
4486     + return result;
4487     +}
4488     +
4489     +static struct page *copy_page_from_orig_page(struct page *orig_page)
4490     +{
4491     + int is_high = PageHighMem(orig_page), index, min, max;
4492     + struct page *high_page = NULL,
4493     + **my_last_high_page = &__get_cpu_var(last_high_page),
4494     + **my_last_sought = &__get_cpu_var(last_sought);
4495     + struct pbe *this, **my_last_low_page = &__get_cpu_var(last_low_page);
4496     + void *compare;
4497     +
4498     + if (is_high) {
4499     + if (*my_last_sought && *my_last_high_page && *my_last_sought < orig_page)
4500     + high_page = *my_last_high_page;
4501     + else
4502     + high_page = (struct page *) restore_highmem_pblist;
4503     + this = (struct pbe *) kmap(high_page);
4504     + compare = orig_page;
4505     + } else {
4506     + if (*my_last_sought && *my_last_low_page && *my_last_sought < orig_page)
4507     + this = *my_last_low_page;
4508     + else
4509     + this = restore_pblist;
4510     + compare = page_address(orig_page);
4511     + }
4512     +
4513     + *my_last_sought = orig_page;
4514     +
4515     + /* Locate page containing pbe */
4516     + while ( this[PBES_PER_PAGE - 1].next &&
4517     + this[PBES_PER_PAGE - 1].orig_address < compare) {
4518     + if (is_high) {
4519     + struct page *next_high_page = (struct page *)
4520     + this[PBES_PER_PAGE - 1].next;
4521     + kunmap(high_page);
4522     + this = kmap(next_high_page);
4523     + high_page = next_high_page;
4524     + } else
4525     + this = this[PBES_PER_PAGE - 1].next;
4526     + }
4527     +
4528     + /* Do a binary search within the page */
4529     + min = 0;
4530     + max = PBES_PER_PAGE;
4531     + index = PBES_PER_PAGE / 2;
4532     + while (max - min) {
4533     + if (!this[index].orig_address ||
4534     + this[index].orig_address > compare)
4535     + max = index;
4536     + else if (this[index].orig_address == compare) {
4537     + if (is_high) {
4538     + struct page *page = this[index].address;
4539     + *my_last_high_page = high_page;
4540     + kunmap(high_page);
4541     + return page;
4542     + }
4543     + *my_last_low_page = this;
4544     + return virt_to_page(this[index].address);
4545     + } else
4546     + min = index;
4547     + index = ((max + min) / 2);
4548     + };
4549     +
4550     + if (is_high)
4551     + kunmap(high_page);
4552     +
4553     + abort_suspend(SUSPEND_FAILED_IO, "Failed to get destination page for"
4554     + " orig page %p. This[min].orig_address=%p.\n", orig_page,
4555     + this[index].orig_address);
4556     + return NULL;
4557     +}
4558     +
4559     +/*
4560     + * do_rw_loop
4561     + *
4562     + * The main I/O loop for reading or writing pages.
4563     + */
4564     +static int worker_rw_loop(void *data)
4565     +{
4566     + unsigned long orig_pfn, write_pfn;
4567     + int result, my_io_index = 0;
4568     + struct suspend_module_ops *first_filter = suspend_get_next_filter(NULL);
4569     + struct page *buffer = alloc_page(GFP_ATOMIC);
4570     +
4571     + atomic_inc(&worker_thread_count);
4572     +
4573     + mutex_lock(&io_mutex);
4574     +
4575     + do {
4576     + int buf_size;
4577     +
4578     + /*
4579     + * What page to use? If reading, don't know yet which page's
4580     + * data will be read, so always use the buffer. If writing,
4581     + * use the copy (Pageset1) or original page (Pageset2), but
4582     + * always write the pfn of the original page.
4583     + */
4584     + if (io_write) {
4585     + struct page *page;
4586     +
4587     + pfn = get_next_bit_on(io_map, pfn);
4588     +
4589     + /* Another thread could have beaten us to it. */
4590     + if (pfn == max_pfn + 1) {
4591     + if (atomic_read(&io_count)) {
4592     + printk("Ran out of pfns but io_count is still %d.\n", atomic_read(&io_count));
4593     + BUG();
4594     + }
4595     + break;
4596     + }
4597     +
4598     + atomic_dec(&io_count);
4599     +
4600     + orig_pfn = pfn;
4601     + write_pfn = pfn;
4602     +
4603     + /*
4604     + * Other_pfn is updated by all threads, so we're not
4605     + * writing the same page multiple times.
4606     + */
4607     + clear_dynpageflag(&io_map, pfn_to_page(pfn));
4608     + if (io_pageset == 1) {
4609     + other_pfn = get_next_bit_on(pageset1_map, other_pfn);
4610     + write_pfn = other_pfn;
4611     + }
4612     + page = pfn_to_page(pfn);
4613     +
4614     + my_io_index = io_finish_at - atomic_read(&io_count);
4615     +
4616     + mutex_unlock(&io_mutex);
4617     +
4618     + result = first_filter->write_chunk(write_pfn, page,
4619     + PAGE_SIZE);
4620     + } else {
4621     + atomic_dec(&io_count);
4622     + mutex_unlock(&io_mutex);
4623     +
4624     + /*
4625     + * Are we aborting? If so, don't submit any more I/O as
4626     + * resetting the resume_attempted flag (from ui.c) will
4627     + * clear the bdev flags, making this thread oops.
4628     + */
4629     + if (unlikely(test_suspend_state(SUSPEND_STOP_RESUME))) {
4630     + atomic_dec(&worker_thread_count);
4631     + if (!atomic_read(&worker_thread_count))
4632     + set_suspend_state(SUSPEND_IO_STOPPED);
4633     + while (1)
4634     + schedule();
4635     + }
4636     +
4637     + result = first_filter->read_chunk(&write_pfn, buffer,
4638     + &buf_size, SUSPEND_ASYNC);
4639     + if (buf_size != PAGE_SIZE) {
4640     + abort_suspend(SUSPEND_FAILED_IO,
4641     + "I/O pipeline returned %d bytes instead "
4642     + "of %d.\n", buf_size, PAGE_SIZE);
4643     + mutex_lock(&io_mutex);
4644     + break;
4645     + }
4646     + }
4647     +
4648     + if (result) {
4649     + io_result = result;
4650     + if (io_write) {
4651     + printk("Write chunk returned %d.\n", result);
4652     + abort_suspend(SUSPEND_FAILED_IO,
4653     + "Failed to write a chunk of the "
4654     + "image.");
4655     + mutex_lock(&io_mutex);
4656     + break;
4657     + }
4658     + panic("Read chunk returned (%d)", result);
4659     + }
4660     +
4661     + /*
4662     + * Discard reads of resaved pages while reading ps2
4663     + * and unwanted pages while rereading ps2 when aborting.
4664     + */
4665     + if (!io_write && !PageResave(pfn_to_page(write_pfn))) {
4666     + struct page *final_page = pfn_to_page(write_pfn),
4667     + *copy_page = final_page;
4668     + char *virt, *buffer_virt;
4669     +
4670     + if (io_pageset == 1 && !load_direct(final_page)) {
4671     + copy_page = copy_page_from_orig_page(final_page);
4672     + BUG_ON(!copy_page);
4673     + }
4674     +
4675     + if (test_dynpageflag(&io_map, final_page)) {
4676     + virt = kmap(copy_page);
4677     + buffer_virt = kmap(buffer);
4678     + memcpy(virt, buffer_virt, PAGE_SIZE);
4679     + kunmap(copy_page);
4680     + kunmap(buffer);
4681     + clear_dynpageflag(&io_map, final_page);
4682     + mutex_lock(&io_mutex);
4683     + my_io_index = io_finish_at - atomic_read(&io_count);
4684     + mutex_unlock(&io_mutex);
4685     + } else {
4686     + mutex_lock(&io_mutex);
4687     + atomic_inc(&io_count);
4688     + mutex_unlock(&io_mutex);
4689     + }
4690     + }
4691     +
4692     + /* Strictly speaking, this is racy - another thread could
4693     + * output the next the next percentage before we've done
4694     + * ours. 1/5th of the pageset would have to be done first,
4695     + * though, so I'm not worried. In addition, the only impact
4696     + * would be messed up output, not image corruption. Doing
4697     + * this under the mutex seems an unnecessary slowdown.
4698     + */
4699     + if ((my_io_index + io_base) >= io_nextupdate)
4700     + io_nextupdate = suspend_update_status(my_io_index +
4701     + io_base, io_barmax, " %d/%d MB ",
4702     + MB(io_base+my_io_index+1), MB(io_barmax));
4703     +
4704     + if ((my_io_index + 1) == io_pc) {
4705     + printk("%d%%...", 20 * io_pc_step);
4706     + io_pc_step++;
4707     + io_pc = io_finish_at * io_pc_step / 5;
4708     + }
4709     +
4710     + suspend_cond_pause(0, NULL);
4711     +
4712     + /*
4713     + * Subtle: If there's less I/O still to be done than threads
4714     + * running, quit. This stops us doing I/O beyond the end of
4715     + * the image when reading.
4716     + *
4717     + * Possible race condition. Two threads could do the test at
4718     + * the same time; one should exit and one should continue.
4719     + * Therefore we take the mutex before comparing and exiting.
4720     + */
4721     +
4722     + mutex_lock(&io_mutex);
4723     +
4724     + } while(atomic_read(&io_count) >= atomic_read(&worker_thread_count) &&
4725     + !(io_write && test_result_state(SUSPEND_ABORTED)));
4726     +
4727     + atomic_dec(&worker_thread_count);
4728     + mutex_unlock(&io_mutex);
4729     +
4730     + __free_pages(buffer, 0);
4731     +
4732     + return 0;
4733     +}
4734     +
4735     +void start_other_threads(void)
4736     +{
4737     + int cpu;
4738     + struct task_struct *p;
4739     +
4740     + for_each_online_cpu(cpu) {
4741     + if (cpu == smp_processor_id())
4742     + continue;
4743     +
4744     + p = kthread_create(worker_rw_loop, NULL, "ks2io/%d", cpu);
4745     + if (IS_ERR(p)) {
4746     + printk("ks2io for %i failed\n", cpu);
4747     + continue;
4748     + }
4749     + kthread_bind(p, cpu);
4750     + wake_up_process(p);
4751     + }
4752     +}
4753     +
4754     +/*
4755     + * do_rw_loop
4756     + *
4757     + * The main I/O loop for reading or writing pages.
4758     + */
4759     +static int do_rw_loop(int write, int finish_at, dyn_pageflags_t *pageflags,
4760     + int base, int barmax, int pageset)
4761     +{
4762     + int index = 0, cpu;
4763     +
4764     + if (!finish_at)
4765     + return 0;
4766     +
4767     + io_write = write;
4768     + io_finish_at = finish_at;
4769     + io_base = base;
4770     + io_barmax = barmax;
4771     + io_pageset = pageset;
4772     + io_index = 0;
4773     + io_pc = io_finish_at / 5;
4774     + io_pc_step = 1;
4775     + io_result = 0;
4776     + io_nextupdate = 0;
4777     +
4778     + for_each_online_cpu(cpu) {
4779     + per_cpu(last_sought, cpu) = NULL;
4780     + per_cpu(last_low_page, cpu) = NULL;
4781     + per_cpu(last_high_page, cpu) = NULL;
4782     + }
4783     +
4784     + /* Ensure all bits clear */
4785     + pfn = get_next_bit_on(io_map, max_pfn + 1);
4786     +
4787     + while (pfn < max_pfn + 1) {
4788     + clear_dynpageflag(&io_map, pfn_to_page(pfn));
4789     + pfn = get_next_bit_on(io_map, pfn);
4790     + }
4791     +
4792     + /* Set the bits for the pages to write */
4793     + pfn = get_next_bit_on(*pageflags, max_pfn + 1);
4794     +
4795     + while (pfn < max_pfn + 1 && index < finish_at) {
4796     + set_dynpageflag(&io_map, pfn_to_page(pfn));
4797     + pfn = get_next_bit_on(*pageflags, pfn);
4798     + index++;
4799     + }
4800     +
4801     + BUG_ON(index < finish_at);
4802     +
4803     + atomic_set(&io_count, finish_at);
4804     +
4805     + pfn = max_pfn + 1;
4806     + other_pfn = pfn;
4807     +
4808     + clear_suspend_state(SUSPEND_IO_STOPPED);
4809     +
4810     + if (!test_action_state(SUSPEND_NO_MULTITHREADED_IO))
4811     + start_other_threads();
4812     + worker_rw_loop(NULL);
4813     +
4814     + while (atomic_read(&worker_thread_count))
4815     + schedule();
4816     +
4817     + set_suspend_state(SUSPEND_IO_STOPPED);
4818     + if (unlikely(test_suspend_state(SUSPEND_STOP_RESUME))) {
4819     + while (1)
4820     + schedule();
4821     + }
4822     +
4823     + if (!io_result) {
4824     + printk("done.\n");
4825     +
4826     + suspend_update_status(io_base + io_finish_at, io_barmax, " %d/%d MB ",
4827     + MB(io_base + io_finish_at), MB(io_barmax));
4828     + }
4829     +
4830     + if (io_write && test_result_state(SUSPEND_ABORTED))
4831     + io_result = 1;
4832     + else /* All I/O done? */
4833     + BUG_ON(get_next_bit_on(io_map, max_pfn + 1) != max_pfn + 1);
4834     +
4835     + return io_result;
4836     +}
4837     +
4838     +/* write_pageset()
4839     + *
4840     + * Description: Write a pageset to disk.
4841     + * Arguments: pagedir: Which pagedir to write..
4842     + * Returns: Zero on success or -1 on failure.
4843     + */
4844     +
4845     +int write_pageset(struct pagedir *pagedir)
4846     +{
4847     + int finish_at, base = 0, start_time, end_time;
4848     + int barmax = pagedir1.size + pagedir2.size;
4849     + long error = 0;
4850     + dyn_pageflags_t *pageflags;
4851     +
4852     + /*
4853     + * Even if there is nothing to read or write, the allocator
4854     + * may need the init/cleanup for it's housekeeping. (eg:
4855     + * Pageset1 may start where pageset2 ends when writing).
4856     + */
4857     + finish_at = pagedir->size;
4858     +
4859     + if (pagedir->id == 1) {
4860     + suspend_prepare_status(DONT_CLEAR_BAR,
4861     + "Writing kernel & process data...");
4862     + base = pagedir2.size;
4863     + if (test_action_state(SUSPEND_TEST_FILTER_SPEED) ||
4864     + test_action_state(SUSPEND_TEST_BIO))
4865     + pageflags = &pageset1_map;
4866     + else
4867     + pageflags = &pageset1_copy_map;
4868     + } else {
4869     + suspend_prepare_status(CLEAR_BAR, "Writing caches...");
4870     + pageflags = &pageset2_map;
4871     + }
4872     +
4873     + start_time = jiffies;
4874     +
4875     + if (rw_init_modules(1, pagedir->id)) {
4876     + abort_suspend(SUSPEND_FAILED_MODULE_INIT,
4877     + "Failed to initialise modules for writing.");
4878     + error = 1;
4879     + }
4880     +
4881     + if (!error)
4882     + error = do_rw_loop(1, finish_at, pageflags, base, barmax,
4883     + pagedir->id);
4884     +
4885     + if (rw_cleanup_modules(WRITE) && !error) {
4886     + abort_suspend(SUSPEND_FAILED_MODULE_CLEANUP,
4887     + "Failed to cleanup after writing.");
4888     + error = 1;
4889     + }
4890     +
4891     + end_time = jiffies;
4892     +
4893     + if ((end_time - start_time) && (!test_result_state(SUSPEND_ABORTED))) {
4894     + suspend_io_time[0][0] += finish_at,
4895     + suspend_io_time[0][1] += (end_time - start_time);
4896     + }
4897     +
4898     + return error;
4899     +}
4900     +
4901     +/* read_pageset()
4902     + *
4903     + * Description: Read a pageset from disk.
4904     + * Arguments: whichtowrite: Controls what debugging output is printed.
4905     + * overwrittenpagesonly: Whether to read the whole pageset or
4906     + * only part.
4907     + * Returns: Zero on success or -1 on failure.
4908     + */
4909     +
4910     +static int read_pageset(struct pagedir *pagedir, int overwrittenpagesonly)
4911     +{
4912     + int result = 0, base = 0, start_time, end_time;
4913     + int finish_at = pagedir->size;
4914     + int barmax = pagedir1.size + pagedir2.size;
4915     + dyn_pageflags_t *pageflags;
4916     +
4917     + if (pagedir->id == 1) {
4918     + suspend_prepare_status(CLEAR_BAR,
4919     + "Reading kernel & process data...");
4920     + pageflags = &pageset1_map;
4921     + } else {
4922     + suspend_prepare_status(DONT_CLEAR_BAR, "Reading caches...");
4923     + if (overwrittenpagesonly)
4924     + barmax = finish_at = min(pagedir1.size,
4925     + pagedir2.size);
4926     + else {
4927     + base = pagedir1.size;
4928     + }
4929     + pageflags = &pageset2_map;
4930     + }
4931     +
4932     + start_time = jiffies;
4933     +
4934     + if (rw_init_modules(0, pagedir->id)) {
4935     + suspendActiveAllocator->invalidate_image();
4936     + result = 1;
4937     + } else
4938     + result = do_rw_loop(0, finish_at, pageflags, base, barmax,
4939     + pagedir->id);
4940     +
4941     + if (rw_cleanup_modules(READ) && !result) {
4942     + abort_suspend(SUSPEND_FAILED_MODULE_CLEANUP,
4943     + "Failed to cleanup after reading.");
4944     + result = 1;
4945     + }
4946     +
4947     + /* Statistics */
4948     + end_time=jiffies;
4949     +
4950     + if ((end_time - start_time) && (!test_result_state(SUSPEND_ABORTED))) {
4951     + suspend_io_time[1][0] += finish_at,
4952     + suspend_io_time[1][1] += (end_time - start_time);
4953     + }
4954     +
4955     + return result;
4956     +}
4957     +
4958     +/* write_module_configs()
4959     + *
4960     + * Description: Store the configuration for each module in the image header.
4961     + * Returns: Int: Zero on success, Error value otherwise.
4962     + */
4963     +static int write_module_configs(void)
4964     +{
4965     + struct suspend_module_ops *this_module;
4966     + char *buffer = (char *) get_zeroed_page(GFP_ATOMIC);
4967     + int len, index = 1;
4968     + struct suspend_module_header suspend_module_header;
4969     +
4970     + if (!buffer) {
4971     + printk("Failed to allocate a buffer for saving "
4972     + "module configuration info.\n");
4973     + return -ENOMEM;
4974     + }
4975     +
4976     + /*
4977     + * We have to know which data goes with which module, so we at
4978     + * least write a length of zero for a module. Note that we are
4979     + * also assuming every module's config data takes <= PAGE_SIZE.
4980     + */
4981     +
4982     + /* For each module (in registration order) */
4983     + list_for_each_entry(this_module, &suspend_modules, module_list) {
4984     + if (!this_module->enabled || !this_module->storage_needed ||
4985     + (this_module->type == WRITER_MODULE &&
4986     + suspendActiveAllocator != this_module))
4987     + continue;
4988     +
4989     + /* Get the data from the module */
4990     + len = 0;
4991     + if (this_module->save_config_info)
4992     + len = this_module->save_config_info(buffer);
4993     +
4994     + /* Save the details of the module */
4995     + suspend_module_header.enabled = this_module->enabled;
4996     + suspend_module_header.type = this_module->type;
4997     + suspend_module_header.index = index++;
4998     + strncpy(suspend_module_header.name, this_module->name,
4999     + sizeof(suspend_module_header.name));
5000     + suspendActiveAllocator->rw_header_chunk(WRITE,
5001     + this_module,
5002     + (char *) &suspend_module_header,
5003     + sizeof(suspend_module_header));
5004     +
5005     + /* Save the size of the data and any data returned */
5006     + suspendActiveAllocator->rw_header_chunk(WRITE,
5007     + this_module,
5008     + (char *) &len, sizeof(int));
5009     + if (len)
5010     + suspendActiveAllocator->rw_header_chunk(
5011     + WRITE, this_module, buffer, len);
5012     + }
5013     +
5014     + /* Write a blank header to terminate the list */
5015     + suspend_module_header.name[0] = '\0';
5016     + suspendActiveAllocator->rw_header_chunk(WRITE,
5017     + NULL,
5018     + (char *) &suspend_module_header,
5019     + sizeof(suspend_module_header));
5020     +
5021     + free_page((unsigned long) buffer);
5022     + return 0;
5023     +}
5024     +
5025     +/* read_module_configs()
5026     + *
5027     + * Description: Reload module configurations from the image header.
5028     + * Returns: Int. Zero on success, error value otherwise.
5029     + */
5030     +
5031     +static int read_module_configs(void)
5032     +{
5033     + struct suspend_module_ops *this_module;
5034     + char *buffer = (char *) get_zeroed_page(GFP_ATOMIC);
5035     + int len, result = 0;
5036     + struct suspend_module_header suspend_module_header;
5037     +
5038     + if (!buffer) {
5039     + printk("Failed to allocate a buffer for reloading module "
5040     + "configuration info.\n");
5041     + return -ENOMEM;
5042     + }
5043     +
5044     + /* All modules are initially disabled. That way, if we have a module
5045     + * loaded now that wasn't loaded when we suspended, it won't be used
5046     + * in trying to read the data.
5047     + */
5048     + list_for_each_entry(this_module, &suspend_modules, module_list)
5049     + this_module->enabled = 0;
5050     +
5051     + /* Get the first module header */
5052     + result = suspendActiveAllocator->rw_header_chunk(READ, NULL,
5053     + (char *) &suspend_module_header,
5054     + sizeof(suspend_module_header));
5055     + if (result) {
5056     + printk("Failed to read the next module header.\n");
5057     + free_page((unsigned long) buffer);
5058     + return -EINVAL;
5059     + }
5060     +
5061     + /* For each module (in registration order) */
5062     + while (suspend_module_header.name[0]) {
5063     +
5064     + /* Find the module */
5065     + this_module = suspend_find_module_given_name(suspend_module_header.name);
5066     +
5067     + if (!this_module) {
5068     + /*
5069     + * Is it used? Only need to worry about filters. The active
5070     + * allocator must be loaded!
5071     + */
5072     + if (suspend_module_header.enabled) {
5073     + suspend_early_boot_message(1, SUSPEND_CONTINUE_REQ,
5074     + "It looks like we need module %s for "
5075     + "reading the image but it hasn't been "
5076     + "registered.\n",
5077     + suspend_module_header.name);
5078     + if (!(test_suspend_state(SUSPEND_CONTINUE_REQ))) {
5079     + suspendActiveAllocator->invalidate_image();
5080     + free_page((unsigned long) buffer);
5081     + return -EINVAL;
5082     + }
5083     + } else
5084     + printk("Module %s configuration data found, but"
5085     + " the module hasn't registered. Looks "
5086     + "like it was disabled, so we're "
5087     + "ignoring it's data.",
5088     + suspend_module_header.name);
5089     + }
5090     +
5091     + /* Get the length of the data (if any) */
5092     + result = suspendActiveAllocator->rw_header_chunk(READ, NULL,
5093     + (char *) &len, sizeof(int));
5094     + if (result) {
5095     + printk("Failed to read the length of the module %s's"
5096     + " configuration data.\n",
5097     + suspend_module_header.name);
5098     + free_page((unsigned long) buffer);
5099     + return -EINVAL;
5100     + }
5101     +
5102     + /* Read any data and pass to the module (if we found one) */
5103     + if (len) {
5104     + suspendActiveAllocator->rw_header_chunk(READ, NULL,
5105     + buffer, len);
5106     + if (this_module) {
5107     + if (!this_module->save_config_info) {
5108     + printk("Huh? Module %s appears to have "
5109     + "a save_config_info, but not a "
5110     + "load_config_info function!\n",
5111     + this_module->name);
5112     + } else
5113     + this_module->load_config_info(buffer, len);
5114     + }
5115     + }
5116     +
5117     + if (this_module) {
5118     + /* Now move this module to the tail of its lists. This
5119     + * will put it in order. Any new modules will end up at
5120     + * the top of the lists. They should have been set to
5121     + * disabled when loaded (people will normally not edit
5122     + * an initrd to load a new module and then suspend
5123     + * without using it!).
5124     + */
5125     +
5126     + suspend_move_module_tail(this_module);
5127     +
5128     + /*
5129     + * We apply the disabled state; modules don't need to
5130     + * save whether they were disabled and if they do, we
5131     + * override them anyway.
5132     + */
5133     + this_module->enabled = suspend_module_header.enabled;
5134     + }
5135     +
5136     + /* Get the next module header */
5137     + result = suspendActiveAllocator->rw_header_chunk(READ, NULL,
5138     + (char *) &suspend_module_header,
5139     + sizeof(suspend_module_header));
5140     +
5141     + if (result) {
5142     + printk("Failed to read the next module header.\n");
5143     + free_page((unsigned long) buffer);
5144     + return -EINVAL;
5145     + }
5146     +
5147     + }
5148     +
5149     + free_page((unsigned long) buffer);
5150     + return 0;
5151     +}
5152     +
5153     +/* write_image_header()
5154     + *
5155     + * Description: Write the image header after write the image proper.
5156     + * Returns: Int. Zero on success or -1 on failure.
5157     + */
5158     +
5159     +int write_image_header(void)
5160     +{
5161     + int ret;
5162     + int total = pagedir1.size + pagedir2.size+2;
5163     + char *header_buffer = NULL;
5164     +
5165     + /* Now prepare to write the header */
5166     + if ((ret = suspendActiveAllocator->write_header_init())) {
5167     + abort_suspend(SUSPEND_FAILED_MODULE_INIT,
5168     + "Active allocator's write_header_init"
5169     + " function failed.");
5170     + goto write_image_header_abort;
5171     + }
5172     +
5173     + /* Get a buffer */
5174     + header_buffer = (char *) get_zeroed_page(GFP_ATOMIC);
5175     + if (!header_buffer) {
5176     + abort_suspend(SUSPEND_OUT_OF_MEMORY,
5177     + "Out of memory when trying to get page for header!");
5178     + goto write_image_header_abort;
5179     + }
5180     +
5181     + /* Write suspend header */
5182     + fill_suspend_header((struct suspend_header *) header_buffer);
5183     + suspendActiveAllocator->rw_header_chunk(WRITE, NULL,
5184     + header_buffer, sizeof(struct suspend_header));
5185     +
5186     + free_page((unsigned long) header_buffer);
5187     +
5188     + /* Write module configurations */
5189     + if ((ret = write_module_configs())) {
5190     + abort_suspend(SUSPEND_FAILED_IO,
5191     + "Failed to write module configs.");
5192     + goto write_image_header_abort;
5193     + }
5194     +
5195     + save_dyn_pageflags(pageset1_map);
5196     +
5197     + /* Flush data and let allocator cleanup */
5198     + if (suspendActiveAllocator->write_header_cleanup()) {
5199     + abort_suspend(SUSPEND_FAILED_IO,
5200     + "Failed to cleanup writing header.");
5201     + goto write_image_header_abort_no_cleanup;
5202     + }
5203     +
5204     + if (test_result_state(SUSPEND_ABORTED))
5205     + goto write_image_header_abort_no_cleanup;
5206     +
5207     + suspend_message(SUSPEND_IO, SUSPEND_VERBOSE, 1, "|\n");
5208     + suspend_update_status(total, total, NULL);
5209     +
5210     + return 0;
5211     +
5212     +write_image_header_abort:
5213     + suspendActiveAllocator->write_header_cleanup();
5214     +write_image_header_abort_no_cleanup:
5215     + return -1;
5216     +}
5217     +
5218     +/* sanity_check()
5219     + *
5220     + * Description: Perform a few checks, seeking to ensure that the kernel being
5221     + * booted matches the one suspended. They need to match so we can
5222     + * be _sure_ things will work. It is not absolutely impossible for
5223     + * resuming from a different kernel to work, just not assured.
5224     + * Arguments: Struct suspend_header. The header which was saved at suspend
5225     + * time.
5226     + */
5227     +static char *sanity_check(struct suspend_header *sh)
5228     +{
5229     + if (sh->version_code != LINUX_VERSION_CODE)
5230     + return "Incorrect kernel version.";
5231     +
5232     + if (sh->num_physpages != num_physpages)
5233     + return "Incorrect memory size.";
5234     +
5235     + if (strncmp(sh->uts.sysname, init_utsname()->sysname, 65))
5236     + return "Incorrect system type.";
5237     +
5238     + if (strncmp(sh->uts.release, init_utsname()->release, 65))
5239     + return "Incorrect release.";
5240     +
5241     + if (strncmp(sh->uts.version, init_utsname()->version, 65))
5242     + return "Right kernel version but wrong build number.";
5243     +
5244     + if (strncmp(sh->uts.machine, init_utsname()->machine, 65))
5245     + return "Incorrect machine type.";
5246     +
5247     + if (sh->page_size != PAGE_SIZE)
5248     + return "Incorrect PAGE_SIZE.";
5249     +
5250     + if (!test_action_state(SUSPEND_IGNORE_ROOTFS)) {
5251     + const struct super_block *sb;
5252     + list_for_each_entry(sb, &super_blocks, s_list) {
5253     + if ((!(sb->s_flags & MS_RDONLY)) &&
5254     + (sb->s_type->fs_flags & FS_REQUIRES_DEV))
5255     + return "Device backed fs has been mounted "
5256     + "rw prior to resume or initrd/ramfs "
5257     + "is mounted rw.";
5258     + }
5259     + }
5260     +
5261     + return 0;
5262     +}
5263     +
5264     +/* __read_pageset1
5265     + *
5266     + * Description: Test for the existence of an image and attempt to load it.
5267     + * Returns: Int. Zero if image found and pageset1 successfully loaded.
5268     + * Error if no image found or loaded.
5269     + */
5270     +static int __read_pageset1(void)
5271     +{
5272     + int i, result = 0;
5273     + char *header_buffer = (char *) get_zeroed_page(GFP_ATOMIC),
5274     + *sanity_error = NULL;
5275     + struct suspend_header *suspend_header;
5276     +
5277     + if (!header_buffer) {
5278     + printk("Unable to allocate a page for reading the signature.\n");
5279     + return -ENOMEM;
5280     + }
5281     +
5282     + /* Check for an image */
5283     + if (!(result = suspendActiveAllocator->image_exists())) {
5284     + result = -ENODATA;
5285     + noresume_reset_modules();
5286     + printk("Suspend2: No image found.\n");
5287     + goto out;
5288     + }
5289     +
5290     + /* Check for noresume command line option */
5291     + if (test_suspend_state(SUSPEND_NORESUME_SPECIFIED)) {
5292     + printk("Suspend2: Noresume: Invalidated image.\n");
5293     + goto out_invalidate_image;
5294     + }
5295     +
5296     + /* Check whether we've resumed before */
5297     + if (test_suspend_state(SUSPEND_RESUMED_BEFORE)) {
5298     + int resumed_before_default = 0;
5299     + if (test_suspend_state(SUSPEND_RETRY_RESUME))
5300     + resumed_before_default = SUSPEND_CONTINUE_REQ;
5301     +
5302     + suspend_early_boot_message(1, resumed_before_default, NULL);
5303     + clear_suspend_state(SUSPEND_RETRY_RESUME);
5304     + if (!(test_suspend_state(SUSPEND_CONTINUE_REQ))) {
5305     + printk("Suspend2: Tried to resume before: "
5306     + "Invalidated image.\n");
5307     + goto out_invalidate_image;
5308     + }
5309     + }
5310     +
5311     + clear_suspend_state(SUSPEND_CONTINUE_REQ);
5312     +
5313     + /*
5314     + * Prepare the active allocator for reading the image header. The
5315     + * activate allocator might read its own configuration.
5316     + *
5317     + * NB: This call may never return because there might be a signature
5318     + * for a different image such that we warn the user and they choose
5319     + * to reboot. (If the device ids look erroneous (2.4 vs 2.6) or the
5320     + * location of the image might be unavailable if it was stored on a
5321     + * network connection.
5322     + */
5323     +
5324     + if ((result = suspendActiveAllocator->read_header_init())) {
5325     + printk("Suspend2: Failed to initialise, reading the image "
5326     + "header.\n");
5327     + goto out_invalidate_image;
5328     + }
5329     +
5330     + /* Read suspend header */
5331     + if ((result = suspendActiveAllocator->rw_header_chunk(READ, NULL,
5332     + header_buffer, sizeof(struct suspend_header))) < 0) {
5333     + printk("Suspend2: Failed to read the image signature.\n");
5334     + goto out_invalidate_image;
5335     + }
5336     +
5337     + suspend_header = (struct suspend_header *) header_buffer;
5338     +
5339     + /*
5340     + * NB: This call may also result in a reboot rather than returning.
5341     + */
5342     +
5343     + if ((sanity_error = sanity_check(suspend_header)) &&
5344     + suspend_early_boot_message(1, SUSPEND_CONTINUE_REQ, sanity_error)) {
5345     + printk("Suspend2: Sanity check failed.\n");
5346     + goto out_invalidate_image;
5347     + }
5348     +
5349     + /*
5350     + * We have an image and it looks like it will load okay.
5351     + *
5352     + * Get metadata from header. Don't override commandline parameters.
5353     + *
5354     + * We don't need to save the image size limit because it's not used
5355     + * during resume and will be restored with the image anyway.
5356     + */
5357     +
5358     + memcpy((char *) &pagedir1,
5359     + (char *) &suspend_header->pagedir, sizeof(pagedir1));
5360     + suspend_result = suspend_header->param0;
5361     + suspend_action = suspend_header->param1;
5362     + suspend_debug_state = suspend_header->param2;
5363     + console_loglevel = suspend_header->param3;
5364     + clear_suspend_state(SUSPEND_IGNORE_LOGLEVEL);
5365     + pagedir2.size = suspend_header->pageset_2_size;
5366     + for (i = 0; i < 4; i++)
5367     + suspend_io_time[i/2][i%2] =
5368     + suspend_header->io_time[i/2][i%2];
5369     +
5370     + /* Read module configurations */
5371     + if ((result = read_module_configs())) {
5372     + pagedir1.size = pagedir2.size = 0;
5373     + printk("Suspend2: Failed to read Suspend module "
5374     + "configurations.\n");
5375     + clear_action_state(SUSPEND_KEEP_IMAGE);
5376     + goto out_invalidate_image;
5377     + }
5378     +
5379     + suspend_prepare_console();
5380     +
5381     + set_suspend_state(SUSPEND_NOW_RESUMING);
5382     +
5383     + if (pre_resume_freeze())
5384     + goto out_reset_console;
5385     +
5386     + suspend_cond_pause(1, "About to read original pageset1 locations.");
5387     +
5388     + /*
5389     + * Read original pageset1 locations. These are the addresses we can't
5390     + * use for the data to be restored.
5391     + */
5392     +
5393     + if (allocate_dyn_pageflags(&pageset1_map) ||
5394     + allocate_dyn_pageflags(&pageset1_copy_map) ||
5395     + allocate_dyn_pageflags(&io_map))
5396     + goto out_reset_console;
5397     +
5398     + if (load_dyn_pageflags(pageset1_map))
5399     + goto out_reset_console;
5400     +
5401     + /* Clean up after reading the header */
5402     + if ((result = suspendActiveAllocator->read_header_cleanup())) {
5403     + printk("Suspend2: Failed to cleanup after reading the image "
5404     + "header.\n");
5405     + goto out_reset_console;
5406     + }
5407     +
5408     + suspend_cond_pause(1, "About to read pagedir.");
5409     +
5410     + /*
5411     + * Get the addresses of pages into which we will load the kernel to
5412     + * be copied back
5413     + */
5414     + if (suspend_get_pageset1_load_addresses()) {
5415     + printk("Suspend2: Failed to get load addresses for pageset1.\n");
5416     + goto out_reset_console;
5417     + }
5418     +
5419     + /* Read the original kernel back */
5420     + suspend_cond_pause(1, "About to read pageset 1.");
5421     +
5422     + if (read_pageset(&pagedir1, 0)) {
5423     + suspend_prepare_status(CLEAR_BAR, "Failed to read pageset 1.");
5424     + result = -EIO;
5425     + printk("Suspend2: Failed to get load pageset1.\n");
5426     + goto out_reset_console;
5427     + }
5428     +
5429     + suspend_cond_pause(1, "About to restore original kernel.");
5430     + result = 0;
5431     +
5432     + if (!test_action_state(SUSPEND_KEEP_IMAGE) &&
5433     + suspendActiveAllocator->mark_resume_attempted)
5434     + suspendActiveAllocator->mark_resume_attempted(1);
5435     +
5436     +out:
5437     + free_page((unsigned long) header_buffer);
5438     + return result;
5439     +
5440     +out_reset_console:
5441     + suspend_cleanup_console();
5442     +
5443     +out_invalidate_image:
5444     + free_dyn_pageflags(&pageset1_map);
5445     + free_dyn_pageflags(&pageset1_copy_map);
5446     + free_dyn_pageflags(&io_map);
5447     + result = -EINVAL;
5448     + if (!test_action_state(SUSPEND_KEEP_IMAGE))
5449     + suspendActiveAllocator->invalidate_image();
5450     + suspendActiveAllocator->read_header_cleanup();
5451     + noresume_reset_modules();
5452     + goto out;
5453     +}
5454     +
5455     +/* read_pageset1()
5456     + *
5457     + * Description: Attempt to read the header and pageset1 of a suspend image.
5458     + * Handle the outcome, complaining where appropriate.
5459     + */
5460     +
5461     +int read_pageset1(void)
5462     +{
5463     + int error;
5464     +
5465     + error = __read_pageset1();
5466     +
5467     + switch (error) {
5468     + case 0:
5469     + case -ENODATA:
5470     + case -EINVAL: /* non fatal error */
5471     + break;
5472     + default:
5473     + if (test_result_state(SUSPEND_ABORTED))
5474     + break;
5475     +
5476     + abort_suspend(SUSPEND_IMAGE_ERROR,
5477     + "Suspend2: Error %d resuming\n",
5478     + error);
5479     + }
5480     + return error;
5481     +}
5482     +
5483     +/*
5484     + * get_have_image_data()
5485     + */
5486     +static char *get_have_image_data(void)
5487     +{
5488     + char *output_buffer = (char *) get_zeroed_page(GFP_ATOMIC);
5489     + struct suspend_header *suspend_header;
5490     +
5491     + if (!output_buffer) {
5492     + printk("Output buffer null.\n");
5493     + return NULL;
5494     + }
5495     +
5496     + /* Check for an image */
5497     + if (!suspendActiveAllocator->image_exists() ||
5498     + suspendActiveAllocator->read_header_init() ||
5499     + suspendActiveAllocator->rw_header_chunk(READ, NULL,
5500     + output_buffer, sizeof(struct suspend_header))) {
5501     + sprintf(output_buffer, "0\n");
5502     + goto out;
5503     + }
5504     +
5505     + suspend_header = (struct suspend_header *) output_buffer;
5506     +
5507     + sprintf(output_buffer, "1\n%s\n%s\n",
5508     + suspend_header->uts.machine,
5509     + suspend_header->uts.version);
5510     +
5511     + /* Check whether we've resumed before */
5512     + if (test_suspend_state(SUSPEND_RESUMED_BEFORE))
5513     + strcat(output_buffer, "Resumed before.\n");
5514     +
5515     +out:
5516     + noresume_reset_modules();
5517     + return output_buffer;
5518     +}
5519     +
5520     +/* read_pageset2()
5521     + *
5522     + * Description: Read in part or all of pageset2 of an image, depending upon
5523     + * whether we are suspending and have only overwritten a portion
5524     + * with pageset1 pages, or are resuming and need to read them
5525     + * all.
5526     + * Arguments: Int. Boolean. Read only pages which would have been
5527     + * overwritten by pageset1?
5528     + * Returns: Int. Zero if no error, otherwise the error value.
5529     + */
5530     +int read_pageset2(int overwrittenpagesonly)
5531     +{
5532     + int result = 0;
5533     +
5534     + if (!pagedir2.size)
5535     + return 0;
5536     +
5537     + result = read_pageset(&pagedir2, overwrittenpagesonly);
5538     +
5539     + suspend_update_status(100, 100, NULL);
5540     + suspend_cond_pause(1, "Pagedir 2 read.");
5541     +
5542     + return result;
5543     +}
5544     +
5545     +/* image_exists_read
5546     + *
5547     + * Return 0 or 1, depending on whether an image is found.
5548     + * Incoming buffer is PAGE_SIZE and result is guaranteed
5549     + * to be far less than that, so we don't worry about
5550     + * overflow.
5551     + */
5552     +int image_exists_read(const char *page, int count)
5553     +{
5554     + int len = 0;
5555     + char *result;
5556     +
5557     + if (suspend_activate_storage(0))
5558     + return count;
5559     +
5560     + if (!test_suspend_state(SUSPEND_RESUME_DEVICE_OK))
5561     + suspend_attempt_to_parse_resume_device(0);
5562     +
5563     + if (!suspendActiveAllocator) {
5564     + len = sprintf((char *) page, "-1\n");
5565     + } else {
5566     + result = get_have_image_data();
5567     + if (result) {
5568     + len = sprintf((char *) page, "%s", result);
5569     + free_page((unsigned long) result);
5570     + }
5571     + }
5572     +
5573     + suspend_deactivate_storage(0);
5574     +
5575     + return len;
5576     +}
5577     +
5578     +/* image_exists_write
5579     + *
5580     + * Invalidate an image if one exists.
5581     + */
5582     +int image_exists_write(const char *buffer, int count)
5583     +{
5584     + if (suspend_activate_storage(0))
5585     + return count;
5586     +
5587     + if (suspendActiveAllocator && suspendActiveAllocator->image_exists())
5588     + suspendActiveAllocator->invalidate_image();
5589     +
5590     + suspend_deactivate_storage(0);
5591     +
5592     + clear_result_state(SUSPEND_KEPT_IMAGE);
5593     +
5594     + return count;
5595     +}
5596     +
5597     +#ifdef CONFIG_SUSPEND2_EXPORTS
5598     +EXPORT_SYMBOL_GPL(suspend_attempt_to_parse_resume_device);
5599     +EXPORT_SYMBOL_GPL(attempt_to_parse_resume_device2);
5600     +#endif
5601     +
5602     diff --git a/kernel/power/io.h b/kernel/power/io.h
5603     new file mode 100644
5604     index 0000000..527b39e
5605     --- /dev/null
5606     +++ b/kernel/power/io.h
5607     @@ -0,0 +1,56 @@
5608     +/*
5609     + * kernel/power/io.h
5610     + *
5611     + * Copyright (C) 2005-2007 Nigel Cunningham (nigel at suspend2 net)
5612     + *
5613     + * This file is released under the GPLv2.
5614     + *
5615     + * It contains high level IO routines for suspending.
5616     + *
5617     + */
5618     +
5619     +#include <linux/utsname.h>
5620     +#include "pagedir.h"
5621     +
5622     +/* Non-module data saved in our image header */
5623     +struct suspend_header {
5624     + u32 version_code;
5625     + unsigned long num_physpages;
5626     + unsigned long orig_mem_free;
5627     + struct new_utsname uts;
5628     + int num_cpus;
5629     + int page_size;
5630     + int pageset_2_size;
5631     + int param0;
5632     + int param1;
5633     + int param2;
5634     + int param3;
5635     + int progress0;
5636     + int progress1;
5637     + int progress2;
5638     + int progress3;
5639     + int io_time[2][2];
5640     + struct pagedir pagedir;
5641     + dev_t root_fs;
5642     +};
5643     +
5644     +extern int write_pageset(struct pagedir *pagedir);
5645     +extern int write_image_header(void);
5646     +extern int read_pageset1(void);
5647     +extern int read_pageset2(int overwrittenpagesonly);
5648     +
5649     +extern int suspend_attempt_to_parse_resume_device(int quiet);
5650     +extern void attempt_to_parse_resume_device2(void);
5651     +extern void attempt_to_parse_po_resume_device2(void);
5652     +int image_exists_read(const char *page, int count);
5653     +int image_exists_write(const char *buffer, int count);
5654     +extern void save_restore_resume2(int replace, int quiet);
5655     +
5656     +/* Args to save_restore_resume2 */
5657     +#define RESTORE 0
5658     +#define SAVE 1
5659     +
5660     +#define NOQUIET 0
5661     +#define QUIET 1
5662     +
5663     +extern dev_t name_to_dev_t(char *line);
5664     diff --git a/kernel/power/main.c b/kernel/power/main.c
5665     index a064dfd..05b6686 100644
5666     --- a/kernel/power/main.c
5667     +++ b/kernel/power/main.c
5668     @@ -155,7 +155,7 @@ static void suspend_finish(suspend_state_t state)
5669     static const char * const pm_states[PM_SUSPEND_MAX] = {
5670     [PM_SUSPEND_STANDBY] = "standby",
5671     [PM_SUSPEND_MEM] = "mem",
5672     -#ifdef CONFIG_SOFTWARE_SUSPEND
5673     +#if defined(CONFIG_SOFTWARE_SUSPEND) || defined(CONFIG_SUSPEND2)
5674     [PM_SUSPEND_DISK] = "disk",
5675     #endif
5676     };
5677     diff --git a/kernel/power/modules.c b/kernel/power/modules.c
5678     new file mode 100644
5679     index 0000000..a6b574f
5680     --- /dev/null
5681     +++ b/kernel/power/modules.c
5682     @@ -0,0 +1,415 @@
5683     +/*
5684     + * kernel/power/modules.c
5685     + *
5686     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
5687     + *
5688     + */
5689     +
5690     +#include <linux/suspend.h>
5691     +#include <linux/module.h>
5692     +#include "suspend.h"
5693     +#include "modules.h"
5694     +#include "sysfs.h"
5695     +#include "ui.h"
5696     +
5697     +struct list_head suspend_filters, suspendAllocators, suspend_modules;
5698     +struct suspend_module_ops *suspendActiveAllocator;
5699     +int suspend_num_filters;
5700     +int suspendNumAllocators, suspend_num_modules;
5701     +int initialised;
5702     +
5703     +static inline void suspend_initialise_module_lists(void) {
5704     + INIT_LIST_HEAD(&suspend_filters);
5705     + INIT_LIST_HEAD(&suspendAllocators);
5706     + INIT_LIST_HEAD(&suspend_modules);
5707     +}
5708     +
5709     +/*
5710     + * suspend_header_storage_for_modules
5711     + *
5712     + * Returns the amount of space needed to store configuration
5713     + * data needed by the modules prior to copying back the original
5714     + * kernel. We can exclude data for pageset2 because it will be
5715     + * available anyway once the kernel is copied back.
5716     + */
5717     +int suspend_header_storage_for_modules(void)
5718     +{
5719     + struct suspend_module_ops *this_module;
5720     + int bytes = 0;
5721     +
5722     + list_for_each_entry(this_module, &suspend_modules, module_list) {
5723     + if (!this_module->enabled ||
5724     + (this_module->type == WRITER_MODULE &&
5725     + suspendActiveAllocator != this_module))
5726     + continue;
5727     + if (this_module->storage_needed) {
5728     + int this = this_module->storage_needed() +
5729     + sizeof(struct suspend_module_header) +
5730     + sizeof(int);
5731     + this_module->header_requested = this;
5732     + bytes += this;
5733     + }
5734     + }
5735     +
5736     + /* One more for the empty terminator */
5737     + return bytes + sizeof(struct suspend_module_header);
5738     +}
5739     +
5740     +/*
5741     + * suspend_memory_for_modules
5742     + *
5743     + * Returns the amount of memory requested by modules for
5744     + * doing their work during the cycle.
5745     + */
5746     +
5747     +int suspend_memory_for_modules(void)
5748     +{
5749     + int bytes = 0;
5750     + struct suspend_module_ops *this_module;
5751     +
5752     + list_for_each_entry(this_module, &suspend_modules, module_list) {
5753     + if (!this_module->enabled)
5754     + continue;
5755     + if (this_module->memory_needed)
5756     + bytes += this_module->memory_needed();
5757     + }
5758     +
5759     + return ((bytes + PAGE_SIZE - 1) >> PAGE_SHIFT);
5760     +}
5761     +
5762     +/*
5763     + * suspend_expected_compression_ratio
5764     + *
5765     + * Returns the compression ratio expected when saving the image.
5766     + */
5767     +
5768     +int suspend_expected_compression_ratio(void)
5769     +{
5770     + int ratio = 100;
5771     + struct suspend_module_ops *this_module;
5772     +
5773     + list_for_each_entry(this_module, &suspend_modules, module_list) {
5774     + if (!this_module->enabled)
5775     + continue;
5776     + if (this_module->expected_compression)
5777     + ratio = ratio * this_module->expected_compression() / 100;
5778     + }
5779     +
5780     + return ratio;
5781     +}
5782     +
5783     +/* suspend_find_module_given_name
5784     + * Functionality : Return a module (if found), given a pointer
5785     + * to its name
5786     + */
5787     +
5788     +struct suspend_module_ops *suspend_find_module_given_name(char *name)
5789     +{
5790     + struct suspend_module_ops *this_module, *found_module = NULL;
5791     +
5792     + list_for_each_entry(this_module, &suspend_modules, module_list) {
5793     + if (!strcmp(name, this_module->name)) {
5794     + found_module = this_module;
5795     + break;
5796     + }
5797     + }
5798     +
5799     + return found_module;
5800     +}
5801     +
5802     +/*
5803     + * suspend_print_module_debug_info
5804     + * Functionality : Get debugging info from modules into a buffer.
5805     + */
5806     +int suspend_print_module_debug_info(char *buffer, int buffer_size)
5807     +{
5808     + struct suspend_module_ops *this_module;
5809     + int len = 0;
5810     +
5811     + list_for_each_entry(this_module, &suspend_modules, module_list) {
5812     + if (!this_module->enabled)
5813     + continue;
5814     + if (this_module->print_debug_info) {
5815     + int result;
5816     + result = this_module->print_debug_info(buffer + len,
5817     + buffer_size - len);
5818     + len += result;
5819     + }
5820     + }
5821     +
5822     + /* Ensure null terminated */
5823     + buffer[buffer_size] = 0;
5824     +
5825     + return len;
5826     +}
5827     +
5828     +/*
5829     + * suspend_register_module
5830     + *
5831     + * Register a module.
5832     + */
5833     +int suspend_register_module(struct suspend_module_ops *module)
5834     +{
5835     + int i;
5836     + struct kobject *kobj;
5837     +
5838     + if (!initialised) {
5839     + suspend_initialise_module_lists();
5840     + initialised = 1;
5841     + }
5842     +
5843     + module->enabled = 1;
5844     +
5845     + if (suspend_find_module_given_name(module->name)) {
5846     + printk("Suspend2: Trying to load module %s,"
5847     + " which is already registered.\n",
5848     + module->name);
5849     + return -EBUSY;
5850     + }
5851     +
5852     + switch (module->type) {
5853     + case FILTER_MODULE:
5854     + list_add_tail(&module->type_list,
5855     + &suspend_filters);
5856     + suspend_num_filters++;
5857     + break;
5858     +
5859     + case WRITER_MODULE:
5860     + list_add_tail(&module->type_list,
5861     + &suspendAllocators);
5862     + suspendNumAllocators++;
5863     + break;
5864     +
5865     + case MISC_MODULE:
5866     + break;
5867     +
5868     + default:
5869     + printk("Hmmm. Module '%s' has an invalid type."
5870     + " It has been ignored.\n", module->name);
5871     + return -EINVAL;
5872     + }
5873     + list_add_tail(&module->module_list, &suspend_modules);
5874     + suspend_num_modules++;
5875     +
5876     + if (module->directory || module->shared_directory) {
5877     + /*
5878     + * Modules may share a directory, but those with shared_dir
5879     + * set must be loaded (via symbol dependencies) after parents
5880     + * and unloaded beforehand.
5881     + */
5882     + if (module->shared_directory) {
5883     + struct suspend_module_ops *shared =
5884     + suspend_find_module_given_name(module->shared_directory);
5885     + if (!shared) {
5886     + printk("Suspend2: Module %s wants to share %s's directory but %s isn't loaded.\n",
5887     + module->name,
5888     + module->shared_directory,
5889     + module->shared_directory);
5890     + suspend_unregister_module(module);
5891     + return -ENODEV;
5892     + }
5893     + kobj = shared->dir_kobj;
5894     + } else
5895     + kobj = make_suspend2_sysdir(module->directory);
5896     + module->dir_kobj = kobj;
5897     + for (i=0; i < module->num_sysfs_entries; i++) {
5898     + int result = suspend_register_sysfs_file(kobj, &module->sysfs_data[i]);
5899     + if (result)
5900     + return result;
5901     + }
5902     + }
5903     +
5904     + printk("Suspend2 %s support registered.\n", module->name);
5905     + return 0;
5906     +}
5907     +
5908     +/*
5909     + * suspend_unregister_module
5910     + *
5911     + * Remove a module.
5912     + */
5913     +void suspend_unregister_module(struct suspend_module_ops *module)
5914     +{
5915     + int i;
5916     +
5917     + if (module->dir_kobj)
5918     + for (i=0; i < module->num_sysfs_entries; i++)
5919     + suspend_unregister_sysfs_file(module->dir_kobj, &module->sysfs_data[i]);
5920     +
5921     + if (!module->shared_directory && module->directory)
5922     + remove_suspend2_sysdir(module->dir_kobj);
5923     +
5924     + switch (module->type) {
5925     + case FILTER_MODULE:
5926     + list_del(&module->type_list);
5927     + suspend_num_filters--;
5928     + break;
5929     +
5930     + case WRITER_MODULE:
5931     + list_del(&module->type_list);
5932     + suspendNumAllocators--;
5933     + if (suspendActiveAllocator == module) {
5934     + suspendActiveAllocator = NULL;
5935     + clear_suspend_state(SUSPEND_CAN_RESUME);
5936     + clear_suspend_state(SUSPEND_CAN_SUSPEND);
5937     + }
5938     + break;
5939     +
5940     + case MISC_MODULE:
5941     + break;
5942     +
5943     + default:
5944     + printk("Hmmm. Module '%s' has an invalid type."
5945     + " It has been ignored.\n", module->name);
5946     + return;
5947     + }
5948     + list_del(&module->module_list);
5949     + suspend_num_modules--;
5950     + printk("Suspend2 %s module unloaded.\n", module->name);
5951     +}
5952     +
5953     +/*
5954     + * suspend_move_module_tail
5955     + *
5956     + * Rearrange modules when reloading the config.
5957     + */
5958     +void suspend_move_module_tail(struct suspend_module_ops *module)
5959     +{
5960     + switch (module->type) {
5961     + case FILTER_MODULE:
5962     + if (suspend_num_filters > 1)
5963     + list_move_tail(&module->type_list,
5964     + &suspend_filters);
5965     + break;
5966     +
5967     + case WRITER_MODULE:
5968     + if (suspendNumAllocators > 1)
5969     + list_move_tail(&module->type_list,
5970     + &suspendAllocators);
5971     + break;
5972     +
5973     + case MISC_MODULE:
5974     + break;
5975     + default:
5976     + printk("Hmmm. Module '%s' has an invalid type."
5977     + " It has been ignored.\n", module->name);
5978     + return;
5979     + }
5980     + if ((suspend_num_filters + suspendNumAllocators) > 1)
5981     + list_move_tail(&module->module_list, &suspend_modules);
5982     +}
5983     +
5984     +/*
5985     + * suspend_initialise_modules
5986     + *
5987     + * Get ready to do some work!
5988     + */
5989     +int suspend_initialise_modules(int starting_cycle)
5990     +{
5991     + struct suspend_module_ops *this_module;
5992     + int result;
5993     +
5994     + list_for_each_entry(this_module, &suspend_modules, module_list) {
5995     + this_module->header_requested = 0;
5996     + this_module->header_used = 0;
5997     + if (!this_module->enabled)
5998     + continue;
5999     + if (this_module->initialise) {
6000     + suspend_message(SUSPEND_MEMORY, SUSPEND_MEDIUM, 1,
6001     + "Initialising module %s.\n",
6002     + this_module->name);
6003     + if ((result = this_module->initialise(starting_cycle))) {
6004     + printk("%s didn't initialise okay.\n",
6005     + this_module->name);
6006     + return result;
6007     + }
6008     + }
6009     + }
6010     +
6011     + return 0;
6012     +}
6013     +
6014     +/*
6015     + * suspend_cleanup_modules
6016     + *
6017     + * Tell modules the work is done.
6018     + */
6019     +void suspend_cleanup_modules(int finishing_cycle)
6020     +{
6021     + struct suspend_module_ops *this_module;
6022     +
6023     + list_for_each_entry(this_module, &suspend_modules, module_list) {
6024     + if (!this_module->enabled)
6025     + continue;
6026     + if (this_module->cleanup) {
6027     + suspend_message(SUSPEND_MEMORY, SUSPEND_MEDIUM, 1,
6028     + "Cleaning up module %s.\n",
6029     + this_module->name);
6030     + this_module->cleanup(finishing_cycle);
6031     + }
6032     + }
6033     +}
6034     +
6035     +/*
6036     + * suspend_get_next_filter
6037     + *
6038     + * Get the next filter in the pipeline.
6039     + */
6040     +struct suspend_module_ops *suspend_get_next_filter(struct suspend_module_ops *filter_sought)
6041     +{
6042     + struct suspend_module_ops *last_filter = NULL, *this_filter = NULL;
6043     +
6044     + list_for_each_entry(this_filter, &suspend_filters, type_list) {
6045     + if (!this_filter->enabled)
6046     + continue;
6047     + if ((last_filter == filter_sought) || (!filter_sought))
6048     + return this_filter;
6049     + last_filter = this_filter;
6050     + }
6051     +
6052     + return suspendActiveAllocator;
6053     +}
6054     +
6055     +/* suspend_get_modules
6056     + *
6057     + * Take a reference to modules so they can't go away under us.
6058     + */
6059     +
6060     +int suspend_get_modules(void)
6061     +{
6062     + struct suspend_module_ops *this_module;
6063     +
6064     + list_for_each_entry(this_module, &suspend_modules, module_list) {
6065     + if (!try_module_get(this_module->module)) {
6066     + /* Failed! Reverse gets and return error */
6067     + struct suspend_module_ops *this_module2;
6068     + list_for_each_entry(this_module2, &suspend_modules, module_list) {
6069     + if (this_module == this_module2)
6070     + return -EINVAL;
6071     + module_put(this_module2->module);
6072     + }
6073     + }
6074     + }
6075     +
6076     + return 0;
6077     +}
6078     +
6079     +/* suspend_put_modules
6080     + *
6081     + * Release our references to modules we used.
6082     + */
6083     +
6084     +void suspend_put_modules(void)
6085     +{
6086     + struct suspend_module_ops *this_module;
6087     +
6088     + list_for_each_entry(this_module, &suspend_modules, module_list)
6089     + module_put(this_module->module);
6090     +}
6091     +
6092     +#ifdef CONFIG_SUSPEND2_EXPORTS
6093     +EXPORT_SYMBOL_GPL(suspend_register_module);
6094     +EXPORT_SYMBOL_GPL(suspend_unregister_module);
6095     +EXPORT_SYMBOL_GPL(suspend_get_next_filter);
6096     +EXPORT_SYMBOL_GPL(suspendActiveAllocator);
6097     +#endif
6098     diff --git a/kernel/power/modules.h b/kernel/power/modules.h
6099     new file mode 100644
6100     index 0000000..ed94458
6101     --- /dev/null
6102     +++ b/kernel/power/modules.h
6103     @@ -0,0 +1,164 @@
6104     +/*
6105     + * kernel/power/modules.h
6106     + *
6107     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
6108     + *
6109     + * This file is released under the GPLv2.
6110     + *
6111     + * It contains declarations for modules. Modules are additions to
6112     + * suspend2 that provide facilities such as image compression or
6113     + * encryption, backends for storage of the image and user interfaces.
6114     + *
6115     + */
6116     +
6117     +#ifndef SUSPEND_MODULES_H
6118     +#define SUSPEND_MODULES_H
6119     +
6120     +/* This is the maximum size we store in the image header for a module name */
6121     +#define SUSPEND_MAX_MODULE_NAME_LENGTH 30
6122     +
6123     +/* Per-module metadata */
6124     +struct suspend_module_header {
6125     + char name[SUSPEND_MAX_MODULE_NAME_LENGTH];
6126     + int enabled;
6127     + int type;
6128     + int index;
6129     + int data_length;
6130     + unsigned long signature;
6131     +};
6132     +
6133     +enum {
6134     + FILTER_MODULE,
6135     + WRITER_MODULE,
6136     + MISC_MODULE /* Block writer, eg. */
6137     +};
6138     +
6139     +enum {
6140     + SUSPEND_ASYNC,
6141     + SUSPEND_SYNC
6142     +};
6143     +
6144     +struct suspend_module_ops {
6145     + /* Functions common to all modules */
6146     + int type;
6147     + char *name;
6148     + char *directory;
6149     + char *shared_directory;
6150     + struct kobject *dir_kobj;
6151     + struct module *module;
6152     + int enabled;
6153     + struct list_head module_list;
6154     +
6155     + /* List of filters or allocators */
6156     + struct list_head list, type_list;
6157     +
6158     + /*
6159     + * Requirements for memory and storage in
6160     + * the image header..
6161     + */
6162     + int (*memory_needed) (void);
6163     + int (*storage_needed) (void);
6164     +
6165     + int header_requested, header_used;
6166     +
6167     + int (*expected_compression) (void);
6168     +
6169     + /*
6170     + * Debug info
6171     + */
6172     + int (*print_debug_info) (char *buffer, int size);
6173     + int (*save_config_info) (char *buffer);
6174     + void (*load_config_info) (char *buffer, int len);
6175     +
6176     + /*
6177     + * Initialise & cleanup - general routines called
6178     + * at the start and end of a cycle.
6179     + */
6180     + int (*initialise) (int starting_cycle);
6181     + void (*cleanup) (int finishing_cycle);
6182     +
6183     + /*
6184     + * Calls for allocating storage (allocators only).
6185     + *
6186     + * Header space is allocated separately. Note that allocation
6187     + * of space for the header might result in allocated space
6188     + * being stolen from the main pool if there is no unallocated
6189     + * space. We have to be able to allocate enough space for
6190     + * the header. We can eat memory to ensure there is enough
6191     + * for the main pool.
6192     + */
6193     +
6194     + int (*storage_available) (void);
6195     + int (*allocate_header_space) (int space_requested);
6196     + int (*allocate_storage) (int space_requested);
6197     + int (*storage_allocated) (void);
6198     + int (*release_storage) (void);
6199     +
6200     + /*
6201     + * Routines used in image I/O.
6202     + */
6203     + int (*rw_init) (int rw, int stream_number);
6204     + int (*rw_cleanup) (int rw);
6205     + int (*write_chunk) (unsigned long index, struct page *buffer_page,
6206     + unsigned int buf_size);
6207     + int (*read_chunk) (unsigned long *index, struct page *buffer_page,
6208     + unsigned int *buf_size, int sync);
6209     +
6210     + /* Reset module if image exists but reading aborted */
6211     + void (*noresume_reset) (void);
6212     +
6213     + /* Read and write the metadata */
6214     + int (*write_header_init) (void);
6215     + int (*write_header_cleanup) (void);
6216     +
6217     + int (*read_header_init) (void);
6218     + int (*read_header_cleanup) (void);
6219     +
6220     + int (*rw_header_chunk) (int rw, struct suspend_module_ops *owner,
6221     + char *buffer_start, int buffer_size);
6222     +
6223     + /* Attempt to parse an image location */
6224     + int (*parse_sig_location) (char *buffer, int only_writer, int quiet);
6225     +
6226     + /* Determine whether image exists that we can restore */
6227     + int (*image_exists) (void);
6228     +
6229     + /* Mark the image as having tried to resume */
6230     + void (*mark_resume_attempted) (int);
6231     +
6232     + /* Destroy image if one exists */
6233     + int (*invalidate_image) (void);
6234     +
6235     + /* Sysfs Data */
6236     + struct suspend_sysfs_data *sysfs_data;
6237     + int num_sysfs_entries;
6238     +};
6239     +
6240     +extern int suspend_num_modules, suspendNumAllocators;
6241     +
6242     +extern struct suspend_module_ops *suspendActiveAllocator;
6243     +extern struct list_head suspend_filters, suspendAllocators, suspend_modules;
6244     +
6245     +extern void suspend_prepare_console_modules(void);
6246     +extern void suspend_cleanup_console_modules(void);
6247     +
6248     +extern struct suspend_module_ops *suspend_find_module_given_name(char *name);
6249     +extern struct suspend_module_ops *suspend_get_next_filter(struct suspend_module_ops *);
6250     +
6251     +extern int suspend_register_module(struct suspend_module_ops *module);
6252     +extern void suspend_move_module_tail(struct suspend_module_ops *module);
6253     +
6254     +extern int suspend_header_storage_for_modules(void);
6255     +extern int suspend_memory_for_modules(void);
6256     +extern int suspend_expected_compression_ratio(void);
6257     +
6258     +extern int suspend_print_module_debug_info(char *buffer, int buffer_size);
6259     +extern int suspend_register_module(struct suspend_module_ops *module);
6260     +extern void suspend_unregister_module(struct suspend_module_ops *module);
6261     +
6262     +extern int suspend_initialise_modules(int starting_cycle);
6263     +extern void suspend_cleanup_modules(int finishing_cycle);
6264     +
6265     +int suspend_get_modules(void);
6266     +void suspend_put_modules(void);
6267     +#endif
6268     diff --git a/kernel/power/netlink.c b/kernel/power/netlink.c
6269     new file mode 100644
6270     index 0000000..bb1d563
6271     --- /dev/null
6272     +++ b/kernel/power/netlink.c
6273     @@ -0,0 +1,387 @@
6274     +/*
6275     + * kernel/power/netlink.c
6276     + *
6277     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
6278     + *
6279     + * This file is released under the GPLv2.
6280     + *
6281     + * Functions for communicating with a userspace helper via netlink.
6282     + */
6283     +
6284     +
6285     +#include <linux/suspend.h>
6286     +#include "netlink.h"
6287     +#include "suspend.h"
6288     +#include "modules.h"
6289     +
6290     +struct user_helper_data *uhd_list = NULL;
6291     +
6292     +/*
6293     + * Refill our pool of SKBs for use in emergencies (eg, when eating memory and none
6294     + * can be allocated).
6295     + */
6296     +static void suspend_fill_skb_pool(struct user_helper_data *uhd)
6297     +{
6298     + while (uhd->pool_level < uhd->pool_limit) {
6299     + struct sk_buff *new_skb =
6300     + alloc_skb(NLMSG_SPACE(uhd->skb_size), GFP_ATOMIC);
6301     +
6302     + if (!new_skb)
6303     + break;
6304     +
6305     + new_skb->next = uhd->emerg_skbs;
6306     + uhd->emerg_skbs = new_skb;
6307     + uhd->pool_level++;
6308     + }
6309     +}
6310     +
6311     +/*
6312     + * Try to allocate a single skb. If we can't get one, try to use one from
6313     + * our pool.
6314     + */
6315     +static struct sk_buff *suspend_get_skb(struct user_helper_data *uhd)
6316     +{
6317     + struct sk_buff *skb =
6318     + alloc_skb(NLMSG_SPACE(uhd->skb_size), GFP_ATOMIC);
6319     +
6320     + if (skb)
6321     + return skb;
6322     +
6323     + skb = uhd->emerg_skbs;
6324     + if (skb) {
6325     + uhd->pool_level--;
6326     + uhd->emerg_skbs = skb->next;
6327     + skb->next = NULL;
6328     + }
6329     +
6330     + return skb;
6331     +}
6332     +
6333     +static void put_skb(struct user_helper_data *uhd, struct sk_buff *skb)
6334     +{
6335     + if (uhd->pool_level < uhd->pool_limit) {
6336     + skb->next = uhd->emerg_skbs;
6337     + uhd->emerg_skbs = skb;
6338     + } else
6339     + kfree_skb(skb);
6340     +}
6341     +
6342     +void suspend_send_netlink_message(struct user_helper_data *uhd,
6343     + int type, void* params, size_t len)
6344     +{
6345     + struct sk_buff *skb;
6346     + struct nlmsghdr *nlh;
6347     + void *dest;
6348     + struct task_struct *t;
6349     +
6350     + if (uhd->pid == -1)
6351     + return;
6352     +
6353     + skb = suspend_get_skb(uhd);
6354     + if (!skb) {
6355     + printk("suspend_netlink: Can't allocate skb!\n");
6356     + return;
6357     + }
6358     +
6359     + /* NLMSG_PUT contains a hidden goto nlmsg_failure */
6360     + nlh = NLMSG_PUT(skb, 0, uhd->sock_seq, type, len);
6361     + uhd->sock_seq++;
6362     +
6363     + dest = NLMSG_DATA(nlh);
6364     + if (params && len > 0)
6365     + memcpy(dest, params, len);
6366     +
6367     + netlink_unicast(uhd->nl, skb, uhd->pid, 0);
6368     +
6369     + read_lock(&tasklist_lock);
6370     + if ((t = find_task_by_pid(uhd->pid)) == NULL) {
6371     + read_unlock(&tasklist_lock);
6372     + if (uhd->pid > -1)
6373     + printk("Hmm. Can't find the userspace task %d.\n", uhd->pid);
6374     + return;
6375     + }
6376     + wake_up_process(t);
6377     + read_unlock(&tasklist_lock);
6378     +
6379     + yield();
6380     +
6381     + return;
6382     +
6383     +nlmsg_failure:
6384     + if (skb)
6385     + put_skb(uhd, skb);
6386     +}
6387     +
6388     +static void send_whether_debugging(struct user_helper_data *uhd)
6389     +{
6390     + static int is_debugging = 1;
6391     +
6392     + suspend_send_netlink_message(uhd, NETLINK_MSG_IS_DEBUGGING,
6393     + &is_debugging, sizeof(int));
6394     +}
6395     +
6396     +/*
6397     + * Set the PF_NOFREEZE flag on the given process to ensure it can run whilst we
6398     + * are suspending.
6399     + */
6400     +static int nl_set_nofreeze(struct user_helper_data *uhd, int pid)
6401     +{
6402     + struct task_struct *t;
6403     +
6404     + read_lock(&tasklist_lock);
6405     + if ((t = find_task_by_pid(pid)) == NULL) {
6406     + read_unlock(&tasklist_lock);
6407     + printk("Strange. Can't find the userspace task %d.\n", pid);
6408     + return -EINVAL;
6409     + }
6410     +
6411     + t->flags |= PF_NOFREEZE;
6412     +
6413     + read_unlock(&tasklist_lock);
6414     + uhd->pid = pid;
6415     +
6416     + suspend_send_netlink_message(uhd, NETLINK_MSG_NOFREEZE_ACK, NULL, 0);
6417     +
6418     + return 0;
6419     +}
6420     +
6421     +/*
6422     + * Called when the userspace process has informed us that it's ready to roll.
6423     + */
6424     +static int nl_ready(struct user_helper_data *uhd, int version)
6425     +{
6426     + if (version != uhd->interface_version) {
6427     + printk("%s userspace process using invalid interface version."
6428     + " Trying to continue without it.\n",
6429     + uhd->name);
6430     + if (uhd->not_ready)
6431     + uhd->not_ready();
6432     + return 1;
6433     + }
6434     +
6435     + complete(&uhd->wait_for_process);
6436     +
6437     + return 0;
6438     +}
6439     +
6440     +void suspend_netlink_close_complete(struct user_helper_data *uhd)
6441     +{
6442     + if (uhd->nl) {
6443     + sock_release(uhd->nl->sk_socket);
6444     + uhd->nl = NULL;
6445     + }
6446     +
6447     + while (uhd->emerg_skbs) {
6448     + struct sk_buff *next = uhd->emerg_skbs->next;
6449     + kfree_skb(uhd->emerg_skbs);
6450     + uhd->emerg_skbs = next;
6451     + }
6452     +
6453     + uhd->pid = -1;
6454     +
6455     + suspend_put_modules();
6456     +}
6457     +
6458     +static int suspend_nl_gen_rcv_msg(struct user_helper_data *uhd,
6459     + struct sk_buff *skb, struct nlmsghdr *nlh)
6460     +{
6461     + int type;
6462     + int *data;
6463     + int err;
6464     +
6465     + /* Let the more specific handler go first. It returns
6466     + * 1 for valid messages that it doesn't know. */
6467     + if ((err = uhd->rcv_msg(skb, nlh)) != 1)
6468     + return err;
6469     +
6470     + type = nlh->nlmsg_type;
6471     +
6472     + /* Only allow one task to receive NOFREEZE privileges */
6473     + if (type == NETLINK_MSG_NOFREEZE_ME && uhd->pid != -1) {
6474     + printk("Received extra nofreeze me requests.\n");
6475     + return -EBUSY;
6476     + }
6477     +
6478     + data = (int*)NLMSG_DATA(nlh);
6479     +
6480     + switch (type) {
6481     + case NETLINK_MSG_NOFREEZE_ME:
6482     + if ((err = nl_set_nofreeze(uhd, nlh->nlmsg_pid)) != 0)
6483     + return err;
6484     + break;
6485     + case NETLINK_MSG_GET_DEBUGGING:
6486     + send_whether_debugging(uhd);
6487     + break;
6488     + case NETLINK_MSG_READY:
6489     + if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int))) {
6490     + printk("Invalid ready mesage.\n");
6491     + return -EINVAL;
6492     + }
6493     + if ((err = nl_ready(uhd, *data)) != 0)
6494     + return err;
6495     + break;
6496     + case NETLINK_MSG_CLEANUP:
6497     + suspend_netlink_close_complete(uhd);
6498     + break;
6499     + }
6500     +
6501     + return 0;
6502     +}
6503     +
6504     +static void suspend_user_rcv_skb(struct user_helper_data *uhd,
6505     + struct sk_buff *skb)
6506     +{
6507     + int err;
6508     + struct nlmsghdr *nlh;
6509     +
6510     + while (skb->len >= NLMSG_SPACE(0)) {
6511     + u32 rlen;
6512     +
6513     + nlh = (struct nlmsghdr *) skb->data;
6514     + if (nlh->nlmsg_len < sizeof(*nlh) || skb->len < nlh->nlmsg_len)
6515     + return;
6516     +
6517     + rlen = NLMSG_ALIGN(nlh->nlmsg_len);
6518     + if (rlen > skb->len)
6519     + rlen = skb->len;
6520     +
6521     + if ((err = suspend_nl_gen_rcv_msg(uhd, skb, nlh)) != 0)
6522     + netlink_ack(skb, nlh, err);
6523     + else if (nlh->nlmsg_flags & NLM_F_ACK)
6524     + netlink_ack(skb, nlh, 0);
6525     + skb_pull(skb, rlen);
6526     + }
6527     +}
6528     +
6529     +static void suspend_netlink_input(struct sock *sk, int len)
6530     +{
6531     + struct user_helper_data *uhd = uhd_list;
6532     +
6533     + while (uhd && uhd->netlink_id != sk->sk_protocol)
6534     + uhd= uhd->next;
6535     +
6536     + do {
6537     + struct sk_buff *skb;
6538     + while ((skb = skb_dequeue(&sk->sk_receive_queue)) != NULL) {
6539     + suspend_user_rcv_skb(uhd, skb);
6540     + put_skb(uhd, skb);
6541     + }
6542     + } while (uhd->nl && uhd->nl->sk_receive_queue.qlen);
6543     +}
6544     +
6545     +static int netlink_prepare(struct user_helper_data *uhd)
6546     +{
6547     + suspend_get_modules();
6548     +
6549     + uhd->next = uhd_list;
6550     + uhd_list = uhd;
6551     +
6552     + uhd->sock_seq = 0x42c0ffee;
6553     + uhd->nl = netlink_kernel_create(uhd->netlink_id, 0,
6554     + suspend_netlink_input, THIS_MODULE);
6555     + if (!uhd->nl) {
6556     + printk("Failed to allocate netlink socket for %s.\n",
6557     + uhd->name);
6558     + return -ENOMEM;
6559     + }
6560     +
6561     + suspend_fill_skb_pool(uhd);
6562     +
6563     + return 0;
6564     +}
6565     +
6566     +void suspend_netlink_close(struct user_helper_data *uhd)
6567     +{
6568     + struct task_struct *t;
6569     +
6570     + read_lock(&tasklist_lock);
6571     + if ((t = find_task_by_pid(uhd->pid)))
6572     + t->flags &= ~PF_NOFREEZE;
6573     + read_unlock(&tasklist_lock);
6574     +
6575     + suspend_send_netlink_message(uhd, NETLINK_MSG_CLEANUP, NULL, 0);
6576     +}
6577     +
6578     +static int suspend2_launch_userspace_program(char *command, int channel_no)
6579     +{
6580     + int retval;
6581     + static char *envp[] = {
6582     + "HOME=/",
6583     + "TERM=linux",
6584     + "PATH=/sbin:/usr/sbin:/bin:/usr/bin",
6585     + NULL };
6586     + static char *argv[] = { NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL };
6587     + char *channel = kmalloc(6, GFP_KERNEL);
6588     + int arg = 0, size;
6589     + char test_read[255];
6590     + char *orig_posn = command;
6591     +
6592     + if (!strlen(orig_posn))
6593     + return 1;
6594     +
6595     + /* Up to 7 args supported */
6596     + while (arg < 7) {
6597     + sscanf(orig_posn, "%s", test_read);
6598     + size = strlen(test_read);
6599     + if (!(size))
6600     + break;
6601     + argv[arg] = kmalloc(size + 1, GFP_ATOMIC);
6602     + strcpy(argv[arg], test_read);
6603     + orig_posn += size + 1;
6604     + *test_read = 0;
6605     + arg++;
6606     + }
6607     +
6608     + if (channel_no) {
6609     + sprintf(channel, "-c%d", channel_no);
6610     + argv[arg] = channel;
6611     + } else
6612     + arg--;
6613     +
6614     + retval = call_usermodehelper(argv[0], argv, envp, 0);
6615     +
6616     + if (retval)
6617     + printk("Failed to launch userspace program '%s': Error %d\n",
6618     + command, retval);
6619     +
6620     + {
6621     + int i;
6622     + for (i = 0; i < arg; i++)
6623     + if (argv[i] && argv[i] != channel)
6624     + kfree(argv[i]);
6625     + }
6626     +
6627     + kfree(channel);
6628     +
6629     + return retval;
6630     +}
6631     +
6632     +int suspend_netlink_setup(struct user_helper_data *uhd)
6633     +{
6634     + if (netlink_prepare(uhd) < 0) {
6635     + printk("Netlink prepare failed.\n");
6636     + return 1;
6637     + }
6638     +
6639     + if (suspend2_launch_userspace_program(uhd->program, uhd->netlink_id) < 0) {
6640     + printk("Launch userspace program failed.\n");
6641     + suspend_netlink_close_complete(uhd);
6642     + return 1;
6643     + }
6644     +
6645     + /* Wait 2 seconds for the userspace process to make contact */
6646     + wait_for_completion_timeout(&uhd->wait_for_process, 2*HZ);
6647     +
6648     + if (uhd->pid == -1) {
6649     + printk("%s: Failed to contact userspace process.\n",
6650     + uhd->name);
6651     + suspend_netlink_close_complete(uhd);
6652     + return 1;
6653     + }
6654     +
6655     + return 0;
6656     +}
6657     +
6658     +EXPORT_SYMBOL_GPL(suspend_netlink_setup);
6659     +EXPORT_SYMBOL_GPL(suspend_netlink_close);
6660     +EXPORT_SYMBOL_GPL(suspend_send_netlink_message);
6661     diff --git a/kernel/power/netlink.h b/kernel/power/netlink.h
6662     new file mode 100644
6663     index 0000000..97647e8
6664     --- /dev/null
6665     +++ b/kernel/power/netlink.h
6666     @@ -0,0 +1,58 @@
6667     +/*
6668     + * kernel/power/netlink.h
6669     + *
6670     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
6671     + *
6672     + * This file is released under the GPLv2.
6673     + *
6674     + * Declarations for functions for communicating with a userspace helper
6675     + * via netlink.
6676     + */
6677     +
6678     +#include <linux/netlink.h>
6679     +#include <net/sock.h>
6680     +
6681     +#define NETLINK_MSG_BASE 0x10
6682     +
6683     +#define NETLINK_MSG_READY 0x10
6684     +#define NETLINK_MSG_NOFREEZE_ME 0x16
6685     +#define NETLINK_MSG_GET_DEBUGGING 0x19
6686     +#define NETLINK_MSG_CLEANUP 0x24
6687     +#define NETLINK_MSG_NOFREEZE_ACK 0x27
6688     +#define NETLINK_MSG_IS_DEBUGGING 0x28
6689     +
6690     +struct user_helper_data {
6691     + int (*rcv_msg) (struct sk_buff *skb, struct nlmsghdr *nlh);
6692     + void (* not_ready) (void);
6693     + struct sock *nl;
6694     + u32 sock_seq;
6695     + pid_t pid;
6696     + char *comm;
6697     + char program[256];
6698     + int pool_level;
6699     + int pool_limit;
6700     + struct sk_buff *emerg_skbs;
6701     + int skb_size;
6702     + int netlink_id;
6703     + char *name;
6704     + struct user_helper_data *next;
6705     + struct completion wait_for_process;
6706     + int interface_version;
6707     + int must_init;
6708     +};
6709     +
6710     +#ifdef CONFIG_NET
6711     +int suspend_netlink_setup(struct user_helper_data *uhd);
6712     +void suspend_netlink_close(struct user_helper_data *uhd);
6713     +void suspend_send_netlink_message(struct user_helper_data *uhd,
6714     + int type, void* params, size_t len);
6715     +#else
6716     +static inline int suspend_netlink_setup(struct user_helper_data *uhd)
6717     +{
6718     + return 0;
6719     +}
6720     +
6721     +static inline void suspend_netlink_close(struct user_helper_data *uhd) { };
6722     +static inline void suspend_send_netlink_message(struct user_helper_data *uhd,
6723     + int type, void* params, size_t len) { };
6724     +#endif
6725     diff --git a/kernel/power/pagedir.c b/kernel/power/pagedir.c
6726     new file mode 100644
6727     index 0000000..4e74689
6728     --- /dev/null
6729     +++ b/kernel/power/pagedir.c
6730     @@ -0,0 +1,480 @@
6731     +/*
6732     + * kernel/power/pagedir.c
6733     + *
6734     + * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu>
6735     + * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz>
6736     + * Copyright (C) 2002-2003 Florent Chabaud <fchabaud@free.fr>
6737     + * Copyright (C) 2006-2007 Nigel Cunningham (nigel at suspend2 net)
6738     + *
6739     + * This file is released under the GPLv2.
6740     + *
6741     + * Routines for handling pagesets.
6742     + * Note that pbes aren't actually stored as such. They're stored as
6743     + * bitmaps and extents.
6744     + */
6745     +
6746     +#include <linux/suspend.h>
6747     +#include <linux/highmem.h>
6748     +#include <linux/bootmem.h>
6749     +#include <linux/hardirq.h>
6750     +#include <linux/sched.h>
6751     +#include <asm/tlbflush.h>
6752     +
6753     +#include "pageflags.h"
6754     +#include "ui.h"
6755     +#include "pagedir.h"
6756     +#include "prepare_image.h"
6757     +#include "suspend.h"
6758     +#include "power.h"
6759     +#include "suspend2_builtin.h"
6760     +
6761     +#define PAGESET1 0
6762     +#define PAGESET2 1
6763     +
6764     +static int ps2_pfn;
6765     +
6766     +/*
6767     + * suspend_mark_task_as_pageset
6768     + * Functionality : Marks all the saveable pages belonging to a given process
6769     + * as belonging to a particular pageset.
6770     + */
6771     +
6772     +static void suspend_mark_task_as_pageset(struct task_struct *t, int pageset2)
6773     +{
6774     + struct vm_area_struct *vma;
6775     + struct mm_struct *mm;
6776     +
6777     + mm = t->active_mm;
6778     +
6779     + if (!mm || !mm->mmap) return;
6780     +
6781     + if (!irqs_disabled())
6782     + down_read(&mm->mmap_sem);
6783     +
6784     + for (vma = mm->mmap; vma; vma = vma->vm_next) {
6785     + unsigned long posn;
6786     +
6787     + if (vma->vm_flags & (VM_PFNMAP | VM_IO | VM_RESERVED)) {
6788     + printk("Skipping vma %p in process %d (%s) which has "
6789     + "VM_PFNMAP | VM_IO | VM_RESERVED (%lx).\n", vma,
6790     + t->pid, t->comm, vma->vm_flags);
6791     + continue;
6792     + }
6793     +
6794     + if (!vma->vm_start)
6795     + continue;
6796     +
6797     + for (posn = vma->vm_start; posn < vma->vm_end;
6798     + posn += PAGE_SIZE) {
6799     + struct page *page = follow_page(vma, posn, 0);
6800     + if (!page)
6801     + continue;
6802     +
6803     + if (pageset2)
6804     + SetPagePageset2(page);
6805     + else {
6806     + ClearPagePageset2(page);
6807     + SetPagePageset1(page);
6808     + }
6809     + }
6810     + }
6811     +
6812     + if (!irqs_disabled())
6813     + up_read(&mm->mmap_sem);
6814     +}
6815     +
6816     +static void pageset2_full(void)
6817     +{
6818     + struct zone *zone;
6819     + unsigned long flags;
6820     +
6821     + for_each_zone(zone) {
6822     + spin_lock_irqsave(&zone->lru_lock, flags);
6823     + if (zone_page_state(zone, NR_INACTIVE)) {
6824     + struct page *page;
6825     + list_for_each_entry(page, &zone->inactive_list, lru)
6826     + SetPagePageset2(page);
6827     + }
6828     + if (zone_page_state(zone, NR_ACTIVE)) {
6829     + struct page *page;
6830     + list_for_each_entry(page, &zone->active_list, lru)
6831     + SetPagePageset2(page);
6832     + }
6833     + spin_unlock_irqrestore(&zone->lru_lock, flags);
6834     + }
6835     +}
6836     +
6837     +/* mark_pages_for_pageset2
6838     + *
6839     + * Description: Mark unshared pages in processes not needed for suspend as
6840     + * being able to be written out in a separate pagedir.
6841     + * HighMem pages are simply marked as pageset2. They won't be
6842     + * needed during suspend.
6843     + */
6844     +
6845     +struct attention_list {
6846     + struct task_struct *task;
6847     + struct attention_list *next;
6848     +};
6849     +
6850     +void suspend_mark_pages_for_pageset2(void)
6851     +{
6852     + struct task_struct *p;
6853     + struct attention_list *attention_list = NULL, *last = NULL, *next;
6854     + int i, task_count = 0;
6855     +
6856     + if (test_action_state(SUSPEND_NO_PAGESET2))
6857     + return;
6858     +
6859     + clear_dyn_pageflags(pageset2_map);
6860     +
6861     + if (test_action_state(SUSPEND_PAGESET2_FULL))
6862     + pageset2_full();
6863     + else {
6864     + read_lock(&tasklist_lock);
6865     + for_each_process(p) {
6866     + if (!p->mm || (p->flags & PF_BORROWED_MM))
6867     + continue;
6868     +
6869     + suspend_mark_task_as_pageset(p, PAGESET2);
6870     + }
6871     + read_unlock(&tasklist_lock);
6872     + }
6873     +
6874     + /*
6875     + * Now we count all userspace process (with task->mm) marked PF_NOFREEZE.
6876     + */
6877     + read_lock(&tasklist_lock);
6878     + for_each_process(p)
6879     + if ((p->flags & PF_NOFREEZE) || p == current)
6880     + task_count++;
6881     + read_unlock(&tasklist_lock);
6882     +
6883     + /*
6884     + * Allocate attention list structs.
6885     + */
6886     + for (i = 0; i < task_count; i++) {
6887     + struct attention_list *this =
6888     + kmalloc(sizeof(struct attention_list), GFP_ATOMIC);
6889     + if (!this) {
6890     + printk("Failed to allocate slab for attention list.\n");
6891     + set_result_state(SUSPEND_ABORTED);
6892     + goto free_attention_list;
6893     + }
6894     + this->next = NULL;
6895     + if (attention_list) {
6896     + last->next = this;
6897     + last = this;
6898     + } else
6899     + attention_list = last = this;
6900     + }
6901     +
6902     + next = attention_list;
6903     + read_lock(&tasklist_lock);
6904     + for_each_process(p)
6905     + if ((p->flags & PF_NOFREEZE) || p == current) {
6906     + next->task = p;
6907     + next = next->next;
6908     + }
6909     + read_unlock(&tasklist_lock);
6910     +
6911     + /*
6912     + * Because the tasks in attention_list are ones related to suspending,
6913     + * we know that they won't go away under us.
6914     + */
6915     +
6916     +free_attention_list:
6917     + while (attention_list) {
6918     + if (!test_result_state(SUSPEND_ABORTED))
6919     + suspend_mark_task_as_pageset(attention_list->task, PAGESET1);
6920     + last = attention_list;
6921     + attention_list = attention_list->next;
6922     + kfree(last);
6923     + }
6924     +}
6925     +
6926     +void suspend_reset_alt_image_pageset2_pfn(void)
6927     +{
6928     + ps2_pfn = max_pfn + 1;
6929     +}
6930     +
6931     +static struct page *first_conflicting_page;
6932     +
6933     +/*
6934     + * free_conflicting_pages
6935     + */
6936     +
6937     +void free_conflicting_pages(void)
6938     +{
6939     + while (first_conflicting_page) {
6940     + struct page *next = *((struct page **) kmap(first_conflicting_page));
6941     + kunmap(first_conflicting_page);
6942     + __free_page(first_conflicting_page);
6943     + first_conflicting_page = next;
6944     + }
6945     +}
6946     +
6947     +/* __suspend_get_nonconflicting_page
6948     + *
6949     + * Description: Gets order zero pages that won't be overwritten
6950     + * while copying the original pages.
6951     + */
6952     +
6953     +struct page * ___suspend_get_nonconflicting_page(int can_be_highmem)
6954     +{
6955     + struct page *page;
6956     + int flags = GFP_ATOMIC | __GFP_NOWARN | __GFP_ZERO;
6957     + if (can_be_highmem)
6958     + flags |= __GFP_HIGHMEM;
6959     +
6960     +
6961     + if (test_suspend_state(SUSPEND_LOADING_ALT_IMAGE) && pageset2_map &&
6962     + (ps2_pfn < (max_pfn + 2))) {
6963     + /*
6964     + * ps2_pfn = max_pfn + 1 when yet to find first ps2 pfn that can
6965     + * be used.
6966     + * = 0..max_pfn when going through list.
6967     + * = max_pfn + 2 when gone through whole list.
6968     + */
6969     + do {
6970     + ps2_pfn = get_next_bit_on(pageset2_map, ps2_pfn);
6971     + if (ps2_pfn <= max_pfn) {
6972     + page = pfn_to_page(ps2_pfn);
6973     + if (!PagePageset1(page) &&
6974     + (can_be_highmem || !PageHighMem(page)))
6975     + return page;
6976     + } else
6977     + ps2_pfn++;
6978     + } while (ps2_pfn < max_pfn);
6979     + }
6980     +
6981     + do {
6982     + page = alloc_pages(flags, 0);
6983     + if (!page) {
6984     + printk("Failed to get nonconflicting page.\n");
6985     + return 0;
6986     + }
6987     + if (PagePageset1(page)) {
6988     + struct page **next = (struct page **) kmap(page);
6989     + *next = first_conflicting_page;
6990     + first_conflicting_page = page;
6991     + kunmap(page);
6992     + }
6993     + } while(PagePageset1(page));
6994     +
6995     + return page;
6996     +}
6997     +
6998     +unsigned long __suspend_get_nonconflicting_page(void)
6999     +{
7000     + struct page *page = ___suspend_get_nonconflicting_page(0);
7001     + return page ? (unsigned long) page_address(page) : 0;
7002     +}
7003     +
7004     +struct pbe *get_next_pbe(struct page **page_ptr, struct pbe *this_pbe, int highmem)
7005     +{
7006     + if (((((unsigned long) this_pbe) & (PAGE_SIZE - 1))
7007     + + 2 * sizeof(struct pbe)) > PAGE_SIZE) {
7008     + struct page *new_page =
7009     + ___suspend_get_nonconflicting_page(highmem);
7010     + if (!new_page)
7011     + return ERR_PTR(-ENOMEM);
7012     + this_pbe = (struct pbe *) kmap(new_page);
7013     + memset(this_pbe, 0, PAGE_SIZE);
7014     + *page_ptr = new_page;
7015     + } else
7016     + this_pbe++;
7017     +
7018     + return this_pbe;
7019     +}
7020     +
7021     +/* get_pageset1_load_addresses
7022     + *
7023     + * Description: We check here that pagedir & pages it points to won't collide
7024     + * with pages where we're going to restore from the loaded pages
7025     + * later.
7026     + * Returns: Zero on success, one if couldn't find enough pages (shouldn't
7027     + * happen).
7028     + */
7029     +
7030     +int suspend_get_pageset1_load_addresses(void)
7031     +{
7032     + int pfn, highallocd = 0, lowallocd = 0;
7033     + int low_needed = pagedir1.size - get_highmem_size(pagedir1);
7034     + int high_needed = get_highmem_size(pagedir1);
7035     + int low_pages_for_highmem = 0;
7036     + unsigned long flags = GFP_ATOMIC | __GFP_NOWARN | __GFP_HIGHMEM;
7037     + struct page *page, *high_pbe_page = NULL, *last_high_pbe_page = NULL,
7038     + *low_pbe_page;
7039     + struct pbe **last_low_pbe_ptr = &restore_pblist,
7040     + **last_high_pbe_ptr = &restore_highmem_pblist,
7041     + *this_low_pbe = NULL, *this_high_pbe = NULL;
7042     + int orig_low_pfn = max_pfn + 1, orig_high_pfn = max_pfn + 1;
7043     + int high_pbes_done=0, low_pbes_done=0;
7044     + int low_direct = 0, high_direct = 0;
7045     + int high_to_free, low_to_free;
7046     +
7047     + /* First, allocate pages for the start of our pbe lists. */
7048     + if (high_needed) {
7049     + high_pbe_page = ___suspend_get_nonconflicting_page(1);
7050     + if (!high_pbe_page)
7051     + return 1;
7052     + this_high_pbe = (struct pbe *) kmap(high_pbe_page);
7053     + memset(this_high_pbe, 0, PAGE_SIZE);
7054     + }
7055     +
7056     + low_pbe_page = ___suspend_get_nonconflicting_page(0);
7057     + if (!low_pbe_page)
7058     + return 1;
7059     + this_low_pbe = (struct pbe *) page_address(low_pbe_page);
7060     +
7061     + /*
7062     + * Next, allocate all possible memory to find where we can
7063     + * load data directly into destination pages. I'd like to do
7064     + * this in bigger chunks, but then we can't free pages
7065     + * individually later.
7066     + */
7067     +
7068     + do {
7069     + page = alloc_pages(flags, 0);
7070     + if (page)
7071     + SetPagePageset1Copy(page);
7072     + } while (page);
7073     +
7074     + /*
7075     + * Find out how many high- and lowmem pages we allocated above,
7076     + * and how many pages we can reload directly to their original
7077     + * location.
7078     + */
7079     + BITMAP_FOR_EACH_SET(pageset1_copy_map, pfn) {
7080     + int is_high;
7081     + page = pfn_to_page(pfn);
7082     + is_high = PageHighMem(page);
7083     +
7084     + if (PagePageset1(page)) {
7085     + if (test_action_state(SUSPEND_NO_DIRECT_LOAD)) {
7086     + ClearPagePageset1Copy(page);
7087     + __free_page(page);
7088     + continue;
7089     + } else {
7090     + if (is_high)
7091     + high_direct++;
7092     + else
7093     + low_direct++;
7094     + }
7095     + } else {
7096     + if (is_high)
7097     + highallocd++;
7098     + else
7099     + lowallocd++;
7100     + }
7101     + }
7102     +
7103     + high_needed-= high_direct;
7104     + low_needed-= low_direct;
7105     +
7106     + /*
7107     + * Do we need to use some lowmem pages for the copies of highmem
7108     + * pages?
7109     + */
7110     + if (high_needed > highallocd) {
7111     + low_pages_for_highmem = high_needed - highallocd;
7112     + high_needed -= low_pages_for_highmem;
7113     + low_needed += low_pages_for_highmem;
7114     + }
7115     +
7116     + high_to_free = highallocd - high_needed;
7117     + low_to_free = lowallocd - low_needed;
7118     +
7119     + /*
7120     + * Now generate our pbes (which will be used for the atomic restore,
7121     + * and free unneeded pages.
7122     + */
7123     + BITMAP_FOR_EACH_SET(pageset1_copy_map, pfn) {
7124     + int is_high;
7125     + page = pfn_to_page(pfn);
7126     + is_high = PageHighMem(page);
7127     +
7128     + if (PagePageset1(page))
7129     + continue;
7130     +
7131     + /* Free the page? */
7132     + if ((is_high && high_to_free) ||
7133     + (!is_high && low_to_free)) {
7134     + ClearPagePageset1Copy(page);
7135     + __free_page(page);
7136     + if (is_high)
7137     + high_to_free--;
7138     + else
7139     + low_to_free--;
7140     + continue;
7141     + }
7142     +
7143     + /* Nope. We're going to use this page. Add a pbe. */
7144     + if (is_high || low_pages_for_highmem) {
7145     + struct page *orig_page;
7146     + high_pbes_done++;
7147     + if (!is_high)
7148     + low_pages_for_highmem--;
7149     + do {
7150     + orig_high_pfn = get_next_bit_on(pageset1_map,
7151     + orig_high_pfn);
7152     + BUG_ON(orig_high_pfn > max_pfn);
7153     + orig_page = pfn_to_page(orig_high_pfn);
7154     + } while(!PageHighMem(orig_page) || load_direct(orig_page));
7155     +
7156     + this_high_pbe->orig_address = orig_page;
7157     + this_high_pbe->address = page;
7158     + this_high_pbe->next = NULL;
7159     + if (last_high_pbe_page != high_pbe_page) {
7160     + *last_high_pbe_ptr = (struct pbe *) high_pbe_page;
7161     + if (!last_high_pbe_page)
7162     + last_high_pbe_page = high_pbe_page;
7163     + } else
7164     + *last_high_pbe_ptr = this_high_pbe;
7165     + last_high_pbe_ptr = &this_high_pbe->next;
7166     + if (last_high_pbe_page != high_pbe_page) {
7167     + kunmap(last_high_pbe_page);
7168     + last_high_pbe_page = high_pbe_page;
7169     + }
7170     + this_high_pbe = get_next_pbe(&high_pbe_page, this_high_pbe, 1);
7171     + if (IS_ERR(this_high_pbe)) {
7172     + printk("This high pbe is an error.\n");
7173     + return -ENOMEM;
7174     + }
7175     + } else {
7176     + struct page *orig_page;
7177     + low_pbes_done++;
7178     + do {
7179     + orig_low_pfn = get_next_bit_on(pageset1_map,
7180     + orig_low_pfn);
7181     + BUG_ON(orig_low_pfn > max_pfn);
7182     + orig_page = pfn_to_page(orig_low_pfn);
7183     + } while(PageHighMem(orig_page) || load_direct(orig_page));
7184     +
7185     + this_low_pbe->orig_address = page_address(orig_page);
7186     + this_low_pbe->address = page_address(page);
7187     + this_low_pbe->next = NULL;
7188     + *last_low_pbe_ptr = this_low_pbe;
7189     + last_low_pbe_ptr = &this_low_pbe->next;
7190     + this_low_pbe = get_next_pbe(&low_pbe_page, this_low_pbe, 0);
7191     + if (IS_ERR(this_low_pbe)) {
7192     + printk("this_low_pbe is an error.\n");
7193     + return -ENOMEM;
7194     + }
7195     + }
7196     + }
7197     +
7198     + if (high_pbe_page)
7199     + kunmap(high_pbe_page);
7200     +
7201     + if (last_high_pbe_page != high_pbe_page) {
7202     + if (last_high_pbe_page)
7203     + kunmap(last_high_pbe_page);
7204     + __free_page(high_pbe_page);
7205     + }
7206     +
7207     + free_conflicting_pages();
7208     +
7209     + return 0;
7210     +}
7211     diff --git a/kernel/power/pagedir.h b/kernel/power/pagedir.h
7212     new file mode 100644
7213     index 0000000..2ae395f
7214     --- /dev/null
7215     +++ b/kernel/power/pagedir.h
7216     @@ -0,0 +1,51 @@
7217     +/*
7218     + * kernel/power/pagedir.h
7219     + *
7220     + * Copyright (C) 2006-2007 Nigel Cunningham (nigel at suspend2 net)
7221     + *
7222     + * This file is released under the GPLv2.
7223     + *
7224     + * Declarations for routines for handling pagesets.
7225     + */
7226     +
7227     +#ifndef KERNEL_POWER_PAGEDIR_H
7228     +#define KERNEL_POWER_PAGEDIR_H
7229     +
7230     +/* Pagedir
7231     + *
7232     + * Contains the metadata for a set of pages saved in the image.
7233     + */
7234     +
7235     +struct pagedir {
7236     + int id;
7237     + int size;
7238     +#ifdef CONFIG_HIGHMEM
7239     + int size_high;
7240     +#endif
7241     +};
7242     +
7243     +#ifdef CONFIG_HIGHMEM
7244     +#define get_highmem_size(pagedir) (pagedir.size_high)
7245     +#define set_highmem_size(pagedir, sz) do { pagedir.size_high = sz; } while(0)
7246     +#define inc_highmem_size(pagedir) do { pagedir.size_high++; } while(0)
7247     +#define get_lowmem_size(pagedir) (pagedir.size - pagedir.size_high)
7248     +#else
7249     +#define get_highmem_size(pagedir) (0)
7250     +#define set_highmem_size(pagedir, sz) do { } while(0)
7251     +#define inc_highmem_size(pagedir) do { } while(0)
7252     +#define get_lowmem_size(pagedir) (pagedir.size)
7253     +#endif
7254     +
7255     +extern struct pagedir pagedir1, pagedir2;
7256     +
7257     +extern void suspend_copy_pageset1(void);
7258     +
7259     +extern void suspend_mark_pages_for_pageset2(void);
7260     +
7261     +extern int suspend_get_pageset1_load_addresses(void);
7262     +
7263     +extern unsigned long __suspend_get_nonconflicting_page(void);
7264     +struct page * ___suspend_get_nonconflicting_page(int can_be_highmem);
7265     +
7266     +extern void suspend_reset_alt_image_pageset2_pfn(void);
7267     +#endif
7268     diff --git a/kernel/power/pageflags.c b/kernel/power/pageflags.c
7269     new file mode 100644
7270     index 0000000..dccac97
7271     --- /dev/null
7272     +++ b/kernel/power/pageflags.c
7273     @@ -0,0 +1,149 @@
7274     +/*
7275     + * kernel/power/pageflags.c
7276     + *
7277     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
7278     + *
7279     + * This file is released under the GPLv2.
7280     + *
7281     + * Routines for serialising and relocating pageflags in which we
7282     + * store our image metadata.
7283     + */
7284     +
7285     +#include <linux/kernel.h>
7286     +#include <linux/mm.h>
7287     +#include <linux/module.h>
7288     +#include <linux/bitops.h>
7289     +#include <linux/list.h>
7290     +#include <linux/suspend.h>
7291     +#include "pageflags.h"
7292     +#include "modules.h"
7293     +#include "pagedir.h"
7294     +#include "suspend.h"
7295     +
7296     +dyn_pageflags_t pageset2_map;
7297     +dyn_pageflags_t page_resave_map;
7298     +dyn_pageflags_t io_map;
7299     +
7300     +static int pages_for_zone(struct zone *zone)
7301     +{
7302     + return DIV_ROUND_UP(zone->spanned_pages, (PAGE_SIZE << 3));
7303     +}
7304     +
7305     +int suspend_pageflags_space_needed(void)
7306     +{
7307     + int total = 0;
7308     + struct zone *zone;
7309     +
7310     + for_each_zone(zone)
7311     + if (populated_zone(zone))
7312     + total += sizeof(int) * 3 + pages_for_zone(zone) * PAGE_SIZE;
7313     +
7314     + total += sizeof(int);
7315     +
7316     + return total;
7317     +}
7318     +
7319     +/* save_dyn_pageflags
7320     + *
7321     + * Description: Save a set of pageflags.
7322     + * Arguments: dyn_pageflags_t *: Pointer to the bitmap being saved.
7323     + */
7324     +
7325     +void save_dyn_pageflags(dyn_pageflags_t pagemap)
7326     +{
7327     + int i, zone_idx, size, node = 0;
7328     + struct zone *zone;
7329     + struct pglist_data *pgdat;
7330     +
7331     + if (!*pagemap)
7332     + return;
7333     +
7334     + for_each_online_pgdat(pgdat) {
7335     + for (zone_idx = 0; zone_idx < MAX_NR_ZONES; zone_idx++) {
7336     + zone = &pgdat->node_zones[zone_idx];
7337     +
7338     + if (!populated_zone(zone))
7339     + continue;
7340     +
7341     + suspendActiveAllocator->rw_header_chunk(WRITE, NULL,
7342     + (char *) &node, sizeof(int));
7343     + suspendActiveAllocator->rw_header_chunk(WRITE, NULL,
7344     + (char *) &zone_idx, sizeof(int));
7345     + size = pages_for_zone(zone);
7346     + suspendActiveAllocator->rw_header_chunk(WRITE, NULL,
7347     + (char *) &size, sizeof(int));
7348     +
7349     + for (i = 0; i < size; i++)
7350     + suspendActiveAllocator->rw_header_chunk(WRITE,
7351     + NULL, (char *) pagemap[node][zone_idx][i],
7352     + PAGE_SIZE);
7353     + }
7354     + node++;
7355     + }
7356     + node = -1;
7357     + suspendActiveAllocator->rw_header_chunk(WRITE, NULL,
7358     + (char *) &node, sizeof(int));
7359     +}
7360     +
7361     +/* load_dyn_pageflags
7362     + *
7363     + * Description: Load a set of pageflags.
7364     + * Arguments: dyn_pageflags_t *: Pointer to the bitmap being loaded.
7365     + * (It must be allocated before calling this routine).
7366     + */
7367     +
7368     +int load_dyn_pageflags(dyn_pageflags_t pagemap)
7369     +{
7370     + int i, zone_idx, zone_check = 0, size, node = 0;
7371     + struct zone *zone;
7372     + struct pglist_data *pgdat;
7373     +
7374     + if (!pagemap)
7375     + return 1;
7376     +
7377     + for_each_online_pgdat(pgdat) {
7378     + for (zone_idx = 0; zone_idx < MAX_NR_ZONES; zone_idx++) {
7379     + zone = &pgdat->node_zones[zone_idx];
7380     +
7381     + if (!populated_zone(zone))
7382     + continue;
7383     +
7384     + /* Same node? */
7385     + suspendActiveAllocator->rw_header_chunk(READ, NULL,
7386     + (char *) &zone_check, sizeof(int));
7387     + if (zone_check != node) {
7388     + printk("Node read (%d) != node (%d).\n",
7389     + zone_check, node);
7390     + return 1;
7391     + }
7392     +
7393     + /* Same zone? */
7394     + suspendActiveAllocator->rw_header_chunk(READ, NULL,
7395     + (char *) &zone_check, sizeof(int));
7396     + if (zone_check != zone_idx) {
7397     + printk("Zone read (%d) != node (%d).\n",
7398     + zone_check, zone_idx);
7399     + return 1;
7400     + }
7401     +
7402     +
7403     + suspendActiveAllocator->rw_header_chunk(READ, NULL,
7404     + (char *) &size, sizeof(int));
7405     +
7406     + for (i = 0; i < size; i++)
7407     + suspendActiveAllocator->rw_header_chunk(READ, NULL,
7408     + (char *) pagemap[node][zone_idx][i],
7409     + PAGE_SIZE);
7410     + }
7411     + node++;
7412     + }
7413     + suspendActiveAllocator->rw_header_chunk(READ, NULL, (char *) &zone_check,
7414     + sizeof(int));
7415     + if (zone_check != -1) {
7416     + printk("Didn't read end of dyn pageflag data marker.(%x)\n",
7417     + zone_check);
7418     + return 1;
7419     + }
7420     +
7421     + return 0;
7422     +}
7423     diff --git a/kernel/power/pageflags.h b/kernel/power/pageflags.h
7424     new file mode 100644
7425     index 0000000..405cbce
7426     --- /dev/null
7427     +++ b/kernel/power/pageflags.h
7428     @@ -0,0 +1,49 @@
7429     +/*
7430     + * kernel/power/pageflags.h
7431     + *
7432     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
7433     + *
7434     + * This file is released under the GPLv2.
7435     + *
7436     + * Suspend2 needs a few pageflags while working that aren't otherwise
7437     + * used. To save the struct page pageflags, we dynamically allocate
7438     + * a bitmap and use that. These are the only non order-0 allocations
7439     + * we do.
7440     + *
7441     + * NOTE!!!
7442     + * We assume that PAGE_SIZE - sizeof(void *) is a multiple of
7443     + * sizeof(unsigned long). Is this ever false?
7444     + */
7445     +
7446     +#include <linux/dyn_pageflags.h>
7447     +#include <linux/suspend.h>
7448     +
7449     +extern dyn_pageflags_t pageset1_map;
7450     +extern dyn_pageflags_t pageset1_copy_map;
7451     +extern dyn_pageflags_t pageset2_map;
7452     +extern dyn_pageflags_t page_resave_map;
7453     +extern dyn_pageflags_t io_map;
7454     +
7455     +#define PagePageset1(page) (test_dynpageflag(&pageset1_map, page))
7456     +#define SetPagePageset1(page) (set_dynpageflag(&pageset1_map, page))
7457     +#define ClearPagePageset1(page) (clear_dynpageflag(&pageset1_map, page))
7458     +
7459     +#define PagePageset1Copy(page) (test_dynpageflag(&pageset1_copy_map, page))
7460     +#define SetPagePageset1Copy(page) (set_dynpageflag(&pageset1_copy_map, page))
7461     +#define ClearPagePageset1Copy(page) (clear_dynpageflag(&pageset1_copy_map, page))
7462     +
7463     +#define PagePageset2(page) (test_dynpageflag(&pageset2_map, page))
7464     +#define SetPagePageset2(page) (set_dynpageflag(&pageset2_map, page))
7465     +#define ClearPagePageset2(page) (clear_dynpageflag(&pageset2_map, page))
7466     +
7467     +#define PageWasRW(page) (test_dynpageflag(&pageset2_map, page))
7468     +#define SetPageWasRW(page) (set_dynpageflag(&pageset2_map, page))
7469     +#define ClearPageWasRW(page) (clear_dynpageflag(&pageset2_map, page))
7470     +
7471     +#define PageResave(page) (page_resave_map ? test_dynpageflag(&page_resave_map, page) : 0)
7472     +#define SetPageResave(page) (set_dynpageflag(&page_resave_map, page))
7473     +#define ClearPageResave(page) (clear_dynpageflag(&page_resave_map, page))
7474     +
7475     +extern void save_dyn_pageflags(dyn_pageflags_t pagemap);
7476     +extern int load_dyn_pageflags(dyn_pageflags_t pagemap);
7477     +extern int suspend_pageflags_space_needed(void);
7478     diff --git a/kernel/power/power.h b/kernel/power/power.h
7479     index eb461b8..a4d6550 100644
7480     --- a/kernel/power/power.h
7481     +++ b/kernel/power/power.h
7482     @@ -1,5 +1,11 @@
7483     +/*
7484     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
7485     + */
7486     +
7487     #include <linux/suspend.h>
7488     #include <linux/utsname.h>
7489     +#include "suspend.h"
7490     +#include "suspend2_builtin.h"
7491    
7492     struct swsusp_info {
7493     struct new_utsname uts;
7494     @@ -15,11 +21,15 @@ struct swsusp_info {
7495    
7496     #ifdef CONFIG_SOFTWARE_SUSPEND
7497     extern int pm_suspend_disk(void);
7498     -
7499     +extern char resume_file[256];
7500     #else
7501     static inline int pm_suspend_disk(void)
7502     {
7503     - return -EPERM;
7504     +#ifdef CONFIG_SUSPEND2
7505     + return suspend2_try_suspend(1);
7506     +#else
7507     + return -ENODEV;
7508     +#endif
7509     }
7510     #endif
7511    
7512     @@ -40,6 +50,8 @@ extern struct subsystem power_subsys;
7513     /* References to section boundaries */
7514     extern const void __nosave_begin, __nosave_end;
7515    
7516     +extern struct pbe *restore_pblist;
7517     +
7518     /* Preferred image size in bytes (default 500 MB) */
7519     extern unsigned long image_size;
7520     extern int in_suspend;
7521     @@ -177,3 +189,11 @@ extern int suspend_enter(suspend_state_t state);
7522     struct timeval;
7523     extern void swsusp_show_speed(struct timeval *, struct timeval *,
7524     unsigned int, char *);
7525     +extern struct page *saveable_page(unsigned long pfn);
7526     +#ifdef CONFIG_HIGHMEM
7527     +extern struct page *saveable_highmem_page(unsigned long pfn);
7528     +#else
7529     +static inline void *saveable_highmem_page(unsigned long pfn) { return NULL; }
7530     +#endif
7531     +
7532     +#define PBES_PER_PAGE (PAGE_SIZE / sizeof(struct pbe))
7533     diff --git a/kernel/power/power_off.c b/kernel/power/power_off.c
7534     new file mode 100644
7535     index 0000000..7db6186
7536     --- /dev/null
7537     +++ b/kernel/power/power_off.c
7538     @@ -0,0 +1,109 @@
7539     +/*
7540     + * kernel/power/power_off.c
7541     + *
7542     + * Copyright (C) 2006-2007 Nigel Cunningham (nigel at suspend2 net)
7543     + *
7544     + * This file is released under the GPLv2.
7545     + *
7546     + * Support for powering down.
7547     + */
7548     +
7549     +#include <linux/device.h>
7550     +#include <linux/suspend.h>
7551     +#include <linux/mm.h>
7552     +#include <linux/pm.h>
7553     +#include <linux/reboot.h>
7554     +#include <linux/cpu.h>
7555     +#include <linux/console.h>
7556     +#include "suspend.h"
7557     +#include "ui.h"
7558     +#include "power_off.h"
7559     +#include "power.h"
7560     +
7561     +unsigned long suspend2_poweroff_method = 0; /* 0 - Kernel power off */
7562     +
7563     +/*
7564     + * suspend2_power_down
7565     + * Functionality : Powers down or reboots the computer once the image
7566     + * has been written to disk.
7567     + * Key Assumptions : Able to reboot/power down via code called or that
7568     + * the warning emitted if the calls fail will be visible
7569     + * to the user (ie printk resumes devices).
7570     + * Called From : do_suspend2_suspend_2
7571     + */
7572     +
7573     +void suspend2_power_down(void)
7574     +{
7575     + int result = 0;
7576     +
7577     + if (test_action_state(SUSPEND_REBOOT)) {
7578     + suspend_prepare_status(DONT_CLEAR_BAR, "Ready to reboot.");
7579     + kernel_restart(NULL);
7580     + }
7581     +
7582     + suspend_prepare_status(DONT_CLEAR_BAR, "Powering down.");
7583     +
7584     + switch (suspend2_poweroff_method) {
7585     + case 0:
7586     + break;
7587     + case 3:
7588     + suspend_console();
7589     +
7590     + if (device_suspend(PMSG_SUSPEND)) {
7591     + suspend_prepare_status(DONT_CLEAR_BAR, "Device "
7592     + "suspend failure. Doing poweroff.");
7593     + goto ResumeConsole;
7594     + }
7595     +
7596     + if (!pm_ops ||
7597     + (pm_ops->prepare && pm_ops->prepare(PM_SUSPEND_MEM)))
7598     + goto DeviceResume;
7599     +
7600     + if (test_action_state(SUSPEND_LATE_CPU_HOTPLUG) &&
7601     + disable_nonboot_cpus())
7602     + goto PmOpsFinish;
7603     +
7604     + if (!suspend_enter(PM_SUSPEND_MEM))
7605     + result = 1;
7606     +
7607     + if (test_action_state(SUSPEND_LATE_CPU_HOTPLUG))
7608     + enable_nonboot_cpus();
7609     +
7610     +PmOpsFinish:
7611     + if (pm_ops->finish)
7612     + pm_ops->finish(PM_SUSPEND_MEM);
7613     +
7614     +DeviceResume:
7615     + device_resume();
7616     +
7617     +ResumeConsole:
7618     + resume_console();
7619     +
7620     + /* If suspended to ram and later woke. */
7621     + if (result)
7622     + return;
7623     + break;
7624     + case 4:
7625     + case 5:
7626     + if (!pm_ops ||
7627     + (pm_ops->prepare && pm_ops->prepare(PM_SUSPEND_MAX)))
7628     + break;
7629     +
7630     + kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
7631     + suspend_enter(suspend2_poweroff_method);
7632     +
7633     + /* Failed. Fall back to kernel_power_off etc. */
7634     + if (pm_ops->finish)
7635     + pm_ops->finish(PM_SUSPEND_MAX);
7636     + }
7637     +
7638     + suspend_prepare_status(DONT_CLEAR_BAR, "Falling back to alternate power off method.");
7639     + kernel_power_off();
7640     + kernel_halt();
7641     + suspend_prepare_status(DONT_CLEAR_BAR, "Powerdown failed.");
7642     + while (1)
7643     + cpu_relax();
7644     +}
7645     +
7646     +EXPORT_SYMBOL_GPL(suspend2_poweroff_method);
7647     +EXPORT_SYMBOL_GPL(suspend2_power_down);
7648     diff --git a/kernel/power/power_off.h b/kernel/power/power_off.h
7649     new file mode 100644
7650     index 0000000..40e0c46
7651     --- /dev/null
7652     +++ b/kernel/power/power_off.h
7653     @@ -0,0 +1,13 @@
7654     +/*
7655     + * kernel/power/power_off.h
7656     + *
7657     + * Copyright (C) 2006-2007 Nigel Cunningham (nigel at suspend2 net)
7658     + *
7659     + * This file is released under the GPLv2.
7660     + *
7661     + * Support for the powering down.
7662     + */
7663     +
7664     +int suspend_pm_state_finish(void);
7665     +void suspend2_power_down(void);
7666     +extern unsigned long suspend2_poweroff_method;
7667     diff --git a/kernel/power/prepare_image.c b/kernel/power/prepare_image.c
7668     new file mode 100644
7669     index 0000000..779341b
7670     --- /dev/null
7671     +++ b/kernel/power/prepare_image.c
7672     @@ -0,0 +1,798 @@
7673     +/*
7674     + * kernel/power/prepare_image.c
7675     + *
7676     + * Copyright (C) 2003-2007 Nigel Cunningham (nigel at suspend2 net)
7677     + *
7678     + * This file is released under the GPLv2.
7679     + *
7680     + * We need to eat memory until we can:
7681     + * 1. Perform the save without changing anything (RAM_NEEDED < #pages)
7682     + * 2. Fit it all in available space (suspendActiveAllocator->available_space() >=
7683     + * main_storage_needed())
7684     + * 3. Reload the pagedir and pageset1 to places that don't collide with their
7685     + * final destinations, not knowing to what extent the resumed kernel will
7686     + * overlap with the one loaded at boot time. I think the resumed kernel
7687     + * should overlap completely, but I don't want to rely on this as it is
7688     + * an unproven assumption. We therefore assume there will be no overlap at
7689     + * all (worse case).
7690     + * 4. Meet the user's requested limit (if any) on the size of the image.
7691     + * The limit is in MB, so pages/256 (assuming 4K pages).
7692     + *
7693     + */
7694     +
7695     +#include <linux/module.h>
7696     +#include <linux/highmem.h>
7697     +#include <linux/freezer.h>
7698     +#include <linux/hardirq.h>
7699     +#include <linux/mmzone.h>
7700     +#include <linux/console.h>
7701     +
7702     +#include "pageflags.h"
7703     +#include "modules.h"
7704     +#include "io.h"
7705     +#include "ui.h"
7706     +#include "extent.h"
7707     +#include "prepare_image.h"
7708     +#include "block_io.h"
7709     +#include "suspend.h"
7710     +#include "checksum.h"
7711     +#include "sysfs.h"
7712     +
7713     +static int num_nosave = 0;
7714     +static int header_space_allocated = 0;
7715     +static int main_storage_allocated = 0;
7716     +static int storage_available = 0;
7717     +int extra_pd1_pages_allowance = MIN_EXTRA_PAGES_ALLOWANCE;
7718     +int image_size_limit = 0;
7719     +
7720     +/*
7721     + * The atomic copy of pageset1 is stored in pageset2 pages.
7722     + * But if pageset1 is larger (normally only just after boot),
7723     + * we need to allocate extra pages to store the atomic copy.
7724     + * The following data struct and functions are used to handle
7725     + * the allocation and freeing of that memory.
7726     + */
7727     +
7728     +static int extra_pages_allocated;
7729     +
7730     +struct extras {
7731     + struct page *page;
7732     + int order;
7733     + struct extras *next;
7734     +};
7735     +
7736     +static struct extras *extras_list;
7737     +
7738     +/* suspend_free_extra_pagedir_memory
7739     + *
7740     + * Description: Free previously allocated extra pagedir memory.
7741     + */
7742     +void suspend_free_extra_pagedir_memory(void)
7743     +{
7744     + /* Free allocated pages */
7745     + while (extras_list) {
7746     + struct extras *this = extras_list;
7747     + int i;
7748     +
7749     + extras_list = this->next;
7750     +
7751     + for (i = 0; i < (1 << this->order); i++)
7752     + ClearPageNosave(this->page + i);
7753     +
7754     + __free_pages(this->page, this->order);
7755     + kfree(this);
7756     + }
7757     +
7758     + extra_pages_allocated = 0;
7759     +}
7760     +
7761     +/* suspend_allocate_extra_pagedir_memory
7762     + *
7763     + * Description: Allocate memory for making the atomic copy of pagedir1 in the
7764     + * case where it is bigger than pagedir2.
7765     + * Arguments: int num_to_alloc: Number of extra pages needed.
7766     + * Result: int. Number of extra pages we now have allocated.
7767     + */
7768     +static int suspend_allocate_extra_pagedir_memory(int extra_pages_needed)
7769     +{
7770     + int j, order, num_to_alloc = extra_pages_needed - extra_pages_allocated;
7771     + unsigned long flags = GFP_ATOMIC | __GFP_NOWARN;
7772     +
7773     + if (num_to_alloc < 1)
7774     + return 0;
7775     +
7776     + order = fls(num_to_alloc);
7777     + if (order >= MAX_ORDER)
7778     + order = MAX_ORDER - 1;
7779     +
7780     + while (num_to_alloc) {
7781     + struct page *newpage;
7782     + unsigned long virt;
7783     + struct extras *extras_entry;
7784     +
7785     + while ((1 << order) > num_to_alloc)
7786     + order--;
7787     +
7788     + extras_entry = (struct extras *) kmalloc(sizeof(struct extras),
7789     + GFP_ATOMIC);
7790     +
7791     + if (!extras_entry)
7792     + return extra_pages_allocated;
7793     +
7794     + virt = __get_free_pages(flags, order);
7795     + while (!virt && order) {
7796     + order--;
7797     + virt = __get_free_pages(flags, order);
7798     + }
7799     +
7800     + if (!virt) {
7801     + kfree(extras_entry);
7802     + return extra_pages_allocated;
7803     + }
7804     +
7805     + newpage = virt_to_page(virt);
7806     +
7807     + extras_entry->page = newpage;
7808     + extras_entry->order = order;
7809     + extras_entry->next = NULL;
7810     +
7811     + if (extras_list)
7812     + extras_entry->next = extras_list;
7813     +
7814     + extras_list = extras_entry;
7815     +
7816     + for (j = 0; j < (1 << order); j++) {
7817     + SetPageNosave(newpage + j);
7818     + SetPagePageset1Copy(newpage + j);
7819     + }
7820     +
7821     + extra_pages_allocated += (1 << order);
7822     + num_to_alloc -= (1 << order);
7823     + }
7824     +
7825     + return extra_pages_allocated;
7826     +}
7827     +
7828     +/*
7829     + * real_nr_free_pages: Count pcp pages for a zone type or all zones
7830     + * (-1 for all, otherwise zone_idx() result desired).
7831     + */
7832     +int real_nr_free_pages(unsigned long zone_idx_mask)
7833     +{
7834     + struct zone *zone;
7835     + int result = 0, i = 0, cpu;
7836     +
7837     + /* PCP lists */
7838     + for_each_zone(zone) {
7839     + if (!populated_zone(zone))
7840     + continue;
7841     +
7842     + if (!(zone_idx_mask & (1 << zone_idx(zone))))
7843     + continue;
7844     +
7845     + for_each_online_cpu(cpu) {
7846     + struct per_cpu_pageset *pset = zone_pcp(zone, cpu);
7847     +
7848     + for (i = 0; i < ARRAY_SIZE(pset->pcp); i++) {
7849     + struct per_cpu_pages *pcp;
7850     +
7851     + pcp = &pset->pcp[i];
7852     + result += pcp->count;
7853     + }
7854     + }
7855     +
7856     + result += zone_page_state(zone, NR_FREE_PAGES);
7857     + }
7858     + return result;
7859     +}
7860     +
7861     +/*
7862     + * Discover how much extra memory will be required by the drivers
7863     + * when they're asked to suspend. We can then ensure that amount
7864     + * of memory is available when we really want it.
7865     + */
7866     +static void get_extra_pd1_allowance(void)
7867     +{
7868     + int orig_num_free = real_nr_free_pages(all_zones_mask), final;
7869     +
7870     + suspend_prepare_status(CLEAR_BAR, "Finding allowance for drivers.");
7871     +
7872     + suspend_console();
7873     + device_suspend(PMSG_FREEZE);
7874     + local_irq_disable(); /* irqs might have been re-enabled on us */
7875     + device_power_down(PMSG_FREEZE);
7876     +
7877     + final = real_nr_free_pages(all_zones_mask);
7878     +
7879     + device_power_up();
7880     + local_irq_enable();
7881     + device_resume();
7882     + resume_console();
7883     +
7884     + extra_pd1_pages_allowance = max(
7885     + orig_num_free - final + MIN_EXTRA_PAGES_ALLOWANCE,
7886     + MIN_EXTRA_PAGES_ALLOWANCE);
7887     +}
7888     +
7889     +/*
7890     + * Amount of storage needed, possibly taking into account the
7891     + * expected compression ratio and possibly also ignoring our
7892     + * allowance for extra pages.
7893     + */
7894     +static int main_storage_needed(int use_ecr,
7895     + int ignore_extra_pd1_allow)
7896     +{
7897     + return ((pagedir1.size + pagedir2.size +
7898     + (ignore_extra_pd1_allow ? 0 : extra_pd1_pages_allowance)) *
7899     + (use_ecr ? suspend_expected_compression_ratio() : 100) / 100);
7900     +}
7901     +
7902     +/*
7903     + * Storage needed for the image header, in bytes until the return.
7904     + */
7905     +static int header_storage_needed(void)
7906     +{
7907     + int bytes = (int) sizeof(struct suspend_header) +
7908     + suspend_header_storage_for_modules() +
7909     + suspend_pageflags_space_needed();
7910     +
7911     + return DIV_ROUND_UP(bytes, PAGE_SIZE);
7912     +}
7913     +
7914     +/*
7915     + * When freeing memory, pages from either pageset might be freed.
7916     + *
7917     + * When seeking to free memory to be able to suspend, for every ps1 page freed,
7918     + * we need 2 less pages for the atomic copy because there is one less page to
7919     + * copy and one more page into which data can be copied.
7920     + *
7921     + * Freeing ps2 pages saves us nothing directly. No more memory is available
7922     + * for the atomic copy. Indirectly, a ps1 page might be freed (slab?), but
7923     + * that's too much work to figure out.
7924     + *
7925     + * => ps1_to_free functions
7926     + *
7927     + * Of course if we just want to reduce the image size, because of storage
7928     + * limitations or an image size limit either ps will do.
7929     + *
7930     + * => any_to_free function
7931     + */
7932     +
7933     +static int highpages_ps1_to_free(void)
7934     +{
7935     + return max_t(int, 0, DIV_ROUND_UP(get_highmem_size(pagedir1) -
7936     + get_highmem_size(pagedir2), 2) - real_nr_free_high_pages());
7937     +}
7938     +
7939     +static int lowpages_ps1_to_free(void)
7940     +{
7941     + return max_t(int, 0, DIV_ROUND_UP(get_lowmem_size(pagedir1) +
7942     + extra_pd1_pages_allowance + MIN_FREE_RAM +
7943     + suspend_memory_for_modules() - get_lowmem_size(pagedir2) -
7944     + real_nr_free_low_pages() - extra_pages_allocated, 2));
7945     +}
7946     +
7947     +static int current_image_size(void)
7948     +{
7949     + return pagedir1.size + pagedir2.size + header_space_allocated;
7950     +}
7951     +
7952     +static int any_to_free(int use_image_size_limit)
7953     +{
7954     + int user_limit = (use_image_size_limit && image_size_limit > 0) ?
7955     + max_t(int, 0, current_image_size() - (image_size_limit << 8))
7956     + : 0;
7957     +
7958     + int storage_limit = max_t(int, 0,
7959     + main_storage_needed(1, 1) - storage_available);
7960     +
7961     + return max(user_limit, storage_limit);
7962     +}
7963     +
7964     +/* amount_needed
7965     + *
7966     + * Calculates the amount by which the image size needs to be reduced to meet
7967     + * our constraints.
7968     + */
7969     +static int amount_needed(int use_image_size_limit)
7970     +{
7971     + return max(highpages_ps1_to_free() + lowpages_ps1_to_free(),
7972     + any_to_free(use_image_size_limit));
7973     +}
7974     +
7975     +static int image_not_ready(int use_image_size_limit)
7976     +{
7977     + suspend_message(SUSPEND_EAT_MEMORY, SUSPEND_LOW, 1,
7978     + "Amount still needed (%d) > 0:%d. Header: %d < %d: %d,"
7979     + " Storage allocd: %d < %d: %d.\n",
7980     + amount_needed(use_image_size_limit),
7981     + (amount_needed(use_image_size_limit) > 0),
7982     + header_space_allocated, header_storage_needed(),
7983     + header_space_allocated < header_storage_needed(),
7984     + main_storage_allocated,
7985     + main_storage_needed(1, 1),
7986     + main_storage_allocated < main_storage_needed(1, 1));
7987     +
7988     + suspend_cond_pause(0, NULL);
7989     +
7990     + return ((amount_needed(use_image_size_limit) > 0) ||
7991     + header_space_allocated < header_storage_needed() ||
7992     + main_storage_allocated < main_storage_needed(1, 1));
7993     +}
7994     +
7995     +static void display_stats(int always, int sub_extra_pd1_allow)
7996     +{
7997     + char buffer[255];
7998     + snprintf(buffer, 254,
7999     + "Free:%d(%d). Sets:%d(%d),%d(%d). Header:%d/%d. Nosave:%d-%d"
8000     + "=%d. Storage:%u/%u(%u=>%u). Needed:%d,%d,%d(%d,%d,%d,%d)\n",
8001     +
8002     + /* Free */
8003     + real_nr_free_pages(all_zones_mask),
8004     + real_nr_free_low_pages(),
8005     +
8006     + /* Sets */
8007     + pagedir1.size, pagedir1.size - get_highmem_size(pagedir1),
8008     + pagedir2.size, pagedir2.size - get_highmem_size(pagedir2),
8009     +
8010     + /* Header */
8011     + header_space_allocated, header_storage_needed(),
8012     +
8013     + /* Nosave */
8014     + num_nosave, extra_pages_allocated,
8015     + num_nosave - extra_pages_allocated,
8016     +
8017     + /* Storage */
8018     + main_storage_allocated,
8019     + storage_available,
8020     + main_storage_needed(1, sub_extra_pd1_allow),
8021     + main_storage_needed(1, 1),
8022     +
8023     + /* Needed */
8024     + lowpages_ps1_to_free(), highpages_ps1_to_free(),
8025     + any_to_free(1),
8026     + MIN_FREE_RAM, suspend_memory_for_modules(),
8027     + extra_pd1_pages_allowance, image_size_limit << 8);
8028     +
8029     + if (always)
8030     + printk(buffer);
8031     + else
8032     + suspend_message(SUSPEND_EAT_MEMORY, SUSPEND_MEDIUM, 1, buffer);
8033     +}
8034     +
8035     +/* generate_free_page_map
8036     + *
8037     + * Description: This routine generates a bitmap of free pages from the
8038     + * lists used by the memory manager. We then use the bitmap
8039     + * to quickly calculate which pages to save and in which
8040     + * pagesets.
8041     + */
8042     +static void generate_free_page_map(void)
8043     +{
8044     + int order, loop, cpu;
8045     + struct page *page;
8046     + unsigned long flags, i;
8047     + struct zone *zone;
8048     +
8049     + for_each_zone(zone) {
8050     + if (!populated_zone(zone))
8051     + continue;
8052     +
8053     + spin_lock_irqsave(&zone->lock, flags);
8054     +
8055     + for(i=0; i < zone->spanned_pages; i++)
8056     + ClearPageNosaveFree(pfn_to_page(
8057     + zone->zone_start_pfn + i));
8058     +
8059     + for (order = MAX_ORDER - 1; order >= 0; --order)
8060     + list_for_each_entry(page,
8061     + &zone->free_area[order].free_list, lru)
8062     + for(loop=0; loop < (1 << order); loop++)
8063     + SetPageNosaveFree(page+loop);
8064     +
8065     +
8066     + for_each_online_cpu(cpu) {
8067     + struct per_cpu_pageset *pset = zone_pcp(zone, cpu);
8068     +
8069     + for (i = 0; i < ARRAY_SIZE(pset->pcp); i++) {
8070     + struct per_cpu_pages *pcp;
8071     + struct page *page;
8072     +
8073     + pcp = &pset->pcp[i];
8074     + list_for_each_entry(page, &pcp->list, lru)
8075     + SetPageNosaveFree(page);
8076     + }
8077     + }
8078     +
8079     + spin_unlock_irqrestore(&zone->lock, flags);
8080     + }
8081     +}
8082     +
8083     +/* size_of_free_region
8084     + *
8085     + * Description: Return the number of pages that are free, beginning with and
8086     + * including this one.
8087     + */
8088     +static int size_of_free_region(struct page *page)
8089     +{
8090     + struct zone *zone = page_zone(page);
8091     + struct page *posn = page, *last_in_zone =
8092     + pfn_to_page(zone->zone_start_pfn) + zone->spanned_pages - 1;
8093     +
8094     + while (posn <= last_in_zone && PageNosaveFree(posn))
8095     + posn++;
8096     + return (posn - page);
8097     +}
8098     +
8099     +/* flag_image_pages
8100     + *
8101     + * This routine generates our lists of pages to be stored in each
8102     + * pageset. Since we store the data using extents, and adding new
8103     + * extents might allocate a new extent page, this routine may well
8104     + * be called more than once.
8105     + */
8106     +static void flag_image_pages(int atomic_copy)
8107     +{
8108     + int num_free = 0;
8109     + unsigned long loop;
8110     + struct zone *zone;
8111     +
8112     + pagedir1.size = 0;
8113     + pagedir2.size = 0;
8114     +
8115     + set_highmem_size(pagedir1, 0);
8116     + set_highmem_size(pagedir2, 0);
8117     +
8118     + num_nosave = 0;
8119     +
8120     + clear_dyn_pageflags(pageset1_map);
8121     +
8122     + generate_free_page_map();
8123     +
8124     + /*
8125     + * Pages not to be saved are marked Nosave irrespective of being reserved
8126     + */
8127     + for_each_zone(zone) {
8128     + int highmem = is_highmem(zone);
8129     +
8130     + if (!populated_zone(zone))
8131     + continue;
8132     +
8133     + for (loop = 0; loop < zone->spanned_pages; loop++) {
8134     + unsigned long pfn = zone->zone_start_pfn + loop;
8135     + struct page *page;
8136     + int chunk_size;
8137     +
8138     + if (!pfn_valid(pfn))
8139     + continue;
8140     +
8141     + page = pfn_to_page(pfn);
8142     +
8143     + chunk_size = size_of_free_region(page);
8144     + if (chunk_size) {
8145     + num_free += chunk_size;
8146     + loop += chunk_size - 1;
8147     + continue;
8148     + }
8149     +
8150     + if (highmem)
8151     + page = saveable_highmem_page(pfn);
8152     + else
8153     + page = saveable_page(pfn);
8154     +
8155     + if (!page) {
8156     + num_nosave++;
8157     + continue;
8158     + }
8159     +
8160     + if (PagePageset2(page)) {
8161     + pagedir2.size++;
8162     + if (PageHighMem(page))
8163     + inc_highmem_size(pagedir2);
8164     + else
8165     + SetPagePageset1Copy(page);
8166     + if (PageResave(page)) {
8167     + SetPagePageset1(page);
8168     + ClearPagePageset1Copy(page);
8169     + pagedir1.size++;
8170     + if (PageHighMem(page))
8171     + inc_highmem_size(pagedir1);
8172     + }
8173     + } else {
8174     + pagedir1.size++;
8175     + SetPagePageset1(page);
8176     + if (PageHighMem(page))
8177     + inc_highmem_size(pagedir1);
8178     + }
8179     + }
8180     + }
8181     +
8182     + if (atomic_copy)
8183     + return;
8184     +
8185     + suspend_message(SUSPEND_EAT_MEMORY, SUSPEND_MEDIUM, 0,
8186     + "Count data pages: Set1 (%d) + Set2 (%d) + Nosave (%d) + "
8187     + "NumFree (%d) = %d.\n",
8188     + pagedir1.size, pagedir2.size, num_nosave, num_free,
8189     + pagedir1.size + pagedir2.size + num_nosave + num_free);
8190     +}
8191     +
8192     +void suspend_recalculate_image_contents(int atomic_copy)
8193     +{
8194     + clear_dyn_pageflags(pageset1_map);
8195     + if (!atomic_copy) {
8196     + int pfn;
8197     + BITMAP_FOR_EACH_SET(pageset2_map, pfn)
8198     + ClearPagePageset1Copy(pfn_to_page(pfn));
8199     + /* Need to call this before getting pageset1_size! */
8200     + suspend_mark_pages_for_pageset2();
8201     + }
8202     + flag_image_pages(atomic_copy);
8203     +
8204     + if (!atomic_copy) {
8205     + storage_available = suspendActiveAllocator->storage_available();
8206     + display_stats(0, 0);
8207     + }
8208     +}
8209     +
8210     +/* update_image
8211     + *
8212     + * Allocate [more] memory and storage for the image.
8213     + */
8214     +static void update_image(void)
8215     +{
8216     + int result, param_used, wanted, got;
8217     +
8218     + suspend_recalculate_image_contents(0);
8219     +
8220     + /* Include allowance for growth in pagedir1 while writing pagedir 2 */
8221     + wanted = pagedir1.size + extra_pd1_pages_allowance -
8222     + get_lowmem_size(pagedir2);
8223     + if (wanted > extra_pages_allocated) {
8224     + got = suspend_allocate_extra_pagedir_memory(wanted);
8225     + if (wanted < got) {
8226     + suspend_message(SUSPEND_EAT_MEMORY, SUSPEND_LOW, 1,
8227     + "Want %d extra pages for pageset1, got %d.\n",
8228     + wanted, got);
8229     + return;
8230     + }
8231     + }
8232     +
8233     + thaw_kernel_threads();
8234     +
8235     + /*
8236     + * Allocate remaining storage space, if possible, up to the
8237     + * maximum we know we'll need. It's okay to allocate the
8238     + * maximum if the writer is the swapwriter, but
8239     + * we don't want to grab all available space on an NFS share.
8240     + * We therefore ignore the expected compression ratio here,
8241     + * thereby trying to allocate the maximum image size we could
8242     + * need (assuming compression doesn't expand the image), but
8243     + * don't complain if we can't get the full amount we're after.
8244     + */
8245     +
8246     + suspendActiveAllocator->allocate_storage(
8247     + min(storage_available, main_storage_needed(0, 0)));
8248     +
8249     + main_storage_allocated = suspendActiveAllocator->storage_allocated();
8250     +
8251     + param_used = header_storage_needed();
8252     +
8253     + result = suspendActiveAllocator->allocate_header_space(param_used);
8254     +
8255     + if (result)
8256     + suspend_message(SUSPEND_EAT_MEMORY, SUSPEND_LOW, 1,
8257     + "Still need to get more storage space for header.\n");
8258     + else
8259     + header_space_allocated = param_used;
8260     +
8261     + if (freeze_processes()) {
8262     + set_result_state(SUSPEND_FREEZING_FAILED);
8263     + set_result_state(SUSPEND_ABORTED);
8264     + }
8265     +
8266     + allocate_checksum_pages();
8267     +
8268     + suspend_recalculate_image_contents(0);
8269     +}
8270     +
8271     +/* attempt_to_freeze
8272     + *
8273     + * Try to freeze processes.
8274     + */
8275     +
8276     +static int attempt_to_freeze(void)
8277     +{
8278     + int result;
8279     +
8280     + /* Stop processes before checking again */
8281     + thaw_processes();
8282     + suspend_prepare_status(CLEAR_BAR, "Freezing processes & syncing filesystems.");
8283     + result = freeze_processes();
8284     +
8285     + if (result) {
8286     + set_result_state(SUSPEND_ABORTED);
8287     + set_result_state(SUSPEND_FREEZING_FAILED);
8288     + }
8289     +
8290     + return result;
8291     +}
8292     +
8293     +/* eat_memory
8294     + *
8295     + * Try to free some memory, either to meet hard or soft constraints on the image
8296     + * characteristics.
8297     + *
8298     + * Hard constraints:
8299     + * - Pageset1 must be < half of memory;
8300     + * - We must have enough memory free at resume time to have pageset1
8301     + * be able to be loaded in pages that don't conflict with where it has to
8302     + * be restored.
8303     + * Soft constraints
8304     + * - User specificied image size limit.
8305     + */
8306     +static void eat_memory(void)
8307     +{
8308     + int amount_wanted = 0;
8309     + int free_flags = 0, did_eat_memory = 0;
8310     +
8311     + /*
8312     + * Note that if we have enough storage space and enough free memory, we
8313     + * may exit without eating anything. We give up when the last 10
8314     + * iterations ate no extra pages because we're not going to get much
8315     + * more anyway, but the few pages we get will take a lot of time.
8316     + *
8317     + * We freeze processes before beginning, and then unfreeze them if we
8318     + * need to eat memory until we think we have enough. If our attempts
8319     + * to freeze fail, we give up and abort.
8320     + */
8321     +
8322     + suspend_recalculate_image_contents(0);
8323     + amount_wanted = amount_needed(1);
8324     +
8325     + switch (image_size_limit) {
8326     + case -1: /* Don't eat any memory */
8327     + if (amount_wanted > 0) {
8328     + set_result_state(SUSPEND_ABORTED);
8329     + set_result_state(SUSPEND_WOULD_EAT_MEMORY);
8330     + return;
8331     + }
8332     + break;
8333     + case -2: /* Free caches only */
8334     + drop_pagecache();
8335     + suspend_recalculate_image_contents(0);
8336     + amount_wanted = amount_needed(1);
8337     + did_eat_memory = 1;
8338     + break;
8339     + default:
8340     + free_flags = GFP_ATOMIC | __GFP_HIGHMEM;
8341     + }
8342     +
8343     + if (amount_wanted > 0 && !test_result_state(SUSPEND_ABORTED) &&
8344     + image_size_limit != -1) {
8345     + struct zone *zone;
8346     + int zone_idx;
8347     +
8348     + suspend_prepare_status(CLEAR_BAR, "Seeking to free %dMB of memory.", MB(amount_wanted));
8349     +
8350     + thaw_kernel_threads();
8351     +
8352     + for (zone_idx = 0; zone_idx < MAX_NR_ZONES; zone_idx++) {
8353     + int zone_type_free = max_t(int, (zone_idx == ZONE_HIGHMEM) ?
8354     + highpages_ps1_to_free() :
8355     + lowpages_ps1_to_free(), amount_wanted);
8356     +
8357     + if (zone_type_free < 0)
8358     + break;
8359     +
8360     + for_each_zone(zone) {
8361     + if (zone_idx(zone) != zone_idx)
8362     + continue;
8363     +
8364     + shrink_one_zone(zone, zone_type_free);
8365     +
8366     + did_eat_memory = 1;
8367     +
8368     + suspend_recalculate_image_contents(0);
8369     +
8370     + amount_wanted = amount_needed(1);
8371     + zone_type_free = max_t(int, (zone_idx == ZONE_HIGHMEM) ?
8372     + highpages_ps1_to_free() :
8373     + lowpages_ps1_to_free(), amount_wanted);
8374     +
8375     + if (zone_type_free < 0)
8376     + break;
8377     + }
8378     + }
8379     +
8380     + suspend_cond_pause(0, NULL);
8381     +
8382     + if (freeze_processes()) {
8383     + set_result_state(SUSPEND_FREEZING_FAILED);
8384     + set_result_state(SUSPEND_ABORTED);
8385     + }
8386     + }
8387     +
8388     + if (did_eat_memory) {
8389     + unsigned long orig_state = get_suspend_state();
8390     + /* Freeze_processes will call sys_sync too */
8391     + restore_suspend_state(orig_state);
8392     + suspend_recalculate_image_contents(0);
8393     + }
8394     +
8395     + /* Blank out image size display */
8396     + suspend_update_status(100, 100, NULL);
8397     +}
8398     +
8399     +/* suspend_prepare_image
8400     + *
8401     + * Entry point to the whole image preparation section.
8402     + *
8403     + * We do four things:
8404     + * - Freeze processes;
8405     + * - Ensure image size constraints are met;
8406     + * - Complete all the preparation for saving the image,
8407     + * including allocation of storage. The only memory
8408     + * that should be needed when we're finished is that
8409     + * for actually storing the image (and we know how
8410     + * much is needed for that because the modules tell
8411     + * us).
8412     + * - Make sure that all dirty buffers are written out.
8413     + */
8414     +#define MAX_TRIES 2
8415     +int suspend_prepare_image(void)
8416     +{
8417     + int result = 1, tries = 1;
8418     +
8419     + header_space_allocated = 0;
8420     + main_storage_allocated = 0;
8421     +
8422     + if (attempt_to_freeze())
8423     + return 1;
8424     +
8425     + if (!extra_pd1_pages_allowance)
8426     + get_extra_pd1_allowance();
8427     +
8428     + storage_available = suspendActiveAllocator->storage_available();
8429     +
8430     + if (!storage_available) {
8431     + printk(KERN_ERR "You need some storage available to be able to suspend.\n");
8432     + set_result_state(SUSPEND_ABORTED);
8433     + set_result_state(SUSPEND_NOSTORAGE_AVAILABLE);
8434     + return 1;
8435     + }
8436     +
8437     + do {
8438     + suspend_prepare_status(CLEAR_BAR, "Preparing Image. Try %d.", tries);
8439     +
8440     + eat_memory();
8441     +
8442     + if (test_result_state(SUSPEND_ABORTED))
8443     + break;
8444     +
8445     + update_image();
8446     +
8447     + tries++;
8448     +
8449     + } while (image_not_ready(1) && tries <= MAX_TRIES &&
8450     + !test_result_state(SUSPEND_ABORTED));
8451     +
8452     + result = image_not_ready(0);
8453     +
8454     + if (!test_result_state(SUSPEND_ABORTED)) {
8455     + if (result) {
8456     + display_stats(1, 0);
8457     + abort_suspend(SUSPEND_UNABLE_TO_PREPARE_IMAGE,
8458     + "Unable to successfully prepare the image.\n");
8459     + } else {
8460     + unlink_lru_lists();
8461     + suspend_cond_pause(1, "Image preparation complete.");
8462     + }
8463     + }
8464     +
8465     + return result;
8466     +}
8467     +
8468     +#ifdef CONFIG_SUSPEND2_EXPORTS
8469     +EXPORT_SYMBOL_GPL(real_nr_free_pages);
8470     +#endif
8471     diff --git a/kernel/power/prepare_image.h b/kernel/power/prepare_image.h
8472     new file mode 100644
8473     index 0000000..8c2e426
8474     --- /dev/null
8475     +++ b/kernel/power/prepare_image.h
8476     @@ -0,0 +1,34 @@
8477     +/*
8478     + * kernel/power/prepare_image.h
8479     + *
8480     + * Copyright (C) 2003-2007 Nigel Cunningham (nigel at suspend2 net)
8481     + *
8482     + * This file is released under the GPLv2.
8483     + *
8484     + */
8485     +
8486     +#include <asm/sections.h>
8487     +
8488     +extern int suspend_prepare_image(void);
8489     +extern void suspend_recalculate_image_contents(int storage_available);
8490     +extern int real_nr_free_pages(unsigned long zone_idx_mask);
8491     +extern int image_size_limit;
8492     +extern void suspend_free_extra_pagedir_memory(void);
8493     +extern int extra_pd1_pages_allowance;
8494     +
8495     +#define MIN_FREE_RAM 100
8496     +#define MIN_EXTRA_PAGES_ALLOWANCE 500
8497     +
8498     +#define all_zones_mask ((unsigned long) ((1 << MAX_NR_ZONES) - 1))
8499     +#ifdef CONFIG_HIGHMEM
8500     +#define real_nr_free_high_pages() (real_nr_free_pages(1 << ZONE_HIGHMEM))
8501     +#define real_nr_free_low_pages() (real_nr_free_pages(all_zones_mask - \
8502     + (1 << ZONE_HIGHMEM)))
8503     +#else
8504     +#define real_nr_free_high_pages() (0)
8505     +#define real_nr_free_low_pages() (real_nr_free_pages(all_zones_mask))
8506     +
8507     +/* For eat_memory function */
8508     +#define ZONE_HIGHMEM (MAX_NR_ZONES + 1)
8509     +#endif
8510     +
8511     diff --git a/kernel/power/process.c b/kernel/power/process.c
8512     index 6d566bf..90a4cc5 100644
8513     --- a/kernel/power/process.c
8514     +++ b/kernel/power/process.c
8515     @@ -15,6 +15,8 @@
8516     #include <linux/syscalls.h>
8517     #include <linux/freezer.h>
8518    
8519     +int freezer_state = 0;
8520     +
8521     /*
8522     * Timeout for stopping processes
8523     */
8524     @@ -179,10 +181,11 @@ int freeze_processes(void)
8525     return nr_unfrozen;
8526    
8527     sys_sync();
8528     + freezer_state = FREEZER_USERSPACE_FROZEN;
8529     nr_unfrozen = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
8530     if (nr_unfrozen)
8531     return nr_unfrozen;
8532     -
8533     + freezer_state = FREEZER_FULLY_ON;
8534     printk("done.\n");
8535     BUG_ON(in_atomic());
8536     return 0;
8537     @@ -200,7 +203,7 @@ static void thaw_tasks(int thaw_user_space)
8538     if (is_user_space(p) == !thaw_user_space)
8539     continue;
8540    
8541     - if (!thaw_process(p))
8542     + if (!thaw_process(p) && p->state != TASK_TRACED)
8543     printk(KERN_WARNING " Strange, %s not stopped\n",
8544     p->comm );
8545     } while_each_thread(g, p);
8546     @@ -209,11 +212,31 @@ static void thaw_tasks(int thaw_user_space)
8547    
8548     void thaw_processes(void)
8549     {
8550     + int old_state = freezer_state;
8551     +
8552     + if (old_state == FREEZER_OFF)
8553     + return;
8554     +
8555     + /*
8556     + * Change state beforehand because thawed tasks might submit I/O
8557     + * immediately.
8558     + */
8559     + freezer_state = FREEZER_OFF;
8560     +
8561     printk("Restarting tasks ... ");
8562     - thaw_tasks(FREEZER_KERNEL_THREADS);
8563     +
8564     + if (old_state == FREEZER_FULLY_ON)
8565     + thaw_tasks(FREEZER_KERNEL_THREADS);
8566     thaw_tasks(FREEZER_USER_SPACE);
8567     schedule();
8568     printk("done.\n");
8569     }
8570    
8571     +void thaw_kernel_threads(void)
8572     +{
8573     + freezer_state = FREEZER_USERSPACE_FROZEN;
8574     + thaw_tasks(FREEZER_KERNEL_THREADS);
8575     +}
8576     +
8577     EXPORT_SYMBOL(refrigerator);
8578     +EXPORT_SYMBOL(freezer_state);
8579     diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
8580     index fc53ad0..319c96a 100644
8581     --- a/kernel/power/snapshot.c
8582     +++ b/kernel/power/snapshot.c
8583     @@ -33,6 +33,7 @@
8584     #include <asm/io.h>
8585    
8586     #include "power.h"
8587     +#include "suspend2_builtin.h"
8588    
8589     /* List of PBEs needed for restoring the pages that were allocated before
8590     * the suspend and included in the suspend image, but have also been
8591     @@ -40,6 +41,13 @@
8592     * directly to their "original" page frames.
8593     */
8594     struct pbe *restore_pblist;
8595     +int resume_attempted;
8596     +EXPORT_SYMBOL_GPL(resume_attempted);
8597     +
8598     +#ifdef CONFIG_SUSPEND2
8599     +#include "pagedir.h"
8600     +int suspend_post_context_save(void);
8601     +#endif
8602    
8603     /* Pointer to an auxiliary buffer (1 page) */
8604     static void *buffer;
8605     @@ -82,6 +90,11 @@ static void *get_image_page(gfp_t gfp_mask, int safe_needed)
8606    
8607     unsigned long get_safe_page(gfp_t gfp_mask)
8608     {
8609     +#ifdef CONFIG_SUSPEND2
8610     + if (suspend2_running)
8611     + return suspend_get_nonconflicting_page();
8612     +#endif
8613     +
8614     return (unsigned long)get_image_page(gfp_mask, PG_SAFE);
8615     }
8616    
8617     @@ -604,7 +617,7 @@ static unsigned int count_free_highmem_pages(void)
8618     * and it isn't a part of a free chunk of pages.
8619     */
8620    
8621     -static struct page *saveable_highmem_page(unsigned long pfn)
8622     +struct page *saveable_highmem_page(unsigned long pfn)
8623     {
8624     struct page *page;
8625    
8626     @@ -646,7 +659,6 @@ unsigned int count_highmem_pages(void)
8627     return n;
8628     }
8629     #else
8630     -static inline void *saveable_highmem_page(unsigned long pfn) { return NULL; }
8631     static inline unsigned int count_highmem_pages(void) { return 0; }
8632     #endif /* CONFIG_HIGHMEM */
8633    
8634     @@ -670,7 +682,7 @@ static inline int pfn_is_nosave(unsigned long pfn)
8635     * a free chunk of pages.
8636     */
8637    
8638     -static struct page *saveable_page(unsigned long pfn)
8639     +struct page *saveable_page(unsigned long pfn)
8640     {
8641     struct page *page;
8642    
8643     @@ -986,6 +998,11 @@ asmlinkage int swsusp_save(void)
8644     {
8645     unsigned int nr_pages, nr_highmem;
8646    
8647     +#ifdef CONFIG_SUSPEND2
8648     + if (suspend2_running)
8649     + return suspend_post_context_save();
8650     +#endif
8651     +
8652     printk("swsusp: critical section: \n");
8653    
8654     drain_local_pages();
8655     diff --git a/kernel/power/storage.c b/kernel/power/storage.c
8656     new file mode 100644
8657     index 0000000..702c1d4
8658     --- /dev/null
8659     +++ b/kernel/power/storage.c
8660     @@ -0,0 +1,288 @@
8661     +/*
8662     + * kernel/power/storage.c
8663     + *
8664     + * Copyright (C) 2005-2007 Nigel Cunningham (nigel at suspend2 net)
8665     + *
8666     + * This file is released under the GPLv2.
8667     + *
8668     + * Routines for talking to a userspace program that manages storage.
8669     + *
8670     + * The kernel side:
8671     + * - starts the userspace program;
8672     + * - sends messages telling it when to open and close the connection;
8673     + * - tells it when to quit;
8674     + *
8675     + * The user space side:
8676     + * - passes messages regarding status;
8677     + *
8678     + */
8679     +
8680     +#include <linux/suspend.h>
8681     +#include <linux/freezer.h>
8682     +
8683     +#include "sysfs.h"
8684     +#include "modules.h"
8685     +#include "netlink.h"
8686     +#include "storage.h"
8687     +#include "ui.h"
8688     +
8689     +static struct user_helper_data usm_helper_data;
8690     +static struct suspend_module_ops usm_ops;
8691     +static int message_received = 0;
8692     +static int usm_prepare_count = 0;
8693     +static int storage_manager_last_action = 0;
8694     +static int storage_manager_action = 0;
8695     +
8696     +static int usm_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
8697     +{
8698     + int type;
8699     + int *data;
8700     +
8701     + type = nlh->nlmsg_type;
8702     +
8703     + /* A control message: ignore them */
8704     + if (type < NETLINK_MSG_BASE)
8705     + return 0;
8706     +
8707     + /* Unknown message: reply with EINVAL */
8708     + if (type >= USM_MSG_MAX)
8709     + return -EINVAL;
8710     +
8711     + /* All operations require privileges, even GET */
8712     + if (security_netlink_recv(skb, CAP_NET_ADMIN))
8713     + return -EPERM;
8714     +
8715     + /* Only allow one task to receive NOFREEZE privileges */
8716     + if (type == NETLINK_MSG_NOFREEZE_ME && usm_helper_data.pid != -1)
8717     + return -EBUSY;
8718     +
8719     + data = (int*)NLMSG_DATA(nlh);
8720     +
8721     + switch (type) {
8722     + case USM_MSG_SUCCESS:
8723     + case USM_MSG_FAILED:
8724     + message_received = type;
8725     + complete(&usm_helper_data.wait_for_process);
8726     + break;
8727     + default:
8728     + printk("Storage manager doesn't recognise message %d.\n", type);
8729     + }
8730     +
8731     + return 1;
8732     +}
8733     +
8734     +#ifdef CONFIG_NET
8735     +static int activations = 0;
8736     +
8737     +int suspend_activate_storage(int force)
8738     +{
8739     + int tries = 1;
8740     +
8741     + if (usm_helper_data.pid == -1 || !usm_ops.enabled)
8742     + return 0;
8743     +
8744     + message_received = 0;
8745     + activations++;
8746     +
8747     + if (activations > 1 && !force)
8748     + return 0;
8749     +
8750     + while ((!message_received || message_received == USM_MSG_FAILED) && tries < 2) {
8751     + suspend_prepare_status(DONT_CLEAR_BAR, "Activate storage attempt %d.\n", tries);
8752     +
8753     + init_completion(&usm_helper_data.wait_for_process);
8754     +
8755     + suspend_send_netlink_message(&usm_helper_data,
8756     + USM_MSG_CONNECT,
8757     + NULL, 0);
8758     +
8759     + /* Wait 2 seconds for the userspace process to make contact */
8760     + wait_for_completion_timeout(&usm_helper_data.wait_for_process, 2*HZ);
8761     +
8762     + tries++;
8763     + }
8764     +
8765     + return 0;
8766     +}
8767     +
8768     +int suspend_deactivate_storage(int force)
8769     +{
8770     + if (usm_helper_data.pid == -1 || !usm_ops.enabled)
8771     + return 0;
8772     +
8773     + message_received = 0;
8774     + activations--;
8775     +
8776     + if (activations && !force)
8777     + return 0;
8778     +
8779     + init_completion(&usm_helper_data.wait_for_process);
8780     +
8781     + suspend_send_netlink_message(&usm_helper_data,
8782     + USM_MSG_DISCONNECT,
8783     + NULL, 0);
8784     +
8785     + wait_for_completion_timeout(&usm_helper_data.wait_for_process, 2*HZ);
8786     +
8787     + if (!message_received || message_received == USM_MSG_FAILED) {
8788     + printk("Returning failure disconnecting storage.\n");
8789     + return 1;
8790     + }
8791     +
8792     + return 0;
8793     +}
8794     +#endif
8795     +
8796     +static void storage_manager_simulate(void)
8797     +{
8798     + printk("--- Storage manager simulate ---\n");
8799     + suspend_prepare_usm();
8800     + schedule();
8801     + printk("--- Activate storage 1 ---\n");
8802     + suspend_activate_storage(1);
8803     + schedule();
8804     + printk("--- Deactivate storage 1 ---\n");
8805     + suspend_deactivate_storage(1);
8806     + schedule();
8807     + printk("--- Cleanup usm ---\n");
8808     + suspend_cleanup_usm();
8809     + schedule();
8810     + printk("--- Storage manager simulate ends ---\n");
8811     +}
8812     +
8813     +static int usm_storage_needed(void)
8814     +{
8815     + return strlen(usm_helper_data.program);
8816     +}
8817     +
8818     +static int usm_save_config_info(char *buf)
8819     +{
8820     + int len = strlen(usm_helper_data.program);
8821     + memcpy(buf, usm_helper_data.program, len);
8822     + return len;
8823     +}
8824     +
8825     +static void usm_load_config_info(char *buf, int size)
8826     +{
8827     + /* Don't load the saved path if one has already been set */
8828     + if (usm_helper_data.program[0])
8829     + return;
8830     +
8831     + memcpy(usm_helper_data.program, buf, size);
8832     +}
8833     +
8834     +static int usm_memory_needed(void)
8835     +{
8836     + /* ball park figure of 32 pages */
8837     + return (32 * PAGE_SIZE);
8838     +}
8839     +
8840     +/* suspend_prepare_usm
8841     + */
8842     +int suspend_prepare_usm(void)
8843     +{
8844     + usm_prepare_count++;
8845     +
8846     + if (usm_prepare_count > 1 || !usm_ops.enabled)
8847     + return 0;
8848     +
8849     + usm_helper_data.pid = -1;
8850     +
8851     + if (!*usm_helper_data.program)
8852     + return 0;
8853     +
8854     + suspend_netlink_setup(&usm_helper_data);
8855     +
8856     + if (usm_helper_data.pid == -1)
8857     + printk("Suspend2 Storage Manager wanted, but couldn't start it.\n");
8858     +
8859     + suspend_activate_storage(0);
8860     +
8861     + return (usm_helper_data.pid != -1);
8862     +}
8863     +
8864     +void suspend_cleanup_usm(void)
8865     +{
8866     + usm_prepare_count--;
8867     +
8868     + if (usm_helper_data.pid > -1 && !usm_prepare_count) {
8869     + suspend_deactivate_storage(0);
8870     + suspend_netlink_close(&usm_helper_data);
8871     + }
8872     +}
8873     +
8874     +static void storage_manager_activate(void)
8875     +{
8876     + if (storage_manager_action == storage_manager_last_action)
8877     + return;
8878     +
8879     + if (storage_manager_action)
8880     + suspend_prepare_usm();
8881     + else
8882     + suspend_cleanup_usm();
8883     +
8884     + storage_manager_last_action = storage_manager_action;
8885     +}
8886     +
8887     +/*
8888     + * User interface specific /sys/power/suspend2 entries.
8889     + */
8890     +
8891     +static struct suspend_sysfs_data sysfs_params[] = {
8892     + { SUSPEND2_ATTR("simulate_atomic_copy", SYSFS_RW),
8893     + .type = SUSPEND_SYSFS_DATA_NONE,
8894     + .write_side_effect = storage_manager_simulate,
8895     + },
8896     +
8897     + { SUSPEND2_ATTR("enabled", SYSFS_RW),
8898     + SYSFS_INT(&usm_ops.enabled, 0, 1, 0)
8899     + },
8900     +
8901     + { SUSPEND2_ATTR("program", SYSFS_RW),
8902     + SYSFS_STRING(usm_helper_data.program, 254, 0)
8903     + },
8904     +
8905     + { SUSPEND2_ATTR("activate_storage", SYSFS_RW),
8906     + SYSFS_INT(&storage_manager_action, 0, 1, 0),
8907     + .write_side_effect = storage_manager_activate,
8908     + }
8909     +};
8910     +
8911     +static struct suspend_module_ops usm_ops = {
8912     + .type = MISC_MODULE,
8913     + .name = "Userspace Storage Manager",
8914     + .directory = "storage_manager",
8915     + .module = THIS_MODULE,
8916     + .storage_needed = usm_storage_needed,
8917     + .save_config_info = usm_save_config_info,
8918     + .load_config_info = usm_load_config_info,
8919     + .memory_needed = usm_memory_needed,
8920     +
8921     + .sysfs_data = sysfs_params,
8922     + .num_sysfs_entries = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data),
8923     +};
8924     +
8925     +/* suspend_usm_sysfs_init
8926     + * Description: Boot time initialisation for user interface.
8927     + */
8928     +int s2_usm_init(void)
8929     +{
8930     + usm_helper_data.nl = NULL;
8931     + usm_helper_data.program[0] = '\0';
8932     + usm_helper_data.pid = -1;
8933     + usm_helper_data.skb_size = 0;
8934     + usm_helper_data.pool_limit = 6;
8935     + usm_helper_data.netlink_id = NETLINK_SUSPEND2_USM;
8936     + usm_helper_data.name = "userspace storage manager";
8937     + usm_helper_data.rcv_msg = usm_user_rcv_msg;
8938     + usm_helper_data.interface_version = 1;
8939     + usm_helper_data.must_init = 0;
8940     + init_completion(&usm_helper_data.wait_for_process);
8941     +
8942     + return suspend_register_module(&usm_ops);
8943     +}
8944     +
8945     +void s2_usm_exit(void)
8946     +{
8947     + suspend_unregister_module(&usm_ops);
8948     +}
8949     diff --git a/kernel/power/storage.h b/kernel/power/storage.h
8950     new file mode 100644
8951     index 0000000..e05eeef
8952     --- /dev/null
8953     +++ b/kernel/power/storage.h
8954     @@ -0,0 +1,53 @@
8955     +/*
8956     + * kernel/power/storage.h
8957     + *
8958     + * Copyright (C) 2005-2007 Nigel Cunningham (nigel at suspend2 net)
8959     + *
8960     + * This file is released under the GPLv2.
8961     + */
8962     +
8963     +#ifdef CONFIG_NET
8964     +int suspend_prepare_usm(void);
8965     +void suspend_cleanup_usm(void);
8966     +
8967     +int suspend_activate_storage(int force);
8968     +int suspend_deactivate_storage(int force);
8969     +extern int s2_usm_init(void);
8970     +extern void s2_usm_exit(void);
8971     +#else
8972     +static inline int s2_usm_init(void) { return 0; }
8973     +static inline void s2_usm_exit(void) { }
8974     +
8975     +static inline int suspend_activate_storage(int force)
8976     +{
8977     + return 0;
8978     +}
8979     +
8980     +static inline int suspend_deactivate_storage(int force)
8981     +{
8982     + return 0;
8983     +}
8984     +
8985     +static inline int suspend_prepare_usm(void) { return 0; }
8986     +static inline void suspend_cleanup_usm(void) { }
8987     +#endif
8988     +
8989     +enum {
8990     + USM_MSG_BASE = 0x10,
8991     +
8992     + /* Kernel -> Userspace */
8993     + USM_MSG_CONNECT = 0x30,
8994     + USM_MSG_DISCONNECT = 0x31,
8995     + USM_MSG_SUCCESS = 0x40,
8996     + USM_MSG_FAILED = 0x41,
8997     +
8998     + USM_MSG_MAX,
8999     +};
9000     +
9001     +#ifdef CONFIG_NET
9002     +extern __init int suspend_usm_init(void);
9003     +extern __exit void suspend_usm_cleanup(void);
9004     +#else
9005     +#define suspend_usm_init() do { } while(0)
9006     +#define suspend_usb_cleanup() do { } while(0)
9007     +#endif
9008     diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
9009     new file mode 100644
9010     index 0000000..3de56ac
9011     --- /dev/null
9012     +++ b/kernel/power/suspend.c
9013     @@ -0,0 +1,1022 @@
9014     +/*
9015     + * kernel/power/suspend.c
9016     + */
9017     +/** \mainpage Suspend2.
9018     + *
9019     + * Suspend2 provides support for saving and restoring an image of
9020     + * system memory to an arbitrary storage device, either on the local computer,
9021     + * or across some network. The support is entirely OS based, so Suspend2
9022     + * works without requiring BIOS, APM or ACPI support. The vast majority of the
9023     + * code is also architecture independant, so it should be very easy to port
9024     + * the code to new architectures. Suspend includes support for SMP, 4G HighMem
9025     + * and preemption. Initramfses and initrds are also supported.
9026     + *
9027     + * Suspend2 uses a modular design, in which the method of storing the image is
9028     + * completely abstracted from the core code, as are transformations on the data
9029     + * such as compression and/or encryption (multiple 'modules' can be used to
9030     + * provide arbitrary combinations of functionality). The user interface is also
9031     + * modular, so that arbitrarily simple or complex interfaces can be used to
9032     + * provide anything from debugging information through to eye candy.
9033     + *
9034     + * \section Copyright
9035     + *
9036     + * Suspend2 is released under the GPLv2.
9037     + *
9038     + * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu><BR>
9039     + * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz><BR>
9040     + * Copyright (C) 2002-2003 Florent Chabaud <fchabaud@free.fr><BR>
9041     + * Copyright (C) 2002-2007 Nigel Cunningham (nigel at suspend2 net)<BR>
9042     + *
9043     + * \section Credits
9044     + *
9045     + * Nigel would like to thank the following people for their work:
9046     + *
9047     + * Bernard Blackham <bernard@blackham.com.au><BR>
9048     + * Web page & Wiki administration, some coding. A person without whom
9049     + * Suspend would not be where it is.
9050     + *
9051     + * Michael Frank <mhf@linuxmail.org><BR>
9052     + * Extensive testing and help with improving stability. I was constantly
9053     + * amazed by the quality and quantity of Michael's help.
9054     + *
9055     + * Pavel Machek <pavel@ucw.cz><BR>
9056     + * Modifications, defectiveness pointing, being with Gabor at the very beginning,
9057     + * suspend to swap space, stop all tasks. Port to 2.4.18-ac and 2.5.17. Even
9058     + * though Pavel and I disagree on the direction suspend to disk should take, I
9059     + * appreciate the valuable work he did in helping Gabor get the concept working.
9060     + *
9061     + * ..and of course the myriads of Suspend2 users who have helped diagnose
9062     + * and fix bugs, made suggestions on how to improve the code, proofread
9063     + * documentation, and donated time and money.
9064     + *
9065     + * Thanks also to corporate sponsors:
9066     + *
9067     + * <B>Redhat.</B>Sometime employer from May 2006 (my fault, not Redhat's!).
9068     + *
9069     + * <B>Cyclades.com.</B> Nigel's employers from Dec 2004 until May 2006, who
9070     + * allowed him to work on Suspend and PM related issues on company time.
9071     + *
9072     + * <B>LinuxFund.org.</B> Sponsored Nigel's work on Suspend for four months Oct 2003
9073     + * to Jan 2004.
9074     + *
9075     + * <B>LAC Linux.</B> Donated P4 hardware that enabled development and ongoing
9076     + * maintenance of SMP and Highmem support.
9077     + *
9078     + * <B>OSDL.</B> Provided access to various hardware configurations, make occasional
9079     + * small donations to the project.
9080     + */
9081     +
9082     +#include <linux/suspend.h>
9083     +#include <linux/module.h>
9084     +#include <linux/freezer.h>
9085     +#include <linux/utsrelease.h>
9086     +#include <linux/cpu.h>
9087     +#include <linux/console.h>
9088     +#include <asm/uaccess.h>
9089     +
9090     +#include "modules.h"
9091     +#include "sysfs.h"
9092     +#include "prepare_image.h"
9093     +#include "io.h"
9094     +#include "ui.h"
9095     +#include "power_off.h"
9096     +#include "storage.h"
9097     +#include "checksum.h"
9098     +#include "cluster.h"
9099     +#include "suspend2_builtin.h"
9100     +
9101     +/*! Pageset metadata. */
9102     +struct pagedir pagedir2 = {2};
9103     +
9104     +static int get_pmsem = 0, got_pmsem;
9105     +static mm_segment_t oldfs;
9106     +static atomic_t actions_running;
9107     +static int block_dump_save;
9108     +extern int block_dump;
9109     +
9110     +int do_suspend2_step(int step);
9111     +
9112     +/*
9113     + * Basic clean-up routine.
9114     + */
9115     +void suspend_finish_anything(int suspend_or_resume)
9116     +{
9117     + if (!atomic_dec_and_test(&actions_running))
9118     + return;
9119     +
9120     + suspend_cleanup_modules(suspend_or_resume);
9121     + suspend_put_modules();
9122     + clear_suspend_state(SUSPEND_RUNNING);
9123     + set_fs(oldfs);
9124     + if (suspend_or_resume) {
9125     + block_dump = block_dump_save;
9126     + set_cpus_allowed(current, CPU_MASK_ALL);
9127     + }
9128     +}
9129     +
9130     +/*
9131     + * Basic set-up routine.
9132     + */
9133     +int suspend_start_anything(int suspend_or_resume)
9134     +{
9135     + if (atomic_add_return(1, &actions_running) != 1) {
9136     + if (suspend_or_resume) {
9137     + printk("Can't start a cycle when actions are "
9138     + "already running.\n");
9139     + atomic_dec(&actions_running);
9140     + return -EBUSY;
9141     + } else
9142     + return 0;
9143     + }
9144     +
9145     + oldfs = get_fs();
9146     + set_fs(KERNEL_DS);
9147     +
9148     + if (!suspendActiveAllocator) {
9149     + /* Be quiet if we're not trying to suspend or resume */
9150     + if (suspend_or_resume)
9151     + printk("No storage allocator is currently active. "
9152     + "Rechecking whether we can use one.\n");
9153     + suspend_attempt_to_parse_resume_device(!suspend_or_resume);
9154     + }
9155     +
9156     + set_suspend_state(SUSPEND_RUNNING);
9157     +
9158     + if (suspend_get_modules()) {
9159     + printk("Suspend2: Get modules failed!\n");
9160     + goto out_err;
9161     + }
9162     +
9163     + if (suspend_initialise_modules(suspend_or_resume)) {
9164     + printk("Suspend2: Initialise modules failed!\n");
9165     + goto out_err;
9166     + }
9167     +
9168     + if (suspend_or_resume) {
9169     + block_dump_save = block_dump;
9170     + block_dump = 0;
9171     + set_cpus_allowed(current, CPU_MASK_CPU0);
9172     + }
9173     +
9174     + return 0;
9175     +
9176     +out_err:
9177     + if (suspend_or_resume)
9178     + block_dump_save = block_dump;
9179     + suspend_finish_anything(suspend_or_resume);
9180     + return -EBUSY;
9181     +}
9182     +
9183     +/*
9184     + * Allocate & free bitmaps.
9185     + */
9186     +static int allocate_bitmaps(void)
9187     +{
9188     + if (allocate_dyn_pageflags(&pageset1_map) ||
9189     + allocate_dyn_pageflags(&pageset1_copy_map) ||
9190     + allocate_dyn_pageflags(&pageset2_map) ||
9191     + allocate_dyn_pageflags(&io_map) ||
9192     + allocate_dyn_pageflags(&page_resave_map))
9193     + return 1;
9194     +
9195     + return 0;
9196     +}
9197     +
9198     +static void free_bitmaps(void)
9199     +{
9200     + free_dyn_pageflags(&pageset1_map);
9201     + free_dyn_pageflags(&pageset1_copy_map);
9202     + free_dyn_pageflags(&pageset2_map);
9203     + free_dyn_pageflags(&io_map);
9204     + free_dyn_pageflags(&page_resave_map);
9205     +}
9206     +
9207     +static int io_MB_per_second(int read_write)
9208     +{
9209     + return (suspend_io_time[read_write][1]) ?
9210     + MB((unsigned long) suspend_io_time[read_write][0]) * HZ /
9211     + suspend_io_time[read_write][1] : 0;
9212     +}
9213     +
9214     +/* get_debug_info
9215     + * Functionality: Store debug info in a buffer.
9216     + */
9217     +#define SNPRINTF(a...) len += snprintf_used(((char *)buffer) + len, \
9218     + count - len - 1, ## a)
9219     +static int get_suspend_debug_info(const char *buffer, int count)
9220     +{
9221     + int len = 0;
9222     +
9223     + SNPRINTF("Suspend2 debugging info:\n");
9224     + SNPRINTF("- Suspend core : %s\n", SUSPEND_CORE_VERSION);
9225     + SNPRINTF("- Kernel Version : %s\n", UTS_RELEASE);
9226     + SNPRINTF("- Compiler vers. : %d.%d\n", __GNUC__, __GNUC_MINOR__);
9227     + SNPRINTF("- Attempt number : %d\n", nr_suspends);
9228     + SNPRINTF("- Parameters : %ld %ld %ld %d %d %ld\n",
9229     + suspend_result,
9230     + suspend_action,
9231     + suspend_debug_state,
9232     + suspend_default_console_level,
9233     + image_size_limit,
9234     + suspend2_poweroff_method);
9235     + SNPRINTF("- Overall expected compression percentage: %d.\n",
9236     + 100 - suspend_expected_compression_ratio());
9237     + len+= suspend_print_module_debug_info(((char *) buffer) + len,
9238     + count - len - 1);
9239     + if (suspend_io_time[0][1]) {
9240     + if ((io_MB_per_second(0) < 5) || (io_MB_per_second(1) < 5)) {
9241     + SNPRINTF("- I/O speed: Write %d KB/s",
9242     + (KB((unsigned long) suspend_io_time[0][0]) * HZ /
9243     + suspend_io_time[0][1]));
9244     + if (suspend_io_time[1][1])
9245     + SNPRINTF(", Read %d KB/s",
9246     + (KB((unsigned long) suspend_io_time[1][0]) * HZ /
9247     + suspend_io_time[1][1]));
9248     + } else {
9249     + SNPRINTF("- I/O speed: Write %d MB/s",
9250     + (MB((unsigned long) suspend_io_time[0][0]) * HZ /
9251     + suspend_io_time[0][1]));
9252     + if (suspend_io_time[1][1])
9253     + SNPRINTF(", Read %d MB/s",
9254     + (MB((unsigned long) suspend_io_time[1][0]) * HZ /
9255     + suspend_io_time[1][1]));
9256     + }
9257     + SNPRINTF(".\n");
9258     + }
9259     + else
9260     + SNPRINTF("- No I/O speed stats available.\n");
9261     + SNPRINTF("- Extra pages : %d used/%d.\n",
9262     + extra_pd1_pages_used, extra_pd1_pages_allowance);
9263     +
9264     + return len;
9265     +}
9266     +
9267     +/*
9268     + * do_cleanup
9269     + */
9270     +
9271     +static void do_cleanup(int get_debug_info)
9272     +{
9273     + int i = 0;
9274     + char *buffer = NULL;
9275     +
9276     + if (get_debug_info)
9277     + suspend_prepare_status(DONT_CLEAR_BAR, "Cleaning up...");
9278     + relink_lru_lists();
9279     +
9280     + free_checksum_pages();
9281     +
9282     + if (get_debug_info)
9283     + buffer = (char *) get_zeroed_page(GFP_ATOMIC);
9284     +
9285     + if (buffer)
9286     + i = get_suspend_debug_info(buffer, PAGE_SIZE);
9287     +
9288     + suspend_free_extra_pagedir_memory();
9289     +
9290     + pagedir1.size = pagedir2.size = 0;
9291     + set_highmem_size(pagedir1, 0);
9292     + set_highmem_size(pagedir2, 0);
9293     +
9294     + restore_avenrun();
9295     +
9296     + thaw_processes();
9297     +
9298     +#ifdef CONFIG_SUSPEND2_KEEP_IMAGE
9299     + if (test_action_state(SUSPEND_KEEP_IMAGE) &&
9300     + !test_result_state(SUSPEND_ABORTED)) {
9301     + suspend_message(SUSPEND_ANY_SECTION, SUSPEND_LOW, 1,
9302     + "Suspend2: Not invalidating the image due "
9303     + "to Keep Image being enabled.\n");
9304     + set_result_state(SUSPEND_KEPT_IMAGE);
9305     + } else
9306     +#endif
9307     + if (suspendActiveAllocator)
9308     + suspendActiveAllocator->invalidate_image();
9309     +
9310     + free_bitmaps();
9311     +
9312     + if (buffer && i) {
9313     + /* Printk can only handle 1023 bytes, including
9314     + * its level mangling. */
9315     + for (i = 0; i < 3; i++)
9316     + printk("%s", buffer + (1023 * i));
9317     + free_page((unsigned long) buffer);
9318     + }
9319     +
9320     + if (!test_action_state(SUSPEND_LATE_CPU_HOTPLUG))
9321     + enable_nonboot_cpus();
9322     + suspend_cleanup_console();
9323     +
9324     + suspend_deactivate_storage(0);
9325     +
9326     + clear_suspend_state(SUSPEND_IGNORE_LOGLEVEL);
9327     + clear_suspend_state(SUSPEND_TRYING_TO_RESUME);
9328     + clear_suspend_state(SUSPEND_NOW_RESUMING);
9329     +
9330     + if (got_pmsem) {
9331     + mutex_unlock(&pm_mutex);
9332     + got_pmsem = 0;
9333     + }
9334     +}
9335     +
9336     +static int check_still_keeping_image(void)
9337     +{
9338     + if (test_action_state(SUSPEND_KEEP_IMAGE)) {
9339     + printk("Image already stored: powering down immediately.");
9340     + do_suspend2_step(STEP_SUSPEND_POWERDOWN);
9341     + return 1; /* Just in case we're using S3 */
9342     + }
9343     +
9344     + printk("Invalidating previous image.\n");
9345     + suspendActiveAllocator->invalidate_image();
9346     +
9347     + return 0;
9348     +}
9349     +
9350     +static int suspend_init(void)
9351     +{
9352     + suspend_result = 0;
9353     +
9354     + printk(KERN_INFO "Suspend2: Initiating a software suspend cycle.\n");
9355     +
9356     + nr_suspends++;
9357     +
9358     + save_avenrun();
9359     +
9360     + suspend_io_time[0][0] = suspend_io_time[0][1] =
9361     + suspend_io_time[1][0] = suspend_io_time[1][1] = 0;
9362     +
9363     + if (!test_suspend_state(SUSPEND_CAN_SUSPEND) ||
9364     + allocate_bitmaps())
9365     + return 0;
9366     +
9367     + suspend_prepare_console();
9368     + if (test_action_state(SUSPEND_LATE_CPU_HOTPLUG) ||
9369     + !disable_nonboot_cpus())
9370     + return 1;
9371     +
9372     + set_result_state(SUSPEND_CPU_HOTPLUG_FAILED);
9373     + set_result_state(SUSPEND_ABORTED);
9374     + return 0;
9375     +}
9376     +
9377     +static int can_suspend(void)
9378     +{
9379     + if (get_pmsem) {
9380     + if (!mutex_trylock(&pm_mutex)) {
9381     + printk("Suspend2: Failed to obtain pm_mutex.\n");
9382     + dump_stack();
9383     + set_result_state(SUSPEND_ABORTED);
9384     + set_result_state(SUSPEND_PM_SEM);
9385     + return 0;
9386     + }
9387     + got_pmsem = 1;
9388     + }
9389     +
9390     + if (!test_suspend_state(SUSPEND_CAN_SUSPEND))
9391     + suspend_attempt_to_parse_resume_device(0);
9392     +
9393     + if (!test_suspend_state(SUSPEND_CAN_SUSPEND)) {
9394     + printk("Suspend2: Software suspend is disabled.\n"
9395     + "This may be because you haven't put something along "
9396     + "the lines of\n\nresume2=swap:/dev/hda1\n\n"
9397     + "in lilo.conf or equivalent. (Where /dev/hda1 is your "
9398     + "swap partition).\n");
9399     + set_result_state(SUSPEND_ABORTED);
9400     + if (!got_pmsem) {
9401     + mutex_unlock(&pm_mutex);
9402     + got_pmsem = 0;
9403     + }
9404     + return 0;
9405     + }
9406     +
9407     + return 1;
9408     +}
9409     +
9410     +static int do_power_down(void)
9411     +{
9412     + /* If switching images fails, do normal powerdown */
9413     + if (poweroff_resume2[0])
9414     + do_suspend2_step(STEP_RESUME_ALT_IMAGE);
9415     +
9416     + suspend_cond_pause(1, "About to power down or reboot.");
9417     + suspend2_power_down();
9418     +
9419     + /* If we return, it's because we suspended to ram */
9420     + if (read_pageset2(1))
9421     + panic("Attempt to reload pagedir 2 failed. Try rebooting.");
9422     +
9423     + barrier();
9424     + mb();
9425     + do_cleanup(1);
9426     + return 0;
9427     +}
9428     +
9429     +/*
9430     + * __save_image
9431     + * Functionality : High level routine which performs the steps necessary
9432     + * to save the image after preparatory steps have been taken.
9433     + * Key Assumptions : Processes frozen, sufficient memory available, drivers
9434     + * suspended.
9435     + */
9436     +static int __save_image(void)
9437     +{
9438     + int temp_result;
9439     +
9440     + suspend_prepare_status(DONT_CLEAR_BAR, "Starting to save the image..");
9441     +
9442     + suspend_message(SUSPEND_ANY_SECTION, SUSPEND_LOW, 1,
9443     + " - Final values: %d and %d.\n",
9444     + pagedir1.size, pagedir2.size);
9445     +
9446     + suspend_cond_pause(1, "About to write pagedir2.");
9447     +
9448     + calculate_check_checksums(0);
9449     +
9450     + temp_result = write_pageset(&pagedir2);
9451     +
9452     + if (temp_result == -1 || test_result_state(SUSPEND_ABORTED))
9453     + return 1;
9454     +
9455     + suspend_cond_pause(1, "About to copy pageset 1.");
9456     +
9457     + if (test_result_state(SUSPEND_ABORTED))
9458     + return 1;
9459     +
9460     + suspend_deactivate_storage(1);
9461     +
9462     + suspend_prepare_status(DONT_CLEAR_BAR, "Doing atomic copy.");
9463     +
9464     + suspend2_in_suspend = 1;
9465     +
9466     + suspend_console();
9467     + if (device_suspend(PMSG_FREEZE)) {
9468     + set_result_state(SUSPEND_DEVICE_REFUSED);
9469     + set_result_state(SUSPEND_ABORTED);
9470     + goto ResumeConsole;
9471     + }
9472     +
9473     + if (test_action_state(SUSPEND_LATE_CPU_HOTPLUG) &&
9474     + disable_nonboot_cpus()) {
9475     + set_result_state(SUSPEND_CPU_HOTPLUG_FAILED);
9476     + set_result_state(SUSPEND_ABORTED);
9477     + } else
9478     + temp_result = suspend2_suspend();
9479     +
9480     + /* We return here at resume time too! */
9481     + if (!suspend2_in_suspend && pm_ops && pm_ops->finish &&
9482     + suspend2_poweroff_method > 3)
9483     + pm_ops->finish(suspend2_poweroff_method);
9484     +
9485     + if (test_action_state(SUSPEND_LATE_CPU_HOTPLUG))
9486     + enable_nonboot_cpus();
9487     +
9488     + device_resume();
9489     +
9490     +ResumeConsole:
9491     + resume_console();
9492     +
9493     + if (suspend_activate_storage(1))
9494     + panic("Failed to reactivate our storage.");
9495     +
9496     + if (temp_result || test_result_state(SUSPEND_ABORTED))
9497     + return 1;
9498     +
9499     + /* Resume time? */
9500     + if (!suspend2_in_suspend) {
9501     + copyback_post();
9502     + return 0;
9503     + }
9504     +
9505     + /* Nope. Suspending. So, see if we can save the image... */
9506     +
9507     + suspend_update_status(pagedir2.size,
9508     + pagedir1.size + pagedir2.size,
9509     + NULL);
9510     +
9511     + if (test_result_state(SUSPEND_ABORTED))
9512     + goto abort_reloading_pagedir_two;
9513     +
9514     + suspend_cond_pause(1, "About to write pageset1.");
9515     +
9516     + suspend_message(SUSPEND_ANY_SECTION, SUSPEND_LOW, 1,
9517     + "-- Writing pageset1\n");
9518     +
9519     + temp_result = write_pageset(&pagedir1);
9520     +
9521     + /* We didn't overwrite any memory, so no reread needs to be done. */
9522     + if (test_action_state(SUSPEND_TEST_FILTER_SPEED))
9523     + return 1;
9524     +
9525     + if (temp_result == 1 || test_result_state(SUSPEND_ABORTED))
9526     + goto abort_reloading_pagedir_two;
9527     +
9528     + suspend_cond_pause(1, "About to write header.");
9529     +
9530     + if (test_result_state(SUSPEND_ABORTED))
9531     + goto abort_reloading_pagedir_two;
9532     +
9533     + temp_result = write_image_header();
9534     +
9535     + if (test_action_state(SUSPEND_TEST_BIO))
9536     + return 1;
9537     +
9538     + if (!temp_result && !test_result_state(SUSPEND_ABORTED))
9539     + return 0;
9540     +
9541     +abort_reloading_pagedir_two:
9542     + temp_result = read_pageset2(1);
9543     +
9544     + /* If that failed, we're sunk. Panic! */
9545     + if (temp_result)
9546     + panic("Attempt to reload pagedir 2 while aborting "
9547     + "a suspend failed.");
9548     +
9549     + return 1;
9550     +}
9551     +
9552     +/*
9553     + * do_save_image
9554     + *
9555     + * Save the prepared image.
9556     + */
9557     +
9558     +static int do_save_image(void)
9559     +{
9560     + int result = __save_image();
9561     + if (!suspend2_in_suspend || result)
9562     + do_cleanup(1);
9563     + return result;
9564     +}
9565     +
9566     +
9567     +/* do_prepare_image
9568     + *
9569     + * Seek to initialise and prepare an image to be saved. On failure,
9570     + * cleanup.
9571     + */
9572     +
9573     +static int do_prepare_image(void)
9574     +{
9575     + if (suspend_activate_storage(0))
9576     + return 1;
9577     +
9578     + /*
9579     + * If kept image and still keeping image and suspending to RAM, we will
9580     + * return 1 after suspending and resuming (provided the power doesn't
9581     + * run out.
9582     + */
9583     +
9584     + if (!can_suspend() ||
9585     + (test_result_state(SUSPEND_KEPT_IMAGE) &&
9586     + check_still_keeping_image()))
9587     + goto cleanup;
9588     +
9589     + if (suspend_init() && !suspend_prepare_image() &&
9590     + !test_result_state(SUSPEND_ABORTED))
9591     + return 0;
9592     +
9593     +cleanup:
9594     + do_cleanup(0);
9595     + return 1;
9596     +}
9597     +
9598     +static int do_check_can_resume(void)
9599     +{
9600     + char *buf = (char *) get_zeroed_page(GFP_KERNEL);
9601     + int result = 0;
9602     +
9603     + if (!buf)
9604     + return 0;
9605     +
9606     + /* Only interested in first byte, so throw away return code. */
9607     + image_exists_read(buf, PAGE_SIZE);
9608     +
9609     + if (buf[0] == '1')
9610     + result = 1;
9611     +
9612     + free_page((unsigned long) buf);
9613     + return result;
9614     +}
9615     +
9616     +/*
9617     + * We check if we have an image and if so we try to resume.
9618     + */
9619     +static int do_load_atomic_copy(void)
9620     +{
9621     + int read_image_result = 0;
9622     +
9623     + if (sizeof(swp_entry_t) != sizeof(long)) {
9624     + printk(KERN_WARNING "Suspend2: The size of swp_entry_t != size"
9625     + " of long. Please report this!\n");
9626     + return 1;
9627     + }
9628     +
9629     + if (!resume2_file[0])
9630     + printk(KERN_WARNING "Suspend2: "
9631     + "You need to use a resume2= command line parameter to "
9632     + "tell Suspend2 where to look for an image.\n");
9633     +
9634     + suspend_activate_storage(0);
9635     +
9636     + if (!(test_suspend_state(SUSPEND_RESUME_DEVICE_OK)) &&
9637     + !suspend_attempt_to_parse_resume_device(0)) {
9638     + /*
9639     + * Without a usable storage device we can do nothing -
9640     + * even if noresume is given
9641     + */
9642     +
9643     + if (!suspendNumAllocators)
9644     + printk(KERN_ALERT "Suspend2: "
9645     + "No storage allocators have been registered.\n");
9646     + else
9647     + printk(KERN_ALERT "Suspend2: "
9648     + "Missing or invalid storage location "
9649     + "(resume2= parameter). Please correct and "
9650     + "rerun lilo (or equivalent) before "
9651     + "suspending.\n");
9652     + suspend_deactivate_storage(0);
9653     + return 1;
9654     + }
9655     +
9656     + read_image_result = read_pageset1(); /* non fatal error ignored */
9657     +
9658     + if (test_suspend_state(SUSPEND_NORESUME_SPECIFIED)) {
9659     + printk(KERN_WARNING "Suspend2: Resuming disabled as requested.\n");
9660     + clear_suspend_state(SUSPEND_NORESUME_SPECIFIED);
9661     + }
9662     +
9663     + suspend_deactivate_storage(0);
9664     +
9665     + if (read_image_result)
9666     + return 1;
9667     +
9668     + return 0;
9669     +}
9670     +
9671     +static void prepare_restore_load_alt_image(int prepare)
9672     +{
9673     + static dyn_pageflags_t pageset1_map_save, pageset1_copy_map_save;
9674     +
9675     + if (prepare) {
9676     + pageset1_map_save = pageset1_map;
9677     + pageset1_map = NULL;
9678     + pageset1_copy_map_save = pageset1_copy_map;
9679     + pageset1_copy_map = NULL;
9680     + set_suspend_state(SUSPEND_LOADING_ALT_IMAGE);
9681     + suspend_reset_alt_image_pageset2_pfn();
9682     + } else {
9683     + if (pageset1_map)
9684     + free_dyn_pageflags(&pageset1_map);
9685     + pageset1_map = pageset1_map_save;
9686     + if (pageset1_copy_map)
9687     + free_dyn_pageflags(&pageset1_copy_map);
9688     + pageset1_copy_map = pageset1_copy_map_save;
9689     + clear_suspend_state(SUSPEND_NOW_RESUMING);
9690     + clear_suspend_state(SUSPEND_LOADING_ALT_IMAGE);
9691     + }
9692     +}
9693     +
9694     +int pre_resume_freeze(void)
9695     +{
9696     + if (!test_action_state(SUSPEND_LATE_CPU_HOTPLUG)) {
9697     + suspend_prepare_status(DONT_CLEAR_BAR, "Disable nonboot cpus.");
9698     + if (disable_nonboot_cpus()) {
9699     + set_result_state(SUSPEND_CPU_HOTPLUG_FAILED);
9700     + set_result_state(SUSPEND_ABORTED);
9701     + return 1;
9702     + }
9703     + }
9704     +
9705     + suspend_prepare_status(DONT_CLEAR_BAR, "Freeze processes.");
9706     +
9707     + if (freeze_processes()) {
9708     + printk("Some processes failed to suspend\n");
9709     + return 1;
9710     + }
9711     +
9712     + return 0;
9713     +}
9714     +
9715     +void post_resume_thaw(void)
9716     +{
9717     + thaw_processes();
9718     + if (!test_action_state(SUSPEND_LATE_CPU_HOTPLUG))
9719     + enable_nonboot_cpus();
9720     +}
9721     +
9722     +int do_suspend2_step(int step)
9723     +{
9724     + int result;
9725     +
9726     + switch (step) {
9727     + case STEP_SUSPEND_PREPARE_IMAGE:
9728     + return do_prepare_image();
9729     + case STEP_SUSPEND_SAVE_IMAGE:
9730     + return do_save_image();
9731     + case STEP_SUSPEND_POWERDOWN:
9732     + return do_power_down();
9733     + case STEP_RESUME_CAN_RESUME:
9734     + return do_check_can_resume();
9735     + case STEP_RESUME_LOAD_PS1:
9736     + return do_load_atomic_copy();
9737     + case STEP_RESUME_DO_RESTORE:
9738     + /*
9739     + * If we succeed, this doesn't return.
9740     + * Instead, we return from do_save_image() in the
9741     + * suspended kernel.
9742     + */
9743     + result = suspend_atomic_restore();
9744     + if (result)
9745     + post_resume_thaw();
9746     + return result;
9747     + case STEP_RESUME_ALT_IMAGE:
9748     + printk("Trying to resume alternate image.\n");
9749     + suspend2_in_suspend = 0;
9750     + save_restore_resume2(SAVE, NOQUIET);
9751     + prepare_restore_load_alt_image(1);
9752     + if (!do_check_can_resume()) {
9753     + printk("Nothing to resume from.\n");
9754     + goto out;
9755     + }
9756     + if (!do_load_atomic_copy()) {
9757     + printk("Failed to load image.\n");
9758     + suspend_atomic_restore();
9759     + }
9760     +out:
9761     + prepare_restore_load_alt_image(0);
9762     + save_restore_resume2(RESTORE, NOQUIET);
9763     + break;
9764     + }
9765     +
9766     + return 0;
9767     +}
9768     +
9769     +/* -- Functions for kickstarting a suspend or resume --- */
9770     +
9771     +/*
9772     + * Check if we have an image and if so try to resume.
9773     + */
9774     +void __suspend2_try_resume(void)
9775     +{
9776     + set_suspend_state(SUSPEND_TRYING_TO_RESUME);
9777     + resume_attempted = 1;
9778     +
9779     + if (do_suspend2_step(STEP_RESUME_CAN_RESUME) &&
9780     + !do_suspend2_step(STEP_RESUME_LOAD_PS1))
9781     + do_suspend2_step(STEP_RESUME_DO_RESTORE);
9782     +
9783     + do_cleanup(0);
9784     +
9785     + clear_suspend_state(SUSPEND_IGNORE_LOGLEVEL);
9786     + clear_suspend_state(SUSPEND_TRYING_TO_RESUME);
9787     + clear_suspend_state(SUSPEND_NOW_RESUMING);
9788     +}
9789     +
9790     +/* Wrapper for when called from init/do_mounts.c */
9791     +void _suspend2_try_resume(void)
9792     +{
9793     + resume_attempted = 1;
9794     +
9795     + if (suspend_start_anything(SYSFS_RESUMING))
9796     + return;
9797     +
9798     + /* Unlock will be done in do_cleanup */
9799     + mutex_lock(&pm_mutex);
9800     + got_pmsem = 1;
9801     +
9802     + __suspend2_try_resume();
9803     +
9804     + /*
9805     + * For initramfs, we have to clear the boot time
9806     + * flag after trying to resume
9807     + */
9808     + clear_suspend_state(SUSPEND_BOOT_TIME);
9809     + suspend_finish_anything(SYSFS_RESUMING);
9810     +}
9811     +
9812     +/*
9813     + * _suspend2_try_suspend
9814     + * Functionality :
9815     + * Called From : drivers/acpi/sleep/main.c
9816     + * kernel/reboot.c
9817     + */
9818     +int _suspend2_try_suspend(int have_pmsem)
9819     +{
9820     + int result = 0, sys_power_disk = 0;
9821     +
9822     + if (!atomic_read(&actions_running)) {
9823     + /* Came in via /sys/power/disk */
9824     + if (suspend_start_anything(SYSFS_SUSPENDING))
9825     + return -EBUSY;
9826     + sys_power_disk = 1;
9827     + }
9828     +
9829     + get_pmsem = !have_pmsem;
9830     +
9831     + if (strlen(poweroff_resume2)) {
9832     + attempt_to_parse_po_resume_device2();
9833     +
9834     + if (!strlen(poweroff_resume2)) {
9835     + printk("Poweroff resume2 now invalid. Aborting.\n");
9836     + goto out;
9837     + }
9838     + }
9839     +
9840     + if ((result = do_suspend2_step(STEP_SUSPEND_PREPARE_IMAGE)))
9841     + goto out;
9842     +
9843     + if (test_action_state(SUSPEND_FREEZER_TEST)) {
9844     + do_cleanup(0);
9845     + goto out;
9846     + }
9847     +
9848     + if ((result = do_suspend2_step(STEP_SUSPEND_SAVE_IMAGE)))
9849     + goto out;
9850     +
9851     + /* This code runs at resume time too! */
9852     + if (suspend2_in_suspend)
9853     + result = do_suspend2_step(STEP_SUSPEND_POWERDOWN);
9854     +out:
9855     + if (sys_power_disk)
9856     + suspend_finish_anything(SYSFS_SUSPENDING);
9857     + return result;
9858     +}
9859     +
9860     +/*
9861     + * This array contains entries that are automatically registered at
9862     + * boot. Modules and the console code register their own entries separately.
9863     + */
9864     +static struct suspend_sysfs_data sysfs_params[] = {
9865     + { SUSPEND2_ATTR("extra_pages_allowance", SYSFS_RW),
9866     + SYSFS_INT(&extra_pd1_pages_allowance, 0, INT_MAX, 0)
9867     + },
9868     +
9869     + { SUSPEND2_ATTR("image_exists", SYSFS_RW),
9870     + SYSFS_CUSTOM(image_exists_read, image_exists_write,
9871     + SYSFS_NEEDS_SM_FOR_BOTH)
9872     + },
9873     +
9874     + { SUSPEND2_ATTR("resume2", SYSFS_RW),
9875     + SYSFS_STRING(resume2_file, 255, SYSFS_NEEDS_SM_FOR_WRITE),
9876     + .write_side_effect = attempt_to_parse_resume_device2,
9877     + },
9878     +
9879     + { SUSPEND2_ATTR("poweroff_resume2", SYSFS_RW),
9880     + SYSFS_STRING(poweroff_resume2, 255, SYSFS_NEEDS_SM_FOR_WRITE),
9881     + .write_side_effect = attempt_to_parse_po_resume_device2,
9882     + },
9883     + { SUSPEND2_ATTR("debug_info", SYSFS_READONLY),
9884     + SYSFS_CUSTOM(get_suspend_debug_info, NULL, 0)
9885     + },
9886     +
9887     + { SUSPEND2_ATTR("ignore_rootfs", SYSFS_RW),
9888     + SYSFS_BIT(&suspend_action, SUSPEND_IGNORE_ROOTFS, 0)
9889     + },
9890     +
9891     + { SUSPEND2_ATTR("image_size_limit", SYSFS_RW),
9892     + SYSFS_INT(&image_size_limit, -2, INT_MAX, 0)
9893     + },
9894     +
9895     + { SUSPEND2_ATTR("last_result", SYSFS_RW),
9896     + SYSFS_UL(&suspend_result, 0, 0, 0)
9897     + },
9898     +
9899     + { SUSPEND2_ATTR("no_multithreaded_io", SYSFS_RW),
9900     + SYSFS_BIT(&suspend_action, SUSPEND_NO_MULTITHREADED_IO, 0)
9901     + },
9902     +
9903     + { SUSPEND2_ATTR("full_pageset2", SYSFS_RW),
9904     + SYSFS_BIT(&suspend_action, SUSPEND_PAGESET2_FULL, 0)
9905     + },
9906     +
9907     + { SUSPEND2_ATTR("reboot", SYSFS_RW),
9908     + SYSFS_BIT(&suspend_action, SUSPEND_REBOOT, 0)
9909     + },
9910     +
9911     +#ifdef CONFIG_SOFTWARE_SUSPEND
9912     + { SUSPEND2_ATTR("replace_swsusp", SYSFS_RW),
9913     + SYSFS_BIT(&suspend_action, SUSPEND_REPLACE_SWSUSP, 0)
9914     + },
9915     +#endif
9916     +
9917     + { SUSPEND2_ATTR("resume_commandline", SYSFS_RW),
9918     + SYSFS_STRING(suspend2_nosave_commandline, COMMAND_LINE_SIZE, 0)
9919     + },
9920     +
9921     + { SUSPEND2_ATTR("version", SYSFS_READONLY),
9922     + SYSFS_STRING(SUSPEND_CORE_VERSION, 0, 0)
9923     + },
9924     +
9925     + { SUSPEND2_ATTR("no_load_direct", SYSFS_RW),
9926     + SYSFS_BIT(&suspend_action, SUSPEND_NO_DIRECT_LOAD, 0)
9927     + },
9928     +
9929     + { SUSPEND2_ATTR("freezer_test", SYSFS_RW),
9930     + SYSFS_BIT(&suspend_action, SUSPEND_FREEZER_TEST, 0)
9931     + },
9932     +
9933     + { SUSPEND2_ATTR("test_bio", SYSFS_RW),
9934     + SYSFS_BIT(&suspend_action, SUSPEND_TEST_BIO, 0)
9935     + },
9936     +
9937     + { SUSPEND2_ATTR("test_filter_speed", SYSFS_RW),
9938     + SYSFS_BIT(&suspend_action, SUSPEND_TEST_FILTER_SPEED, 0)
9939     + },
9940     +
9941     + { SUSPEND2_ATTR("slow", SYSFS_RW),
9942     + SYSFS_BIT(&suspend_action, SUSPEND_SLOW, 0)
9943     + },
9944     +
9945     + { SUSPEND2_ATTR("no_pageset2", SYSFS_RW),
9946     + SYSFS_BIT(&suspend_action, SUSPEND_NO_PAGESET2, 0)
9947     + },
9948     +
9949     + { SUSPEND2_ATTR("late_cpu_hotplug", SYSFS_RW),
9950     + SYSFS_BIT(&suspend_action, SUSPEND_LATE_CPU_HOTPLUG, 0)
9951     + },
9952     +
9953     +#if defined(CONFIG_ACPI)
9954     + { SUSPEND2_ATTR("powerdown_method", SYSFS_RW),
9955     + SYSFS_UL(&suspend2_poweroff_method, 0, 5, 0)
9956     + },
9957     +#endif
9958     +
9959     +#ifdef CONFIG_SUSPEND2_KEEP_IMAGE
9960     + { SUSPEND2_ATTR("keep_image", SYSFS_RW),
9961     + SYSFS_BIT(&suspend_action, SUSPEND_KEEP_IMAGE, 0)
9962     + },
9963     +#endif
9964     +};
9965     +
9966     +struct suspend2_core_fns my_fns = {
9967     + .get_nonconflicting_page = __suspend_get_nonconflicting_page,
9968     + .post_context_save = __suspend_post_context_save,
9969     + .try_suspend = _suspend2_try_suspend,
9970     + .try_resume = _suspend2_try_resume,
9971     +};
9972     +
9973     +static __init int core_load(void)
9974     +{
9975     + int i,
9976     + numfiles = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data);
9977     +
9978     + printk("Suspend v" SUSPEND_CORE_VERSION "\n");
9979     +
9980     + if (s2_sysfs_init())
9981     + return 1;
9982     +
9983     + for (i=0; i< numfiles; i++)
9984     + suspend_register_sysfs_file(&suspend2_subsys.kset.kobj,
9985     + &sysfs_params[i]);
9986     +
9987     + s2_core_fns = &my_fns;
9988     +
9989     + if (s2_checksum_init())
9990     + return 1;
9991     + if (s2_cluster_init())
9992     + return 1;
9993     + if (s2_usm_init())
9994     + return 1;
9995     + if (s2_ui_init())
9996     + return 1;
9997     +
9998     +#ifdef CONFIG_SOFTWARE_SUSPEND
9999     + /* Overriding resume2= with resume=? */
10000     + if (test_action_state(SUSPEND_REPLACE_SWSUSP) && resume_file[0])
10001     + strncpy(resume2_file, resume_file, 256);
10002     +#endif
10003     +
10004     + return 0;
10005     +}
10006     +
10007     +#ifdef MODULE
10008     +static __exit void core_unload(void)
10009     +{
10010     + int i,
10011     + numfiles = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data);
10012     +
10013     + s2_ui_exit();
10014     + s2_checksum_exit();
10015     + s2_cluster_exit();
10016     + s2_usm_exit();
10017     +
10018     + for (i=0; i< numfiles; i++)
10019     + suspend_unregister_sysfs_file(&suspend2_subsys.kset.kobj,
10020     + &sysfs_params[i]);
10021     +
10022     + s2_core_fns = NULL;
10023     +
10024     + s2_sysfs_exit();
10025     +}
10026     +MODULE_LICENSE("GPL");
10027     +module_init(core_load);
10028     +module_exit(core_unload);
10029     +#else
10030     +late_initcall(core_load);
10031     +#endif
10032     +
10033     +#ifdef CONFIG_SUSPEND2_EXPORTS
10034     +EXPORT_SYMBOL_GPL(pagedir2);
10035     +#endif
10036     diff --git a/kernel/power/suspend.h b/kernel/power/suspend.h
10037     new file mode 100644
10038     index 0000000..81e752d
10039     --- /dev/null
10040     +++ b/kernel/power/suspend.h
10041     @@ -0,0 +1,182 @@
10042     +/*
10043     + * kernel/power/suspend.h
10044     + *
10045     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
10046     + *
10047     + * This file is released under the GPLv2.
10048     + *
10049     + * It contains declarations used throughout swsusp.
10050     + *
10051     + */
10052     +
10053     +#ifndef KERNEL_POWER_SUSPEND_H
10054     +#define KERNEL_POWER_SUSPEND_H
10055     +
10056     +#include <linux/delay.h>
10057     +#include <linux/bootmem.h>
10058     +#include <linux/suspend.h>
10059     +#include <linux/dyn_pageflags.h>
10060     +#include <asm/setup.h>
10061     +#include "pageflags.h"
10062     +
10063     +#define SUSPEND_CORE_VERSION "2.2.9.17"
10064     +
10065     +/* == Action states == */
10066     +
10067     +enum {
10068     + SUSPEND_REBOOT,
10069     + SUSPEND_PAUSE,
10070     + SUSPEND_SLOW,
10071     + SUSPEND_LOGALL,
10072     + SUSPEND_CAN_CANCEL,
10073     + SUSPEND_KEEP_IMAGE,
10074     + SUSPEND_FREEZER_TEST,
10075     + SUSPEND_SINGLESTEP,
10076     + SUSPEND_PAUSE_NEAR_PAGESET_END,
10077     + SUSPEND_TEST_FILTER_SPEED,
10078     + SUSPEND_TEST_BIO,
10079     + SUSPEND_NO_PAGESET2,
10080     + SUSPEND_PM_PREPARE_CONSOLE,
10081     + SUSPEND_IGNORE_ROOTFS,
10082     + SUSPEND_REPLACE_SWSUSP,
10083     + SUSPEND_RETRY_RESUME,
10084     + SUSPEND_PAGESET2_FULL,
10085     + SUSPEND_ABORT_ON_RESAVE_NEEDED,
10086     + SUSPEND_NO_MULTITHREADED_IO,
10087     + SUSPEND_NO_DIRECT_LOAD,
10088     + SUSPEND_LATE_CPU_HOTPLUG,
10089     +};
10090     +
10091     +extern unsigned long suspend_action;
10092     +
10093     +#define clear_action_state(bit) (test_and_clear_bit(bit, &suspend_action))
10094     +#define test_action_state(bit) (test_bit(bit, &suspend_action))
10095     +
10096     +/* == Result states == */
10097     +
10098     +enum {
10099     + SUSPEND_ABORTED,
10100     + SUSPEND_ABORT_REQUESTED,
10101     + SUSPEND_NOSTORAGE_AVAILABLE,
10102     + SUSPEND_INSUFFICIENT_STORAGE,
10103     + SUSPEND_FREEZING_FAILED,
10104     + SUSPEND_UNEXPECTED_ALLOC,
10105     + SUSPEND_KEPT_IMAGE,
10106     + SUSPEND_WOULD_EAT_MEMORY,
10107     + SUSPEND_UNABLE_TO_FREE_ENOUGH_MEMORY,
10108     + SUSPEND_ENCRYPTION_SETUP_FAILED,
10109     + SUSPEND_PM_SEM,
10110     + SUSPEND_DEVICE_REFUSED,
10111     + SUSPEND_EXTRA_PAGES_ALLOW_TOO_SMALL,
10112     + SUSPEND_UNABLE_TO_PREPARE_IMAGE,
10113     + SUSPEND_FAILED_MODULE_INIT,
10114     + SUSPEND_FAILED_MODULE_CLEANUP,
10115     + SUSPEND_FAILED_IO,
10116     + SUSPEND_OUT_OF_MEMORY,
10117     + SUSPEND_IMAGE_ERROR,
10118     + SUSPEND_PLATFORM_PREP_FAILED,
10119     + SUSPEND_CPU_HOTPLUG_FAILED,
10120     +};
10121     +
10122     +extern unsigned long suspend_result;
10123     +
10124     +#define set_result_state(bit) (test_and_set_bit(bit, &suspend_result))
10125     +#define clear_result_state(bit) (test_and_clear_bit(bit, &suspend_result))
10126     +#define test_result_state(bit) (test_bit(bit, &suspend_result))
10127     +
10128     +/* == Debug sections and levels == */
10129     +
10130     +/* debugging levels. */
10131     +enum {
10132     + SUSPEND_STATUS = 0,
10133     + SUSPEND_ERROR = 2,
10134     + SUSPEND_LOW,
10135     + SUSPEND_MEDIUM,
10136     + SUSPEND_HIGH,
10137     + SUSPEND_VERBOSE,
10138     +};
10139     +
10140     +enum {
10141     + SUSPEND_ANY_SECTION,
10142     + SUSPEND_EAT_MEMORY,
10143     + SUSPEND_IO,
10144     + SUSPEND_HEADER,
10145     + SUSPEND_WRITER,
10146     + SUSPEND_MEMORY,
10147     +};
10148     +
10149     +extern unsigned long suspend_debug_state;
10150     +
10151     +#define set_debug_state(bit) (test_and_set_bit(bit, &suspend_debug_state))
10152     +#define clear_debug_state(bit) (test_and_clear_bit(bit, &suspend_debug_state))
10153     +#define test_debug_state(bit) (test_bit(bit, &suspend_debug_state))
10154     +
10155     +/* == Steps in suspending == */
10156     +
10157     +enum {
10158     + STEP_SUSPEND_PREPARE_IMAGE,
10159     + STEP_SUSPEND_SAVE_IMAGE,
10160     + STEP_SUSPEND_POWERDOWN,
10161     + STEP_RESUME_CAN_RESUME,
10162     + STEP_RESUME_LOAD_PS1,
10163     + STEP_RESUME_DO_RESTORE,
10164     + STEP_RESUME_READ_PS2,
10165     + STEP_RESUME_GO,
10166     + STEP_RESUME_ALT_IMAGE,
10167     +};
10168     +
10169     +/* == Suspend states ==
10170     + (see also include/linux/suspend.h) */
10171     +
10172     +#define get_suspend_state() (suspend_state)
10173     +#define restore_suspend_state(saved_state) \
10174     + do { suspend_state = saved_state; } while(0)
10175     +
10176     +/* == Module support == */
10177     +
10178     +struct suspend2_core_fns {
10179     + int (*post_context_save)(void);
10180     + unsigned long (*get_nonconflicting_page)(void);
10181     + int (*try_suspend)(int have_pmsem);
10182     + void (*try_resume)(void);
10183     +};
10184     +
10185     +extern struct suspend2_core_fns *s2_core_fns;
10186     +
10187     +/* == All else == */
10188     +#define KB(x) ((x) << (PAGE_SHIFT - 10))
10189     +#define MB(x) ((x) >> (20 - PAGE_SHIFT))
10190     +
10191     +extern int suspend_start_anything(int suspend_or_resume);
10192     +extern void suspend_finish_anything(int suspend_or_resume);
10193     +
10194     +extern int save_image_part1(void);
10195     +extern int suspend_atomic_restore(void);
10196     +
10197     +extern int _suspend2_try_suspend(int have_pmsem);
10198     +extern void __suspend2_try_resume(void);
10199     +
10200     +extern int __suspend_post_context_save(void);
10201     +
10202     +extern unsigned int nr_suspends;
10203     +extern char resume2_file[256];
10204     +extern char poweroff_resume2[256];
10205     +
10206     +extern void copyback_post(void);
10207     +extern int suspend2_suspend(void);
10208     +extern int extra_pd1_pages_used;
10209     +
10210     +extern int suspend_io_time[2][2];
10211     +
10212     +#define SECTOR_SIZE 512
10213     +
10214     +extern int suspend_early_boot_message
10215     + (int can_erase_image, int default_answer, char *warning_reason, ...);
10216     +
10217     +static inline int load_direct(struct page *page)
10218     +{
10219     + return test_action_state(SUSPEND_NO_DIRECT_LOAD) ? 0 : PagePageset1Copy(page);
10220     +}
10221     +
10222     +extern int pre_resume_freeze(void);
10223     +#endif
10224     diff --git a/kernel/power/suspend2_builtin.c b/kernel/power/suspend2_builtin.c
10225     new file mode 100644
10226     index 0000000..15fe301
10227     --- /dev/null
10228     +++ b/kernel/power/suspend2_builtin.c
10229     @@ -0,0 +1,287 @@
10230     +/*
10231     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
10232     + *
10233     + * This file is released under the GPLv2.
10234     + */
10235     +#include <linux/module.h>
10236     +#include <linux/resume-trace.h>
10237     +#include <linux/syscalls.h>
10238     +#include <linux/kernel.h>
10239     +#include <linux/swap.h>
10240     +#include <linux/syscalls.h>
10241     +#include <linux/bio.h>
10242     +#include <linux/root_dev.h>
10243     +#include <linux/freezer.h>
10244     +#include <linux/reboot.h>
10245     +#include <linux/writeback.h>
10246     +#include <linux/tty.h>
10247     +#include <linux/crypto.h>
10248     +#include <linux/cpu.h>
10249     +#include <linux/dyn_pageflags.h>
10250     +#include "io.h"
10251     +#include "suspend.h"
10252     +#include "extent.h"
10253     +#include "block_io.h"
10254     +#include "netlink.h"
10255     +#include "prepare_image.h"
10256     +#include "ui.h"
10257     +#include "sysfs.h"
10258     +#include "pagedir.h"
10259     +#include "modules.h"
10260     +#include "suspend2_builtin.h"
10261     +
10262     +#ifdef CONFIG_SUSPEND2_CORE_EXPORTS
10263     +#ifdef CONFIG_SOFTWARE_SUSPEND
10264     +EXPORT_SYMBOL_GPL(resume_file);
10265     +#endif
10266     +
10267     +EXPORT_SYMBOL_GPL(max_pfn);
10268     +EXPORT_SYMBOL_GPL(free_dyn_pageflags);
10269     +EXPORT_SYMBOL_GPL(clear_dynpageflag);
10270     +EXPORT_SYMBOL_GPL(test_dynpageflag);
10271     +EXPORT_SYMBOL_GPL(set_dynpageflag);
10272     +EXPORT_SYMBOL_GPL(get_next_bit_on);
10273     +EXPORT_SYMBOL_GPL(allocate_dyn_pageflags);
10274     +EXPORT_SYMBOL_GPL(clear_dyn_pageflags);
10275     +
10276     +#ifdef CONFIG_X86_64
10277     +EXPORT_SYMBOL_GPL(restore_processor_state);
10278     +EXPORT_SYMBOL_GPL(save_processor_state);
10279     +#endif
10280     +
10281     +EXPORT_SYMBOL_GPL(kernel_shutdown_prepare);
10282     +EXPORT_SYMBOL_GPL(drop_pagecache);
10283     +EXPORT_SYMBOL_GPL(restore_pblist);
10284     +EXPORT_SYMBOL_GPL(pm_mutex);
10285     +EXPORT_SYMBOL_GPL(pm_restore_console);
10286     +EXPORT_SYMBOL_GPL(super_blocks);
10287     +EXPORT_SYMBOL_GPL(next_zone);
10288     +
10289     +EXPORT_SYMBOL_GPL(freeze_processes);
10290     +EXPORT_SYMBOL_GPL(thaw_processes);
10291     +EXPORT_SYMBOL_GPL(thaw_kernel_threads);
10292     +EXPORT_SYMBOL_GPL(shrink_all_memory);
10293     +EXPORT_SYMBOL_GPL(shrink_one_zone);
10294     +EXPORT_SYMBOL_GPL(saveable_page);
10295     +EXPORT_SYMBOL_GPL(swsusp_arch_suspend);
10296     +EXPORT_SYMBOL_GPL(swsusp_arch_resume);
10297     +EXPORT_SYMBOL_GPL(pm_ops);
10298     +EXPORT_SYMBOL_GPL(pm_prepare_console);
10299     +EXPORT_SYMBOL_GPL(follow_page);
10300     +EXPORT_SYMBOL_GPL(machine_halt);
10301     +EXPORT_SYMBOL_GPL(block_dump);
10302     +EXPORT_SYMBOL_GPL(unlink_lru_lists);
10303     +EXPORT_SYMBOL_GPL(relink_lru_lists);
10304     +EXPORT_SYMBOL_GPL(power_subsys);
10305     +EXPORT_SYMBOL_GPL(machine_power_off);
10306     +EXPORT_SYMBOL_GPL(suspend_enter);
10307     +EXPORT_SYMBOL_GPL(first_online_pgdat);
10308     +EXPORT_SYMBOL_GPL(next_online_pgdat);
10309     +EXPORT_SYMBOL_GPL(machine_restart);
10310     +EXPORT_SYMBOL_GPL(saved_command_line);
10311     +EXPORT_SYMBOL_GPL(tasklist_lock);
10312     +#ifdef CONFIG_SUSPEND_SMP
10313     +EXPORT_SYMBOL_GPL(disable_nonboot_cpus);
10314     +EXPORT_SYMBOL_GPL(enable_nonboot_cpus);
10315     +#endif
10316     +#endif
10317     +
10318     +#ifdef CONFIG_SUSPEND2_USERUI_EXPORTS
10319     +EXPORT_SYMBOL_GPL(kmsg_redirect);
10320     +EXPORT_SYMBOL_GPL(console_printk);
10321     +#ifndef CONFIG_COMPAT
10322     +EXPORT_SYMBOL_GPL(sys_ioctl);
10323     +#endif
10324     +#endif
10325     +
10326     +#ifdef CONFIG_SUSPEND2_SWAP_EXPORTS /* Suspend swap specific */
10327     +EXPORT_SYMBOL_GPL(sys_swapon);
10328     +EXPORT_SYMBOL_GPL(sys_swapoff);
10329     +EXPORT_SYMBOL_GPL(si_swapinfo);
10330     +EXPORT_SYMBOL_GPL(map_swap_page);
10331     +EXPORT_SYMBOL_GPL(get_swap_page);
10332     +EXPORT_SYMBOL_GPL(swap_free);
10333     +EXPORT_SYMBOL_GPL(get_swap_info_struct);
10334     +#endif
10335     +
10336     +#ifdef CONFIG_SUSPEND2_FILE_EXPORTS
10337     +/* Suspend_file specific */
10338     +extern char * __initdata root_device_name;
10339     +
10340     +EXPORT_SYMBOL_GPL(ROOT_DEV);
10341     +EXPORT_SYMBOL_GPL(root_device_name);
10342     +EXPORT_SYMBOL_GPL(sys_unlink);
10343     +EXPORT_SYMBOL_GPL(sys_mknod);
10344     +#endif
10345     +
10346     +/* Swap or file */
10347     +#if defined(CONFIG_SUSPEND2_FILE_EXPORTS) || defined(CONFIG_SUSPEND2_SWAP_EXPORTS)
10348     +EXPORT_SYMBOL_GPL(bio_set_pages_dirty);
10349     +EXPORT_SYMBOL_GPL(name_to_dev_t);
10350     +#endif
10351     +
10352     +#if defined(CONFIG_SUSPEND2_EXPORTS) || defined(CONFIG_SUSPEND2_CORE_EXPORTS)
10353     +EXPORT_SYMBOL_GPL(snprintf_used);
10354     +#endif
10355     +struct suspend2_core_fns *s2_core_fns;
10356     +EXPORT_SYMBOL_GPL(s2_core_fns);
10357     +
10358     +dyn_pageflags_t pageset1_map;
10359     +dyn_pageflags_t pageset1_copy_map;
10360     +EXPORT_SYMBOL_GPL(pageset1_map);
10361     +EXPORT_SYMBOL_GPL(pageset1_copy_map);
10362     +
10363     +unsigned long suspend_result = 0;
10364     +unsigned long suspend_debug_state = 0;
10365     +int suspend_io_time[2][2];
10366     +struct pagedir pagedir1 = {1};
10367     +
10368     +EXPORT_SYMBOL_GPL(suspend_io_time);
10369     +EXPORT_SYMBOL_GPL(suspend_debug_state);
10370     +EXPORT_SYMBOL_GPL(suspend_result);
10371     +EXPORT_SYMBOL_GPL(pagedir1);
10372     +
10373     +unsigned long suspend_get_nonconflicting_page(void)
10374     +{
10375     + return s2_core_fns->get_nonconflicting_page();
10376     +}
10377     +
10378     +int suspend_post_context_save(void)
10379     +{
10380     + return s2_core_fns->post_context_save();
10381     +}
10382     +
10383     +int suspend2_try_suspend(int have_pmsem)
10384     +{
10385     + if (!s2_core_fns)
10386     + return -ENODEV;
10387     +
10388     + return s2_core_fns->try_suspend(have_pmsem);
10389     +}
10390     +
10391     +void suspend2_try_resume(void)
10392     +{
10393     + if (s2_core_fns)
10394     + s2_core_fns->try_resume();
10395     +}
10396     +
10397     +int suspend2_lowlevel_builtin(void)
10398     +{
10399     + int error = 0;
10400     +
10401     + save_processor_state();
10402     + if ((error = swsusp_arch_suspend()))
10403     + printk(KERN_ERR "Error %d suspending\n", error);
10404     + /* Restore control flow appears here */
10405     + restore_processor_state();
10406     +
10407     + return error;
10408     +}
10409     +
10410     +EXPORT_SYMBOL_GPL(suspend2_lowlevel_builtin);
10411     +
10412     +unsigned long suspend_compress_bytes_in, suspend_compress_bytes_out;
10413     +EXPORT_SYMBOL_GPL(suspend_compress_bytes_in);
10414     +EXPORT_SYMBOL_GPL(suspend_compress_bytes_out);
10415     +
10416     +#ifdef CONFIG_SUSPEND2_REPLACE_SWSUSP
10417     +unsigned long suspend_action = (1 << SUSPEND_REPLACE_SWSUSP) | (1 << SUSPEND_PAGESET2_FULL);
10418     +#else
10419     +unsigned long suspend_action = 1 << SUSPEND_PAGESET2_FULL;
10420     +#endif
10421     +EXPORT_SYMBOL_GPL(suspend_action);
10422     +
10423     +unsigned long suspend_state = ((1 << SUSPEND_BOOT_TIME) |
10424     + (1 << SUSPEND_IGNORE_LOGLEVEL) |
10425     + (1 << SUSPEND_IO_STOPPED));
10426     +EXPORT_SYMBOL_GPL(suspend_state);
10427     +
10428     +/* The number of suspends we have started (some may have been cancelled) */
10429     +unsigned int nr_suspends;
10430     +EXPORT_SYMBOL_GPL(nr_suspends);
10431     +
10432     +char resume2_file[256] = CONFIG_SUSPEND2_DEFAULT_RESUME2;
10433     +EXPORT_SYMBOL_GPL(resume2_file);
10434     +
10435     +int suspend2_running = 0;
10436     +EXPORT_SYMBOL_GPL(suspend2_running);
10437     +
10438     +int suspend2_in_suspend __nosavedata;
10439     +EXPORT_SYMBOL_GPL(suspend2_in_suspend);
10440     +
10441     +unsigned long suspend2_nosave_state1 __nosavedata = 0;
10442     +unsigned long suspend2_nosave_state2 __nosavedata = 0;
10443     +int suspend2_nosave_state3 __nosavedata = 0;
10444     +int suspend2_nosave_io_speed[2][2] __nosavedata;
10445     +__nosavedata char suspend2_nosave_commandline[COMMAND_LINE_SIZE];
10446     +
10447     +__nosavedata struct pbe *restore_highmem_pblist;
10448     +
10449     +#ifdef CONFIG_SUSPEND2_CORE_EXPORTS
10450     +#ifdef CONFIG_HIGHMEM
10451     +EXPORT_SYMBOL_GPL(nr_free_highpages);
10452     +EXPORT_SYMBOL_GPL(saveable_highmem_page);
10453     +EXPORT_SYMBOL_GPL(restore_highmem_pblist);
10454     +#endif
10455     +
10456     +EXPORT_SYMBOL_GPL(suspend2_nosave_state1);
10457     +EXPORT_SYMBOL_GPL(suspend2_nosave_state2);
10458     +EXPORT_SYMBOL_GPL(suspend2_nosave_state3);
10459     +EXPORT_SYMBOL_GPL(suspend2_nosave_io_speed);
10460     +EXPORT_SYMBOL_GPL(suspend2_nosave_commandline);
10461     +#endif
10462     +
10463     +/* -- Commandline Parameter Handling ---
10464     + *
10465     + * Resume setup: obtain the storage device.
10466     + */
10467     +static int __init resume2_setup(char *str)
10468     +{
10469     + if (!*str)
10470     + return 0;
10471     +
10472     + strncpy(resume2_file, str, 255);
10473     + return 0;
10474     +}
10475     +
10476     +/*
10477     + * Allow the user to specify that we should ignore any image found and
10478     + * invalidate the image if necesssary. This is equivalent to running
10479     + * the task queue and a sync and then turning off the power. The same
10480     + * precautions should be taken: fsck if you're not journalled.
10481     + */
10482     +static int __init noresume2_setup(char *str)
10483     +{
10484     + set_suspend_state(SUSPEND_NORESUME_SPECIFIED);
10485     + return 0;
10486     +}
10487     +
10488     +static int __init suspend_retry_resume_setup(char *str)
10489     +{
10490     + set_suspend_state(SUSPEND_RETRY_RESUME);
10491     + return 0;
10492     +}
10493     +
10494     +#ifndef CONFIG_SOFTWARE_SUSPEND
10495     +static int __init resume_setup(char *str)
10496     +{
10497     + if (!*str)
10498     + return 0;
10499     +
10500     + strncpy(resume2_file, str, 255);
10501     + return 0;
10502     +}
10503     +
10504     +static int __init noresume_setup(char *str)
10505     +{
10506     + set_suspend_state(SUSPEND_NORESUME_SPECIFIED);
10507     + return 0;
10508     +}
10509     +__setup("noresume", noresume_setup);
10510     +__setup("resume=", resume_setup);
10511     +#endif
10512     +
10513     +__setup("noresume2", noresume2_setup);
10514     +__setup("resume2=", resume2_setup);
10515     +__setup("suspend_retry_resume", suspend_retry_resume_setup);
10516     +
10517     diff --git a/kernel/power/suspend2_builtin.h b/kernel/power/suspend2_builtin.h
10518     new file mode 100644
10519     index 0000000..968b24b
10520     --- /dev/null
10521     +++ b/kernel/power/suspend2_builtin.h
10522     @@ -0,0 +1,35 @@
10523     +/*
10524     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
10525     + *
10526     + * This file is released under the GPLv2.
10527     + */
10528     +#include <linux/dyn_pageflags.h>
10529     +#include <asm/setup.h>
10530     +
10531     +extern struct suspend2_core_fns *s2_core_fns;
10532     +extern unsigned long suspend_compress_bytes_in, suspend_compress_bytes_out;
10533     +extern unsigned long suspend_action;
10534     +extern unsigned int nr_suspends;
10535     +extern char resume2_file[256];
10536     +extern int suspend2_in_suspend;
10537     +
10538     +extern unsigned long suspend2_nosave_state1 __nosavedata;
10539     +extern unsigned long suspend2_nosave_state2 __nosavedata;
10540     +extern int suspend2_nosave_state3 __nosavedata;
10541     +extern int suspend2_nosave_io_speed[2][2] __nosavedata;
10542     +extern __nosavedata char suspend2_nosave_commandline[COMMAND_LINE_SIZE];
10543     +extern __nosavedata struct pbe *restore_highmem_pblist;
10544     +
10545     +int suspend2_lowlevel_builtin(void);
10546     +
10547     +extern dyn_pageflags_t __nosavedata suspend2_nosave_origmap;
10548     +extern dyn_pageflags_t __nosavedata suspend2_nosave_copymap;
10549     +
10550     +#ifdef CONFIG_HIGHMEM
10551     +extern __nosavedata struct zone_data *suspend2_nosave_zone_list;
10552     +extern __nosavedata unsigned long suspend2_nosave_max_pfn;
10553     +#endif
10554     +
10555     +extern unsigned long suspend_get_nonconflicting_page(void);
10556     +extern int suspend_post_context_save(void);
10557     +extern int suspend2_try_suspend(int have_pmsem);
10558     diff --git a/kernel/power/suspend_block_io.c b/kernel/power/suspend_block_io.c
10559     new file mode 100644
10560     index 0000000..533ae5c
10561     --- /dev/null
10562     +++ b/kernel/power/suspend_block_io.c
10563     @@ -0,0 +1,1020 @@
10564     +/*
10565     + * kernel/power/suspend_block_io.c
10566     + *
10567     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
10568     + *
10569     + * Distributed under GPLv2.
10570     + *
10571     + * This file contains block io functions for suspend2. These are
10572     + * used by the swapwriter and it is planned that they will also
10573     + * be used by the NFSwriter.
10574     + *
10575     + */
10576     +
10577     +#include <linux/blkdev.h>
10578     +#include <linux/syscalls.h>
10579     +#include <linux/suspend.h>
10580     +
10581     +#include "suspend.h"
10582     +#include "sysfs.h"
10583     +#include "modules.h"
10584     +#include "prepare_image.h"
10585     +#include "block_io.h"
10586     +#include "ui.h"
10587     +
10588     +static int pr_index;
10589     +
10590     +#if 0
10591     +#define PR_DEBUG(a, b...) do { if (pr_index < 20) printk(a, ##b); } while(0)
10592     +#else
10593     +#define PR_DEBUG(a, b...) do { } while(0)
10594     +#endif
10595     +
10596     +#define MAX_OUTSTANDING_IO 2048
10597     +#define SUBMIT_BATCH_SIZE 128
10598     +
10599     +static int max_outstanding_io = MAX_OUTSTANDING_IO;
10600     +static int submit_batch_size = SUBMIT_BATCH_SIZE;
10601     +
10602     +struct io_info {
10603     + struct bio *sys_struct;
10604     + sector_t first_block;
10605     + struct page *bio_page, *dest_page;
10606     + int writing, readahead_index;
10607     + struct block_device *dev;
10608     + struct list_head list;
10609     +};
10610     +
10611     +static LIST_HEAD(ioinfo_ready_for_cleanup);
10612     +static DEFINE_SPINLOCK(ioinfo_ready_lock);
10613     +
10614     +static LIST_HEAD(ioinfo_submit_batch);
10615     +static DEFINE_SPINLOCK(ioinfo_submit_lock);
10616     +
10617     +static LIST_HEAD(ioinfo_busy);
10618     +static DEFINE_SPINLOCK(ioinfo_busy_lock);
10619     +
10620     +static struct io_info *waiting_on;
10621     +
10622     +static atomic_t submit_batch;
10623     +static int submit_batched(void);
10624     +
10625     +/* [Max] number of I/O operations pending */
10626     +static atomic_t outstanding_io;
10627     +
10628     +static int extra_page_forward = 0;
10629     +
10630     +static volatile unsigned long suspend_readahead_flags[
10631     + DIV_ROUND_UP(MAX_OUTSTANDING_IO, BITS_PER_LONG)];
10632     +static spinlock_t suspend_readahead_flags_lock = SPIN_LOCK_UNLOCKED;
10633     +static struct page *suspend_readahead_pages[MAX_OUTSTANDING_IO];
10634     +static int readahead_index, readahead_submit_index;
10635     +
10636     +static int current_stream;
10637     +/* 0 = Header, 1 = Pageset1, 2 = Pageset2 */
10638     +struct extent_iterate_saved_state suspend_writer_posn_save[3];
10639     +
10640     +/* Pointer to current entry being loaded/saved. */
10641     +struct extent_iterate_state suspend_writer_posn;
10642     +
10643     +/* Not static, so that the allocators can setup and complete
10644     + * writing the header */
10645     +char *suspend_writer_buffer;
10646     +int suspend_writer_buffer_posn;
10647     +
10648     +int suspend_read_fd;
10649     +
10650     +static struct suspend_bdev_info *suspend_devinfo;
10651     +
10652     +int suspend_header_bytes_used = 0;
10653     +
10654     +DEFINE_MUTEX(suspend_bio_mutex);
10655     +
10656     +/*
10657     + * __suspend_bio_cleanup_one
10658     + *
10659     + * Description: Clean up after completing I/O on a page.
10660     + * Arguments: struct io_info: Data for I/O to be completed.
10661     + */
10662     +static void __suspend_bio_cleanup_one(struct io_info *io_info)
10663     +{
10664     + suspend_message(SUSPEND_WRITER, SUSPEND_HIGH, 0,
10665     + "Cleanup IO: [%p]\n", io_info);
10666     +
10667     + if (!io_info->writing && io_info->readahead_index == -1) {
10668     + char *from, *to;
10669     + /*
10670     + * Copy the page we read into the buffer our caller provided.
10671     + */
10672     + to = (char *) kmap(io_info->dest_page);
10673     + from = (char *) kmap(io_info->bio_page);
10674     + memcpy(to, from, PAGE_SIZE);
10675     + kunmap(io_info->dest_page);
10676     + kunmap(io_info->bio_page);
10677     + }
10678     +
10679     + if (io_info->writing || io_info->readahead_index == -1) {
10680     + /* Sanity check */
10681     + if (page_count(io_info->bio_page) != 2)
10682     + printk(KERN_EMERG "Cleanup IO: Page count on page %p"
10683     + " is %d. Not good!\n",
10684     + io_info->bio_page,
10685     + page_count(io_info->bio_page));
10686     + put_page(io_info->bio_page);
10687     + __free_page(io_info->bio_page);
10688     + } else
10689     + put_page(io_info->bio_page);
10690     +
10691     + bio_put(io_info->sys_struct);
10692     + io_info->sys_struct = NULL;
10693     +}
10694     +
10695     +/* __suspend_io_cleanup
10696     + */
10697     +
10698     +static void suspend_bio_cleanup_one(void *data)
10699     +{
10700     + struct io_info *io_info = (struct io_info *) data;
10701     + int readahead_index;
10702     + unsigned long flags;
10703     +
10704     + readahead_index = io_info->readahead_index;
10705     + list_del_init(&io_info->list);
10706     + __suspend_bio_cleanup_one(io_info);
10707     +
10708     + if (readahead_index > -1) {
10709     + int index = readahead_index/BITS_PER_LONG;
10710     + int bit = readahead_index - (index * BITS_PER_LONG);
10711     + spin_lock_irqsave(&suspend_readahead_flags_lock, flags);
10712     + set_bit(bit, &suspend_readahead_flags[index]);
10713     + spin_unlock_irqrestore(&suspend_readahead_flags_lock, flags);
10714     + }
10715     +
10716     + if (waiting_on == io_info)
10717     + waiting_on = NULL;
10718     + kfree(io_info);
10719     + atomic_dec(&outstanding_io);
10720     +}
10721     +
10722     +/* suspend_cleanup_some_completed_io
10723     + *
10724     + * NB: This is designed so that multiple callers can be in here simultaneously.
10725     + */
10726     +
10727     +static void suspend_cleanup_some_completed_io(void)
10728     +{
10729     + int num_cleaned = 0;
10730     + struct io_info *first;
10731     + unsigned long flags;
10732     +
10733     + spin_lock_irqsave(&ioinfo_ready_lock, flags);
10734     + while(!list_empty(&ioinfo_ready_for_cleanup)) {
10735     + first = list_entry(ioinfo_ready_for_cleanup.next,
10736     + struct io_info, list);
10737     +
10738     + list_del_init(&first->list);
10739     +
10740     + spin_unlock_irqrestore(&ioinfo_ready_lock, flags);
10741     + suspend_bio_cleanup_one((void *) first);
10742     + spin_lock_irqsave(&ioinfo_ready_lock, flags);
10743     +
10744     + num_cleaned++;
10745     + if (num_cleaned == submit_batch_size)
10746     + break;
10747     + }
10748     + spin_unlock_irqrestore(&ioinfo_ready_lock, flags);
10749     +}
10750     +
10751     +/* do_bio_wait
10752     + *
10753     + * Actions taken when we want some I/O to get run.
10754     + *
10755     + * Submit any I/O that's batched up (if we're not already doing
10756     + * that, unplug queues, schedule and clean up whatever we can.
10757     + */
10758     +static void do_bio_wait(void)
10759     +{
10760     + int num_submitted = 0;
10761     +
10762     + /* Don't want to wait on I/O we haven't submitted! */
10763     + num_submitted = submit_batched();
10764     +
10765     + kblockd_flush();
10766     +
10767     + io_schedule();
10768     +
10769     + suspend_cleanup_some_completed_io();
10770     +}
10771     +
10772     +/*
10773     + * suspend_finish_all_io
10774     + *
10775     + * Description: Finishes all IO and frees all IO info struct pages.
10776     + */
10777     +static void suspend_finish_all_io(void)
10778     +{
10779     + /* Wait for all I/O to complete. */
10780     + while (atomic_read(&outstanding_io))
10781     + do_bio_wait();
10782     +}
10783     +
10784     +/*
10785     + * wait_on_readahead
10786     + *
10787     + * Wait until a particular readahead is ready.
10788     + */
10789     +static void suspend_wait_on_readahead(int readahead_index)
10790     +{
10791     + int index = readahead_index / BITS_PER_LONG;
10792     + int bit = readahead_index - index * BITS_PER_LONG;
10793     +
10794     + /* read_ahead_index is the one we want to return */
10795     + while (!test_bit(bit, &suspend_readahead_flags[index]))
10796     + do_bio_wait();
10797     +}
10798     +
10799     +/*
10800     + * readahead_done
10801     + *
10802     + * Returns whether the readahead requested is ready.
10803     + */
10804     +
10805     +static int suspend_readahead_ready(int readahead_index)
10806     +{
10807     + int index = readahead_index / BITS_PER_LONG;
10808     + int bit = readahead_index - (index * BITS_PER_LONG);
10809     +
10810     + return test_bit(bit, &suspend_readahead_flags[index]);
10811     +}
10812     +
10813     +/* suspend_readahead_prepare
10814     + * Set up for doing readahead on an image */
10815     +static int suspend_prepare_readahead(int index)
10816     +{
10817     + unsigned long new_page = get_zeroed_page(GFP_ATOMIC | __GFP_NOWARN);
10818     +
10819     + if(!new_page)
10820     + return -ENOMEM;
10821     +
10822     + suspend_readahead_pages[index] = virt_to_page(new_page);
10823     + return 0;
10824     +}
10825     +
10826     +/* suspend_readahead_cleanup
10827     + * Clean up structures used for readahead */
10828     +static void suspend_cleanup_readahead(int page)
10829     +{
10830     + __free_page(suspend_readahead_pages[page]);
10831     + suspend_readahead_pages[page] = 0;
10832     + return;
10833     +}
10834     +
10835     +/*
10836     + * suspend_end_bio
10837     + *
10838     + * Description: Function called by block driver from interrupt context when I/O
10839     + * is completed. This is the reason we use spinlocks in
10840     + * manipulating the io_info lists.
10841     + * Nearly the fs/buffer.c version, but we want to mark the page as
10842     + * done in our own structures too.
10843     + */
10844     +
10845     +static int suspend_end_bio(struct bio *bio, unsigned int num, int err)
10846     +{
10847     + struct io_info *io_info = bio->bi_private;
10848     + unsigned long flags;
10849     +
10850     + spin_lock_irqsave(&ioinfo_busy_lock, flags);
10851     + list_del_init(&io_info->list);
10852     + spin_unlock_irqrestore(&ioinfo_busy_lock, flags);
10853     +
10854     + spin_lock_irqsave(&ioinfo_ready_lock, flags);
10855     + list_add_tail(&io_info->list, &ioinfo_ready_for_cleanup);
10856     + spin_unlock_irqrestore(&ioinfo_ready_lock, flags);
10857     + return 0;
10858     +}
10859     +
10860     +/**
10861     + * submit - submit BIO request.
10862     + * @writing: READ or WRITE.
10863     + * @io_info: IO info structure.
10864     + *
10865     + * Based on Patrick's pmdisk code from long ago:
10866     + * "Straight from the textbook - allocate and initialize the bio.
10867     + * If we're writing, make sure the page is marked as dirty.
10868     + * Then submit it and carry on."
10869     + *
10870     + * With a twist, though - we handle block_size != PAGE_SIZE.
10871     + * Caller has already checked that our page is not fragmented.
10872     + */
10873     +
10874     +static int submit(struct io_info *io_info)
10875     +{
10876     + struct bio *bio = NULL;
10877     + unsigned long flags;
10878     +
10879     + while (!bio) {
10880     + bio = bio_alloc(GFP_ATOMIC,1);
10881     + if (!bio)
10882     + do_bio_wait();
10883     + }
10884     +
10885     + bio->bi_bdev = io_info->dev;
10886     + bio->bi_sector = io_info->first_block;
10887     + bio->bi_private = io_info;
10888     + bio->bi_end_io = suspend_end_bio;
10889     + io_info->sys_struct = bio;
10890     +
10891     + if (bio_add_page(bio, io_info->bio_page, PAGE_SIZE, 0) < PAGE_SIZE) {
10892     + printk("ERROR: adding page to bio at %lld\n",
10893     + (unsigned long long) io_info->first_block);
10894     + bio_put(bio);
10895     + return -EFAULT;
10896     + }
10897     +
10898     + if (io_info->writing)
10899     + bio_set_pages_dirty(bio);
10900     +
10901     + spin_lock_irqsave(&ioinfo_busy_lock, flags);
10902     + list_add_tail(&io_info->list, &ioinfo_busy);
10903     + spin_unlock_irqrestore(&ioinfo_busy_lock, flags);
10904     +
10905     + submit_bio(io_info->writing, bio);
10906     +
10907     + return 0;
10908     +}
10909     +
10910     +/*
10911     + * submit a batch. The submit function can wait on I/O, so we have
10912     + * simple locking to avoid infinite recursion.
10913     + */
10914     +static int submit_batched(void)
10915     +{
10916     + static int running_already = 0;
10917     + struct io_info *first;
10918     + unsigned long flags;
10919     + int num_submitted = 0;
10920     +
10921     + if (running_already)
10922     + return 0;
10923     +
10924     + running_already = 1;
10925     + spin_lock_irqsave(&ioinfo_submit_lock, flags);
10926     + while(!list_empty(&ioinfo_submit_batch)) {
10927     + first = list_entry(ioinfo_submit_batch.next, struct io_info,
10928     + list);
10929     + list_del_init(&first->list);
10930     + atomic_dec(&submit_batch);
10931     + spin_unlock_irqrestore(&ioinfo_submit_lock, flags);
10932     + submit(first);
10933     + spin_lock_irqsave(&ioinfo_submit_lock, flags);
10934     + num_submitted++;
10935     + if (num_submitted == submit_batch_size)
10936     + break;
10937     + }
10938     + spin_unlock_irqrestore(&ioinfo_submit_lock, flags);
10939     + running_already = 0;
10940     +
10941     + return num_submitted;
10942     +}
10943     +
10944     +static void add_to_batch(struct io_info *io_info)
10945     +{
10946     + unsigned long flags;
10947     + int waiting;
10948     +
10949     + /* Put our prepared I/O struct on the batch list. */
10950     + spin_lock_irqsave(&ioinfo_submit_lock, flags);
10951     + list_add_tail(&io_info->list, &ioinfo_submit_batch);
10952     + waiting = atomic_add_return(1, &submit_batch);
10953     + spin_unlock_irqrestore(&ioinfo_submit_lock, flags);
10954     +
10955     + if (waiting >= submit_batch_size)
10956     + submit_batched();
10957     +}
10958     +
10959     +/*
10960     + * get_io_info_struct
10961     + *
10962     + * Description: Get an I/O struct.
10963     + * Returns: Pointer to the struct prepared for use.
10964     + */
10965     +static struct io_info *get_io_info_struct(void)
10966     +{
10967     + struct io_info *this = NULL;
10968     +
10969     + do {
10970     + while (atomic_read(&outstanding_io) >= max_outstanding_io)
10971     + do_bio_wait();
10972     +
10973     + this = kmalloc(sizeof(struct io_info), GFP_ATOMIC);
10974     + } while (!this);
10975     +
10976     + INIT_LIST_HEAD(&this->list);
10977     + return this;
10978     +}
10979     +
10980     +/*
10981     + * suspend_do_io
10982     + *
10983     + * Description: Prepare and start a read or write operation.
10984     + * Note that we use our own buffer for reading or writing.
10985     + * This simplifies doing readahead and asynchronous writing.
10986     + * We can begin a read without knowing the location into which
10987     + * the data will eventually be placed, and the buffer passed
10988     + * for a write can be reused immediately (essential for the
10989     + * modules system).
10990     + * Failure? What's that?
10991     + * Returns: The io_info struct created.
10992     + */
10993     +static int suspend_do_io(int writing, struct block_device *bdev, long block0,
10994     + struct page *page, int readahead_index, int syncio)
10995     +{
10996     + struct io_info *io_info;
10997     + unsigned long buffer_virt = 0;
10998     + char *to, *from;
10999     +
11000     + io_info = get_io_info_struct();
11001     +
11002     + /* Done before submitting to avoid races. */
11003     + if (syncio)
11004     + waiting_on = io_info;
11005     +
11006     + /* Get our local buffer */
11007     + suspend_message(SUSPEND_WRITER, SUSPEND_HIGH, 1,
11008     + "Start_IO: [%p]", io_info);
11009     +
11010     + /* Copy settings to the io_info struct */
11011     + io_info->writing = writing;
11012     + io_info->dev = bdev;
11013     + io_info->first_block = block0;
11014     + io_info->dest_page = page;
11015     + io_info->readahead_index = readahead_index;
11016     +
11017     + if (io_info->readahead_index == -1) {
11018     + while (!(buffer_virt = get_zeroed_page(GFP_ATOMIC | __GFP_NOWARN)))
11019     + do_bio_wait();
11020     +
11021     + suspend_message(SUSPEND_WRITER, SUSPEND_HIGH, 0,
11022     + "[ALLOC BUFFER]->%d",
11023     + real_nr_free_pages(all_zones_mask));
11024     + io_info->bio_page = virt_to_page(buffer_virt);
11025     + } else {
11026     + unsigned long flags;
11027     + int index = io_info->readahead_index / BITS_PER_LONG;
11028     + int bit = io_info->readahead_index - index * BITS_PER_LONG;
11029     +
11030     + spin_lock_irqsave(&suspend_readahead_flags_lock, flags);
11031     + clear_bit(bit, &suspend_readahead_flags[index]);
11032     + spin_unlock_irqrestore(&suspend_readahead_flags_lock, flags);
11033     +
11034     + io_info->bio_page = page;
11035     + }
11036     +
11037     + /* If writing, copy our data. The data is probably in
11038     + * lowmem, but we cannot be certain. If there is no
11039     + * compression/encryption, we might be passed the
11040     + * actual source page's address. */
11041     + if (writing) {
11042     + to = (char *) buffer_virt;
11043     + from = kmap_atomic(page, KM_USER1);
11044     + memcpy(to, from, PAGE_SIZE);
11045     + kunmap_atomic(from, KM_USER1);
11046     + }
11047     +
11048     + /* Submit the page */
11049     + get_page(io_info->bio_page);
11050     +
11051     + suspend_message(SUSPEND_WRITER, SUSPEND_HIGH, 1,
11052     + "-> (PRE BRW) %d\n", real_nr_free_pages(all_zones_mask));
11053     +
11054     + if (syncio)
11055     + submit(io_info);
11056     + else
11057     + add_to_batch(io_info);
11058     +
11059     + atomic_inc(&outstanding_io);
11060     +
11061     + if (syncio)
11062     + do { do_bio_wait(); } while (waiting_on);
11063     +
11064     + return 0;
11065     +}
11066     +
11067     +/* We used to use bread here, but it doesn't correctly handle
11068     + * blocksize != PAGE_SIZE. Now we create a submit_info to get the data we
11069     + * want and use our normal routines (synchronously).
11070     + */
11071     +
11072     +static int suspend_bdev_page_io(int writing, struct block_device *bdev,
11073     + long pos, struct page *page)
11074     +{
11075     + return suspend_do_io(writing, bdev, pos, page, -1, 1);
11076     +}
11077     +
11078     +static int suspend_bio_memory_needed(void)
11079     +{
11080     + /* We want to have at least enough memory so as to have
11081     + * max_outstanding_io transactions on the fly at once. If we
11082     + * can do more, fine. */
11083     + return (max_outstanding_io * (PAGE_SIZE + sizeof(struct request) +
11084     + sizeof(struct bio) + sizeof(struct io_info)));
11085     +}
11086     +
11087     +static void suspend_set_devinfo(struct suspend_bdev_info *info)
11088     +{
11089     + suspend_devinfo = info;
11090     +}
11091     +
11092     +static void dump_block_chains(void)
11093     +{
11094     + int i;
11095     +
11096     + for (i = 0; i < suspend_writer_posn.num_chains; i++) {
11097     + struct extent *this;
11098     +
11099     + printk("Chain %d:", i);
11100     +
11101     + this = (suspend_writer_posn.chains + i)->first;
11102     +
11103     + if (!this)
11104     + printk(" (Empty)");
11105     +
11106     + while (this) {
11107     + printk(" [%lu-%lu]%s", this->minimum, this->maximum,
11108     + this->next ? "," : "");
11109     + this = this->next;
11110     + }
11111     +
11112     + printk("\n");
11113     + }
11114     +
11115     + for (i = 0; i < 3; i++)
11116     + printk("Posn %d: Chain %d, extent %d, offset %lu.\n", i,
11117     + suspend_writer_posn_save[i].chain_num,
11118     + suspend_writer_posn_save[i].extent_num,
11119     + suspend_writer_posn_save[i].offset);
11120     +}
11121     +static int forward_extra_blocks(void)
11122     +{
11123     + int i;
11124     +
11125     + for (i = 1; i < suspend_devinfo[suspend_writer_posn.current_chain].
11126     + blocks_per_page; i++)
11127     + suspend_extent_state_next(&suspend_writer_posn);
11128     +
11129     + if (suspend_extent_state_eof(&suspend_writer_posn)) {
11130     + printk("Extent state eof.\n");
11131     + dump_block_chains();
11132     + return -ENODATA;
11133     + }
11134     +
11135     + return 0;
11136     +}
11137     +
11138     +static int forward_one_page(void)
11139     +{
11140     + int at_start = (suspend_writer_posn.current_chain == -1);
11141     +
11142     + /* Have to go forward one to ensure we're on the right chain,
11143     + * before we can know how many more blocks to skip.*/
11144     + suspend_extent_state_next(&suspend_writer_posn);
11145     +
11146     + if (!at_start && forward_extra_blocks())
11147     + return -ENODATA;
11148     +
11149     + if (extra_page_forward) {
11150     + extra_page_forward = 0;
11151     + return forward_one_page();
11152     + }
11153     +
11154     + return 0;
11155     +}
11156     +
11157     +/* Used in reading header, to jump to 2nd page after getting 1st page
11158     + * direct from image header. */
11159     +static void set_extra_page_forward(void)
11160     +{
11161     + extra_page_forward = 1;
11162     +}
11163     +
11164     +static int suspend_bio_rw_page(int writing, struct page *page,
11165     + int readahead_index, int sync)
11166     +{
11167     + struct suspend_bdev_info *dev_info;
11168     +
11169     + if (test_action_state(SUSPEND_TEST_FILTER_SPEED))
11170     + return 0;
11171     +
11172     + if (forward_one_page()) {
11173     + printk("Failed to advance a page in the extent data.\n");
11174     + return -ENODATA;
11175     + }
11176     +
11177     + if (current_stream == 0 && writing &&
11178     + suspend_writer_posn.current_chain == suspend_writer_posn_save[2].chain_num &&
11179     + suspend_writer_posn.current_offset == suspend_writer_posn_save[2].offset) {
11180     + dump_block_chains();
11181     + BUG();
11182     + }
11183     +
11184     + dev_info = &suspend_devinfo[suspend_writer_posn.current_chain];
11185     +
11186     + return suspend_do_io(writing, dev_info->bdev,
11187     + suspend_writer_posn.current_offset <<
11188     + dev_info->bmap_shift,
11189     + page, readahead_index, sync);
11190     +}
11191     +
11192     +static int suspend_rw_init(int writing, int stream_number)
11193     +{
11194     + suspend_header_bytes_used = 0;
11195     +
11196     + suspend_extent_state_restore(&suspend_writer_posn,
11197     + &suspend_writer_posn_save[stream_number]);
11198     +
11199     + suspend_writer_buffer_posn = writing ? 0 : PAGE_SIZE;
11200     +
11201     + current_stream = stream_number;
11202     +
11203     + readahead_index = readahead_submit_index = -1;
11204     +
11205     + pr_index = 0;
11206     +
11207     + return 0;
11208     +}
11209     +
11210     +static void suspend_read_header_init(void)
11211     +{
11212     + readahead_index = readahead_submit_index = -1;
11213     +}
11214     +
11215     +static int suspend_rw_cleanup(int writing)
11216     +{
11217     + if (writing && suspend_bio_rw_page(WRITE,
11218     + virt_to_page(suspend_writer_buffer), -1, 0))
11219     + return -EIO;
11220     +
11221     + if (writing && current_stream == 2)
11222     + suspend_extent_state_save(&suspend_writer_posn,
11223     + &suspend_writer_posn_save[1]);
11224     +
11225     + suspend_finish_all_io();
11226     +
11227     + if (!writing)
11228     + while (readahead_index != readahead_submit_index) {
11229     + suspend_cleanup_readahead(readahead_index);
11230     + readahead_index++;
11231     + if (readahead_index == max_outstanding_io)
11232     + readahead_index = 0;
11233     + }
11234     +
11235     + current_stream = 0;
11236     +
11237     + return 0;
11238     +}
11239     +
11240     +static int suspend_bio_read_page_with_readahead(void)
11241     +{
11242     + static int last_result;
11243     + unsigned long *virt;
11244     +
11245     + if (readahead_index == -1) {
11246     + last_result = 0;
11247     + readahead_index = readahead_submit_index = 0;
11248     + }
11249     +
11250     + /* Start a new readahead? */
11251     + if (last_result) {
11252     + /* We failed to submit a read, and have cleaned up
11253     + * all the readahead previously submitted */
11254     + if (readahead_submit_index == readahead_index) {
11255     + abort_suspend(SUSPEND_FAILED_IO, "Failed to submit"
11256     + " a read and no readahead left.\n");
11257     + return -EIO;
11258     + }
11259     + goto wait;
11260     + }
11261     +
11262     + do {
11263     + if (suspend_prepare_readahead(readahead_submit_index))
11264     + break;
11265     +
11266     + last_result = suspend_bio_rw_page(READ,
11267     + suspend_readahead_pages[readahead_submit_index],
11268     + readahead_submit_index, SUSPEND_ASYNC);
11269     + if (last_result) {
11270     + printk("Begin read chunk for page %d returned %d.\n",
11271     + readahead_submit_index, last_result);
11272     + suspend_cleanup_readahead(readahead_submit_index);
11273     + break;
11274     + }
11275     +
11276     + readahead_submit_index++;
11277     +
11278     + if (readahead_submit_index == max_outstanding_io)
11279     + readahead_submit_index = 0;
11280     +
11281     + } while((!last_result) && (readahead_submit_index != readahead_index) &&
11282     + (!suspend_readahead_ready(readahead_index)));
11283     +
11284     +wait:
11285     + suspend_wait_on_readahead(readahead_index);
11286     +
11287     + virt = kmap_atomic(suspend_readahead_pages[readahead_index], KM_USER1);
11288     + memcpy(suspend_writer_buffer, virt, PAGE_SIZE);
11289     + kunmap_atomic(virt, KM_USER1);
11290     +
11291     + suspend_cleanup_readahead(readahead_index);
11292     +
11293     + readahead_index++;
11294     + if (readahead_index == max_outstanding_io)
11295     + readahead_index = 0;
11296     +
11297     + return 0;
11298     +}
11299     +
11300     +/*
11301     + *
11302     + */
11303     +
11304     +static int suspend_rw_buffer(int writing, char *buffer, int buffer_size)
11305     +{
11306     + int bytes_left = buffer_size;
11307     +
11308     + /* Read/write a chunk of the header */
11309     + while (bytes_left) {
11310     + char *source_start = buffer + buffer_size - bytes_left;
11311     + char *dest_start = suspend_writer_buffer + suspend_writer_buffer_posn;
11312     + int capacity = PAGE_SIZE - suspend_writer_buffer_posn;
11313     + char *to = writing ? dest_start : source_start;
11314     + char *from = writing ? source_start : dest_start;
11315     +
11316     + if (bytes_left <= capacity) {
11317     + if (test_debug_state(SUSPEND_HEADER))
11318     + printk("Copy %d bytes %d-%d from %p to %p.\n",
11319     + bytes_left,
11320     + suspend_header_bytes_used,
11321     + suspend_header_bytes_used + bytes_left,
11322     + from, to);
11323     + memcpy(to, from, bytes_left);
11324     + suspend_writer_buffer_posn += bytes_left;
11325     + suspend_header_bytes_used += bytes_left;
11326     + return 0;
11327     + }
11328     +
11329     + /* Complete this page and start a new one */
11330     + if (test_debug_state(SUSPEND_HEADER))
11331     + printk("Copy %d bytes (%d-%d) from %p to %p.\n",
11332     + capacity,
11333     + suspend_header_bytes_used,
11334     + suspend_header_bytes_used + capacity,
11335     + from, to);
11336     + memcpy(to, from, capacity);
11337     + bytes_left -= capacity;
11338     + suspend_header_bytes_used += capacity;
11339     +
11340     + if (!writing) {
11341     + if (test_suspend_state(SUSPEND_TRY_RESUME_RD))
11342     + sys_read(suspend_read_fd,
11343     + suspend_writer_buffer, BLOCK_SIZE);
11344     + else
11345     + if (suspend_bio_read_page_with_readahead())
11346     + return -EIO;
11347     + } else if (suspend_bio_rw_page(WRITE,
11348     + virt_to_page(suspend_writer_buffer),
11349     + -1, SUSPEND_ASYNC))
11350     + return -EIO;
11351     +
11352     + suspend_writer_buffer_posn = 0;
11353     + suspend_cond_pause(0, NULL);
11354     + }
11355     +
11356     + return 0;
11357     +}
11358     +
11359     +/*
11360     + * suspend_bio_read_chunk
11361     + *
11362     + * Read a (possibly compressed and/or encrypted) page from the image,
11363     + * into buffer_page, returning it's index and the buffer size.
11364     + *
11365     + * If asynchronous I/O is requested, use readahead.
11366     + */
11367     +
11368     +static int suspend_bio_read_chunk(unsigned long *index, struct page *buffer_page,
11369     + unsigned int *buf_size, int sync)
11370     +{
11371     + int result;
11372     + char *buffer_virt = kmap(buffer_page);
11373     +
11374     + pr_index++;
11375     +
11376     + while (!mutex_trylock(&suspend_bio_mutex))
11377     + do_bio_wait();
11378     +
11379     + if ((result = suspend_rw_buffer(READ, (char *) index,
11380     + sizeof(unsigned long)))) {
11381     + abort_suspend(SUSPEND_FAILED_IO,
11382     + "Read of index returned %d.\n", result);
11383     + goto out;
11384     + }
11385     +
11386     + if ((result = suspend_rw_buffer(READ, (char *) buf_size, sizeof(int)))) {
11387     + abort_suspend(SUSPEND_FAILED_IO,
11388     + "Read of buffer size is %d.\n", result);
11389     + goto out;
11390     + }
11391     +
11392     + result = suspend_rw_buffer(READ, buffer_virt, *buf_size);
11393     + if (result)
11394     + abort_suspend(SUSPEND_FAILED_IO,
11395     + "Read of data returned %d.\n", result);
11396     +
11397     + PR_DEBUG("%d: Index %ld, %d bytes.\n", pr_index, *index, *buf_size);
11398     +out:
11399     + mutex_unlock(&suspend_bio_mutex);
11400     + kunmap(buffer_page);
11401     + if (result)
11402     + abort_suspend(SUSPEND_FAILED_IO,
11403     + "Returning %d from suspend_bio_read_chunk.\n", result);
11404     + return result;
11405     +}
11406     +
11407     +/*
11408     + * suspend_bio_write_chunk
11409     + *
11410     + * Write a (possibly compressed and/or encrypted) page to the image from
11411     + * the buffer, together with it's index and buffer size.
11412     + */
11413     +
11414     +static int suspend_bio_write_chunk(unsigned long index, struct page *buffer_page,
11415     + unsigned int buf_size)
11416     +{
11417     + int result;
11418     + char *buffer_virt = kmap(buffer_page);
11419     +
11420     + pr_index++;
11421     +
11422     + while (!mutex_trylock(&suspend_bio_mutex))
11423     + do_bio_wait();
11424     +
11425     + if ((result = suspend_rw_buffer(WRITE, (char *) &index,
11426     + sizeof(unsigned long))))
11427     + goto out;
11428     +
11429     + if ((result = suspend_rw_buffer(WRITE, (char *) &buf_size, sizeof(int))))
11430     + goto out;
11431     +
11432     + result = suspend_rw_buffer(WRITE, buffer_virt, buf_size);
11433     +
11434     + PR_DEBUG("%d: Index %ld, %d bytes.\n", pr_index, index, buf_size);
11435     +out:
11436     + mutex_unlock(&suspend_bio_mutex);
11437     + kunmap(buffer_page);
11438     + return result;
11439     +}
11440     +
11441     +/*
11442     + * suspend_rw_header_chunk
11443     + *
11444     + * Read or write a portion of the header.
11445     + */
11446     +
11447     +static int suspend_rw_header_chunk(int writing,
11448     + struct suspend_module_ops *owner,
11449     + char *buffer, int buffer_size)
11450     +{
11451     + if (owner) {
11452     + owner->header_used += buffer_size;
11453     + if (owner->header_used > owner->header_requested) {
11454     + printk(KERN_EMERG "Suspend2 module %s is using more"
11455     + "header space (%u) than it requested (%u).\n",
11456     + owner->name,
11457     + owner->header_used,
11458     + owner->header_requested);
11459     + return buffer_size;
11460     + }
11461     + }
11462     +
11463     + return suspend_rw_buffer(writing, buffer, buffer_size);
11464     +}
11465     +
11466     +/*
11467     + * write_header_chunk_finish
11468     + *
11469     + * Flush any buffered writes in the section of the image.
11470     + */
11471     +static int write_header_chunk_finish(void)
11472     +{
11473     + return suspend_bio_rw_page(WRITE, virt_to_page(suspend_writer_buffer),
11474     + -1, 0) ? -EIO : 0;
11475     +}
11476     +
11477     +static int suspend_bio_storage_needed(void)
11478     +{
11479     + return 2 * sizeof(int);
11480     +}
11481     +
11482     +static int suspend_bio_save_config_info(char *buf)
11483     +{
11484     + int *ints = (int *) buf;
11485     + ints[0] = max_outstanding_io;
11486     + ints[1] = submit_batch_size;
11487     + return 2 * sizeof(int);
11488     +}
11489     +
11490     +static void suspend_bio_load_config_info(char *buf, int size)
11491     +{
11492     + int *ints = (int *) buf;
11493     + max_outstanding_io = ints[0];
11494     + submit_batch_size = ints[1];
11495     +}
11496     +
11497     +static int suspend_bio_initialise(int starting_cycle)
11498     +{
11499     + suspend_writer_buffer = (char *) get_zeroed_page(GFP_ATOMIC);
11500     +
11501     + return suspend_writer_buffer ? 0 : -ENOMEM;
11502     +}
11503     +
11504     +static void suspend_bio_cleanup(int finishing_cycle)
11505     +{
11506     + if (suspend_writer_buffer) {
11507     + free_page((unsigned long) suspend_writer_buffer);
11508     + suspend_writer_buffer = NULL;
11509     + }
11510     +}
11511     +
11512     +struct suspend_bio_ops suspend_bio_ops = {
11513     + .bdev_page_io = suspend_bdev_page_io,
11514     + .finish_all_io = suspend_finish_all_io,
11515     + .forward_one_page = forward_one_page,
11516     + .set_extra_page_forward = set_extra_page_forward,
11517     + .set_devinfo = suspend_set_devinfo,
11518     + .read_chunk = suspend_bio_read_chunk,
11519     + .write_chunk = suspend_bio_write_chunk,
11520     + .rw_init = suspend_rw_init,
11521     + .rw_cleanup = suspend_rw_cleanup,
11522     + .read_header_init = suspend_read_header_init,
11523     + .rw_header_chunk = suspend_rw_header_chunk,
11524     + .write_header_chunk_finish = write_header_chunk_finish,
11525     +};
11526     +
11527     +static struct suspend_sysfs_data sysfs_params[] = {
11528     + { SUSPEND2_ATTR("max_outstanding_io", SYSFS_RW),
11529     + SYSFS_INT(&max_outstanding_io, 16, MAX_OUTSTANDING_IO, 0),
11530     + },
11531     +
11532     + { SUSPEND2_ATTR("submit_batch_size", SYSFS_RW),
11533     + SYSFS_INT(&submit_batch_size, 16, SUBMIT_BATCH_SIZE, 0),
11534     + }
11535     +};
11536     +
11537     +static struct suspend_module_ops suspend_blockwriter_ops =
11538     +{
11539     + .name = "Block I/O",
11540     + .type = MISC_MODULE,
11541     + .directory = "block_io",
11542     + .module = THIS_MODULE,
11543     + .memory_needed = suspend_bio_memory_needed,
11544     + .storage_needed = suspend_bio_storage_needed,
11545     + .save_config_info = suspend_bio_save_config_info,
11546     + .load_config_info = suspend_bio_load_config_info,
11547     + .initialise = suspend_bio_initialise,
11548     + .cleanup = suspend_bio_cleanup,
11549     +
11550     + .sysfs_data = sysfs_params,
11551     + .num_sysfs_entries = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data),
11552     +};
11553     +
11554     +static __init int suspend_block_io_load(void)
11555     +{
11556     + return suspend_register_module(&suspend_blockwriter_ops);
11557     +}
11558     +
11559     +#ifdef CONFIG_SUSPEND2_FILE_EXPORTS
11560     +EXPORT_SYMBOL_GPL(suspend_read_fd);
11561     +#endif
11562     +#if defined(CONFIG_SUSPEND2_FILE_EXPORTS) || defined(CONFIG_SUSPEND2_SWAP_EXPORTS)
11563     +EXPORT_SYMBOL_GPL(suspend_writer_posn);
11564     +EXPORT_SYMBOL_GPL(suspend_writer_posn_save);
11565     +EXPORT_SYMBOL_GPL(suspend_writer_buffer);
11566     +EXPORT_SYMBOL_GPL(suspend_writer_buffer_posn);
11567     +EXPORT_SYMBOL_GPL(suspend_header_bytes_used);
11568     +EXPORT_SYMBOL_GPL(suspend_bio_ops);
11569     +#endif
11570     +#ifdef MODULE
11571     +static __exit void suspend_block_io_unload(void)
11572     +{
11573     + suspend_unregister_module(&suspend_blockwriter_ops);
11574     +}
11575     +
11576     +module_init(suspend_block_io_load);
11577     +module_exit(suspend_block_io_unload);
11578     +MODULE_LICENSE("GPL");
11579     +MODULE_AUTHOR("Nigel Cunningham");
11580     +MODULE_DESCRIPTION("Suspend2 block io functions");
11581     +#else
11582     +late_initcall(suspend_block_io_load);
11583     +#endif
11584     diff --git a/kernel/power/suspend_compress.c b/kernel/power/suspend_compress.c
11585     new file mode 100644
11586     index 0000000..6e70e33
11587     --- /dev/null
11588     +++ b/kernel/power/suspend_compress.c
11589     @@ -0,0 +1,436 @@
11590     +/*
11591     + * kernel/power/compression.c
11592     + *
11593     + * Copyright (C) 2003-2007 Nigel Cunningham (nigel at suspend2 net)
11594     + *
11595     + * This file is released under the GPLv2.
11596     + *
11597     + * This file contains data compression routines for suspend,
11598     + * using cryptoapi.
11599     + */
11600     +
11601     +#include <linux/module.h>
11602     +#include <linux/suspend.h>
11603     +#include <linux/highmem.h>
11604     +#include <linux/vmalloc.h>
11605     +#include <linux/crypto.h>
11606     +
11607     +#include "suspend2_builtin.h"
11608     +#include "suspend.h"
11609     +#include "modules.h"
11610     +#include "sysfs.h"
11611     +#include "io.h"
11612     +#include "ui.h"
11613     +
11614     +static int suspend_expected_compression = 0;
11615     +
11616     +static struct suspend_module_ops suspend_compression_ops;
11617     +static struct suspend_module_ops *next_driver;
11618     +
11619     +static char suspend_compressor_name[32] = "lzf";
11620     +
11621     +static DEFINE_MUTEX(stats_lock);
11622     +
11623     +struct cpu_context {
11624     + u8 * page_buffer;
11625     + struct crypto_comp *transform;
11626     + unsigned int len;
11627     + char *buffer_start;
11628     +};
11629     +
11630     +static DEFINE_PER_CPU(struct cpu_context, contexts);
11631     +
11632     +static int suspend_compress_prepare_result;
11633     +
11634     +/*
11635     + * suspend_compress_cleanup
11636     + *
11637     + * Frees memory allocated for our labours.
11638     + */
11639     +static void suspend_compress_cleanup(int suspend_or_resume)
11640     +{
11641     + int cpu;
11642     +
11643     + if (!suspend_or_resume)
11644     + return;
11645     +
11646     + for_each_online_cpu(cpu) {
11647     + struct cpu_context *this = &per_cpu(contexts, cpu);
11648     + if (this->transform) {
11649     + crypto_free_comp(this->transform);
11650     + this->transform = NULL;
11651     + }
11652     +
11653     + if (this->page_buffer)
11654     + free_page((unsigned long) this->page_buffer);
11655     +
11656     + this->page_buffer = NULL;
11657     + }
11658     +}
11659     +
11660     +/*
11661     + * suspend_crypto_prepare
11662     + *
11663     + * Prepare to do some work by allocating buffers and transforms.
11664     + */
11665     +static int suspend_compress_crypto_prepare(void)
11666     +{
11667     + int cpu;
11668     +
11669     + if (!*suspend_compressor_name) {
11670     + printk("Suspend2: Compression enabled but no compressor name set.\n");
11671     + return 1;
11672     + }
11673     +
11674     + for_each_online_cpu(cpu) {
11675     + struct cpu_context *this = &per_cpu(contexts, cpu);
11676     + this->transform = crypto_alloc_comp(suspend_compressor_name,
11677     + 0, 0);
11678     + if (IS_ERR(this->transform)) {
11679     + printk("Suspend2: Failed to initialise the %s "
11680     + "compression transform.\n",
11681     + suspend_compressor_name);
11682     + this->transform = NULL;
11683     + return 1;
11684     + }
11685     +
11686     + this->page_buffer = (char *) get_zeroed_page(GFP_ATOMIC);
11687     +
11688     + if (!this->page_buffer) {
11689     + printk(KERN_ERR
11690     + "Failed to allocate a page buffer for suspend2 "
11691     + "encryption driver.\n");
11692     + return -ENOMEM;
11693     + }
11694     + }
11695     +
11696     + return 0;
11697     +}
11698     +
11699     +/*
11700     + * suspend_compress_init
11701     + */
11702     +
11703     +static int suspend_compress_init(int suspend_or_resume)
11704     +{
11705     + if (!suspend_or_resume)
11706     + return 0;
11707     +
11708     + suspend_compress_bytes_in = suspend_compress_bytes_out = 0;
11709     +
11710     + next_driver = suspend_get_next_filter(&suspend_compression_ops);
11711     +
11712     + if (!next_driver) {
11713     + printk("Compression Driver: Argh! Nothing follows me in"
11714     + " the pipeline!\n");
11715     + return -ECHILD;
11716     + }
11717     +
11718     + suspend_compress_prepare_result = suspend_compress_crypto_prepare();
11719     +
11720     + return 0;
11721     +}
11722     +
11723     +/*
11724     + * suspend_compress_rw_init()
11725     + */
11726     +
11727     +int suspend_compress_rw_init(int rw, int stream_number)
11728     +{
11729     + if (suspend_compress_prepare_result) {
11730     + printk("Failed to initialise compression algorithm.\n");
11731     + if (rw == READ)
11732     + return -ENODEV;
11733     + else
11734     + suspend_compression_ops.enabled = 0;
11735     + }
11736     +
11737     + return 0;
11738     +}
11739     +
11740     +/*
11741     + * suspend_compress_write_chunk()
11742     + *
11743     + * Compress a page of data, buffering output and passing on filled
11744     + * pages to the next module in the pipeline.
11745     + *
11746     + * Buffer_page: Pointer to a buffer of size PAGE_SIZE, containing
11747     + * data to be compressed.
11748     + *
11749     + * Returns: 0 on success. Otherwise the error is that returned by later
11750     + * modules, -ECHILD if we have a broken pipeline or -EIO if
11751     + * zlib errs.
11752     + */
11753     +static int suspend_compress_write_chunk(unsigned long index,
11754     + struct page *buffer_page, unsigned int buf_size)
11755     +{
11756     + int ret, cpu = smp_processor_id();
11757     + struct cpu_context *ctx = &per_cpu(contexts, cpu);
11758     +
11759     + if (!ctx->transform)
11760     + return next_driver->write_chunk(index, buffer_page, buf_size);
11761     +
11762     + ctx->buffer_start = kmap(buffer_page);
11763     +
11764     + ctx->len = buf_size;
11765     +
11766     + ret = crypto_comp_compress(ctx->transform,
11767     + ctx->buffer_start, buf_size,
11768     + ctx->page_buffer, &ctx->len);
11769     +
11770     + kunmap(buffer_page);
11771     +
11772     + if (ret) {
11773     + printk("Compression failed.\n");
11774     + goto failure;
11775     + }
11776     +
11777     + mutex_lock(&stats_lock);
11778     + suspend_compress_bytes_in += buf_size;
11779     + suspend_compress_bytes_out += ctx->len;
11780     + mutex_unlock(&stats_lock);
11781     +
11782     + if (ctx->len < buf_size) /* some compression */
11783     + ret = next_driver->write_chunk(index,
11784     + virt_to_page(ctx->page_buffer),
11785     + ctx->len);
11786     + else
11787     + ret = next_driver->write_chunk(index, buffer_page, buf_size);
11788     +
11789     +failure:
11790     + return ret;
11791     +}
11792     +
11793     +/*
11794     + * suspend_compress_read_chunk()
11795     + * @buffer_page: struct page *. Pointer to a buffer of size PAGE_SIZE.
11796     + * @sync: int. Whether the previous module (or core) wants its data
11797     + * synchronously.
11798     + *
11799     + * Retrieve data from later modules and decompress it until the input buffer
11800     + * is filled.
11801     + * Zero if successful. Error condition from me or from downstream on failure.
11802     + */
11803     +static int suspend_compress_read_chunk(unsigned long *index,
11804     + struct page *buffer_page, unsigned int *buf_size, int sync)
11805     +{
11806     + int ret, cpu = smp_processor_id();
11807     + unsigned int len;
11808     + unsigned int outlen = PAGE_SIZE;
11809     + char *buffer_start;
11810     + struct cpu_context *ctx = &per_cpu(contexts, cpu);
11811     +
11812     + if (!ctx->transform)
11813     + return next_driver->read_chunk(index, buffer_page, buf_size,
11814     + sync);
11815     +
11816     + /*
11817     + * All our reads must be synchronous - we can't decompress
11818     + * data that hasn't been read yet.
11819     + */
11820     +
11821     + *buf_size = PAGE_SIZE;
11822     +
11823     + ret = next_driver->read_chunk(index, buffer_page, &len, SUSPEND_SYNC);
11824     +
11825     + /* Error or uncompressed data */
11826     + if (ret || len == PAGE_SIZE)
11827     + return ret;
11828     +
11829     + buffer_start = kmap(buffer_page);
11830     + memcpy(ctx->page_buffer, buffer_start, len);
11831     + ret = crypto_comp_decompress(
11832     + ctx->transform,
11833     + ctx->page_buffer,
11834     + len, buffer_start, &outlen);
11835     + if (ret)
11836     + abort_suspend(SUSPEND_FAILED_IO,
11837     + "Compress_read returned %d.\n", ret);
11838     + else if (outlen != PAGE_SIZE) {
11839     + abort_suspend(SUSPEND_FAILED_IO,
11840     + "Decompression yielded %d bytes instead of %ld.\n",
11841     + outlen, PAGE_SIZE);
11842     + ret = -EIO;
11843     + *buf_size = outlen;
11844     + }
11845     + kunmap(buffer_page);
11846     + return ret;
11847     +}
11848     +
11849     +/*
11850     + * suspend_compress_print_debug_stats
11851     + * @buffer: Pointer to a buffer into which the debug info will be printed.
11852     + * @size: Size of the buffer.
11853     + *
11854     + * Print information to be recorded for debugging purposes into a buffer.
11855     + * Returns: Number of characters written to the buffer.
11856     + */
11857     +
11858     +static int suspend_compress_print_debug_stats(char *buffer, int size)
11859     +{
11860     + int pages_in = suspend_compress_bytes_in >> PAGE_SHIFT,
11861     + pages_out = suspend_compress_bytes_out >> PAGE_SHIFT;
11862     + int len;
11863     +
11864     + /* Output the compression ratio achieved. */
11865     + if (*suspend_compressor_name)
11866     + len = snprintf_used(buffer, size, "- Compressor is '%s'.\n",
11867     + suspend_compressor_name);
11868     + else
11869     + len = snprintf_used(buffer, size, "- Compressor is not set.\n");
11870     +
11871     + if (pages_in)
11872     + len+= snprintf_used(buffer+len, size - len,
11873     + " Compressed %ld bytes into %ld (%d percent compression).\n",
11874     + suspend_compress_bytes_in,
11875     + suspend_compress_bytes_out,
11876     + (pages_in - pages_out) * 100 / pages_in);
11877     + return len;
11878     +}
11879     +
11880     +/*
11881     + * suspend_compress_compression_memory_needed
11882     + *
11883     + * Tell the caller how much memory we need to operate during suspend/resume.
11884     + * Returns: Unsigned long. Maximum number of bytes of memory required for
11885     + * operation.
11886     + */
11887     +static int suspend_compress_memory_needed(void)
11888     +{
11889     + return 2 * PAGE_SIZE;
11890     +}
11891     +
11892     +static int suspend_compress_storage_needed(void)
11893     +{
11894     + return 4 * sizeof(unsigned long) + strlen(suspend_compressor_name) + 1;
11895     +}
11896     +
11897     +/*
11898     + * suspend_compress_save_config_info
11899     + * @buffer: Pointer to a buffer of size PAGE_SIZE.
11900     + *
11901     + * Save informaton needed when reloading the image at resume time.
11902     + * Returns: Number of bytes used for saving our data.
11903     + */
11904     +static int suspend_compress_save_config_info(char *buffer)
11905     +{
11906     + int namelen = strlen(suspend_compressor_name) + 1;
11907     + int total_len;
11908     +
11909     + *((unsigned long *) buffer) = suspend_compress_bytes_in;
11910     + *((unsigned long *) (buffer + 1 * sizeof(unsigned long))) =
11911     + suspend_compress_bytes_out;
11912     + *((unsigned long *) (buffer + 2 * sizeof(unsigned long))) =
11913     + suspend_expected_compression;
11914     + *((unsigned long *) (buffer + 3 * sizeof(unsigned long))) = namelen;
11915     + strncpy(buffer + 4 * sizeof(unsigned long), suspend_compressor_name,
11916     + namelen);
11917     + total_len = 4 * sizeof(unsigned long) + namelen;
11918     + return total_len;
11919     +}
11920     +
11921     +/* suspend_compress_load_config_info
11922     + * @buffer: Pointer to the start of the data.
11923     + * @size: Number of bytes that were saved.
11924     + *
11925     + * Description: Reload information needed for decompressing the image at
11926     + * resume time.
11927     + */
11928     +static void suspend_compress_load_config_info(char *buffer, int size)
11929     +{
11930     + int namelen;
11931     +
11932     + suspend_compress_bytes_in = *((unsigned long *) buffer);
11933     + suspend_compress_bytes_out = *((unsigned long *) (buffer + 1 * sizeof(unsigned long)));
11934     + suspend_expected_compression = *((unsigned long *) (buffer + 2 *
11935     + sizeof(unsigned long)));
11936     + namelen = *((unsigned long *) (buffer + 3 * sizeof(unsigned long)));
11937     + strncpy(suspend_compressor_name, buffer + 4 * sizeof(unsigned long),
11938     + namelen);
11939     + return;
11940     +}
11941     +
11942     +/*
11943     + * suspend_expected_compression_ratio
11944     + *
11945     + * Description: Returns the expected ratio between data passed into this module
11946     + * and the amount of data output when writing.
11947     + * Returns: 100 if the module is disabled. Otherwise the value set by the
11948     + * user via our sysfs entry.
11949     + */
11950     +
11951     +static int suspend_compress_expected_ratio(void)
11952     +{
11953     + if (!suspend_compression_ops.enabled)
11954     + return 100;
11955     + else
11956     + return 100 - suspend_expected_compression;
11957     +}
11958     +
11959     +/*
11960     + * data for our sysfs entries.
11961     + */
11962     +static struct suspend_sysfs_data sysfs_params[] = {
11963     + {
11964     + SUSPEND2_ATTR("expected_compression", SYSFS_RW),
11965     + SYSFS_INT(&suspend_expected_compression, 0, 99, 0)
11966     + },
11967     +
11968     + {
11969     + SUSPEND2_ATTR("enabled", SYSFS_RW),
11970     + SYSFS_INT(&suspend_compression_ops.enabled, 0, 1, 0)
11971     + },
11972     +
11973     + {
11974     + SUSPEND2_ATTR("algorithm", SYSFS_RW),
11975     + SYSFS_STRING(suspend_compressor_name, 31, 0)
11976     + }
11977     +};
11978     +
11979     +/*
11980     + * Ops structure.
11981     + */
11982     +static struct suspend_module_ops suspend_compression_ops = {
11983     + .type = FILTER_MODULE,
11984     + .name = "Compressor",
11985     + .directory = "compression",
11986     + .module = THIS_MODULE,
11987     + .initialise = suspend_compress_init,
11988     + .cleanup = suspend_compress_cleanup,
11989     + .memory_needed = suspend_compress_memory_needed,
11990     + .print_debug_info = suspend_compress_print_debug_stats,
11991     + .save_config_info = suspend_compress_save_config_info,
11992     + .load_config_info = suspend_compress_load_config_info,
11993     + .storage_needed = suspend_compress_storage_needed,
11994     + .expected_compression = suspend_compress_expected_ratio,
11995     +
11996     + .rw_init = suspend_compress_rw_init,
11997     +
11998     + .write_chunk = suspend_compress_write_chunk,
11999     + .read_chunk = suspend_compress_read_chunk,
12000     +
12001     + .sysfs_data = sysfs_params,
12002     + .num_sysfs_entries = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data),
12003     +};
12004     +
12005     +/* ---- Registration ---- */
12006     +
12007     +static __init int suspend_compress_load(void)
12008     +{
12009     + return suspend_register_module(&suspend_compression_ops);
12010     +}
12011     +
12012     +#ifdef MODULE
12013     +static __exit void suspend_compress_unload(void)
12014     +{
12015     + suspend_unregister_module(&suspend_compression_ops);
12016     +}
12017     +
12018     +module_init(suspend_compress_load);
12019     +module_exit(suspend_compress_unload);
12020     +MODULE_LICENSE("GPL");
12021     +MODULE_AUTHOR("Nigel Cunningham");
12022     +MODULE_DESCRIPTION("Compression Support for Suspend2");
12023     +#else
12024     +late_initcall(suspend_compress_load);
12025     +#endif
12026     diff --git a/kernel/power/suspend_file.c b/kernel/power/suspend_file.c
12027     new file mode 100644
12028     index 0000000..e523f8b
12029     --- /dev/null
12030     +++ b/kernel/power/suspend_file.c
12031     @@ -0,0 +1,1131 @@
12032     +/*
12033     + * kernel/power/suspend_file.c
12034     + *
12035     + * Copyright (C) 2005-2007 Nigel Cunningham (nigel at suspend2 net)
12036     + *
12037     + * Distributed under GPLv2.
12038     + *
12039     + * This file encapsulates functions for usage of a simple file as a
12040     + * backing store. It is based upon the swapallocator, and shares the
12041     + * same basic working. Here, though, we have nothing to do with
12042     + * swapspace, and only one device to worry about.
12043     + *
12044     + * The user can just
12045     + *
12046     + * echo Suspend2 > /path/to/my_file
12047     + *
12048     + * and
12049     + *
12050     + * echo /path/to/my_file > /sys/power/suspend2/suspend_file/target
12051     + *
12052     + * then put what they find in /sys/power/suspend2/resume2
12053     + * as their resume2= parameter in lilo.conf (and rerun lilo if using it).
12054     + *
12055     + * Having done this, they're ready to suspend and resume.
12056     + *
12057     + * TODO:
12058     + * - File resizing.
12059     + */
12060     +
12061     +#include <linux/suspend.h>
12062     +#include <linux/module.h>
12063     +#include <linux/blkdev.h>
12064     +#include <linux/file.h>
12065     +#include <linux/stat.h>
12066     +#include <linux/mount.h>
12067     +#include <linux/statfs.h>
12068     +#include <linux/syscalls.h>
12069     +#include <linux/namei.h>
12070     +#include <linux/fs.h>
12071     +#include <linux/root_dev.h>
12072     +
12073     +#include "suspend.h"
12074     +#include "sysfs.h"
12075     +#include "modules.h"
12076     +#include "ui.h"
12077     +#include "extent.h"
12078     +#include "io.h"
12079     +#include "storage.h"
12080     +#include "block_io.h"
12081     +
12082     +static struct suspend_module_ops suspend_fileops;
12083     +
12084     +/* Details of our target. */
12085     +
12086     +char suspend_file_target[256];
12087     +static struct inode *target_inode;
12088     +static struct file *target_file;
12089     +static struct block_device *suspend_file_target_bdev;
12090     +static dev_t resume_file_dev_t;
12091     +static int used_devt = 0;
12092     +static int setting_suspend_file_target = 0;
12093     +static sector_t target_firstblock = 0, target_header_start = 0;
12094     +static int target_storage_available = 0;
12095     +static int target_claim = 0;
12096     +
12097     +static char HaveImage[] = "HaveImage\n";
12098     +static char NoImage[] = "Suspend2\n";
12099     +#define sig_size (sizeof(HaveImage) + 1)
12100     +
12101     +struct suspend_file_header {
12102     + char sig[sig_size];
12103     + int resumed_before;
12104     + unsigned long first_header_block;
12105     +};
12106     +
12107     +extern char *__initdata root_device_name;
12108     +
12109     +/* Header Page Information */
12110     +static int header_pages_allocated;
12111     +
12112     +/* Main Storage Pages */
12113     +static int main_pages_allocated, main_pages_requested;
12114     +
12115     +#define target_is_normal_file() (S_ISREG(target_inode->i_mode))
12116     +
12117     +static struct suspend_bdev_info devinfo;
12118     +
12119     +/* Extent chain for blocks */
12120     +static struct extent_chain block_chain;
12121     +
12122     +/* Signature operations */
12123     +enum {
12124     + GET_IMAGE_EXISTS,
12125     + INVALIDATE,
12126     + MARK_RESUME_ATTEMPTED,
12127     + UNMARK_RESUME_ATTEMPTED,
12128     +};
12129     +
12130     +static void set_devinfo(struct block_device *bdev, int target_blkbits)
12131     +{
12132     + devinfo.bdev = bdev;
12133     + if (!target_blkbits) {
12134     + devinfo.bmap_shift = devinfo.blocks_per_page = 0;
12135     + } else {
12136     + devinfo.bmap_shift = target_blkbits - 9;
12137     + devinfo.blocks_per_page = (1 << (PAGE_SHIFT - target_blkbits));
12138     + }
12139     +}
12140     +
12141     +static int adjust_for_extra_pages(int unadjusted)
12142     +{
12143     + return (unadjusted << PAGE_SHIFT) / (PAGE_SIZE + sizeof(unsigned long)
12144     + + sizeof(int));
12145     +}
12146     +
12147     +static int suspend_file_storage_available(void)
12148     +{
12149     + int result = 0;
12150     + struct block_device *bdev=suspend_file_target_bdev;
12151     +
12152     + if (!target_inode)
12153     + return 0;
12154     +
12155     + switch (target_inode->i_mode & S_IFMT) {
12156     + case S_IFSOCK:
12157     + case S_IFCHR:
12158     + case S_IFIFO: /* Socket, Char, Fifo */
12159     + return -1;
12160     + case S_IFREG: /* Regular file: current size - holes + free
12161     + space on part */
12162     + result = target_storage_available;
12163     + break;
12164     + case S_IFBLK: /* Block device */
12165     + if (!bdev->bd_disk) {
12166     + printk("bdev->bd_disk null.\n");
12167     + return 0;
12168     + }
12169     +
12170     + result = (bdev->bd_part ?
12171     + bdev->bd_part->nr_sects :
12172     + bdev->bd_disk->capacity) >> (PAGE_SHIFT - 9);
12173     + }
12174     +
12175     + return adjust_for_extra_pages(result);
12176     +}
12177     +
12178     +static int has_contiguous_blocks(int page_num)
12179     +{
12180     + int j;
12181     + sector_t last = 0;
12182     +
12183     + for (j = 0; j < devinfo.blocks_per_page; j++) {
12184     + sector_t this = bmap(target_inode,
12185     + page_num * devinfo.blocks_per_page + j);
12186     +
12187     + if (!this || (last && (last + 1) != this))
12188     + break;
12189     +
12190     + last = this;
12191     + }
12192     +
12193     + return (j == devinfo.blocks_per_page);
12194     +}
12195     +
12196     +static int size_ignoring_ignored_pages(void)
12197     +{
12198     + int mappable = 0, i;
12199     +
12200     + if (!target_is_normal_file())
12201     + return suspend_file_storage_available();
12202     +
12203     + for (i = 0; i < (target_inode->i_size >> PAGE_SHIFT) ; i++)
12204     + if (has_contiguous_blocks(i))
12205     + mappable++;
12206     +
12207     + return mappable;
12208     +}
12209     +
12210     +static void __populate_block_list(int min, int max)
12211     +{
12212     + if (test_action_state(SUSPEND_TEST_BIO))
12213     + printk("Adding extent %d-%d.\n", min << devinfo.bmap_shift,
12214     + ((max + 1) << devinfo.bmap_shift) - 1);
12215     +
12216     + suspend_add_to_extent_chain(&block_chain, min, max);
12217     +}
12218     +
12219     +static void populate_block_list(void)
12220     +{
12221     + int i;
12222     + int extent_min = -1, extent_max = -1, got_header = 0;
12223     +
12224     + if (block_chain.first)
12225     + suspend_put_extent_chain(&block_chain);
12226     +
12227     + if (!target_is_normal_file()) {
12228     + if (target_storage_available > 0)
12229     + __populate_block_list(devinfo.blocks_per_page,
12230     + (target_storage_available + 1) *
12231     + devinfo.blocks_per_page - 1);
12232     + return;
12233     + }
12234     +
12235     + for (i = 0; i < (target_inode->i_size >> PAGE_SHIFT); i++) {
12236     + sector_t new_sector;
12237     +
12238     + if (!has_contiguous_blocks(i))
12239     + continue;
12240     +
12241     + new_sector = bmap(target_inode,
12242     + (i * devinfo.blocks_per_page));
12243     +
12244     + /*
12245     + * Ignore the first block in the file.
12246     + * It gets the header.
12247     + */
12248     + if (new_sector == target_firstblock >> devinfo.bmap_shift) {
12249     + got_header = 1;
12250     + continue;
12251     + }
12252     +
12253     + /*
12254     + * I'd love to be able to fill in holes and resize
12255     + * files, but not yet...
12256     + */
12257     +
12258     + if (new_sector == extent_max + 1)
12259     + extent_max+= devinfo.blocks_per_page;
12260     + else {
12261     + if (extent_min > -1)
12262     + __populate_block_list(extent_min,
12263     + extent_max);
12264     +
12265     + extent_min = new_sector;
12266     + extent_max = extent_min +
12267     + devinfo.blocks_per_page - 1;
12268     + }
12269     + }
12270     +
12271     + if (extent_min > -1)
12272     + __populate_block_list(extent_min, extent_max);
12273     +}
12274     +
12275     +static void suspend_file_cleanup(int finishing_cycle)
12276     +{
12277     + if (suspend_file_target_bdev) {
12278     + if (target_claim) {
12279     + bd_release(suspend_file_target_bdev);
12280     + target_claim = 0;
12281     + }
12282     +
12283     + if (used_devt) {
12284     + blkdev_put(suspend_file_target_bdev);
12285     + used_devt = 0;
12286     + }
12287     + suspend_file_target_bdev = NULL;
12288     + target_inode = NULL;
12289     + set_devinfo(NULL, 0);
12290     + target_storage_available = 0;
12291     + }
12292     +
12293     + if (target_file > 0) {
12294     + filp_close(target_file, NULL);
12295     + target_file = NULL;
12296     + }
12297     +}
12298     +
12299     +/*
12300     + * reopen_resume_devt
12301     + *
12302     + * Having opened resume2= once, we remember the major and
12303     + * minor nodes and use them to reopen the bdev for checking
12304     + * whether an image exists (possibly when starting a resume).
12305     + */
12306     +static void reopen_resume_devt(void)
12307     +{
12308     + suspend_file_target_bdev = open_by_devnum(resume_file_dev_t, FMODE_READ);
12309     + if (IS_ERR(suspend_file_target_bdev)) {
12310     + printk("Got a dev_num (%lx) but failed to open it.\n",
12311     + (unsigned long) resume_file_dev_t);
12312     + return;
12313     + }
12314     + target_inode = suspend_file_target_bdev->bd_inode;
12315     + set_devinfo(suspend_file_target_bdev, target_inode->i_blkbits);
12316     +}
12317     +
12318     +static void suspend_file_get_target_info(char *target, int get_size,
12319     + int resume2)
12320     +{
12321     + if (target_file)
12322     + suspend_file_cleanup(0);
12323     +
12324     + if (!target || !strlen(target))
12325     + return;
12326     +
12327     + target_file = filp_open(target, O_RDWR, 0);
12328     +
12329     + if (IS_ERR(target_file) || !target_file) {
12330     +
12331     + if (!resume2) {
12332     + printk("Open file %s returned %p.\n",
12333     + target, target_file);
12334     + target_file = NULL;
12335     + return;
12336     + }
12337     +
12338     + target_file = NULL;
12339     + resume_file_dev_t = name_to_dev_t(target);
12340     + if (!resume_file_dev_t) {
12341     + struct kstat stat;
12342     + int error = vfs_stat(target, &stat);
12343     + printk("Open file %s returned %p and name_to_devt "
12344     + "failed.\n", target, target_file);
12345     + if (error)
12346     + printk("Stating the file also failed."
12347     + " Nothing more we can do.\n");
12348     + else
12349     + resume_file_dev_t = stat.rdev;
12350     + return;
12351     + }
12352     +
12353     + suspend_file_target_bdev = open_by_devnum(resume_file_dev_t,
12354     + FMODE_READ);
12355     + if (IS_ERR(suspend_file_target_bdev)) {
12356     + printk("Got a dev_num (%lx) but failed to open it.\n",
12357     + (unsigned long) resume_file_dev_t);
12358     + return;
12359     + }
12360     + used_devt = 1;
12361     + target_inode = suspend_file_target_bdev->bd_inode;
12362     + } else
12363     + target_inode = target_file->f_mapping->host;
12364     +
12365     + if (S_ISLNK(target_inode->i_mode) || S_ISDIR(target_inode->i_mode) ||
12366     + S_ISSOCK(target_inode->i_mode) || S_ISFIFO(target_inode->i_mode)) {
12367     + printk("File support works with regular files, character "
12368     + "files and block devices.\n");
12369     + goto cleanup;
12370     + }
12371     +
12372     + if (!used_devt) {
12373     + if (S_ISBLK(target_inode->i_mode)) {
12374     + suspend_file_target_bdev = I_BDEV(target_inode);
12375     + if (!bd_claim(suspend_file_target_bdev, &suspend_fileops))
12376     + target_claim = 1;
12377     + } else
12378     + suspend_file_target_bdev = target_inode->i_sb->s_bdev;
12379     + resume_file_dev_t = suspend_file_target_bdev->bd_dev;
12380     + }
12381     +
12382     + set_devinfo(suspend_file_target_bdev, target_inode->i_blkbits);
12383     +
12384     + if (get_size)
12385     + target_storage_available = size_ignoring_ignored_pages();
12386     +
12387     + if (!resume2)
12388     + target_firstblock = bmap(target_inode, 0) << devinfo.bmap_shift;
12389     +
12390     + return;
12391     +cleanup:
12392     + target_inode = NULL;
12393     + if (target_file) {
12394     + filp_close(target_file, NULL);
12395     + target_file = NULL;
12396     + }
12397     + set_devinfo(NULL, 0);
12398     + target_storage_available = 0;
12399     +}
12400     +
12401     +static int parse_signature(struct suspend_file_header *header)
12402     +{
12403     + int have_image = !memcmp(HaveImage, header->sig, sizeof(HaveImage) - 1);
12404     + int no_image_header = !memcmp(NoImage, header->sig, sizeof(NoImage) - 1);
12405     +
12406     + if (no_image_header)
12407     + return 0;
12408     +
12409     + if (!have_image)
12410     + return -1;
12411     +
12412     + if (header->resumed_before)
12413     + set_suspend_state(SUSPEND_RESUMED_BEFORE);
12414     + else
12415     + clear_suspend_state(SUSPEND_RESUMED_BEFORE);
12416     +
12417     + target_header_start = header->first_header_block;
12418     + return 1;
12419     +}
12420     +
12421     +/* prepare_signature */
12422     +
12423     +static int prepare_signature(struct suspend_file_header *current_header,
12424     + unsigned long first_header_block)
12425     +{
12426     + strncpy(current_header->sig, HaveImage, sizeof(HaveImage));
12427     + current_header->resumed_before = 0;
12428     + current_header->first_header_block = first_header_block;
12429     + return 0;
12430     +}
12431     +
12432     +static int suspend_file_storage_allocated(void)
12433     +{
12434     + if (!target_inode)
12435     + return 0;
12436     +
12437     + if (target_is_normal_file())
12438     + return (int) target_storage_available;
12439     + else
12440     + return header_pages_allocated + main_pages_requested;
12441     +}
12442     +
12443     +static int suspend_file_release_storage(void)
12444     +{
12445     + if (test_action_state(SUSPEND_KEEP_IMAGE) &&
12446     + test_suspend_state(SUSPEND_NOW_RESUMING))
12447     + return 0;
12448     +
12449     + suspend_put_extent_chain(&block_chain);
12450     +
12451     + header_pages_allocated = 0;
12452     + main_pages_allocated = 0;
12453     + main_pages_requested = 0;
12454     + return 0;
12455     +}
12456     +
12457     +static int __suspend_file_allocate_storage(int main_storage_requested,
12458     + int header_storage);
12459     +
12460     +static int suspend_file_allocate_header_space(int space_requested)
12461     +{
12462     + int i;
12463     +
12464     + if (!block_chain.first && __suspend_file_allocate_storage(
12465     + main_pages_requested, space_requested)) {
12466     + printk("Failed to allocate space for the header.\n");
12467     + return -ENOSPC;
12468     + }
12469     +
12470     + suspend_extent_state_goto_start(&suspend_writer_posn);
12471     + suspend_bio_ops.forward_one_page(); /* To first page */
12472     +
12473     + for (i = 0; i < space_requested; i++) {
12474     + if (suspend_bio_ops.forward_one_page()) {
12475     + printk("Out of space while seeking to allocate "
12476     + "header pages,\n");
12477     + header_pages_allocated = i;
12478     + return -ENOSPC;
12479     + }
12480     + }
12481     +
12482     + header_pages_allocated = space_requested;
12483     +
12484     + /* The end of header pages will be the start of pageset 2 */
12485     + suspend_extent_state_save(&suspend_writer_posn,
12486     + &suspend_writer_posn_save[2]);
12487     + return 0;
12488     +}
12489     +
12490     +static int suspend_file_allocate_storage(int space_requested)
12491     +{
12492     + if (__suspend_file_allocate_storage(space_requested,
12493     + header_pages_allocated))
12494     + return -ENOSPC;
12495     +
12496     + main_pages_requested = space_requested;
12497     + return -ENOSPC;
12498     +}
12499     +
12500     +static int __suspend_file_allocate_storage(int main_space_requested,
12501     + int header_space_requested)
12502     +{
12503     + int result = 0;
12504     +
12505     + int extra_pages = DIV_ROUND_UP(main_space_requested *
12506     + (sizeof(unsigned long) + sizeof(int)), PAGE_SIZE);
12507     + int pages_to_get = main_space_requested + extra_pages +
12508     + header_space_requested;
12509     + int blocks_to_get = pages_to_get - block_chain.size;
12510     +
12511     + /* Only release_storage reduces the size */
12512     + if (blocks_to_get < 1)
12513     + return 0;
12514     +
12515     + populate_block_list();
12516     +
12517     + suspend_message(SUSPEND_WRITER, SUSPEND_MEDIUM, 0,
12518     + "Finished with block_chain.size == %d.\n",
12519     + block_chain.size);
12520     +
12521     + if (block_chain.size < pages_to_get) {
12522     + printk("Block chain size (%d) < header pages (%d) + extra pages (%d) + main pages (%d) (=%d pages).\n",
12523     + block_chain.size, header_pages_allocated, extra_pages,
12524     + main_space_requested, pages_to_get);
12525     + result = -ENOSPC;
12526     + }
12527     +
12528     + main_pages_requested = main_space_requested;
12529     + main_pages_allocated = main_space_requested + extra_pages;
12530     +
12531     + suspend_file_allocate_header_space(header_pages_allocated);
12532     + return result;
12533     +}
12534     +
12535     +static int suspend_file_write_header_init(void)
12536     +{
12537     + suspend_extent_state_goto_start(&suspend_writer_posn);
12538     +
12539     + suspend_writer_buffer_posn = suspend_header_bytes_used = 0;
12540     +
12541     + /* Info needed to bootstrap goes at the start of the header.
12542     + * First we save the basic info needed for reading, including the number
12543     + * of header pages. Then we save the structs containing data needed
12544     + * for reading the header pages back.
12545     + * Note that even if header pages take more than one page, when we
12546     + * read back the info, we will have restored the location of the
12547     + * next header page by the time we go to use it.
12548     + */
12549     +
12550     + suspend_bio_ops.rw_header_chunk(WRITE, &suspend_fileops,
12551     + (char *) &suspend_writer_posn_save,
12552     + sizeof(suspend_writer_posn_save));
12553     +
12554     + suspend_bio_ops.rw_header_chunk(WRITE, &suspend_fileops,
12555     + (char *) &devinfo, sizeof(devinfo));
12556     +
12557     + suspend_serialise_extent_chain(&suspend_fileops, &block_chain);
12558     +
12559     + return 0;
12560     +}
12561     +
12562     +static int suspend_file_write_header_cleanup(void)
12563     +{
12564     + struct suspend_file_header *header;
12565     +
12566     + /* Write any unsaved data */
12567     + if (suspend_writer_buffer_posn)
12568     + suspend_bio_ops.write_header_chunk_finish();
12569     +
12570     + suspend_bio_ops.finish_all_io();
12571     +
12572     + suspend_extent_state_goto_start(&suspend_writer_posn);
12573     + suspend_bio_ops.forward_one_page();
12574     +
12575     + /* Adjust image header */
12576     + suspend_bio_ops.bdev_page_io(READ, suspend_file_target_bdev,
12577     + target_firstblock,
12578     + virt_to_page(suspend_writer_buffer));
12579     +
12580     + header = (struct suspend_file_header *) suspend_writer_buffer;
12581     +
12582     + prepare_signature(header,
12583     + suspend_writer_posn.current_offset <<
12584     + devinfo.bmap_shift);
12585     +
12586     + suspend_bio_ops.bdev_page_io(WRITE, suspend_file_target_bdev,
12587     + target_firstblock,
12588     + virt_to_page(suspend_writer_buffer));
12589     +
12590     + suspend_bio_ops.finish_all_io();
12591     +
12592     + return 0;
12593     +}
12594     +
12595     +/* HEADER READING */
12596     +
12597     +#ifdef CONFIG_DEVFS_FS
12598     +int create_dev(char *name, dev_t dev, char *devfs_name);
12599     +#else
12600     +static int create_dev(char *name, dev_t dev, char *devfs_name)
12601     +{
12602     + sys_unlink(name);
12603     + return sys_mknod(name, S_IFBLK|0600, new_encode_dev(dev));
12604     +}
12605     +#endif
12606     +
12607     +static int rd_init(void)
12608     +{
12609     + suspend_writer_buffer_posn = 0;
12610     +
12611     + create_dev("/dev/root", ROOT_DEV, root_device_name);
12612     + create_dev("/dev/ram", MKDEV(RAMDISK_MAJOR, 0), NULL);
12613     +
12614     + suspend_read_fd = sys_open("/dev/root", O_RDONLY, 0);
12615     + if (suspend_read_fd < 0)
12616     + goto out;
12617     +
12618     + sys_read(suspend_read_fd, suspend_writer_buffer, BLOCK_SIZE);
12619     +
12620     + memcpy(&suspend_writer_posn_save,
12621     + suspend_writer_buffer + suspend_writer_buffer_posn,
12622     + sizeof(suspend_writer_posn_save));
12623     +
12624     + suspend_writer_buffer_posn += sizeof(suspend_writer_posn_save);
12625     +
12626     + return 0;
12627     +out:
12628     + sys_unlink("/dev/ram");
12629     + sys_unlink("/dev/root");
12630     + return -EIO;
12631     +}
12632     +
12633     +static int file_init(void)
12634     +{
12635     + suspend_writer_buffer_posn = 0;
12636     +
12637     + /* Read suspend_file configuration */
12638     + suspend_bio_ops.bdev_page_io(READ, suspend_file_target_bdev,
12639     + target_header_start,
12640     + virt_to_page((unsigned long) suspend_writer_buffer));
12641     +
12642     + return 0;
12643     +}
12644     +
12645     +/*
12646     + * read_header_init()
12647     + *
12648     + * Ramdisk support based heavily on init/do_mounts_rd.c
12649     + *
12650     + * Description:
12651     + * 1. Attempt to read the device specified with resume2=.
12652     + * 2. Check the contents of the header for our signature.
12653     + * 3. Warn, ignore, reset and/or continue as appropriate.
12654     + * 4. If continuing, read the suspend_file configuration section
12655     + * of the header and set up block device info so we can read
12656     + * the rest of the header & image.
12657     + *
12658     + * Returns:
12659     + * May not return if user choose to reboot at a warning.
12660     + * -EINVAL if cannot resume at this time. Booting should continue
12661     + * normally.
12662     + */
12663     +
12664     +static int suspend_file_read_header_init(void)
12665     +{
12666     + int result;
12667     + struct block_device *tmp;
12668     +
12669     + if (test_suspend_state(SUSPEND_TRY_RESUME_RD))
12670     + result = rd_init();
12671     + else
12672     + result = file_init();
12673     +
12674     + if (result) {
12675     + printk("FileAllocator read header init: Failed to initialise "
12676     + "reading the first page of data.\n");
12677     + return result;
12678     + }
12679     +
12680     + memcpy(&suspend_writer_posn_save,
12681     + suspend_writer_buffer + suspend_writer_buffer_posn,
12682     + sizeof(suspend_writer_posn_save));
12683     +
12684     + suspend_writer_buffer_posn += sizeof(suspend_writer_posn_save);
12685     +
12686     + tmp = devinfo.bdev;
12687     +
12688     + memcpy(&devinfo,
12689     + suspend_writer_buffer + suspend_writer_buffer_posn,
12690     + sizeof(devinfo));
12691     +
12692     + devinfo.bdev = tmp;
12693     + suspend_writer_buffer_posn += sizeof(devinfo);
12694     +
12695     + suspend_bio_ops.read_header_init();
12696     + suspend_extent_state_goto_start(&suspend_writer_posn);
12697     + suspend_bio_ops.set_extra_page_forward();
12698     +
12699     + suspend_header_bytes_used = suspend_writer_buffer_posn;
12700     +
12701     + return suspend_load_extent_chain(&block_chain);
12702     +}
12703     +
12704     +static int suspend_file_read_header_cleanup(void)
12705     +{
12706     + suspend_bio_ops.rw_cleanup(READ);
12707     + return 0;
12708     +}
12709     +
12710     +static int suspend_file_signature_op(int op)
12711     +{
12712     + char *cur;
12713     + int result = 0, changed = 0;
12714     + struct suspend_file_header *header;
12715     +
12716     + if(suspend_file_target_bdev <= 0)
12717     + return -1;
12718     +
12719     + cur = (char *) get_zeroed_page(GFP_ATOMIC);
12720     + if (!cur) {
12721     + printk("Unable to allocate a page for reading the image "
12722     + "signature.\n");
12723     + return -ENOMEM;
12724     + }
12725     +
12726     + suspend_bio_ops.bdev_page_io(READ, suspend_file_target_bdev,
12727     + target_firstblock,
12728     + virt_to_page(cur));
12729     +
12730     + header = (struct suspend_file_header *) cur;
12731     + result = parse_signature(header);
12732     +
12733     + switch (op) {
12734     + case INVALIDATE:
12735     + if (result == -1)
12736     + goto out;
12737     +
12738     + strcpy(header->sig, NoImage);
12739     + header->resumed_before = 0;
12740     + result = changed = 1;
12741     + break;
12742     + case MARK_RESUME_ATTEMPTED:
12743     + if (result == 1) {
12744     + header->resumed_before = 1;
12745     + changed = 1;
12746     + }
12747     + break;
12748     + case UNMARK_RESUME_ATTEMPTED:
12749     + if (result == 1) {
12750     + header->resumed_before = 0;
12751     + changed = 1;
12752     + }
12753     + break;
12754     + }
12755     +
12756     + if (changed)
12757     + suspend_bio_ops.bdev_page_io(WRITE, suspend_file_target_bdev,
12758     + target_firstblock,
12759     + virt_to_page(cur));
12760     +
12761     +out:
12762     + suspend_bio_ops.finish_all_io();
12763     + free_page((unsigned long) cur);
12764     + return result;
12765     +}
12766     +
12767     +/* Print debug info
12768     + *
12769     + * Description:
12770     + */
12771     +
12772     +static int suspend_file_print_debug_stats(char *buffer, int size)
12773     +{
12774     + int len = 0;
12775     +
12776     + if (suspendActiveAllocator != &suspend_fileops) {
12777     + len = snprintf_used(buffer, size, "- FileAllocator inactive.\n");
12778     + return len;
12779     + }
12780     +
12781     + len = snprintf_used(buffer, size, "- FileAllocator active.\n");
12782     +
12783     + len+= snprintf_used(buffer+len, size-len, " Storage available for image: "
12784     + "%ld pages.\n",
12785     + suspend_file_storage_allocated());
12786     +
12787     + return len;
12788     +}
12789     +
12790     +/*
12791     + * Storage needed
12792     + *
12793     + * Returns amount of space in the image header required
12794     + * for the suspend_file's data.
12795     + *
12796     + * We ensure the space is allocated, but actually save the
12797     + * data from write_header_init and therefore don't also define a
12798     + * save_config_info routine.
12799     + */
12800     +static int suspend_file_storage_needed(void)
12801     +{
12802     + return sig_size + strlen(suspend_file_target) + 1 +
12803     + 3 * sizeof(struct extent_iterate_saved_state) +
12804     + sizeof(devinfo) +
12805     + sizeof(struct extent_chain) - 2 * sizeof(void *) +
12806     + (2 * sizeof(unsigned long) * block_chain.num_extents);
12807     +}
12808     +
12809     +/*
12810     + * suspend_file_invalidate_image
12811     + *
12812     + */
12813     +static int suspend_file_invalidate_image(void)
12814     +{
12815     + int result;
12816     +
12817     + suspend_file_release_storage();
12818     +
12819     + result = suspend_file_signature_op(INVALIDATE);
12820     + if (result == 1 && !nr_suspends)
12821     + printk(KERN_WARNING "Suspend2: Image invalidated.\n");
12822     +
12823     + return result;
12824     +}
12825     +
12826     +/*
12827     + * Image_exists
12828     + *
12829     + */
12830     +
12831     +static int suspend_file_image_exists(void)
12832     +{
12833     + if (!suspend_file_target_bdev)
12834     + reopen_resume_devt();
12835     +
12836     + return suspend_file_signature_op(GET_IMAGE_EXISTS);
12837     +}
12838     +
12839     +/*
12840     + * Mark resume attempted.
12841     + *
12842     + * Record that we tried to resume from this image.
12843     + */
12844     +
12845     +static void suspend_file_mark_resume_attempted(int mark)
12846     +{
12847     + suspend_file_signature_op(mark ? MARK_RESUME_ATTEMPTED:
12848     + UNMARK_RESUME_ATTEMPTED);
12849     +}
12850     +
12851     +static void suspend_file_set_resume2(void)
12852     +{
12853     + char *buffer = (char *) get_zeroed_page(GFP_ATOMIC);
12854     + char *buffer2 = (char *) get_zeroed_page(GFP_ATOMIC);
12855     + unsigned long sector = bmap(target_inode, 0);
12856     + int offset = 0;
12857     +
12858     + if (suspend_file_target_bdev) {
12859     + set_devinfo(suspend_file_target_bdev, target_inode->i_blkbits);
12860     +
12861     + bdevname(suspend_file_target_bdev, buffer2);
12862     + offset += snprintf(buffer + offset, PAGE_SIZE - offset,
12863     + "/dev/%s", buffer2);
12864     +
12865     + if (sector)
12866     + offset += snprintf(buffer + offset, PAGE_SIZE - offset,
12867     + ":0x%lx", sector << devinfo.bmap_shift);
12868     + } else
12869     + offset += snprintf(buffer + offset, PAGE_SIZE - offset,
12870     + "%s is not a valid target.", suspend_file_target);
12871     +
12872     + sprintf(resume2_file, "file:%s", buffer);
12873     +
12874     + free_page((unsigned long) buffer);
12875     + free_page((unsigned long) buffer2);
12876     +
12877     + suspend_attempt_to_parse_resume_device(1);
12878     +}
12879     +
12880     +static int __test_suspend_file_target(char *target, int resume_time, int quiet)
12881     +{
12882     + suspend_file_get_target_info(target, 0, resume_time);
12883     + if (suspend_file_signature_op(GET_IMAGE_EXISTS) > -1) {
12884     + if (!quiet)
12885     + printk("Suspend2: FileAllocator: File signature found.\n");
12886     + if (!resume_time)
12887     + suspend_file_set_resume2();
12888     +
12889     + suspend_bio_ops.set_devinfo(&devinfo);
12890     + suspend_writer_posn.chains = &block_chain;
12891     + suspend_writer_posn.num_chains = 1;
12892     +
12893     + if (!resume_time)
12894     + set_suspend_state(SUSPEND_CAN_SUSPEND);
12895     + return 0;
12896     + }
12897     +
12898     + clear_suspend_state(SUSPEND_CAN_SUSPEND);
12899     +
12900     + if (quiet)
12901     + return 1;
12902     +
12903     + if (*target)
12904     + printk("Suspend2: FileAllocator: Sorry. No signature found at"
12905     + " %s.\n", target);
12906     + else
12907     + if (!resume_time)
12908     + printk("Suspend2: FileAllocator: Sorry. Target is not"
12909     + " set for suspending.\n");
12910     +
12911     + return 1;
12912     +}
12913     +
12914     +static void test_suspend_file_target(void)
12915     +{
12916     + setting_suspend_file_target = 1;
12917     +
12918     + printk("Suspend2: Suspending %sabled.\n",
12919     + __test_suspend_file_target(suspend_file_target, 0, 1) ?
12920     + "dis" : "en");
12921     +
12922     + setting_suspend_file_target = 0;
12923     +}
12924     +
12925     +/*
12926     + * Parse Image Location
12927     + *
12928     + * Attempt to parse a resume2= parameter.
12929     + * Swap Writer accepts:
12930     + * resume2=file:DEVNAME[:FIRSTBLOCK]
12931     + *
12932     + * Where:
12933     + * DEVNAME is convertable to a dev_t by name_to_dev_t
12934     + * FIRSTBLOCK is the location of the first block in the file.
12935     + * BLOCKSIZE is the logical blocksize >= SECTOR_SIZE & <= PAGE_SIZE,
12936     + * mod SECTOR_SIZE == 0 of the device.
12937     + * Data is validated by attempting to read a header from the
12938     + * location given. Failure will result in suspend_file refusing to
12939     + * save an image, and a reboot with correct parameters will be
12940     + * necessary.
12941     + */
12942     +
12943     +static int suspend_file_parse_sig_location(char *commandline,
12944     + int only_writer, int quiet)
12945     +{
12946     + char *thischar, *devstart = NULL, *colon = NULL, *at_symbol = NULL;
12947     + int result = -EINVAL, target_blocksize = 0;
12948     +
12949     + if (strncmp(commandline, "file:", 5)) {
12950     + if (!only_writer)
12951     + return 1;
12952     + } else
12953     + commandline += 5;
12954     +
12955     + /*
12956     + * Don't check signature again if we're beginning a cycle. If we already
12957     + * did the initialisation successfully, assume we'll be okay when it comes
12958     + * to resuming.
12959     + */
12960     + if (suspend_file_target_bdev)
12961     + return 0;
12962     +
12963     + devstart = thischar = commandline;
12964     + while ((*thischar != ':') && (*thischar != '@') &&
12965     + ((thischar - commandline) < 250) && (*thischar))
12966     + thischar++;
12967     +
12968     + if (*thischar == ':') {
12969     + colon = thischar;
12970     + *colon = 0;
12971     + thischar++;
12972     + }
12973     +
12974     + while ((*thischar != '@') && ((thischar - commandline) < 250) && (*thischar))
12975     + thischar++;
12976     +
12977     + if (*thischar == '@') {
12978     + at_symbol = thischar;
12979     + *at_symbol = 0;
12980     + }
12981     +
12982     + /*
12983     + * For the suspend_file, you can be able to resume, but not suspend,
12984     + * because the resume2= is set correctly, but the suspend_file_target
12985     + * isn't.
12986     + *
12987     + * We may have come here as a result of setting resume2 or
12988     + * suspend_file_target. We only test the suspend_file target in the
12989     + * former case (it's already done in the later), and we do it before
12990     + * setting the block number ourselves. It will overwrite the values
12991     + * given on the command line if we don't.
12992     + */
12993     +
12994     + if (!setting_suspend_file_target)
12995     + __test_suspend_file_target(suspend_file_target, 1, 0);
12996     +
12997     + if (colon)
12998     + target_firstblock = (int) simple_strtoul(colon + 1, NULL, 0);
12999     + else
13000     + target_firstblock = 0;
13001     +
13002     + if (at_symbol) {
13003     + target_blocksize = (int) simple_strtoul(at_symbol + 1, NULL, 0);
13004     + if (target_blocksize & (SECTOR_SIZE - 1)) {
13005     + printk("FileAllocator: Blocksizes are multiples of %d.\n", SECTOR_SIZE);
13006     + result = -EINVAL;
13007     + goto out;
13008     + }
13009     + }
13010     +
13011     + if (!quiet)
13012     + printk("Suspend2 FileAllocator: Testing whether you can resume:\n");
13013     +
13014     + suspend_file_get_target_info(commandline, 0, 1);
13015     +
13016     + if (!suspend_file_target_bdev || IS_ERR(suspend_file_target_bdev)) {
13017     + suspend_file_target_bdev = NULL;
13018     + result = -1;
13019     + goto out;
13020     + }
13021     +
13022     + if (target_blocksize)
13023     + set_devinfo(suspend_file_target_bdev, ffs(target_blocksize));
13024     +
13025     + result = __test_suspend_file_target(commandline, 1, 0);
13026     +
13027     +out:
13028     + if (result)
13029     + clear_suspend_state(SUSPEND_CAN_SUSPEND);
13030     +
13031     + if (!quiet)
13032     + printk("Resuming %sabled.\n", result ? "dis" : "en");
13033     +
13034     + if (colon)
13035     + *colon = ':';
13036     + if (at_symbol)
13037     + *at_symbol = '@';
13038     +
13039     + return result;
13040     +}
13041     +
13042     +/* suspend_file_save_config_info
13043     + *
13044     + * Description: Save the target's name, not for resume time, but for all_settings.
13045     + * Arguments: Buffer: Pointer to a buffer of size PAGE_SIZE.
13046     + * Returns: Number of bytes used for saving our data.
13047     + */
13048     +
13049     +static int suspend_file_save_config_info(char *buffer)
13050     +{
13051     + strcpy(buffer, suspend_file_target);
13052     + return strlen(suspend_file_target) + 1;
13053     +}
13054     +
13055     +/* suspend_file_load_config_info
13056     + *
13057     + * Description: Reload target's name.
13058     + * Arguments: Buffer: Pointer to the start of the data.
13059     + * Size: Number of bytes that were saved.
13060     + */
13061     +
13062     +static void suspend_file_load_config_info(char *buffer, int size)
13063     +{
13064     + strcpy(suspend_file_target, buffer);
13065     +}
13066     +
13067     +static int suspend_file_initialise(int starting_cycle)
13068     +{
13069     + if (starting_cycle) {
13070     + if (suspendActiveAllocator != &suspend_fileops)
13071     + return 0;
13072     +
13073     + if (starting_cycle & SYSFS_SUSPEND && !*suspend_file_target) {
13074     + printk("FileAllocator is the active writer, "
13075     + "but no filename has been set.\n");
13076     + return 1;
13077     + }
13078     + }
13079     +
13080     + if (suspend_file_target)
13081     + suspend_file_get_target_info(suspend_file_target, starting_cycle, 0);
13082     +
13083     + if (starting_cycle && (suspend_file_image_exists() == -1)) {
13084     + printk("%s is does not have a valid signature for suspending.\n",
13085     + suspend_file_target);
13086     + return 1;
13087     + }
13088     +
13089     + return 0;
13090     +}
13091     +
13092     +static struct suspend_sysfs_data sysfs_params[] = {
13093     +
13094     + {
13095     + SUSPEND2_ATTR("target", SYSFS_RW),
13096     + SYSFS_STRING(suspend_file_target, 256, SYSFS_NEEDS_SM_FOR_WRITE),
13097     + .write_side_effect = test_suspend_file_target,
13098     + },
13099     +
13100     + {
13101     + SUSPEND2_ATTR("enabled", SYSFS_RW),
13102     + SYSFS_INT(&suspend_fileops.enabled, 0, 1, 0),
13103     + .write_side_effect = attempt_to_parse_resume_device2,
13104     + }
13105     +};
13106     +
13107     +static struct suspend_module_ops suspend_fileops = {
13108     + .type = WRITER_MODULE,
13109     + .name = "File Allocator",
13110     + .directory = "file",
13111     + .module = THIS_MODULE,
13112     + .print_debug_info = suspend_file_print_debug_stats,
13113     + .save_config_info = suspend_file_save_config_info,
13114     + .load_config_info = suspend_file_load_config_info,
13115     + .storage_needed = suspend_file_storage_needed,
13116     + .initialise = suspend_file_initialise,
13117     + .cleanup = suspend_file_cleanup,
13118     +
13119     + .storage_available = suspend_file_storage_available,
13120     + .storage_allocated = suspend_file_storage_allocated,
13121     + .release_storage = suspend_file_release_storage,
13122     + .allocate_header_space = suspend_file_allocate_header_space,
13123     + .allocate_storage = suspend_file_allocate_storage,
13124     + .image_exists = suspend_file_image_exists,
13125     + .mark_resume_attempted = suspend_file_mark_resume_attempted,
13126     + .write_header_init = suspend_file_write_header_init,
13127     + .write_header_cleanup = suspend_file_write_header_cleanup,
13128     + .read_header_init = suspend_file_read_header_init,
13129     + .read_header_cleanup = suspend_file_read_header_cleanup,
13130     + .invalidate_image = suspend_file_invalidate_image,
13131     + .parse_sig_location = suspend_file_parse_sig_location,
13132     +
13133     + .sysfs_data = sysfs_params,
13134     + .num_sysfs_entries = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data),
13135     +};
13136     +
13137     +/* ---- Registration ---- */
13138     +static __init int suspend_file_load(void)
13139     +{
13140     + suspend_fileops.rw_init = suspend_bio_ops.rw_init;
13141     + suspend_fileops.rw_cleanup = suspend_bio_ops.rw_cleanup;
13142     + suspend_fileops.read_chunk = suspend_bio_ops.read_chunk;
13143     + suspend_fileops.write_chunk = suspend_bio_ops.write_chunk;
13144     + suspend_fileops.rw_header_chunk = suspend_bio_ops.rw_header_chunk;
13145     +
13146     + return suspend_register_module(&suspend_fileops);
13147     +}
13148     +
13149     +#ifdef MODULE
13150     +static __exit void suspend_file_unload(void)
13151     +{
13152     + suspend_unregister_module(&suspend_fileops);
13153     +}
13154     +
13155     +module_init(suspend_file_load);
13156     +module_exit(suspend_file_unload);
13157     +MODULE_LICENSE("GPL");
13158     +MODULE_AUTHOR("Nigel Cunningham");
13159     +MODULE_DESCRIPTION("Suspend2 FileAllocator");
13160     +#else
13161     +late_initcall(suspend_file_load);
13162     +#endif
13163     diff --git a/kernel/power/suspend_swap.c b/kernel/power/suspend_swap.c
13164     new file mode 100644
13165     index 0000000..a81dcc1
13166     --- /dev/null
13167     +++ b/kernel/power/suspend_swap.c
13168     @@ -0,0 +1,1262 @@
13169     +/*
13170     + * kernel/power/suspend_swap.c
13171     + *
13172     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
13173     + *
13174     + * Distributed under GPLv2.
13175     + *
13176     + * This file encapsulates functions for usage of swap space as a
13177     + * backing store.
13178     + */
13179     +
13180     +#include <linux/suspend.h>
13181     +#include <linux/module.h>
13182     +#include <linux/blkdev.h>
13183     +#include <linux/swapops.h>
13184     +#include <linux/swap.h>
13185     +#include <linux/syscalls.h>
13186     +
13187     +#include "suspend.h"
13188     +#include "sysfs.h"
13189     +#include "modules.h"
13190     +#include "io.h"
13191     +#include "ui.h"
13192     +#include "extent.h"
13193     +#include "block_io.h"
13194     +
13195     +static struct suspend_module_ops suspend_swapops;
13196     +
13197     +#define SIGNATURE_VER 6
13198     +
13199     +/* --- Struct of pages stored on disk */
13200     +
13201     +union diskpage {
13202     + union swap_header swh; /* swh.magic is the only member used */
13203     +};
13204     +
13205     +union p_diskpage {
13206     + union diskpage *pointer;
13207     + char *ptr;
13208     + unsigned long address;
13209     +};
13210     +
13211     +/* Devices used for swap */
13212     +static struct suspend_bdev_info devinfo[MAX_SWAPFILES];
13213     +
13214     +/* Extent chains for swap & blocks */
13215     +struct extent_chain swapextents;
13216     +struct extent_chain block_chain[MAX_SWAPFILES];
13217     +
13218     +static dev_t header_dev_t;
13219     +static struct block_device *header_block_device;
13220     +static unsigned long headerblock;
13221     +
13222     +/* For swapfile automatically swapon/off'd. */
13223     +static char swapfilename[32] = "";
13224     +static int suspend_swapon_status;
13225     +
13226     +/* Header Page Information */
13227     +static int header_pages_allocated;
13228     +
13229     +/* Swap Pages */
13230     +static int main_pages_allocated, main_pages_requested;
13231     +
13232     +/* User Specified Parameters. */
13233     +
13234     +static unsigned long resume_firstblock;
13235     +static int resume_blocksize;
13236     +static dev_t resume_swap_dev_t;
13237     +static struct block_device *resume_block_device;
13238     +
13239     +struct sysinfo swapinfo;
13240     +static int suspend_swap_invalidate_image(void);
13241     +
13242     +/* Block devices open. */
13243     +struct bdev_opened
13244     +{
13245     + dev_t device;
13246     + struct block_device *bdev;
13247     + int claimed;
13248     +};
13249     +
13250     +/*
13251     + * Entry MAX_SWAPFILES is the resume block device, which may
13252     + * not be a swap device enabled when we suspend.
13253     + * Entry MAX_SWAPFILES + 1 is the header block device, which
13254     + * is needed before we find out which slot it occupies.
13255     + */
13256     +static struct bdev_opened *bdev_info_list[MAX_SWAPFILES + 2];
13257     +
13258     +static void close_bdev(int i)
13259     +{
13260     + struct bdev_opened *this = bdev_info_list[i];
13261     +
13262     + if (this->claimed)
13263     + bd_release(this->bdev);
13264     +
13265     + /* Release our reference. */
13266     + blkdev_put(this->bdev);
13267     +
13268     + /* Free our info. */
13269     + kfree(this);
13270     +
13271     + bdev_info_list[i] = NULL;
13272     +}
13273     +
13274     +static void close_bdevs(void)
13275     +{
13276     + int i;
13277     +
13278     + for (i = 0; i < MAX_SWAPFILES; i++)
13279     + if (bdev_info_list[i])
13280     + close_bdev(i);
13281     +
13282     + resume_block_device = header_block_device = NULL;
13283     +}
13284     +
13285     +static struct block_device *open_bdev(int index, dev_t device, int display_errs)
13286     +{
13287     + struct bdev_opened *this;
13288     + struct block_device *bdev;
13289     +
13290     + if (bdev_info_list[index] && (bdev_info_list[index]->device == device)){
13291     + bdev = bdev_info_list[index]->bdev;
13292     + return bdev;
13293     + }
13294     +
13295     + if (bdev_info_list[index] && bdev_info_list[index]->device != device)
13296     + close_bdev(index);
13297     +
13298     + bdev = open_by_devnum(device, FMODE_READ);
13299     +
13300     + if (IS_ERR(bdev) || !bdev) {
13301     + if (display_errs)
13302     + suspend_early_boot_message(1,SUSPEND_CONTINUE_REQ,
13303     + "Failed to get access to block device "
13304     + "\"%x\" (error %d).\n Maybe you need "
13305     + "to run mknod and/or lvmsetup in an "
13306     + "initrd/ramfs?", device, bdev);
13307     + return ERR_PTR(-EINVAL);
13308     + }
13309     +
13310     + this = kmalloc(sizeof(struct bdev_opened), GFP_KERNEL);
13311     + if (!this) {
13312     + printk(KERN_WARNING "Suspend2: Failed to allocate memory for "
13313     + "opening a bdev.");
13314     + return ERR_PTR(-ENOMEM);
13315     + }
13316     +
13317     + bdev_info_list[index] = this;
13318     + this->device = device;
13319     + this->bdev = bdev;
13320     +
13321     + if (index < MAX_SWAPFILES)
13322     + devinfo[index].bdev = bdev;
13323     +
13324     + return bdev;
13325     +}
13326     +
13327     +/* Must be silent - might be called from cat /sys/power/suspend2/debug_info
13328     + * Returns 0 if was off, -EBUSY if was on, error value otherwise.
13329     + */
13330     +static int enable_swapfile(void)
13331     +{
13332     + int activateswapresult = -EINVAL;
13333     +
13334     + if (suspend_swapon_status)
13335     + return 0;
13336     +
13337     + if (swapfilename[0]) {
13338     + /* Attempt to swap on with maximum priority */
13339     + activateswapresult = sys_swapon(swapfilename, 0xFFFF);
13340     + if ((activateswapresult) && (activateswapresult != -EBUSY))
13341     + printk("Suspend2: The swapfile/partition specified by "
13342     + "/sys/power/suspend2/suspend_swap/swapfile "
13343     + "(%s) could not be turned on (error %d). "
13344     + "Attempting to continue.\n",
13345     + swapfilename, activateswapresult);
13346     + if (!activateswapresult)
13347     + suspend_swapon_status = 1;
13348     + }
13349     + return activateswapresult;
13350     +}
13351     +
13352     +/* Returns 0 if was on, -EINVAL if was off, error value otherwise */
13353     +static int disable_swapfile(void)
13354     +{
13355     + int result = -EINVAL;
13356     +
13357     + if (!suspend_swapon_status)
13358     + return 0;
13359     +
13360     + if (swapfilename[0]) {
13361     + result = sys_swapoff(swapfilename);
13362     + if (result == -EINVAL)
13363     + return 0; /* Wasn't on */
13364     + if (!result)
13365     + suspend_swapon_status = 0;
13366     + }
13367     +
13368     + return result;
13369     +}
13370     +
13371     +static int try_to_parse_resume_device(char *commandline, int quiet)
13372     +{
13373     + struct kstat stat;
13374     + int error = 0;
13375     +
13376     + resume_swap_dev_t = name_to_dev_t(commandline);
13377     +
13378     + if (!resume_swap_dev_t) {
13379     + struct file *file = filp_open(commandline, O_RDONLY, 0);
13380     +
13381     + if (!IS_ERR(file) && file) {
13382     + vfs_getattr(file->f_vfsmnt, file->f_dentry, &stat);
13383     + filp_close(file, NULL);
13384     + } else
13385     + error = vfs_stat(commandline, &stat);
13386     + if (!error)
13387     + resume_swap_dev_t = stat.rdev;
13388     + }
13389     +
13390     + if (!resume_swap_dev_t) {
13391     + if (quiet)
13392     + return 1;
13393     +
13394     + if (test_suspend_state(SUSPEND_TRYING_TO_RESUME))
13395     + suspend_early_boot_message(1, SUSPEND_CONTINUE_REQ,
13396     + "Failed to translate \"%s\" into a device id.\n",
13397     + commandline);
13398     + else
13399     + printk("Suspend2: Can't translate \"%s\" into a device "
13400     + "id yet.\n", commandline);
13401     + return 1;
13402     + }
13403     +
13404     + resume_block_device = open_bdev(MAX_SWAPFILES, resume_swap_dev_t, 0);
13405     + if (IS_ERR(resume_block_device)) {
13406     + if (!quiet)
13407     + suspend_early_boot_message(1, SUSPEND_CONTINUE_REQ,
13408     + "Failed to get access to \"%s\", where"
13409     + " the swap header should be found.",
13410     + commandline);
13411     + return 1;
13412     + }
13413     +
13414     + return 0;
13415     +}
13416     +
13417     +/*
13418     + * If we have read part of the image, we might have filled memory with
13419     + * data that should be zeroed out.
13420     + */
13421     +static void suspend_swap_noresume_reset(void)
13422     +{
13423     + memset((char *) &devinfo, 0, sizeof(devinfo));
13424     +}
13425     +
13426     +static int parse_signature(char *header, int restore)
13427     +{
13428     + int type = -1;
13429     +
13430     + if (!memcmp("SWAP-SPACE",header,10))
13431     + return 0;
13432     + else if (!memcmp("SWAPSPACE2",header,10))
13433     + return 1;
13434     +
13435     + else if (!memcmp("S1SUSP",header,6))
13436     + type = 2;
13437     + else if (!memcmp("S2SUSP",header,6))
13438     + type = 3;
13439     + else if (!memcmp("S1SUSPEND",header,9))
13440     + type = 4;
13441     +
13442     + else if (!memcmp("z",header,1))
13443     + type = 12;
13444     + else if (!memcmp("Z",header,1))
13445     + type = 13;
13446     +
13447     + /*
13448     + * Put bdev of suspend header in last byte of swap header
13449     + * (unsigned short)
13450     + */
13451     + if (type > 11) {
13452     + dev_t *header_ptr = (dev_t *) &header[1];
13453     + unsigned char *headerblocksize_ptr =
13454     + (unsigned char *) &header[5];
13455     + u32 *headerblock_ptr = (u32 *) &header[6];
13456     + header_dev_t = *header_ptr;
13457     + /*
13458     + * We are now using the highest bit of the char to indicate
13459     + * whether we have attempted to resume from this image before.
13460     + */
13461     + clear_suspend_state(SUSPEND_RESUMED_BEFORE);
13462     + if (((int) *headerblocksize_ptr) & 0x80)
13463     + set_suspend_state(SUSPEND_RESUMED_BEFORE);
13464     + headerblock = (unsigned long) *headerblock_ptr;
13465     + }
13466     +
13467     + if ((restore) && (type > 5)) {
13468     + /* We only reset our own signatures */
13469     + if (type & 1)
13470     + memcpy(header,"SWAPSPACE2",10);
13471     + else
13472     + memcpy(header,"SWAP-SPACE",10);
13473     + }
13474     +
13475     + return type;
13476     +}
13477     +
13478     +/*
13479     + * prepare_signature
13480     + */
13481     +static int prepare_signature(dev_t bdev, unsigned long block,
13482     + char *current_header)
13483     +{
13484     + int current_type = parse_signature(current_header, 0);
13485     + dev_t *header_ptr = (dev_t *) (&current_header[1]);
13486     + unsigned long *headerblock_ptr =
13487     + (unsigned long *) (&current_header[6]);
13488     +
13489     + if ((current_type > 1) && (current_type < 6))
13490     + return 1;
13491     +
13492     + /* At the moment, I don't have a way to handle the block being
13493     + * > 32 bits. Not enough room in the signature and no way to
13494     + * safely put the data elsewhere. */
13495     +
13496     + if (BITS_PER_LONG == 64 && ffs(block) > 31) {
13497     + suspend_prepare_status(DONT_CLEAR_BAR,
13498     + "Header sector requires 33+ bits. "
13499     + "Would not be able to resume.");
13500     + return 1;
13501     + }
13502     +
13503     + if (current_type & 1)
13504     + current_header[0] = 'Z';
13505     + else
13506     + current_header[0] = 'z';
13507     + *header_ptr = bdev;
13508     + /* prev is the first/last swap page of the resume area */
13509     + *headerblock_ptr = (unsigned long) block;
13510     + return 0;
13511     +}
13512     +
13513     +static int __suspend_swap_allocate_storage(int main_storage_requested,
13514     + int header_storage);
13515     +
13516     +static int suspend_swap_allocate_header_space(int space_requested)
13517     +{
13518     + int i;
13519     +
13520     + if (!swapextents.size && __suspend_swap_allocate_storage(
13521     + main_pages_requested, space_requested)) {
13522     + printk("Failed to allocate space for the header.\n");
13523     + return -ENOSPC;
13524     + }
13525     +
13526     + suspend_extent_state_goto_start(&suspend_writer_posn);
13527     + suspend_bio_ops.forward_one_page(); /* To first page */
13528     +
13529     + for (i = 0; i < space_requested; i++) {
13530     + if (suspend_bio_ops.forward_one_page()) {
13531     + printk("Out of space while seeking to allocate "
13532     + "header pages,\n");
13533     + header_pages_allocated = i;
13534     + return -ENOSPC;
13535     + }
13536     +
13537     + }
13538     +
13539     + header_pages_allocated = space_requested;
13540     +
13541     + /* The end of header pages will be the start of pageset 2;
13542     + * we are now sitting on the first pageset2 page. */
13543     + suspend_extent_state_save(&suspend_writer_posn,
13544     + &suspend_writer_posn_save[2]);
13545     + return 0;
13546     +}
13547     +
13548     +static void get_main_pool_phys_params(void)
13549     +{
13550     + struct extent *extentpointer = NULL;
13551     + unsigned long address;
13552     + int i, extent_min = -1, extent_max = -1, last_chain = -1;
13553     +
13554     + for (i = 0; i < MAX_SWAPFILES; i++)
13555     + if (block_chain[i].first)
13556     + suspend_put_extent_chain(&block_chain[i]);
13557     +
13558     + suspend_extent_for_each(&swapextents, extentpointer, address) {
13559     + swp_entry_t swap_address = extent_val_to_swap_entry(address);
13560     + pgoff_t offset = swp_offset(swap_address);
13561     + unsigned swapfilenum = swp_type(swap_address);
13562     + struct swap_info_struct *sis = get_swap_info_struct(swapfilenum);
13563     + sector_t new_sector = map_swap_page(sis, offset);
13564     +
13565     + if ((new_sector == extent_max + 1) &&
13566     + (last_chain == swapfilenum))
13567     + extent_max++;
13568     + else {
13569     + if (extent_min > -1) {
13570     + if (test_action_state(SUSPEND_TEST_BIO))
13571     + printk("Adding extent chain %d %d-%d.\n",
13572     + swapfilenum,
13573     + extent_min <<
13574     + devinfo[last_chain].bmap_shift,
13575     + extent_max <<
13576     + devinfo[last_chain].bmap_shift);
13577     +
13578     + suspend_add_to_extent_chain(
13579     + &block_chain[last_chain],
13580     + extent_min, extent_max);
13581     + }
13582     + extent_min = extent_max = new_sector;
13583     + last_chain = swapfilenum;
13584     + }
13585     + }
13586     +
13587     + if (extent_min > -1) {
13588     + if (test_action_state(SUSPEND_TEST_BIO))
13589     + printk("Adding extent chain %d %d-%d.\n",
13590     + last_chain,
13591     + extent_min <<
13592     + devinfo[last_chain].bmap_shift,
13593     + extent_max <<
13594     + devinfo[last_chain].bmap_shift);
13595     + suspend_add_to_extent_chain(
13596     + &block_chain[last_chain],
13597     + extent_min, extent_max);
13598     + }
13599     +
13600     + suspend_swap_allocate_header_space(header_pages_allocated);
13601     +}
13602     +
13603     +static int suspend_swap_storage_allocated(void)
13604     +{
13605     + return main_pages_requested + header_pages_allocated;
13606     +}
13607     +
13608     +static int suspend_swap_storage_available(void)
13609     +{
13610     + si_swapinfo(&swapinfo);
13611     + return (((int) swapinfo.freeswap + main_pages_allocated) * PAGE_SIZE /
13612     + (PAGE_SIZE + sizeof(unsigned long) + sizeof(int)));
13613     +}
13614     +
13615     +static int suspend_swap_initialise(int starting_cycle)
13616     +{
13617     + if (!starting_cycle)
13618     + return 0;
13619     +
13620     + enable_swapfile();
13621     +
13622     + if (resume_swap_dev_t && !resume_block_device &&
13623     + IS_ERR(resume_block_device =
13624     + open_bdev(MAX_SWAPFILES, resume_swap_dev_t, 1)))
13625     + return 1;
13626     +
13627     + return 0;
13628     +}
13629     +
13630     +static void suspend_swap_cleanup(int ending_cycle)
13631     +{
13632     + if (ending_cycle)
13633     + disable_swapfile();
13634     +
13635     + close_bdevs();
13636     +}
13637     +
13638     +static int suspend_swap_release_storage(void)
13639     +{
13640     + int i = 0;
13641     +
13642     + if (test_action_state(SUSPEND_KEEP_IMAGE) &&
13643     + test_suspend_state(SUSPEND_NOW_RESUMING))
13644     + return 0;
13645     +
13646     + header_pages_allocated = 0;
13647     + main_pages_allocated = 0;
13648     +
13649     + if (swapextents.first) {
13650     + /* Free swap entries */
13651     + struct extent *extentpointer;
13652     + unsigned long extentvalue;
13653     + suspend_extent_for_each(&swapextents, extentpointer,
13654     + extentvalue)
13655     + swap_free(extent_val_to_swap_entry(extentvalue));
13656     +
13657     + suspend_put_extent_chain(&swapextents);
13658     +
13659     + for (i = 0; i < MAX_SWAPFILES; i++)
13660     + if (block_chain[i].first)
13661     + suspend_put_extent_chain(&block_chain[i]);
13662     + }
13663     +
13664     + return 0;
13665     +}
13666     +
13667     +static int suspend_swap_allocate_storage(int space_requested)
13668     +{
13669     + if (!__suspend_swap_allocate_storage(space_requested,
13670     + header_pages_allocated)) {
13671     + main_pages_requested = space_requested;
13672     + return 0;
13673     + }
13674     +
13675     + return -ENOSPC;
13676     +}
13677     +
13678     +static void free_swap_range(unsigned long min, unsigned long max)
13679     +{
13680     + int j;
13681     +
13682     + for (j = min; j < max; j++)
13683     + swap_free(extent_val_to_swap_entry(j));
13684     +}
13685     +
13686     +/*
13687     + * Round robin allocation (where swap storage has the same priority).
13688     + * could make this very inefficient, so we track extents allocated on
13689     + * a per-swapfiles basis.
13690     + */
13691     +static int __suspend_swap_allocate_storage(int main_space_requested,
13692     + int header_space_requested)
13693     +{
13694     + int i, result = 0, first[MAX_SWAPFILES], pages_to_get, extra_pages, gotten = 0;
13695     + unsigned long extent_min[MAX_SWAPFILES], extent_max[MAX_SWAPFILES];
13696     +
13697     + extra_pages = DIV_ROUND_UP(main_space_requested * (sizeof(unsigned long)
13698     + + sizeof(int)), PAGE_SIZE);
13699     + pages_to_get = main_space_requested + extra_pages +
13700     + header_space_requested - swapextents.size;
13701     +
13702     + if (pages_to_get < 1)
13703     + return 0;
13704     +
13705     + for (i=0; i < MAX_SWAPFILES; i++) {
13706     + struct swap_info_struct *si = get_swap_info_struct(i);
13707     + if ((devinfo[i].bdev = si->bdev))
13708     + devinfo[i].dev_t = si->bdev->bd_dev;
13709     + devinfo[i].bmap_shift = 3;
13710     + devinfo[i].blocks_per_page = 1;
13711     + first[i] = 1;
13712     + }
13713     +
13714     + for(i=0; i < pages_to_get; i++) {
13715     + swp_entry_t entry;
13716     + unsigned long new_value;
13717     + unsigned swapfilenum;
13718     +
13719     + entry = get_swap_page();
13720     + if (!entry.val)
13721     + break;
13722     +
13723     + swapfilenum = swp_type(entry);
13724     + new_value = swap_entry_to_extent_val(entry);
13725     +
13726     + if (first[swapfilenum]) {
13727     + first[swapfilenum] = 0;
13728     + extent_min[swapfilenum] = new_value;
13729     + extent_max[swapfilenum] = new_value;
13730     + gotten++;
13731     + continue;
13732     + }
13733     +
13734     + if (new_value == extent_max[swapfilenum] + 1) {
13735     + extent_max[swapfilenum]++;
13736     + gotten++;
13737     + continue;
13738     + }
13739     +
13740     + if (suspend_add_to_extent_chain(&swapextents,
13741     + extent_min[swapfilenum],
13742     + extent_max[swapfilenum])) {
13743     + free_swap_range(extent_min[swapfilenum],
13744     + extent_max[swapfilenum]);
13745     + swap_free(entry);
13746     + gotten -= (extent_max[swapfilenum] -
13747     + extent_min[swapfilenum]);
13748     + break;
13749     + } else {
13750     + extent_min[swapfilenum] = new_value;
13751     + extent_max[swapfilenum] = new_value;
13752     + gotten++;
13753     + }
13754     + }
13755     +
13756     + for (i = 0; i < MAX_SWAPFILES; i++)
13757     + if (!first[i] && suspend_add_to_extent_chain(&swapextents,
13758     + extent_min[i], extent_max[i])) {
13759     + free_swap_range(extent_min[i], extent_max[i]);
13760     + gotten -= (extent_max[i] - extent_min[i]);
13761     + }
13762     +
13763     + if (gotten < pages_to_get)
13764     + result = -ENOSPC;
13765     +
13766     + main_pages_allocated += gotten;
13767     + get_main_pool_phys_params();
13768     + return result;
13769     +}
13770     +
13771     +static int suspend_swap_write_header_init(void)
13772     +{
13773     + int i, result;
13774     + struct swap_info_struct *si;
13775     +
13776     + suspend_extent_state_goto_start(&suspend_writer_posn);
13777     +
13778     + suspend_writer_buffer_posn = suspend_header_bytes_used = 0;
13779     +
13780     + /* Info needed to bootstrap goes at the start of the header.
13781     + * First we save the positions and devinfo, including the number
13782     + * of header pages. Then we save the structs containing data needed
13783     + * for reading the header pages back.
13784     + * Note that even if header pages take more than one page, when we
13785     + * read back the info, we will have restored the location of the
13786     + * next header page by the time we go to use it.
13787     + */
13788     +
13789     + /* Forward one page will be done prior to the read */
13790     + for (i = 0; i < MAX_SWAPFILES; i++) {
13791     + si = get_swap_info_struct(i);
13792     + if (si->swap_file)
13793     + devinfo[i].dev_t = si->bdev->bd_dev;
13794     + else
13795     + devinfo[i].dev_t = (dev_t) 0;
13796     + }
13797     +
13798     + if ((result = suspend_bio_ops.rw_header_chunk(WRITE,
13799     + &suspend_swapops,
13800     + (char *) &suspend_writer_posn_save,
13801     + sizeof(suspend_writer_posn_save))))
13802     + return result;
13803     +
13804     + if ((result = suspend_bio_ops.rw_header_chunk(WRITE,
13805     + &suspend_swapops,
13806     + (char *) &devinfo, sizeof(devinfo))))
13807     + return result;
13808     +
13809     + for (i=0; i < MAX_SWAPFILES; i++)
13810     + suspend_serialise_extent_chain(&suspend_swapops, &block_chain[i]);
13811     +
13812     + return 0;
13813     +}
13814     +
13815     +static int suspend_swap_write_header_cleanup(void)
13816     +{
13817     + int result;
13818     + struct swap_info_struct *si;
13819     +
13820     + /* Write any unsaved data */
13821     + if (suspend_writer_buffer_posn)
13822     + suspend_bio_ops.write_header_chunk_finish();
13823     +
13824     + suspend_bio_ops.finish_all_io();
13825     +
13826     + suspend_extent_state_goto_start(&suspend_writer_posn);
13827     + suspend_bio_ops.forward_one_page();
13828     +
13829     + /* Adjust swap header */
13830     + suspend_bio_ops.bdev_page_io(READ, resume_block_device,
13831     + resume_firstblock,
13832     + virt_to_page(suspend_writer_buffer));
13833     +
13834     + si = get_swap_info_struct(suspend_writer_posn.current_chain);
13835     + result = prepare_signature(si->bdev->bd_dev,
13836     + suspend_writer_posn.current_offset,
13837     + ((union swap_header *) suspend_writer_buffer)->magic.magic);
13838     +
13839     + if (!result)
13840     + suspend_bio_ops.bdev_page_io(WRITE, resume_block_device,
13841     + resume_firstblock,
13842     + virt_to_page(suspend_writer_buffer));
13843     +
13844     + suspend_bio_ops.finish_all_io();
13845     +
13846     + return result;
13847     +}
13848     +
13849     +/* ------------------------- HEADER READING ------------------------- */
13850     +
13851     +/*
13852     + * read_header_init()
13853     + *
13854     + * Description:
13855     + * 1. Attempt to read the device specified with resume2=.
13856     + * 2. Check the contents of the swap header for our signature.
13857     + * 3. Warn, ignore, reset and/or continue as appropriate.
13858     + * 4. If continuing, read the suspend_swap configuration section
13859     + * of the header and set up block device info so we can read
13860     + * the rest of the header & image.
13861     + *
13862     + * Returns:
13863     + * May not return if user choose to reboot at a warning.
13864     + * -EINVAL if cannot resume at this time. Booting should continue
13865     + * normally.
13866     + */
13867     +
13868     +static int suspend_swap_read_header_init(void)
13869     +{
13870     + int i, result = 0;
13871     +
13872     + suspend_header_bytes_used = 0;
13873     +
13874     + if (!header_dev_t) {
13875     + printk("read_header_init called when we haven't "
13876     + "verified there is an image!\n");
13877     + return -EINVAL;
13878     + }
13879     +
13880     + /*
13881     + * If the header is not on the resume_swap_dev_t, get the resume device first.
13882     + */
13883     + if (header_dev_t != resume_swap_dev_t) {
13884     + header_block_device = open_bdev(MAX_SWAPFILES + 1,
13885     + header_dev_t, 1);
13886     +
13887     + if (IS_ERR(header_block_device))
13888     + return PTR_ERR(header_block_device);
13889     + } else
13890     + header_block_device = resume_block_device;
13891     +
13892     + /*
13893     + * Read suspend_swap configuration.
13894     + * Headerblock size taken into account already.
13895     + */
13896     + suspend_bio_ops.bdev_page_io(READ, header_block_device,
13897     + headerblock << 3,
13898     + virt_to_page((unsigned long) suspend_writer_buffer));
13899     +
13900     + memcpy(&suspend_writer_posn_save, suspend_writer_buffer, 3 * sizeof(struct extent_iterate_saved_state));
13901     +
13902     + suspend_writer_buffer_posn = 3 * sizeof(struct extent_iterate_saved_state);
13903     + suspend_header_bytes_used += 3 * sizeof(struct extent_iterate_saved_state);
13904     +
13905     + memcpy(&devinfo, suspend_writer_buffer + suspend_writer_buffer_posn, sizeof(devinfo));
13906     +
13907     + suspend_writer_buffer_posn += sizeof(devinfo);
13908     + suspend_header_bytes_used += sizeof(devinfo);
13909     +
13910     + /* Restore device info */
13911     + for (i = 0; i < MAX_SWAPFILES; i++) {
13912     + dev_t thisdevice = devinfo[i].dev_t;
13913     + struct block_device *result;
13914     +
13915     + devinfo[i].bdev = NULL;
13916     +
13917     + if (!thisdevice)
13918     + continue;
13919     +
13920     + if (thisdevice == resume_swap_dev_t) {
13921     + devinfo[i].bdev = resume_block_device;
13922     + bdev_info_list[i] = bdev_info_list[MAX_SWAPFILES];
13923     + bdev_info_list[MAX_SWAPFILES] = NULL;
13924     + continue;
13925     + }
13926     +
13927     + if (thisdevice == header_dev_t) {
13928     + devinfo[i].bdev = header_block_device;
13929     + bdev_info_list[i] = bdev_info_list[MAX_SWAPFILES + 1];
13930     + bdev_info_list[MAX_SWAPFILES + 1] = NULL;
13931     + continue;
13932     + }
13933     +
13934     + result = open_bdev(i, thisdevice, 1);
13935     + if (IS_ERR(result))
13936     + return PTR_ERR(result);
13937     + }
13938     +
13939     + suspend_bio_ops.read_header_init();
13940     + suspend_extent_state_goto_start(&suspend_writer_posn);
13941     + suspend_bio_ops.set_extra_page_forward();
13942     +
13943     + for (i = 0; i < MAX_SWAPFILES && !result; i++)
13944     + result = suspend_load_extent_chain(&block_chain[i]);
13945     +
13946     + return result;
13947     +}
13948     +
13949     +static int suspend_swap_read_header_cleanup(void)
13950     +{
13951     + suspend_bio_ops.rw_cleanup(READ);
13952     + return 0;
13953     +}
13954     +
13955     +/* suspend_swap_invalidate_image
13956     + *
13957     + */
13958     +static int suspend_swap_invalidate_image(void)
13959     +{
13960     + union p_diskpage cur;
13961     + int result = 0;
13962     + char newsig[11];
13963     +
13964     + cur.address = get_zeroed_page(GFP_ATOMIC);
13965     + if (!cur.address) {
13966     + printk("Unable to allocate a page for restoring the swap signature.\n");
13967     + return -ENOMEM;
13968     + }
13969     +
13970     + /*
13971     + * If nr_suspends == 0, we must be booting, so no swap pages
13972     + * will be recorded as used yet.
13973     + */
13974     +
13975     + if (nr_suspends > 0)
13976     + suspend_swap_release_storage();
13977     +
13978     + /*
13979     + * We don't do a sanity check here: we want to restore the swap
13980     + * whatever version of kernel made the suspend image.
13981     + *
13982     + * We need to write swap, but swap may not be enabled so
13983     + * we write the device directly
13984     + */
13985     +
13986     + suspend_bio_ops.bdev_page_io(READ, resume_block_device,
13987     + resume_firstblock,
13988     + virt_to_page(cur.pointer));
13989     +
13990     + result = parse_signature(cur.pointer->swh.magic.magic, 1);
13991     +
13992     + if (result < 5)
13993     + goto out;
13994     +
13995     + strncpy(newsig, cur.pointer->swh.magic.magic, 10);
13996     + newsig[10] = 0;
13997     +
13998     + suspend_bio_ops.bdev_page_io(WRITE, resume_block_device,
13999     + resume_firstblock,
14000     + virt_to_page(cur.pointer));
14001     +
14002     + if (!nr_suspends)
14003     + printk(KERN_WARNING "Suspend2: Image invalidated.\n");
14004     +out:
14005     + suspend_bio_ops.finish_all_io();
14006     + free_page(cur.address);
14007     + return 0;
14008     +}
14009     +
14010     +/*
14011     + * workspace_size
14012     + *
14013     + * Description:
14014     + * Returns the number of bytes of RAM needed for this
14015     + * code to do its work. (Used when calculating whether
14016     + * we have enough memory to be able to suspend & resume).
14017     + *
14018     + */
14019     +static int suspend_swap_memory_needed(void)
14020     +{
14021     + return 1;
14022     +}
14023     +
14024     +/*
14025     + * Print debug info
14026     + *
14027     + * Description:
14028     + */
14029     +static int suspend_swap_print_debug_stats(char *buffer, int size)
14030     +{
14031     + int len = 0;
14032     + struct sysinfo sysinfo;
14033     +
14034     + if (suspendActiveAllocator != &suspend_swapops) {
14035     + len = snprintf_used(buffer, size, "- SwapAllocator inactive.\n");
14036     + return len;
14037     + }
14038     +
14039     + len = snprintf_used(buffer, size, "- SwapAllocator active.\n");
14040     + if (swapfilename[0])
14041     + len+= snprintf_used(buffer+len, size-len,
14042     + " Attempting to automatically swapon: %s.\n", swapfilename);
14043     +
14044     + si_swapinfo(&sysinfo);
14045     +
14046     + len+= snprintf_used(buffer+len, size-len, " Swap available for image: %ld pages.\n",
14047     + (int) sysinfo.freeswap + suspend_swap_storage_allocated());
14048     +
14049     + return len;
14050     +}
14051     +
14052     +/*
14053     + * Storage needed
14054     + *
14055     + * Returns amount of space in the swap header required
14056     + * for the suspend_swap's data. This ignores the links between
14057     + * pages, which we factor in when allocating the space.
14058     + *
14059     + * We ensure the space is allocated, but actually save the
14060     + * data from write_header_init and therefore don't also define a
14061     + * save_config_info routine.
14062     + */
14063     +static int suspend_swap_storage_needed(void)
14064     +{
14065     + int i, result;
14066     + result = sizeof(suspend_writer_posn_save) + sizeof(devinfo);
14067     +
14068     + for (i = 0; i < MAX_SWAPFILES; i++) {
14069     + result += 3 * sizeof(int);
14070     + result += (2 * sizeof(unsigned long) *
14071     + block_chain[i].num_extents);
14072     + }
14073     +
14074     + return result;
14075     +}
14076     +
14077     +/*
14078     + * Image_exists
14079     + */
14080     +static int suspend_swap_image_exists(void)
14081     +{
14082     + int signature_found;
14083     + union p_diskpage diskpage;
14084     +
14085     + if (!resume_swap_dev_t) {
14086     + printk("Not even trying to read header "
14087     + "because resume_swap_dev_t is not set.\n");
14088     + return 0;
14089     + }
14090     +
14091     + if (!resume_block_device &&
14092     + IS_ERR(resume_block_device =
14093     + open_bdev(MAX_SWAPFILES, resume_swap_dev_t, 1))) {
14094     + printk("Failed to open resume dev_t (%x).\n", resume_swap_dev_t);
14095     + return 0;
14096     + }
14097     +
14098     + diskpage.address = get_zeroed_page(GFP_ATOMIC);
14099     +
14100     + suspend_bio_ops.bdev_page_io(READ, resume_block_device,
14101     + resume_firstblock,
14102     + virt_to_page(diskpage.ptr));
14103     + suspend_bio_ops.finish_all_io();
14104     +
14105     + signature_found = parse_signature(diskpage.pointer->swh.magic.magic, 0);
14106     + free_page(diskpage.address);
14107     +
14108     + if (signature_found < 2) {
14109     + printk("Suspend2: Normal swapspace found.\n");
14110     + return 0; /* Normal swap space */
14111     + } else if (signature_found == -1) {
14112     + printk(KERN_ERR "Suspend2: Unable to find a signature. Could "
14113     + "you have moved a swap file?\n");
14114     + return 0;
14115     + } else if (signature_found < 6) {
14116     + printk("Suspend2: Detected another implementation's signature.\n");
14117     + return 0;
14118     + } else if ((signature_found >> 1) != SIGNATURE_VER) {
14119     + if ((!(test_suspend_state(SUSPEND_NORESUME_SPECIFIED))) &&
14120     + suspend_early_boot_message(1, SUSPEND_CONTINUE_REQ,
14121     + "Found a different style suspend image signature.")) {
14122     + set_suspend_state(SUSPEND_NORESUME_SPECIFIED);
14123     + printk("Suspend2: Dectected another implementation's signature.\n");
14124     + }
14125     + }
14126     +
14127     + return 1;
14128     +}
14129     +
14130     +/*
14131     + * Mark resume attempted.
14132     + *
14133     + * Record that we tried to resume from this image.
14134     + */
14135     +static void suspend_swap_mark_resume_attempted(int mark)
14136     +{
14137     + union p_diskpage diskpage;
14138     + int signature_found;
14139     +
14140     + if (!resume_swap_dev_t) {
14141     + printk("Not even trying to record attempt at resuming"
14142     + " because resume_swap_dev_t is not set.\n");
14143     + return;
14144     + }
14145     +
14146     + diskpage.address = get_zeroed_page(GFP_ATOMIC);
14147     +
14148     + suspend_bio_ops.bdev_page_io(READ, resume_block_device,
14149     + resume_firstblock,
14150     + virt_to_page(diskpage.ptr));
14151     + signature_found = parse_signature(diskpage.pointer->swh.magic.magic, 0);
14152     +
14153     + switch (signature_found) {
14154     + case 12:
14155     + case 13:
14156     + diskpage.pointer->swh.magic.magic[5] &= ~0x80;
14157     + if (mark)
14158     + diskpage.pointer->swh.magic.magic[5] |= 0x80;
14159     + break;
14160     + }
14161     +
14162     + suspend_bio_ops.bdev_page_io(WRITE, resume_block_device,
14163     + resume_firstblock,
14164     + virt_to_page(diskpage.ptr));
14165     + suspend_bio_ops.finish_all_io();
14166     + free_page(diskpage.address);
14167     + return;
14168     +}
14169     +
14170     +/*
14171     + * Parse Image Location
14172     + *
14173     + * Attempt to parse a resume2= parameter.
14174     + * Swap Writer accepts:
14175     + * resume2=swap:DEVNAME[:FIRSTBLOCK][@BLOCKSIZE]
14176     + *
14177     + * Where:
14178     + * DEVNAME is convertable to a dev_t by name_to_dev_t
14179     + * FIRSTBLOCK is the location of the first block in the swap file
14180     + * (specifying for a swap partition is nonsensical but not prohibited).
14181     + * Data is validated by attempting to read a swap header from the
14182     + * location given. Failure will result in suspend_swap refusing to
14183     + * save an image, and a reboot with correct parameters will be
14184     + * necessary.
14185     + */
14186     +static int suspend_swap_parse_sig_location(char *commandline,
14187     + int only_allocator, int quiet)
14188     +{
14189     + char *thischar, *devstart, *colon = NULL, *at_symbol = NULL;
14190     + union p_diskpage diskpage;
14191     + int signature_found, result = -EINVAL, temp_result;
14192     +
14193     + if (strncmp(commandline, "swap:", 5)) {
14194     + /*
14195     + * Failing swap:, we'll take a simple
14196     + * resume2=/dev/hda2, but fall through to
14197     + * other allocators if /dev/ isn't matched.
14198     + */
14199     + if (strncmp(commandline, "/dev/", 5))
14200     + return 1;
14201     + } else
14202     + commandline += 5;
14203     +
14204     + devstart = thischar = commandline;
14205     + while ((*thischar != ':') && (*thischar != '@') &&
14206     + ((thischar - commandline) < 250) && (*thischar))
14207     + thischar++;
14208     +
14209     + if (*thischar == ':') {
14210     + colon = thischar;
14211     + *colon = 0;
14212     + thischar++;
14213     + }
14214     +
14215     + while ((*thischar != '@') && ((thischar - commandline) < 250) && (*thischar))
14216     + thischar++;
14217     +
14218     + if (*thischar == '@') {
14219     + at_symbol = thischar;
14220     + *at_symbol = 0;
14221     + }
14222     +
14223     + if (colon)
14224     + resume_firstblock = (int) simple_strtoul(colon + 1, NULL, 0);
14225     + else
14226     + resume_firstblock = 0;
14227     +
14228     + clear_suspend_state(SUSPEND_CAN_SUSPEND);
14229     + clear_suspend_state(SUSPEND_CAN_RESUME);
14230     +
14231     + /* Legacy */
14232     + if (at_symbol) {
14233     + resume_blocksize = (int) simple_strtoul(at_symbol + 1, NULL, 0);
14234     + if (resume_blocksize & (SECTOR_SIZE - 1)) {
14235     + if (!quiet)
14236     + printk("SwapAllocator: Blocksizes are multiples"
14237     + "of %d!\n", SECTOR_SIZE);
14238     + return -EINVAL;
14239     + }
14240     + resume_firstblock = resume_firstblock *
14241     + (resume_blocksize / SECTOR_SIZE);
14242     + }
14243     +
14244     + temp_result = try_to_parse_resume_device(devstart, quiet);
14245     +
14246     + if (colon)
14247     + *colon = ':';
14248     + if (at_symbol)
14249     + *at_symbol = '@';
14250     +
14251     + if (temp_result)
14252     + return -EINVAL;
14253     +
14254     + diskpage.address = get_zeroed_page(GFP_ATOMIC);
14255     + if (!diskpage.address) {
14256     + printk(KERN_ERR "Suspend2: SwapAllocator: Failed to allocate "
14257     + "a diskpage for I/O.\n");
14258     + return -ENOMEM;
14259     + }
14260     +
14261     + temp_result = suspend_bio_ops.bdev_page_io(READ,
14262     + resume_block_device,
14263     + resume_firstblock,
14264     + virt_to_page(diskpage.ptr));
14265     +
14266     + suspend_bio_ops.finish_all_io();
14267     +
14268     + if (temp_result) {
14269     + printk(KERN_ERR "Suspend2: SwapAllocator: Failed to submit "
14270     + "I/O.\n");
14271     + goto invalid;
14272     + }
14273     +
14274     + signature_found = parse_signature(diskpage.pointer->swh.magic.magic, 0);
14275     +
14276     + if (signature_found != -1) {
14277     + if (!quiet)
14278     + printk("Suspend2: SwapAllocator: Signature found.\n");
14279     + result = 0;
14280     +
14281     + suspend_bio_ops.set_devinfo(devinfo);
14282     + suspend_writer_posn.chains = &block_chain[0];
14283     + suspend_writer_posn.num_chains = MAX_SWAPFILES;
14284     + set_suspend_state(SUSPEND_CAN_SUSPEND);
14285     + set_suspend_state(SUSPEND_CAN_RESUME);
14286     + } else
14287     + if (!quiet)
14288     + printk(KERN_ERR "Suspend2: SwapAllocator: No swap "
14289     + "signature found at specified location.\n");
14290     +invalid:
14291     + free_page((unsigned long) diskpage.address);
14292     + return result;
14293     +
14294     +}
14295     +
14296     +static int header_locations_read_sysfs(const char *page, int count)
14297     +{
14298     + int i, printedpartitionsmessage = 0, len = 0, haveswap = 0;
14299     + struct inode *swapf = 0;
14300     + int zone;
14301     + char *path_page = (char *) __get_free_page(GFP_KERNEL);
14302     + char *path, *output = (char *) page;
14303     + int path_len;
14304     +
14305     + if (!page)
14306     + return 0;
14307     +
14308     + for (i = 0; i < MAX_SWAPFILES; i++) {
14309     + struct swap_info_struct *si = get_swap_info_struct(i);
14310     +
14311     + if (!si->swap_file)
14312     + continue;
14313     +
14314     + if (S_ISBLK(si->swap_file->f_mapping->host->i_mode)) {
14315     + haveswap = 1;
14316     + if (!printedpartitionsmessage) {
14317     + len += sprintf(output + len,
14318     + "For swap partitions, simply use the "
14319     + "format: resume2=swap:/dev/hda1.\n");
14320     + printedpartitionsmessage = 1;
14321     + }
14322     + } else {
14323     + path_len = 0;
14324     +
14325     + path = d_path(si->swap_file->f_dentry,
14326     + si->swap_file->f_vfsmnt,
14327     + path_page,
14328     + PAGE_SIZE);
14329     + path_len = snprintf(path_page, 31, "%s", path);
14330     +
14331     + haveswap = 1;
14332     + swapf = si->swap_file->f_mapping->host;
14333     + if (!(zone = bmap(swapf,0))) {
14334     + len+= sprintf(output + len,
14335     + "Swapfile %s has been corrupted. Reuse"
14336     + " mkswap on it and try again.\n",
14337     + path_page);
14338     + } else {
14339     + char name_buffer[255];
14340     + len+= sprintf(output + len, "For swapfile `%s`,"
14341     + " use resume2=swap:/dev/%s:0x%x.\n",
14342     + path_page,
14343     + bdevname(si->bdev, name_buffer),
14344     + zone << (swapf->i_blkbits - 9));
14345     + }
14346     +
14347     + }
14348     + }
14349     +
14350     + if (!haveswap)
14351     + len = sprintf(output, "You need to turn on swap partitions "
14352     + "before examining this file.\n");
14353     +
14354     + free_page((unsigned long) path_page);
14355     + return len;
14356     +}
14357     +
14358     +static struct suspend_sysfs_data sysfs_params[] = {
14359     + {
14360     + SUSPEND2_ATTR("swapfilename", SYSFS_RW),
14361     + SYSFS_STRING(swapfilename, 255, 0)
14362     + },
14363     +
14364     + {
14365     + SUSPEND2_ATTR("headerlocations", SYSFS_READONLY),
14366     + SYSFS_CUSTOM(header_locations_read_sysfs, NULL, 0)
14367     + },
14368     +
14369     + { SUSPEND2_ATTR("enabled", SYSFS_RW),
14370     + SYSFS_INT(&suspend_swapops.enabled, 0, 1, 0),
14371     + .write_side_effect = attempt_to_parse_resume_device2,
14372     + }
14373     +};
14374     +
14375     +static struct suspend_module_ops suspend_swapops = {
14376     + .type = WRITER_MODULE,
14377     + .name = "Swap Allocator",
14378     + .directory = "swap",
14379     + .module = THIS_MODULE,
14380     + .memory_needed = suspend_swap_memory_needed,
14381     + .print_debug_info = suspend_swap_print_debug_stats,
14382     + .storage_needed = suspend_swap_storage_needed,
14383     + .initialise = suspend_swap_initialise,
14384     + .cleanup = suspend_swap_cleanup,
14385     +
14386     + .noresume_reset = suspend_swap_noresume_reset,
14387     + .storage_available = suspend_swap_storage_available,
14388     + .storage_allocated = suspend_swap_storage_allocated,
14389     + .release_storage = suspend_swap_release_storage,
14390     + .allocate_header_space = suspend_swap_allocate_header_space,
14391     + .allocate_storage = suspend_swap_allocate_storage,
14392     + .image_exists = suspend_swap_image_exists,
14393     + .mark_resume_attempted = suspend_swap_mark_resume_attempted,
14394     + .write_header_init = suspend_swap_write_header_init,
14395     + .write_header_cleanup = suspend_swap_write_header_cleanup,
14396     + .read_header_init = suspend_swap_read_header_init,
14397     + .read_header_cleanup = suspend_swap_read_header_cleanup,
14398     + .invalidate_image = suspend_swap_invalidate_image,
14399     + .parse_sig_location = suspend_swap_parse_sig_location,
14400     +
14401     + .sysfs_data = sysfs_params,
14402     + .num_sysfs_entries = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data),
14403     +};
14404     +
14405     +/* ---- Registration ---- */
14406     +static __init int suspend_swap_load(void)
14407     +{
14408     + suspend_swapops.rw_init = suspend_bio_ops.rw_init;
14409     + suspend_swapops.rw_cleanup = suspend_bio_ops.rw_cleanup;
14410     + suspend_swapops.read_chunk = suspend_bio_ops.read_chunk;
14411     + suspend_swapops.write_chunk = suspend_bio_ops.write_chunk;
14412     + suspend_swapops.rw_header_chunk = suspend_bio_ops.rw_header_chunk;
14413     +
14414     + return suspend_register_module(&suspend_swapops);
14415     +}
14416     +
14417     +#ifdef MODULE
14418     +static __exit void suspend_swap_unload(void)
14419     +{
14420     + suspend_unregister_module(&suspend_swapops);
14421     +}
14422     +
14423     +module_init(suspend_swap_load);
14424     +module_exit(suspend_swap_unload);
14425     +MODULE_LICENSE("GPL");
14426     +MODULE_AUTHOR("Nigel Cunningham");
14427     +MODULE_DESCRIPTION("Suspend2 SwapAllocator");
14428     +#else
14429     +late_initcall(suspend_swap_load);
14430     +#endif
14431     diff --git a/kernel/power/suspend_userui.c b/kernel/power/suspend_userui.c
14432     new file mode 100644
14433     index 0000000..a11cd3d
14434     --- /dev/null
14435     +++ b/kernel/power/suspend_userui.c
14436     @@ -0,0 +1,649 @@
14437     +/*
14438     + * kernel/power/user_ui.c
14439     + *
14440     + * Copyright (C) 2005-2007 Bernard Blackham
14441     + * Copyright (C) 2002-2007 Nigel Cunningham (nigel at suspend2 net)
14442     + *
14443     + * This file is released under the GPLv2.
14444     + *
14445     + * Routines for Suspend2's user interface.
14446     + *
14447     + * The user interface code talks to a userspace program via a
14448     + * netlink socket.
14449     + *
14450     + * The kernel side:
14451     + * - starts the userui program;
14452     + * - sends text messages and progress bar status;
14453     + *
14454     + * The user space side:
14455     + * - passes messages regarding user requests (abort, toggle reboot etc)
14456     + *
14457     + */
14458     +
14459     +#define __KERNEL_SYSCALLS__
14460     +
14461     +#include <linux/suspend.h>
14462     +#include <linux/freezer.h>
14463     +#include <linux/console.h>
14464     +#include <linux/ctype.h>
14465     +#include <linux/tty.h>
14466     +#include <linux/vt_kern.h>
14467     +#include <linux/module.h>
14468     +#include <linux/reboot.h>
14469     +#include <linux/kmod.h>
14470     +#include <linux/security.h>
14471     +#include <linux/syscalls.h>
14472     +
14473     +#include "sysfs.h"
14474     +#include "modules.h"
14475     +#include "suspend.h"
14476     +#include "ui.h"
14477     +#include "netlink.h"
14478     +#include "power_off.h"
14479     +
14480     +static char local_printf_buf[1024]; /* Same as printk - should be safe */
14481     +
14482     +static struct user_helper_data ui_helper_data;
14483     +static struct suspend_module_ops userui_ops;
14484     +static int orig_kmsg;
14485     +
14486     +static char lastheader[512];
14487     +static int lastheader_message_len;
14488     +static int ui_helper_changed; /* Used at resume-time so don't overwrite value
14489     + set from initrd/ramfs. */
14490     +
14491     +/* Number of distinct progress amounts that userspace can display */
14492     +static int progress_granularity = 30;
14493     +
14494     +DECLARE_WAIT_QUEUE_HEAD(userui_wait_for_key);
14495     +
14496     +static void ui_nl_set_state(int n)
14497     +{
14498     + /* Only let them change certain settings */
14499     + static const int suspend_action_mask =
14500     + (1 << SUSPEND_REBOOT) | (1 << SUSPEND_PAUSE) | (1 << SUSPEND_SLOW) |
14501     + (1 << SUSPEND_LOGALL) | (1 << SUSPEND_SINGLESTEP) |
14502     + (1 << SUSPEND_PAUSE_NEAR_PAGESET_END);
14503     +
14504     + suspend_action = (suspend_action & (~suspend_action_mask)) |
14505     + (n & suspend_action_mask);
14506     +
14507     + if (!test_action_state(SUSPEND_PAUSE) &&
14508     + !test_action_state(SUSPEND_SINGLESTEP))
14509     + wake_up_interruptible(&userui_wait_for_key);
14510     +}
14511     +
14512     +static void userui_redraw(void)
14513     +{
14514     + suspend_send_netlink_message(&ui_helper_data,
14515     + USERUI_MSG_REDRAW, NULL, 0);
14516     +}
14517     +
14518     +static int userui_storage_needed(void)
14519     +{
14520     + return sizeof(ui_helper_data.program) + 1 + sizeof(int);
14521     +}
14522     +
14523     +static int userui_save_config_info(char *buf)
14524     +{
14525     + *((int *) buf) = progress_granularity;
14526     + memcpy(buf + sizeof(int), ui_helper_data.program, sizeof(ui_helper_data.program));
14527     + return sizeof(ui_helper_data.program) + sizeof(int) + 1;
14528     +}
14529     +
14530     +static void userui_load_config_info(char *buf, int size)
14531     +{
14532     + progress_granularity = *((int *) buf);
14533     + size -= sizeof(int);
14534     +
14535     + /* Don't load the saved path if one has already been set */
14536     + if (ui_helper_changed)
14537     + return;
14538     +
14539     + if (size > sizeof(ui_helper_data.program))
14540     + size = sizeof(ui_helper_data.program);
14541     +
14542     + memcpy(ui_helper_data.program, buf + sizeof(int), size);
14543     + ui_helper_data.program[sizeof(ui_helper_data.program)-1] = '\0';
14544     +}
14545     +
14546     +static void set_ui_program_set(void)
14547     +{
14548     + ui_helper_changed = 1;
14549     +}
14550     +
14551     +static int userui_memory_needed(void)
14552     +{
14553     + /* ball park figure of 128 pages */
14554     + return (128 * PAGE_SIZE);
14555     +}
14556     +
14557     +/* suspend_update_status
14558     + *
14559     + * Description: Update the progress bar and (if on) in-bar message.
14560     + * Arguments: UL value, maximum: Current progress percentage (value/max).
14561     + * const char *fmt, ...: Message to be displayed in the middle
14562     + * of the progress bar.
14563     + * Note that a NULL message does not mean that any previous
14564     + * message is erased! For that, you need suspend_prepare_status with
14565     + * clearbar on.
14566     + * Returns: Unsigned long: The next value where status needs to be updated.
14567     + * This is to reduce unnecessary calls to update_status.
14568     + */
14569     +static unsigned long userui_update_status(unsigned long value,
14570     + unsigned long maximum, const char *fmt, ...)
14571     +{
14572     + static int last_step = -1;
14573     + struct userui_msg_params msg;
14574     + int bitshift;
14575     + int this_step;
14576     + unsigned long next_update;
14577     +
14578     + if (ui_helper_data.pid == -1)
14579     + return 0;
14580     +
14581     + if ((!maximum) || (!progress_granularity))
14582     + return maximum;
14583     +
14584     + if (value < 0)
14585     + value = 0;
14586     +
14587     + if (value > maximum)
14588     + value = maximum;
14589     +
14590     + /* Try to avoid math problems - we can't do 64 bit math here
14591     + * (and shouldn't need it - anyone got screen resolution
14592     + * of 65536 pixels or more?) */
14593     + bitshift = fls(maximum) - 16;
14594     + if (bitshift > 0) {
14595     + unsigned long temp_maximum = maximum >> bitshift;
14596     + unsigned long temp_value = value >> bitshift;
14597     + this_step = (int)
14598     + (temp_value * progress_granularity / temp_maximum);
14599     + next_update = (((this_step + 1) * temp_maximum /
14600     + progress_granularity) + 1) << bitshift;
14601     + } else {
14602     + this_step = (int) (value * progress_granularity / maximum);
14603     + next_update = ((this_step + 1) * maximum /
14604     + progress_granularity) + 1;
14605     + }
14606     +
14607     + if (this_step == last_step)
14608     + return next_update;
14609     +
14610     + memset(&msg, 0, sizeof(msg));
14611     +
14612     + msg.a = this_step;
14613     + msg.b = progress_granularity;
14614     +
14615     + if (fmt) {
14616     + va_list args;
14617     + va_start(args, fmt);
14618     + vsnprintf(msg.text, sizeof(msg.text), fmt, args);
14619     + va_end(args);
14620     + msg.text[sizeof(msg.text)-1] = '\0';
14621     + }
14622     +
14623     + suspend_send_netlink_message(&ui_helper_data, USERUI_MSG_PROGRESS,
14624     + &msg, sizeof(msg));
14625     + last_step = this_step;
14626     +
14627     + return next_update;
14628     +}
14629     +
14630     +/* userui_message.
14631     + *
14632     + * Description: This function is intended to do the same job as printk, but
14633     + * without normally logging what is printed. The point is to be
14634     + * able to get debugging info on screen without filling the logs
14635     + * with "1/534. ^M 2/534^M. 3/534^M"
14636     + *
14637     + * It may be called from an interrupt context - can't sleep!
14638     + *
14639     + * Arguments: int mask: The debugging section(s) this message belongs to.
14640     + * int level: The level of verbosity of this message.
14641     + * int restartline: Whether to output a \r or \n with this line
14642     + * (\n if we're logging all output).
14643     + * const char *fmt, ...: Message to be displayed a la printk.
14644     + */
14645     +static void userui_message(unsigned long section, unsigned long level,
14646     + int normally_logged, const char *fmt, ...)
14647     +{
14648     + struct userui_msg_params msg;
14649     +
14650     + if ((level) && (level > console_loglevel))
14651     + return;
14652     +
14653     + memset(&msg, 0, sizeof(msg));
14654     +
14655     + msg.a = section;
14656     + msg.b = level;
14657     + msg.c = normally_logged;
14658     +
14659     + if (fmt) {
14660     + va_list args;
14661     + va_start(args, fmt);
14662     + vsnprintf(msg.text, sizeof(msg.text), fmt, args);
14663     + va_end(args);
14664     + msg.text[sizeof(msg.text)-1] = '\0';
14665     + }
14666     +
14667     + if (test_action_state(SUSPEND_LOGALL))
14668     + printk("%s\n", msg.text);
14669     +
14670     + suspend_send_netlink_message(&ui_helper_data, USERUI_MSG_MESSAGE,
14671     + &msg, sizeof(msg));
14672     +}
14673     +
14674     +static void wait_for_key_via_userui(void)
14675     +{
14676     + DECLARE_WAITQUEUE(wait, current);
14677     +
14678     + add_wait_queue(&userui_wait_for_key, &wait);
14679     + set_current_state(TASK_INTERRUPTIBLE);
14680     +
14681     + interruptible_sleep_on(&userui_wait_for_key);
14682     +
14683     + set_current_state(TASK_RUNNING);
14684     + remove_wait_queue(&userui_wait_for_key, &wait);
14685     +}
14686     +
14687     +static char userui_wait_for_keypress(int timeout)
14688     +{
14689     + int fd;
14690     + char key = '\0';
14691     + struct termios t, t_backup;
14692     +
14693     + if (ui_helper_data.pid != -1) {
14694     + wait_for_key_via_userui();
14695     + key = ' ';
14696     + goto out;
14697     + }
14698     +
14699     + /* We should be guaranteed /dev/console exists after populate_rootfs() in
14700     + * init/main.c
14701     + */
14702     + if ((fd = sys_open("/dev/console", O_RDONLY, 0)) < 0) {
14703     + printk("Couldn't open /dev/console.\n");
14704     + goto out;
14705     + }
14706     +
14707     + if (sys_ioctl(fd, TCGETS, (long)&t) < 0)
14708     + goto out_close;
14709     +
14710     + memcpy(&t_backup, &t, sizeof(t));
14711     +
14712     + t.c_lflag &= ~(ISIG|ICANON|ECHO);
14713     + t.c_cc[VMIN] = 0;
14714     + if (timeout)
14715     + t.c_cc[VTIME] = timeout*10;
14716     +
14717     + if (sys_ioctl(fd, TCSETS, (long)&t) < 0)
14718     + goto out_restore;
14719     +
14720     + while (1) {
14721     + if (sys_read(fd, &key, 1) <= 0) {
14722     + key = '\0';
14723     + break;
14724     + }
14725     + key = tolower(key);
14726     + if (test_suspend_state(SUSPEND_SANITY_CHECK_PROMPT)) {
14727     + if (key == 'c') {
14728     + set_suspend_state(SUSPEND_CONTINUE_REQ);
14729     + break;
14730     + } else if (key == ' ')
14731     + break;
14732     + } else
14733     + break;
14734     + }
14735     +
14736     +out_restore:
14737     + sys_ioctl(fd, TCSETS, (long)&t_backup);
14738     +out_close:
14739     + sys_close(fd);
14740     +out:
14741     + return key;
14742     +}
14743     +
14744     +/* suspend_prepare_status
14745     + * Description: Prepare the 'nice display', drawing the header and version,
14746     + * along with the current action and perhaps also resetting the
14747     + * progress bar.
14748     + * Arguments:
14749     + * int clearbar: Whether to reset the progress bar.
14750     + * const char *fmt, ...: The action to be displayed.
14751     + */
14752     +static void userui_prepare_status(int clearbar, const char *fmt, ...)
14753     +{
14754     + va_list args;
14755     +
14756     + if (fmt) {
14757     + va_start(args, fmt);
14758     + lastheader_message_len = vsnprintf(lastheader, 512, fmt, args);
14759     + va_end(args);
14760     + }
14761     +
14762     + if (clearbar)
14763     + suspend_update_status(0, 1, NULL);
14764     +
14765     + suspend_message(0, SUSPEND_STATUS, 1, lastheader, NULL);
14766     +
14767     + if (ui_helper_data.pid == -1)
14768     + printk(KERN_EMERG "%s\n", lastheader);
14769     +}
14770     +
14771     +/* abort_suspend
14772     + *
14773     + * Description: Begin to abort a cycle. If this wasn't at the user's request
14774     + * (and we're displaying output), tell the user why and wait for
14775     + * them to acknowledge the message.
14776     + * Arguments: A parameterised string (imagine this is printk) to display,
14777     + * telling the user why we're aborting.
14778     + */
14779     +
14780     +static void userui_abort_suspend(int result_code, const char *fmt, ...)
14781     +{
14782     + va_list args;
14783     + int printed_len = 0;
14784     +
14785     + set_result_state(result_code);
14786     + if (!test_result_state(SUSPEND_ABORTED)) {
14787     + if (!test_result_state(SUSPEND_ABORT_REQUESTED)) {
14788     + va_start(args, fmt);
14789     + printed_len = vsnprintf(local_printf_buf,
14790     + sizeof(local_printf_buf), fmt, args);
14791     + va_end(args);
14792     + if (ui_helper_data.pid != -1)
14793     + printed_len = sprintf(local_printf_buf + printed_len,
14794     + " (Press SPACE to continue)");
14795     + suspend_prepare_status(CLEAR_BAR, local_printf_buf);
14796     +
14797     + if (ui_helper_data.pid != -1)
14798     + suspend_wait_for_keypress(0);
14799     + }
14800     + /* Turn on aborting flag */
14801     + set_result_state(SUSPEND_ABORTED);
14802     + }
14803     +}
14804     +
14805     +/* request_abort_suspend
14806     + *
14807     + * Description: Handle the user requesting the cancellation of a suspend by
14808     + * pressing escape.
14809     + * Callers: Invoked from a netlink packet from userspace when the user presses
14810     + * escape.
14811     + */
14812     +static void request_abort_suspend(void)
14813     +{
14814     + if (test_result_state(SUSPEND_ABORT_REQUESTED))
14815     + return;
14816     +
14817     + if (test_suspend_state(SUSPEND_NOW_RESUMING)) {
14818     + suspend_prepare_status(CLEAR_BAR, "Escape pressed. "
14819     + "Powering down again.");
14820     + set_suspend_state(SUSPEND_STOP_RESUME);
14821     + while (!test_suspend_state(SUSPEND_IO_STOPPED))
14822     + schedule();
14823     + if (suspendActiveAllocator->mark_resume_attempted)
14824     + suspendActiveAllocator->mark_resume_attempted(0);
14825     + suspend2_power_down();
14826     + } else {
14827     + suspend_prepare_status(CLEAR_BAR, "--- ESCAPE PRESSED :"
14828     + " ABORTING SUSPEND ---");
14829     + set_result_state(SUSPEND_ABORTED);
14830     + set_result_state(SUSPEND_ABORT_REQUESTED);
14831     +
14832     + wake_up_interruptible(&userui_wait_for_key);
14833     + }
14834     +}
14835     +
14836     +static int userui_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
14837     +{
14838     + int type;
14839     + int *data;
14840     +
14841     + type = nlh->nlmsg_type;
14842     +
14843     + /* A control message: ignore them */
14844     + if (type < NETLINK_MSG_BASE)
14845     + return 0;
14846     +
14847     + /* Unknown message: reply with EINVAL */
14848     + if (type >= USERUI_MSG_MAX)
14849     + return -EINVAL;
14850     +
14851     + /* All operations require privileges, even GET */
14852     + if (security_netlink_recv(skb, CAP_NET_ADMIN))
14853     + return -EPERM;
14854     +
14855     + /* Only allow one task to receive NOFREEZE privileges */
14856     + if (type == NETLINK_MSG_NOFREEZE_ME && ui_helper_data.pid != -1) {
14857     + printk("Got NOFREEZE_ME request when ui_helper_data.pid is %d.\n", ui_helper_data.pid);
14858     + return -EBUSY;
14859     + }
14860     +
14861     + data = (int*)NLMSG_DATA(nlh);
14862     +
14863     + switch (type) {
14864     + case USERUI_MSG_ABORT:
14865     + request_abort_suspend();
14866     + break;
14867     + case USERUI_MSG_GET_STATE:
14868     + suspend_send_netlink_message(&ui_helper_data,
14869     + USERUI_MSG_GET_STATE, &suspend_action,
14870     + sizeof(suspend_action));
14871     + break;
14872     + case USERUI_MSG_GET_DEBUG_STATE:
14873     + suspend_send_netlink_message(&ui_helper_data,
14874     + USERUI_MSG_GET_DEBUG_STATE,
14875     + &suspend_debug_state,
14876     + sizeof(suspend_debug_state));
14877     + break;
14878     + case USERUI_MSG_SET_STATE:
14879     + if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int)))
14880     + return -EINVAL;
14881     + ui_nl_set_state(*data);
14882     + break;
14883     + case USERUI_MSG_SET_DEBUG_STATE:
14884     + if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int)))
14885     + return -EINVAL;
14886     + suspend_debug_state = (*data);
14887     + break;
14888     + case USERUI_MSG_SPACE:
14889     + wake_up_interruptible(&userui_wait_for_key);
14890     + break;
14891     + case USERUI_MSG_GET_POWERDOWN_METHOD:
14892     + suspend_send_netlink_message(&ui_helper_data,
14893     + USERUI_MSG_GET_POWERDOWN_METHOD,
14894     + &suspend2_poweroff_method,
14895     + sizeof(suspend2_poweroff_method));
14896     + break;
14897     + case USERUI_MSG_SET_POWERDOWN_METHOD:
14898     + if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int)))
14899     + return -EINVAL;
14900     + suspend2_poweroff_method = (*data);
14901     + break;
14902     + case USERUI_MSG_GET_LOGLEVEL:
14903     + suspend_send_netlink_message(&ui_helper_data,
14904     + USERUI_MSG_GET_LOGLEVEL,
14905     + &suspend_default_console_level,
14906     + sizeof(suspend_default_console_level));
14907     + break;
14908     + case USERUI_MSG_SET_LOGLEVEL:
14909     + if (nlh->nlmsg_len < NLMSG_LENGTH(sizeof(int)))
14910     + return -EINVAL;
14911     + suspend_default_console_level = (*data);
14912     + break;
14913     + }
14914     +
14915     + return 1;
14916     +}
14917     +
14918     +/* userui_cond_pause
14919     + *
14920     + * Description: Potentially pause and wait for the user to tell us to continue.
14921     + * We normally only pause when @pause is set.
14922     + * Arguments: int pause: Whether we normally pause.
14923     + * char *message: The message to display. Not parameterised
14924     + * because it's normally a constant.
14925     + */
14926     +
14927     +static void userui_cond_pause(int pause, char *message)
14928     +{
14929     + int displayed_message = 0, last_key = 0;
14930     +
14931     + while (last_key != 32 &&
14932     + ui_helper_data.pid != -1 &&
14933     + (!test_result_state(SUSPEND_ABORTED)) &&
14934     + ((test_action_state(SUSPEND_PAUSE) && pause) ||
14935     + (test_action_state(SUSPEND_SINGLESTEP)))) {
14936     + if (!displayed_message) {
14937     + suspend_prepare_status(DONT_CLEAR_BAR,
14938     + "%s Press SPACE to continue.%s",
14939     + message ? message : "",
14940     + (test_action_state(SUSPEND_SINGLESTEP)) ?
14941     + " Single step on." : "");
14942     + displayed_message = 1;
14943     + }
14944     + last_key = suspend_wait_for_keypress(0);
14945     + }
14946     + schedule();
14947     +}
14948     +
14949     +/* userui_prepare_console
14950     + *
14951     + * Description: Prepare a console for use, save current settings.
14952     + * Returns: Boolean: Whether an error occured. Errors aren't
14953     + * treated as fatal, but a warning is printed.
14954     + */
14955     +static void userui_prepare_console(void)
14956     +{
14957     + orig_kmsg = kmsg_redirect;
14958     + kmsg_redirect = fg_console + 1;
14959     +
14960     + ui_helper_data.pid = -1;
14961     +
14962     + if (!userui_ops.enabled)
14963     + return;
14964     +
14965     + if (!*ui_helper_data.program) {
14966     + printk("suspend_userui: program not configured. suspend_userui disabled.\n");
14967     + return;
14968     + }
14969     +
14970     + suspend_netlink_setup(&ui_helper_data);
14971     +
14972     + return;
14973     +}
14974     +
14975     +/* userui_cleanup_console
14976     + *
14977     + * Description: Restore the settings we saved above.
14978     + */
14979     +
14980     +static void userui_cleanup_console(void)
14981     +{
14982     + if (ui_helper_data.pid > -1)
14983     + suspend_netlink_close(&ui_helper_data);
14984     +
14985     + kmsg_redirect = orig_kmsg;
14986     +}
14987     +
14988     +/*
14989     + * User interface specific /sys/power/suspend2 entries.
14990     + */
14991     +
14992     +static struct suspend_sysfs_data sysfs_params[] = {
14993     +#if defined(CONFIG_NET) && defined(CONFIG_SYSFS)
14994     + { SUSPEND2_ATTR("enable_escape", SYSFS_RW),
14995     + SYSFS_BIT(&suspend_action, SUSPEND_CAN_CANCEL, 0)
14996     + },
14997     +
14998     + { SUSPEND2_ATTR("pause_between_steps", SYSFS_RW),
14999     + SYSFS_BIT(&suspend_action, SUSPEND_PAUSE, 0)
15000     + },
15001     +
15002     + { SUSPEND2_ATTR("enabled", SYSFS_RW),
15003     + SYSFS_INT(&userui_ops.enabled, 0, 1, 0)
15004     + },
15005     +
15006     + { SUSPEND2_ATTR("progress_granularity", SYSFS_RW),
15007     + SYSFS_INT(&progress_granularity, 1, 2048, 0)
15008     + },
15009     +
15010     + { SUSPEND2_ATTR("program", SYSFS_RW),
15011     + SYSFS_STRING(ui_helper_data.program, 255, 0),
15012     + .write_side_effect = set_ui_program_set,
15013     + },
15014     +#endif
15015     +};
15016     +
15017     +static struct suspend_module_ops userui_ops = {
15018     + .type = MISC_MODULE,
15019     + .name = "Userspace UI",
15020     + .shared_directory = "Basic User Interface",
15021     + .module = THIS_MODULE,
15022     + .storage_needed = userui_storage_needed,
15023     + .save_config_info = userui_save_config_info,
15024     + .load_config_info = userui_load_config_info,
15025     + .memory_needed = userui_memory_needed,
15026     + .sysfs_data = sysfs_params,
15027     + .num_sysfs_entries = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data),
15028     +};
15029     +
15030     +static struct ui_ops my_ui_ops = {
15031     + .redraw = userui_redraw,
15032     + .update_status = userui_update_status,
15033     + .message = userui_message,
15034     + .prepare_status = userui_prepare_status,
15035     + .abort = userui_abort_suspend,
15036     + .cond_pause = userui_cond_pause,
15037     + .prepare = userui_prepare_console,
15038     + .cleanup = userui_cleanup_console,
15039     + .wait_for_key = userui_wait_for_keypress,
15040     +};
15041     +
15042     +/* suspend_console_sysfs_init
15043     + * Description: Boot time initialisation for user interface.
15044     + */
15045     +
15046     +static __init int s2_user_ui_init(void)
15047     +{
15048     + int result;
15049     +
15050     + ui_helper_data.nl = NULL;
15051     + ui_helper_data.program[0] = '\0';
15052     + ui_helper_data.pid = -1;
15053     + ui_helper_data.skb_size = sizeof(struct userui_msg_params);
15054     + ui_helper_data.pool_limit = 6;
15055     + ui_helper_data.netlink_id = NETLINK_SUSPEND2_USERUI;
15056     + ui_helper_data.name = "userspace ui";
15057     + ui_helper_data.rcv_msg = userui_user_rcv_msg;
15058     + ui_helper_data.interface_version = 7;
15059     + ui_helper_data.must_init = 0;
15060     + ui_helper_data.not_ready = userui_cleanup_console;
15061     + init_completion(&ui_helper_data.wait_for_process);
15062     + result = suspend_register_module(&userui_ops);
15063     + if (!result)
15064     + result = s2_register_ui_ops(&my_ui_ops);
15065     + if (result)
15066     + suspend_unregister_module(&userui_ops);
15067     +
15068     + return result;
15069     +}
15070     +
15071     +#ifdef MODULE
15072     +static __exit void s2_user_ui_exit(void)
15073     +{
15074     + s2_remove_ui_ops(&my_ui_ops);
15075     + suspend_unregister_module(&userui_ops);
15076     +}
15077     +
15078     +module_init(s2_user_ui_init);
15079     +module_exit(s2_user_ui_exit);
15080     +MODULE_AUTHOR("Nigel Cunningham");
15081     +MODULE_DESCRIPTION("Suspend2 Userui Support");
15082     +MODULE_LICENSE("GPL");
15083     +#else
15084     +late_initcall(s2_user_ui_init);
15085     +#endif
15086     diff --git a/kernel/power/sysfs.c b/kernel/power/sysfs.c
15087     new file mode 100644
15088     index 0000000..47eca3b
15089     --- /dev/null
15090     +++ b/kernel/power/sysfs.c
15091     @@ -0,0 +1,347 @@
15092     +/*
15093     + * kernel/power/sysfs.c
15094     + *
15095     + * Copyright (C) 2002-2007 Nigel Cunningham (nigel at suspend2 net)
15096     + *
15097     + * This file is released under the GPLv2.
15098     + *
15099     + * This file contains support for sysfs entries for tuning Suspend2.
15100     + *
15101     + * We have a generic handler that deals with the most common cases, and
15102     + * hooks for special handlers to use.
15103     + */
15104     +
15105     +#include <linux/suspend.h>
15106     +#include <linux/module.h>
15107     +#include <asm/uaccess.h>
15108     +
15109     +#include "sysfs.h"
15110     +#include "suspend.h"
15111     +#include "storage.h"
15112     +
15113     +static int suspend_sysfs_initialised = 0;
15114     +
15115     +static void suspend_initialise_sysfs(void);
15116     +
15117     +static struct suspend_sysfs_data sysfs_params[];
15118     +
15119     +#define to_sysfs_data(_attr) container_of(_attr, struct suspend_sysfs_data, attr)
15120     +
15121     +static void suspend2_main_wrapper(void)
15122     +{
15123     + _suspend2_try_suspend(0);
15124     +}
15125     +
15126     +static ssize_t suspend2_attr_show(struct kobject *kobj, struct attribute *attr,
15127     + char *page)
15128     +{
15129     + struct suspend_sysfs_data *sysfs_data = to_sysfs_data(attr);
15130     + int len = 0;
15131     +
15132     + if (suspend_start_anything(0))
15133     + return -EBUSY;
15134     +
15135     + if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_READ)
15136     + suspend_prepare_usm();
15137     +
15138     + switch (sysfs_data->type) {
15139     + case SUSPEND_SYSFS_DATA_CUSTOM:
15140     + len = (sysfs_data->data.special.read_sysfs) ?
15141     + (sysfs_data->data.special.read_sysfs)(page, PAGE_SIZE)
15142     + : 0;
15143     + break;
15144     + case SUSPEND_SYSFS_DATA_BIT:
15145     + len = sprintf(page, "%d\n",
15146     + -test_bit(sysfs_data->data.bit.bit,
15147     + sysfs_data->data.bit.bit_vector));
15148     + break;
15149     + case SUSPEND_SYSFS_DATA_INTEGER:
15150     + len = sprintf(page, "%d\n",
15151     + *(sysfs_data->data.integer.variable));
15152     + break;
15153     + case SUSPEND_SYSFS_DATA_LONG:
15154     + len = sprintf(page, "%ld\n",
15155     + *(sysfs_data->data.a_long.variable));
15156     + break;
15157     + case SUSPEND_SYSFS_DATA_UL:
15158     + len = sprintf(page, "%lu\n",
15159     + *(sysfs_data->data.ul.variable));
15160     + break;
15161     + case SUSPEND_SYSFS_DATA_STRING:
15162     + len = sprintf(page, "%s\n",
15163     + sysfs_data->data.string.variable);
15164     + break;
15165     + }
15166     + /* Side effect routine? */
15167     + if (sysfs_data->read_side_effect)
15168     + sysfs_data->read_side_effect();
15169     +
15170     + if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_READ)
15171     + suspend_cleanup_usm();
15172     +
15173     + suspend_finish_anything(0);
15174     +
15175     + return len;
15176     +}
15177     +
15178     +#define BOUND(_variable, _type) \
15179     + if (*_variable < sysfs_data->data._type.minimum) \
15180     + *_variable = sysfs_data->data._type.minimum; \
15181     + else if (*_variable > sysfs_data->data._type.maximum) \
15182     + *_variable = sysfs_data->data._type.maximum;
15183     +
15184     +static ssize_t suspend2_attr_store(struct kobject *kobj, struct attribute *attr,
15185     + const char *my_buf, size_t count)
15186     +{
15187     + int assigned_temp_buffer = 0, result = count;
15188     + struct suspend_sysfs_data *sysfs_data = to_sysfs_data(attr);
15189     +
15190     + if (suspend_start_anything((sysfs_data->flags & SYSFS_SUSPEND_OR_RESUME)))
15191     + return -EBUSY;
15192     +
15193     + ((char *) my_buf)[count] = 0;
15194     +
15195     + if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_WRITE)
15196     + suspend_prepare_usm();
15197     +
15198     + switch (sysfs_data->type) {
15199     + case SUSPEND_SYSFS_DATA_CUSTOM:
15200     + if (sysfs_data->data.special.write_sysfs)
15201     + result = (sysfs_data->data.special.write_sysfs)
15202     + (my_buf, count);
15203     + break;
15204     + case SUSPEND_SYSFS_DATA_BIT:
15205     + {
15206     + int value = simple_strtoul(my_buf, NULL, 0);
15207     + if (value)
15208     + set_bit(sysfs_data->data.bit.bit,
15209     + (sysfs_data->data.bit.bit_vector));
15210     + else
15211     + clear_bit(sysfs_data->data.bit.bit,
15212     + (sysfs_data->data.bit.bit_vector));
15213     + }
15214     + break;
15215     + case SUSPEND_SYSFS_DATA_INTEGER:
15216     + {
15217     + int *variable = sysfs_data->data.integer.variable;
15218     + *variable = simple_strtol(my_buf, NULL, 0);
15219     + BOUND(variable, integer);
15220     + break;
15221     + }
15222     + case SUSPEND_SYSFS_DATA_LONG:
15223     + {
15224     + long *variable = sysfs_data->data.a_long.variable;
15225     + *variable = simple_strtol(my_buf, NULL, 0);
15226     + BOUND(variable, a_long);
15227     + break;
15228     + }
15229     + case SUSPEND_SYSFS_DATA_UL:
15230     + {
15231     + unsigned long *variable = sysfs_data->data.ul.variable;
15232     + *variable = simple_strtoul(my_buf, NULL, 0);
15233     + BOUND(variable, ul);
15234     + break;
15235     + }
15236     + break;
15237     + case SUSPEND_SYSFS_DATA_STRING:
15238     + {
15239     + int copy_len = count;
15240     + char *variable =
15241     + sysfs_data->data.string.variable;
15242     +
15243     + if (sysfs_data->data.string.max_length &&
15244     + (copy_len > sysfs_data->data.string.max_length))
15245     + copy_len = sysfs_data->data.string.max_length;
15246     +
15247     + if (!variable) {
15248     + sysfs_data->data.string.variable =
15249     + variable = (char *) get_zeroed_page(GFP_ATOMIC);
15250     + assigned_temp_buffer = 1;
15251     + }
15252     + strncpy(variable, my_buf, copy_len);
15253     + if ((copy_len) &&
15254     + (my_buf[copy_len - 1] == '\n'))
15255     + variable[count - 1] = 0;
15256     + variable[count] = 0;
15257     + }
15258     + break;
15259     + }
15260     +
15261     + /* Side effect routine? */
15262     + if (sysfs_data->write_side_effect)
15263     + sysfs_data->write_side_effect();
15264     +
15265     + /* Free temporary buffers */
15266     + if (assigned_temp_buffer) {
15267     + free_page((unsigned long) sysfs_data->data.string.variable);
15268     + sysfs_data->data.string.variable = NULL;
15269     + }
15270     +
15271     + if (sysfs_data->flags & SYSFS_NEEDS_SM_FOR_WRITE)
15272     + suspend_cleanup_usm();
15273     +
15274     + suspend_finish_anything(sysfs_data->flags & SYSFS_SUSPEND_OR_RESUME);
15275     +
15276     + return result;
15277     +}
15278     +
15279     +static struct sysfs_ops suspend2_sysfs_ops = {
15280     + .show = &suspend2_attr_show,
15281     + .store = &suspend2_attr_store,
15282     +};
15283     +
15284     +static struct kobj_type suspend2_ktype = {
15285     + .sysfs_ops = &suspend2_sysfs_ops,
15286     +};
15287     +
15288     +decl_subsys(suspend2, &suspend2_ktype, NULL);
15289     +
15290     +/* Non-module sysfs entries.
15291     + *
15292     + * This array contains entries that are automatically registered at
15293     + * boot. Modules and the console code register their own entries separately.
15294     + *
15295     + * NB: If you move do_suspend, change suspend_write_sysfs's test so that
15296     + * suspend_start_anything still gets a 1 when the user echos > do_suspend!
15297     + */
15298     +
15299     +static struct suspend_sysfs_data sysfs_params[] = {
15300     + { SUSPEND2_ATTR("do_suspend", SYSFS_WRITEONLY),
15301     + SYSFS_CUSTOM(NULL, NULL, SYSFS_SUSPENDING),
15302     + .write_side_effect = suspend2_main_wrapper
15303     + },
15304     +
15305     + { SUSPEND2_ATTR("do_resume", SYSFS_WRITEONLY),
15306     + SYSFS_CUSTOM(NULL, NULL, SYSFS_RESUMING),
15307     + .write_side_effect = __suspend2_try_resume
15308     + },
15309     +
15310     +};
15311     +
15312     +void remove_suspend2_sysdir(struct kobject *kobj)
15313     +{
15314     + if (!kobj)
15315     + return;
15316     +
15317     + kobject_unregister(kobj);
15318     +
15319     + kfree(kobj);
15320     +}
15321     +
15322     +struct kobject *make_suspend2_sysdir(char *name)
15323     +{
15324     + struct kobject *kobj = kzalloc(sizeof(struct kobject), GFP_KERNEL);
15325     + int err;
15326     +
15327     + if(!kobj) {
15328     + printk("Suspend2: Can't allocate kobject for sysfs dir!\n");
15329     + return NULL;
15330     + }
15331     +
15332     + err = kobject_set_name(kobj, "%s", name);
15333     +
15334     + if (err) {
15335     + kfree(kobj);
15336     + return NULL;
15337     + }
15338     +
15339     + kobj->kset = &suspend2_subsys.kset;
15340     +
15341     + err = kobject_register(kobj);
15342     +
15343     + if (err)
15344     + kfree(kobj);
15345     +
15346     + return err ? NULL : kobj;
15347     +}
15348     +
15349     +/* suspend_register_sysfs_file
15350     + *
15351     + * Helper for registering a new /sysfs/suspend2 entry.
15352     + */
15353     +
15354     +int suspend_register_sysfs_file(
15355     + struct kobject *kobj,
15356     + struct suspend_sysfs_data *suspend_sysfs_data)
15357     +{
15358     + int result;
15359     +
15360     + if (!suspend_sysfs_initialised)
15361     + suspend_initialise_sysfs();
15362     +
15363     + if ((result = sysfs_create_file(kobj, &suspend_sysfs_data->attr)))
15364     + printk("Suspend2: sysfs_create_file for %s returned %d.\n",
15365     + suspend_sysfs_data->attr.name, result);
15366     +
15367     + return result;
15368     +}
15369     +
15370     +/* suspend_unregister_sysfs_file
15371     + *
15372     + * Helper for removing unwanted /sys/power/suspend2 entries.
15373     + *
15374     + */
15375     +void suspend_unregister_sysfs_file(struct kobject *kobj,
15376     + struct suspend_sysfs_data *suspend_sysfs_data)
15377     +{
15378     + sysfs_remove_file(kobj, &suspend_sysfs_data->attr);
15379     +}
15380     +
15381     +void suspend_cleanup_sysfs(void)
15382     +{
15383     + int i,
15384     + numfiles = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data);
15385     +
15386     + if (!suspend_sysfs_initialised)
15387     + return;
15388     +
15389     + for (i=0; i< numfiles; i++)
15390     + suspend_unregister_sysfs_file(&suspend2_subsys.kset.kobj,
15391     + &sysfs_params[i]);
15392     +
15393     + kobj_set_kset_s(&suspend2_subsys.kset, power_subsys);
15394     + subsystem_unregister(&suspend2_subsys);
15395     +
15396     + suspend_sysfs_initialised = 0;
15397     +}
15398     +
15399     +/* suspend_initialise_sysfs
15400     + *
15401     + * Initialise the /sysfs/suspend2 directory.
15402     + */
15403     +
15404     +static void suspend_initialise_sysfs(void)
15405     +{
15406     + int i, error;
15407     + int numfiles = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data);
15408     +
15409     + if (suspend_sysfs_initialised)
15410     + return;
15411     +
15412     + /* Make our suspend2 directory a child of /sys/power */
15413     + kobj_set_kset_s(&suspend2_subsys.kset, power_subsys);
15414     + error = subsystem_register(&suspend2_subsys);
15415     +
15416     + if (error)
15417     + return;
15418     +
15419     + /* Make it use the .store and .show routines above */
15420     + kobj_set_kset_s(&suspend2_subsys.kset, suspend2_subsys);
15421     +
15422     + suspend_sysfs_initialised = 1;
15423     +
15424     + for (i=0; i< numfiles; i++)
15425     + suspend_register_sysfs_file(&suspend2_subsys.kset.kobj,
15426     + &sysfs_params[i]);
15427     +}
15428     +
15429     +int s2_sysfs_init(void)
15430     +{
15431     + suspend_initialise_sysfs();
15432     + return 0;
15433     +}
15434     +
15435     +void s2_sysfs_exit(void)
15436     +{
15437     + suspend_cleanup_sysfs();
15438     +}
15439     diff --git a/kernel/power/sysfs.h b/kernel/power/sysfs.h
15440     new file mode 100644
15441     index 0000000..aad5e26
15442     --- /dev/null
15443     +++ b/kernel/power/sysfs.h
15444     @@ -0,0 +1,132 @@
15445     +/*
15446     + * kernel/power/sysfs.h
15447     + *
15448     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
15449     + *
15450     + * This file is released under the GPLv2.
15451     + *
15452     + * It provides declarations for suspend to use in managing
15453     + * /sysfs/suspend2. When we switch to kobjects,
15454     + * this will become redundant.
15455     + *
15456     + */
15457     +
15458     +#include <linux/sysfs.h>
15459     +#include "power.h"
15460     +
15461     +struct suspend_sysfs_data {
15462     + struct attribute attr;
15463     + int type;
15464     + int flags;
15465     + union {
15466     + struct {
15467     + unsigned long *bit_vector;
15468     + int bit;
15469     + } bit;
15470     + struct {
15471     + int *variable;
15472     + int minimum;
15473     + int maximum;
15474     + } integer;
15475     + struct {
15476     + long *variable;
15477     + long minimum;
15478     + long maximum;
15479     + } a_long;
15480     + struct {
15481     + unsigned long *variable;
15482     + unsigned long minimum;
15483     + unsigned long maximum;
15484     + } ul;
15485     + struct {
15486     + char *variable;
15487     + int max_length;
15488     + } string;
15489     + struct {
15490     + int (*read_sysfs) (const char *buffer, int count);
15491     + int (*write_sysfs) (const char *buffer, int count);
15492     + void *data;
15493     + } special;
15494     + } data;
15495     +
15496     + /* Side effects routines. Used, eg, for reparsing the
15497     + * resume2 entry when it changes */
15498     + void (*read_side_effect) (void);
15499     + void (*write_side_effect) (void);
15500     + struct list_head sysfs_data_list;
15501     +};
15502     +
15503     +enum {
15504     + SUSPEND_SYSFS_DATA_NONE = 1,
15505     + SUSPEND_SYSFS_DATA_CUSTOM,
15506     + SUSPEND_SYSFS_DATA_BIT,
15507     + SUSPEND_SYSFS_DATA_INTEGER,
15508     + SUSPEND_SYSFS_DATA_UL,
15509     + SUSPEND_SYSFS_DATA_LONG,
15510     + SUSPEND_SYSFS_DATA_STRING
15511     +};
15512     +
15513     +#define SUSPEND2_ATTR(_name, _mode) \
15514     + .attr = {.name = _name , .mode = _mode }
15515     +
15516     +#define SYSFS_BIT(_ul, _bit, _flags) \
15517     + .type = SUSPEND_SYSFS_DATA_BIT, \
15518     + .flags = _flags, \
15519     + .data = { .bit = { .bit_vector = _ul, .bit = _bit } }
15520     +
15521     +#define SYSFS_INT(_int, _min, _max, _flags) \
15522     + .type = SUSPEND_SYSFS_DATA_INTEGER, \
15523     + .flags = _flags, \
15524     + .data = { .integer = { .variable = _int, .minimum = _min, \
15525     + .maximum = _max } }
15526     +
15527     +#define SYSFS_UL(_ul, _min, _max, _flags) \
15528     + .type = SUSPEND_SYSFS_DATA_UL, \
15529     + .flags = _flags, \
15530     + .data = { .ul = { .variable = _ul, .minimum = _min, \
15531     + .maximum = _max } }
15532     +
15533     +#define SYSFS_LONG(_long, _min, _max, _flags) \
15534     + .type = SUSPEND_SYSFS_DATA_LONG, \
15535     + .flags = _flags, \
15536     + .data = { .a_long = { .variable = _long, .minimum = _min, \
15537     + .maximum = _max } }
15538     +
15539     +#define SYSFS_STRING(_string, _max_len, _flags) \
15540     + .type = SUSPEND_SYSFS_DATA_STRING, \
15541     + .flags = _flags, \
15542     + .data = { .string = { .variable = _string, .max_length = _max_len } }
15543     +
15544     +#define SYSFS_CUSTOM(_read, _write, _flags) \
15545     + .type = SUSPEND_SYSFS_DATA_CUSTOM, \
15546     + .flags = _flags, \
15547     + .data = { .special = { .read_sysfs = _read, .write_sysfs = _write } }
15548     +
15549     +#define SYSFS_WRITEONLY 0200
15550     +#define SYSFS_READONLY 0444
15551     +#define SYSFS_RW 0644
15552     +
15553     +/* Flags */
15554     +#define SYSFS_NEEDS_SM_FOR_READ 1
15555     +#define SYSFS_NEEDS_SM_FOR_WRITE 2
15556     +#define SYSFS_SUSPEND 4
15557     +#define SYSFS_RESUME 8
15558     +#define SYSFS_SUSPEND_OR_RESUME (SYSFS_SUSPEND | SYSFS_RESUME)
15559     +#define SYSFS_SUSPENDING (SYSFS_SUSPEND | SYSFS_NEEDS_SM_FOR_WRITE)
15560     +#define SYSFS_RESUMING (SYSFS_RESUME | SYSFS_NEEDS_SM_FOR_WRITE)
15561     +#define SYSFS_NEEDS_SM_FOR_BOTH \
15562     + (SYSFS_NEEDS_SM_FOR_READ | SYSFS_NEEDS_SM_FOR_WRITE)
15563     +
15564     +int suspend_register_sysfs_file(struct kobject *kobj,
15565     + struct suspend_sysfs_data *suspend_sysfs_data);
15566     +void suspend_unregister_sysfs_file(struct kobject *kobj,
15567     + struct suspend_sysfs_data *suspend_sysfs_data);
15568     +
15569     +extern struct subsystem suspend2_subsys;
15570     +
15571     +struct kobject *make_suspend2_sysdir(char *name);
15572     +void remove_suspend2_sysdir(struct kobject *obj);
15573     +extern void suspend_cleanup_sysfs(void);
15574     +
15575     +extern int s2_sysfs_init(void);
15576     +extern void s2_sysfs_exit(void);
15577     diff --git a/kernel/power/ui.c b/kernel/power/ui.c
15578     new file mode 100644
15579     index 0000000..5b6789f
15580     --- /dev/null
15581     +++ b/kernel/power/ui.c
15582     @@ -0,0 +1,235 @@
15583     +/*
15584     + * kernel/power/ui.c
15585     + *
15586     + * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu>
15587     + * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz>
15588     + * Copyright (C) 2002-2003 Florent Chabaud <fchabaud@free.fr>
15589     + * Copyright (C) 2002-2007 Nigel Cunningham (nigel at suspend2 net)
15590     + *
15591     + * This file is released under the GPLv2.
15592     + *
15593     + * Routines for Suspend2's user interface.
15594     + *
15595     + * The user interface code talks to a userspace program via a
15596     + * netlink socket.
15597     + *
15598     + * The kernel side:
15599     + * - starts the userui program;
15600     + * - sends text messages and progress bar status;
15601     + *
15602     + * The user space side:
15603     + * - passes messages regarding user requests (abort, toggle reboot etc)
15604     + *
15605     + */
15606     +
15607     +#define __KERNEL_SYSCALLS__
15608     +
15609     +#include <linux/suspend.h>
15610     +#include <linux/freezer.h>
15611     +#include <linux/console.h>
15612     +#include <linux/ctype.h>
15613     +#include <linux/tty.h>
15614     +#include <linux/vt_kern.h>
15615     +#include <linux/module.h>
15616     +#include <linux/reboot.h>
15617     +#include <linux/kmod.h>
15618     +#include <linux/security.h>
15619     +#include <linux/syscalls.h>
15620     +
15621     +#include "sysfs.h"
15622     +#include "modules.h"
15623     +#include "suspend.h"
15624     +#include "ui.h"
15625     +#include "netlink.h"
15626     +#include "power_off.h"
15627     +
15628     +static char local_printf_buf[1024]; /* Same as printk - should be safe */
15629     +struct ui_ops *s2_current_ui;
15630     +
15631     +/*! The console log level we default to. */
15632     +int suspend_default_console_level = 0;
15633     +
15634     +/* suspend_early_boot_message()
15635     + * Description: Handle errors early in the process of booting.
15636     + * The user may press C to continue booting, perhaps
15637     + * invalidating the image, or space to reboot.
15638     + * This works from either the serial console or normally
15639     + * attached keyboard.
15640     + *
15641     + * Note that we come in here from init, while the kernel is
15642     + * locked. If we want to get events from the serial console,
15643     + * we need to temporarily unlock the kernel.
15644     + *
15645     + * suspend_early_boot_message may also be called post-boot.
15646     + * In this case, it simply printks the message and returns.
15647     + *
15648     + * Arguments: int Whether we are able to erase the image.
15649     + * int default_answer. What to do when we timeout. This
15650     + * will normally be continue, but the user might
15651     + * provide command line options (__setup) to override
15652     + * particular cases.
15653     + * Char *. Pointer to a string explaining why we're moaning.
15654     + */
15655     +
15656     +#define say(message, a...) printk(KERN_EMERG message, ##a)
15657     +#define message_timeout 25 /* message_timeout * 10 must fit in 8 bits */
15658     +
15659     +int suspend_early_boot_message(int message_detail, int default_answer, char *warning_reason, ...)
15660     +{
15661     + unsigned long orig_state = get_suspend_state(), continue_req = 0;
15662     + unsigned long orig_loglevel = console_loglevel;
15663     + va_list args;
15664     + int printed_len;
15665     +
15666     + if (warning_reason) {
15667     + va_start(args, warning_reason);
15668     + printed_len = vsnprintf(local_printf_buf,
15669     + sizeof(local_printf_buf),
15670     + warning_reason,
15671     + args);
15672     + va_end(args);
15673     + }
15674     +
15675     + if (!test_suspend_state(SUSPEND_BOOT_TIME)) {
15676     + printk("Suspend2: %s\n", local_printf_buf);
15677     + return default_answer;
15678     + }
15679     +
15680     + /* We might be called directly from do_mounts_initrd if the
15681     + * user fails to set up their initrd properly. We need to
15682     + * enable the keyboard handler by setting the running flag */
15683     + set_suspend_state(SUSPEND_RUNNING);
15684     +
15685     +#if defined(CONFIG_VT) || defined(CONFIG_SERIAL_CONSOLE)
15686     + console_loglevel = 7;
15687     +
15688     + say("=== Suspend2 ===\n\n");
15689     + if (warning_reason) {
15690     + say("BIG FAT WARNING!! %s\n\n", local_printf_buf);
15691     + switch (message_detail) {
15692     + case 0:
15693     + say("If you continue booting, note that any image WILL NOT BE REMOVED.\n");
15694     + say("Suspend is unable to do so because the appropriate modules aren't\n");
15695     + say("loaded. You should manually remove the image to avoid any\n");
15696     + say("possibility of corrupting your filesystem(s) later.\n");
15697     + break;
15698     + case 1:
15699     + say("If you want to use the current suspend image, reboot and try\n");
15700     + say("again with the same kernel that you suspended from. If you want\n");
15701     + say("to forget that image, continue and the image will be erased.\n");
15702     + break;
15703     + }
15704     + say("Press SPACE to reboot or C to continue booting with this kernel\n\n");
15705     + say("Default action if you don't select one in %d seconds is: %s.\n",
15706     + message_timeout,
15707     + default_answer == SUSPEND_CONTINUE_REQ ?
15708     + "continue booting" : "reboot");
15709     + } else {
15710     + say("BIG FAT WARNING!!\n\n");
15711     + say("You have tried to resume from this image before.\n");
15712     + say("If it failed once, it may well fail again.\n");
15713     + say("Would you like to remove the image and boot normally?\n");
15714     + say("This will be equivalent to entering noresume2 on the\n");
15715     + say("kernel command line.\n\n");
15716     + say("Press SPACE to remove the image or C to continue resuming.\n\n");
15717     + say("Default action if you don't select one in %d seconds is: %s.\n",
15718     + message_timeout,
15719     + !!default_answer ?
15720     + "continue resuming" : "remove the image");
15721     + }
15722     + console_loglevel = orig_loglevel;
15723     +
15724     + set_suspend_state(SUSPEND_SANITY_CHECK_PROMPT);
15725     + clear_suspend_state(SUSPEND_CONTINUE_REQ);
15726     +
15727     + if (suspend_wait_for_keypress(message_timeout) == 0) /* We timed out */
15728     + continue_req = !!default_answer;
15729     + else
15730     + continue_req = test_suspend_state(SUSPEND_CONTINUE_REQ);
15731     +
15732     + if ((warning_reason) && (!continue_req))
15733     + machine_restart(NULL);
15734     +
15735     + restore_suspend_state(orig_state);
15736     + if (continue_req)
15737     + set_suspend_state(SUSPEND_CONTINUE_REQ);
15738     +
15739     +#endif /* CONFIG_VT or CONFIG_SERIAL_CONSOLE */
15740     + return -EIO;
15741     +}
15742     +#undef say
15743     +
15744     +/*
15745     + * User interface specific /sys/power/suspend2 entries.
15746     + */
15747     +
15748     +static struct suspend_sysfs_data sysfs_params[] = {
15749     +#if defined(CONFIG_NET) && defined(CONFIG_SYSFS)
15750     + { SUSPEND2_ATTR("default_console_level", SYSFS_RW),
15751     + SYSFS_INT(&suspend_default_console_level, 0, 7, 0)
15752     + },
15753     +
15754     + { SUSPEND2_ATTR("debug_sections", SYSFS_RW),
15755     + SYSFS_UL(&suspend_debug_state, 0, 1 << 30, 0)
15756     + },
15757     +
15758     + { SUSPEND2_ATTR("log_everything", SYSFS_RW),
15759     + SYSFS_BIT(&suspend_action, SUSPEND_LOGALL, 0)
15760     + },
15761     +#endif
15762     + { SUSPEND2_ATTR("pm_prepare_console", SYSFS_RW),
15763     + SYSFS_BIT(&suspend_action, SUSPEND_PM_PREPARE_CONSOLE, 0)
15764     + }
15765     +};
15766     +
15767     +static struct suspend_module_ops userui_ops = {
15768     + .type = MISC_MODULE,
15769     + .name = "Basic User Interface",
15770     + .directory = "user_interface",
15771     + .module = THIS_MODULE,
15772     + .sysfs_data = sysfs_params,
15773     + .num_sysfs_entries = sizeof(sysfs_params) / sizeof(struct suspend_sysfs_data),
15774     +};
15775     +
15776     +int s2_register_ui_ops(struct ui_ops *this_ui)
15777     +{
15778     + if (s2_current_ui) {
15779     + printk("Only one Suspend2 user interface module can be loaded"
15780     + " at a time.");
15781     + return -EBUSY;
15782     + }
15783     +
15784     + s2_current_ui = this_ui;
15785     +
15786     + return 0;
15787     +}
15788     +
15789     +void s2_remove_ui_ops(struct ui_ops *this_ui)
15790     +{
15791     + if (s2_current_ui != this_ui)
15792     + return;
15793     +
15794     + s2_current_ui = NULL;
15795     +}
15796     +
15797     +/* suspend_console_sysfs_init
15798     + * Description: Boot time initialisation for user interface.
15799     + */
15800     +
15801     +int s2_ui_init(void)
15802     +{
15803     + return suspend_register_module(&userui_ops);
15804     +}
15805     +
15806     +void s2_ui_exit(void)
15807     +{
15808     + suspend_unregister_module(&userui_ops);
15809     +}
15810     +
15811     +#ifdef CONFIG_SUSPEND2_EXPORTS
15812     +EXPORT_SYMBOL_GPL(s2_current_ui);
15813     +EXPORT_SYMBOL_GPL(suspend_early_boot_message);
15814     +EXPORT_SYMBOL_GPL(s2_register_ui_ops);
15815     +EXPORT_SYMBOL_GPL(s2_remove_ui_ops);
15816     +EXPORT_SYMBOL_GPL(suspend_default_console_level);
15817     +#endif
15818     diff --git a/kernel/power/ui.h b/kernel/power/ui.h
15819     new file mode 100644
15820     index 0000000..2d1034c
15821     --- /dev/null
15822     +++ b/kernel/power/ui.h
15823     @@ -0,0 +1,108 @@
15824     +/*
15825     + * kernel/power/ui.h
15826     + *
15827     + * Copyright (C) 2004-2007 Nigel Cunningham (nigel at suspend2 net)
15828     + */
15829     +
15830     +enum {
15831     + DONT_CLEAR_BAR,
15832     + CLEAR_BAR
15833     +};
15834     +
15835     +enum {
15836     + /* Userspace -> Kernel */
15837     + USERUI_MSG_ABORT = 0x11,
15838     + USERUI_MSG_SET_STATE = 0x12,
15839     + USERUI_MSG_GET_STATE = 0x13,
15840     + USERUI_MSG_GET_DEBUG_STATE = 0x14,
15841     + USERUI_MSG_SET_DEBUG_STATE = 0x15,
15842     + USERUI_MSG_SPACE = 0x18,
15843     + USERUI_MSG_GET_POWERDOWN_METHOD = 0x1A,
15844     + USERUI_MSG_SET_POWERDOWN_METHOD = 0x1B,
15845     + USERUI_MSG_GET_LOGLEVEL = 0x1C,
15846     + USERUI_MSG_SET_LOGLEVEL = 0x1D,
15847     +
15848     + /* Kernel -> Userspace */
15849     + USERUI_MSG_MESSAGE = 0x21,
15850     + USERUI_MSG_PROGRESS = 0x22,
15851     + USERUI_MSG_REDRAW = 0x25,
15852     +
15853     + USERUI_MSG_MAX,
15854     +};
15855     +
15856     +struct userui_msg_params {
15857     + unsigned long a, b, c, d;
15858     + char text[255];
15859     +};
15860     +
15861     +struct ui_ops {
15862     + char (*wait_for_key) (int timeout);
15863     + unsigned long (*update_status) (unsigned long value,
15864     + unsigned long maximum, const char *fmt, ...);
15865     + void (*prepare_status) (int clearbar, const char *fmt, ...);
15866     + void (*cond_pause) (int pause, char *message);
15867     + void (*abort)(int result_code, const char *fmt, ...);
15868     + void (*prepare)(void);
15869     + void (*cleanup)(void);
15870     + void (*redraw)(void);
15871     + void (*message)(unsigned long section, unsigned long level,
15872     + int normally_logged, const char *fmt, ...);
15873     +};
15874     +
15875     +extern struct ui_ops *s2_current_ui;
15876     +
15877     +#define suspend_update_status(val, max, fmt, args...) \
15878     + (s2_current_ui ? (s2_current_ui->update_status) (val, max, fmt, ##args) : max)
15879     +
15880     +#define suspend_wait_for_keypress(timeout) \
15881     + (s2_current_ui ? (s2_current_ui->wait_for_key) (timeout) : 0)
15882     +
15883     +#define suspend_ui_redraw(void) \
15884     + do { if (s2_current_ui) \
15885     + (s2_current_ui->redraw)(); \
15886     + } while(0)
15887     +
15888     +#define suspend_prepare_console(void) \
15889     + do { if (s2_current_ui) \
15890     + (s2_current_ui->prepare)(); \
15891     + } while(0)
15892     +
15893     +#define suspend_cleanup_console(void) \
15894     + do { if (s2_current_ui) \
15895     + (s2_current_ui->cleanup)(); \
15896     + } while(0)
15897     +
15898     +#define abort_suspend(result, fmt, args...) \
15899     + do { if (s2_current_ui) \
15900     + (s2_current_ui->abort)(result, fmt, ##args); \
15901     + else { \
15902     + set_result_state(SUSPEND_ABORTED); \
15903     + set_result_state(result); \
15904     + } \
15905     + } while(0)
15906     +
15907     +#define suspend_cond_pause(pause, message) \
15908     + do { if (s2_current_ui) \
15909     + (s2_current_ui->cond_pause)(pause, message); \
15910     + } while(0)
15911     +
15912     +#define suspend_prepare_status(clear, fmt, args...) \
15913     + do { if (s2_current_ui) \
15914     + (s2_current_ui->prepare_status)(clear, fmt, ##args); \
15915     + else \
15916     + printk(fmt, ##args); \
15917     + } while(0)
15918     +
15919     +extern int suspend_default_console_level;
15920     +
15921     +#define suspend_message(sn, lev, log, fmt, a...) \
15922     +do { \
15923     + if (s2_current_ui && (!sn || test_debug_state(sn))) \
15924     + s2_current_ui->message(sn, lev, log, fmt, ##a); \
15925     +} while(0)
15926     +
15927     +__exit void suspend_ui_cleanup(void);
15928     +extern int s2_ui_init(void);
15929     +extern void s2_ui_exit(void);
15930     +extern int s2_register_ui_ops(struct ui_ops *this_ui);
15931     +extern void s2_remove_ui_ops(struct ui_ops *this_ui);
15932     diff --git a/kernel/printk.c b/kernel/printk.c
15933     index 4b47e59..6cafd3b 100644
15934     --- a/kernel/printk.c
15935     +++ b/kernel/printk.c
15936     @@ -32,6 +32,7 @@
15937     #include <linux/bootmem.h>
15938     #include <linux/syscalls.h>
15939     #include <linux/jiffies.h>
15940     +#include <linux/suspend.h>
15941    
15942     #include <asm/uaccess.h>
15943    
15944     @@ -92,9 +93,9 @@ static DEFINE_SPINLOCK(logbuf_lock);
15945     * The indices into log_buf are not constrained to log_buf_len - they
15946     * must be masked before subscripting
15947     */
15948     -static unsigned long log_start; /* Index into log_buf: next char to be read by syslog() */
15949     -static unsigned long con_start; /* Index into log_buf: next char to be sent to consoles */
15950     -static unsigned long log_end; /* Index into log_buf: most-recently-written-char + 1 */
15951     +static unsigned long POSS_NOSAVE log_start; /* Index into log_buf: next char to be read by syslog() */
15952     +static unsigned long POSS_NOSAVE con_start; /* Index into log_buf: next char to be sent to consoles */
15953     +static unsigned long POSS_NOSAVE log_end; /* Index into log_buf: most-recently-written-char + 1 */
15954    
15955     /*
15956     * Array of consoles built from command line options (console=)
15957     @@ -117,10 +118,10 @@ static int console_may_schedule;
15958    
15959     #ifdef CONFIG_PRINTK
15960    
15961     -static char __log_buf[__LOG_BUF_LEN];
15962     -static char *log_buf = __log_buf;
15963     -static int log_buf_len = __LOG_BUF_LEN;
15964     -static unsigned long logged_chars; /* Number of chars produced since last read+clear operation */
15965     +static char POSS_NOSAVE __log_buf[__LOG_BUF_LEN];
15966     +static char POSS_NOSAVE *log_buf = __log_buf;
15967     +static int POSS_NOSAVE log_buf_len = __LOG_BUF_LEN;
15968     +static unsigned long POSS_NOSAVE logged_chars; /* Number of chars produced since last read+clear operation */
15969    
15970     static int __init log_buf_len_setup(char *str)
15971     {
15972     @@ -739,12 +740,14 @@ void suspend_console(void)
15973     acquire_console_sem();
15974     console_suspended = 1;
15975     }
15976     +EXPORT_SYMBOL(suspend_console);
15977    
15978     void resume_console(void)
15979     {
15980     console_suspended = 0;
15981     release_console_sem();
15982     }
15983     +EXPORT_SYMBOL(resume_console);
15984     #endif /* CONFIG_DISABLE_CONSOLE_SUSPEND */
15985    
15986     /**
15987     diff --git a/kernel/timer.c b/kernel/timer.c
15988     index dd6c2c1..aded05d 100644
15989     --- a/kernel/timer.c
15990     +++ b/kernel/timer.c
15991     @@ -1240,6 +1240,38 @@ unsigned long avenrun[3];
15992    
15993     EXPORT_SYMBOL(avenrun);
15994    
15995     +static unsigned long avenrun_save[3];
15996     +/*
15997     + * save_avenrun - Record the values prior to starting a hibernation cycle.
15998     + * We do this to make the work done in hibernation invisible to userspace
15999     + * post-suspend. Some programs, including some MTAs, watch the load average
16000     + * and stop work until it lowers. Without this, they would stop working for
16001     + * a while post-resume, unnecessarily.
16002     + */
16003     +
16004     +void save_avenrun(void)
16005     +{
16006     + avenrun_save[0] = avenrun[0];
16007     + avenrun_save[1] = avenrun[1];
16008     + avenrun_save[2] = avenrun[2];
16009     +}
16010     +
16011     +EXPORT_SYMBOL_GPL(save_avenrun);
16012     +
16013     +void restore_avenrun(void)
16014     +{
16015     + if (!avenrun_save[0])
16016     + return;
16017     +
16018     + avenrun[0] = avenrun_save[0];
16019     + avenrun[1] = avenrun_save[1];
16020     + avenrun[2] = avenrun_save[2];
16021     +
16022     + avenrun_save[0] = 0;
16023     +}
16024     +
16025     +EXPORT_SYMBOL_GPL(restore_avenrun);
16026     +
16027     /*
16028     * calc_load - given tick count, update the avenrun load estimates.
16029     * This is called while holding a write_lock on xtime_lock.
16030     diff --git a/lib/Kconfig b/lib/Kconfig
16031     index 3842499..758a928 100644
16032     --- a/lib/Kconfig
16033     +++ b/lib/Kconfig
16034     @@ -47,6 +47,9 @@ config AUDIT_GENERIC
16035     depends on AUDIT && !AUDIT_ARCH
16036     default y
16037    
16038     +config DYN_PAGEFLAGS
16039     + bool
16040     +
16041     #
16042     # compression support is select'ed if needed
16043     #
16044     diff --git a/lib/Makefile b/lib/Makefile
16045     index 992a39e..b974998 100644
16046     --- a/lib/Makefile
16047     +++ b/lib/Makefile
16048     @@ -38,6 +38,9 @@ ifneq ($(CONFIG_HAVE_DEC_LOCK),y)
16049     endif
16050    
16051     obj-$(CONFIG_BITREVERSE) += bitrev.o
16052     +
16053     +obj-$(CONFIG_DYN_PAGEFLAGS) += dyn_pageflags.o
16054     +
16055     obj-$(CONFIG_CRC_CCITT) += crc-ccitt.o
16056     obj-$(CONFIG_CRC16) += crc16.o
16057     obj-$(CONFIG_CRC32) += crc32.o
16058     diff --git a/lib/dyn_pageflags.c b/lib/dyn_pageflags.c
16059     new file mode 100644
16060     index 0000000..963ac5c
16061     --- /dev/null
16062     +++ b/lib/dyn_pageflags.c
16063     @@ -0,0 +1,312 @@
16064     +/*
16065     + * lib/dyn_pageflags.c
16066     + *
16067     + * Copyright (C) 2004-2006 Nigel Cunningham <nigel@suspend2.net>
16068     + *
16069     + * This file is released under the GPLv2.
16070     + *
16071     + * Routines for dynamically allocating and releasing bitmaps
16072     + * used as pseudo-pageflags.
16073     + */
16074     +
16075     +#include <linux/module.h>
16076     +#include <linux/dyn_pageflags.h>
16077     +#include <linux/bootmem.h>
16078     +#include <linux/mm.h>
16079     +
16080     +#if 0
16081     +#define PR_DEBUG(a, b...) do { printk(a, ##b); } while(0)
16082     +#else
16083     +#define PR_DEBUG(a, b...) do { } while(0)
16084     +#endif
16085     +
16086     +#define pages_for_zone(zone) \
16087     + (DIV_ROUND_UP((zone)->spanned_pages, (PAGE_SIZE << 3)))
16088     +
16089     +/*
16090     + * clear_dyn_pageflags(dyn_pageflags_t pagemap)
16091     + *
16092     + * Clear an array used to store local page flags.
16093     + *
16094     + */
16095     +
16096     +void clear_dyn_pageflags(dyn_pageflags_t pagemap)
16097     +{
16098     + int i = 0, zone_idx, node_id = 0;
16099     + struct zone *zone;
16100     + struct pglist_data *pgdat;
16101     +
16102     + BUG_ON(!pagemap);
16103     +
16104     + for_each_online_pgdat(pgdat) {
16105     + for (zone_idx = 0; zone_idx < MAX_NR_ZONES; zone_idx++) {
16106     + zone = &pgdat->node_zones[zone_idx];
16107     +
16108     + if (!populated_zone(zone))
16109     + continue;
16110     +
16111     + for (i = 0; i < pages_for_zone(zone); i++)
16112     + memset((pagemap[node_id][zone_idx][i]), 0,
16113     + PAGE_SIZE);
16114     + }
16115     + node_id++;
16116     + }
16117     +}
16118     +
16119     +/*
16120     + * free_dyn_pageflags(dyn_pageflags_t pagemap)
16121     + *
16122     + * Free a dynamically allocated pageflags bitmap. For Suspend2 usage, we
16123     + * support data being relocated from slab to pages that don't conflict
16124     + * with the image that will be copied back. This is the reason for the
16125     + * PageSlab tests below.
16126     + *
16127     + */
16128     +void free_dyn_pageflags(dyn_pageflags_t *pagemap)
16129     +{
16130     + int i = 0, zone_pages, node_id = -1, zone_idx;
16131     + struct zone *zone;
16132     + struct pglist_data *pgdat;
16133     +
16134     + if (!*pagemap)
16135     + return;
16136     +
16137     + PR_DEBUG("Seeking to free dyn_pageflags %p.\n", pagemap);
16138     +
16139     + for_each_online_pgdat(pgdat) {
16140     + node_id++;
16141     +
16142     + PR_DEBUG("Node id %d.\n", node_id);
16143     +
16144     + if (!(*pagemap)[node_id]) {
16145     + PR_DEBUG("Node %d unallocated.\n", node_id);
16146     + continue;
16147     + }
16148     +
16149     + for (zone_idx = 0; zone_idx < MAX_NR_ZONES; zone_idx++) {
16150     + zone = &pgdat->node_zones[zone_idx];
16151     + if (!populated_zone(zone)) {
16152     + PR_DEBUG("Node %d zone %d unpopulated.\n", node_id, zone_idx);
16153     + continue;
16154     + }
16155     +
16156     + if (!(*pagemap)[node_id][zone_idx]) {
16157     + PR_DEBUG("Node %d zone %d unallocated.\n", node_id, zone_idx);
16158     + continue;
16159     + }
16160     +
16161     + PR_DEBUG("Node id %d. Zone %d.\n", node_id, zone_idx);
16162     +
16163     + zone_pages = pages_for_zone(zone);
16164     +
16165     + for (i = 0; i < zone_pages; i++) {
16166     + PR_DEBUG("Node id %d. Zone %d. Page %d.\n", node_id, zone_idx, i);
16167     + free_page((unsigned long)(*pagemap)[node_id][zone_idx][i]);
16168     + }
16169     +
16170     + kfree((*pagemap)[node_id][zone_idx]);
16171     + }
16172     + PR_DEBUG("Free node %d (%p).\n", node_id, pagemap[node_id]);
16173     + kfree((*pagemap)[node_id]);
16174     + }
16175     +
16176     + PR_DEBUG("Free map pgdat list at %p.\n", pagemap);
16177     + kfree(*pagemap);
16178     +
16179     + *pagemap = NULL;
16180     + PR_DEBUG("Done.\n");
16181     + return;
16182     +}
16183     +
16184     +static int try_alloc_dyn_pageflag_part(int nr_ptrs, void **ptr)
16185     +{
16186     + *ptr = kzalloc(sizeof(void *) * nr_ptrs, GFP_ATOMIC);
16187     + PR_DEBUG("Got %p. Putting it in %p.\n", *ptr, ptr);
16188     +
16189     + if (*ptr)
16190     + return 0;
16191     +
16192     + printk("Error. Unable to allocate memory for dynamic pageflags.");
16193     + return -ENOMEM;
16194     +}
16195     +
16196     +/*
16197     + * allocate_dyn_pageflags
16198     + *
16199     + * Allocate a bitmap for dynamic page flags.
16200     + *
16201     + */
16202     +int allocate_dyn_pageflags(dyn_pageflags_t *pagemap)
16203     +{
16204     + int i, zone_idx, zone_pages, node_id = 0;
16205     + struct zone *zone;
16206     + struct pglist_data *pgdat;
16207     +
16208     + if (*pagemap) {
16209     + PR_DEBUG("Pagemap %p already allocated.\n", pagemap);
16210     + return 0;
16211     + }
16212     +
16213     + PR_DEBUG("Seeking to allocate dyn_pageflags %p.\n", pagemap);
16214     +
16215     + for_each_online_pgdat(pgdat)
16216     + node_id++;
16217     +
16218     + if (try_alloc_dyn_pageflag_part(node_id, (void **) pagemap))
16219     + return -ENOMEM;
16220     +
16221     + node_id = 0;
16222     +
16223     + for_each_online_pgdat(pgdat) {
16224     + PR_DEBUG("Node %d.\n", node_id);
16225     +
16226     + if (try_alloc_dyn_pageflag_part(MAX_NR_ZONES,
16227     + (void **) &(*pagemap)[node_id]))
16228     + return -ENOMEM;
16229     +
16230     + for (zone_idx = 0; zone_idx < MAX_NR_ZONES; zone_idx++) {
16231     + PR_DEBUG("Zone %d of %d.\n", zone_idx, MAX_NR_ZONES);
16232     +
16233     + zone = &pgdat->node_zones[zone_idx];
16234     +
16235     + if (!populated_zone(zone)) {
16236     + PR_DEBUG("Node %d zone %d unpopulated - won't allocate.\n", node_id, zone_idx);
16237     + continue;
16238     + }
16239     +
16240     + zone_pages = pages_for_zone(zone);
16241     +
16242     + PR_DEBUG("Node %d zone %d (needs %d pages).\n", node_id, zone_idx, zone_pages);
16243     +
16244     + if (try_alloc_dyn_pageflag_part(zone_pages,
16245     + (void **) &(*pagemap)[node_id][zone_idx]))
16246     + return -ENOMEM;
16247     +
16248     + for (i = 0; i < zone_pages; i++) {
16249     + unsigned long address = get_zeroed_page(GFP_ATOMIC);
16250     + if (!address) {
16251     + PR_DEBUG("Error. Unable to allocate memory for "
16252     + "dynamic pageflags.");
16253     + free_dyn_pageflags(pagemap);
16254     + return -ENOMEM;
16255     + }
16256     + PR_DEBUG("Node %d zone %d. Page %d.\n", node_id, zone_idx, i);
16257     + (*pagemap)[node_id][zone_idx][i] =
16258     + (unsigned long *) address;
16259     + }
16260     + }
16261     + node_id++;
16262     + }
16263     +
16264     + PR_DEBUG("Done.\n");
16265     + return 0;
16266     +}
16267     +
16268     +#define GET_BIT_AND_UL(bitmap, page) \
16269     + struct zone *zone = page_zone(page); \
16270     + unsigned long zone_pfn = page_to_pfn(page) - zone->zone_start_pfn; \
16271     + int node = page_to_nid(page); \
16272     + int zone_num = zone_idx(zone); \
16273     + int pagenum = PAGENUMBER(zone_pfn); \
16274     + int page_offset = PAGEINDEX(zone_pfn); \
16275     + unsigned long *ul = ((*bitmap)[node][zone_num][pagenum]) + page_offset; \
16276     + int bit = PAGEBIT(zone_pfn);
16277     +
16278     +/*
16279     + * test_dynpageflag(dyn_pageflags_t *bitmap, struct page *page)
16280     + *
16281     + * Is the page flagged in the given bitmap?
16282     + *
16283     + */
16284     +
16285     +int test_dynpageflag(dyn_pageflags_t *bitmap, struct page *page)
16286     +{
16287     + GET_BIT_AND_UL(bitmap, page);
16288     + return test_bit(bit, ul);
16289     +}
16290     +
16291     +/*
16292     + * set_dynpageflag(dyn_pageflags_t *bitmap, struct page *page)
16293     + *
16294     + * Set the flag for the page in the given bitmap.
16295     + *
16296     + */
16297     +
16298     +void set_dynpageflag(dyn_pageflags_t *bitmap, struct page *page)
16299     +{
16300     + GET_BIT_AND_UL(bitmap, page);
16301     + set_bit(bit, ul);
16302     +}
16303     +
16304     +/*
16305     + * clear_dynpageflags(dyn_pageflags_t *bitmap, struct page *page)
16306     + *
16307     + * Clear the flag for the page in the given bitmap.
16308     + *
16309     + */
16310     +
16311     +void clear_dynpageflag(dyn_pageflags_t *bitmap, struct page *page)
16312     +{
16313     + GET_BIT_AND_UL(bitmap, page);
16314     + clear_bit(bit, ul);
16315     +}
16316     +
16317     +/*
16318     + * get_next_bit_on(dyn_pageflags_t bitmap, int counter)
16319     + *
16320     + * Given a pfn (possibly -1), find the next pfn in the bitmap that
16321     + * is set. If there are no more flags set, return -1.
16322     + *
16323     + */
16324     +
16325     +unsigned long get_next_bit_on(dyn_pageflags_t bitmap, unsigned long counter)
16326     +{
16327     + struct page *page;
16328     + struct zone *zone;
16329     + unsigned long *ul = NULL;
16330     + unsigned long zone_offset;
16331     + int pagebit, zone_num, first = (counter == (max_pfn + 1)), node;
16332     +
16333     + if (first)
16334     + counter = first_online_pgdat()->node_zones->zone_start_pfn;
16335     +
16336     + page = pfn_to_page(counter);
16337     + zone = page_zone(page);
16338     + node = zone->zone_pgdat->node_id;
16339     + zone_num = zone_idx(zone);
16340     + zone_offset = counter - zone->zone_start_pfn;
16341     +
16342     + if (first)
16343     + goto test;
16344     +
16345     + do {
16346     + zone_offset++;
16347     +
16348     + if (zone_offset >= zone->spanned_pages) {
16349     + do {
16350     + zone = next_zone(zone);
16351     + if (!zone)
16352     + return max_pfn + 1;
16353     + } while(!zone->spanned_pages);
16354     +
16355     + zone_num = zone_idx(zone);
16356     + node = zone->zone_pgdat->node_id;
16357     + zone_offset = 0;
16358     + }
16359     +test:
16360     + pagebit = PAGEBIT(zone_offset);
16361     +
16362     + if (!pagebit || !ul)
16363     + ul = (bitmap[node][zone_num][PAGENUMBER(zone_offset)])
16364     + + PAGEINDEX(zone_offset);
16365     +
16366     + if (!(*ul & ~((1 << pagebit) - 1))) {
16367     + zone_offset += BITS_PER_LONG - pagebit - 1;
16368     + continue;
16369     + }
16370     +
16371     + } while(!test_bit(pagebit, ul));
16372     +
16373     + return zone->zone_start_pfn + zone_offset;
16374     +}
16375     +
16376     diff --git a/lib/vsprintf.c b/lib/vsprintf.c
16377     index b025864..2138c47 100644
16378     --- a/lib/vsprintf.c
16379     +++ b/lib/vsprintf.c
16380     @@ -236,6 +236,29 @@ static char * number(char * buf, char * end, unsigned long long num, int base, i
16381     return buf;
16382     }
16383    
16384     +/*
16385     + * vsnprintf_used
16386     + *
16387     + * Functionality : Print a string with parameters to a buffer of a
16388     + * limited size. Unlike vsnprintf, we return the number
16389     + * of bytes actually put in the buffer, not the number
16390     + * that would have been put in if it was big enough.
16391     + */
16392     +int snprintf_used(char *buffer, int buffer_size, const char *fmt, ...)
16393     +{
16394     + int result;
16395     + va_list args;
16396     +
16397     + if (!buffer_size)
16398     + return 0;
16399     +
16400     + va_start(args, fmt);
16401     + result = vsnprintf(buffer, buffer_size, fmt, args);
16402     + va_end(args);
16403     +
16404     + return result > buffer_size ? buffer_size : result;
16405     +}
16406     +
16407     /**
16408     * vsnprintf - Format a string and place it in a buffer
16409     * @buf: The buffer to place the result into
16410     diff --git a/mm/vmscan.c b/mm/vmscan.c
16411     index db023e2..54acc2c 100644
16412     --- a/mm/vmscan.c
16413     +++ b/mm/vmscan.c
16414     @@ -654,6 +654,28 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
16415     return nr_taken;
16416     }
16417    
16418     +/* return_lru_pages puts a list of pages back on a zone's lru lists. */
16419     +
16420     +static void return_lru_pages(struct list_head *page_list, struct zone *zone,
16421     + struct pagevec *pvec)
16422     +{
16423     + while (!list_empty(page_list)) {
16424     + struct page *page = lru_to_page(page_list);
16425     + VM_BUG_ON(PageLRU(page));
16426     + SetPageLRU(page);
16427     + list_del(&page->lru);
16428     + if (PageActive(page))
16429     + add_page_to_active_list(zone, page);
16430     + else
16431     + add_page_to_inactive_list(zone, page);
16432     + if (!pagevec_add(pvec, page)) {
16433     + spin_unlock_irq(&zone->lru_lock);
16434     + __pagevec_release(pvec);
16435     + spin_lock_irq(&zone->lru_lock);
16436     + }
16437     + }
16438     +}
16439     +
16440     /*
16441     * shrink_inactive_list() is a helper for shrink_zone(). It returns the number
16442     * of reclaimed pages
16443     @@ -671,7 +693,6 @@ static unsigned long shrink_inactive_list(unsigned long max_scan,
16444     lru_add_drain();
16445     spin_lock_irq(&zone->lru_lock);
16446     do {
16447     - struct page *page;
16448     unsigned long nr_taken;
16449     unsigned long nr_scan;
16450     unsigned long nr_freed;
16451     @@ -701,21 +722,7 @@ static unsigned long shrink_inactive_list(unsigned long max_scan,
16452     /*
16453     * Put back any unfreeable pages.
16454     */
16455     - while (!list_empty(&page_list)) {
16456     - page = lru_to_page(&page_list);
16457     - VM_BUG_ON(PageLRU(page));
16458     - SetPageLRU(page);
16459     - list_del(&page->lru);
16460     - if (PageActive(page))
16461     - add_page_to_active_list(zone, page);
16462     - else
16463     - add_page_to_inactive_list(zone, page);
16464     - if (!pagevec_add(&pvec, page)) {
16465     - spin_unlock_irq(&zone->lru_lock);
16466     - __pagevec_release(&pvec);
16467     - spin_lock_irq(&zone->lru_lock);
16468     - }
16469     - }
16470     + return_lru_pages(&page_list, zone, &pvec);
16471     } while (nr_scanned < max_scan);
16472     spin_unlock(&zone->lru_lock);
16473     done:
16474     @@ -1276,6 +1283,72 @@ out:
16475     return nr_reclaimed;
16476     }
16477    
16478     +struct lru_save {
16479     + struct zone *zone;
16480     + struct list_head active_list;
16481     + struct list_head inactive_list;
16482     + struct lru_save *next;
16483     +};
16484     +
16485     +struct lru_save *lru_save_list;
16486     +
16487     +void unlink_lru_lists(void)
16488     +{
16489     + struct zone *zone;
16490     +
16491     + for_each_zone(zone) {
16492     + struct lru_save *this;
16493     + unsigned long moved, scanned;
16494     +
16495     + if (!zone->spanned_pages)
16496     + continue;
16497     +
16498     + this = (struct lru_save *)
16499     + kzalloc(sizeof(struct lru_save), GFP_ATOMIC);
16500     +
16501     + BUG_ON(!this);
16502     +
16503     + this->next = lru_save_list;
16504     + lru_save_list = this;
16505     +
16506     + this->zone = zone;
16507     +
16508     + spin_lock_irq(&zone->lru_lock);
16509     + INIT_LIST_HEAD(&this->active_list);
16510     + INIT_LIST_HEAD(&this->inactive_list);
16511     + moved = isolate_lru_pages(zone_page_state(zone, NR_ACTIVE),
16512     + &zone->active_list, &this->active_list,
16513     + &scanned);
16514     + __mod_zone_page_state(zone, NR_ACTIVE, -moved);
16515     + moved = isolate_lru_pages(zone_page_state(zone, NR_INACTIVE),
16516     + &zone->inactive_list, &this->inactive_list,
16517     + &scanned);
16518     + __mod_zone_page_state(zone, NR_INACTIVE, -moved);
16519     + spin_unlock_irq(&zone->lru_lock);
16520     + }
16521     +}
16522     +
16523     +void relink_lru_lists(void)
16524     +{
16525     + while(lru_save_list) {
16526     + struct lru_save *this = lru_save_list;
16527     + struct zone *zone = this->zone;
16528     + struct pagevec pvec;
16529     +
16530     + pagevec_init(&pvec, 1);
16531     +
16532     + lru_save_list = this->next;
16533     +
16534     + spin_lock_irq(&zone->lru_lock);
16535     + return_lru_pages(&this->active_list, zone, &pvec);
16536     + return_lru_pages(&this->inactive_list, zone, &pvec);
16537     + spin_unlock_irq(&zone->lru_lock);
16538     + pagevec_release(&pvec);
16539     +
16540     + kfree(this);
16541     + }
16542     +}
16543     +
16544     /*
16545     * The background pageout daemon, started as a kernel thread
16546     * from the init process.
16547     @@ -1323,8 +1396,6 @@ static int kswapd(void *p)
16548     for ( ; ; ) {
16549     unsigned long new_order;
16550    
16551     - try_to_freeze();
16552     -
16553     /* kswapd has been busy so delay watermark_timer */
16554     mod_timer(&pgdat->watermark_timer, jiffies + WT_EXPIRY);
16555     prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
16556     @@ -1335,13 +1406,20 @@ static int kswapd(void *p)
16557     */
16558     order = new_order;
16559     } else {
16560     set_user_nice(tsk, 0);
16561     - schedule();
16562     + if (!freezing(current))
16563     + schedule();
16564     +
16565     order = pgdat->kswapd_max_order;
16566     }
16567     finish_wait(&pgdat->kswapd_wait, &wait);
16568    
16569     - balance_pgdat(pgdat, order);
16570     + if (!try_to_freeze()) {
16571     + /* We can speed up thawing tasks if we don't call
16572     + * balance_pgdat after returning from the refrigerator
16573     + */
16574     + balance_pgdat(pgdat, order);
16575     + }
16576     }
16577     return 0;
16578     }
16579     @@ -1355,6 +1433,9 @@ void wakeup_kswapd(struct zone *zone, int order)
16580     if (!populated_zone(zone))
16581     return;
16582    
16583     + if (freezer_is_on())
16584     + return;
16585     +
16586     pgdat = zone->zone_pgdat;
16587     if (zone_watermark_ok(zone, order, zone->pages_low, 0, 0))
16588     return;
16589     @@ -1368,6 +1449,91 @@ void wakeup_kswapd(struct zone *zone, int order)
16590     }
16591    
16592     #ifdef CONFIG_PM
16593     +void shrink_one_zone(struct zone *zone, int total_to_free)
16594     +{
16595     + int prio;
16596     + unsigned long still_to_free = total_to_free;
16597     + struct scan_control sc = {
16598     + .gfp_mask = GFP_KERNEL,
16599     + .may_swap = 0,
16600     + .may_writepage = 1,
16601     + .mapped = vm_mapped,
16602     + };
16603     +
16604     + if (!populated_zone(zone) || zone->all_unreclaimable)
16605     + return;
16606     +
16607     + if (still_to_free <= 0)
16608     + return;
16609     +
16610     + if (is_highmem(zone))
16611     + sc.gfp_mask |= __GFP_HIGHMEM;
16612     +
16613     + for (prio = DEF_PRIORITY; prio >= 0; prio--) {
16614     + unsigned long to_free, just_freed, orig_size;
16615     + unsigned long old_nr_active;
16616     +
16617     + to_free = min(zone_page_state(zone, NR_ACTIVE) +
16618     + zone_page_state(zone, NR_INACTIVE),
16619     + still_to_free);
16620     +
16621     + if (to_free <= 0)
16622     + return;
16623     +
16624     + sc.swap_cluster_max = to_free -
16625     + zone_page_state(zone, NR_INACTIVE);
16626     +
16627     + do {
16628     + old_nr_active = zone_page_state(zone, NR_ACTIVE);
16629     + zone->nr_scan_active = sc.swap_cluster_max - 1;
16630     + shrink_active_list(sc.swap_cluster_max, zone, &sc,
16631     + prio);
16632     + zone->nr_scan_active = 0;
16633     +
16634     + sc.swap_cluster_max = to_free - zone_page_state(zone,
16635     + NR_INACTIVE);
16636     +
16637     + } while (sc.swap_cluster_max > 0 &&
16638     + zone_page_state(zone, NR_ACTIVE) > old_nr_active);
16639     +
16640     + to_free = min(zone_page_state(zone, NR_ACTIVE) +
16641     + zone_page_state(zone, NR_INACTIVE),
16642     + still_to_free);
16643     +
16644     + do {
16645     + orig_size = zone_page_state(zone, NR_ACTIVE) +
16646     + zone_page_state(zone, NR_INACTIVE);
16647     + zone->nr_scan_inactive = to_free;
16648     + sc.swap_cluster_max = to_free;
16649     + shrink_inactive_list(to_free, zone, &sc);
16650     + just_freed = (orig_size -
16651     + (zone_page_state(zone, NR_ACTIVE) +
16652     + zone_page_state(zone, NR_INACTIVE)));
16653     + zone->nr_scan_inactive = 0;
16654     + still_to_free -= just_freed;
16655     + to_free -= just_freed;
16656     + } while (just_freed > 0 && still_to_free > 0);
16657     + };
16658     +
16659     + while (still_to_free > 0) {
16660     + unsigned long nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
16661     + struct reclaim_state reclaim_state;
16662     +
16663     + if (nr_slab > still_to_free)
16664     + nr_slab = still_to_free;
16665     +
16666     + reclaim_state.reclaimed_slab = 0;
16667     + shrink_slab(nr_slab, sc.gfp_mask, nr_slab);
16668     + if (!reclaim_state.reclaimed_slab)
16669     + break;
16670     +
16671     + still_to_free -= reclaim_state.reclaimed_slab;
16672     + }
16673     +
16674     + return;
16675     +}
16676     +
16677     +
16678     /*
16679     * Helper function for shrink_all_memory(). Tries to reclaim 'nr_pages' pages
16680     * from LRU lists system-wide, for given pass and priority, and returns the