Coimpressed caching is the introduction of a new level into the virtual memory heirarchy. Specifically, RAM is used to store both an uncompressed cache of pages in their 'natural' encoding, and a compressed cache of pages in some compressed format. By using RAM to store some number of compressed pages, the effective size of RAM is increased, and so the number of page faults that must be serviced by very slow hard disks is decreased.
Our paper on this topic will appear in the USENIX Annual Technical Conference '99, and you can grab a copy from our page of papers.
The programs in the test suite were each linked with VMTrace, our tracing utility. This utility works on SPARC-v8/Solaris and i386/Linux machines. (Note that new version of VMTrace are being developed with greater capabilities, but this version, under Linux, was the one used for our experiments in the paper.) For each page referenced, VMTrace outputs the page number, whether or not the page was dirtied, and a copy of the page image itself. The output of VMTrace was piped into our trace reduction utility. This utility creates a sequence that is the LRU behavior for a chosen reduction memory size, such that the larger the reduction memory, the smaller the resulting sequence. However, simulations must be performed on sequences (traces) reduced using a sufficiently small reduction memory to ensure accuracy. We have developed better reduction methods (SAD and OLR), but have not implemented them for page image traces. The LRU behavior trace created by the utility above contains page images, whose compressibility can be tested with our compressibility trace utility. Each page image in the source trace is measured, for a number of different compression algorithms, for compressibility, compression time, and decompression time. The result is an LRU behavior trace like the source trace, where the page images are replaced by compressibility measurements. Our size-adaptive compressed cache simulator accepts a compressibility trace as input, and simulates the management of a compression cache. It outputs the total paging time that would be required for a traditional VM system without compressed caching, and the total paging time required of a compressed caching VM system.
WK4x4 is the variant of our WK compression family that achieves the tighest compression by itself by using a 4x4 set-associative dictionary of recently seen words. WKdm compresses nearly as tightly as WK4x4, but is much faster because of the simple, direct-mapped dictionary of recently seen words. LZO is a very fast Lempel-Ziv implementation by Markus Oberhumer. LZRW1 was, until LZO, the reigning Lempel-Ziv speed champion, and was used in past studies on compressed caching.
espresso is a circuit simulator gcc-2.7.2 is the GNU C/C++ compiler gnuplot is the GNU plotting utility grobner calculated Grobner basis functions gs3.33 is GhostScript, a software PostScript interpreter lindsay is a hypercube simulator p2c is a Pascal->C transofrmer rscheme is our implementation of Scheme
From the programs listed in the test suite above, we gathered page image traces and reduced them on a memory size that yeilded, roughly, 1 Gbyte traces. Note that these are very large, even though they have been reduced with our LRU behavior reduction utility and gzipped as tightly as possible. If these are too large to grab, please send email to Scott Kaplan <sfkaplan@cs.utexas.edu>, and we can try to re-reduce the traces on a larger memory size to provide a smaller trace.
From each of the page image trace above, we used our compressibility trace utility to generate compressibility traces for each of the four compression algorithms listed above, running on three different processors. Thus, there are 8 source page image traces, 4 compression algorithms, and 3 platforms, resulting in 96 compressibility traces. They are not overly large, so we have grouped them together by processor. Each file here is a tarball of the 32 compressibility traces for a particular architecture.espresso ( 794,207,195 bytes) gcc-2.7.2 (1,154,083,515 bytes) gnuplot ( 138,740,724 bytes) grobner ( 743,662,768 bytes) gs3.33 ( 882,016,979 bytes) lindsay ( 821,088,665 bytes) p2c ( 906,746,319 bytes) rscheme ( 892,110,863 bytes)
Pentium Pro 180 MHz (p-pro-180-compressibility.tgz) UltraSPARC 168 MHz (sparc-168-compressibility.tgz) UltraSPARC 300 MHz (sparc-300-compressibility.tgz)
Page last updated on April 13,
1999
Send feedback to Scott
Kaplan <sfkaplan@cs.utexas.edu>