Programmatically Compress and Decompress Files

هذه المقالة متوفرة أيضا باللغة العربية، اقرأها هنا.

This lesson is very easy. This lesson focuses on how to compress and decompress files programmatically using .NET Framework and C# -or any language that supports .NET of course.

Currently .NET Framework supports two types of compression algorithms:

  • Deflate
    This is a very simple algorithm, used for simple compression and decompression operations. Don’t use this algorithm except if you intend to use compressed data only in your application because this algorithm is not compatible with any compression tool.
  • GZIP
    This algorithm uses internally the Deflate algorithm. This algorithm includes a cyclic redundancy check value for detecting data corruption. Also it’s compatible with most compression tools because it writes headers to the compressed file, so compression tools -like WinZip and WinRAR- can easily access the compressed file and decompress it as well. This algorithm also can be extended to use other algorithms internally other than Deflate.

For a complete GZIP reference see RFC 1952 (GZIP File Format Specification).

The most powerful compression tool now is WinRAR.

Fortunately, whether using Deflate or GZIP in .NET, code is the same; the only thing that needs to change is the class name.

Deflate and GZIP can be found in the namespace System.IO.Compression that resides on assembly System.dll. This namespace contains only three types, the two algorithms implementation classes DeflateStream and GZipStream -both inherit directly from System.IO.Stream class-, and an enumeration CompressionMode that defines the operation (compression or decompression).

Compressing a file

The code for compression is very simple:

Code snippets in this lesson rely on the fact that you added two using (Imports in VB) statements, System.IO and System.IO.Compression.

// C# Code

string fileToBeCompressed = "D:\My Great Word Document.doc";
string zipFilename = "D:\CompressedGreatDocument.zip";

using (FileStream target = new FileStream(zipFilename, FileMode.Create, FileAccess.Write))
using (GZipStream alg = new GZipStream(target, CompressionMode.Compress))
{
    byte[] data = File.ReadAllBytes(fileToBeCompressed);
    alg.Write(data, 0, data.Length);
    alg.Flush();
}

Code explanation:
If you are going to compress a file then you must specify the CompressionMode.Compress option, and also you must specify a Stream that will the data be written to.
After creating the class you can compress the data by using its Write() method.

If you intended to use the Deflate algorithm, just change the class name to DeflateStream.

Decompressing a file

The code that decompresses a compressed file is very similar:

// C# Code

string compressedFile = "D:\CompressedGreatDocument.zip";
string originalFileName = "D:\My Great Word Document.doc";

using (FileStream zipFile = new FileStream(compressedFile, FileMode.Open, FileAccess.Read))
using (FileStream originalFile = new FileStream(originalFileName, FileMode.Create, FileAccess.Write))
using (GZipStream alg = new GZipStream(zipFile, CompressionMode.Decompress))
{
    while (true)
    {
        // Reading 100bytes by 100bytes
        byte[] buffer = new byte[100];
        // The Read() method returns the number of bytes read
        int bytesRead = alg.Read(buffer, 0, buffer.Length);

        originalFile.Write(buffer, 0, returnedBytes);

        if (bytesRead != buffer.Length)
            break;
    }
}
' VB.NET Code

Dim compressedFile As String = "D:CompressedGreatDocument.zip"
Dim originalFileName As String = "D:My Great Word Document.doc"

Dim zipFile As New FileStream(compressedFile, FileMode.Open, FileAccess.Read)
Dim originalFile As New FileStream(originalFileName, FileMode.Create, FileAccess.Write)
Dim alg As New GZipStream(zipFile, CompressionMode.Decompress)

While (True)
    ' Reading 100bytes by 100bytes
    Dim buffer(100) As Byte

    ' The Read() method returns the number of bytes read
    Dim bytesRead As Integer = alg.Read(buffer, 0, buffer.Length)

    originalFile.Write(buffer, 0, bytesRead)

    If (bytesRead  buffer.Length) Then
        Exit While
    End If
End While

Code explanation:

First, we create a file stream to read the ZIP file, and then we created our algorithm class that encapsulates the file stream for decompressing it.

Then we created a file stream for the original file, for writing the decompressed data.
After that, we read the data compressed 100bytes after each other -you may prefer more- and write this data to the original file stream.

By encapsulating disposable objects into using statements you become assured that every object will be disposed in a certain point -end of the using statement- and no object will retain in memory.

Try it yourself

Develop a simple application that can be used to compress and decompress files.

This application can compress several files into one file and decompress them as well.

Besides algorithms classes you may use System.IO.BinaryWriter and System.IO.BinaryReader to get full control of how the file contents will be.

Windows Vista File and Registry Virtualization

Enabling UAC (User Access Control) feature in Windows Vista, Administrator users in Windows Vista, by default, don’t have administrative privileges. Every Windows process has two security tokens associated with it, one with normal user privileges and one with admin privileges. With applications that require administrative privileges, the user can elevate the application to run with Administrator rights. And that process called Elevation.

As you expect, it’s the least-privilege principle well-recognized for security pros and people who use Linux.

User can elevate an application either by clicking “Run as Administrator” from the context menu of the application icon, or even by editing the Compatibility tab in the properties of the application file.
Also, while an application running it can ask the user to provide administrative permission to complete a specific operation (a good example is switching to the All Users mode in Task Manager).

Compatibility Options

Windows Vista keeps track of the compatibility options edited for an application by adding a compatibility flag to the registry at HKCUSoftwareMicrosoftWindows NTCurrentVersionAppCompatFlagsLayers.

Try changing any of the compatibility options for an application and see how Windows tracks that.

Because UAC feature of Windows Vista, it doesn’t allow users to access some folders like Program Files and Windows folder. Also it doesn’t allow them to access the registry without administrative permission.

But, there’re lots of applications that write lots of data to the Program Files folder for instance. And Windows Vista must keep them away from doing such these operations without administrative permission -you can imagine the amount of applications that require administrative privileges-. So to handle this dilemma, Windows Vista has a new technique called Virtualization.

When a program tries to write to the Program Files folder for instance, Windows Vista redirects it to a special virtual store so that the application can read/write data without generating errors (because of course it doesn’t have the permission).

As we would see in the next example Windows Vista uses this technique with registry too.

For folders, Virtualization called File Virtualization. For registry, it’s called Registry Virtualization.

File Virtualization

To see virtualization in action let’s try this example:

string programFiles =
    Environment.GetFolderPath(Environment.SpecialFolder.ProgramFiles);
string appDir = Path.Combine(programFiles, "MyApplication");

if (Directory.Exists(appDir) == false)
    Directory.CreateDirectory(appDir);

string file = Path.Combine(appDir, "SampleFile.txt");

File.WriteAllText(file, "Hello, World!");

When you run the example it doesn’t write to C:Program FilesMyApplication. Instead it writes to the Program Files virtual store in C:UsersAppDataLocalVirtualStoreProgram FilesMyApplication

Note that if you are running your Visual Studio instance in elevated mode and run your application it gets the elevated mode from Visual Studio. So you need to run it manually from its icon.

Try changing the application so it writes to Windows folder. And check the virtual store folder.

Registry Virtualization

Virtualization is not only done with folders but also with registry entries. If the application tries to write to the registry key Software in HKEY_LOCAL_MACHINE hive, it is redirected to the HKEY_CURRENT_USER hive. Instead of writing to HKLMSoftware{Manufacturer}, it writes to the registry Virtual Store HKCUSoftwareClassesVirtualStoreMACHINESOFTWARE{Manufacturer}.

File and registry virtualization is available only for 32-bit applications. This feature is not available for 64-bit applications on Windows Vista.

Don’t use virtualization as a feature of your application. It is better to fix your application than to write to Program Files folder and the HKLM hive without elevated user privileges. Redirection is only a temporary means to fix broken applications.