1. Computing & Technology

Discuss in my forum

Programming Contest 45 - Defrag a Disk

Deadline March 31

By , About.com Guide

This is not actually defragging a real disk but a simulated one. I've created a virtual disk made up of 100 tracks (numbered 0-99) with a varying number of sectors in each track; fewer in the center, most at the outside.

Real disks are marvels of engineering that store data in concentric circles (called a track), with a number of sectors in each track. Files are spread across tracks and over time, the sectors that make up a file may get moved about and no longer be contiguous; they are fragmented. To read the file into memory the computer must move the read head to a track, wait until the sector comes under it and reads the data.

Defragmentation (aka defragging) rewrites files so they once again occupy contiguous sectors in the same track. That means the computer doesn't have to move the head between tracks so it eases wear and tear and is faster.

In this contest you are given a "drive" with fragmented files and must write software to defrag it. The disk is actually a text file with a representation of a 199 x 199 array, each element in the array being a file sector. There are approx 250 files on disk, ranging in size from 5-300 sectors each and they are scattered far and wide. The disk is about 90% full.

Disk Structure

To simplify things, in this contest the disk is square! Track 0 is a single character, track one is 3 x 3, track 2 is 5 x 5 etc like this for tracks 0-4, except it extends out to track 99 and tracks 0-9 are not used.

444444444
433333334
432222234
432111234
432101234
432111234
432222234
433333334
444444444

Imagine the full 199 x 199 square with a smaller 19 x 19 square in the center. To further complicate things, there are approximately 250 possible values in each location. To keep the file size down, I've converted these into a 2 character hexadecimal representation in base 16 so for example file a016 is 16010.

The file is 199 lines by 199 positions per line at 2 characters per position.

The file will be in the same location as your exe and is organized in pairs of chars with 398 chars per line making 199 pairs. There are 199 lines of text. If you see () then it's an empty sector. Where there are [], that is in the first 10 tracks that cannot be used.

The Contest

Your program must rearrange all "files" so that the number of reads of each file is kept to a minimum. In this contest a "file" is just a number of locations with the same value. If there are 5 locations with the value a2 then it's a 5 sector file.

This means moving the sectors that make up one file so they occupy one or two tracks. A track is a concentric square. So the first and last rows in the input file plus the first and last character on each line makes up track 99. The formula 8n is how many sectors there are in track n.

As there are files longer than many tracks, it makes sense to put these files on the outer tracks so you can fit them in one track. There is no ordering to sectors in a file so you can move them anywhere.

Output File

I'll be marking these with a program that reads your output file. This must be a file called defrag.txt and in the same format. 398 chars of text output on each of 199 lines and using the same hex notation for sectors, () for empty sectors and [] for the unusable sectors.

Winning Criteria

The scoring will be as follows (lowest score wins btw!). One point per sector in a file. If a sector in a file is not in the next location to the last sector then it gets 25 points. If it is in a different track then it gets 100 points. I.e. you want to keep sectors contiguous in the same track to get the lowest scores.

Timing Code

This is not a speed contest but in the case of a tie, fastest wins...

Please time from the start of the run, after reading in the input file until just before the output file is opened for writing. This code below will do high speed timing for Windows (first three) and the fourth one for Linux.

Final Results

Congrats again to Ivan Dimov whose C++ defragger stormed to a very fast 0.000375 of a second to do the defragging. James Borden came a credible second in 0.002344 seconds. Last but not least and worthy of a mention for displaying the defragged disk was Jeremy Hornick

  1. Ivan Dimov - C++ 0.000375
  2. James Borden - C# 0.0023442
  3. Paul Jones - C# 0.0152514
  4. Paul Goldenberg - C# 0.4101473
  5. Jeremy Hornick - C# 9.111

Thanks to everyone who entered.

Rules

This is for glory only. About.com does not permit prizes to be given.

Please submit your source code and the output file to the cplus.guide@about.com?subject=Programming Contest 45 email address with the subject line Programming Contest 45.

It must compile with Open Watcom, Microsoft Visual C++ 2008/2010 Express Edition/Microsoft Visual Studio 2008/2010 or Borland Turbo C++ Explorer, Microsoft Visual C# 2008/2010 Express Edition or GCC/G++. If it doesn't compile, it can't be run so is automatically disqualified.

Please include your name, age (optional), blog/website url (optional) and country. Your email address will not be kept, used or displayed except to acknowledge your challenge entry. You can submit as many entries as you like before the deadline which is March 31st 2011.

The top ten entries will be listed, judged purely on points. A condition of entry is that you allow your source code to be published on this website, with full credits to you as the author.

©2012 About.com. All rights reserved.

A part of The New York Times Company.