1. Technology

Programming Challenge 71 - Unweave Tags!

Now Completed


This month's challenge is an exercise in identifying structure in an html document. Several tags are used to split html web pages into sections (I'm only referring to the older <div> tags not the new html5 <section> tags).

I recently had a less than joyful two hours trying to find out why a web page was wonky. The bottom footer was not positioned correctly. In the end it turned out to be two problems, a </form> tag had got introduced in the wrong place and a <div> was missing.

Along the way I discovered a lovely free web tool for displaying <div> structure at Div tool checker and that gave me the idea for this programming contest.

Challenge Definition

Create a program that reads a supplied .html file and either verifies it as ok or displays a chart like the Tormus one. However rather than just handle <div> tags, it should also handle <span>, <p>, <b>, <i> and <u> tags (and their closing tags).


All of these tags can be nested inside each other but must not overlap. The tags in question are limited to <div> tags, it should also handle <span>, <p>, <b>, <i> and <u> tags (and their closing tags).

Verifying correctness is one part, the second part is displaying a tag tree like the Tormus website but showing all the tags. If there is a missing or overlapping tag then it should be identified in the tree.

The input

Your program will be supplied with five smallish html files called input1.html to input5.html in the same location as your executable.

These html files will have body and html tags as per normal html. There will be no header tags used or <head> section.

Use this input.zip for testing.

The Output

Please output all your outputs into results.txt in the same location as your exe. It should have five sections one per file. These should start by saying either the html file is correct or has faults. It's possible there may be more than one fault in a file.

After the line of text leave a blank line and then output the diagram for the relevant input file. It should look something like this but include the other specified tags and feel free to use other text characters to make it look pretty if you want.

<div id="abw">
      |<div id="abh">
      | |<div id="adL">
      | |</div>
      | |<div class="ma">
      | | |<div class="lg">
      | | |</div>
      | |</div>
      |<div id="abb" >
      | |<div id="abm" class="clear">
      | | |<div id="abc" class="clear">
      | | | |<div id="lp-side">
      | | | | |<div id="mr">
      | | | | | |<div class="h3">
      | | | | | |</div>
      | | | | |</div>


Winning Criteria

You get a point for each correctly verified (or not) html file. I'll decide tie breakers by the quality of the output file diagrams.

Final Results

It was very close. I assessed it as 100 points for Stephen Burris and 95 each for Zeljko Peric and Ron Spain. Zeljko did a visual version that was neat but I ruled Stephen's as best because of the outputs, with the tree, with errors and with line numbers. But really all three are so close.

Zeljko and Ron are equal second. Thanks to all three who entered.

  1. Stephen Burris(C#) Result=100, (USA)
  2. Zeljko Peric(C#) Result=95, (Serbia)
  3. Ron Spain(C) Result=95, (USA,34)

General Tips on Entering

This is a single page article with tips on things to do and not do when entering challenges. Please read it!

These tips contain code for C,C++ and C# (Not Go, yet) that can do very high precision timing.


This is for glory only. About.com does not permit prizes to be given.

Please submit your source code and the output file to the cplus@aboutguide.com?subject=Programming Contest 71 email address with the subject line Programming Contest 71.

It must compile with Open Watcom, Microsoft Visual C++ 2010/2012 Express Edition/Microsoft Visual Studio 2010/2012, CC386 or Borland Turbo C++ Explorer, Microsoft Visual C# 2008/2010 Express Edition, GCC/G++ and any Google Go compiler. If it doesn't compile, it can't be run so is automatically disqualified.

Please include your name, age (optional), blog/website url (optional) and country. Your email address will not be kept, used or displayed except to acknowledge your challenge entry. You can submit as many entries as you like before the deadline which is the first Sunday on or after the last day in the challenge month.

The top ten entries in each challenge will be listed, judged on highest score and in the case of a draw, the fastest time. A condition of entry is that you allow your source code to be published on this website, with full credits to you as the author.

  1. About.com
  2. Technology
  3. C / C++ / C#
  4. Programming Challenge List
  5. Programming Challenge 71 - Unweave the Divs

©2014 About.com. All rights reserved.