THTMLParser v1.02

THTMLParser is a delphi class to parse a HTML file. The file will be split into tags and text objects (useful for validating tags or for automatic code corrections). Sample file included (very simple web browser!). Supports HTML3.2 entities and Western Latin-1 charset.

How to use the HTMLParser
Create an instance of HTMLParser with

  HTMLParser:=THTMLParser.Create;

then load a HTML file, e.g.

  HTMLParser.Lines.LoadfromFile(filename)

whereas Lines is a normal TStringlist.
With

  HTMLParser.Execute;

the file will be parsed into HTMLParser.Parsed
this TList consists of objects derived from 2 classes:

type THTMLText = class
 property Line:string;  // HTML3.2 Entities and Western Latin-1 Font converted text
 property Raw:string;   // raw text line as read from input file

type THTMLTag = class
 Params:TList; // see below
 property Name:string; // uppercased TAG 
 property Raw:string;  // raw TAG (parameters included) as read from input file

 Params is a list of all parameters for the TAG (if any):

   type THTMLParam = class
    property Key:string;            // Key name
    property Value:string;          // Value name
    property Raw:string read fRaw;  // raw parameter line

Example
The HTML file

<html>
<BODY LINK="#FF00FF" border=0>
Hello You &amp; Co!
</html>

will result in 4 objects (HTMLParser.Parsed.Count=4):

[0] HTMLTag.Name = "HTML"
           .Params.count = 0

[1] HTMLTag.Name = "BODY"
           .Params.count = 2

                  [0]  HTMLParam.Key   = "LINK"
                                .Value = "#FF00FF"
                  [1]  HTMLParam.Key   = "BORDER"
                                .Value = "0"

[2] HTMLText.Line = "Hello You & Co!"

[3] HTMLTag.Name = "/HTML"
           .Params.count = 0

Comments and Bugs
Please send any comments or bugs to dennis@spreendigital.de.

Known Bugs and Problems
There are some problems with chars in a hex representation e.g. & #x67.

Important!

Please do NOT report any bugs considering this WebBrowser sample!
This sample is not meant as a full HTML compatible browser, indeed it is programmed to show this help file only.
©1999 Dennis D. Spreen