wok-6.x annotate perl-html-parser/description.txt @ rev 24715

updated lfs-book (8.4 -> 11.1)
author Hans-G?nter Theisgen
date Tue Mar 15 06:57:03 2022 +0100 (2022-03-15)
parents
children
rev   line source
Hans-G?nter@24217 1 Objects of the HTML::Parser class will recognize markup and
Hans-G?nter@24217 2 separate it from plain text (alias data content) in HTML
Hans-G?nter@24217 3 documents.
Hans-G?nter@24217 4 As different kinds of markup and text are recognized, the
Hans-G?nter@24217 5 corresponding event handlers are invoked.
Hans-G?nter@24217 6
Hans-G?nter@24217 7 HTML::Parser is not a generic SGML parser. We have tried to
Hans-G?nter@24217 8 make it able to deal with the HTML that is actually "out there",
Hans-G?nter@24217 9 and it normally parses as closely as possible to the way the
Hans-G?nter@24217 10 popular web browsers do it instead of strictly following one
Hans-G?nter@24217 11 of the many HTML specifications from W3C.
Hans-G?nter@24217 12 Where there is disagreement, there is often an option that
Hans-G?nter@24217 13 you can enable to get the official behaviour.
Hans-G?nter@24217 14
Hans-G?nter@24217 15 The document to be parsed may be supplied in arbitrary chunks.
Hans-G?nter@24217 16 This makes on-the-fly parsing as documents are received from
Hans-G?nter@24217 17 the network possible.
Hans-G?nter@24217 18
Hans-G?nter@24217 19 If event driven parsing does not feel right for your application,
Hans-G?nter@24217 20 you might want to use HTML::PullParser.
Hans-G?nter@24217 21 This is an HTML::Parser subclass that allows a more conventional
Hans-G?nter@24217 22 program structure.