HtmlZap sample program
This little code snippet opens all the files that match the filespec
Anybody who wonders why I would be doing things like this has never worked with HTML files created by Microsoft Word's "Save as HTML" command... <g>
Fetching web pages
HtmlZap users frequently ask me how to use the component to parse pages retrieved directly from the Internet. HtmlZap itself doesn't have this functionality: it can only parse HTML contained in a local disk file or a string.
However, there are many components (commercial and otherwise) that can
fetch web pages and return them in a form HtmlZap can use. The simplest
approach may be to use the Microsoft Internet Transfer Control, which is
included with Visual Studio. If you were to place an Internet Transfer
HZ.LoadBuffer ITC.OpenURL("http://www.google.com/", icByteArray)
This will cause the Internet Transfer Control to connect to the target
URL (Google's main page in the example), retrieve its contents as a Byte
Array, then pass the array to the HtmlZap component named
If you'd rather not use the Internet Transfer Control, which doesn't work from scripting languages, you can try a freeware component like AspTear, for example, though I've never used it myself. Several commercial IP component suites include tools that can retrieve pages over HTTP; I've had really good luck with the IP*Works suite from /n software.
Private Sub Demo() ' ' Parse the intro ' Dim of As Integer Dim sf As String, pfn As String Dim pict As String Const SrcDir = "j:\intro\" ' Source directory HZ.CompressWS = False ' Don't compress whitespace sf = Dir(SrcDir + "page*.htm") ' Get all page*.htm files While sf <> "" HZ.Load SrcDir + sf ' Load a file of = FreeFile ' Open the copy Open SrcDir + "copy\" + sf For Output As #of While Not HZ.EOF ' Loop through the entire source file If HZ.IsTag Then ' This is a tag Select Case HZ.TagName ' Which one? Case "font", "/font", "p" ' Do nothing... just remove these Case "/p" Print #of, "<p>"; ' Convert </p> to <p> Case "img" pfn = LCase$(HZ.Param("src")) FileCopy SrcDir + pfn, "j:\intro\send\" + pfn pict = SrcDir + pfn ' Use an invisible picture control to get picture sizes PicSizer.Picture = LoadPicture(pict) Print #of, "<img src="""; pfn; """ width="; _ Format$(PicSizer.ScaleWidth); " height="; _ Format$(PicSizer.ScaleHeight); ">"; DoEvents Case Else ' All other tags Print #of, "<"; HZ.ToString; ">"; End Select Else Print #of, HZ.Text; ' Just transcribe text unchanged End If HZ.Next ' Get the next slice Wend Close #of ' Close the copy HZ.Reset ' Reset the Html Zapper DoEvents sf = Dir ' Get next filename Wend MsgBox "Done", vbOKOnly, "Demo Copy" End Sub
Last revised: 24 September 2002