Class StringExtractor
- java.lang.Object
-
- org.htmlparser.parserapplications.StringExtractor
-
public class StringExtractor extends java.lang.Object
Extract plaintext strings from a web page. Illustrative program to gather the textual contents of a web page. Uses aStringBean
to accumulate the user visible text (what a browser would display) into a single string.
-
-
Constructor Summary
Constructors Constructor Description StringExtractor(java.lang.String resource)
Construct a StringExtractor to read from the given resource.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
extractStrings(boolean links)
Extract the text from a page.static void
main(java.lang.String[] args)
Mainline.
-
-
-
Method Detail
-
extractStrings
public java.lang.String extractStrings(boolean links) throws ParserException
Extract the text from a page.- Parameters:
links
- iftrue
include hyperlinks in output.- Returns:
- The textual contents of the page.
- Throws:
ParserException
- If a parse error occurs.
-
main
public static void main(java.lang.String[] args)
Mainline.- Parameters:
args
- The command line arguments.
-
-