HTML to Text
A common problem when working with internet data is “What can I do about all these tags”. Cleaning html can be a daunting task. A simple work around is to use the help of a text-based web browser.
Text-based browing
Here is an example on how to use the Linux browser “links”.
#links -dump http://www.salsadev.com/
This command can be wrapped into a php file and used as a service:
if(isset($_REQUEST[’page’]) && $_REQUEST[’page’] != “”){
echo system(”links -dump “.$_REQUEST[’page’]);
}
links.php
This script is available online for those who do not have access to a Linux shell. Point your browser to http://www.salsadev.com/tools/links.php and start working with text-based data. The script takes one url parameter called ‘page’.

