data:image/s3,"s3://crabby-images/474cc/474cc30ac8b14bb038d70c7dfc8923a2d287986f" alt="Lead Image © Danila Krylov, 123RF.com Lead Image © Danila Krylov, 123RF.com"
Lead Image © Danila Krylov, 123RF.com
HTML to database with a Perl script
Clean Start
At work, I was tasked to come up with a solution to a problem presented by a web developer who needed some 1,200 static HTML files stripped of the HTML markup, leaving behind only text. Because 1,200 files introduce too much complexity and are generally cumbersome to keep on a web server, a database-driven approach was necessary. Only a few Perl scripts would be needed to serve the pages. Although 1,200 files is perhaps an attractive approach to a low-tech solution, its simple bulk is unruly. Also, editing the text as database entries would be much simpler, employing only a few more fairly simple-to-implement Perl scripts.
In the solution described in this article, a Perl script strips the HTML markup with a simple regular expression and creates the text files that are put into a database.
The Solution
Initially, an HTML file employing three iframes is created (Listing 1) [1]. Each window is given a name with the name
attribute (line 7). A simple target attribute within an anchor element sends the data to the specified iframe. The first iframe, designated menu
, is a list of reference categories. The links from the menu
iframe are targeted to the iframe below it, list
(line 11), which contains the commands for the categories representing the language that was identified in the menu
iframe.
Listing 1
HTML iframes
01 <table border="0" cellpadding="0" cellspacing="0" align="center"> 02 <tr> 03 <td align="left" valign="top"> 04 <table border="0" cellpadding="0"
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.
data:image/s3,"s3://crabby-images/8882c/8882c7b9049274130cc0e4f3065e8d0006a061a0" alt="Learn More”>
</a>
<hr>
</div>
</div>
<div class="