source: pmb4.2/trunk/fuentes/pmb/classes/font/ttf2ufm/ttf2ufm-src/scripts/unhtml @ 815

Last change on this file since 815 was 815, checked in by jrpelegrina, 4 years ago

Initial release of pmb 4.2

  • Property svn:executable set to *
File size: 785 bytes
Line 
1#!/bin/sh
2#
3# This script removes the HTML formatting from a file. If the file was designed
4# with such use in mind and was properly formatted besides HTML (such as the README
5# file for ttf2pt1) it will look good as a plain text file.
6#
7# This script supports a very limited set of HTML formatting. Everything that
8# goes before <BODY> is removed.  Any lines that
9# contain only the HTML formatting or start with "<!" or contain only ">"
10# are completely removed. Then all the in-line formatting is removed.
11# Then "&nbsp;", "&lt;", "&gt;" are changed to " ", "<", ">".
12
13sed '1,/<[bB][oO][dD][yY]>/d;
14/^<!/d;
15s/<[lL][iI]>/-/g;
16s/^</< </;
17s/> *$/>>/;
18s/<[^<>]*>//g;
19/^< *>$/d;
20/^>>$/d;s/^< //;
21s/>$//;
22s/&[nN][bB][sS][pP];/ /g;s/&[lL][tT];/</g;s/&[gG][tT];/>/g;s/&[aA][mM][pP];/\&/g;'
Note: See TracBrowser for help on using the repository browser.