Table of Contents


WWW-wide translation of HTML pages

The most popular application of this project is Guhgel - a German->Saxon translator applied to the search engine Google.

The same technique was independently developed for the German->Swabian translation for UNiMUT's Schwobifying Proxy

"parallelweb" when installed as CGI script will translate a HTML document while all its links are rewritten so that following links will also translate the referred documents.


To get this nice effect, three problems have to solved (and they are all solved at a certain degree):

  • Parse HTML and separate markup from text

  • On markup: Rewrite hyperlinks for automatic translation of referred pages.

  • On text: Translate it.

Although the last point was the original motivation the first two points consumed most of the development time because of the crappy HTML design including its extensions and mistakes that are made by 99% of the HTML designers who either use ugly WYSIWYG HTML editors or at least avoid any HTML validation.

The heart of the project is the module which provides the ugly technical part whereas many modules (see the example directory) can benefit from it.


Look into the "Makefile" ! :-]


  • RelinkHTML seems to be a never ending story, it does still not work with many HTML pages.

  • contains some bugs as far as I can see and is inconsistent with respect to numeric character entities

  • Adaption of the modules to work with the apache module mod_python. Which is much more efficient than running the python interpreter again and again.

  • Is there someone who is able and willing to install these scripts for general usage on his machine?

  • Can someone tell me why happydoc creates the whole home/user/parallelweb/doc path into my doc directory, again? I can't make the links work generally this way!

  • Today I feel that a static typed language like Haskell had been the better choice for this project, although Python is much better suited than Perl or PHP. In Python the data structures are built dynamically which let you easily lose track of the structures and makes it a horror to restructure something.

Modules and Packages   



Replaces vowel according to the scheme of the piece of the three Chineses with a double bass.


Makes German text sound a little more cute.


apply an echo effect on German HTML files :-)


add some tongues to German HTML files, so the texts sounds like those spoken by a famous Bavarian politician :-)


Replaces vowel according to the scheme of the piece of the three Chineses with a double bass.


apply a so called spoon effect on German HTML files :-)


Translate German HTML files to Saxonian dialect :-).


add some wise words to German HTML files :-)



Extends UserDict by the methods transpose(), keyregexp().


Output of special markup


Help processing of German words.


Parse HTML pages, absolutize links and invoke a translate on the text.

Table of Contents

This document was automatically generated on Mon Oct 6 18:11:15 2003 by HappyDoc version 2.1