Project

General

Profile

task #8555

Updated by Andreas Kohlbecker over 4 years ago

1. research on suitable document formats 
 1. documentation within the taxeditor would be preferrable (see [[Meeting_2019-09-07]]: 
     * In-App_Doku im Taxeditor: (see #8555) 
         * via Tooltips 
         * F1 --> Eclipse RCP Helpsystem 
         * Export als PDF 

 ---- 

 The eclipse help system supports multiple document formats, of which [DocBook](https://docbook.org/) is the most versatile format. 

 Documentations created in DocBook can be modularized (http://www.sagehill.net/docbookxsl/ModularDoc.html). Books can be assembled from individual chapter files whereas chapters can be hierarchically nested at arbitrary levels. Specific parts of a chapter can be included and repeating of the same content at various places is possible. So called xml-catalog-files map virtual to real file locations. This could for example be useful to provide the location of release/develop files.   

 Even if DocBook files are XML files, some editors exist which allow editing these files in a word processor like style: 

 * [Vex](https://www.eclipse.org/vex/) 
 * ~~OpenOffice:~~ The ability to operate with DocBook files goes back to Apache OpenOffice 4.1.7 (https://www.openoffice.org/xml/xmerge/docbook/) where this has implemented as experimental feature. Problems with this functionality exist, see https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=93593&p=445548 

 The excellent site [DocBook-Publishing](http://www.stefan-rinke.de/articles/publish/index.html) provides a framework for creating and publishing DocBooks. This framework includes the option to convert odt files into DocBook xml files. The tool for this transformation,    [OOo2sDbk](http://ebellot.chez.com/ooo2sdbk/), is being developed by Eric Bellot. 

 **ooo2dbk** is also available as package in debian and derivates. The source package can be downloaded from https://launchpad.net/ubuntu/+source/ooo2dbk/2.1.0-1.1. 
 Processing of OpenOffice 2 files (odt) requires the ooo2dbk.odf.xsl which can be extracted from the source package. Copy this file to `/usr/share/xml/ooo2dbk/` 
  change the files location which is configured in `/etc/ooo2dbk.xml`.  

 

 ### Potential workflow to convert docx files into modularized DocBook xml files 

 Tools: 

 * LibreOffice Writer (Version: 6.0.7.3) 
     * Extract embedded images and replace embedded images with linked images: [PicExtract - 1.0](https://extensions.libreoffice.org/extensions/extract-embedded-images-and-replace-embedded-images-with-linked-images-picextract) 

 Process: 

 1. Open the docx file in Writer and save as odt 
 2. Open the odt file 
 3. Stop "Track Changes" and Disable "Show track changes", save 
 4. In the toolbar click "PicExtract" ![](picture566-1.png) which is available once the PicExtract module is installed. 
 5. The dialog will offer to create `Pictures/` of the document location. We accept this setting and check both options below: 
    * [x] use image/object names ... 
    * [x] replace embedded 
 6. The extraction process may fail at some point with the notification that the document must be saved first. If this happens save the ddt and try again. If this does not help, save the odt as DocBook xml and search the xml file for `<inlinegraphic fileref="embedded:` the image named after that string needs to be exported manually by saving and linking to the image via the Image Properties dialog. Now you can try again to run the extraction process. Repeat these steps until the extraction completes without warning. 
 7. The final XML file needs to be post processed: 
     * replace all `fileref="#Pictures` by `fileref="Pictures` to fix the boken image links 
     * .... 

Back