task #8555: Evaluate document format and management for the documentation - EDIT Platform Etablierung - EDIT Project Management

task #8555

Updated by Andreas Kohlbecker over 4 years ago

1. research on suitable document formats 
 1. documentation within the taxeditor would be preferable (see [[Meeting_2019-09-07]]: 
     * In-App_Doku im Taxeditor: (see #8555) 
         * via Tooltips 
         * F1 --> Eclipse RCP Helpsystem 
         * Export als PDF 

 ---- 

 ## Eclipse Help System (InfoCenter) 

 Resources: 

 * [Platform/InfoCenter](https://wiki.eclipse.org/Platform/InfoCenter) 
 * [Adding Help Support to a Rich Client Platform (RCP) Application](http://www.eclipse.org/articles/article.php?file=Article-AddingHelpToRCP/index.html) 
 * [Authoring with Eclipse - Technical Documentations with DockBook and DITA](http://www.eclipse.org/articles/Article-Authoring-With-Eclipse/index.html#eclipse-transformation) 
     * [Authoring Eclipse Help Using DITA](https://wiki.eclipse.org/Authoring_Eclipse_Help_Using_DITA) 
     * [Authoring Eclipse Help Using DocBook](http://wiki.eclipse.org/Authoring_Eclipse_Help_Using_DocBook) [*2 September 2009*] 
     * [vogella - EclipseRCPHelpSystem](https://www.vogella.com/tutorials/EclipseRCPHelpSystem/article.html ) 

 The content for the eclipse help system consists of **TOC XML** files in which the structure of the documentation is defined. The TOC-XML files reference the content which is written in **HTML**. 

 The help system (since 3.3) has extension points for **content producers** that programmatically generate help content including a table of contents, keyword index, and content pages. This can be useful when converting documentation from some other format into HTML.([](http://www.eclipse.org/articles/article.php?file=Article-AddingHelpToRCP/index.html)) --> **TODO**: *which formats are supported?* 

 The eclipse wiki describes techniques to write the documentation in DocBook or DITA and to transform these into the eclipse help structure of TOC-XML and HTML files. --> **TODO**: are there maven plugins supporting the transformation? 

 ## DocBook & DITA 
 
 Documentations created in DocBook and DITA can be modularized (http://www.sagehill.net/docbookxsl/ModularDoc.html). Books can be assembled from individual chapter files whereas chapters can be hierarchically nested at arbitrary levels. DockBook 5.0 and DITA also support the concept of topics.  

 **WYSIWYG editors for DocBook and DITA:** 

 * [Vex](https://www.eclipse.org/vex/) - Eclipse plugin  
     * DocBook, DITA? 
     * OpenSource & free 
     * visual XML Editor, not really WYSIWYG, limited user experience 
 * [DocBook Editing and Processing for Eclipse (DEP4E)](http://dep4e.sourceforge.net/) DEP4E integrates DocBook XML and DocBook XSL into Eclipse IDE to create, edit and process DocBook projects. 
     * DocBook 
     * OpenSource & free 
     * no visual mode only pure XML editing 
 * [XMLmind XML Editor Personal Edition](https://www.xmlmind.com/xmleditor/download.shtml) 
     * DocBook 
     * DITA 
     * Can import DOCX and splits it up into topics - excellent! 
     * free but closed source 
     * WYSIWYG 
 * [Codex](http://codex.ca/download.html) WYSIWYG DITA Editor 
     * DITA 
     * Free License, closed source 
     * WYSIWYG 
 * [Oxygen](https://www.oxygenxml.com/xml_editor/docbook_editor.html) 
     * DocBook, DITA 
     * Academic license: $99 
     * WYSIWYG (excellent!) 
 * OpenOffice  
     * OpenSource & free 
     * real WYSIWYG 
     * DocBook: The ability to operate with DocBook files was added to Apache OpenOffice 3.4.1 as experimental feature (Dec 2015) (https://www.openoffice.org/xml/xmerge/docbook/) (https://web.archive.org/web/20121226165205/http://www.openoffice.org/xml/xmerge/docbook). Problems with this functionality exist, see https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=93593&p=445548 
         * Sources for the xsl files are at https://github.com/LibreOffice/core/tree/master/filter/source/docbook 
     * DITA: Can open DITA files but support is limited, not the full DITA spec seems to be supported. (Images missing, text formatting missing) no support for `*.ditamap* files 
 * [Adobe RoboHelp](https://www.adobe.com/products/robohelp.html) can open/import DITA, operates internally with another format, export of eclipse help possible. 

 See also: 

 * https://www.ditawriter.com/dita-related-tools/ 

 ### DocBook 

 Specific parts of a chapter can be included and repeating of the same content at various places is possible. So called xml-catalog-files map virtual to real file locations. This could for example be useful to provide the location of release/develop files.   


 The excellent site [DocBook-Publishing](http://www.stefan-rinke.de/articles/publish/index.html) provides a framework for creating and publishing DocBooks. This framework includes the option to convert odt files into DocBook xml files. The tool for this transformation,    [OOo2sDbk](http://ebellot.chez.com/ooo2sdbk/), is being developed by Eric Bellot. This however only supports the conversion in one direction. 

 ---- 

 ## Experimental workflows 

 **Converting Word (docx) to DocBook** 

 https://github.com/docbook/wiki/wiki/ConvertOtherFormatsToDocBook 

 **ooo2dbk** is also available as package in debian and derivates. The source package can be downloaded from https://launchpad.net/ubuntu/+source/ooo2dbk/2.1.0-1.1. 
 Processing of OpenOffice 2 files (odt) requires the ooo2dbk.odf.xsl which can be extracted from the source package. Copy this file to `/usr/share/xml/ooo2dbk/`task #8555: Evaluate document format and management for the documentation - EDIT Platform Etablierung - Redmine 
  change the files location which is configured in `/etc/ooo2dbk.xml`.  

 **Potential workflow to convert docx files into modularized DocBook xml files** 

 Tools: 

 * LibreOffice Writer (Version: 6.0.7.3) 
     * Extract embedded images and replace embedded images with linked images: [PicExtract - 1.0](https://extensions.libreoffice.org/extensions/extract-embedded-images-and-replace-embedded-images-with-linked-images-picextract) 

 Process: 

 1. Open the docx file in Writer and save as odt 
 2. Open the odt file 
 3. Stop "Track Changes" and Disable "Show track changes", save 
 4. In the toolbar click "PicExtract" ![](picture566-1.png) which is available once the PicExtract module is installed. 
 5. The dialog will offer to create `Pictures/` of the document location. We accept this setting and check both options below: 
    * [x] use image/object names ... 
    * [x] replace embedded 
 6. The extraction process may fail at some point with the notification that the document must be saved first. If this happens save the ddt and try again. If this does not help, save the odt as DocBook xml and search the xml file for `<inlinegraphic fileref="embedded:` the image named after that string needs to be exported manually by saving and linking to the image via the Image Properties dialog. Now you can try again to run the extraction process. Repeat these steps until the extraction completes without warning. 
 7. The final XML file needs to be post processed: 
     * replace all `fileref="#Pictures` by `fileref="Pictures` to fix the boken image links 
     * ....

Back

Project

General

Profile

EDIT » EDIT Platform Etablierung

task #8555