Skip to content

Migrating LoopBack Docs to Markdown for use with Jekyll

Rand McKinney edited this page Aug 6, 2016 · 16 revisions

To create an open-source site similar to Express docs:

  1. Export content of APIC space to HTML. (This space now contains the source documentation for LoopBack, which is duplicated in pages with the same title in the LB space.)
  2. Convert/strip HTML to markdown using script.
  3. Get image content from Confluence.

NOTE: Although the long-term plan is to have both LoopBack 2.x and 3.0 docs, initially we should focus on 2.x, since 3.0 is not released yet. As the 3.0 release approaches, we can "clone" the 2.0 docs into /docs/lb3, and then add/modify the content as needed.

Conversion from HTML to markdown

Title

Article title is in <span id="title-text">...</span>. Use the contents of this tag as the value for the title property in the article front-matter.

Front matter

Every markdown file must start with some Jekyll front-matter that looks like this:

---
title: The article title goes here
layout: page
keywords: LoopBack
tags:
sidebar: lb2_sidebar
permalink: /doc/lb2/The-file-name-goes-here.html
summary:
---

NOTE: The three dashes before and after front-matter are required.

In general, we don't have a consistent summary for every article, so we'll leave the summary property blank. Confluence export apparently does not include "labels" data, so we'll also leave the tags property blank. This seems pretty lame on the part of Confluence (Atlassian).

Article content

The actual article content is in

<div id="main-content" class="wiki-content group">
...
</div>

Everthing above and below this, i.e. outside of this tag, can be discarded.

### Other stuff that should be discarded.

Some pages may have these, which should just be discarded.

#### Injected CSS

Discard injected CSS: `<style type='text/css'>/*<![CDATA[*/ .... /*]]>*/</style>`

#### Confluence-generated TOC

Since our Jekyll theme has it's own [automatic generated TOCs](http://idratherbewriting.com/documentation-theme-jekyll/mydoc_pages.html#automatic-mini-tocs), we should discard this HTML (that occurs only in some pages):
...
```

The class selector rbtoc1470354523244 varies by page.

Links

Images

Macros


Note

I'm assuming we can convert the HTML to markdown without too much trouble, but I'm keeping this here for reference in case we need it.

In case it's easier to export to Word and then convert the Word files to markdown. See How can doc/docx files be converted to markdown or structured text?.

Other references:

References: