How To: Create Cool PDFs with Code with Pandoc

Note: This is my first english post. Sorry for my bad english. I hope this guide is comprehensible. It’s not tested for all OS, except for macOS. If there is a suggestion for improvements, please write it in the comments below. Thanks.

Purpose

We will create a Markdown file, which contains LaTeX, C-Code (or any other programming language), YAML and some Pandoc specific code. So in the end we will get a perfect looking PDF-file with highlighted Code and some sweet equations, etc.

Why?

  • Markdown is awesome.
  • LaTeX is awesome.
  • Pandoc is awesome.

What we’re going to do:

  • Explain our writing environment
  • Install Pandoc & LaTeX
  • Create Markdown file with C-Code & Latex inside

What we can do (optional)

Create an include.hs script for pandoc to create a shell script for easier execute the conversion

Writing Environment

We will use a little bit of YAML, more Markdown and sometimes LaTeX. LaTeX is optional to use in our document. But pandoc will convert the whole doc in it.

YAML

For YAML you don’t need know anything. Just follow the Tutorial. It’s only at the beginning of the doc, starting and ending with

---

between the two lines are some list points. Well, that’s all about it.

Markdown

Essential for writing this doc is Markdown. I won’t go into detail, also because there are so many great tutorials in the internet.

For first headings you write

# Heading 1

## Heading 2

### Heading 3

Links can be added with [name of link](actual url) so you have for example [Pandoc](http://pandoc.org)

For some cases Pandoc has it own syntax.

Inline images e.g. ![alt text](path to file)\

More information on this Cheatsheet from Adam Pritchard for ’normal‘ Markdown. And Pandocs Markdown.

Editor

My preferred editor for basically everything is Atom. You can get it at atom.io. But you can use every editor, that supports basic text editing. No Word, RTF, …

The file type isn’t really important. I would use .md. Only pandoc needs to know, which file type you doc is.

Install Pandoc & LaTeX

The latest installing guide is always available at pandoc.org. There are instructions for other OS too.

macOS

  1. The easiest way on macOS is to install Pandoc, is to use homebrew: brew install pandoc.
  2. As LaTeX distribution it’s recommend to use Basic TeX. To install write in your prefered terminal: ´brew cask install basictex´

Linux

  1. Try to install the package with your package manger. On Ubuntu, Debian & elementary use apt-get install pandoc. For detailed & more install options visit https://pandoc.org/installing.html#linux
  2. For LaTeX, we can use texlive, so type Ubuntu, Debian & elementary apt-get install texlive.

Windows

  1. Install the latest Pandoc package from https://github.com/jgm/pandoc/releases.
  2. Install MiKTeX from https://miktex.org/download.

Create a Markdown file

At the beginning of the file we will insert some YAML. In this YAML-Code we put in some metadata about the document. It will not just will show up in the document, but will added to the PDF-file as metatags. It’s optional, but it’s a lot shinier than without. To seperate the YAML metadata from the other content we’re creating a block, starting with two lines of \---\. In there for each option we’re creating a new line. We can use title for the documents title and the tag author for the documents author. It could look like this:

---
title: Excercise 3 - Computer Science Class
author: Markus Haller
---

As an advanced option we also can add multiple authors:

author:
- Konrad Zuze
- Bill Kates
- Steve Hobs
- Linus Porvalds

and to change the paper margins, just write something like

geometry: margin=1in

So in a complex case it could look like

---
title: Excercise 3 - Computer Science Class
author:
- Konrad Zuze
- Bill Kates
- Steve Hobs
- Linus Porvalds
geometry: margin=1in
---

There more options like this. Just google them in combination of LaTeX. Now you can write as much Markdown as you want to.

Insert Programming Code

Markdown supports code, but it just does have a different formatting as text. To let Pandoc highlight code for you, you need to declare it a little bit different than in Markdown. You need to write a fenced code block.

For generic Code you write ~~~ /* Code */ ~~~. For C-Code You write ~~~ {.c} /* Code */ ~~~. If you want numbered Lines, just add .numberLines, so for C-Code you have ~~~ {.c .numberLines} /* Code */ ~~~. Nice, isn’t it? If you want to know all the languages Pandoc can highlight just type pandoc --list-highlight-languages in your shell. There are also more code attributes available at Pandoc.org (search for fenced_code_attributes).

Insert LaTeX in the Document (optional)

For inline LaTeX write one $ at the beginning and at the end of the statement. For multiline LaTeX write two. Now you can make something like this: a1, …, an ∈ ℕ with $a_{1}, ..., a_{n} \in \mathbb{N}$

Conversion to PDF

In your preferred shell type pandoc -o output.pdf input.md --latex-engine=xelatex. That’s it. If you got any problem. DuckDuckGo knows some answers. Also Pandorg.org is really helpful. For mathematical expressions (also sweet arrows, subscripted character, etc.) in LaTeX visit https://en.wikibooks.org/wiki/LaTeX/Mathematics

How to Include Code from Files (optional)

Install Haskell

We can extend Pandoc with extensions/scripts. You can use different languages, but we use Haskell. We need the whole Haskell Platform. For proper instructions for your operating system have a look at the Haskell website: https://www.haskell.org/platform.

Install Haskell Core if you have the choice.

Then execute cabal update and then cabal install pandoc.

Windows

Download the latest minimal installer & install it.

macOS

My preferred way to do this on Mac, is to use Homebrew Cask: brew cask install haskell-platform.

Linux

Use your package manager to install it. For more information head to the Haskell website.

Create Include Script

We use the script from the pandoc.org Website to include files in the document.

#!/usr/bin/env runhaskell
-- includes.hs
import Text.Pandoc.JSON

doInclude :: Block -> IO Block
doInclude cb@(CodeBlock (id, classes, namevals) contents) =
  case lookup "include" namevals of
       Just f     -> return . (CodeBlock (id, classes, namevals)) =<< readFile f
       Nothing    -> return cb
doInclude x = return x

main :: IO ()
main = toJSONFilter doInclude

Save this script as file in the folder your document is and name it something like include.hs.

To insert a code file in your document, write in your document:

~~~ {include="calcRoot.c"}

~~~

Create Document with Code as Files

Now you can excute the the script, when using Pandoc, like: pandoc -o myDocument.pdf myDocument.md --filter include.hs.

You may have to add the parameter --latex-engine=xelatex.

If there is an error like Failed to load interface for 'Text.Pandoc.JSON' you have to execute cabal update and then cabal install pandoc.