{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## A brief introduction to `pandoc`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`pandoc` is a really handy tool for a lot of bioinformaticians due to its ability to inter-convert between different document formats. For example, if you want to convert a simple text file into a Word document or a PDF file, you can convert it within a command line environment. Its usefulness will become highly evident in work settings where you need to generate documents or reports on the fly as a part of a bioinformatic workflow or a pipeline, for example.\n", "\n", "In your assignment 1, I instructed you to generate a PDF document from a Markdown text file. First, you need to install `pandoc` through `conda`. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading channels: ...working... done\n", "# Name Version Build Channel \n", "pandoc 1.16.0.2 0 conda-forge \n", "pandoc 1.17.0.1 0 conda-forge \n", "pandoc 1.17.0.2 0 conda-forge \n", "pandoc 1.17.1 0 conda-forge \n", "pandoc 1.17.2 0 conda-forge \n", "pandoc 1.18 0 conda-forge \n", "pandoc 1.19 0 conda-forge \n", "pandoc 1.19.1 0 conda-forge \n", "pandoc 1.19.2 0 conda-forge \n", "pandoc 1.19.2.1 ha5e8f32_1 pkgs/main \n", "pandoc 2.0.0.1 0 conda-forge \n", "pandoc 2.0.0.1 1 conda-forge \n", "pandoc 2.0.3 0 conda-forge \n", "pandoc 2.0.4 0 conda-forge \n", "pandoc 2.0.5 0 conda-forge \n", "pandoc 2.1 0 conda-forge \n", "pandoc 2.1.1 0 conda-forge \n", "pandoc 2.1.2 0 conda-forge \n", "pandoc 2.1.3 0 conda-forge \n", "pandoc 2.2 hde52d81_0 conda-forge \n", "pandoc 2.2.1 h1a437c5_0 pkgs/main \n", "pandoc 2.2.1 hde52d81_0 conda-forge \n", "pandoc 2.2.2 hde52d81_0 conda-forge \n", "pandoc 2.2.2 hde52d81_1 conda-forge \n", "pandoc 2.2.3.2 0 pkgs/main \n", "pandoc 2.3 0 conda-forge \n", "pandoc 2.3.1 0 conda-forge \n", "pandoc 2.4 0 conda-forge \n", "pandoc 2.5 0 conda-forge \n", "pandoc 2.5 1 conda-forge \n", "pandoc 2.6 0 conda-forge \n", "pandoc 2.6 1 conda-forge \n", "pandoc 2.7.1 0 conda-forge \n", "pandoc 2.7.2 0 conda-forge \n", "pandoc 2.7.3 0 conda-forge \n", "pandoc 2.8 0 conda-forge \n", "pandoc 2.8.0.1 0 conda-forge \n", "pandoc 2.8.1 0 conda-forge \n", "pandoc 2.9 0 conda-forge \n", "pandoc 2.9.1 0 conda-forge \n", "pandoc 2.9.1.1 0 conda-forge \n", "pandoc 2.9.2 0 conda-forge \n", "pandoc 2.9.2.1 0 conda-forge \n", "pandoc 2.9.2.1 0 pkgs/main \n", "pandoc 2.10 0 conda-forge \n", "pandoc 2.10 0 pkgs/main \n", "pandoc 2.10 h1de35cc_0 conda-forge \n", "pandoc 2.10.1 0 pkgs/main \n", "pandoc 2.10.1 haf1e3a3_0 conda-forge \n" ] } ], "source": [ "%%bash\n", "conda search pandoc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To install, `pandoc`, just type `conda install pandoc` and it will install the latest version of it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to convert this Markdown text file into a PDF file, I gave as an example the following commands you need to type in your terminal:\n", "\n", "```bash\n", "## For Mac\n", "pandoc assignment_01.md \\\n", " -V geometry:margin=1in \\\n", " -V fontsize:11pt \\\n", " --variable mainfont=\"PT Serif\" \\\n", " --variable sansfont=\"Arial\" \\\n", " --variable monofont=\"Menlo\" \\\n", " --pdf-engine=xelatex \\\n", " -o assignment_01.pdf\n", " \n", "## For Windows/WSL\n", "pandoc assignment_01.md \\\n", " -V geometry:margin=1in \\\n", " -V fontsize:11pt \\\n", " --variable mainfont=\"Liberation Serif\" \\\n", " --variable sansfont=\"Liberation Sans\" \\\n", " --variable monofont=\"Liberation Mono\" \\\n", " --pdf-engine=xelatex \\\n", " -o assignment_01.pdf\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You might find this way of typing commands in your terminal a bit confusing. But don't be afraid! All of the parameters broken into several lines can be written in a single line. But it will usually look very long and it may be a bit harder to fit it all in your terminal window. Therefore, I type \"\\\\\" (backslash) to tell the shell (your Unix environment) that the command is not over yet and continues in the next line. The backslash is a special character in the Unix environment and is the mirror of \"/\" (Slash). Programmers use it as an \"escape\" character to specify something. In this example, it allows you to escape the end of line and continues the next line.\n", "\n", "If you type \"\\\\n\", this means you are specifying a line break. For example, this command: `echo -e \"My\\nName\"` in your terminal, it will result in this:\n", "```bash\n", "My\n", "Name\n", "```\n", "\n", "If you type \"\\\\t\" in the `awk` command, it means you are referring to a \"tab\" character (the character you see above the \"cap lock\" on your keyboard.\n", "\n", "Coming back to the `pandoc` command examples, I specified font variables such as `--variable sansfont=\"Arial\"` to indicate that pandoc uses this specific font for Sans Serif font in the document generated. This is somewhat confusing to many and it may not work, depending on whether or not these fonts are installed on your computer. You can omit the lines indicating these fonts and `pandoc` will automatically use whatever fonts that are installed on your system. \n", "\n", "However, one thing you might notice here is that this parameter `--pdf-engine=xelatex` may not work on your computer depending on whether `LaTeX` is installed on your computer or not. If it is not installed, you will need to install the required packages through `conda`.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading channels: ...working... done\n", "# Name Version Build Channel \n", "texlive-core 20160520 pl5.20.3.1_1 conda-forge \n", "texlive-core 20160523b pl5.20.3.1_0 conda-forge \n", "texlive-core 20160523b pl5.20.3.1_1 conda-forge \n", "texlive-core 20160523b pl5.20.3_3 conda-forge \n", "texlive-core 20170520 pl5.22.2.1_0 conda-forge \n", "texlive-core 20170520 pl5.22.2.1_1 conda-forge \n", "texlive-core 20170520 pl5.22.2.1_2 conda-forge \n", "texlive-core 20170520 pl526h2f74ec9_2 pkgs/main \n", "texlive-core 20170520 pl526h47ed19a_1 pkgs/main \n", "texlive-core 20170520 pl526ha3510ec_1 pkgs/main \n", "texlive-core 20170520 pl526hc2f8f47_1 pkgs/main \n", "texlive-core 20180414 ha09c46f_0 pkgs/main \n", "texlive-core 20180414 pl526h0778769_1 conda-forge \n", "texlive-core 20180414 pl526h6632d02_1 conda-forge \n", "texlive-core 20180414 pl526hd51217d_2 conda-forge \n", "texlive-core 20180414 pl526hd51217d_3 conda-forge \n", "texlive-core 20180414 pl526hfbb4d6c_0 conda-forge \n" ] } ], "source": [ "%%bash\n", "conda search texlive-core" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This package should be present on both Ubuntu and Mac `conda` environments. Go ahead and install it by typing `conda install texlive-core`. And try typing the pandoc commands again. Try this first:\n", "\n", "```bash\n", "pandoc assignment_01.md \\\n", " -V geometry:margin=1in \\\n", " -V fontsize:11pt \\\n", " --pdf-engine=xelatex \\\n", " -o assignment_01.pdf\n", "```\n", "\n", "If this fails to produce a PDF file or you get error messages, try typing this:\n", "\n", "```bash\n", "pandoc assignment_01.md \\\n", " -V geometry:margin=1in \\\n", " -V fontsize:11pt \\\n", " --pdf-engine=pdflatex \\\n", " -o assignment_01.pdf\n", "```\n", "\n", "Hopefully, you will get a PDF file after this command. Now, I will show you another example to generate a Word document using `pandoc`. Type\n", "\n", "```bash\n", "pandoc assignment_01.md \\\n", " -V geometry:margin=1in \\\n", " -V fontsize:11pt \\\n", " -o assignment_01.docx\n", "```\n", "\n", "Now, you have converted your assignment Markdown file into a Word document that can be opened with Microsoft Word. There are just two examples of what you can do with `pandoc`. You can go here to see what else you can do with it. \n", "\n", "https://pandoc.org/\n", "\n", "The possibilities are enormous. You can even convert your Jupyter notebook into other formats." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Alternative ways to convert Markdown files to PDF" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you ran into problem installing `LaTeX` and related tools that gives you problem with `pandoc`, you can still use a different tool to generate a PDF file. See here: https://superuser.com/questions/689056/how-can-i-convert-github-flavored-markdown-to-a-pdf\n", "\n", "Basically, you install a tool known as `grip`, which will render your Markdown file on your Web browser, then you print and save it as a PDF file (Chrome works best for this). To install `grip`, you type:\n", "\n", "```bash\n", "pip install grip\n", "```\n", "\n", "Then on your terminal, type: \n", "\n", "```bash\n", "grip assignment_01.md\n", "```\n", "\n", "This will print a URL on your terminal. In my case, it's http://localhost:6419/\n", "\n", "Copy and paste this address to your Chrome browser and you will see the rendering. Next, print it (save it as a PDF) and you get a PDF file. I will accept it as an alternative way to generate a PDF file for your assignments. To stop the `grip` tool, type Control (CTRL key) + C together." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" } }, "nbformat": 4, "nbformat_minor": 4 }