TLDR: I made a tutorial: Just enough Python for R users to write advanced Snakemake workflows. The original intended audience is the Schloss Lab, but anyone with basic Snakemake experience may find it useful.
For almost everyone in the Schloss Lab, R is their primary programming language. I can’t speak for Pat, but my understanding of the reason is:
- Pat knows R.
- If you write your code in R, Pat and other lab members can help you with bugs in your code.2
When I first joined the lab, Python was my main programming language. I had also taken some computer science classes in C++ and dabbled in R for statistics classes, but I didn’t know R well and hadn’t even heard of the tidyverse. Fortunately, picking up R & the tidyverse packages was a breeze thanks to Pat’s learning resources, other lab members leading code clubs that showcased their cool tips & tricks, and a vibrant #analysis-questions
channel in the lab Slack workspace.
So I started using the tidyverse in R for all my data analysis code, but I kept on using Snakemake, a Python-based tool for writing reproducible workflows. Fortunately, Pat either didn’t mind, or didn’t mind enough to tell me to use Make instead (phew 😅).
Fast-forward several years: not only does everyone in the lab use Snakemake, Pat even made a video teaching intro to Snakemake!
Pat has joined the snake people! 🎉 https://t.co/7HgaWHrMTp
— Kelly Sovacool (@kelly_sovacool) September 16, 2022
It’s possible to write good, simple Snakemake workflows with very little knowledge of Python. However, once you begin to take your workflows to an intermediate or advanced level, learning some Python is very helpful. I would even say learning Python is required if you want to understand advanced Snakemake tricks.
So it was high time to make the code club I’ve been secretly dying to make since day 0: Just enough Python for R users to write advanced Snakemake workflows. We covered the content in two 1-hour lab meetings over the last couple weeks. The tutorial covers enough Python to grok these Snakemake concepts:
- f-Strings for human-readable string formatting
- Config files as dictionaries
- Snakemake’s expand() function
- Lambda functions to define input files based on wildcards
Although the original intended audience is the Schloss Lab, anyone with basic Snakemake experience but limited Python knowledge may find this useful.
It was hard to stop myself from turning this code club into a brain-dump of everything I know and like about Python. There were a lot of topics I wished I had time to cover. I may eventually make a follow-up tutorial, “Just a little more Python”. Stay tuned!
Footnotes
The Python logo by the Python Software Foundation is licensed under GNU GPL v2. The Snakemake logo by Johannes Köster is licensed under CC BY-SA 4.0. The R logo by Hadley Wickham and others at RStudio is licensed under CC BY-SA 4.0.↩︎
Pat on “Why R?”: https://riffomonas.org/minimalR/01_introduction.html#why-r↩︎