Rχiv-Maker: an automated template engine for streamlined scientific publications Bruno M. Saraiva1, , Guillaume Jaquemet2,3,4, , and Ricardo Henriques1,5, arXiv:2508.00836v3 [cs.DL] 13 Aug 2025 1 Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Oeiras, Portugal 2 Faculty of Science and Engineering, Cell Biology, Åbo Akademi University, Turku, Finland 3 InFLAMES Research Flagship Center, University of Turku, Turku, Finland 4 Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland 5 UCL Laboratory for Molecular Cell Biology, University College London, London, United Kingdom Preprint servers have become central to research communication, but authors still struggle with manuscript preparation and typesetting. Rxiv-Maker converts Markdown documents to publication-ready PDFs through automated LaTeX processing. Researchers can focus on content while the system handles formatting and typesetting without requiring LaTeX knowledge. The tool supports version control and collaborative editing workflows common in modern research teams. Python and R scripts execute during compilation to generate figures directly from data, keeping visualisations synchronised with analyses. Docker containerisation and automated build systems provide consistent results across different computing environments. Mathematical notation, citations, and cross-references are processed automatically during conversion. This manuscript was prepared using Rxiv-Maker. article template | scientific publishing | preprints Correspondence: (B. M. Saraiva) bsaraiva@itqb.unl.pt; (G. Jaquemet) guillaume.jacquemet@abo.fi; (R. Henriques) ricardo.henriques@itqb.unl.pt Main Preprint servers like arXiv, bioRxiv, and medRxiv have become central to research communication (1–3). As submission rates climb (Fig. S1, Fig. S2), researchers now handle tasks once managed by journal production teams (4, 5). Most manuscript preparation workflows use proprietary formats that work poorly with version control systems, making collaborative research more difficult (6). Computational research faces particular challenges because algorithms, analysis methods, and processing pipelines change frequently. In computational biology, researchers struggle to keep manuscripts synchronized with evolving analysis code, leading to publications that don’t accurately describe the methods used. Bioimage analysis shows these problems clearly: collaborative frameworks (7) and containerised analysis environments (8) highlight how important reproducible computational workflows are for scientific publishing. Rxiv-Maker helps address these challenges by providing a developer-centric framework for reproducible preprint preparation. It generates publication-standard PDFs through automated LaTeX processing and works directly with Git workflows and continuous integration practices. Built-in reproducibility features ensure manuscripts build consistently across different systems and over time. Manuscript preparation becomes a transparent process that Fig. 1. The Rxiv-Maker System Diagram. The system integrates Markdown content, YAML metadata, Python and R scripts, and bibliography files through a processing engine. This engine leverages GitHub Actions, virtual environments, and LaTeX to produce a publication-ready scientific article, demonstrating a fully automated and reproducible pipeline. gives researchers access to professional typesetting tools. A Visual Studio Code extension provides syntax highlighting and automated citation management. Researchers can leverage familiar development environments while maintaining rigorous version control and reproducibility guarantees. This bridges traditional authoring workflows with contemporary best practices in computational research. The framework enables programmatic generation of figures and tables using Python and R scripting with visualisation libraries including Matplotlib (9) and Seaborn (10). Figures can be generated directly from source datasets during compilation, establishing transparent connections between raw data, processing pipelines, and final visualisations. This executable manuscript approach eliminates the manual copyand-paste workflow that traditionally introduces errors when transferring results between analysis and documentation (11). When datasets are updated or algorithms refined, affected figures are automatically regenerated, ensuring consistency and eliminating outdated visualisations. The system integrates Mermaid.js (12) for generating technical diagrams from textbased syntax, with the complete range of supported methods detailed in Table S1. This approach reframes manuscripts as executable outputs of the research process rather than static documentation. Built upon the HenriquesLab bioRxiv template (13), Rxiv-Maker rxiv-maker | Saraiva et al. | August 14, 2025 | 1–12 Fig. 2. Rxiv-Maker Workflow: User Input vs. Automated Processing. The framework clearly separates user responsibilities (content creation and configuration) from automated processes (parsing, conversion, compilation, and output generation). Users only need to write content and set preferences. At the same time, the system handles all technical aspects of manuscript preparation automatically, ensuring a streamlined workflow from markdown input to publication-ready PDF output. extends capabilities through automated processing pipelines. The architecture, detailed in Fig. 1 and Fig. 2, provides automated build processes through GitHub Actions and virtual environments, with technical details described in Supp. Note 1. Academic authors use various tools depending on their research needs and technical requirements. Traditional LaTeX environments like Overleaf democratise professional typesetting through accessible web interfaces, but struggle with version control and computational content integration. Multi-format publishing platforms including Quarto, R Markdown, and Bookdown excel at producing multiple output formats with statistical integration, though they introduce complexity for simple documents and variable LaTeX typesetting quality. Collaborative writing frameworks such as Manubot enable transparent, version-controlled scholarly communication with automated citation management (14), yet offer limited computational reproducibility features. Web-first computational systems like MyST and Jupyter Book prioritise interactive content and browser-native experiences, but compromise PDF output quality and offline accessibility. Modern typesetting engines like Typst provide cleaner syntax and faster compilation, though ecosystem maturity and adoption remain barriers. Rxiv-Maker occupies a specialised niche at the intersection of developer workflows, academic publishing, and computational reproducibility. This developer-centric approach requires technical setup but delivers automated, reproducible PDF preprint generation particularly suited to computational research where datasets evolve and algorithmic documentation is essential. The framework trades initial complexity for long-term automation benefits, enabling deeper specialisation for manuscripts involving dynamic content and processing pipelines. A comprehensive comparison is provided in Table S2. Rxiv-Maker simplifies manuscript creation by building reproducibility directly into the writing process. Writers work in familiar Markdown, which the system converts to LaTeX and compiles into publication-ready PDFs with proper formatting, pagination, and high-quality figures. Docker containerisation addresses computational repro2 ducibility by encapsulating the complete environment (LaTeX distributions, Python libraries, R packages, and system dependencies) within immutable container images. GitHub Actions workflows leverage pre-compiled Docker images for standardised compilation processes, reducing build times from 8-10 minutes to approximately 2 minutes. The Docker engine mode enables researchers to generate PDFs with only Docker and python as prerequisites. This is valuable for collaborative research across platforms or institutional settings with software restrictions (15). The system automatically saves all generated files, creating a complete record from source materials to the finished document. For users who want immediate feedback, we provide Google Colab notebook deployment that compiles documents in real-time while preserving reproducibility. We’ve also developed a Docker-based version using udocker (16) that cuts setup time dramatically (from about 20 minutes down to 4 minutes). It runs in pre-configured containers with all dependencies already installed, eliminating manual setup and ensuring consistency between Colab sessions. Available deployment strategies are compared in Table S3. When working with figures, the system handles both static images and dynamic content. Drop Python or R scripts into designated folders, and Rxiv-Maker will execute them during compilation, pulling in data, running analyses, and generating visualisations that appear in the final PDF (17). It even renders Mermaid.js diagrams from markdown into crisp SVG images. This approach makes manuscripts complete, verifiable records of research where readers can trace every figure and result back to its source code and data. The Visual Studio Code extension provides editing features including real-time syntax highlighting, autocompletion for bibliographic citations from BibTeX files, and crossreference management. The extension reduces cognitive load and minimises syntax errors while maintaining consistent formatting. Rxiv-Maker combines plain-text authoring with automated build environments to address consistency and reproducibility challenges in scientific publishing. Following literate programming principles (18), it creates documents that blend narrative text with executable code while hiding typesetSaraiva et al. | Rχ iv-Maker ting complexity. Git integration provides transparent attribution, conflict-free merging, and complete revision histories (19, 20), supporting collaborative practices needed for open science. Preprint servers have transferred quality control and typesetting responsibilities from journals to individual authors, creating both opportunities and challenges for scientific communication. Rxiv-Maker provides automated safeguards that help researchers produce publication-quality work without extensive typesetting knowledge, making professional publishing tools available through GitHub-based infrastructure. The focus on PDF output via LaTeX optimises preprint workflows for scientific publishing requirements. We plan to extend format support by integrating universal converters such as Pandoc (21), while preserving typographic control and reproducibility standards. The Visual Studio Code extension addresses adoption barriers by providing familiar development environments that bridge text editing with version control workflows. Future development will prioritise deeper integration with computational environments and quality assessment tools, building upon established collaborative frameworks (7) and containerised approaches that enhance reproducibility (8). The system supports scientific publishing through organised project structure separating content, configuration, and computational elements. All manuscript content, metadata, and bibliographic references are version-controlled, ensuring transparency. The markdown-to-LaTeX conversion pipeline handles complex academic syntax including figures, tables, citations, and mathematical expressions while preserving semantic meaning and typographical quality. The system uses a multi-pass approach that protects literal content during transformation, ensuring intricate scientific expressions render accurately. The framework supports subscript and superscript notation essential for chemical formulas, allowing expressions such as 2 H2 O, CO2 , Ca2+ , SO2− 4 , and E = mc , as well as temperature notation like 25°C. The system’s mathematical typesetting capabilities extend to numbered equations, which are essential for scientific manuscripts. For instance, the fundamental equation relating mass and energy can be expressed as: E = mc2 (1) The framework also supports more complex mathematical formulations, such as the standard deviation calculation commonly used in data analysis: s σ= 1 N ∑ (xi − x̄)2 N − 1 i=1 (2) Additionally, the system handles chemical equilibrium expressions, which are crucial in biochemical and chemical research: Keq = Saraiva et al. | [Products] [Ca2+ ][SO24 ] = [Reactants] [CaSO4 ] Rχ iv-Maker (3) These numbered equations (Eq. (1), Eq. (2), and Eq. (3)) demonstrate the framework’s capability to handle diverse mathematical notation while maintaining proper crossreferencing throughout the manuscript. This functionality ensures that complex scientific concepts can be presented with the precision and clarity required for academic publication. Rxiv-Maker is optimised for reproducible PDF preprint generation within the scientific authoring ecosystem. While platforms such as Overleaf and Quarto offer multi-format capabilities, Rxiv-Maker provides focused, developer-centric workflows that integrate with version control and automated build environments. The framework provides practical training in version control, automated workflows, and computational reproducibility, which are skills fundamental to modern scientific practice. Researchers learn technical skills including Git proficiency, markdown authoring, continuous integration, and containerised environments. The system is designed to be accessible without extensive programming backgrounds, featuring comprehensive documentation and intuitive workflows that reduce barriers and foster skill development. The technical architecture addresses computational constraints of cloud-based build systems through intelligent caching and selective content regeneration. The framework supports high-resolution graphics and advanced figure layouts while maintaining optimal document organisation and cross-referencing functionality. Computational research faces a growing disconnect between advanced analytical methods and traditional publishing workflows. Rxiv-Maker addresses this by treating manuscripts as executable code rather than static documents, bringing collaborative development practices from software engineering to scientific communication. This enables transparent, verifiable publications suitable for both immediate sharing and long-term preservation. The framework’s impact extends beyond technical capabilities to foster a culture of computational literacy and transparent science. As preprint servers continue to reshape academic publishing, tools like Rxiv-Maker become essential infrastructure for maintaining quality and reproducibility in researcher-led publication processes. The framework serves as both a practical solution for immediate publishing needs and a foundation for advancing open science principles across diverse research domains. ABOUT THIS MANUSCRIPT This work is licensed under CC BY 4.0. DATA AVAILABILITY arXiv monthly submission data used in this article is available at https: //arxiv.org/stats/monthly_submissions. Preprint submissions data across different hosting platforms is available at https://github.com/ esperr/pubmed-by-year. The source code and data for the figures in this article are available at https://github.com/HenriquesLab/rxiv-maker. CODE AVAILABILITY The Rxiv-Maker computational framework is available at https://github. com/HenriquesLab/rxiv-maker. The framework includes comprehensive documentation, example manuscripts, and automated testing suites to ensure reliability across different deployment environments. Additionally, the Visual Studio Code extension for Rxiv-Maker is available at https://github.com/ HenriquesLab/vscode-rxiv-maker, providing researchers with an integrated development environment that includes syntax highlighting, intelligent autocompletion for citations and cross-references, schema validation for configura- 3 tion files, and seamless integration with the main framework’s build processes. All source code is under an MIT License, enabling free use, modification, and distribution for both academic and commercial applications. 17. AUTHOR CONTRIBUTIONS Both Bruno M. Saraiva, Guillaume Jacquemet, and Ricardo Henriques conceived the project and designed the framework. All authors contributed to writing and reviewing the manuscript. 18. ACKNOWLEDGEMENTS The authors thank Jeffrey Perkel for feedback that helped improve the manuscript. B.S. and R.H. acknowledge support from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 101001332) (to R.H.) and funding from the European Union through the Horizon Europe program (AI4LIFE project with grant agreement 101057970-AI4LIFE and RT-SuperES project with grant agreement 101099654RTSuperES to R.H.). Funded by the European Union. However, the views and opinions expressed are those of the authors only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them. This work was also supported by a European Molecular Biology Organization (EMBO) installation grant (EMBO-2020-IG-4734 to R.H.), a Chan Zuckerberg Initiative Visual Proteomics Grant (vpi-0000000044 with https://doi.org/10.37921/743590vtudfp to R.H.), and a Chan Zuckerberg Initiative Essential Open Source Software for Science (EOSS6-0000000260). This study was supported by the Academy of Finland (no. 338537 to G.J.), the Sigrid Juselius Foundation (to G.J.), the Cancer Society of Finland (Syöpäjärjestöt, to G.J.), and the Solutions for Health strategic funding to Åbo Akademi University (to G.J.). This research was supported by the InFLAMES Flagship Program of the Academy of Finland (decision no. 337531). EXTENDED AUTHOR INFORMATION • Bruno M. Saraiva: 0000-0002-9151-5477; 7 Bruno_MSaraiva; ¯ bruno-saraiva • Guillaume Jaquemet: 0000-0002-9286-920X; 7 guijacquemet; ⋆ guijacquemet.bsky.social • Ricardo Henriques: 0000-0002-2043-5234; ¯ ricardo-henriques 7 HenriquesLab; ⋆ henriqueslab.bsky.social; Bibliography 1. Jeffrey Beck, Christine A Ferguson, Kathryn Funk, Brooks Hanson, Melissa Harrison, Michele Ide-Smith, Rachael Lammey, Maria Levchenko, Alex Mendonca, Michael Parkin, Naomi Penfold, Nicole Pfeiffer, Jessica Polka, Iratxe Puebla, Oya Y Rieger, Martyn Rittman, Richard Sever, and Sowmya Swaminathan. Building trust in preprints: recommendations for servers and other stakeholders, 2020. 2. Mariia Levchenko, Michael Parkin, Johanna McEntyre, and Melissa Harrison. Enabling preprint discovery, evaluation, and analysis with europe pmc, 2024. 3. Nicholas Fraser, Fakhri Momeni, Philipp Mayr, and Isabella Peters. The relationship between biorxiv preprints, citations and altmetrics. Quantitative Science Studies, 1(2):618– 638, 2020. doi: 10.1162/qss_a_00043. 4. Ronald D Vale. Accelerating scientific publication in biology. Proceedings of the National Academy of Sciences, 112(44):13439–13446, 2015. doi: 10.1073/pnas.1511912112. 5. Jonathan P Tenant, Francois Waldner, Damien C Jacques, Paola Masuzzo, Lauren B Collister, and Chris HJ Hartgerink. The academic, economic and societal impacts of open access: an evidence-based review. F1000Research, 5:632, 2016. doi: 10.12688/f1000research.8460. 3. 6. Jialiang Lin, Yao Yu, Yu Zhou, Zhiyang Zhou, and Xiaodong Shi. How many preprints have actually been printed and why: a case study of computer science preprints on arxiv. Scientometrics, 124(1):555–574, 2020. doi: 10.1007/s11192-020-03430-8. 7. Ulysse Rubens, Romain Mormont, Lassi Paavolainen, Volker Bäcker, Benjamin Pavie, et al. Biaflows: A collaborative framework to reproducibly deploy and benchmark bioimage analysis workflows. Patterns, 1(3):100040, 2020. doi: 10.1016/j.patter.2020.100040. 8. Ivan Hidalgo-Cenalmor, Joanna W Pylvänäinen, Mariana G Ferreira, et al. Dl4miceverywhere: deep learning for microscopy made flexible, shareable and reproducible. Nature Methods, 21(9):1645–1656, 2024. doi: 10.1038/s41592-024-02295-6. 9. John D Hunter. Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95, 2007. doi: 10.1109/MCSE.2007.55. 10. Michael L Waskom. seaborn: statistical data visualization. Journal of Open Source Software, 6(60):3021, 2021. doi: 10.21105/joss.03021. 11. Jeffrey M. Perkel. Cut the tyranny of copy-and-paste with these coding tools. Nature, 603 (7899):191–192, 2022. doi: 10.1038/d41586-022-00563-z. 12. Mermaid Team. Mermaid: Generation of diagrams and flowcharts from text in a similar manner as markdown, 2023. Accessed: 2024-12-01. 13. Ricardo Henriques. Henriques biorxiv template, 2015. Overleaf LaTeX template. Accessed: 2025-06-16. 14. Daniel S. Himmelstein, Vincent Rubinetti, David R. Slochower, Dongbo Hu, Venkat S. Malladi, Casey S. Greene, and Anthony Gitter. Open collaborative writing with manubot. PLOS Computational Biology, 15(6):e1007128, 2019. doi: 10.1371/journal.pcbi.1007128. 15. Carl Boettiger. An introduction to docker for reproducible research. ACM SIGOPS Operating Systems Review, 49(1):71–79, 2015. doi: 10.1145/2723872.2723882. 16. Jorge Gomes, Emanuele Bagnaschi, Isabel Campos, Mario David, Luís Alves, João Martins, João Pina, Alvaro López-García, and Pablo Orviz. Enabling rootless linux containers 4 19. 20. 21. 22. 23. 24. 25. 26. in multi-user environments: The udocker tool. Computer Physics Communications, 232: 84–97, 2018. doi: 10.1016/j.cpc.2018.05.021. Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian E Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica B Hamrick, Jason Grout, Sylvain Corlay, et al. Jupyter notebooks–a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, pages 87– 90. IOS Press, 2016. doi: 10.3233/978-1-61499-649-1-87. Donald E Knuth. Literate programming. The Computer Journal, 27(2):97–111, 1984. doi: 10.1093/comjnl/27.2.97. Karthik Ram. Git can facilitate greater reproducibility and increased transparency in science. Source Code for Biology and Medicine, 8(1):7, 2013. doi: 10.1186/1751-0473-8-7. Yasset Perez-Riverol, Laurent Gatto, Rui Wang, Timo Sachsenberg, Julian Uszkoreit, Felipe da Veiga Leprevost, Christian Fufezan, Tobias Ternent, Stephen J Eglen, Daniel S Katz, et al. Ten simple rules for taking advantage of git and github. PLoS Computational Biology, 12(7):e1004947, 2016. doi: 10.1371/journal.pcbi.1004947. Pandoc: The universal markup converter, 2020. Accessed: 2025-06-16. Overleaf. Overleaf: Real-time collaborative writing and publishing tools with integrated pdf preview, 2024. Cloud-based LaTeX editor. Posit PBC. Quarto: An open-source scientific and technical publishing system, 2024. Multilanguage scientific publishing system. Typst GmbH. Typst: A new markup-based typesetting system, 2024. Modern typesetting system designed for scientific documents. Yihui Xie. bookdown: Authoring Books and Technical Documents with R Markdown. Chapman and Hall/CRC, 2016. ISBN 9781138700109. Sperr, Ed. Pubmed by year: A dataset of pubmed publication counts by year, 2025. Accessed: 2025-07-01. Methods This section describes the Rxiv-Maker framework technically, showing how the system generates structured documentation from source code and plain text. System architecture is detailed in Fig. S3. Processing Pipeline. Rxiv-Maker processes manuscripts through a five-stage pipeline controlled by a central Makefile that converts source files into publication-ready PDFs. The pipeline ensures computational reproducibility through these stages: 1. Environment Setup: Automated dependency resolution with containerised environments using Docker or local virtual environments with pinned package versions 2. Content Generation: Conditional execution of Python/R scripts and Mermaid diagram compilation based on modification timestamps 3. Markdown Processing: Multi-pass conversion with intelligent content protection preserving mathematical expressions, code blocks, and LaTeX commands 4. Asset Aggregation: Systematic collection and validation of figures, tables, and bibliographic references with integrity checking 5. LaTeX Compilation: Optimised pdflatex sequences with automatic cross-reference and citation resolution For users without local LaTeX installations, the framework provides identical build capabilities through cloud-based GitHub Actions, making professional publishing workflows accessible while maintaining reproducibility guarantees. Saraiva et al. | Rχ iv-Maker Markdown-to-LaTeX Conversion. Manuscript conversion is handled by a Python processing engine that manages complex academic syntax requirements through "rxivmarkdown". This multi-pass conversion system uses content protection strategies to preserve computational elements such as code blocks and mathematical notation. It converts specialised academic elements including dynamic citations (@smith2023), programmatic figures, statistical tables, and supplementary notes before applying standard markdown formatting. The system supports notation essential for scientific disciplines: subscript and superscript syntax for chemical formulas such as H2 O and CO2 , mathematical expressions including Einstein’s mass-energy equivalence (Eq. (1)), chemical notation such as Ca2+ and SO2− 4 (Eq. (3)), temperature specifications like 25°C, and statistical calculations including standard deviation (Eq. (2)). Supported syntax is detailed in Table S4. The framework supports complex mathematical expressions typical of computational workflows: 1 ∂ u + (u · ∇)u = − ∇p + ν∇2 u ∂t ρ (4) This approach provides accessible alternatives for common formulas while ensuring complex equations like the NavierStokes equation (Eq. (4)) are rendered with professional quality. Mathematical formula support is detailed in Supp. Note 2. Programmatic Content and Environments. The framework generates figures, statistical analyses, and algorithmic diagrams as reproducible outputs linked to source data and processing pipelines. The build pipeline executes Python, R, and Mermaid scripts with caching to avoid redundant computation while maintaining traceability between datasets, algorithms, and visualisations (Supp. Note 1). Rxiv-Maker implements multi-layered environment management to address complex dependency requirements. Dependencies are rigorously pinned, isolated virtual environments support development workflows, and containerised environments ensure consistent execution across computing platforms. Cloud-based GitHub Actions provide controlled, auditable build environments that guarantee identical computational outcomes across systems. Rosetta emulation on Apple Silicon systems. For optimal performance on ARM64 systems, local installation provides full capabilities without emulation overhead. Cloud-based deployment through GitHub Actions provides architecture-agnostic automated builds for continuous integration workflows. The modular architecture enables researchers to select deployment strategies appropriate to technical constraints while maintaining reproducibility guarantees. Visual Studio Code Extension. Rxiv-Maker includes a Vi- sual Studio Code extension providing an integrated development environment for collaborative manuscript preparation. The extension leverages the Language Server Protocol delivering real-time syntax highlighting for academic markdown syntax, intelligent autocompletion for bibliographic citations from BibTeX files, and context-aware suggestions for crossreferences to figures, tables, equations, and supplementary materials. The extension integrates with the main framework through file system monitoring and automated workspace detection, recognising rxiv-maker project structures and providing appropriate editing features. Schema validation for YAML configuration files ensures project metadata adheres to reproducibility specifications, while integrated terminal access enables direct execution of framework commands. This provides researchers with accessible, feature-rich editing experience maintaining reproducibility guarantees while reducing technical barriers. Quality Assurance. Framework reliability is ensured through multi-level validation protocols. Unit tests validate individual components, integration tests verify end-to-end pipelines, and platform tests validate deployment environment behaviour. Pre-commit pipelines enforce code formatting, linting, and type checking, ensuring code quality. Deployment Architecture and Platform Considerations. The framework provides flexible deployment strategies for diverse research environments. Local installation offers optimal performance and universal architecture compatibility, supporting AMD64 and ARM64 systems with direct access to native resources required for diagram generation. This approach enables faster iteration cycles and comprehensive debugging capabilities. Containerised execution through Docker Engine Mode eliminates local dependency management by providing preconfigured environments containing LaTeX distributions, Python libraries, R packages, and Node.js tooling. Docker deployment uses AMD64 base images because Google Chrome has limitations on ARM64 Linux. These run via Saraiva et al. | Rχ iv-Maker 5 Supplementary Information Rχiv-Maker: an automated template engine for streamlined scientific publications Format Mermaid Diagrams Input Extension .mmd Processing Method Mermaid CLI Output Formats SVG, PNG, PDF Quality Vector/Raster Python and R Figures Static Images LaTeX Graphics .py, .R .png, .jpg, .svg .tex, .tikz Script execution Direct inclusion LaTeX compilation PNG, PDF, SVG Same format PDF Publication Original Vector Data Files .csv, .json, .xlsx Python and R processing Via scripts Computed Use Case Flowcharts, architectures Data visualisation Photographs, logos Mathematical diagrams Raw data integration Sup. Table S1. Supported Figure Generation Methods. Comprehensive overview of the framework’s figure processing capabilities, demonstrating support for both static and dynamic content generation with emphasis on reproducible computational graphics. Tool Rxiv-Maker Type Pipeline Markdown Excellent Primary Use Case Preprint servers Overleaf (22) Web Editor Limited Academic publishing Quarto (23) Publisher Native Multi-format publishing Manubot (14) Collaborative Native Pandoc (21) Converter Excellent Version-controlled ing Format conversion Typst (24) Typesetter Good Modern typesetting Bookdown (25) Publisher R Markdown Academic books Direct LaTeX Typesetter Limited Traditional publishing writ- Key Strengths GitHub Actions integration, automated workflows Real-time collaboration, rich templates Polyglot support, multiple outputs Automated citations, transparent collaboration Universal format support, extensible Fast compilation, modern syntax Cross-references, multiple formats Ultimate control, established workflows Open Source Yes Freemium Yes Yes Yes Yes Yes Yes Sup. Table S2. Comprehensive Comparison of Manuscript Preparation Tools. This comparison provides an exhaustive overview of available tools for scientific manuscript preparation, positioning each within the broader ecosystem of academic publishing workflows. Rxiv-Maker is designed as a specialised solution optimising for preprint server submissions, complementing rather than replacing established tools like Overleaf for general LaTeX collaboration or Quarto for multi-format publishing. The comparison highlights that different tools excel in distinct contexts: Overleaf dominates collaborative LaTeX editing, Quarto excels at multi-format computational publishing, and Rxiv-Maker streamlines the specific workflow of preparing reproducible preprints for submission to arXiv, bioRxiv, and medRxiv. Deployment Method GitHub Actions Google Colab Local Python Manual LaTeX Environment Cloud CI/CD Web browser Local machine Local machine Dependencies None (cloud) None (cloud) Python + LaTeX Full LaTeX suite Collaboration Automatic Shared notebooks Git-based Git-based Ease of Use Very High Very High Medium Low Reproducibility Perfect High Good Variable Sup. Table S3. Rxiv-Maker Deployment Strategies. Comparison of available compilation methods, highlighting the flexibility of the framework in accommodating different user preferences and technical environments whilst maintaining consistent output quality. Saraiva et al. | Rχ iv-Maker 7 Markdown Element Basic Text Formatting **bold text** *italic text* ~subscript~ ^superscript^ Document Structure # Header 1 ## Header 2 ### Header 3 Lists - list item 1. list item Links and URLs [link text](url) https://example.com Citations @citation [@cite1;@cite2] Cross-References @fig:label @sfig:label @table:label @stable:label @eq:label @snote:label Tables and Figures Markdown table Image with caption Document Control LaTeX Equivalent Description \textbf{bold text} \textit{italic text} \textsubscript{subscript} \textsuperscript{superscript} Bold formatting for emphasis Italic formatting for emphasis Subscript formatting (H˜2˜O, CO˜2˜) Superscript formatting (E=mcˆ2ˆ, xˆnˆ) \section{Header 1} \subsection{Header 2} \subsubsection{Header 3} Top-level section heading Second-level section heading Third-level section heading \begin{itemize}\item...\end{itemize} \begin{enumerate}\item...\end{enumerate} Unordered list Ordered list \href{url}{link text} \url{https://example.com} Hyperlink with custom text Bare URL \cite{citation} \cite{cite1,cite2} Single citation reference Multiple citation references \ref{fig:label} \ref{sfig:label} \ref{table:label} \ref{stable:label} \eqref{eq:label} \sidenote{label} Figure cross-reference Supplementary figure cross-reference Table cross-reference Supplementary table cross-reference Equation cross-reference Supplement note cross-reference \begin{table}...\end{table} \begin{figure}...\end{figure} Table with automatic formatting Figure with separate caption % comment \newpage \clearpage Comments (converted to LaTeX style) Manual page break control Page break with float clearing Sup. Table S4. Rxiv-Maker Markdown Syntax Overview. Comprehensive mapping of markdown elements to their LaTeX equivalents, demonstrating the automated translation system that enables researchers to write in familiar markdown syntax whilst producing professional LaTeX output. Supp. Note 1: Programmatic Figure Generation and Computational Reproducibility. Rxiv-Maker’s figure generation capabilities demonstrate automated processing pipelines maintaining transparent connections between source data and final visualisations whilst ensuring computational reproducibility. The system supports two primary methodologies: Mermaid diagram processing and Python/R-based data visualisation, each addressing distinct requirements within scientific publishing workflows. Mermaid diagram processing leverages the Mermaid CLI to convert text-based specifications into publication-ready graphics. This approach enables version-controlled diagram creation where complex flowcharts, system architectures, and conceptual models are specified using intuitive syntax and automatically rendered into multiple output formats. The system generates SVG, PNG, and PDF variants accommodating different compilation requirements whilst maintaining vector quality. This automation eliminates manual effort for diagram creation and updates, ensuring modifications are immediately reflected in the final document. Script-based figure generation represents computational reproducibility where analytical scripts execute during compilation to generate figures directly from source data. This integration ensures visualisations remain synchronised with underlying datasets and analytical methods, eliminating outdated or inconsistent graphics. The system executes image generation scripts within the compilation environment, automatically detecting generated files and incorporating them into document structure. This approach transforms figures from static illustrations into dynamic, reproducible computational artefacts enhancing scientific rigour. Supp. Note 2: Mathematical Formula Support and LaTeX Integration. Rxiv-Maker integrates mathematical notation by translating markdown-style expressions into publication-ready LaTeX mathematics. This enables researchers to author complex mathematical content using familiar syntax whilst benefiting from LaTeX’s superior typesetting capabilities. Inline mathematical expressions use dollar sign delimiters ($...$), enabling formulas such as E = mc2 or α = βγ to be embedded within text. The conversion system preserves expressions during markdown-to-LaTeX transformation, ensuring 8 Saraiva et al. | Rχ iv-Maker mathematical notation maintains proper formatting and spacing. Display equations utilise double dollar delimiters ($$...$$) for prominent mathematical expressions requiring centred presentation. Complex equations such as the Schrödinger equation: ih̄ ∂ Ψ(r,t) = ĤΨ(r,t) ∂t or the Navier-Stokes equations:  ρ  ∂v + v · ∇v = −∇p + µ∇2 v + f ∂t demonstrate the framework’s capability to handle sophisticated mathematical typography, including Greek letters, partial derivatives, vector notation, and complex fraction structures. The system supports LaTeX’s mathematical environments by directly including LaTeX code blocks. This hybrid approach enables simple markdown syntax for straightforward expressions whilst retaining access to LaTeX’s full capabilities for complex multi-line derivations. Mathematical expressions within figure captions, table entries, and cross-references are automatically processed, ensuring consistent typography throughout documents. The framework’s content protection system preserves mathematical expressions during multi-stage conversion, preventing unwanted modifications. Statistical notation commonly required in manuscripts is supported, including confidence intervals µ ± σ , probability distribuR∞ tions P(X ≤ x), and significance levels p < 0.05. Complex expressions involving summations ∑ni=1 xi , integrals −∞ f (x)dx, and matrix operations A−1 b = x are rendered with appropriate spacing. Saraiva et al. | Rχ iv-Maker 9 Sup. Fig. S1. The growth of preprint submissions on the arXiv server from 1991 to 2025. The data, sourced from arXiv’s public statistics, is plotted using a Python script integrated into our Rxiv-Maker pipeline. This demonstrates the system’s capacity for reproducible, data-driven figure generation directly within the publication workflow. 10 Saraiva et al. | Rχ iv-Maker Sup. Fig. S2. Preprint Submission Trends Across Multiple Servers (2018-2025). The figure displays the annual number of preprint submissions to major repositories, including arXiv, bioRxiv, and medRxiv. Data was collected from publicly available sources (26) and visualised using a reproducible R script within the Rxiv-Maker pipeline. This approach ensures that the figure remains synchronised with the latest available data and supports transparent, data-driven scientific reporting. Saraiva et al. | Rχ iv-Maker 11 Sup. Fig. S3. Detailed System Architecture and Processing Layers. Comprehensive technical diagram showing the complete Rxiv-Maker architecture, including input layer organisation, processing engine components (parsers, converters, generators), compilation infrastructure, output generation, and deployment methodology integration with Docker containerisation support. This figure illustrates the modular design that enables independent development and testing of system components across both local and containerised environments. 12 Saraiva et al. | Rχ iv-Maker