genomic-medicine-sweden / nallo
An analysis pipeline for long-reads from both PacBio and Oxford Nanopore Technologies (ONT), written in Nextflow.
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing genomic-medicine-sweden/nallo in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler viewIntroduction **genomic-medicine-sweden/nallo** is a bioinformatics analysis pipeline for long-reads from both PacBio and (targeted) ONT-data, focused on rare-disease. Heavily influenced by best-practice pipelines such as nf-core/sarek, nf-core/raredisease, nf-core/nanoseq, PacBio Human WGS Workflow, epi2me-labs/wf-human-variation and brentp/rare-disease-wf. QC • Read QC with FastQC, cramino, mosdepth and peddy Alignment & assembly • Assemble genomes with hifiasm • Align reads and assemblies to reference with minimap2 Variant calling • Call SNVs & joint genotyping with deepvariant and GLNexus • Call SVs with Severus, Sniffles or Sawfish (PacBio only) • Call CNVs with HiFiCNV • Call tandem repeats with TRGT (PacBio only) or STRdust • Call paralogous genes with Paraphase (PacBio only) Phasing and methylation • Phase and haplotag reads with LongPhase, whatshap or HiPhase • Create methylation pileups with modkit or pbcpgtools (PacBio only) • Rare methylation analaysis with methbat profile (PacBio only) Annotation • Annotate SNVs and INDELs with databases of choice, e.g. gnomAD, ClinVar, CADD with echtvar and VEP • Annotate repeat expansions with strdrop and stranger (TRGT only) • Annotate SVs with SVDB and VEP Ranking • Rank SNVs, INDELs, SVs and CNVs with GENMOD Filtering • Filter SNVs, INDELs, SVs and CNVs with filter_vep and bcftools Usage > [!NOTE] > If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with before running the workflow on actual data. Prepare a samplesheet with input data: Supply a reference genome with and choose a matching for your data ( , , ). Now, you can run the pipeline using: However, to run most parts of the pipeline you will need to supply additional reference files. For more details and further functionality, please refer to the documentation. Credits genomic-medicine-sweden/nallo was originally written by Felix Lenner. We thank the following people for their extensive assistance in the development of this pipeline: Anders Jemt, Annick Renevey, Daniel Schmitz, Lucía Peña-Pérez, Peter Pruisscher, Ramprasad Neethiraj & Alexander Koc. Contributions and Support If you would like to contribute to this pipeline, please see the contributing guidelines. Citations If you use genomic-medicine-sweden/nallo for your analysis, please cite it using the following doi: 10.5281/zenodo.13748210. This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license. > **The nf-core framework for community-curated bioinformatics pipelines.** > > Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen. > > _Nat Biotechnol._ 2020 Feb 13. doi: 10.1038/s41587-020-0439-x. An extensive list of references for the tools used by the pipeline can be found in the file.