Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
b62c0ac
changing filtering to ensembl canonical intial commit
kkuret Mar 24, 2025
a5974d2
changes to imports etc
iraiosub Mar 24, 2025
3a3b824
canonical filtering testing ok
iraiosub Mar 24, 2025
5773bdf
more changes to filtering by canonical
iraiosub Mar 24, 2025
43c2b9e
add new ref filtering test for human
iraiosub Mar 24, 2025
546c348
comment on output name
kkuret Mar 24, 2025
07bbfa2
added human test data
iraiosub Mar 24, 2025
042f16b
use premade bowtie index for testing
iraiosub Mar 24, 2025
01ec151
comment out unnecessary gtf channels
iraiosub Mar 24, 2025
64bcd5a
Modify resolve unannotated to remove genic other option
kkuret Mar 24, 2025
3f05a38
modify main.nf inputs and prepare genome subworkflow
kkuret Mar 24, 2025
1519c6b
renamed CLIPSEQ_RESOLVE_UNANNOTATED inputs
iraiosub Mar 24, 2025
47e0a9c
Merge remote-tracking branch 'origin/feat-2-0' into feat-2-0-canonical
iraiosub Mar 24, 2025
fee5f80
Merge remote-tracking branch 'origin/feat-2-0' into feat-2-0-canonical
iraiosub Mar 24, 2025
b81ffd3
Merge remote-tracking branch 'origin/feat-2-0' into feat-2-0-canonical
iraiosub Mar 24, 2025
d984ac2
fixes cadinality issue CLIPSEQ_RESOLVE_UNANNOTATED
iraiosub Mar 24, 2025
0049a4b
filtergtf outputting tx gene pairs
kkuret Mar 25, 2025
afe7534
Edited find longest transcript to also filter gtf and output all long…
kkuret Mar 25, 2025
42be21b
changes to longest tx selection
iraiosub Mar 25, 2025
7740e49
Fix file saving options
kkuret Mar 25, 2025
6115fa4
fixed newline
iraiosub Mar 25, 2025
fa2b3cb
correct excessive quoting and edit attributes
kkuret Mar 25, 2025
484ad07
Edit the longest transcript script to accept user provided transcripts
kkuret Mar 25, 2025
412942d
incorporated new tx selection into prep genome subworkflow
iraiosub Mar 25, 2025
f4f5f01
some more ref test data
iraiosub Mar 25, 2025
a33d0d4
added ensembl 103 test ref
iraiosub Mar 25, 2025
0e1c646
added logging to the transcript selection and filtering script
iraiosub Mar 26, 2025
4b81518
Merge remote-tracking branch 'origin/feat-2-0' into feat-2-0-canonical
iraiosub Mar 26, 2025
fdcb29c
raising error when not exactly 1 tx per gene
iraiosub Mar 26, 2025
267adbf
Addditional checkes and added unspliced transcript length into transc…
kkuret Mar 26, 2025
61fb7fa
simplifying prep_genome: closes #141 #140 #119 #156 #157
iraiosub Mar 26, 2025
213d36b
renamed longest_ with representative_ in params
iraiosub Mar 26, 2025
42e396f
fix naming of channels emitted by FILTER_GTF_BY_TRANSCRIPTS
iraiosub Mar 26, 2025
578c9b5
fixed more channel names..
iraiosub Mar 26, 2025
f703e97
more longest_ renaming with representative_
iraiosub Mar 26, 2025
c300db1
removed tmp files used ofr testing like ref and sampleshet
iraiosub Mar 26, 2025
e8197e4
added a comment to explain channel logic for regions.gtf to use
iraiosub Mar 26, 2025
3e0abc9
fix assignment of gtf used depending on --filter_gtf_by_transcripts …
iraiosub Mar 26, 2025
68e6823
improved ch logic so filtering is skipped when all files provided
iraiosub Mar 30, 2025
843c5a2
Update test.config for testing
iraiosub Mar 30, 2025
ef7e326
Update test.config for testing
iraiosub Mar 30, 2025
ad2d65f
renamed filtering param to use_filtered_gtf
iraiosub Mar 30, 2025
711c4b5
Update prepare_genome.nf
iraiosub Mar 31, 2025
b3058b5
indentation prepare_genome.nf
iraiosub Mar 31, 2025
4ee3b76
further adjustments to logging
iraiosub Apr 2, 2025
55597cb
refactor
iraiosub Apr 2, 2025
e2cd485
fixed load_and_validate_transcripts function
iraiosub Apr 2, 2025
6a4f933
updates in logic flow
iraiosub Apr 2, 2025
8888c0b
clarifications
iraiosub Apr 2, 2025
a626511
add conditions later in genome prep subworkflow
iraiosub Apr 2, 2025
756ffea
added skip_transcriptome to prep genome subworkflow
iraiosub Apr 3, 2025
7845455
updated meta.yml FILTER_GTF_BY_TRANSCRIPT
iraiosub Apr 3, 2025
f5bb1ea
renamed FILTER_GTF_BY_TRANSCRIPT module folder
iraiosub Apr 3, 2025
7a95e29
deleted unused config
iraiosub Apr 3, 2025
ed6ed3a
started on docs #163
iraiosub Apr 3, 2025
495e1eb
Merge remote-tracking branch 'origin/feat-2-0' into feat-2-0-canonical
iraiosub Apr 3, 2025
1588167
removed extra input ch
iraiosub Apr 3, 2025
2f9092e
improved logic comment clarity
iraiosub Apr 3, 2025
859ca1c
shortened ch definitions bc they are initialised in the main workflow
iraiosub Apr 3, 2025
0c4f5dc
changed warning to info
iraiosub Apr 3, 2025
9c41ee4
fixed filtering GTF condition
iraiosub Apr 3, 2025
a97890b
added some logging to resolve GTF
iraiosub Apr 3, 2025
10ff0cd
updated umi pattern param in test full config
iraiosub Apr 4, 2025
cfb6fe8
replaced igenomes refs in full test with ensembl ftp
iraiosub Apr 4, 2025
bcd26ea
changed the umi separator pram name in full_test so it matches nextfl…
iraiosub Apr 4, 2025
834bf7a
enabled UMI extraction in test_full config, and saving of refs
iraiosub Apr 4, 2025
4ed4f28
indentation
iraiosub Apr 4, 2025
052dde8
Updated dump_versions to report also pandas and pr versions, removed …
kkuret Apr 5, 2025
65d601c
Annotated fai2bed functions
kkuret Apr 5, 2025
9f705f7
versions dump fix
kkuret Apr 5, 2025
8dd8abd
Code annotation and unified naming chr -> chrom. Export a set of chro…
kkuret Apr 5, 2025
f483966
Added checks for input files, that unfiltered regions GTF is really c…
kkuret Apr 5, 2025
7fdbb30
Changed variable name, removed unused code
kkuret Apr 5, 2025
02e3d5a
added function doc string
kkuret Apr 5, 2025
2e845c4
tests for validation functions - perhaps needs to be removed before m…
kkuret Apr 5, 2025
00323b1
deleted unittests as they break execution
kkuret Apr 6, 2025
c41f5dd
added more details to docs
kkuret Apr 6, 2025
5024213
removed redundant variable assignment, typo fix, sorted transcript id…
kkuret Apr 6, 2025
3b69169
sorted transcript id list generation simplified by using .loc 1x inst…
kkuret Apr 6, 2025
351bc6a
change log name
kkuret Apr 6, 2025
c9fe631
Added conditional resolving, saving BED files of unnanotated regions …
kkuret Apr 7, 2025
b052423
removed comments
kkuret Apr 7, 2025
d4425f8
removed a redundant bed save, added code comments
kkuret Apr 7, 2025
7265b74
fixed nonexisting output file name (var referenced before assignment)
kkuret Apr 7, 2025
b22915a
Docs update
kkuret Apr 7, 2025
40cbf0d
reduce ref files published when filtering GTF, increase CONSENSUS_CRO…
iraiosub Apr 7, 2025
3708dca
undo resource changes for bedtools sort
iraiosub Apr 7, 2025
5721d99
added missing negation
iraiosub Apr 7, 2025
6c4cba8
removed unnecessary input files from test profile
iraiosub Apr 7, 2025
7084539
Reverted test config to original
kkuret Apr 8, 2025
b313822
Additional validation that no unexpected region types formed during r…
kkuret Apr 8, 2025
9d81833
unified logging
kkuret Apr 8, 2025
c084bbc
trailing whitespaces removed
kkuret Apr 8, 2025
62d744e
Fixed logging, printing of a set
kkuret Apr 8, 2025
9cc5de3
Fixed logging
kkuret Apr 8, 2025
93d5f85
Merge pull request #1 from iraiosub/feat-2-0-klara
iraiosub Apr 8, 2025
80193c1
small changes to docs
iraiosub Apr 8, 2025
c96d71f
docs restructureing
iraiosub Apr 8, 2025
11ae020
typos
iraiosub Apr 8, 2025
a0e4fb5
added outputs documentation
iraiosub Apr 8, 2025
9c1c7e2
docs update
iraiosub Apr 9, 2025
b53ef83
more renaming
iraiosub Apr 9, 2025
106d3b2
Use ternary operator to create empty ch when optional inputs are unde…
iraiosub Apr 10, 2025
f8b1d6b
added collect to deal with queue ref channels
iraiosub Jun 9, 2025
f40f034
Merge pull request #2 from iraiosub/feat-collect-fix
iraiosub Jun 10, 2025
f699bb3
fix merging of iCount summaries with premapped crosslinks
kkuret Jun 12, 2025
59ef35c
Merge pull request #3 from iraiosub/fix-premap-summary
iraiosub Jun 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.nextflow*
work/
data/
results/
results*/
.DS_Store
testing/
testing*
Expand Down
70 changes: 70 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,76 @@ For more details about the output files and reports, please refer to the

The pipeline currently does not support paired-end reads, as in our experience alignment using both reads when available doesn't improve analysis of CLIP data. When recieving CLIP data sequenced paired-end, we recommend running the pipeline with the read containing the crosslink and ensuring the crosslink_position parameter is set appropriately. If you have evidence to the contrary please do get in touch and let us know, or if you are working on a new variant protocol where paired-end alignment is important please do reach out.

## A note on annotation

In the current implementation, certain tools for peak-calling, motif discovery and analysis of crosslink distribution around landmarks or transcript regions (Clippy, PEKA, iCount summary and iCount RNA-maps) rely on GTF files generated by the [iCount-Mini](https://github.com/ulelab/iCount-Mini) segment [script](https://github.com/ulelab/iCount-Mini/blob/main/iCount/genomes/segment.py).

Segmentation divides the genome into regions such as CDS, UTR, UTR3, ncRNA, introns, and intergenic, at:
- the **transcript level** (`*seg.gtf`): each transcript is divided into non-overlapping segments (e.g., CDS, UTRs, introns). Segments can overlap across transcripts or genes.
- the **genome level** (`*regions.gtf`): the genome is partitioned into non-overlapping regions. Each position is assigned to exactly one region based on iCount's priority: `CDS > UTR3 > UTR5 > ncRNA > intron > intergenic`

See the [iCount segment documentation](https://icount.readthedocs.io/en/latest/_modules/iCount/genomes/segment.html) for details.

> **Warning:**
> iCount-Mini only supports **Ensembl** or **GENCODE-style** annotations.

### GTF filtering for iCount segmentation

Pre-filtering the annotation can improve iCount genome-level segmentation by:
- Prioritizing one representative transcript per gene
- Reducing conflicts in genomic region assignments caused by overlapping isoforms

This can improve the biological interpretability of region assignments, especially at the genome level.

GTF filtering is enabled by default. To enable, omit the parameter or set `--skip_gtf_filter false`. To disable, set `--skip_gtf_filter true`.

> **Warning:**
> Your GTF must contain valid transcript and exon features for all genes.
> If your annotation does not meet these standards you may want to consider disabling filtering with `--skip_gtf_filter true`.

When enabled, the GTF is filtered prior to segmentation to include **one transcript per gene**.
These representative transcripts can be either a user-defined set of transcripts (`--representative_transcript`) or automatically selected by the pipeline as the longest transcript per gene.

#### Transcript selection:
- If `--representative_transcript` is provided:
- Must be a `.txt` file with **one transcript ID per line**
- Must include **exactly one transcript for each gene** in the input GTF (`--gtf`)
- Only these transcripts and their associated features will be retained
- If not provided, the pipeline auto-selects one representative transcript per gene using the hierarchy:
1. **CDS length**
2. **Exon length**
3. **Unspliced (transcript) length**
4. Tie-breaker: transcript ID

#### How segmentation uses the filtered GTF:

When filtering is **enabled** (`--skip_gtf_filter false`), the genome is segmented **twice** using the:

1. **filtered GTF**: to prioritize representative transcripts. Some regions may remain unannotated because the gene-level annotation can extend beyond the boundaries of the representative transcripts
2. **unfiltered GTF** (the original GTF provided via `--gtf`): ensures full coverage of genes

Any regions left unannotated after segmentation on the filtered GTF (**1**) are filled in using the unfiltered GTF regions (**2**) during the `RESOLVE_UNANNOTATED` step. This ensures full genome coverage while still prioritizing the set of representative transcripts.

> This way, the final genomic regions (CDS, UTRs, introns, etc.) mostly reflect a single representative transcript per gene, while still ensuring no regions are left unannotated

Key outputs (filtering enabled):
- `*_representative_transcript_filtered.gtf`: Filtered GTF containing only features for the selected representative transcripts.
- `*_seg.gtf`: Transcript-wise segmentation (segments) based on the unfiltered GTF.
- `*_representative_transcript_filtered_regions.resolved.gtf`: resolved regions file with genome-wise segmentation (regions) based on the filtered GTF, with unannotated parts filled in from the unfiltered regions.

#### When GTF filtering is disabled:

If `--skip_gtf_filter true` is set:
- Segmentation is run **once**, using the original GTF provided via `--gtf`
- All transcripts per gene are included
- Regions (e.g., UTRs, CDS, introns) are assigned by collapsing annotations across all transcripts
- iCount’s internal rules resolve overlapping features
- This may result in more complex region annotations for genes with many transcripts

Key outputs (filtering disabled):
- `*_seg.gtf`: Transcript-wise segmentation (segments) using the unfiltered GTF.
- `*_regions.gtf`: Genome-wise segmentation (regions) from all transcripts in the unfiltered GTF

## Credits

nf-core/clipseq was originally written by Charlotte West ([@charlotte-west](https://github.com/charlotte-west)) and Anob Chakrabarti ([@amchakra](https://github.com/amchakra)) from [Luscombe Lab](https://www.crick.ac.uk/research/labs/nicholas-luscombe) at [The Francis Crick Institute](https://www.crick.ac.uk/), London, UK. It started life in April 2020 as a Nextflow DSL2 Luscombe Lab ([@luslab](https://github.com/luslab)) lockdown hackathon day and we thank all the lab members for their early contributions.
Expand Down
50 changes: 11 additions & 39 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -111,16 +111,7 @@ if(params.run_genome_prep) {
]
}

withName: 'NFCORE_CLIPSEQ:CLIPSEQ:PREPARE_GENOME:FIND_LONGEST_TRANSCRIPT' {
publishDir = [
path: { "${params.outdir}/00_genome" },
mode: "${params.publish_dir_mode}",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: params.save_reference
]
}

withName: 'NFCORE_CLIPSEQ:CLIPSEQ:PREPARE_GENOME:CLIPSEQ_FILTER_GTF' {
withName: 'NFCORE_CLIPSEQ:CLIPSEQ:PREPARE_GENOME:FILTER_GTF_BY_TRANSCRIPT' {
publishDir = [
path: { "${params.outdir}/00_genome" },
mode: "${params.publish_dir_mode}",
Expand All @@ -133,7 +124,15 @@ if(params.run_genome_prep) {
publishDir = [
path: { "${params.outdir}/00_genome" },
mode: "${params.publish_dir_mode}",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
saveAs: { filename ->
if (filename.equals('versions.yml')) {
return null
}
if (!params.skip_filter_gtf && filename.endsWith('regions.gtf.gz')) {
return null
}
return filename
},
enabled: params.save_reference
]
}
Expand All @@ -143,16 +142,7 @@ if(params.run_genome_prep) {
path: { "${params.outdir}/00_genome" },
mode: "${params.publish_dir_mode}",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: params.save_reference
]
}

withName: 'NFCORE_CLIPSEQ:CLIPSEQ:PREPARE_GENOME:RESOLVE_UNANNOTATED' {
publishDir = [
path: { "${params.outdir}/00_genome" },
mode: "${params.publish_dir_mode}",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: params.save_reference
enabled: false
]
}

Expand All @@ -164,24 +154,6 @@ if(params.run_genome_prep) {
enabled: params.save_reference
]
}

withName: 'NFCORE_CLIPSEQ:CLIPSEQ:PREPARE_GENOME:RESOLVE_UNANNOTATED_GENIC_OTHER' {
publishDir = [
path: { "${params.outdir}/00_genome" },
mode: "${params.publish_dir_mode}",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: params.save_reference
]
}

withName: 'NFCORE_CLIPSEQ:CLIPSEQ:PREPARE_GENOME:RESOLVE_UNANNOTATED_GENIC_OTHER_REGIONS' {
publishDir = [
path: { "${params.outdir}/00_genome" },
mode: "${params.publish_dir_mode}",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: params.save_reference
]
}
}
}

Expand Down
11 changes: 4 additions & 7 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -31,18 +31,14 @@ params {
ncrna_genome_index = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/bowtie.tar.gz"
genome_chrom_sizes = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV.fa.sizes"
ncrna_chrom_sizes = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/homosapiens_smallRNA.fa.sizes"
longest_transcript = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.txt"
longest_transcript_fai = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.fai"
longest_transcript_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.gtf"
representative_transcript = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.txt"
representative_transcript_fai = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.fai"
representative_transcript_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.gtf"
filtered_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered.gtf"
seg_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_seg.gtf"
seg_filt_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_seg.gtf"
regions_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_regions.gtf.gz"
regions_filt_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_regions.gtf.gz"
seg_resolved_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_seg_genicOtherfalse.resolved.gtf"
regions_resolved_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_regions_genicOtherfalse.resolved.gtf"
seg_resolved_gtf_genic = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_seg_genicOthertrue.resolved.gtf"
regions_resolved_gtf_genic = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_regions_genicOthertrue.resolved.gtf"

// Logic
debug = true
Expand All @@ -53,6 +49,7 @@ params {
save_unaligned_output = true
save_align_intermed = true
skip_transcriptome = true
skip_filter_gtf = false

// Pipeline params
umitools_bc_pattern = 'NNNNNNNNN'
Expand Down
10 changes: 3 additions & 7 deletions conf/test_bam.config
Original file line number Diff line number Diff line change
Expand Up @@ -31,18 +31,14 @@ params {
ncrna_genome_index = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/bowtie.tar.gz"
genome_chrom_sizes = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV.fa.sizes"
ncrna_chrom_sizes = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/homosapiens_smallRNA.fa.sizes"
longest_transcript = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.txt"
longest_transcript_fai = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.fai"
longest_transcript_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.gtf"
representative_transcript = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.txt"
representative_transcript_fai = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.fai"
representative_transcript_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/longest_transcript.gtf"
filtered_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered.gtf"
seg_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_seg.gtf"
seg_filt_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_seg.gtf"
regions_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_regions.gtf.gz"
regions_filt_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_regions.gtf.gz"
seg_resolved_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_seg_genicOtherfalse.resolved.gtf"
regions_resolved_gtf = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_regions_genicOtherfalse.resolved.gtf"
seg_resolved_gtf_genic = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_seg_genicOthertrue.resolved.gtf"
regions_resolved_gtf_genic = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/yeast_MitoV_filtered_regions_genicOthertrue.resolved.gtf"

// Logic
debug = true
Expand Down
16 changes: 10 additions & 6 deletions conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,17 @@ params {
config_profile_description = 'Full test dataset to check pipeline function'

// Input data for full size test
input = 'https://raw.githubusercontent.com/nf-core/clipseq/refs/heads/feat-2-0/tests/test_new_samplesheet_FASTQ_full.csv'
input = 'https://raw.githubusercontent.com/nf-core/clipseq/refs/heads/feat-2-0/tests/test_new_samplesheet_FASTQ_full.csv'
source = "fastq"

// Genome references
ncrna_fasta = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/homosapiens_smallRNA.fa.gz"
fasta = 's3://ngi-igenomes/test-data/clipseq/input_data/reference/GRCh38.primary_assembly.genome.fa.gz'
gtf = 's3://ngi-igenomes/test-data/clipseq/input_data/reference/gencode.v37.primary_assembly.annotation.gtf.gz'
move_umi = 'NNNNNNNNN'
umi_separator = '_'
ncrna_fasta = "https://raw.githubusercontent.com/nf-core/test-datasets/clipseq/v_2_0/genome/homosapiens_smallRNA.fa.gz"
fasta = 'https://ftp.ensembl.org/pub/release-111/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz'
gtf = 'https://ftp.ensembl.org/pub/release-111/gtf/homo_sapiens/Homo_sapiens.GRCh38.111.gtf.gz'
save_reference = true

// UMI options
umitools_bc_pattern = 'NNNNNNNNN'
umitools_umi_separator = '_'
skip_umi_extract = false
}
9 changes: 0 additions & 9 deletions modules/local/filter_gtf/README.md

This file was deleted.

24 changes: 0 additions & 24 deletions modules/local/filter_gtf/main.nf

This file was deleted.

26 changes: 0 additions & 26 deletions modules/local/filter_gtf/meta.yml

This file was deleted.

Loading
Loading