genecovr

genecovr is an R package that provides plotting functions that summarize gene transcript to genome alignments. The main purpose is to assess the effect of polishing and scaffolding operations has on the quality of a genome assembly. The gene transcript set is a large sequence set consisting of assembled transcripts from RNA-seq data generated in relation to a genome assembly project. Therefore, genecovr serves as a complement to software such as BUSCO, which evaluates genome assembly quality using a smaller set of well-defined single-copy orthologs.

Installation

You can install the released version of genecovr from NBIS GitHub with:

# If necessary, uncomment to install devtools
# install.packages("devtools")
devtools::install_github("NBISweden/genecovr")

Usage

genecovr script quick start

There is a helper script for generating basic plots located in PACKAGE_DIR/bin/genecovr. Create a data input csv-delimited file with columns

data label
mapping file (supported formats: psl)
assembly file (fasta or fasta index)
transcript file (fasta or fasta index)

Columns 3 and 4 can be set to missing value (NA) in which case sequence sizes will be inferred from the alignment files. Then run the script to generate plots:

PACKAGE_DIR/bin/genecovr indata.csv

Example

There are example files located in PACKAGE_DIR/inst/extdata consisting of two psl alignment files containing gmap alignments and fasta indices for the transcript sequences and two for different assembly versions:

nonpolished.fai - fasta index for raw assembly
polished.fai - fasta index for polished assembly
transcripts.fai - fasta index for transcript sequences
transcripts2nonpolished.psl - gmap alignments, transcripts to raw assembly
transcripts2polished.psl - gmap alignments, transcripts to polished assembly

Using these files and the labels non and pol for the different assemblies, a genecovr input file (called e.g., assemblies.csv) would look as follows:

nonpol,transcripts2nonpolished.psl,nonpolished.fai,transcripts.fai
pol,transcripts2polished.psl,polished.fai,transcripts.fai

and the command to run would be:

genecovr assemblies.csv

genecovr options

To list genecovr script options, type ’genecovr -h`:

usage: genecovr [-h] [-v] [-p number]
                             [-d OUTPUT_DIRECTORY] [--height HEIGHT]
                             [--width WIDTH]
                             csvfile

positional arguments:
  csvfile               csv-delimited file with columns
                            1. data label
                            2. mapping file (supported formats: psl)
                            3. assembly file (fasta or fasta index)
                            4. transcript file (fasta or fasta index)

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         print extra output
  -p number, --cpus number
                        number of cpus [default 1]
  -d OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
                        output directory
  --height HEIGHT       figure height in inches [default 6.0]
  --width WIDTH         figure width in inches [default 6.0]

R package vignette

Alternatively, import the library in an R script and use the package functions. See Get started or run vignette("genecovr") for a minimum working example.

Name	Name	Last commit message	Last commit date
Latest commit percyfal Merge pull request #8 from NBISweden/0.1.1 Dec 21, 2023 c2079a0 · Dec 21, 2023 History 83 Commits
.github/workflows	.github/workflows	Change R CMD check error_on to error	Dec 19, 2023
R	R	Extend DFrame instead of virtual DataFrame class	May 3, 2022
inst	inst	Update tests and documentation for 0.1.0 release	Dec 18, 2023
man	man	Add pkgdown	Dec 18, 2023
tests	tests	Move devtools to actions	Dec 18, 2023
vignettes	vignettes	Documentation updates	Dec 21, 2023
.Rbuildignore	.Rbuildignore	Add reference section	Dec 18, 2023
.lintr	.lintr	Update tests and documentation for 0.1.0 release	Dec 18, 2023
.pre-commit-config.yaml	.pre-commit-config.yaml	Update function documentation	Dec 19, 2023
DESCRIPTION	DESCRIPTION	Documentation updates	Dec 21, 2023
LICENSE.md	LICENSE.md	Initial commit	Jun 3, 2020
NAMESPACE	NAMESPACE	Add function to summarize number of inserts by transcript	Mar 22, 2022
NEWS.md	NEWS.md	Documentation updates	Dec 21, 2023
README.Rmd	README.Rmd	Documentation updates	Dec 21, 2023
README.md	README.md	Update function documentation	Dec 19, 2023
_pkgdown.yml	_pkgdown.yml	Add reference section	Dec 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

genecovr

Installation

Usage

genecovr script quick start

Example

genecovr options

R package vignette

About

Releases

Packages

Languages

License

NBISweden/genecovr

Folders and files

Latest commit

History

Repository files navigation

genecovr

Installation

Usage

genecovr script quick start

Example

genecovr options

R package vignette

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages