Skip to content

Commit 20dec8d

Browse files
committed
change the "CS" to "cs" #25
1 parent 11ff1a5 commit 20dec8d

File tree

9 files changed

+50
-48
lines changed

9 files changed

+50
-48
lines changed

README.md

+17-17
Original file line numberDiff line numberDiff line change
@@ -7,25 +7,25 @@
77

88
# cstag
99

10-
`cstag` is a Python library tailored for the manipulation and handling of [minimap2's CS tags](https://github.com/lh3/minimap2#cs).
10+
`cstag` is a Python library tailored for for manipulating and visualizing [minimap2's cs tags](https://github.com/lh3/minimap2#cs).
1111

1212

1313
## 🌟 Features
1414

15-
- `cstag.call()`: Generate a CS tag
16-
- `cstag.shorten()`: Convert a CS tag from its long to short format
17-
- `cstag.lengthen()`: Convert a CS tag from its short to long format
18-
- `cstag.consensus()`: Create a consensus CS tag from multiple CS tags
19-
- `cstag.mask()`: Mask low-quality bases within a CS tag
20-
- `cstag.split()`: Break down a CS tag into its constituent parts
21-
- `cstag.revcomp()`: Convert a CS tag to its reverse complement
15+
- `cstag.call()`: Generate a cs tag
16+
- `cstag.shorten()`: Convert a cs tag from its long to short format
17+
- `cstag.lengthen()`: Convert a cs tag from its short to long format
18+
- `cstag.consensus()`: Create a consensus cs tag from multiple cs tags
19+
- `cstag.mask()`: Mask low-quality bases within a cs tag
20+
- `cstag.split()`: Break down a cs tag into its constituent parts
21+
- `cstag.revcomp()`: Convert a cs tag to its reverse complement
2222
- `cstag.to_sequence()`: Reconstruct a reference subsequence from the alignment
2323
- `cstag.to_vcf()`: Generate a VCF representation
2424
- `cstag.to_html()`: Generate an HTML representation
2525
- `cstag.to_pdf()`: Produce a PDF file
2626

2727
For comprehensive documentation, please visit [our docs](https://akikuno.github.io/cstag/cstag/).
28-
To add CS tags to SAM/BAM files, check out [`cstag-cli`](https://github.com/akikuno/cstag-cli).
28+
To add cs tags to SAM/BAM files, check out [`cstag-cli`](https://github.com/akikuno/cstag-cli).
2929

3030

3131
## 🛠 Installation
@@ -44,7 +44,7 @@ conda install -c bioconda cstag
4444

4545
## 💡 Usage
4646

47-
### Generating CS Tags
47+
### Generating cs Tags
4848

4949
```python
5050
import cstag
@@ -60,19 +60,19 @@ print(cstag.call(cigar, md, seq, long=True))
6060
# =AC*ag=TACGT-ag=ACGT+ac~nn3nn=G
6161
```
6262

63-
### Shortening or Lengthening CS Tags
63+
### Shortening or Lengthening cs Tags
6464

6565
```python
6666
import cstag
6767

68-
# Convert a CS tag from long to short
68+
# Convert a cs tag from long to short
6969
cs_tag = "=ACGT*ag=CGT"
7070

7171
print(cstag.shorten(cs_tag))
7272
# :4*ag:3
7373

7474

75-
# Convert a CS tag from short to long
75+
# Convert a cs tag from short to long
7676
cs_tag = ":4*ag:3"
7777
cigar = "8M"
7878
seq = "ACGTACGT"
@@ -106,7 +106,7 @@ print(cstag.mask(cs_tag, cigar, qual, phred_threshold))
106106
# =ACNN*an+ng-cc=T
107107
```
108108

109-
### Splitting a CS Tag
109+
### Splitting a cs Tag
110110

111111
```python
112112
import cstag
@@ -116,7 +116,7 @@ print(cstag.split(cs_tag))
116116
# ['=ACGT', '*ac', '+gg', '-cc', '=T']
117117
```
118118

119-
### Reverse Complement of a CS Tag
119+
### Reverse Complement of a cs Tag
120120

121121
```python
122122
import cstag
@@ -152,7 +152,7 @@ chr1 5 . C CTT . . .
152152
"""
153153
```
154154

155-
The multiple CS tags enable reporting of the variant allele frequency (VAF).
155+
The multiple cs tags enable reporting of the variant allele frequency (VAF).
156156

157157
```python
158158
import cstag
@@ -186,7 +186,7 @@ Path("report.html").write_text(cs_tag_html)
186186
# Output "report.html"
187187
```
188188

189-
You can visualize mutations indicated by the CS tag using the generated `report.html` file as shown below:
189+
You can visualize mutations indicated by the cs tag using the generated `report.html` file as shown below:
190190

191191
<img width="511" alt="image" src="https://user-images.githubusercontent.com/15861316/265405607-a3cc1b76-f6a2-441d-b282-6f2dc06fc03d.png">
192192

src/cstag/call.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,10 @@ def trim_clips(cigar: str, seq: str) -> tuple[str, str]:
7474

7575

7676
###########################################################
77-
# Generate CS long
77+
# Generate cs tag in long format
7878
###########################################################
79+
80+
7981
def expand_cigar_operations(cigar: str) -> list[str]:
8082
parsed_cigar = parse_cigar(cigar)
8183
expanded_list = []

src/cstag/consensus.py

+10-10
Original file line numberDiff line numberDiff line change
@@ -25,20 +25,20 @@ def expand_deletion_tags(tags_combined: list[str]) -> list[str]:
2525

2626
def split_cs_tags(cs_tags: list[str]) -> list[list[str]]:
2727
"""
28-
Split and process each CS tag in cs_tags.
28+
Split and process each cs tag in cs_tags.
2929
3030
Args:
31-
cs_tags (list[str]): list of CS tags in the long format.
31+
cs_tags (list[str]): list of cs tags in the long format.
3232
3333
Returns:
34-
list[list[str]]: list of processed CS tags.
34+
list[list[str]]: list of processed cs tags.
3535
"""
3636
cs_tags_splitted = []
3737
for cs_tag in cs_tags:
3838
# Remove the prefix "cs:Z:" if present
3939
cs_tag = cs_tag.replace("cs:Z:", "")
4040

41-
# Split the CS tag using special symbols (-, *, ~, =)
41+
# Split the cs tag using special symbols (-, *, ~, =)
4242
# insertion symbol (+) is ignored because it is not observed in reference sequence
4343
tags_splitted = re.split(r"([-*~=])", cs_tag)[1:]
4444
# Combine the symbol with the corresponding sequence
@@ -70,7 +70,7 @@ def normalize_read_lengths(cs_tags: list[str], positions: list[int]) -> list[lis
7070
Normalize the lengths of each read in cs_tags based on their starts positions. If the length is insufficient, fill in with `None`.
7171
7272
Args:
73-
cs_tags (list[str]): list of CS tags.
73+
cs_tags (list[str]): list of cs tags.
7474
positions (list[int]): Starting positions of each read.
7575
7676
Returns:
@@ -109,7 +109,7 @@ def get_consensus(cs_tags: list[list[str]]) -> str:
109109
for cs in zip(*cs_tags):
110110
# Remove the None that is compensating for the insufficient lead length.
111111
cs = [c for c in cs if c]
112-
# Get the most common CS tag(s)
112+
# Get the most common cs tag(s)
113113
most_common_tags = Counter(cs).most_common()
114114

115115
# If there's a unique most common tag, return it
@@ -134,13 +134,13 @@ def get_consensus(cs_tags: list[list[str]]) -> str:
134134

135135

136136
def consensus(cs_tags: list[str], positions: list[int], prefix: bool = False) -> str:
137-
"""generate consensus of CS tags
137+
"""generate consensus of cs tags
138138
Args:
139-
cs_tags (list): CS tags in the **long** format
139+
cs_tags (list): cs tags in the **long** format
140140
positions (list): 1-based leftmost mapping position (4th column in SAM file)
141-
prefix (bool, optional): Whether to add the prefix 'cs:Z:' to the CS tag. Defaults to False
141+
prefix (bool, optional): Whether to add the prefix 'cs:Z:' to the cs tag. Defaults to False
142142
Return:
143-
str: a consensus of CS tag in the **long** format
143+
str: a consensus of cs tag in the **long** format
144144
Example:
145145
>>> import cstag
146146
>>> cs_tags = ["=ACGT", "=AC*gt=T", "=C*gt=T", "=C*gt=T", "=ACT+ccc=T"]

src/cstag/to_html.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@
8383

8484

8585
def append_mark_to_n(cs_tag: str) -> str:
86-
"""Process each CS tag by adding specific markers `@` to `N`."""
86+
"""Process each cs tag by adding specific markers `@` to `N`."""
8787

8888
def append_mark(cs: str) -> str:
8989
if cs.startswith("N"):
@@ -138,7 +138,7 @@ def process_cs_tag(cs_tag: str) -> str:
138138
def to_html(cs_tag: str, description: str = "") -> str:
139139
"""Output HTML string showing a sequence with mutations colored
140140
Args:
141-
cs_tag (str): CS tag in the **long** format
141+
cs_tag (str): cs tag in the **long** format
142142
description (str): (optional) header information in the output string
143143
Return:
144144
HTML string

src/cstag/to_pdf.py

+4-4
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,15 @@
77

88
def to_pdf(cs_tag: str, description: str, path_out: str | Path) -> None:
99
"""
10-
Convert a CS tag and its description to a PDF file.
10+
Convert a cs tag and its description to a PDF file.
1111
12-
This function takes a CS (custom string) tag and its description, converts
12+
This function takes a cs (custom string) tag and its description, converts
1313
it to HTML using the `to_html` function, and then writes it to a PDF file
1414
using WeasyPrint.
1515
1616
Args:
17-
cs_tag (str): The CS tag to be converted.
18-
description (str): The description associated with the CS tag.
17+
cs_tag (str): The cs tag to be converted.
18+
description (str): The description associated with the cs tag.
1919
path_out (str | Path): The path where the output PDF file will be saved.
2020
2121
Returns:

src/cstag/to_sequence.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ def to_sequence(cs_tag: str) -> str:
88
"""Reconstruct the reference subsequence in the alignment
99
1010
Args:
11-
cs_tag (str): CS tag in the **long** format
11+
cs_tag (str): cs tag in the **long** format
1212
1313
Returns:
14-
str: The sequence string derived from the CS tag.
14+
str: The sequence string derived from the cs tag.
1515
1616
Example:
1717
>>> import cstag

src/cstag/to_vcf.py

+6-6
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ def get_variant_annotations(cs_tag_split: list[str], position: int) -> list[Vcf]
9696

9797

9898
###########################################################
99-
# Format the CS tags
99+
# Format the cs tags
100100
###########################################################
101101

102102

@@ -146,7 +146,7 @@ def format_cs_tags(cs_tags: list[str], chroms: list[str] | list[int], positions:
146146

147147

148148
def group_by_chrom(cs_tags_formatted: list[tuple]) -> dict[str, tuple]:
149-
"""Group CS tags by chromosomes"""
149+
"""Group cs tags by chromosomes"""
150150
cs_tags_grouped = defaultdict(list)
151151
for cs in cs_tags_formatted:
152152
cs_tags_grouped[cs.chrom].append(
@@ -234,7 +234,7 @@ def add_vcf_fields(
234234

235235

236236
###########################################################
237-
# Process CS tag (One)
237+
# Process cs tag (One)
238238
###########################################################
239239

240240

@@ -259,7 +259,7 @@ def process_cs_tag(cs_tag: str, chrom: str | int, pos: int) -> str:
259259

260260

261261
###########################################################
262-
# Process CS tags (Many)
262+
# Process cs tags (Many)
263263
###########################################################
264264

265265

@@ -319,10 +319,10 @@ def process_cs_tags(cs_tags: list[str], chroms: list[str], positions: list[int])
319319

320320
def to_vcf(cs_tags: str | list[str], chroms: str | int | list[str] | list[int], positions: int | list[int]) -> str:
321321
"""
322-
Convert CS tag(s) to VCF (Variant Call Format) string.
322+
Convert cs tag(s) to VCF (Variant Call Format) string.
323323
324324
Args:
325-
cs_tag (str | list[str]): The CS tag representing the sequence alignment.
325+
cs_tag (str | list[str]): The cs tag representing the sequence alignment.
326326
chrom (str | list[str]): The chromosome name.
327327
pos (int | list[int]): The starting position for the sequence.
328328

src/cstag/utils/validator.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -9,17 +9,17 @@ def validate_cs_tag(cs_tag: str) -> None:
99
)
1010

1111
if not pattern.fullmatch(cs_tag.replace("cs:Z:", "")):
12-
raise ValueError(f"Invalid CS tag: {cs_tag}")
12+
raise ValueError(f"Invalid cs tag: {cs_tag}")
1313

1414

1515
def validate_short_format(cs_tag: str) -> None:
1616
if re.search(r"=[ACGTN]+", cs_tag):
17-
raise ValueError("CS tag must be in short format")
17+
raise ValueError("cs tag must be in short format")
1818

1919

2020
def validate_long_format(cs_tag: str) -> None:
2121
if re.search(r":[0-9]+", cs_tag):
22-
raise ValueError("CS tag must be in long format")
22+
raise ValueError("cs tag must be in long format")
2323

2424

2525
def validate_threshold(threshold: int) -> None:

tests/test_to_vcf.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ def test_get_variant_annotations():
7979

8080

8181
###########################################################
82-
# Format the CS tags
82+
# Format the cs tags
8383
###########################################################
8484

8585

@@ -202,7 +202,7 @@ def test_add_vcf_fields():
202202

203203

204204
###########################################################
205-
# process_cs_tag: Single CS tag
205+
# process_cs_tag: Single cs tag
206206
###########################################################
207207

208208

@@ -227,7 +227,7 @@ def test_process_cs_tag():
227227

228228

229229
###########################################################
230-
# process_cs_tags: Multuple CS tags
230+
# process_cs_tags: Multuple cs tags
231231
###########################################################
232232

233233

0 commit comments

Comments
 (0)