seqwalk.filtering
Module Contents
Functions
|
reverse complement of DNA sequence |
|
filter library to be RC free |
|
filter any library to be RC free, using simple hash approach |
|
filters library for sequences that have desired GC content |
|
filters library to remove specific patterns |
- seqwalk.filtering.rc(seq)
reverse complement of DNA sequence
- Parameters:
seq – string with letters in {A, C, G, T}
- Returns:
string corresponding to reverse complement
- seqwalk.filtering.filter_rc_3letter(library, k)[source]
filter library to be RC free (Supplementary note X)
- Parameters:
library – list of sequences
k – SSM k value
- Returns:
- filtered_library
list of sequences without reverse complementary k-mers
- Return type:
list of strings
- seqwalk.filtering.rc_hash_filtering(library, k)
filter any library to be RC free, using simple hash approach could be slow for large libraries
- Parameters:
library – list of sequences
k – SSM k value
- Returns:
- filtered_library
list of sequences without reverse complementary k-mers
- Return type:
list of strings
- seqwalk.filtering.filter_gc(library, gc_min, gc_max)[source]
filters library for sequences that have desired GC content
- Parameters:
library – list of sequences in string representation
gc_min – minimum number of GC bases (int)
gc_max – maximimum number of GC bases (int)
- Returns:
- filtered_library
list of sequences in string representation
- Return type:
list of strings
- seqwalk.filtering.filter_pattern(library, pattern)[source]
filters library to remove specific patterns
- Parameters:
library – list of sequences in string representation
pattern – sequence pattern to be prevented
- Returns:
- filtered_library
list of sequences in string representation
- Return type:
list of strings