Questions tagged [fasta]
To be used for questions specific tothe sequence file format `.fasta`. Please minimise usage if the question is more generally about sequence formats.
247 questions
2
votes
0
answers
5
views
Error "no rows to aggregate" using makeFunctionalPrediction() Tax4fun2 (R)
’m running Tax4Fun2 (v1.1.5) on Ubuntu via WSL2. The runRefBlast() step completed successfully, but when I call makeFunctionalPrediction(), I get the following error:
...
0
votes
1
answer
81
views
FASTA Files in the old style
Some time ago, I worked with NCBI FASTA files--specifically human refseq files--that had record headers in a style and with information that is no longer used. For example, headers of this style
...
0
votes
0
answers
36
views
Databases for mtDNA with location/age
I'm doing some experiments on mtDNA, trying to infer samples' location/age based on the similarity among sequences. I was able to find good samples (with geographical coordinates and age range) for my ...
1
vote
2
answers
237
views
How to align two FASTAs and extract the aligned part?
I have two fastas and I'd like to align them to extract the aligned part. I've just found examples (Several of them) of alignment of FASTQs using FASTA references, so I'm not sure if what I want ...
3
votes
2
answers
108
views
Multi-pattern search in aligned sequences
I am currently working on a bioinformatics problem where I need to lookup and count the location and count of occurences of 4000-ish 5 character long patterns in each sequence of a fasta file of 700GB....
4
votes
1
answer
107
views
Different line length in sequence 'chrY'
I just downloaded a reference genome (using wget), and attempted to use it with samtools view. However, I received the following error:
...
1
vote
0
answers
78
views
Need to find an intersect of sequences inside fasta files (no standardized sequence names)
I have several multifasta files (each containing between 3,000 to 3,600 sequences). The sequence names were derived from the genome where the sequence was extracted and they bear no significance or ...
3
votes
1
answer
46
views
Compare FASTQ reconstructions to mtDNA
I have some doubts in how to proceed to compare ancient mtDNA with modern mtDNA. A little more context:
The problem:
I have some FASTQs associated with a given FASTA. They contain mtDNA. My goal is to ...
1
vote
2
answers
129
views
How to know if FASTQ/BAM is from reference genome (FASTA)?
I'm new to bioinformatics. I have a problem in which I have a FASTA reference genome and lots of reads in FASTQ files. Some of them could be contaminants, so I'd like to filter them out and get only ...
0
votes
1
answer
70
views
Dealing with X Residues in FASTA files prior to folding analysis
We are working with a large dataset of proteins for folding/docking tests across various tools (AlphaFold, ESM2, RoseTTAFold). In some of the FASTA files, there is an X for a non-standard residue. An ...
1
vote
4
answers
1k
views
Remove sequences from a fasta file with IDs from a text file using Python
a python beginner here.
I have a fasta file with 2500+ sequences, and after doing some analysis I want to remove around 200+ sequences based on the matching IDs. Now, I have one fasta file (as sample....
1
vote
1
answer
205
views
How to remove third codon positions from a charset in iqtree?
I need to build a phylogenetic tree using IQ-TREE, starting from a sequence alignment in CODON format of several invertebrate mitochondrial genes. These are my charsets:
...
0
votes
0
answers
75
views
What is the troubleshoot for this error: conversion of .SRA to FASTA file on command prompt?
I am getting this error message after using the following code:
C:\sratoolkit.3.0.7-win64\sratoolkit.3.0.7-win64\bin>fastq-dump --fasta SRR1658345
Error:
...
1
vote
1
answer
190
views
Get a certain gene sequence from bam/vcf and reference
I need to get a fasta sequence of a certain gene for a certain worm strain that is different from reference. I have a reference genome, BAM for the strain of interest, and coordinates of the gene. I ...
3
votes
0
answers
38
views
How to identify TF binding motifs?
I have DNA sequences from a single gecko species for a set of genes (SULF1, SOX9, SAL1), I wanted to compare the putative promoter sequences for each gene, and compare them to the promoter regions of ...