Skip to main content

Questions tagged [fasta]

To be used for questions specific tothe sequence file format `.fasta`. Please minimise usage if the question is more generally about sequence formats.

Filter by
Sorted by
Tagged with
2 votes
0 answers
5 views

Error "no rows to aggregate" using makeFunctionalPrediction() Tax4fun2 (R)

’m running Tax4Fun2 (v1.1.5) on Ubuntu via WSL2. The runRefBlast() step completed successfully, but when I call makeFunctionalPrediction(), I get the following error: ...
Marine St's user avatar
0 votes
1 answer
81 views

FASTA Files in the old style

Some time ago, I worked with NCBI FASTA files--specifically human refseq files--that had record headers in a style and with information that is no longer used. For example, headers of this style ...
Mark Pauley's user avatar
0 votes
0 answers
36 views

Databases for mtDNA with location/age

I'm doing some experiments on mtDNA, trying to infer samples' location/age based on the similarity among sequences. I was able to find good samples (with geographical coordinates and age range) for my ...
dyxcvi's user avatar
  • 151
1 vote
2 answers
237 views

How to align two FASTAs and extract the aligned part?

I have two fastas and I'd like to align them to extract the aligned part. I've just found examples (Several of them) of alignment of FASTQs using FASTA references, so I'm not sure if what I want ...
dyxcvi's user avatar
  • 151
3 votes
2 answers
108 views

Multi-pattern search in aligned sequences

I am currently working on a bioinformatics problem where I need to lookup and count the location and count of occurences of 4000-ish 5 character long patterns in each sequence of a fasta file of 700GB....
Swathi Subramanyan's user avatar
4 votes
1 answer
107 views

Different line length in sequence 'chrY'

I just downloaded a reference genome (using wget), and attempted to use it with samtools view. However, I received the following error: ...
Wouter De Coster's user avatar
1 vote
0 answers
78 views

Need to find an intersect of sequences inside fasta files (no standardized sequence names)

I have several multifasta files (each containing between 3,000 to 3,600 sequences). The sequence names were derived from the genome where the sequence was extracted and they bear no significance or ...
Miguel Prieto's user avatar
3 votes
1 answer
46 views

Compare FASTQ reconstructions to mtDNA

I have some doubts in how to proceed to compare ancient mtDNA with modern mtDNA. A little more context: The problem: I have some FASTQs associated with a given FASTA. They contain mtDNA. My goal is to ...
dyxcvi's user avatar
  • 151
1 vote
2 answers
129 views

How to know if FASTQ/BAM is from reference genome (FASTA)?

I'm new to bioinformatics. I have a problem in which I have a FASTA reference genome and lots of reads in FASTQ files. Some of them could be contaminants, so I'd like to filter them out and get only ...
dyxcvi's user avatar
  • 151
0 votes
1 answer
70 views

Dealing with X Residues in FASTA files prior to folding analysis

We are working with a large dataset of proteins for folding/docking tests across various tools (AlphaFold, ESM2, RoseTTAFold). In some of the FASTA files, there is an X for a non-standard residue. An ...
Brian Root's user avatar
1 vote
4 answers
1k views

Remove sequences from a fasta file with IDs from a text file using Python

a python beginner here. I have a fasta file with 2500+ sequences, and after doing some analysis I want to remove around 200+ sequences based on the matching IDs. Now, I have one fasta file (as sample....
Irfan's user avatar
  • 81
1 vote
1 answer
205 views

How to remove third codon positions from a charset in iqtree?

I need to build a phylogenetic tree using IQ-TREE, starting from a sequence alignment in CODON format of several invertebrate mitochondrial genes. These are my charsets: ...
Francesco De Giglio's user avatar
0 votes
0 answers
75 views

What is the troubleshoot for this error: conversion of .SRA to FASTA file on command prompt?

I am getting this error message after using the following code: C:\sratoolkit.3.0.7-win64\sratoolkit.3.0.7-win64\bin>fastq-dump --fasta SRR1658345 Error: ...
Sanjukta Ghosh's user avatar
1 vote
1 answer
190 views

Get a certain gene sequence from bam/vcf and reference

I need to get a fasta sequence of a certain gene for a certain worm strain that is different from reference. I have a reference genome, BAM for the strain of interest, and coordinates of the gene. I ...
user98747's user avatar
3 votes
0 answers
38 views

How to identify TF binding motifs?

I have DNA sequences from a single gecko species for a set of genes (SULF1, SOX9, SAL1), I wanted to compare the putative promoter sequences for each gene, and compare them to the promoter regions of ...
JohnDoe23's user avatar
  • 101

15 30 50 per page
1
2 3 4 5
17