1. Introduction

Phylogeny.fr has been designed to provide a high performance platform that transparently chains programs relevant to phylogenetic analysis in a comprehensive and flexible pipeline. Although phylogenetic aficionados will be able to find most of their favorite tools and run sophisticated analyzes, the primary philosophy of Phylogeny.fr is to assist biologists with no experience in phylogeny in analyzing their data in a robust way.

Note: The Phylogeny.fr platform offers a phylogeny pipeline which can be executed through three main modes, designed to fit your specific expertise level and analysis requirements.

Alternatively, users can run each program independently in the Tools section.

For each mode and tool, an optional email address can be submitted to receive a notification when the analysis completes. This address is retained only for the duration of the job and deleted as soon as the job completion confirmation email has been sent. No sensitive data is therefore stored on the server.

2. Phylogeny Analysis

2.1 "One Click" mode

This is the "default" mode which proposes a pipeline already set up to run and connect programs recognized for their accuracy and speed to reconstruct a robust phylogenetic tree from a set of sequences:

  • MAFFT has been chosen for rapid and robust multiple sequence alignment.
  • ClipKit can optionally be used for alignment curation (trimming unreliable regions).
  • FastTree (Fast mode) or IQ-TREE (Accurate mode) can be selected for phylogeny.
You only have to upload a FASTA file or paste some sequences, toggle on the alignment curation if needed, and select an execution mode. The system will handle everything else for you, automatically formatting data between steps and applying appropriate parameters.

You have the ability to toggle the curation step (ClipKit) to eliminate poorly aligned positions and highly divergent regions before the tree inference step. At the end of the analysis, an interactive and exportable rendition of the generated tree will be available.

Default Parameters

For transparency, below is the exhaustive list of the parameters used by the tools behind the scenes during a One Click run:

ParameterDefault Value
Mafft (Alignment)
Alignment strategyDefault (FFT-NS-2)
ClipKit (Alignment Curation)
ModeGappy (removes columns with high gap frequency)
Gaps threshold0.4 (40%)
FastTree (Fast Phylogeny)
Evolutionary ModelDefault (JTT+CAT for proteins, JC+CAT for DNA)
Branch SupportSH-like local supports
IQ-TREE (Accurate Phylogeny)
Model SelectionAuto (ModelFinder)
Branch SupportUltrafast Bootstrap (5000 replicates)
Burn-in iterations200
PhyloTree Tool (Tree Utilities)
RootingMidpoint Rooting

2.2 Advanced mode

The Advanced mode follows the same robust pipeline philosophy as "One Click" mode, utilizing a predefined succession of programs (MAFFT, ClipKit, and IQ-TREE). However, it provides you with complete control over the parameters for each step of the analysis.

Tip: If you are unsure about a specific parameter, the Advanced mode forms provide descriptive tooltips for each option to help you make scientifically sound and appropriate choices.

2.3 "A la carte" mode

The "A la carte" mode provides the ultimate flexibility for building your phylogenetic analysis pipeline. This mode allows you to handpick a tool for each step of the analysis, giving complete control over the methodology.

Ideal if: you want to compare different tools, test specific methodologies, or have particular requirements that the standard pipelines do not cover.

Two-Phase Workflow

The A la carte mode operates in two distinct phases:

  1. Pipeline Design: Select the tools to include in the analysis. A visual pipeline at the top of the page shows the current selection in real-time.
  2. Configuration & Launch: After confirming the tool selection, configure the settings for each tool individually, upload the input FASTA file, and launch the analysis.
Flexible Selection: It is not required to select a tool for every step. For example, you can run only an alignment tool, or skip the curation step entirely. However, at least one tool must be selected to proceed.

How It Works

  1. Navigate to the A la carte page from the main menu.
  2. Click on the tool tiles to select your preferred tool for each step (Alignment, Curation, Phylogeny). Selected tools are highlighted with a checkmark.
  3. Click "Continue to Configuration" to proceed.
  4. Upload your FASTA file or paste your sequences directly.
  5. Configure each selected tool's parameters using the provided forms.
  6. Optionally enter your email to receive a notification when the analysis completes.
  7. Click "Submit Analysis" to start the pipeline.

3. BLAST Explorer

The BLAST Explorer is used to search for sequences similar to a given query within major biological databases. Beyond standard BLAST searches, our platform automatically builds a phylogenetic tree from the top hits, enabling you to explore the evolutionary relationships among homologous sequences interactively.

Unique Feature: Unlike traditional BLAST services, Phylogeny.fr automatically aligns your query with a maximum of 100 best hits using Mafft and reconstructs a phylogenetic tree using FastTree, giving you instant evolutionary context.

3.1 Submitting a query

To run a BLAST search, a single sequence in FASTA format has to be provided either as a file or by copying and pasting it directly into the text box.

BLAST Programs

Users have to select the appropriate program based on the query and target database types:

ProgramQueryDatabaseUse case
blastpProteinProteinFind homologous proteins
blastnNucleotideNucleotideFind similar DNA/RNA sequences
blastxNucleotideProteinTranslate DNA and search proteins
tblastnProteinNucleotideSearch protein against translated DNA

Available Databases

A variety of sequence databases are available:

DatabaseTypeDescription
Protein Databases
nr_clusterProteinNCBI non-redundant protein sequences (clustered)
SwissProtProteinCurated, high-quality UniProt entries
UniRef90ProteinUniProt clustered at 90% identity
PDBProteinProtein Data Bank sequences with known 3D structures
RefSeq Viral ProteinProteinNCBI curated viral protein sequences
Nucleotide Databases
Core NucleotideNucleotideNCBI core nucleotide collection
PDBNucleotidePDB nucleotide sequences
RefSeq Viral GenomicNucleotideNCBI curated viral genomic sequences

E-value Threshold

The E-value (Expect value) represents the number of hits expected by chance. A lower E-value indicates a more significant match:

  • 1.e-5 (default): Good balance between sensitivity and specificity.
  • 1.e-10 to 1.e-30: More stringent, returns only highly confident hits.
  • 0.1 to 1: More permissive, may include distant homologs.
Tip: If your initial search returns no hits, try increasing the E-value threshold or switching to a broader database like nr_cluster.

Optional Clustering

MMseqs2 can be used to cluster all results. For instance, as the number of returned sequences is limited, this can avoid the over-representation of one kind of organism.

3.2 Interactive results

Once the BLAST job completes, an interactive visualization is displayed, combining a phylogenetic tree with a floating Toolbox panel dedicated to filtering and exporting results.

Phylogenetic Tree Display

The tree displays the query sequence alongside the top BLAST hits, making it possible to visualize evolutionary relationships at a glance:

  • Query highlighting: The input sequence is visually emphasized in the tree.
  • Taxonomy coloring: Sequences are colored according to taxonomic groups (with adjustable depth).
  • Interactive selection: Tree leaves can be clicked directly to select or deselect sequences.
  • External links: Sequence IDs provide access to the corresponding original database entries.

The Toolbox Panel

The floating Toolbox serves as the main control center for exploring BLAST results. It can be moved freely on the screen and provides two distinct modes accessible through tabs.

Tree View Tab

This tab focuses on the top hits displayed in the phylogenetic tree (typically the highest-scoring sequences). It is intended for precise, small-scale selection:

  • Selection counter: Displays the number of currently selected sequences relative to the total number of tree leaves.
  • Quick select buttons: Allows all tree sequences to be selected or deselected instantly.
  • Bitscore filter pills: Filters sequences according to score ranges (<40, 40–50, 50–80, 80–200, >200) using color-coded filters.
  • Taxonomy depth slider: Adjusts the displayed taxonomic classification level.
  • Gap-free indicator: Displays the percentage of alignment positions without gaps for the current selection, providing an indicator of alignment quality.
All Hits Tab

This tab operates on all BLAST hits, limited to 500, not only those displayed in the tree. It is intended for large-scale filtering across the complete result set:

  • E-value threshold: Dynamically filters hits according to E-value (from 1e-5 to 1e-100).
  • Histogram filters: Interactive histograms allow sequence selection based on:
    • Score (bitscore): Filtering by alignment score strength.
    • Similarity (% identity): Filtering by sequence similarity percentage.
    • Coverage (% query coverage): Filtering by query coverage percentage.
  • Taxonomy filter: A hierarchical taxonomy tree enables selection of sequences belonging to specific taxonomic groups (for example, only Bacteria or only Vertebrates).
  • Select in table: A selection tool based on the BLAST table results.
  • Reset selection: Clears all filters and restores the initial state.
Tip: Use the All Hits tab when you need to work with sequences not included in the tree visualization. The histogram filters let you identify and select sequences based on precise score ranges.

Exporting & Pipeline Integration

Both tabs provide export options for the selected sequences:

  • Download FASTA: Export selected sequences as a FASTA file for external analysis.
  • Send to Pipeline: Directly transfer selected sequences to the One Click or Advanced phylogeny pipelines. The sequences are automatically loaded into the analysis form.

Taxonomy Legend

Below the tree, a dynamic legend displays the taxonomic groups present in the results with their associated colors. The groups shown depend on the taxonomy depth setting in the Toolbox.

4. Tools & Versions

Below is the list of all bioinformatics tools integrated in the platform, along with their current version.

ToolDescriptionVersionMax Sequences
BlastBasic Local Alignment Search Tool2.16.01
MMseqs2Ultra-fast and sensitive sequence search and clusteringsse2
MafftMultiple sequence alignment program for large datasets7.526200
MuscleFast and accurate multiple sequence alignment5.3200
Clustal OmegaFast multiple sequence alignment1.2.4200
TCoffeeAdvanced multiple sequence alignment program13.46.0.919e8c6b200
GBlocksAlignment curation — eliminates poorly aligned and divergent regions0.91b200
ClipKitAlignment curation using smart-gap trimming2.11.4200
BMGEAlignment curation using gap trimming2.0200
FastTreeApproximately-maximum-likelihood phylogenetic tree inference2.2.0200
IQ-TreeMaximum-likelihood phylogenetic inference with automatic model selection2.4.0200
RAxML-NGRandomized Axelerated Maximum Likelihood phylogenetic inference2.0.0200
MrBayesBayesian inference of phylogenetic trees3.2.7a50
PhyMLMaximum-likelihood phylogenetic inference3.3.2024120750
FastphyloFast tools for phylogenetics — distance computation and neighbor-joining1.0.1
BioNJNeighbor-joining phylogenetic inference1.050

New to the platform?

Take the interactive guided tour — 2 minutes to cover all the key features.