Scikit Bio

Name: Scikit Bio
Author: davila7

🌐Community

by davila7 · vlatest · Repository

Scikit Bio analyzes biological sequences (DNA, RNA, protein) using scikit-learn for pattern recognition and prediction – useful for bioinformatics research.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

Run in terminal (recommended)

terminal

claude mcp add scikit-bio npx -- -y @trustedskills/scikit-bio

Or manually add to ~/.claude/settings.json

~/.claude/settings.json

{
  "mcpServers": {
    "scikit-bio": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/scikit-bio"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

The Scikit Bio skill enables AI agents to analyze biological data using the scikit-bio Python library. It facilitates tasks such as manipulating DNA, RNA, and protein sequences; reading and writing various biological file formats (FASTA, FASTQ, GenBank); performing sequence alignments; constructing phylogenetic trees; and calculating diversity metrics for microbiome or community ecology data. This skill is particularly useful for bioinformatics research requiring pattern recognition and prediction within biological datasets.

When to use it

You need to read and process biological sequences from files in formats like FASTA or FASTQ.
You're performing sequence alignment or searching for specific motifs within DNA, RNA, or protein data.
You want to construct or analyze phylogenetic trees representing evolutionary relationships.
You are calculating diversity metrics (alpha/beta diversity, UniFrac distances) from microbiome data.

Key capabilities

Sequence Manipulation: Reading, writing, slicing, concatenating, and searching biological sequences (DNA, RNA, protein).
File Format Support: Handles FASTA, FASTQ, GenBank, Newick, BIOM formats for reading and writing biological data.
Sequence Transformations: Performs reverse complement, transcription (DNA→RNA), and translation (RNA→protein) operations.
Motif Finding: Identifies specific patterns within sequences using regular expressions.
Distance Calculations: Calculates distances between sequences (e.g., Hamming distance, k-mer based).
Metadata Handling: Manages sequence quality scores and associated metadata.

Example prompts

"Read the DNA sequence from 'my_sequence.fasta' and find all occurrences of the motif 'ATG'."
"Translate the RNA sequence in 'rna_data.txt' into a protein sequence."
"Calculate the Hamming distance between these two DNA sequences: [Sequence 1] and [Sequence 2]."

Tips & gotchas

Use the DNA, RNA, or Protein classes when working with sequences that require specific alphabet validation.
The Sequence class is suitable for generic sequences without alphabet restrictions.
Quality scores from FASTQ files are automatically loaded into positional metadata, which can be leveraged in subsequent analyses.

View Repository →

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust Hub	Pass
Socket	Pass
Snyk	Pass

Details

Version: vlatest
License
Author: davila7
Installs: 138

Repository (canonical source) →

🌐 Community

Passed automated security scans.