Scikit Bio

🌐Community
by davila7 · vlatest · Repository

Scikit Bio analyzes biological sequences (DNA, RNA, protein) using scikit-learn for pattern recognition and prediction – useful for bioinformatics research.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

1

Run in terminal (recommended)

terminal
claude mcp add scikit-bio npx -- -y @trustedskills/scikit-bio
2

Or manually add to ~/.claude/settings.json

~/.claude/settings.json
{
  "mcpServers": {
    "scikit-bio": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/scikit-bio"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

The Scikit Bio skill enables AI agents to analyze biological data using the scikit-bio Python library. It facilitates tasks such as manipulating DNA, RNA, and protein sequences; reading and writing various biological file formats (FASTA, FASTQ, GenBank); performing sequence alignments; constructing phylogenetic trees; and calculating diversity metrics for microbiome or community ecology data. This skill is particularly useful for bioinformatics research requiring pattern recognition and prediction within biological datasets.

When to use it

  • You need to read and process biological sequences from files in formats like FASTA or FASTQ.
  • You're performing sequence alignment or searching for specific motifs within DNA, RNA, or protein data.
  • You want to construct or analyze phylogenetic trees representing evolutionary relationships.
  • You are calculating diversity metrics (alpha/beta diversity, UniFrac distances) from microbiome data.

Key capabilities

  • Sequence Manipulation: Reading, writing, slicing, concatenating, and searching biological sequences (DNA, RNA, protein).
  • File Format Support: Handles FASTA, FASTQ, GenBank, Newick, BIOM formats for reading and writing biological data.
  • Sequence Transformations: Performs reverse complement, transcription (DNA→RNA), and translation (RNA→protein) operations.
  • Motif Finding: Identifies specific patterns within sequences using regular expressions.
  • Distance Calculations: Calculates distances between sequences (e.g., Hamming distance, k-mer based).
  • Metadata Handling: Manages sequence quality scores and associated metadata.

Example prompts

  • "Read the DNA sequence from 'my_sequence.fasta' and find all occurrences of the motif 'ATG'."
  • "Translate the RNA sequence in 'rna_data.txt' into a protein sequence."
  • "Calculate the Hamming distance between these two DNA sequences: [Sequence 1] and [Sequence 2]."

Tips & gotchas

  • Use the DNA, RNA, or Protein classes when working with sequences that require specific alphabet validation.
  • The Sequence class is suitable for generic sequences without alphabet restrictions.
  • Quality scores from FASTQ files are automatically loaded into positional metadata, which can be leveraged in subsequent analyses.

Tags

🛡️

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust HubPass
SocketPass
SnykPass

Details

Version
vlatest
License
Author
davila7
Installs
138

🌐 Community

Passed automated security scans.