Scikit Bio
Scikit Bio analyzes biological sequences (DNA, RNA, protein) using scikit-learn for pattern recognition and prediction – useful for bioinformatics research.
Install on your platform
We auto-selected Claude Code based on this skill’s supported platforms.
Run in terminal (recommended)
claude mcp add scikit-bio npx -- -y @trustedskills/scikit-bio
Or manually add to ~/.claude/settings.json
{
"mcpServers": {
"scikit-bio": {
"command": "npx",
"args": [
"-y",
"@trustedskills/scikit-bio"
]
}
}
}Requires Claude Code (claude CLI). Run claude --version to verify your install.
About This Skill
What it does
The Scikit Bio skill enables AI agents to analyze biological data using the scikit-bio Python library. It facilitates tasks such as manipulating DNA, RNA, and protein sequences; reading and writing various biological file formats (FASTA, FASTQ, GenBank); performing sequence alignments; constructing phylogenetic trees; and calculating diversity metrics for microbiome or community ecology data. This skill is particularly useful for bioinformatics research requiring pattern recognition and prediction within biological datasets.
When to use it
- You need to read and process biological sequences from files in formats like FASTA or FASTQ.
- You're performing sequence alignment or searching for specific motifs within DNA, RNA, or protein data.
- You want to construct or analyze phylogenetic trees representing evolutionary relationships.
- You are calculating diversity metrics (alpha/beta diversity, UniFrac distances) from microbiome data.
Key capabilities
- Sequence Manipulation: Reading, writing, slicing, concatenating, and searching biological sequences (DNA, RNA, protein).
- File Format Support: Handles FASTA, FASTQ, GenBank, Newick, BIOM formats for reading and writing biological data.
- Sequence Transformations: Performs reverse complement, transcription (DNA→RNA), and translation (RNA→protein) operations.
- Motif Finding: Identifies specific patterns within sequences using regular expressions.
- Distance Calculations: Calculates distances between sequences (e.g., Hamming distance, k-mer based).
- Metadata Handling: Manages sequence quality scores and associated metadata.
Example prompts
- "Read the DNA sequence from 'my_sequence.fasta' and find all occurrences of the motif 'ATG'."
- "Translate the RNA sequence in 'rna_data.txt' into a protein sequence."
- "Calculate the Hamming distance between these two DNA sequences: [Sequence 1] and [Sequence 2]."
Tips & gotchas
- Use the
DNA,RNA, orProteinclasses when working with sequences that require specific alphabet validation. - The
Sequenceclass is suitable for generic sequences without alphabet restrictions. - Quality scores from FASTQ files are automatically loaded into positional metadata, which can be leveraged in subsequent analyses.
Tags
TrustedSkills Verification
Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.
Security Audits
| Gen Agent Trust Hub | Pass |
| Socket | Pass |
| Snyk | Pass |
🌐 Community
Passed automated security scans.