Documentation

Bioinformatics Toolbox Functions

High-Throughput Sequencing

Data Import and Management

fastainfo Return information about FASTA file
fastaread Read data from FASTA file
fastawrite Write to file using FASTA format
fastqinfo Return information about FASTQ file
fastqread Read data from FASTQ file
fastqwrite Write to file using FASTQ format
saminfo Return information about Sequence Alignment/Map (SAM) file
samread Read data from Sequence Alignment/Map (SAM) file
baminfo Return information about Binary Sequence Alignment/Map (BAM) file
bamread Read data from Binary Sequence Alignment/Map (BAM) file
bamindexread Read Binary Sequence Alignment/Map Index (BAI) file
goannotread Read annotations from Gene Ontology annotated file
soapread Read data from Short Oligonucleotide Analysis Package (SOAP) file
BioIndexedFile Allow quick and efficient access to large text file with nonuniform-size entries
BioMap Contain sequence, quality, alignment, and mapping data
BioRead Contain sequence and quality data
BioReadQualityStatistics Quality statistics from a short-read sequence file
GFFAnnotation Represent General Feature Format (GFF) annotations
GTFAnnotation Represent Gene Transfer Format (GTF) annotations

Preprocessing

BioMap Contain sequence, quality, alignment, and mapping data
BioReadQualityStatistics Quality statistics from a short-read sequence file
GFFAnnotation Represent General Feature Format (GFF) annotations
GTFAnnotation Represent Gene Transfer Format (GTF) annotations

RNA

ngsbrowser Open NGS Browser to visualize and explore short-read sequence alignments
align2cigar Convert aligned sequences to corresponding Compact Idiosyncratic Gapped Alignment Report (CIGAR) format strings
cigar2align Convert unaligned sequences to aligned sequences using Compact Idiosyncratic Gapped Alignment Report (CIGAR) format strings
bowtie Map short reads to reference sequence using Burrows-Wheeler transform
bowtiebuild Generate index using Burrows-Wheeler transform
mattest Perform two-sample t-test to evaluate differential expression of genes from two experimental conditions or phenotypes
mafdr Estimate false discovery rate (FDR) for multiple hypothesis testing
mavolcanoplot Create significance versus gene expression ratio (fold change) scatter plot of microarray data
mairplot Create intensity versus ratio scatter plot of microarray data
maboxplot Create box plot for microarray data
nbintest Unpaired hypothesis test for short-read count data with small sample sizes
clustergram Compute hierarchical clustering, display dendrogram and heat map, and create clustergram object
redbluecmap Create red and blue colormap
redgreencmap Create red and green colormap
metafeatures Attractor metagene algorithm for feature engineering using mutual information-based learning
rankfeatures Rank key features by class separability criteria
randfeatures Generate randomized subset of features
fitcknn Fit k-nearest neighbor classifier
knnimpute Impute missing data using nearest-neighbor method
classperf Evaluate performance of classifier
crossvalind Generate cross-validation indices
BioRead Contain sequence and quality data
BioMap Contain sequence, quality, alignment, and mapping data
BioReadQualityStatistics Quality statistics from a short-read sequence file
GFFAnnotation Represent General Feature Format (GFF) annotations
GTFAnnotation Represent Gene Transfer Format (GTF) annotations
NegativeBinomialTest Unpaired hypothesis test result
HeatMap Display heat map of matrix data and create HeatMap object
HeatMap object Object containing matrix and heat map display properties
clustergram Compute hierarchical clustering, display dendrogram and heat map, and create clustergram object
clustergram object Object containing hierarchical clustering analysis data
DataMatrix Create DataMatrix object
DataMatrix object Data structure encapsulating data and metadata from microarray experiment so that it can be indexed by gene or probe identifiers and by sample identifiers

DNA

ngsbrowser Open NGS Browser to visualize and explore short-read sequence alignments
mafdr Estimate false discovery rate (FDR) for multiple hypothesis testing
mle Maximum likelihood estimates
nbinpdf Negative binomial probability density function
chromosomeplot Plot chromosome ideogram with G-banding pattern
cpgisland Locate CpG islands in DNA sequence
mspeaks Convert raw peak data to peak list (centroided data)
align2cigar Convert aligned sequences to corresponding Compact Idiosyncratic Gapped Alignment Report (CIGAR) format strings
cigar2align Convert unaligned sequences to aligned sequences using Compact Idiosyncratic Gapped Alignment Report (CIGAR) format strings
bowtie Map short reads to reference sequence using Burrows-Wheeler transform
bowtiebuild Generate index using Burrows-Wheeler transform
BioRead Contain sequence and quality data
BioMap Contain sequence, quality, alignment, and mapping data
BioReadQualityStatistics Quality statistics from a short-read sequence file
GFFAnnotation Represent General Feature Format (GFF) annotations
GTFAnnotation Represent Gene Transfer Format (GTF) annotations

Gene Ontology

goannotread Read annotations from Gene Ontology annotated file
num2goid Convert numbers to Gene Ontology IDs
geneont Data structure containing Gene Ontology (GO) information

Network Analysis and Visualization

graphallshortestpaths Find all shortest paths in graph
graphconncomp Find strongly or weakly connected components in graph
graphisdag Test for cycles in directed graph
graphisomorphism Find isomorphism between two graphs
graphisspantree Determine if tree is spanning tree
graphmaxflow Calculate maximum flow in directed graph
graphminspantree Find minimal spanning tree in graph
graphpred2path Convert predecessor indices to paths
graphshortestpath Solve shortest path problem in graph
graphtopoorder Perform topological sort of directed acyclic graph
graphtraverse Traverse graph by following adjacent nodes
biograph Create biograph object
biograph object Data structure containing generic interconnected data used to implement directed graph

Microarray Analysis

Data Import and Management

affyread Read microarray data from Affymetrix GeneChip file
affysnpannotread Read Affymetrix Mapping DNA array data from CSV-format annotation file
affyprobeseqread Read data file containing probe sequence information for Affymetrix GeneChip array
celintensityread Read probe intensities from Affymetrix CEL files
getgeodata Retrieve Gene Expression Omnibus (GEO) format data
geoseriesread Read Gene Expression Omnibus (GEO) Series (GSE) format data
geosoftread Read Gene Expression Omnibus (GEO) SOFT format data
galread Read microarray data from GenePix array list file
gprread Read microarray data from GenePix Results (GPR) file
agferead Read Agilent Feature Extraction Software file
ilmnbsread Read gene expression data exported from Illumina BeadStudio software
imageneread Read microarray data from ImaGene Results file
sptread Read data from SPOT file
goannotread Read annotations from Gene Ontology annotated file
cytobandread Read cytogenetic banding information
bioma.ExpressionSet Contain data from microarray gene expression experiment
bioma.data.ExptData Contain data values from microarray experiment
bioma.data.MetaData Contain metadata from microarray experiment
bioma.data.MIAME Contain experiment information from microarray gene expression experiment

Preprocessing

affyrma Perform Robust Multi-array Average (RMA) procedure on Affymetrix microarray probe-level data
affygcrma Perform GC Robust Multi-array Average (GCRMA) procedure on Affymetrix microarray probe-level data
affyinvarsetnorm Perform rank invariant set normalization on probe intensities from multiple Affymetrix CEL or DAT files
affyprobeaffinities Compute Affymetrix probe affinities from their sequences and MM probe intensities
affysnpintensitysplit Split Affymetrix SNP probe intensity information for alleles A and B
affysnpquartets Create table of SNP probe quartet results for Affymetrix probe set
probesetvalues Create table of Affymetrix probe set intensity values
probesetlookup Look up information for Affymetrix probe set
probesetplot Plot Affymetrix probe set intensity values
probesetlink Display probe set information on NetAffx Web site
probelibraryinfo Create table of probe set library information
rmabackadj Perform background adjustment on Affymetrix microarray probe-level data using Robust Multi-array Average (RMA) procedure
rmasummary Calculate gene expression values from Affymetrix microarray probe-level data using Robust Multi-array Average (RMA) procedure
gcrma Perform GC Robust Multi-array Average (GCRMA) background adjustment, quantile normalization, and median-polish summarization on Affymetrix microarray probe-level data
gcrmabackadj Perform GC Robust Multi-array Average (GCRMA) background adjustment on Affymetrix microarray probe-level data using sequence information
zonebackadj Perform background adjustment on Affymetrix microarray probe-level data using zone-based method
manorm Normalize microarray data
quantilenorm Quantile normalization over multiple arrays
mainvarsetnorm Perform rank invariant set normalization on gene expression values from two experimental conditions or phenotypes
malowess Smooth microarray data using Lowess method
exprprofrange Calculate range of gene expression profiles
exprprofvar Calculate variance of gene expression profiles
geneentropyfilter Remove genes with low entropy expression values
genelowvalfilter Remove gene profiles with low absolute values
generangefilter Remove gene profiles with small profile ranges
genevarfilter Filter genes with small profile variance
maimage Spatial image for microarray data
microplateplot Display visualization of microtiter plate
ilmnbslookup Look up Illumina BeadStudio target (probe) sequence and annotation information
magetfield Extract data from microarray structure

Expression Analysis

mattest Perform two-sample t-test to evaluate differential expression of genes from two experimental conditions or phenotypes
mafdr Estimate false discovery rate (FDR) for multiple hypothesis testing
mavolcanoplot Create significance versus gene expression ratio (fold change) scatter plot of microarray data
mairplot Create intensity versus ratio scatter plot of microarray data
maboxplot Create box plot for microarray data
maloglog Create loglog plot of microarray data
mapcaplot Create Principal Component Analysis (PCA) plot of microarray data
nbintest Unpaired hypothesis test for short-read count data with small sample sizes
clustergram Compute hierarchical clustering, display dendrogram and heat map, and create clustergram object
redbluecmap Create red and blue colormap
redgreencmap Create red and green colormap
probesetplot Plot Affymetrix probe set intensity values
metafeatures Attractor metagene algorithm for feature engineering using mutual information-based learning
rankfeatures Rank key features by class separability criteria
randfeatures Generate randomized subset of features
fitcknn Fit k-nearest neighbor classifier
knnimpute Impute missing data using nearest-neighbor method
classperf Evaluate performance of classifier
crossvalind Generate cross-validation indices
DataMatrix Create DataMatrix object
DataMatrix object Data structure encapsulating data and metadata from microarray experiment so that it can be indexed by gene or probe identifiers and by sample identifiers
bioma.ExpressionSet Contain data from microarray gene expression experiment
bioma.data.ExptData Contain data values from microarray experiment
bioma.data.MetaData Contain metadata from microarray experiment
bioma.data.MIAME Contain experiment information from microarray gene expression experiment
NegativeBinomialTest Unpaired hypothesis test result
HeatMap Display heat map of matrix data and create HeatMap object
HeatMap object Object containing matrix and heat map display properties
clustergram Compute hierarchical clustering, display dendrogram and heat map, and create clustergram object
clustergram object Object containing hierarchical clustering analysis data

Genetic Variant Analysis

cghcbs Perform circular binary segmentation (CBS) on array-based comparative genomic hybridization (aCGH) data
cghfreqplot Display frequency of DNA copy number alterations across multiple samples
chromosomeplot Plot chromosome ideogram with G-banding pattern
gcrma Perform GC Robust Multi-array Average (GCRMA) background adjustment, quantile normalization, and median-polish summarization on Affymetrix microarray probe-level data
gcrmabackadj Perform GC Robust Multi-array Average (GCRMA) background adjustment on Affymetrix microarray probe-level data using sequence information
mattest Perform two-sample t-test to evaluate differential expression of genes from two experimental conditions or phenotypes
mafdr Estimate false discovery rate (FDR) for multiple hypothesis testing
mavolcanoplot Create significance versus gene expression ratio (fold change) scatter plot of microarray data
mairplot Create intensity versus ratio scatter plot of microarray data
maboxplot Create box plot for microarray data
maloglog Create loglog plot of microarray data
mapcaplot Create Principal Component Analysis (PCA) plot of microarray data
nbintest Unpaired hypothesis test for short-read count data with small sample sizes
DataMatrix Create DataMatrix object
DataMatrix object Data structure encapsulating data and metadata from microarray experiment so that it can be indexed by gene or probe identifiers and by sample identifiers
NegativeBinomialTest Unpaired hypothesis test result

Gene Ontology

goannotread Read annotations from Gene Ontology annotated file
num2goid Convert numbers to Gene Ontology IDs
geneont Data structure containing Gene Ontology (GO) information

Network Analysis and Visualization

graphallshortestpaths Find all shortest paths in graph
graphconncomp Find strongly or weakly connected components in graph
graphisdag Test for cycles in directed graph
graphisomorphism Find isomorphism between two graphs
graphisspantree Determine if tree is spanning tree
graphmaxflow Calculate maximum flow in directed graph
graphminspantree Find minimal spanning tree in graph
graphpred2path Convert predecessor indices to paths
graphshortestpath Solve shortest path problem in graph
graphtopoorder Perform topological sort of directed acyclic graph
graphtraverse Traverse graph by following adjacent nodes
biograph Create biograph object
biograph object Data structure containing generic interconnected data used to implement directed graph

Sequence Analysis

Data Import and Export

fastainfo Return information about FASTA file
fastaread Read data from FASTA file
fastawrite Write to file using FASTA format
genbankread Read data from GenBank file
getgenbank Retrieve sequence information from GenBank database
genpeptread Read data from GenPept file
getgenpept Retrieve sequence information from GenPept database
emblread Read data from EMBL file
getembl Retrieve sequence information from EMBL database
pdbread Read data from Protein Data Bank (PDB) file
pdbwrite Write to file using Protein Data Bank (PDB) format
getpdb Retrieve protein structure data from Protein Data Bank (PDB) database
fastqinfo Return information about FASTQ file
fastqread Read data from FASTQ file
fastqwrite Write to file using FASTQ format
blastread Read data from NCBI BLAST report file
blastreadlocal Read data from local BLAST report
blastformat Create local BLAST database
getblast Retrieve BLAST report from NCBI Web site
multialignread Read multiple sequence alignment file
multialignwrite Write multiple alignment to file
pfamhmmread Read data from PFAM HMM-formatted file
gethmmprof Retrieve hidden Markov model (HMM) profile from PFAM database
gethmmtree Retrieve phylogenetic tree data from PFAM database
gethmmalignment Retrieve multiple sequence alignment associated with hidden Markov model (HMM) profile from PFAM database
phytreeread Read phylogenetic tree file
phytreewrite Write phylogenetic tree object to Newick-formatted file

Nucleotide Sequence Analysis

basecount Count nucleotides in sequence
codonbias Calculate codon frequency for each amino acid coded for in nucleotide sequence
codoncount Count codons in nucleotide sequence
dimercount Count dimers in nucleotide sequence
ntdensity Plot density of nucleotides along sequence
seqwordcount Count number of occurrences of word in sequence
nt2aa Convert nucleotide sequence to amino acid sequence
aa2nt Convert amino acid sequence to nucleotide sequence
nt2int Convert nucleotide sequence from letter to integer representation
int2nt Convert nucleotide sequence from integer to letter representation
dna2rna Convert DNA sequence to RNA sequence
rna2dna Convert RNA sequence to DNA sequence
revgeneticcode Return reverse mapping (amino acid to nucleotide codon) for genetic code
seqcomplement Calculate complementary strand of nucleotide sequence
seqrcomplement Calculate reverse complementary strand of nucleotide sequence
seqreverse Calculate reverse strand of nucleotide sequence
baselookup Find nucleotide codes, integers, names, and complements
geneticcode Return nucleotide codon to amino acid mapping for genetic code
oligoprop Calculate sequence properties of DNA oligonucleotide
cpgisland Locate CpG islands in DNA sequence
joinseq Join two sequences to produce shortest supersequence
palindromes Find palindromes in sequence
randseq Generate random sequence from finite alphabet
seqmatch Find matches for every string in library
seqviewer Visualize and interactively explore biological sequences
seqdisp Format long sequence output for easy viewing
seqshoworfs Display open reading frames in sequence
seqshowwords Graphically display words in sequence
featuresmap Draw linear or circular map of features from GenBank structure
featuresparse Parse features from GenBank, GenPept, or EMBL data
seqconsensus Calculate consensus sequence
seqdotplot Create dot plot of two sequences
seqlogo Display sequence logo for nucleotide or amino acid sequences
seqprofile Calculate sequence profile from set of multiply aligned sequences
seqinsertgaps Insert gaps into nucleotide or amino acid sequence
rebasecuts Find restriction enzymes that cut nucleotide sequence
restrict Split nucleotide sequence at restriction site
seq2regexp Convert sequence with ambiguous characters to regular expression

Protein and Amino Acid Sequence Analysis

aacount Count amino acids in sequence
codonbias Calculate codon frequency for each amino acid coded for in nucleotide sequence
codoncount Count codons in nucleotide sequence
nmercount Count n-mers in nucleotide or amino acid sequence
seqwordcount Count number of occurrences of word in sequence
aa2nt Convert amino acid sequence to nucleotide sequence
nt2aa Convert nucleotide sequence to amino acid sequence
aa2int Convert amino acid sequence from letter to integer representation
int2aa Convert amino acid sequence from integer to letter representation
revgeneticcode Return reverse mapping (amino acid to nucleotide codon) for genetic code
aminolookup Find amino acid codes, integers, abbreviations, names, and codons
geneticcode Return nucleotide codon to amino acid mapping for genetic code
atomiccomp Calculate atomic composition of protein
isoelectric Estimate isoelectric point for amino acid sequence
isotopicdist Calculate high-resolution isotope mass distribution and density function
molweight Calculate molecular weight of amino acid sequence
seqviewer Visualize and interactively explore biological sequences
seqdisp Format long sequence output for easy viewing
seqshoworfs Display open reading frames in sequence
seqshowwords Graphically display words in sequence
proteinplot Open Protein Plot window to investigate properties of amino acid sequence
proteinpropplot Plot properties of amino acid sequence
ramachandran Draw Ramachandran plot for Protein Data Bank (PDB) data
seqconsensus Calculate consensus sequence
seqdotplot Create dot plot of two sequences
seqlogo Display sequence logo for nucleotide or amino acid sequences
seqprofile Calculate sequence profile from set of multiply aligned sequences
seqinsertgaps Insert gaps into nucleotide or amino acid sequence
cleave Cleave amino acid sequence with enzyme
cleavelookup Find cleavage rule for enzyme or compound
seq2regexp Convert sequence with ambiguous characters to regular expression
featuresparse Parse features from GenBank, GenPept, or EMBL data
randseq Generate random sequence from finite alphabet
seqmatch Find matches for every string in library

Sequence Alignment

localalign Return local optimal and suboptimal alignments between two sequences
nwalign Globally align two sequences using Needleman-Wunsch algorithm
swalign Locally align two sequences using Smith-Waterman algorithm
seqdotplot Create dot plot of two sequences
seqpdist Calculate pairwise distance between sequences
seqalignviewer Visualize and edit multiple sequence alignment
multialign Align multiple sequences using progressive method
profalign Align two profiles using Needleman-Wunsch global alignment
seqconsensus Calculate consensus sequence
seqprofile Calculate sequence profile from set of multiply aligned sequences
seqlogo Display sequence logo for nucleotide or amino acid sequences
showalignment Display color-coded sequence alignment
hmmprofalign Align query sequence to profile using hidden Markov model alignment
hmmprofestimate Estimate profile hidden Markov model (HMM) parameters using pseudocounts
hmmprofgenerate Generate random sequence drawn from profile hidden Markov model (HMM)
hmmprofmerge Concatenate prealigned strings of several sequences to profile hidden Markov model (HMM)
hmmprofstruct Create or edit hidden Markov model (HMM) profile structure
showhmmprof Plot hidden Markov model (HMM) profile
blastlocal Perform search on local BLAST database to create BLAST report
blastncbi Create remote NCBI BLAST report request ID or link to NCBI BLAST report
blosum Return BLOSUM scoring matrix
dayhoff Return Dayhoff scoring matrix
gonnet Return Gonnet scoring matrix
nuc44 Return NUC44 scoring matrix for nucleotide sequences
pam Return Point Accepted Mutation (PAM) scoring matrix

Phylogenetic Analysis

seqlinkage Construct phylogenetic tree from pairwise distances
seqneighjoin Construct phylogenetic tree using neighbor-joining method
seqpdist Calculate pairwise distance between sequences
phytreeviewer Visualize, edit, and explore phylogenetic tree data
dnds Estimate synonymous and nonsynonymous substitution rates
dndsml Estimate synonymous and nonsynonymous substitution rates using maximum likelihood method
gethmmtree Retrieve phylogenetic tree data from PFAM database
seqinsertgaps Insert gaps into nucleotide or amino acid sequence
phytree Create phytree object
phytree object Data structure containing phylogenetic tree

Structural Analysis

rnaconvert Convert secondary structure of RNA sequence between bracket and matrix notations
rnafold Predict minimum free-energy secondary structure of RNA sequence
rnaplot Draw secondary structure of RNA sequence
pdbtransform Apply linear transformation to 3-D structure of molecule
pdbsuperpose Superpose 3-D structures of two proteins
pdbread Read data from Protein Data Bank (PDB) file
pdbwrite Write to file using Protein Data Bank (PDB) format
molviewer Display and manipulate 3-D molecule structure
pdbdistplot Visualize intermolecular distances in Protein Data Bank (PDB) file
ramachandran Draw Ramachandran plot for Protein Data Bank (PDB) data
evalrasmolscript Send RasMol script commands to Molecule Viewer window

Mass Spectrometry and Bioanalytics

Data Import

mzcdfinfo Return information about netCDF file containing mass spectrometry data
mzcdfread Read mass spectrometry data from netCDF file
mzcdf2peaks Convert mzCDF structure to peak list
mzxmlinfo Return information about mzXML file
mzxmlread Read data from mzXML file
mzxml2peaks Convert mzXML structure to peak list
sffinfo Return information about SFF file
sffread Read data from SFF file
tgspcinfo Return information about SPC file
tgspcread Read data from SPC file
jcampread Read JCAMP-DX-formatted files
scfread Read trace data from SCF file

Preprocessing

msresample Resample signal with peaks
msbackadj Correct baseline of signal with peaks
msnorm Normalize set of signals with peaks
mslowess Smooth signal with peaks using nonparametric method
mssgolay Smooth signal with peaks using least-squares polynomial
msppresample Resample signal with peaks while preserving peaks
msheatmap Create pseudocolor image of set of mass spectra
msdotplot Plot set of peak lists from LC/MS or GC/MS data set
msviewer Explore mass spectrum or set of mass spectra

Spectrum and Signal Analysis

mspeaks Convert raw peak data to peak list (centroided data)
mspalign Align mass spectra from multiple peak lists from LC/MS or GC/MS data set
msalign Align peaks in signal to reference peaks
samplealign Align two data sets containing sequential observations by introducing gaps
isotopicdist Calculate high-resolution isotope mass distribution and density function
msheatmap Create pseudocolor image of set of mass spectra
msdotplot Plot set of peak lists from LC/MS or GC/MS data set
msviewer Explore mass spectrum or set of mass spectra
traceplot Draw nucleotide trace plots
metafeatures Attractor metagene algorithm for feature engineering using mutual information-based learning
rankfeatures Rank key features by class separability criteria
randfeatures Generate randomized subset of features
fitcknn Fit k-nearest neighbor classifier
classperf Evaluate performance of classifier
crossvalind Generate cross-validation indices
Was this topic helpful?