Main Content

getTranscripts

Return table of unique transcripts in GTFAnnotation object

Description

transcriptsTable = getTranscripts(AnnotObj) returns transcriptsTable, a table of transcripts referenced by exons in AnnotObj.

transcriptsTable = getTranscripts(AnnotObj,"Reference",R) returns one or more transcripts that belong to the references specified by R.

example

transcriptsTable = getTranscripts(AnnotObj,"Gene",G) returns one or more transcripts that belong to the genes specified by G.

transcriptsTable = getTranscripts(AnnotObj,"Transcript",T) returns one or more transcripts specified by T.

Examples

collapse all

Create a GTFAnnotation object from a GTF-formatted file.

obj = GTFAnnotation('hum37_2_1M.gtf');

Get the list of gene names listed in the object.

gNames = getGeneNames(obj)
gNames = 28x1 cell
    {'uc002qvu.2'}
    {'uc002qvv.2'}
    {'uc002qvw.2'}
    {'uc002qvx.2'}
    {'uc002qvy.2'}
    {'uc002qvz.2'}
    {'uc002qwa.2'}
    {'uc002qwb.2'}
    {'uc002qwc.1'}
    {'uc002qwd.2'}
    {'uc002qwe.3'}
    {'uc002qwf.2'}
    {'uc002qwg.2'}
    {'uc002qwh.2'}
    {'uc002qwi.3'}
    {'uc002qwk.2'}
    {'uc002qwl.2'}
    {'uc002qwm.1'}
    {'uc002qwn.1'}
    {'uc002qwo.1'}
    {'uc002qwp.2'}
    {'uc002qwq.2'}
    {'uc010ewe.2'}
    {'uc010ewf.1'}
    {'uc010ewg.2'}
    {'uc010ewh.1'}
    {'uc010ewi.2'}
    {'uc010yim.1'}

Get a table of transcripts which belong to the first gene uc002qvu.2.

transcripts = getTranscripts(obj,"Gene",gNames{1})
transcripts=1×7 table
      Transcript       GeneName         GeneID        Reference    Start      Stop     Strand
    ______________    __________    ______________    _________    ______    ______    ______

    {'uc002qvu.2'}    {0x0 char}    {'uc002qvu.2'}      chr2       218138    249852      -   

Input Arguments

collapse all

GTF annotation, specified as a GTFAnnotation object.

Names of reference sequences, specified as a character vector, string, string vector, cell array of character vectors, or categorical array.

The names must come from the Reference field of AnnotObj. If a name does not exist, the function provides a warning and ignores it.

Data Types: char | string | cell | categorical

Names of genes, specified as a character vector, string, string vector, cell array of character vectors, or categorical array.

The names must come from the Gene field of AnnotObj. If a name does not exist, the function provides a warning and ignores the name.

Data Types: char | string | cell | categorical

Names of transcripts, specified as a character vector, string, string vector, cell array of character vectors, or categorical array.

The names must come from the Transcript field of AnnotObj. If a name does not exist, the function gives a warning and ignores the name.

Data Types: char | string | cell | categorical

Output Arguments

collapse all

Transcripts, returned as a table. The table contains the following variables for each transcript.

Variable NameDescription
TranscriptCell array of character vectors containing transcript IDs, obtained from the Transcript field of AnnotObj.
GeneNameCell array of character vectors containing the names of expressed genes, obtained from the Attributes field of AnnotObj. This cell array can contain empty character vectors if the corresponding gene names are not found in Attributes.
GeneIDCell array of character vectors containing the expressed gene IDs, obtained from the Gene field of AnnotObj.
ReferenceCategorical array representing the names of reference sequences to which the expressed genes belong. The reference names are from the Reference field of AnnotObj.
StartStart location of the first exon in each transcript.
StopStop location of the last exon in each transcript.
StrandCategorical array containing the strand of expressed gene.

Version History

Introduced in R2014b