Home    Site Map  Download      FAQ         Links       Genome Browser  
TIGR Rice Genome Annotation Data Download
 


There are several options for downloading TIGR rice genome annotation data from the osa1 database. The TIGR FTP site is available for those who wish to download the annotation data as an entire set or by chromosome. The FTP site also contains the annotation data in XML and GFF3 format.

The XML files contain an external Document Type Definition (DTD) which describes the TIGR gene annotatation XML structure and valid XML elements and attributes. A UML diagram is also available from the TIGR FTP site describing the associations between the elements of the TIGR gene annotation XML model. GFF3 files contain a subset of annotation data from osa1 which is stored in a hierarchical manner: Reference Sequence(Pseudomolecule),Gene or Transposable Element, mRNA, 5' UTR, CDS, 3' UTR, Exon. The desciptions of the features contain information on the parent of the feature, locus, feat_name, common name, and GO slim. The XML files contain the information in the GFF3 files along with assembly, gene model attributes (transmembrane domain, cell localization), gene model evidence, and cDNA support.

Users that wish to extract and download subsets of rice genome annotation data for individual chromosomes or regions within chromosomes can use the TIGR Rice Genome Data Extractor below. The data extractor prepares the files requested and compresses them in a zipfile which then can be downloaded.

NEW: A batch data download tool is now available on the TIGR Rice Genome Annotation website. This tool may be more appropriate if you have a list of TIGR gene locus identifiers and/or feat_names and wish to download locus geneomic sequence, gene model sequence, gene model protein sequence, putative function assignments or GOSlim assingments.



Select data to export
Chromosome:  




Bases from:
to  


e.g., type "123", "5K", or "13.2M"
Select export options
   Export the following files: (Help)
     Protein sequences (.pep)
     Protein-coding nucleotide sequence (.cds)
     Intron sequences (.intron)
     Gene sequences (.seq)
     UTR sequences (.UTR)
     1000 bp upstream genomic sequences (lower case) from the
          translational start codons (upper case) (.1kUpstream)
     Intergenic sequences (.intergenic)
     Genomic sequences (.con)
     Brief information about the gene models (.TU_model.brief_info)
     Gene models with Pfam domain matches (.models_with_Pfam)
     Transcript sequences (.cDNA)
     Gene models with Tos17 or other insertions nearby
          (.models_near_insertion_sites)
  
 
  
 
  


 
   
 
For Rice Comments/Questions send mail to the TIGR rice team.
 
Photographs courtesy of Robin Buell (TIGR), Jiming Jiang (University of Wisconsin), and the USDA Agricultural Research Service