Filtering and clipping sequences by gene region
Each sequence in the Los Alamos database is assigned standard start
and stop coordinates based on the
HXB2 HIV-1 complete
reference genome. Filtering is accomplished by retrieving these
coordinates as part of the initial query, and comparing these to a
reference table. Clipping involves performing a short alignment at
the 5' and 3' ends against the reference sequence.
Filtering by gene
After running the query, the "Downloads" control gives options for
selecting those returned sequences that contain a desired gene
region. In the pulldown menu, select the gene region desired,
or Any to retrieve all returned sequences.
Clipping sequences to gene region
Check the box labelled "Clip sequences to region" to retrieve
the only the sequence corresponding to the desired gene.
Precision clipping is not currently
guaranteed. Contact the
developer with any issues you may have with this feature.
Table of regions
The table is based on
this one found on the Los Alamos site, but hand-corrected using
http://www.hiv.lanl.gov/content/sequence/HIV/MAP/hxb2.xls
as a guide.
region | start | stop |
5' LTR | 1 | 633 |
5' LTR R | 456 | 551 |
5' LTR U3 | 1 | 455 |
5' LTR U5 | 552 | 633 |
TAR | 453 | 513 |
Gag-Pol | 790 | 5096 |
Gag | 790 | 2292 |
p17 (matrix) | 790 | 1185 |
p24 (capsid) | 1186 | 1878 |
p2 | 1879 | 1920 |
p7 (nucleocapsid) | 1921 | 2085 |
p1 | 2086 | 2133 |
p6 | 2134 | 2292 |
Pol CDS | 2085 | 5096 |
p51 (RT) | 2550 | 3869 |
p15 (RNAse H) | 3870 | 4229 |
p31 (integrase) | 4230 | 5096 |
protease | 2293 | 2549 |
Vif CDS | 5041 | 5619 |
Vpr CDS | 5560 | 5850 |
Tat CDS (plus intron) | 5831 | 8469 |
Tat exon 1 | 5831 | 6045 |
Tat exon 2 | 8379 | 8470 |
Rev CDS (plus intron) | 5970 | 8653 |
Rev exon 1 | 5970 | 6045 |
Rev exon 2 | 8379 | 8653 |
Vpu CDS | 6062 | 6310 |
Env CDS | 6225 | 8795 |
V1 | 6615 | 6692 |
V2 | 6693 | 6812 |
V3 | 7110 | 7217 |
V4 | 7377 | 7475 |
V5 | 7602 | 7631 |
RRE | 7710 | 8061 |
gp41 | 7758 | 8795 |
gp120 | 6315 | 7757 |
gp160 | 6315 | 8795 |
Nef CDS | 8797 | 9417 |
3' LTR | 9086 | 9719 |
3' LTR R | 9541 | 9636 |
3' LTR U3 | 9086 | 9540 |
3' LTR U5 | 9637 | 9719 |