HIV Structure and Genome

hiv structure

HIV structure and genome have been described in this article. HIV is a kind of retrovirus that is the causative agent of HIV infection leading to a lethal condition known as acquired immunodeficiency syndrome (AIDS) (Weiss, 1993). AIDS is a condition in which human immune system gradually becomes fragile enough to let the body suffer from many deadly diseases called opportunistic infections as well as cancer. AIDS is one of the most lethal diseases that cause the death of millions of people every year. The deadliest feature of this infection is that it remains unnoticed for a long period of time. The person may not notice the infection for a long period of time or he may experience mild influenza-like sickness but with the progression of the infection, the person may suffer from other common infections such as tuberculosis, and tumor that does not have any serious effect on the people who have strong or normal immune system (Dalla Pria & Bower, 2018). All these symptoms after suffering from HIV infection is overall known as AIDS. During the state where the immune system of the body becomes very weak, the human body is excessively susceptible to many infections. In normal state of the immune system, if any pathogen enters the body, cells of immune system kills it instantly and also stores memory cells in case of a second attack by the same pathogen. These infections that occur during AIDS is known as opportunistic infections or opportunistic diseases. In fact, AIDS is not a disease but a condition where many other diseases are supposed to surround the body and lead to the death of the infected patient. Many functional remedies of AIDS have now been developed such as antiretroviral therapy. During antiretroviral therapy, the patient is given many drugs that targets the cycle of HIV. Hence HIV off-springs is not produced and death of the patient can be avoided. Despite effective medications, it is not certain that the patients will never develop AIDS because medication does not remove the incorporated HIV genome from the T cells of infected individuals (Meintjes et al., 2017). Then comes gene therapies which have shown tremendously successful results and can remove the latent HIV genome. These gene therapies includes many techniques but CRISPR/Cas9 system is the most effective cure of HIV. Other gene therapies such as usage of transcription activator-like effector nucleases or Tre-recombinases can also remove the latent HIV genome but they are laborious even at lab scale. There is the least chance of cost-effectiveness during the application of these gene therapies on a commercial scale. However, CRISPR/Cas9 system can be used at commercial scale because the only gRNA can lead to effective cure which is not as much time taking on lab scale as other gene therapies. CRISPR/Cas9 needs only gRNA to perform its functionality. This gRNA helps Cas9 enzyme to find provirus and remove it successfully (Roggenkamp et al., 2018).


HIV is a type of lentivirus that is a subgroup of retrovirus, discovered in 1983. Retroviruses are the viruses that possess a special type of enzyme called reverse transcriptase. With the help of these special enzymes, retroviruses convert its single-stranded RNA strands into double-stranded DNA after entering the host cell. This newly formed DNA incorporates with the DNA of the host cell. At this stage, the incorporated DNA of the virus with the host DNA by integrase enzymes and other co-factors is called a provirus. A provirus may become latent for an indefinite period of time and it is not detected by the immune system. On activation, the information retained in the viral genes is used to transcribe mRNAs that are translated into different proteins such as envelope proteins (evn proteins), group-specific antigen (gag) proteins, pol proteins, proteases and many other proteins necessary for producing new viruses. Thus the central dogma of life becomes RNA to DNA to RNA to Polypeptides (Benjamin et al., 2005). There are two species of HIV i.e. HIV-1 and HIV-2. Both species differ from each other with respect to their inferred origin, virulence, infectivity and prevalence. HIV-1 has been proved to more virulent, pervasive globally with high infectivity as compared to HIV-2 that is pervasive in West Africa and appeared to be less virulent and infective. Both are also different from each other while designing the anti-retroviral therapy (ART). They are also different genetically. HIV-1 has vpu gene while this vpu gene has been replaced by vpx gene in HIV-2. Aspartic acid proteases of both enzymes share only 50% similarity (Levy, 1993).

Structure of HIV

HIV1 is composed of two single-stranded ribonucleic acid strands (ssRNA). Both strands are positive sens strands with no splicing and they are enclosed by viral capsid proteins called p24. P24 is a characteristic protein of retroviruses (Lu et al., 2011). Both RNA strands are 9749 nucleotides long with a 5’ cap and 3’ poly A tail and many ORF i.e. Open Reading Frames (Ratner et al., n.d.). These ORFs are divided into long ORF and Small ORF. Long ORFs encode viral structural proteins while Small ORFs encode for regulators necessary for the viral life cycle e.g. proteins for assembly of the virus, replication of viral RNA and attachment on the host cell surface.

The single-stranded RNA of the HIV is bound to many proteins present in it. ssRNA is linked to p7 proteins that are nucleocapsid proteins, to p6 proteins that are late assembly proteins, to reverse transcriptase and integrase that are essential for the development of the virus from provirus. The primer for reverse transcriptase enzyme is Lysine tRNA. The nucleocapsid proteins protect the viral RNA from digestion by host protease enzymes. Vif, Vpr, Nef and protease of the virus itself are also enclosed within the virus particle. Viral infectivity factor (Vif) is responsible for the disruption of the antiviral activity by some human cytidine deaminases that mutates viral nucleic acid by ubiquitination and cellular degradation. Viral Protein R (Vpr) is accountable for

importing HIV-1 pre-integration complex and replication of the HIV in mature non-dividing cells. Negative Regulatory Factor (Nrf) is responsible for a mutation in the host cell machinery thus allowing the infection, survival or replication of HIV. The capsid is surrounded by the p17 proteins that maintain the integrity of the virus when outside the host cell. The capsid itself is surrounded by the envelope that was formed during the budding of the virus from the host cell (Fig. 1). The envelope is quite similar to the host cell plasma membrane and contains some glycoproteins necessary for the attachment and entering the cell e.g. gp 120 and gp41 (Trono, 1995).

The structure of HIV has been visualized by using some molecular biology techniques such as X-ray crystallography and cryo-electron microscopy which in turn has been possible after the creation of stable recombinant forms of the viral spikes (Benjamin et al., 2005). These stable forms of the viral spikes were produced by mutating the isoleucine to proline in gp41 as well as making an intersubunit disulphide bridge. These viral spikes are the best vaccine targets because they display very little non-neutralizing epitopes in contrary to gp120 which suppresses the immune response to target epitopes (Fig. 1) (Sanders et al., 2013).

Structure of  HIV

Fig 01: Structure of HIV

Genome of HIV

HIV genome encodes for structural proteins that are found in almost every virus belonging to the group retroviruses; and the non-structural or accessory proteins that exist only in HIV. HIV genome contains nine genes that encode for fifteen different kinds of proteins including structural and non-structural proteins and seven landmarks that are Long Terminal Repeats (LTR), Trans-Activation Response( TAR ), Rev Response Element (RRE), PE, SLIP, CRS, INS. HIV possesses highly sophisticated splicing system that produces fifteen proteins from less than 10kb genome. Out of this, 9.5kb is non-spliced genome that encodes for structural proteins such as gag and pol and only 4.5 k singly spliced genome produces all the other accessory proteins such as Vir, Vfr, Vpu and env  (G. Li et al., 2015). The only 2kb genome is multiple spliced that encodes for Tat, Rev, Nef. Tat and Rev are the regulatory proteins in the viral particle (Table 01). All these proteins are part of either the interior of the viral particle or the envelope of the virus. These proteins also contain some enzymes like polymerase enzymes or reverse transcriptase and integrase. All these genes may get mutated in different variants of HIV but one of these genes remained unaltered in all the variants of HIV and that is tev (Tev is actually a fusion of three genes i.e. tat, env, rev). The mutations in these genes produce a huge amount of genetic variability of HIV (G. Li et al., 2015).

Table 1: Different HIV virus genes and their corresponding protein products.

Class Gene name Primary protein products
Viral structural proteins gag Gag polyprotein
  pol Pol polyprotein
  env gp160
Essential regulatory elements tat Tat
  rev Rev
Accessory regulatory proteins nef Nef
  vpr Vpr
  vif Vif
  vpu Vpu

Group-specific antigen (gag) encodes for a gag-polyprotein that is further processed during the period of maturation by the viral proteases into MA (matrix protein, p17); CA (capsid protein, p24); SP1 (spacer peptide 1, p2); NC (nucleocapsid protein, p7); SP2 (spacer peptide 2, p1) and P6 protein (Henrich et al., 2017). Pol gene encodes for reverse transcriptase, integrase, RNAase and protease. RT is the typical enzyme of retroviruses that produces double-stranded DNA from viral RNA. IN is responsible for the integration of this dsDNA to host genome. Proteases process the gag polyproteins during maturation of the viral particle. Env gene encodes for a very unique protein gp160. This protein is ultimately converted into gp120 and gp41 by the host proteases furin in a small organelle called endoplasmic reticulum. Gp160 is converted into gp120 and gp41 by post-translational modification. Former is responsible for attachment of the viral particle onto CD4 receptors located on lymphocytes cells and latter embeds into the viral outmost envelope enabling the virus to fuse to the target cells (Fig. 2) (King et al., 2016).

HIV Genome
Fig. 02: HIV Genome

Leave a Reply

Your email address will not be published. Required fields are marked *