Gene List Format, Update and Management
1. Introduction
Gene/protein lists are managed by the centralized List Manager Panel of DAVID and user-submitted lists are stored in this panel which can be accessed by all DAVID tools. Users do not need to re-submit their lists for different DAVID Tools.
2. Upload Tab
You can either upload a gene/protein list by copying and pasting the identifiers into the text box (2.1) or uploading from a file (2.2). DAVID is case insensitive for all the accessions/IDs. Since the DAVID list manager is centralized, the format requirements to submit a gene list are the same for ALL DAVID tools. In addition, at step 3 shown in following two figures (Figures 1, 2), you may choose the usage of the uploaded gene list: use the identifiers as the "Gene List" for the analysis or use them as a "Background" population e.g. all the genes in the analysis space: all the genes on the arrays or in annotation files. The "Gene List" is usually a user's genes/proteins of interest selected from an experiment, e.g. genes with significant fold change in a high throughput analysis such as RNA-Seq analysis, microarray analysis, etc. The gene list submitted as "Gene List" will show up in the List Tab. "Background" means the submitted identifiers will be used as a customized background population for enrichment analysis. This feature is useful when all pre-built backgrounds in DAVID are not satisfactory for the user's particular purpose. Please read more about What are the choices of population background in DAVID enrichment analysis? The gene list submitted as "Background" will show up in the Background Tab.

The indication of a successful submission is that you should see the corresponding gene lists are listed within List tab or Background tab. Moreover, an expected gene # should also be associated with the gene list.

2.1. Upload gene/protein identifier list by copy and paste
You may upload a gene list by coping and pasting the identifiers into the text box as depicted in Figure 1. If the gene identifiers are official gene symbols, you will need to enter the species name or Taxon ID (step 2a, this option is not available for other identifiers). At step 3, you will choose the usage of the uploaded gene list ("Gene List" or "Background") followed by submission in step 4.

Figure 1. Upload gene/protein identifier list by copy and paste.

An example:

Copy/paste the following IDs to "box A" → Select Identifier as "Affymetrix_3PRIME_IVT_ID" → List Type as "Gene List" → Click the "Submit" button.


2.2. Upload gene/protein identifier list from a tab-delimited text file
You may upload an identifier list by uploading a tab-delimited text file as seen in Figure 2 starting with step 1. For a single list file upload, DAVID was designed to accept the identifiers starting from the first row without a header. The list needs to be in a format of one gene/protein identifier per row and only the first column is considered in the analysis. Subsequent columns will be treated as metadata (i.e. fold change, etc.) For a multi-list file upload, each column is considered a gene list with the first row (header) used as the list name. A multi-list file upload will require the additional step 1 to denote the uploaded file as such. Please note that if the identifiers are official gene symbols, you will need to enter the species name or Taxon ID (step 2a). At step 3, you will choose the usage of the uploaded gene list ("Gene List" or "Background") followed by submission in step 4.

Figure 2. Upload gene/protein identifier list from a tab-delimited text file.

2.3. What are the demo list 1 and demo list 2
DAVID has links for two pre-built demo lists for users who do not have a gene list and would like to explore DAVID. You just simply click on the links for demolist1 or demolist2 on top of the submission box to start the analysis. Following is the information regarding the two demo lists:

demolist1: One hundred sixty-four genes found to be upregulated in CD4+/CD62L- T cells relative to CD4+/CD62L+ T cells.

Cutting edge: L-selectin (CD62L) expression distinguishes small resting memory CD4+ T cells that preferentially respond to recall antigen.
Hengel RL, Thaker V, Pavlick MV, Metcalf JA, Dennis G Jr, Yang J, Lempicki RA, Sereti I, Lane HC. J Immunol 2003 Jan 1;170(1):28-32.

Naive CD4+ T cells use L-selectin (CD62L) expression to facilitate immune surveillance. However, the reasons for its expression on a subset of memory CD4+ T cells are unknown. We show that memory CD4+ T cells expressing CD62L were smaller, proliferated well in response to tetanus toxoid, had longer telomeres, and expressed genes and proteins consistent with immune surveillance function. Conversely, memory CD4+ T cells lacking CD62L expression were larger, proliferated poorly in response to tetanus toxoid, had shorter telomeres, and expressed genes and proteins consistent with effector function. These findings suggest that CD62L expression facilitates immune surveillance by programming CD4+ T cell blood and lymph node recirculation, irrespective of naive or memory CD4+ T cell phenotype.

demolist2: Four hundred three genes found to be induced in peripheral blood mononuclear cells incubated with purified HIV envelope proteins.

HIV envelope induces a cascade of cell signals in non-proliferating target cells that favor virus replication.
Cicala C, Arthos J, Selig SM, Dennis G Jr, Hosack DA, Van Ryk D, Spangler ML, Steenbeke TD, Khazanie P, Gupta N, Yang J, Daucher M, Lempicki RA, Fauci AS. Proc Natl Acad Sci U S A 2002 Jul 9;99(14):9380-9385.

Certain HIV-encoded proteins modify host-cell gene expression in a manner that facilitates viral replication. These activities may contribute to low-level viral replication in nonproliferating cells. Through the use of oligonucleotide microarrays and high-throughput Western blotting we demonstrate that one of these proteins, gp120, induces the expression of cytokines, chemokines, kinases, and transcription factors associated with antigen-specific T cell activation in the absence of cellular proliferation. Examination of transcriptional changes induced by gp120 in freshly isolated peripheral blood mononuclear cells and monocyte-derived-macrophages reveals a broad and complex transcriptional program conducive to productive infection with HIV. Observations include the induction of nuclear factor of activated T cells, components of the RNA polymerase II complex including TFII D, proteins localized to the plasma membrane, including several syntaxins, and members of the Rho protein family, including Cdc 42. These observations provide evidence that envelope-mediated signaling contributes to the productive infection of HIV in suboptimally activated T cells.
3. List Tab

Gene-Species Mapping Manager (top box):

After selecting a gene list from List Manager (bottom), this box specifically generates a summary of gene-species mapping. If the gene list contains multiple species, users can define one or multiple gene-species groups to analyze together or separately.

List Manager (bottom box):

Users are allowed to input multiple gene list in one web session. Each list is remembered and listed in this box. By default, the last input gene list is the current list being selected and to be analyzed.

Special Note:

Clicking "Use" and "Select Species" buttons are required in order to make switches.
4. Background Tab

Background Tab allows you to choose different background for enrichment analysis.
Last edited on December 22, 2020