Van Eyk, Dunn - Proteomic and Genomic Analysis of Cardiovascular Disease - 2003 (522919), страница 24
Текст из файла (страница 24)
Fidelity of thismRNA amplification method was assessed using microarray technology [23].The combination of powerful microarray technology with precise amplificationtechniques promises to be especially important for small samples of heart biopsies. However, assessment of the yield of labeled mRNA, representation in amplifying various transcripts (fidelity), linearity of amplification, and finally the sensitivity and reproducibility of the method in individual laboratory is essential.Again, the relative efficiency of in-vitro transcription of specific size of mRNAmay later correlate with startup levels of mRNA.4.10Probe LabelingTwo principal types of arrays, spotted arrays (robotic deposition of nucleic acids) andin situ synthesis (using photolithography) are used in gene expression monitoring[24, 25]. Labeled material can prepared by the “one” and “two color” system.The “one color” system is the method that is used for in situ synthesized chips.
TheRNA can be labeled directly with psoralen-biotin derivative or with a Biotin carryingmolecule. The labeled nucleotides are incorporated into cDNA during reverse transcription of poly(A) RNA [24, 26]. Alternatively, cDNA with a T7 promoter at its 5'end can be generated to serve as template for the subsequent step in which the labeled nucleotides are incorporated into cRNA. Commonly used dyes are fluorescentcyanine based Cy3 and Cy5 and nonfluorescent biotin (Amersham).The second method is the “two color” system of probe labeling which is oftenused with cDNA chips. Equal amounts of cDNA from two different conditions arelabeled with different fluorescent dyes, usually Cy3 and Cy5, mixed and hybridized to a chip [25].
The information on ratio (relative concentration) of mRNAfrom two samples is obtained. There are direct and indirect methods of incorporating the dyes into cDNA. In direct method, the labeled nucleotide is incorporatedinto the cDNA, whereas, in indirect method, an amino-allyl modified nucleotideanalogue such as amino-allyl-dCTP is incorporated into the cDNA to which thedyes are subsequently coupled chemically. In addition to systematic variations indirect dual color labeling, Cy3 and Cy5 exhibit different quantum yields. Thus an4.11 Data Analysis and BioinformaticsTo ensure high degree of reproducibility, the cDNA microarray here is doublyspotted, and only spots show concordant upor down regulation is included in the finalFig.
4.3analysis. Fluor-flip is another technique to ensure that the differences observed is due totrue expression differences and not due to artifact.additional chip with exchange dyes (or commonly called Fluor-Flip) is required toobtain a reliable data (Fig. 4.3). After hybridization and washing, the array isscanned at two different wavelengths to determine the relative transcript abundance for each condition and data analysis.4.11Data Analysis and BioinformaticsThe basic techniques in microarray experiments from cDNA synthesis to hybridization and washing are conventional methods that have been used in the laboratory for years.
Data analysis is the most demanding part in the use of this extraordinary tool because we deal with an unprecedented volume of data. For the mostchallenging part of this technology, the data analysis, an increasing number ofsoftware tools are available [27]. Two basic steps in microarray data analysis andresources are:· data collection (collecting raw data from images, correction for the backgroundand normalization), target (differentially expressed genes) detection and targetintensity extraction.· Analysis and Bioinformatics with multiple image analysis and data visualization(e.g. clustering methods to identify unique pattern of gene expression).The care in assuring accurate reproducibility of the data is of paramount importance (Fig. 4.4).69704 cDNA Microarrays in Heart Failure ResearchThe reliability of microarray experiments is dependent on the reproducibility ofthe data set not only within the same subject,but also between subjects in an experimentsexposed to the same conditions.
Here we illustrate normalized gene expression changesFig. 4.4between the first and fifth subject of a singleexperiment of aortic banding in a mousemodel, demonstrating high degree of correlation and reproducibility between these twohearts subjected to the same stress.Data collection: differential gene expression is assessed by scanning the hybridized arrays using either a confocal laser scanner (GSI Scan array) producing 16bit TIFF images, or a photomultiplier tube (PMT) laser scanner (Axon Scanner)capable of interrogating both the Cy3- and Cy5-labeled probes and producing theratio image of 24-bit composite RGB (Red-Green-Blue) or capable of detecting additional dye up to 4 wavelengths simultaneously.
The ratio image typically represents the level of two cDNAs (Control and Test) that is hybridized to the array ina “two color” system. A great advantage to this approach is its capacity to demonstrate a dynamic pattern of gene expression. These images then must be processed or be converted to numerical representations in order to calculate the relative expression levels of each gene and to identify differentially expressed genes.In image processing, first the spots representing the arrayed genes are identifiedand distinguished from nonspecific contamination (such as dust), or artifacts.
Thesecond step in image analysis is background calculation and subtraction to reducethe effect of nonspecific fluorescence. Different data analysis algorithms utilizedare employed by various software tools to quantify the images. For all ratio calculations that require background subtraction, the median background value isusually used (in GenePix Pro, Axon).Because of multivariate nature of the microarray experiments, it is not easy tocompare data from different experiments. To improve the comparison across manymicroarrays, data normalization is required.
Different software packages offer various methods for normalization (Commercial software: GinPix, Axon; GeneSpring, Silicon Genetics; Affymetrix microarray suite and Data mining tool, Affymetrix, Inc; Spot fire, Spotfire Inc and free software: DNA-Chip analyzer, SAM, Stanford; Treeview, Eisen, etc. most of these can be found). Increasing numbers of re-4.11 Data Analysis and Bioinformaticssearchers prefer scaling to normalization. The difference between scaling and normalization relates to the mehtod used to pick the target intensity. For scaling a number that represents the average signal from a large set of arrays is used. For normalization the target intensity is defined as the average signal on the baseline array andthen all experimental arrays are adjusted to that value.
In addition to per chip normalization or scaling, there must be a per gene normalization in order to bring thedata to a relative scale.Normalized or scaled data are typically analyzed to identify genes that are differentially expressed. Most published studies have used a cutoff of two fold up- ordown-regulation to define differential expression; however this can not be true forall genes, because different genes may have different levels of sensitivity.Multiple statistical methods can be used first to filter the most statistically significant data and then to perform further analysis, data mining, and bioinformatics in order to extract the most reliable information from microarray data. Different software packages offer various statistical methods for data filtering suchas: parametric test (assume variance equal) or student’s t-test/ANOVA and Welcht-test/Welch ANOVA (do not assume variance equal) or nonparametric test or Wilcoxon-Mann-Whitney test.
In addition to filtering by standard deviation, p-values,etc., multiple testing corrections can be added to the above methods to increasethe accuracy of filtered data.Sophisticated bioinformatics tools are required to extract accurate informationfrom the avalanche of data and to draw a logical and reliable conclusion from themassive volume of information that is generated from the microarray experiments.
The objective is to reduce complexity and extract or mine as much usefuland relevant information as possible. For microarray data analysis both data miningand bioinformatics are required. Data mining has been defined as “the extraction ofimplicit, previously unknown, and potentially useful information from data”,whereas bioinformatics is used for sequence-based extraction of specific patternsor motifs with the ability of specific pattern matching.
Currently they exist as separate approaches but eventually, data mining and bioinformatics will be indistinguishable. Most data analysis software is equipped with bioinformatics rather than datamining tools. When the size of the data set is reduced to a manageable volume ofstatistically significant data, it is possible for the scientist to identify emerging patterns.There are several popular methods to analyse and visualizae gene expression data:· Hierarchical Clustering is used to visualize a set of samples or genes by organizing them into a phylogenetic tree, often referred to as a dendrogram.
One wayof analyzing microarray data is to look at the cluster (group) of genes with asimilar pattern of expression across many experiments. The co-regulated geneswithin such groups are often found to have related functions. The distance between two branches of a tree is a measure of the correlation between any twogenes in the two branches. This is an exceedingly powerful method and is usedmost widely. It allows a researcher to find experimental conditions (e.g. variousdrug treatments, classification of disease states) that have similar effects.71724 cDNA Microarrays in Heart Failure Research· K-means Clustering divides genes into distinct groups based on their expressionpatterns.