Minimum information about a microarray experiment - MIAME
Draft
March 21, 2001 (based on November 17, 2000) updated 30 July 2001.
MIAME compliant example experiment
1.
Experimental design: the set of the hybridisation experiments as a whole
This
section gives information describing the experiment, which may consist of one
or more hybridisations, as a whole.
Normally 'experiment' should include a set of hybridisations which are
inter-related and performed in a limited period of time. For instance, it may
be all the hybridisations related to research published in a single paper.
author
(submitter), laboratory, contact information, links (URL) -
a) author (submitter), laboratory,
contact information, links (URL), citation
T.
Preiss
E. Wabek
A.
Richter
J.
Zimmermann
M. Wagner
C. Schwager
M. W. Hentze
W. Ansorge
EMBL,
Meyerhofstr.1, D-69117 Heidelberg, Germany
b) type of the experiment - maximum
one line
treated vs. untreated
comparison
c) experimental variables, i.e. parameters
or conditions tested (e.g., time, dose, genetic variation, response to a
treatment or compound)
response to a compound -
rapamycin
d) single or multiple hybridisations
For
multiple hybridisations:
* serial (yes/no)
no
* grouping(yes/no)
no
for
multiple hybridisations:
* serial (yes/no)
no
* type (e.g., time course, dose response)
dose response
grouping
(yes/no)
no
* type (e.g., normal vs. diseased, multiple tissue comparison)
pre and post compound addition
Relationships between all the samples, arrays and hybridisations
in the experiment: each sample, each
array, and each hybridisation should be given a unique ID or number, and all
the relationships should be listed, possibly with appropriate comments.
Samples: treated, untreated
Arrays:
20, 27, 28,
29, nn
Hybridisations: H1: Array 20, treated Cy5 vers. untreated Cy3
H2:
Array 27, treated Cy5 vers. untreated Cy3
H3:
Array 28, treated Cy5 vers. untreated Cy3
H4:
Array 29, treated Cy5 vers. untreated Cy3
H5:
Array nn, treated Cy5 vers. untreated Cy3
H1 and H3 are
replicates
All are replicates to each other,
same RNA prep was used for all hybridisations
cDNA labelling was done individually for each
experiment
e) quality related indicators quality
control steps taken:
* biological replicates
* technical replicates
(replicate spots or hybs)
5
* polyA tails
* low complexity regions
* unspecific binding
* other
Empty wells, labeled
primers, spiked in controls from Arabidopsis and bacterial genes, postive as
well as negative control sequences
f) optional user defined
"qualifier, value, source" list
g) a free text description of the
experiment set or a link to a publication
The diploid yeast
(Saccharomyces cerevisiae) strain FY1679 (Mata/? ura3-52/ura3-52
trp1?63/+leu2?/+his3?200/+) was grown at 30 °C in YPD medium.
At an OD(600 nm) of 0.6
rapamycin was added to a final concentration of 0.2 µg/ml and incubation was
continued for 90 min.
A control culture was treated
in parallel with an equal amount of DMSO vehicle.
Cells were harvested at an
OD(600nm) of 0.9 (+rapamycin) and 1.1 (control).
Unpublished data.
There are three published
papers describing the transcriptional programme of yeast in response to
rapamycin.
It should be possible to
compare our datasets with those described in these papers:
Curr. Biol. 2000 10:1574-1581;
PNAS
USA 1999 96:14866-14870;
Genes
Dev. 1999 13:3271-3279
2. Array design: each array used and each element (spot) on the
array
There
are two parts of this section:
2.1 describes the list of physical arrays themselves, each
of
these referring to specific array
design types described in 2.2. We expect that the array design type
descriptions
will be given by the array providers and manufactures, in which case the users
will simply
need to
reference them.
2.2 Array design This section consists of three parts
a) description of the array as the whole,
b) description of each type of elements (spot) used (properties
that are typically common to many elements
(e.g.,
'synthesized oligo-nucleotides' or 'PCR products from cDNA clones'),
c) description of the specific properties of each element, such as
the DNA sequence.
(In
practice, the last part will be provided as a spread-sheet or tab-delimited file)
2.1 Array copy: each array used and each element
(spot) on the array.
* unique id as used in part 1
20,27,28,29
* array design name (e.g Stanford human 10K set)
(for
commercial or standard arrays a unique ID given by the provider may be used)
EMBL
Yeast 12K ver.1
2.2
a)
array related information
* array design name (e.g., "Stanford Human 10K set")
EMBL
Yeast 12KVer. 1
* platform type: in situ synthesized or spotted
spotted
* array provider (source)
in-house (EMBL)
* surface type: glass,
membrane, other
glass
* surface type name
in-house coated EMBL slides
* physical dimensions of array support (e.g. slide)
75 x 25 mm
* number of elements on the array
14 000 (approximately)
* a reference system allowing to locate each element (spot) on the
array
(in the simplest case the number of columns and rows is sufficient)
by coordinate, referencing to
an external data table,
32 subgrids
ordered 4 x8
counted left
right, top down, 21 rows, 20 columns per subgrid,
counted left
right, top down for array EMBL Yeast 12KVer. 1
* production date
* production protocol (obligatory if applicable)
Yeast cDNA microarrays were
spotted using pcr products obtained with
the Research Genetics GenePair
primer set.
Spotters:
GeneMachine Omnigrid, Telechem split pins SMP3
Double
stranded, preferentially only the forward strand is attached
via a 5' aminolink moiety.
(The slide
chemistry will preferentially bind this strand but
it will also retain the other).
* optional "qualifier, value, source" list (see
Introduction)
b)properties of each group of
elements (spots) on the array;
elements
may be simple, i.e., containing only identical molecules, or composite, i.e.,
containing different oligonucleotides obtained from the same reference
molecule;
* element type id
* simple or composite
simple
* element type: synthesized oligo-nucleotides, PCR products,
plasmids, colonies, other
PCR
products
* single or double stranded
double
* element (spot) dimensions (approximate diameter)
90-120µm
* element generation protocol that includes sufficient information
to reproduce the element
using the ResGen primer set and
genomic DNA
(some PCR products encompass
intron sequences)
* attachment (covalent/ionic/other)
covalent
* optional "qualifier, value, source" list (see
Introduction)
c)
specific properties of each spot on the array:
* element type ID from 2.2b
PCR
products
* position on the array allowing spot identification in the image
(see 5a below)
spreadsheet attached
* clone information, obligatory for elements obtained from clones:
(clone
ID,clone provider,date,availability )
* sequence or PCR primer information:
(sequence
accession number in DDBJ/EMBL/GenBank if known, sequence itself (if databases
do not contain it, primer pair information, if relevant)
The PCR primers that are used
for the generating the elements are
identified by MIPS ORF names
given in attached spreadsheet
* for composite oligonucleotide elements:
(oligonucleotide
sequences if given,given number of oligonucleotides and the reference sequence
(or accession number), otherwise, one of the above should unambiguously
identify the element.
* approximate lengths if exact sequence not known
* gene name and links to appropriate databases
e.g.,
SWISS-PROT, or organism specific databases), if known and relevant
(Normally
this information will be provided in one or more spread-sheets or tab-delimited
files. )
3. Samples: samples used, extract preparation and labeling
By a
'sample' we understand the biological material, from which the RNA gene
products (or DNA) have been extracted for subsequent labeling, hybridisation
and measuring. This section describes
the source of the sample (e.g., organism, cell type or line), its treatment, as
well as preparing the extract and its labeling, i.e., all steps that precedes
the contact with an array (i.e., hybridisation). Each sample used in the experiment has a separate section 3. In practice, if the treatments are similar,
differing only slightly, the descriptions can be given together, clearly
pointing out the differences.
sample
source and treatment (this section describes the biological treatment which
happens before the extract preparation and labelling, i.e., biological sample
in which we intend to measure the gene expression; for each sample only some of
the qualifiers given below may be relevant):
* ID as used in section 1
treated, untreated
* organism (NCBI taxonomy)
Saccharomyces
cerevisiae
additional
"qualifier, value, source" list; each qualifier in the list is
obligatory if applicable; the list includes:
* cell source and type (if derived from primary sources (s))
strain FY1679
* development stage
diploid
* genetic variation (e.g., gene knockout, transgenic variation)
Mata/? ura3-52/ura3-52
trp1?63/+leu2?/+his3?200/+
* in vivo treatments (organism or individual treatments)
* in vitro treatments (cell culture conditions)
cells were grown at 30oC
in YPD medium
* treatment type (e.g., small molecule, heat shock, cold shock,
food deprivation)
small molecule
* compound
rapamycin, drug vehicle
* separation technique (e.g., none, trimming, microdissection,
FACS)
none
* laboratory protocol for sample treatment
At an OD(600 nm) of 0.6
rapamycin (20µl of a 1 µg/µl solution in DMSO)
was added to a final
concentration of 0.2 µg/ml and incubation was continued
for 90 min. A control culture
was treated in parallel with 20µl of DMSO
vehicle. Cells were harvested
at an OD(600nm) of 0.9 (+rapamycin) and 1.1 (control).
b) hybridisation extract preparation
laboratory protocol for extract preparation, including:
protocol
description
* Description:
Total
RNA preparation from yeast cultures
Grow 100ml suspension culture in YPD to an OD @600nm of approx
1
Harvest
cells by centrifugation for 10' @5000rpm/4°C in the SS34 rotor
Resuspend
cells in breaking buffer and transfer to 2 x 2ml eppies
per culture
Pellet cells
2' @ top speed in a cooled microfuge
Freeze
pellet in liquid nitrogen or proceed directly to cell lysis
To cell
pellet add 400µl phenol/chloroform,
400µl breaking buffer, 5µl 20%
SDS, and approx 400µl glass beads
Agitate 2x
for 45'' in a carbon dioxide-cooled bead mill
(GeneExProgramme) taking care
not to freeze the mixture
Spin 10' @
top speed in a cooled microfuge
Remove SN to
a new eppie
Perform 2x
regular phenol/chloroform extractions
Precipitate
with ethanol/ sodium acetate as usual
Materials:
Breaking
buffer:
20mM Tris/Cl
pH7.4, 100mM KCl, 2mM MgCl2, 2mM DTT
glass beads
Sigma # G-8772
* extraction method
bead beating lysis
* whether total RNA, mRNA, or genomic DNA is extracted
total RNA
* amplification (RNA polymerases, PCR)
none
* optional "qualifier, value,
source" list (see Introduction)
c)labelling: laboratory protocol for
labelling, including:
* protocol
Clontech Atlas Glass
Fluorescent labeling kit was used
(with some modifications).
cDNA
synthesis was primed with oligo(dT)20 primer.
* amount of nucleic acids labeled
20 (g total RNA) ????
* label used (e.g., Cy3, Cy5, 33P)
Cy3
(untreated), Cy5 (treated)
* optional "qualifier, value, source" list (see
Introduction)
4. Hybridisations: procedures and parameters
This
section describes details of each hybridisation in the experiment.
Each
hybridisation has a separate section 4, though if they are similar they may be
described together.
* ID as given in section 1
H1,
H2, H3, H4, H5
* laboratory protocol for hybridisation, including:
* the solution (e.g., concentration of solutes)
Hybridisation
buffer:
50%
formamide, 6x SSC, 0.5% SDS, 5x Denhardt's solution.
* blocking agent
6x SSC, 0.5% SDS, 1% BSA.
Prehybridisation for 45 min at 42°C.
* Slide blocking:
6x SSC, 0.5% SDS, 1% BSA.
Prehybridisation for 45 min at 42°C.
* Probe blocking:
0.1 ug/ul salmon sperm DNA, 0.5
ug/ul polydA during hybridization.
* wash procedure
1x 10 min wash with 0.1x SSC,
o.1% SDS, 2x 5 min in 0.1x SSC
* quantity of labelled target used
all material generated from 20
µg total RNA
* time, concentration, volume, temperature
overnight (16h), 40 µl, at 42°C
* description of the hybridisation instruments
Gene
Machine Hybridisation chamber
* optional "qualifier, value, source" list (see
Introduction)
5. Measurements: images, quantitation, specifications:
This
section describes the data obtained from each scan and their combinations
hybridisation
scan raw data:
a1) the
scanner image file (e.g., TIFF) from the hybridised microarray scanning
attached
files
AXON_27_Treated_Cy5.tif
AXON_27_Untreated_Cy3.tif
AXON_28_Treated_Cy5.tif
AXON_28_Untreated_Cy3.tif
AXON_29_Treated_Cy5.tif
AXON_29_Untreated_Cy3.tif
GMS_27_Treated_Cy5.tif
GMS_27_Untreated_Cy3.tif
GMS_20_Treated_Cy5.tif
GMS_20_Untreated_Cy3.tif
GMS_29_Treated_Cy5.tif
GMS_29_Untreated_Cy3.tif
GSI_27_Treated_Cy5.tif
GSI_27_Untreated_Cy3.tif
GSI_20_Treated_Cy5.tif
GSI_20_Untreated_Cy3.tif
GSI_nn_Treated_Cy5.tif
GSI_nn_Untreated_Cy3.tif
Grid Files:
==========
AXON_27_***.grid
AXON_28_***.grid
AXON_29_***.grid
GMS_20_***.grid
GMS_27_***.grid
GMS_28_***.grid
GSI_20_***.grid
GSI_27_***.grid
GSI_nn_***.grid
ii.a2) scanning information:
* parsed header of the TIFF file, including laser power, spatial
resolution, pixel space, PMT voltage;
see attached files
* laboratory protocol for scanning, including:
* hybridisation ID as in Section 1
H1,
H2, H3, H4, H5
* image unique id
* scanning parameters (including laser power,spatial
resolution,pixel space PMT voltage)
* lab protocol for scanning (including scanning hardware and
software)
* scanning hardware
Genetic Microsystems GMS 418
(H1, H2, H4)
Axon GenePix4000A (H2, H3, H4)
GSI Lumonics ScanArray 5000
(H1, H2,H5)
* scanning software
b) image analysis and quantitation
bi) the complete image analysis output
(of the particular image analysis software) for each element (or composite
element - see 2.b)),
for
each channel -
see
attached files:
Table Data Files:
=============
AXON_27.txt
AXON_28.txt
AXON_29.txt
GMS_20.txt
GMS_27.txt
GMS_28.txt
GSI_20.txt
GSI_27.txt
GSI_nn.txt
bii) image analysis information:
* input image id
* quantitation unique id
* image analysis software specification and version, availability,
and the description of the algorithm
ChipSkipper
EMBL Christian Schwager (schwager@embl-heidelberg.de)
* all parameters
Expected spot diameter: 100
Micrometer (11pixel)
Area
reject = 10
C)
summarized information from possible replicates
ci) derived measurement value
summarizing related elements as used by the author (this may constitute
replicates of the element on the same or different arrays or hybridisations, as
well as different elements related to the same entity e.g. gene)
attached files
Average Files (Scanner specific averages)
===============================
AXON_Average.txt
AXON_Average.xls
GMS_Average.txt
GMS_Average.xls
GSI_Average.txt
GSI_Average.xls
cii)
reliability indicator for the value of c1) as used by the author (e.g. standard
deviation); may be "unknown"
see files above
ciii)
specification how c1 and c2 are calculated; the specification should be bases
on b1
see files above
6.
Normalisation controls, values, specifications for hybridisations
a)
Normalization strategy (spiking, housekeeping genes, total array)
spiking
b)
Normalisation algorithm (linear regression, log-linear regression, ratio
statistics, log(ratio) mean median centering)
linear regression
c)
Control array elements
* position (the abstract coordinate on the array)
attached
spreadsheet
* control type (spiking, normalization, negative, positive)
spiking
* control qualifier (endogenous, exogenous)
exogenous
d)
Hybridisation extract preparation
* spike type
polyadenylated spiking mRNA added
prior to reverse transcription
* target element
* optional user defined quality value
(to be
added as section 7 in the next MIAME version)
List of the control elements
used on the EMBL yeast chips:
All
fragments have been amplified by PCR using Research genetics
style primers, i.e. they all
have the same short invariant regions
added by the primer as the
yeast orfs.
The
sequences shown below show only the "gene"-specific part.
The specific part of the
primers that were used to amplify these
fragments are the 20 or so
nucleotides at the extremes of the listed sequences.
The following sequences were
employed as spiking controls:
P450, FAD6, 6i18, 15b8, 8h10,
CAT, LUC
All
other control spots were assumed to be negative controls.
Linear regression to compensate
between the two channels was done
using all control spots
(negative and positive).
The chips also contained
repeated spots of fluorescently labelled oligonucleotides named "Cy3"
or "Cy5".
These spots were not included
in any data evaluation but are also listed in the data tables.