|
Home : Workgroups : MIAME : MIAME Archive : Nov 2000
Minimum information about a microarray experiment - MIAME
Endorsed by MGED steering committee meeting November 17, 2000 For archive purposes
The goal of the MIAME is to specify the minimum information that must be reported about a microarray (or any DNA array) based gene expression monitoring experiment in order to ensure the interpretability, as well as potential verification of the results by third parties. The background aim is to facilitate the establishment of public repositories and data exchange format for microarray based gene expression data. The MGED group will be encouraging the scientific journals and funding agencies to adopt policies requiring data submissions to repositories, once MIAME compliant repositories are established.
Introduction:
The definition of the minimum information is aimed at co-operative data providers, and not for closing possible loopholes in not providing the information.
Among the concepts in the definition is a list of 'qualifier, value, source' triplets, where 'source' is either user defined, or a reference to an externally defined ontology or controlled vocabulary, such as the species taxonomy database at NCBI. Where necessary, the authors are encouraged to define their own qualifiers and provide the appropriate values so that the list as a whole gives sufficient information to interpret that particular part of the experiment. The judgement regarding the necessary level of detail is left to the data providers. In the future these 'voluntary' qualifier lists may be gradually substituted by required fields, as the respective ontologies are developed.
Parts of the MIAME can be provided as a reference or link to an externally existing description. For instance, for commercial or other standard arrays all the required information should be normally provided only once by the array provider and referenced by the users. Standard protocols should also normally be provided only once. It is necessary, that either a valid reference or the information itself be provided for every experiment set.
Definition:
The minimum information about a published microarray based gene expression experiment should include the description of
1.Experimental design: the set of the hybridisation experiments as a whole
2.Array design: each array used and each element (spot) on the array
3.Samples: samples used, the extract preparation and labeling
4.Hybridisations: procedures and parameters
5.Measurements: images, quantitation, specifications
6.Controls: types, values, specifications
The following details should be provided for each array, each sample, hybridisation and measurement in the experiment set:
1. Experimental design: the set of the hybridisation experiments as a whole
a)author (submitter), laboratory, contact information, links (URL)
b)type of the experiment - maximum one line for instance:
- normal vs. diseased comparison
- treated vs. untreated comparison
- time course
- dose response
- effect of gene knock-out
- effect of gene knock-in (transgenics)
- shock
(multiple types possible)
c)experimental factors (e.g., time, dose, genetic variation)
d)the list of platforms used
e)single or multiple hybridisations
For multiple hybridisations:
- ordered/unordered
- serial (yes/no)
- type (e.g., time course, dose response)
- grouping (yes/no)
- type (e.g., normal vs. diseased, multiple tissue comparison)
- list of the samples and arrays used in the experiment and description of the relationship between them: each sample and each array should be assigned a unique id in the experiment set and all the relationships should be listed with appropriate comments
- which hybridisations are replicates
f)quality related indicators
- does a related peer-reviewed publication exist
- number of replicate hybridisations
- any other quality control steps taken (polyA, non-specific binding etc.)
g)optional user defined "qualifier, value, source" list (see Introduction)
h)a free text description of the experiment set or a link to a publication
2. Array design: each array used and each element (spot) on the array.
a)array
- array design name (e.g., "Stanford Human 10K set")
- platform type: in situ synthesized or spotted
- provider (source)
- surface type: adsorptive/ non-adsorptive
- surface type name
- array dimensions (long, short)
- number of elements on the array
- a reference system allowing each element (spot) to be located
on the array (in the simplest case the number of columns and rows is sufficient)
- unique ID from the provider
- production protocol (obligatory if applicable)
- optional "qualifier, value, source" list (see Introduction)
b)element (spot) on the array - elements may be simple, i.e., containing only identical molecules, or composite, i.e., containing different oligonucleotides obtained from the same reference molecule; for each element the following must be given:
- position on the array allowing the spot to be identified in the image (see 5. a) below)
- element type: synthesized oligonucleotides, PCR products, plasmids, colonies, other
- clone information, obligatory for elements obtained from clones:
- clone ID, clone provider, date, availability
- sequence information, obligatory for simple synthetic elements:
- sequence accession number in DDBJ/EMBL/GenBank if known
- sequence itself (if databases do not contain it)
- for composite oligonucleotide elements:
- oligonucleotide sequences, if given
- number of oligonucleotides and the reference sequence (or accession number), otherwise
- approximate lengths if exact sequence not known
- single or double stranded
- element (spot) dimensions (approximate diameter)
- element generation protocol that includes sufficient information to reproduce the element
- attachment (covalent/ionic/other)
- gene name and links to appropriate databases (e.g., SWISS-PROT, or organism specific databases), if applicable
3. Samples: samples used, extract preparation and labeling
4. Hybridisations: procedures and parameters
laboratory protocol for hybridisation, including:
- the solution (e.g., concentration of solutes)
- blocking agent
- wash procedure
- quantity of labelled target used
- time, concentration, volume, temperature
- description of the hybridisation instruments
optional "qualifier, value, source" list (see Introduction)
5. Measurements: images, quantitation, specifications:
a)hybridisation scan raw data:
a1)the scanner image file (e.g., TIFF) from the hybridised microarray scanning
a2)scanning information:
- parsed header of the TIFF file, including laser power, spatial resolution, pixel space, PMT voltage;
- laboratory protocol for scanning, including:
- scanning hardware
- scanning software
b)image analysis and quantitation
b1)the complete image analysis output (of the particular image analysis software) for each element (or composite element - see 2.b), for each channel
b2)image analysis information:
- image analysis software specification and version, availability, and the description of the algorithm
- all parameters
c)summarized information from possible replicates
c1)derived measurement value summarizing related elements as used by the author (this may constitute replicates of the element on the same or different arrays or hybridisations, as well as different elements related to the same entity e.g., gene)
c2)reliability indicator for the value of c1) as used by the author (e.g., standard deviation); may be "unknown"
c3)specification how c1 and c2 are calculated; the specification should be bases on b1
6. Normalisation controls, values, specifications for hybridisations
|