Genome Wide Association Studies (GWAS) in Ischemic Stroke Workshop
Sponsored by the National Institute of Neurological Disorders and Stroke
April 30, 2008
The workshop “Genome Wide Association Studies (GWAS) in Ischemic Stroke” was organized by the NINDS (chaired in partnership by Steve Rich, University of Virginia, and Katrina Gwinn, NINDS) in order to discuss the shift in science and technology from the study of single gene Stroke disorders to the study of genome wide, population-based genetic studies in Ischemic Stroke. This workshop was a “roll up the sleeves” meeting which included representatives from all of the major groups studying the genetics of ischemic stroke as well as individuals working in other disorders, the latter of whom informed the workshop regarding lessons learned (successes and failures) from their own experiences. Each group studying the genetics of Ischemic Stroke discussed 1) the number of samples they have available for a GWAS in individual collections 2) the options/limitations for sharing within, and outside the group of investigators present and 3) the extent of the phenotypic data available for endo-phenotypic analyses. These discussions were in the context of existing NIH GWAS policy, and existing NIH wide and NINDS specific resources. From this, a strategy for a concerted effort in gene discovery in ischemic stroke evolved with specific recommendations, as outlined below, for accomplishing this.
Policy Governing and Resources for Genome Wide Association Studies
Dr. Laura Rodriguez (NHGRI) presented the trans-NIH policy for GWAS studies, as a representative of the NIH-wide committee which developed this policy. As the stewards of the Federal investment in biomedical research, we need to not only leverage our investments in disease areas, but also, assure that community resources are created and sharing is promoted, so that questions can be asked and answered in ways that achieves the full public benefit possible. Because the genotype/phenotype data sets are rich, they contain information that is appropriate for analysis by more than one investigator or even one team of investigators. towards the goal of leveraging resources and maximizing public benefit, the NIH-wide GWAS policy has the expectation that genotype/phenotype information will be shared via a public, centralized repository. For this purpose, the resource of DbGaP (database of genotype and phenotype, http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap) has been created to allow standardized sharing of data in a way that protects subjects and investigators rights and privacy. Details regarding that policy can be found at http://grants.nih.gov/grants/gwas/.
Dr. Steve Sherry (NCBI), with the assistance of Dr. Mike Feolo, further reviewed the database of Genotype and Phenotype (dbGaP) (http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap). DbGaP has two orders of access (open, and controlled) which permit broad release of non-sensitive data in the first case, but also providing oversight and investigator accountability for sensitive data sets involving personal health information in the second. Summaries of all study documents and data dictionaries are included along with the genotyping and phenotyping data, and are made available to investigators via dbGaP. Controlled-access data can only be obtained if a user has been authorized by the appropriate Data Access Committee (DAC). Each study has a DAC which is responsible for adjudicating requests, typically, run by the Institute which funded the study submitting and posting data. A Data Use Certification is the application a user submits to a particular study's Data Access Committee (DAC) for consideration for authorized use of controlled dbGaP data. In order to request data, a Data Use Certification (DUC) must be completed and co-signed by a designated official representing the institution for which the applicant works, and must include a list of the controlled data set(s) required by the user and a brief description of the proposed research use of the requested data.
Dr. Rod Corriveau (Coriell) described the NINDS Human Genetics DNA and Cell Line Repository (http://ccr.coriell.org/Sections/Collections/NINDS/?SsId=10). This resource has banked a large number of samples from investigators studying stroke as well as other neurological disorders.
A lengthy discussion also ensued regarding the total number of samples that are available for study, within and outside the
NINDS Repository, and this totals over 9 thousand cases of Ischemic stroke with DNA collected, and over 38 thousand potential
controls. Of note, however, at least one of the studies discussed is a longitudinal cohort study, and so, cases would be
defined in a different way, as would controls, depending on this, versus other, study designs.
Experiences of Groups in Other Fields and Outside of the NINDS Portfolio
Drs. Bob Karp (Extramural Program Director, NIDDK) and Gerry Shellenberg (University of Washington) discussed the Inflammatory Bowel Disease (IBD) consortium and the NIH funded Alzheimer GWAS Consortium, respectively. Dr. Bruce Pstay (University of Washington) discussed efforts in the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium (and its working group on Stroke) and Dr. Hugh Markus discussed the Wellcome Trust effort in GWAS in Ischemic Stroke.
Dr. Karp in particular discussed the structure and function of the IBD consortium, noting that the RFA announcement for that effort did not specify the structure of the steering committee or the publications policy, but that the consortium itself developed these as a means of self-government once the grants were awarded, which apparently worked well. He also described the value of a Data sharing committee. The importance of limiting the phenotype and population was emphasized (in the case of IBD, the phenotype was limited to ileal Crohn’s disease, and Ashkenazi versus non-Jewish cases were not pooled).
Dr. Shellenberg discussed the value of post-mortem diagnoses and phenotyping of control subjects, despite the expense, because it allows refinement of the phenotype which is more likely to lead to true gene discovery. Having brain samples also makes cell lines less necessary because of the amount of DNA that a brain sample yields. For their consortium, a neuropathological group, a biomarker group, clinical sample group, and an epidemiology group were formed to designed specific aspects of the GWAS. He noted that using existing sample collections allowed leveraging of resources, as no additional funding was needed for subject recruitment using this strategy.
Dr. John Hardy (Queen’s Square London) discussed GWAS studies in Parkinson’s disease and ALS, and his and others experiences in banking the samples and posting the data publicly. This has been beneficial, not detrimental, to those who have shared samples and data without restrictions, and others were encouraged to share broadly, where possible in order to advance science. It was noted by him, and others, that only through broad sharing will sample sizes ever attain the number needed to make meaningful findings via GWAS, especially in disorders as complex as Ischemic Stroke.
Dr. Psaty discussed a cohort study that has functioned via a consortium for about 2 decades. Their organization included committees such as Research Coordination and analysis committees. He felt that the efforts of “young mavericks” on the team were of particular value, each of whom took ownership of a subproject within the study itself.
Dr. Markus discussed the plans for his leadership of a GWAS in Ischemic Stroke (funded by the Wellcome Trust). He described the study plan as including Ischemic Stroke only (no hemorrhagic stroke), which will have an initial screen of 4000 cases and 4000-6000 controls (all Caucasians). The phenotyping will be done with brain imaging to confirm the diagnosis. Exclusions will include hemorrhage, apparent mono-genetic disorders, iatrogenic strokes, and post operative stroke. They will be using the TOAST sub-typing (Gordon et al 1993, others). The controls are already genotyped and include subjects from Munich as well as Sweden and Krakow. It is likely that these are controls that will be utilized as control subjects for other gene discovery projects as well although the issue of controls is under ongoing discussion as part of this project and at the Wellcome Trust generally. The genotyping will all be done by one Institution in the US (at the Broad Institute). Controls will be genotyped on both platforms, and while the investigator can request a given platform for case genotyping, they may not get their request. Because these samples are archival, not all will have proper consent for sharing broadly, but, sharing is a part of the agreement of the grant, and so, the data will be made as freely available as possible. Regarding their plans for replication, they intend to formulate a plan once they have an idea of the initial results. Dr. Markus also discussed his interactions with the grassroots “International Stroke Consortium”. He underscored his interest in collaboration.
All speakers and participants underscored that trust and transparency are critical to the success of their efforts.
Discussion Highlights (All Participants)
An afternoon was devoted to open discussion by all participants. This discussion included many issues regarding moving forward with a GWAS in Ischemic Stroke. Design considerations and pitfalls, the minimum versus ideal phenotypic variables to be included, replication planning were all discussed. Also mentioned were the future role of the NINDS repository, and that of the NINDS itself in moving the field forward. It was generally expressed that despite the current and planned efforts in GWAS in Ischemic Stroke, action in GWAS in Ischemic stroke is needed and timely, and a coordinated plan for this is essential. The NINDS can play a vital role in this ongoing and future area of scientific endeavor.
SUMMARY AND RECOMMENDATIONS
It has become clear that the NINDS needs a strategy for dealing with Genome Wide association studies (GWAS), especially in the more common neurological disorders such as Ischemic Stroke. These studies have a large price tag, and while technology costs are getting lower, these applications will continue to have high costs related to sample and data management and analysis. Perhaps even more importantly, the larger the sample size, the more likely a GWAS is to be successful in identifying real genes of risk in common, complex diseases. In Ischemic stroke, multiple investigators have private sample and data collections, as well as banked collections in the NINDS repository. However, if all of these investigators worked together, then, firstly, the power of a given study would be dramatically increased and secondly, there would be a cost savings to the institute by sharing of genotype resources. Finally, this could allow data sharing and pooling, including via DbGaP (the NCBI/NLM database for public sharing of genotype/phenotype SNP data) which would lead to secondary analysis capabilities, and be hypothesis generating.
Some general comments and recommendations that were made included:
NIH GWAS Policy and Information, http://grants.nih.gov/grants/gwas/.
DbGaP (database of genotype and phenotype, http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap)
NINDS Human Genetics Repository, http://ccr.coriell.org/Sections/Collections/NINDS/?SsId=10
Last updated July 21, 2008