Leveraging cloud and big data technologies in large genomics projects

Speaker Name: 
Brian O’Connor
Speaker Organization: 
Ontario Institute for Cancer Research
Start Time: 
Thursday, September 3, 2015 - 3:30pm
End Time: 
Thursday, September 3, 2015 - 4:30pm
Location: 
599 Engineering 2
Organizer: 
Linda Rosewood

Advances in sequencing technologies have spurred the ever-increasing production of valuable, large-scale omics datasets. The ICGC project’s latest release, for example, consists of data from over 12,979 cancer genomes and the project is on track for analyzing over 25,000 tumour genomes by 2018.  Likewise, technical advances in so called Big Data frameworks and Cloud infrastructures have been instrumental in streamlining the analysis of large volumes of sequence data. In this talk I explore the technologies built by OICR and our partners to enable the creation of computational workflows, cloud-based systems for processing large numbers of genomes, and tools for visualizing and searching analytical results. These systems have been used by several projects including the ICGC/TCGA PanCancer project, a collaborative effort to consistently analyze over 2,800 cancer genomes. We have released the tools built in this process, SeqWare, CloudBindle, and Consonance, as open source projects for use by others in the community.