Hackpads are smart collaborative documents. .
876 days ago
2 / 2
Unfiled. Edited by Nadia Kovalevskaya 876 days ago
  • Open sharing and easy access to DNA sequencing data from clinical samples is limited due to privacy concerns and interoperability difficulties. The lack of sharing impedes the progress of genomics research, affecting all genetic disease research from cancer to rare diseases.
  • DNAdigest was founded in 2013 for the purpose of promoting and enabling open access, interoperability and secure sharing of genomics data for research. We are developing a portal for custom querying into genomics data repositories, shortening the research time and effort for discovery, access and processing of genomics data. 
1205 days ago
Unfiled. Edited by Matthew Cockerill 1205 days ago
Discussion before lunch
1323 days ago
Unfiled. Edited by Fiona Nielsen 1323 days ago
The topic of data sharing in genomics has been brought up at multiple conferences and meetings. For a great summary, have a look at this 2012 report from the Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects (PDF) 
1329 days ago
1329 days ago
Unfiled. Edited by Adrian Alexa 1329 days ago
- Unique identifier (ID) for results based on
-- dataset (ID changes for every new data)
-- query
-- cacheing results. Important to note not to delete after [x] timeframe but don't return after [x] timeframe unless otherwise stated within the search query.
-- if there are multiple versions of same individual, use the most recent entry via a timestamp/last modification.
- Include the top three contributors (both as citation source / incentive rank)
1335 days ago
Unfiled. Edited by Dev Kumar 1335 days ago
API Design
  • Identify and create relevant endpoints
  • Determine format for data response (e.g. design the JSON)
  • Encapsulate query functionality so it can be generalized across institutions
  • Process aggregated data, or aggregate processed data (depending on design choice)
  • While the data may change can we cache query results? Can they then be re-run occasionally? 
  • How can we authenticate queries on all levels (data security)
  • Will you restrict the number of open queries per user?
User Interface/Experience
  • How do we evidence impact of a dataset for both data contributors and institutions?
1336 days ago
Unfiled. Edited by Sherif Maktabi 1336 days ago
We started by thinking through some of the challenges to getting good data and who the end users might be. One of the big issues is that long-term data entry and storage is a secondary concern to just doing your work. Other major concerns come from vetting the data, do you trust the sources you're looking at? How do we build trust by evidencing the people and institutions that are using a dataset? 
There are too many user types to cover in a one-day workshop, we focused on academic researchers and broke down the types of meta-data they would want to search against and need to see in order to vet the data. Search criteria brought up an interesting point, as there several types of sequencers and alignment tools, the metadata and the UI need to allow researchers to define the pipeline(s) that they're using to derive their answers. 
We started sketching out (roughly) some of the types of things that would need to be included in a search (output) UI. And that lead to a variety of questions about how the results might be visualized, and how we would understand the source of the data. 

Stop sharing the collection with ?

This pad is open to "", so will still be able to access it.
Hack Day November 2013 Feed

Contact Support

Please check out our How-to Guide and FAQ first to see if your question is already answered! :)

If you have a feature request, please add it to this pad. Thanks!

Log in