# Chapter 5: Software and Cloud Architecture 1. **Intended audience(s):** **A. CTSA hub leaders** (strategic recommendations for the use of cloud computing and reusable software resources at the hub and network levels) **B. Clinical and translational scientists** (project-level recommendations for the use of cloud computing and reusable software resources to meet individual needs and enhance the reproducibility, rigor, and shareability of research products) **C. Informatics and technology solution providers** (technical recommendation for how to access and use CD2H provisioned cloud computing resources and reusable software components) 2. **Current version / status:** **A. Last revision:** 12/18/2019 **B. Status:** draft, outline 3. **Lessons learned / summary:** **A. Mission and purpose of the CD2H Tool and Cloud Community Core:** Computational technologies and tools are vital to clinical and translational research; however, CTSA hubs currently develop, deploy, and manage these key resources independently. As a result, these processes are tedious, costly, and heterogeneous. This core will address these issues by establishing a common tool and cloud computing architecture, and will provide CTSA hubs with an affordable, easy to use, and scalable deployment paradigm. Such an approach will support a robust ecosystem that demonstrates the use of shared tools and platforms for the collaborative analysis of clinical data. Hubs can easily promote and deploy their own products as well as adopt others, thereby transcending long-standing "boundaries" and solving common and recurring information needs. **B. Value and vision:** **C. Dimensions of tool and cloud architecture and capabilities:** ![cloud_diagram](../_static/img/chapter_5_cloud_diagram.png) i. **Cloud hosting** for software applications and platforms, leveraging Amazon Web Services (AWS) environment managed by NCATS and provisioned by CD2H ii. **Tool registry** to assist in the sharing and quality assurance of shared software components developed by CTSA hubs **\<LINK TO SLIDES RE: TOOL REGISTRY PROJECT\>** iii. **Build and test framework** for collaborative software development projects iv. **Sandboxes** to provide spaces for informatics-focused workgroups seeking solutions to shared data analytic and management challenges v. **Benchmarking** of algorithms and predictive models using Challenge framework 4. **Status and feedback mechanisms:** A. **CD2H cloud hosting architecture** (v1.0) currently available for community feedback and comments: 1. [CD2H-NCATS Cloud Architecture proposal](https://docs.google.com/presentation/d/1O8C0Kj5AtX-69C0eY79zaftAQFPYAWAELAZ2Y7-vnnA/edit#slide=id.g5e2ce0d5ce_5_0) 2. [Architecture Response Form](https://docs.google.com/forms/d/e/1FAIpQLScVXPr_wPDVDdbxn4NXCOPVVXnN2rzfMjtrPle6DZjr2jPlIw/viewform?vc=0&c=0&w=1&usp=mail_form_link) B. **CD2H cloud resource request "intake" form** (process for requesting access to CD2H provisioned cloud infrastructure) i. [Cloud resource request intake form](https://forms.gle/YdZHUSR9NT2ktt1EA) ii. Cloud deployment projects dashboard (under development) C. **Prototype shared tools** deployed using NCATS/CD2H cloud resources or other Tool and Cloud Community Core capabilities: i. [Competitions](http://competitions.cd2h.org/) (peer review and competitive application management) ii. [Leaf](http://rit.uw.edu/leaf) (platform agnostic clinical data browser) D. Program-wide **CD2H tool registry** i. [CD2H Labs](http://labs.cd2h.org/labs/) E. **Benchmarking projects** leverage Challenge framework: i. [Metadata Challenge](http://synapse.org/metadatachallenge) (sharing of cancer-focused datasets) ii. [EHR Challenge](http://synapse.org/ehr_dream_challenge_mortality) (mortality prediction) 5. **Takeaway list:** A. Create a common cloud computing architecture that can enable the rapid deployment and sharing of reusable software components by CTSA hubs B. Demonstrate the use of shared tools and platforms for the collaborative analysis of clinical data in a manner that transcends individual CTSA hub "boundaries" C. Disseminate a common set of tools that can be employed for both the local and collaborative query of common data warehousing platforms and underlying data models D. Pilot the "cloudification" of software artifacts that can be shared across CTSA hubs to address common and recurring information needs. 6. **Deep dive into takeaways:** A. [CD2H-NCATS Cloud Deployment Checklist](https://docs.google.com/presentation/d/1rVAgHFmiKszxF-_VJLvY9JK91Lg3IjwAV8kM78qzuX4/edit?usp=sharing) B. [CD2H-NCATS Cloud Deployment Process Workflow](https://docs.google.com/presentation/d/1GYGgSbglIuHxAd0qkYRXbcWL4g1jmB-N-gMlQoYQMIc/edit?usp=sharing) C. [CD2H-NCATS Architecture Design Proposal](http://bit.ly/cd2h-cloud-rfc) D. [CD2H-NCATS Architecture Request for Feedback Form](https://docs.google.com/forms/d/e/1FAIpQLScVXPr_wPDVDdbxn4NXCOPVVXnN2rzfMjtrPle6DZjr2jPlIw/viewform?vc=0&c=0&w=1&usp=mail_form_link) E. [CD2H-NCATS Federated Authentication (UNA) Overview](https://drive.google.com/open?id=1DclEZEwvEasCX0QfBeJZOTlRB0VYCoOQ) F. Code and documentation repositories for ongoing Tool and Cloud Community Core projects: 1. [Tool-Cloud-Infrastructure Core GitHub repo](https://github.com/data2health/tools-cloud-infrastructure) 2. [Cloud-Tool-Architecture project GitHub repo](https://github.com/data2health/Cloud-Tool-Architecture) 3. [Competitions project GitHub repo](https://github.com/data2health/competitions-project) 4. [EHR Dream Challenge project GitHub repo](https://github.com/data2health/DREAM-Challenge) 7. **Acknowledgements** 1. \<LIST CLOUD CORE PARTICIPANTS\> ## Cloud Collaboration software ## Cloud architecture ## Software best practices