Sei sulla pagina 1di 113

DATA SCIENCE COURSE

CATALOG
DATA SCIENCE @NLM TRAINING PROGRAM

This document is confidential and intended solely for the client to whom it is addressed.

Updated March 2019


DATA SCIENCE @NLM TRAINING PROGRAM

Welcome Letter from Dr. Brennan


I am delighted to present NLM staff with this Data Science
Course Catalog – a product of the Data Science @NLM
Training Program. This Course Catalog is a testament to my
promise that I will provide NLM staff with the resources and
opportunities to engage with our data science initiatives and
to support the future of our service-minded organization.
Our NLM-wide commitment to become the platform for
data-powered health starts with our workforce, our most
important asset.

Specifically, this Data Science Course Catalog is designed to


enhance our collective understanding of data science and
expand our ability to apply new ways of thinking across a
variety of data science skillsets (described here as Dr. Patricia Brennan
competencies). Each staff member should explore the wide Director, NLM
variety of resources in this Course Catalog, which will enable
growth in the data science knowledge, skills, and abilities
that are needed to foster and sustain a successful data
science ecosystem at NLM.

Courses included here are accessible to all NLM staff and


available across various modes of delivery (i.e., self-paced,
instructor-led, in-person, and virtual). A majority are
available at no cost, but some trainings do require payment. I
encourage all staff to engage their immediate supervisor(s)
in determining the most appropriate path forward.

With this, I challenge each of you to join me on this exciting


journey – a journey focused on personal development and
growth – as we work towards a greater understanding of how
we all play a role in NLM’s data science ecosystem.

Thank you.

1
DATA SCIENCE @NLM TRAINING PROGRAM
DATA SCIENCE COURSE CATALOG
Disclaimers
The Data Science @NLM Training Program Course Data
Science Catalog serves as a reference for:

1. Courses that are currently – or have previously been


(as of March 2019) – offered and made accessible to
staff through NLM’s network of training resources

2. Courses that exist outside of NLM’s network but have


been reviewed and recommended to staff as
supplemental opportunities, which complement and
expand upon NLM’s current offerings

While Booz Allen Hamilton makes every effort to present


accurate and reliable information, Booz Allen does not
endorse, approve, or certify such information, nor does it
guarantee the accuracy, completeness, efficiency, timeliness,
or correct sequencing of such information, including pricing,
which is subject to change. Furthermore, Booz Allen has not
evaluated the effectiveness of these trainings and courses.

Please reach to the Data Science @NLM Training Program


team with any corrections, additions, or questions.

This Data Science Course Catalog is up-to-date as of March


2019.

Disclaimers | 2
Table of Contents
DATA SCIENCE @NLM TRAINING PROGRAM ................................................................................................................. 1
Welcome Letter from Dr. Brennan .......................................................................................................................................... 1

DATA SCIENCE @NLM TRAINING PROGRAM DATA SCIENCE COURSE CATALOG .................................................. 2
Disclaimers ........................................................................................................................................................................... 2
Table of Contents .................................................................................................................................................................. 3

HOW TO USE THIS DATA SCIENCE COURSE CATALOG ................................................................................................ 4


Administrative Guidelines ...................................................................................................................................................... 4
Administrative Guidelines ...................................................................................................................................................... 5
Competencies and Proficiencies ............................................................................................................................................. 6
Data Science Course Gap Analysis ......................................................................................................................................... 7

TECHNICAL DATA SCIENCE COURSES ............................................................................................................................ 8


Advanced Mathematics .......................................................................................................................................................... 8
Computer Science ................................................................................................................................................................ 17
Data Mining & Integration ................................................................................................................................................... 25
Data Visualization ............................................................................................................................................................... 34
Database Science ................................................................................................................................................................ 43
Machine Learning ............................................................................................................................................................... 54
Operations Research ........................................................................................................................................................... 61
Programming and Scripting ................................................................................................................................................. 69
Research Design ................................................................................................................................................................. 86
Statistical Modeling ............................................................................................................................................................. 95

ADDITIONAL DATA SCIENCE TRAININGS ................................................................................................................... 103


Advanced Learning Options ............................................................................................................................................... 103
Professional Development Conferences ............................................................................................................................... 105
Advanced Data Science Webinars ....................................................................................................................................... 108

APPENDIX .......................................................................................................................................................................... 110


Resources ......................................................................................................................................................................... 110
Notes ................................................................................................................................................................................ 112

Table of Contents | 3
HOW TO USE THIS DATA SCIENCE COURSE CATALOG
Administrative Guidelines
Data Science Course Catalog Layout
The Course Catalog is comprised of two sections: (1) Technical Data Science Courses and (2) Other
Courses, which may cover multiple data science competencies. The Technical Data Science Courses are
organized according to their primary data science competency, which are arranged alphabetically (i.e.,
Advanced Mathematics, Computer Science, Data Mining & Integration, etc.). Each course listing has different
sections that provide you with specific information:

Course Example
• Title - Some courses are labeled with a indicating that
this course exists outside of HHS / NIH/ NLM and NLM’s
LinkedIn Learning subscription service
• Description - A synopsis of the learning objectives and topics
covered in the course
• Competency - The main data science skillset(s) covered in
the course (see pg. 6 for competency definitions)
• Proficiency Level – An indication of how a competency is
applied at a specific level
• Training Provider - An identification of the host
organization with links to the platform’s main landing page
and to the specific course
• Course Delivery Format - The method of delivery or
instruction for the training course
• Course Prerequisites or Preferred Prior Knowledge -
Some courses recommend or require prior knowledge,
experience, or coursework completed
• Cost – Some courses require a fee to register for the course.
o For example, Coursera courses are free to audit, but there is
a fee to receive a certificate of completion

The Other Courses included in this Catalog are not assigned to any
one particular competency because they traverse multiple
competencies or address data science leadership practices that are
necessary in building data science teams and a data-driven workforce.

Administrative Guidelines | 4
Administrative Guidelines
Data Science Individual Training Plans
Your Individual Training Plan contains suggested data science training courses, grouped by technical
competency and proficiency level, to help you achieve your data science skill development goals. All
courses listed in your Individual Training Plan can be found in this Course Catalog and are organized
by competency.

Time Commitment
The time commitment will depend on your chosen Skill Development Profile, your results from the
Data Science Readiness Survey, and the specific trainings listed in your Individual Training Plan, but
each of the courses featured in this Course Catalog takes at least 30 minutes to complete. Please work
with your immediate supervisor(s) to allot your time accordingly.

HHS Courses: Courses included from the Department of Health and Human Services Learning
Portal can be accessed by manually searching for the course title in the search bar.

© Department of Health & Human Services Learning Portal

Administrative Guidelines | 5
Competencies and Proficiencies
COMPETENCY DEFINITION
Advanced Understands and applies mathematical techniques, concepts, and theory (e.g. discrete math, matrix computation) to
Mathematics address data science problems.

Computer Science Demonstrates the relationship between the scientific and practical approaches to computation and its applications. Creates
technical environments in which data-driven hypotheses are tested and applied.

Data Mining & Employs an interdisciplinary approach to break down data sets into usable information and to discover new patterns and
Integration behaviors (e.g., cluster analysis, anomaly detection, dependencies

Data Visualization Designs and showcases visual representations of data findings via visualization tools (e.g., Flare, Google Visualization API)
to ensure understanding of core business users to facilitate corporate decision-making.

Database Science Applies knowledge of special-purpose programming language such as structured query language (SQL) to design and
manage relational and non-relational database systems.

Machine Learning Demonstrates the ability to use machines (i.e., computers) to develop and improve algorithms without being explicitly
programmed to improve their own performance through artificial intelligence.

Operations Research Employs operations research (OR) theory, mindset, and techniques, such as LP, IP, GP, logistics, inventory control and
advanced modelling, to arrive at optimal or near-optimal solutions to complex decision-making problems.

Programming and Creates, modifies, and tests computer code, forms, and script to ensure operability of applications. Analyzes user needs to
Scripting recommend software solutions and designs. Uses programming language to develop and write computer programs to
store, locate, and retrieve specific documents, data, and information.

Research Design Creates, modifies, and tests computer code, forms, and script to ensure operability of applications. Analyzes user needs to
recommend software solutions and designs. Uses programming language to develop and write computer programs to
store, locate, and retrieve specific documents, data, and information.

Statistical Modeling Understands and applies mathematical techniques, concepts, and theory (e.g. discrete math, matrix computation) to
address data science problems.

PROFICIENCY DEFINITION
Comprehension You understand the fundamental concepts, activities, or processes associated with this competency.

Basic You have begun to apply knowledge of fundamental concepts, activities, or processes associated with this competency to
work activities. You can demonstrate some parts of this competency after being given specific instructions or guidance.

Foundational You can perform work in activities requiring this competency, often under the supervision of others. You can demonstrate
this competency after being given specific instructions and guidance, and you can engage in general conversation about
this competency.

Full Performance You can perform work in this competency independently. You can demonstrate this competency in straightforward and
routine situations and can contribute new ideas in applying this competency.

Expert You are looked at as having mastered this competency and can lead or teach others in this area. Others view you as a role
model and may consult with you for assistance or guidance with work requiring this competency.

Competencies and Proficiencies | 6


Data Science Course Gap Analysis

Gap Analysis Process


1. Reviewed 138 courses
offered to NLM employees
and assigned each course
proficiency levels across data
science competencies. These
courses were then placed in
the Data Science Course
Catalog.

2. Added 82 courses to the


Catalog to ensure that courses
covering all proficiency levels
across competencies were
included.

3. 220 total courses are featured


in the Data Science Course
Catalog.

*These training courses are not meant to be exhaustive.

Data Science Course Gap Analysis | 7


TECHNICAL DATA SCIENCE COURSES

Advanced Mathematics
Data Science Math Skills In this introductory course on Ordinary Differential
Equations, we first provide basic terminologies on
Data science courses contain math—no avoiding the theory of differential equations and then proceed
that! This course is designed to teach learners the to methods of solving various types of ordinary
basic math you will need to be successful in almost differential equations. We handle first order
any data science math course and was created for differential equations and then second order linear
learners who have basic math skills but may not have differential equations. We also discuss some related
taken algebra or pre-calculus. Data Science Math concrete mathematical modeling problems, which
Skills introduces the core math that data science is can be handled by the methods introduced in this
built upon, with no extra complexity, introducing course.
unfamiliar ideas and math symbols one-at-a-time.
Competency
Learners who complete this course will master the Advanced Mathematics
vocabulary, notation, concepts, and algebra rules
that all data scientists must know before moving on Proficiency Level
to more advanced material. Comprehension

Competency Training Provider


Advanced Mathematics Coursera (main training landing page)
Coursera (specific course description)
Proficiency Level
Comprehension Course Delivery Format
In-person, instructor-led, & virtual
Training Provider
Coursera (main training landing page) Course Prerequisites or Preferred Prior
Coursera (specific course description) Knowledge
N/A
Course Delivery Format
In-person, instructor-led, & virtual Cost
Free to Audit
Course Prerequisites or Preferred Prior
Knowledge Mathematics for Computer Science
N/A
Welcome to Introduction to Numerical Mathematics.
Cost This is designed to give you part of the mathematical
Free to Audit foundations needed to work in computer science in
any of its strands, from business to visual digital arts,
Introduction to Ordinary Differential music, games. At any stage of the problem solving
Equations and modeling stage you will require numerical and
computational tools. We get you started in binary
and other number bases, some tools to make sense of

Advanced Mathematics | 8
sequences of numbers, how to represent space discussion boards to create an active learning
numerically using coordinates, how to study experience. For computing, you have the choice of
variations of quantities via functions and their using Microsoft Excel or the open-source, freely
graphs. For this we prepared computing and available statistical package R, with equivalent
everyday life problems for you to solve using these content for both options. The lectures provide some
tools, from sending secret messages to designing of the basic mathematical development as well as
computer graphics. explanations of philosophy and interpretation.
Completion of this course will give you an
If you wish to take it further, you can join the BSc understanding of the concepts of the Bayesian
Computer Science degree and complete the full approach, understanding the key differences between
module “Numerical Mathematics.” Bayesian and Frequentist approaches, and the ability
to do basic data analyses.
Competency
Advanced Mathematics Competency
Advanced Mathematics
Proficiency Level
Comprehension Proficiency Level
Basic
Training Provider
Coursera (main training landing page) Training Provider
Coursera (specific course description) Coursera (main training landing page)
Coursera (specific course description)
Course Delivery Format
In-person, instructor-led, & virtual Course Delivery Format
In-person, instructor-led, & virtual
Course Prerequisites or Preferred Prior
Knowledge Course Prerequisites or Preferred Prior
N/A Knowledge
N/A
Cost
Free to Audit Cost
Free to Audit
Bayesian Statistics: From Concept to
Data Analysis Business Analytics Foundations:
Descriptive, Exploratory, and
This course introduces the Bayesian approach to Explanatory Analytics
statistics, starting with the concept of probability and
moving to the analysis of data. We will learn about Business analytics allows us to learn from the past
the philosophy of the Bayesian approach as well as and make better predictions for the future. There are
how to implement it for common types of data. We three types of analytics used for learning from the
will compare the Bayesian approach to the more past. Descriptive analytics summarizes historical
commonly-taught Frequentist approach and see data; exploratory analytics uncovers hidden patterns;
some of the benefits of the Bayesian approach. The and explanatory analytics reveals the reasons for
Bayesian approach allows for better accounting of business results. Each type encompasses a different
uncertainty, results that have more intuitive and set of tools, technologies, processes, and best
interpretable meaning, and more explicit statements practices to derive insights from data. This course by
of assumptions. This course combines lecture videos,
computer demonstrations, readings, exercises, and

Advanced Mathematics | 9
Kumaran Ponnambalam explains why they matter Course Delivery Format
and how and when to use them. Virtual, instructor-led

Competency Course Prerequisites or Preferred Prior


Advanced Mathematics Knowledge
N/A
Proficiency Level
Basic Cost
Free to audit
Training Provider
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Introduction to Linear Models and
Matrix Algebra
Course Delivery Format
In-person, instructor-led Matrix Algebra underlies many of the current tools
for experimental design and the analysis of high-
Course Prerequisites or Preferred Prior dimensional data. In this introductory data analysis
Knowledge course, we will use matrix algebra to represent the
N/A linear models that commonly used to model
differences between experimental units. We perform
Cost statistical inference on these differences. Throughout
Free the course we will use the R programming language.
Given the diversity in educational background of our
Exploratory Data Analysis students we have divided the series into seven parts.
You can take the entire series or individual courses
This course covers the essential exploratory that interest you. If you are a statistician you should
techniques for summarizing data. These techniques consider skipping the first two or three courses,
are typically applied before formal modeling similarly, if you are biologists you should consider
commences and can help inform the development of skipping some of the introductory biology lectures.
more complex statistical models. Exploratory Note that the statistics and programming aspects of
techniques are also important for eliminating or the class ramp up in difficulty relatively quickly
sharpening potential hypotheses about the world across the first three courses. By the third course will
that can be addressed by the data. We will cover in be teaching advanced statistical concepts such as
detail the plotting systems in R as well as some of the hierarchical models and by the fourth advanced
basic principles of constructing data graphics. We software engineering skills, such as parallel
will also cover some of the common multivariate computing and reproducible research concepts.
statistical techniques used to visualize high-
dimensional data. Competency
Advanced Mathematics
Competency
Advanced Mathematics Proficiency Level
Basic
Proficiency Level
Basic Training Provider
edX (main training landing page)
Training Provider edX (specific course description)
Coursera (main training landing page)
Coursera (specific course description) Course Delivery Format
Virtual, instructor-led

Advanced Mathematics | 10
Course Prerequisites or Preferred Prior • A basic understanding of statistics and
Knowledge regression models
N/A • At least a little familiarity with proof-based
mathematics
Cost • Basic knowledge of the R programming
Free to audit, $49 for certificate language

Advanced Linear Models for Data After taking this course, students will have a firm
Science 1: Least Squares foundation in a linear algebraic treatment of
regression modeling. This will greatly augment
Welcome to the Advanced Linear Models for Data applied data scientists' general understanding of
Science Class 1: Least Squares. This class is an regression models.
introduction to least squares from a linear algebraic
and mathematical perspective. Competency
Advanced Mathematics
Competency
Advanced Mathematics Proficiency Level
Foundational
Proficiency Level
Foundational Training Provider
Coursera (main training landing page)
Training Provider Coursera (specific course description)
Coursera (main training landing page)
Coursera (specific course description) Course Delivery Format
Virtual, instructor-led
Course Delivery Format
Virtual, instructor-led Course Prerequisites or Preferred Prior
Knowledge
Course Prerequisites or Preferred Prior N/A
Knowledge
N/A Cost
Free to audit
Cost
Free to audit Elementary Calculus I

Advanced Linear Models for Data This course is an introduction to calculus and is
aimed at students who have not taken calculus in
Science 2: Statistical Linear Models their previous education. This course will begin with
a review of pre-calculus topics, including functions
Welcome to the Advanced Linear Models for Data and algebra, which are then used as the groundwork
Science Class 2: Statistical Linear Models. This class for exploring the core topics of limits, continuity,
is an introduction to least squares from a linear differentiation, and integration. Where possible,
algebraic and mathematical perspective. Before problems considered in this class will be of a
beginning the class, make sure you have the biological nature, and problem sets will be available
following: to promote understanding.
• A basic understanding of linear algebra and
multivariate calculus Competency
Advanced Mathematics

Advanced Mathematics | 11
Proficiency Level Cost
Foundational Free to audit

Training Provider
The Foundation for Advanced Education in the
Mathematics for Machine Learning
Sciences (FAES) (main training landing page)
The Foundation for Advanced Education in the For a lot of higher-level courses in Machine Learning
Sciences (FAES) (specific course description) and Data Science, you find you need to freshen up on
the basics in mathematics - stuff you may have
Course Delivery Format studied before in school or university, but which was
In-person, instructor-led taught in another context, or not very intuitively,
such that you struggle to relate it to how it’s used in
Computer Science. This specialization aims to bridge
Course Prerequisites or Preferred Prior
that gap, getting you up to speed in the underlying
Knowledge
mathematics, building an intuitive understanding,
N/A
and relating it to Machine Learning and Data
Science.
Cost
$504
In the first course on Linear Algebra, we look at what
linear algebra is and how it relates to data. Then we
Geometric Algorithms look through what vectors and matrices are and how
to work with them.
In many areas of computer science such as robotics,
computer graphics, virtual reality, and geographic The second course, Multivariate Calculus, builds on
information systems, it is necessary to store, analyze, this to look at how to optimize fitting functions to get
and create or manipulate spatial data. This course good fits to data. It starts from introductory calculus
deals with the algorithmic aspects of these tasks: we and then uses the matrices and vectors from the first
study techniques and concepts needed for the design course to look at data fitting.
and analysis of geometric algorithms and data
structures. Each technique and concept will be The third course, Dimensionality Reduction with
illustrated on the basis of a problem arising in one of Principal Component Analysis, uses the mathematics
the application areas mentioned above. from the first two courses to compress high-
dimensional data. This course is of intermediate
Competency difficulty and will require basic Python and NumPy
Advanced Mathematics knowledge.

Proficiency Level At the end of this specialization, you will have gained
Foundational the prerequisite mathematical knowledge to continue
your journey and take more advanced courses in
Training Provider machine learning.
Coursera (main training landing page)
Coursera (specific course description) Competency
Advanced Mathematics
Course Delivery Format Machine Learning
In-person, instructor-led, & virtual
Proficiency Level
Course Prerequisites or Preferred Prior Foundational
Knowledge
N/A

Advanced Mathematics | 12
Training Provider Training Provider
Coursera (main training landing page) Coursera (main training landing page)
Coursera (specific course description) Coursera (specific course description)

Course Delivery Format Course Delivery Format


In-person, instructor-led, & virtual In-person, instructor-led, & virtual

Course Prerequisites or Preferred Prior Course Prerequisites or Preferred Prior


Knowledge Knowledge
N/A N/A

Cost Cost
Free to audit Free to Audit

Mathematics for Machine Learning: Probability: Basic Concepts & Discrete


Multivariate Calculus Random Variables
This course offers a brief introduction to the Our capacity to collect and store data has
multivariate calculus required to build many exponentially increased but deriving information
common machine learning techniques. We start at from data from a scientific perspective requires a
the very beginning with a refresher on the “rise over foundational knowledge of probability.
run” formulation of a slope, before converting this to
the formal definition of the gradient of a function. Are you interested in a career in the emerging data
We then start to build up a set of tools for making science field, or as an actuarial scientist? Or want
calculus easier and faster. Next, we learn how to better to understand statistical theory and
calculate vectors that point up hill on mathematical modeling?
multidimensional surfaces and even put this into
action using an interactive game. We look at how we In this statistics and data analysis course, we will
can use calculus to build approximations to introduce mathematical probability to help meet
functions, as well as helping us to quantify how your career goals in the exciting new areas becoming
accurate we should expect those approximations to known as information science.
be. We also spend some time talking about where In this course, we will first introduce basic
calculus comes up in the training of neural networks, probability concepts and rules, including Bayes
before finally showing you how it is applied in linear theorem, probability mass functions and CDFs, joint
regression models. This course is intended to offer an distributions and expected values.
intuitive understanding of calculus, as well as the
language necessary to look concepts up yourselves Then we will discuss a few important probability
when you get stuck. Hopefully, without going into distribution models with discrete random variables,
too much detail, you’ll still come away with the including Bernoulli and Binomial distributions,
confidence to dive into some more focused machine Geometric distribution, Negative Binomial
learning courses in future. distribution, Poisson distribution, Hypergeometric
distribution and discrete uniform distribution.
Competency
Advanced Mathematics Competency
Advanced Mathematics
Proficiency Level
Foundational

Advanced Mathematics | 13
Proficiency Level consider skipping the first two or three courses,
Foundational similarly, if you are biologists you should consider
skipping some of the introductory biology lectures.
Training Provider Note that the statistics and programming aspects of
edX (main training landing page) the class ramp up in difficulty relatively quickly
edX (specific course description) across the first three courses. By the third course will
be teaching advanced statistical concepts such as
Course Delivery Format hierarchical models and by the fourth advanced
Virtual, instructor-led software engineering skills, such as parallel
computing and reproducible research concepts.
Course Prerequisites or Preferred Prior
Knowledge Competency
N/A Advanced Mathematics

Cost Proficiency Level


Free to audit, $49 for Verified Certificate Full Performance

Training Provider
High-Dimensional Data Analysis edX (main training landing page)
edX (specific course description)
If you’re interested in data analysis and
interpretation, then this is the data science course for Course Delivery Format
you. We start by learning the mathematical Virtual, self-paced
definition of distance and use this to motivate the use
of the singular value decomposition (SVD) for Course Prerequisites or Preferred Prior
dimension reduction and multi-dimensional scaling Knowledge
and its connection to principle component analysis. Experiences with Statistical Inference and Modeling
We will learn about the batch effect: the most for High-throughput or basic programming and
challenging data analytical problem in genomics introduction to linear algebra
today and describe how the techniques can be used
to detect and adjust for batch effects. Specifically, we Cost
will describe the principal component analysis and Free to audit, $49 for Verified Certificate
factor analysis and demonstrate how these concepts
are applied to data visualization and data analysis of
high-throughput experimental data.
Modeling and Simulation using
MATLAB
Finally, we give a brief introduction to machine
learning and apply it to high-throughput data. We Modelling and simulation make a part of the world
describe the general idea behind clustering analysis easier to define, visualize and understand. Both
and descript K-means and hierarchical clustering require the identification of relevant aspects of a
and demonstrate how these are used in genomics situation in the real world and then the use of
and describe prediction algorithms such as k-nearest different types of models for different objectives and
neighbors along with the concepts of training sets, the definition of the most suitable model parameters.
test sets, error rates and cross-validation. This course teaches you to simulate models for a
wide range of applications using MATLAB – a high-
Given the diversity in educational background of our level programming language and an environment for
students we have divided the series into seven parts. numerical computation and visualization.
You can take the entire series or individual courses
that interest you. If you are a statistician you should

Advanced Mathematics | 14
Competency Training Provider
Advanced Mathematics edX (main training landing page)
edX (specific course description)
Proficiency Level
Full Performance Course Delivery Format
Virtual, instructor-led
Training Provider
Iversity (main training landing page) Course Prerequisites or Preferred Prior
Iversity (specific course description) Knowledge
Probability: Basic Concepts & Discrete Random
Course Delivery Format Course provided by edX
Virtual, instructor-led
Cost
Course Prerequisites or Preferred Prior Free to audit, $49 for Verified Certificate
Knowledge
College-level calculus (single-variable and
multivariable). Although this is not a mathematics
Probabilistic Graphical Models
course, it does rely on the language and some tools Specialization
from mathematics. It requires a level of comfort with
mathematical reasoning, familiarity with sequences, Probabilistic graphical models (PGMs) are a rich
limits, infinite series, the chain rule, as well as the framework for encoding probability distributions
ability to work with ordinary or multiple integrals. over complex domains: joint (multivariate)
distributions over large numbers of random variables
Cost
Free that interact with each other. These representations
sit at the intersection of statistics and computer
science, relying on concepts from probability theory,
Probability: Distribution Models & graph algorithms, machine learning, and more. They
Continuous Random Variables are the basis for the state-of-the-art methods in a
wide variety of applications, such as medical
In this statistics and data analysis course, you will
learn about continuous random variables and some diagnosis, image understanding, speech recognition,
of the most frequently used probability distribution natural language processing, and many, many more.
models including, exponential distribution, Gamma They are also a foundational tool in formulating
distribution, Beta distribution, and most many machine learning problems.
importantly, normal distribution.
Competency
You will learn how these distributions can relate to Advanced Mathematics
the Normal distribution by Central limit theorem Machine Learning
(CLT). We will discuss Markov and Chebyshev
inequalities, order statistics, moment generating Proficiency Level
functions and transformation of random variables. Full Performance

Competency Training Provider


Advanced Mathematics Coursera (main training landing page)
Coursera (specific course description)
Proficiency Level
Full Performance

Advanced Mathematics | 15
Course Delivery Format
Virtual, instructor-led

Course Prerequisites or Preferred Prior


Knowledge
N/A

Cost
Free to audit

Advanced Mathematics | 16
Computer Science

The Bits and Bytes of Computer encoded, stored, and communicated between
Networking computers.

This course is designed to provide a full overview of Competency


computer networking. We’ll cover everything from Computer Science
the fundamentals of modern networking Programming and Scripting
technologies and protocols to an overview of the
cloud to practical applications and network Proficiency Level
troubleshooting. We’ll wrap up by covering how this Comprehension
information might show up in a job interview and
giving you a few tips for troubleshooting on the spot. Training Provider
LinkedIn Learning (main training landing page)
Competency LinkedIn Learning (specific course description)
Computer Science
Course Delivery Format
Proficiency Level Virtual, self-paced
Comprehension
Course Prerequisites or Preferred Prior
Training Provider Knowledge
Coursera (main training landing page) N/A
Coursera (specific course description)
Cost
Course Delivery Format Free
Virtual, instructor-led
Computer Science Principles: The
Course Prerequisites or Preferred Prior Internet
Knowledge
N/A This course is the second in our Computer Science
Principles series, designed around the AP Computer
Cost Science Principles (CSP) curriculum. It is a great
Free to audit foundation for anyone, at any age, to prepare for
careers in technology and computer science.
Computer Science Principles: Digital Understanding basics like the Internet will help you
Information understand the interplay between hardware,
software, data, networks, and the people that use
Computers, at their most basic level, store them.
information in bits—a series of on and off states
represented by ones and zeroes. Using this binary Competency
language, the information in images, audio, video, Computer Science
text, and other files can be saved and shared. This
principle is the basis of all computing, including Proficiency
programming. Here Doug Winnie explains the basics Comprehension
of binary: how digital information is represented,

Computer Science | 17
Training Provider Understanding the foundations of networking is
LinkedIn Learning (main training landing page) paramount for any IT professional. Once you have a
LinkedIn Learning (specific course description) grasp of the basics, IP addressing is the next step. In
this course, Timothy Pintello walks through the
Course Delivery Format essential concepts, including common numbering
Virtual, self-paced systems and logical vs. physical IP addressing. He
also discusses the IPv4 and IPv6 addressing
Course Prerequisites or Preferred Prior schemes, and various IP addressing resolution
Knowledge techniques (such as DHCP and DNS).
N/A
Competency
Cost Computer Science
Free
Proficiency Level
Learning Networking Comprehension

Discover the fundamentals of networking. Mark Training Provider


Jacob takes you on an exploration of various network LinkedIn Learning (main training landing page)
topologies, different cable types, and the LinkedIn Learning (specific course description)
functionality of network devices, and helps you
understand collision domains, the ways in which Course Delivery Format
switches move traffic, and message types. Plus, learn Virtual, self-paced
how to troubleshoot basic switch issues.
Course Prerequisites or Preferred Prior
Competency Knowledge
Computer Science N/A

Proficiency Level Cost


Comprehension Free

Training Provider Computer Science Principles:


LinkedIn Learning (main training landing page) Programming
LinkedIn Learning (specific course description)
Join Doug Winnie as he explains the principles of
Course Delivery Format programming and helps you connect to core concepts
Virtual, self-paced by exploring three ways that programmers perform
their jobs. Doug starts by sharing the history of
Course Prerequisites or Preferred Prior coding and then dives into functions, values,
Knowledge variables, and parameters used to define actions. He
N/A covers capturing input from users, creating
conditional tests, using loops with arrays, and object-
Cost oriented programming basics. He also takes you
Free beyond programming, into processes like debugging,
refactoring, and building iteratively.
Networking Foundations: IP
Addressing Competency
Computer Science
Programming & Scripting

Computer Science | 18
Proficiency Level Cost
Basic Free

Training Provider Learning Software Version Control


LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) This course is a gateway to learning software version
control (SVC), process management, and
Course Delivery Format collaboration techniques. Author Michael Lehman
Virtual, self-paced reviews the history of version control and
demonstrates the fundamental concepts: check-
Course Prerequisites or Preferred Prior in/checkout, forking, merging, commits, and
Knowledge distribution. The choice of an SVC system is critical
N/A to effectively managing and versioning the assets in a
software development project (from source code,
Cost images, and compiled binaries to installation
Free packages), so the course also surveys the solutions
available. Michael examines Git, Perforce,
Developing Secure Software Subversion, Mercurial, and Microsoft Team
Foundation Server (TFS) in particular, describing the
Jungwoo Ryoo is a faculty member teaching appropriate use, features, benefits, and optimal
cybersecurity and information technology at Penn group size for each one.
State. In this course, he'll introduce secure software
development tools and frameworks and teach secure Competency
coding practices such as input validation, separation Computer Science
of concerns, and single access point. He'll also show Programming & Scripting
how to recognize different kinds of security threats
and fortify your code. Plus, he'll help you put a Proficiency Level
system in place to test your software for any Basic
overlooked vulnerabilities.
Training Provider
Competency LinkedIn Learning (main training landing page)
Computer Science LinkedIn Learning (specific course description)
Programming & Scripting
Course Delivery Format
Proficiency Level Virtual, self-paced
Basic
Course Prerequisites or Preferred Prior
Training Provider Knowledge
LinkedIn Learning (main training landing page) N/A
LinkedIn Learning (specific course description)
Cost
Course Delivery Format Free
Virtual, self-paced
Networking Foundations: Networking
Course Prerequisites or Preferred Prior Basics
Knowledge
N/A Understanding the foundations of networking is
paramount for any IT professional. This course

Computer Science | 19
covers the very basics. Professor of computer science Competency
Tim Pintello introduces the core networking Computer Science
topologies and implementation examples. He will
also explain and compare the OSI and TCP/IP Proficiency Level
models, and introduce viewers to commonly used Basic
network devices, such as NICs, hubs, switches, and
routers. Training Provider
LinkedIn Learning (main training landing page)
Competency LinkedIn Learning (specific course description)
Computer Science
Course Delivery Format
Proficiency Level Virtual, self-paced
Basic
Course Prerequisites or Preferred Prior
Training Provider Knowledge
LinkedIn Learning (main training landing page) N/A
LinkedIn Learning (specific course description)
Cost
Course Delivery Format Free
In-Person, instructor -led
Fundamentals of Network
Course Prerequisites or Preferred Prior Communication
Knowledge
N/A In this course, we trace the evolution of networks and
identify the key concepts and functions that form the
Cost basis for layered architecture. We introduce
Free examples of protocols and services that are familiar
to the students, and we explain how these services
Programming Foundations: Secure are supported by networks. Further, we explain
Coding fundamental concepts in digital communication, and
focus on error control techniques that include parity
Learn how to incorporate security into the software check, polynomial code, and Internet checksum.
development life cycle. Move security into your
design and build phases by identifying common Competency
insecure code issues and embracing the mindset of a Computer Science
security professional. In this course, security
architect Frank Moley provides a basic Proficiency Level
understanding of secure coding practices. Learn how Foundational
to understand your attackers and risks and mitigate
issues at critical junctures in your code, including Training Provider
thick app, client, and server interactions. Plus, Coursera (main training landing page)
explore how to prevent unauthorized access and data Coursera (specific course description)
leaks with authentication and cryptography. Frank
closes with an overview of security in each phase of Course Delivery Format
the software development life cycle, and next steps Virtual, self-paced
for strengthening the security posture of your
applications.

Computer Science | 20
Course Prerequisites or Preferred Prior Software Development Processes and
Knowledge
Required to have some prior programming Methodologies
experience in C-programming (C++/Java), some
fundamental knowledge of computer organization In this course, we trace the evolution of networks and
identify the key concepts and functions that form the
and IT architecture and a background in computer basis for layered architecture. We introduce
science is a plus. examples of protocols and services that are familiar
to the students, and we explain how these services
Cost are supported by networks. Further, we explain
Free to audit fundamental concepts in digital communication, and
focus on error control techniques that include parity
Introduction to Computer Science and check, polynomial code, and Internet checksum.
Programming Specialization
Competency
This specialization covers topics ranging from basic Computer Science
computing principles to the mathematical
foundations required for computer science. You will Proficiency Level
learn fundamental concepts of how computers work, Foundational
which can be applied to any software or computer
system. You will also gain the practical skillset Training Provider
needed to write interactive, graphical programs at an Coursera (main training landing page)
introductory level. The numerical mathematics Coursera (specific course description)
component will provide you with numerical and
computational tools that are essential for the Course Delivery Format
problem solving and modelling stages of computer Virtual, self-paced
science.
Course Prerequisites or Preferred Prior
Competency Knowledge
Computer Science N/A
Programming and Scripting
Cost
Proficiency Level Free
Foundational
Software Design and Architecture
Training Provider Specialization
Coursera (main training landing page)
Coursera (specific course description) In the Software Design and Architecture
Specialization, you will learn how to apply design
Course Delivery Format principles, patterns, and architectures to create
Virtual, self-paced reusable and flexible software applications and
systems. You will learn how to express and document
Course Prerequisites or Preferred Prior the design and architecture of a software system
Knowledge using a visual notation. Practical examples and
N/A opportunities to apply your knowledge will help you
develop employable skills and relevant expertise in
Cost the software industry.
Free

Computer Science | 21
Competency Cost
Computer Science Free to audit

Proficiency Level Software Testing Fundamentals


Full Performance
Want to gain software testing skills to start a career
Training Provider or are you a software developer looking to improve
Coursera (main training landing page) your unit testing skills? This course, part of the
Coursera (specific course description) Software Testing and Verification MicroMasters
program, will provide the essential skills you need for
Course Delivery Format success. Learn the techniques Software Testers and
Virtual, self-paced Quality Assurance Engineers use every day, which
can be applied to any programming language and
Course Prerequisites or Preferred Prior testing software.
Knowledge
N/A Competency
Computer Science
Cost
Free to audit Proficiency Level
Full Performance
Software Design as an Element of the
Training Provider
Software Development Lifecycle edX (main training landing page)
edX (specific course description)
This course talks about software development
lifecycles a description/prescription for how we write
Course Delivery Format
software. Design is a step in this life cycle, and the
Virtual, self-paced
course explores the implications of this. Design has a
role in the life cycle; it is always there, regardless of
Course Prerequisites or Preferred Prior
the kind of life cycle we’re talking about. Why is
Knowledge
that? Why was design considered as a step in this life
N/A
cycle?

Competency Cost
Computer Science Free to audit

Proficiency Level User Interface Design Specialization


Full Performance
In this Specialization, you will learn industry-
Training Provider standard theory and methods for developing
Coursera (main training landing page) successful user interfaces (UIs). Upon completing
Coursera (specific course description) this Specialization, you will have fluency with the
user research, prototyping and evaluation techniques
Course Delivery Format necessary for creating intuitive interfaces that
Virtual, self-paced facilitate good user experiences. You will also have
demonstrated this fluency through an in-depth
Course Prerequisites or Preferred Prior Capstone Project that can be shown to prospective
Knowledge employers in the fast-growing field of UI design.
N/A

Computer Science | 22
Competency Course Prerequisites or Preferred Prior
Computer Science Knowledge
N/A
Proficiency Level
Full Performance Cost
Free to audit
Training Provider
Coursera (main training landing page)
Coursera (specific course description)
System Validation (4): Modelling
Software, Protocols, and Other
Course Delivery Format Behavior
Virtual, self-paced
System Validation is the field that studies the
Course Prerequisites or Preferred Prior fundamentals of system communication and
Knowledge information processing. It allows automated analysis
N/A based on behavioral models of a system to see if a
system works correctly. We want to guarantee that
Cost the systems do exactly what it is supposed to do. The
Free to audit techniques put forward in system validation allow to
prove the absence of errors. It allows to design
Fundamentals of Computer Network embedded system behavior' that is structurally sound
and as a side effect enforces you to make the
Security Specialization behavior simple and insightful. This means that the
systems are not only behaving correctly but are also
This specialization in intended for IT professionals, much easier to maintain and adapt. ’Modeling
computer programmers, managers, IT security Software Protocols, and other behavior' '
professionals who like to move up ladder, who are demonstrates the power of formal methods in
seeking to develop network system security skills. software modelling, communication protocols, and
Through four courses, we will cover the Design and other examples.
Analyze Secure Networked Systems, Develop Secure
Programs with Basic Cryptography and Crypto API, Competency
Hacking and Patching Web Applications, Perform Computer Science
Penetration Testing, and Secure Networked Systems
with Firewall and IDS, which will prepare you to Proficiency Level
perform tasks as Cyber Security Engineer, IT Expert
Security Analyst, and Cyber Security Analyst.
Training Provider
Competency Coursera (main training landing page)
Computer Science Coursera (specific course description)
Proficiency Level Course Delivery Format
Expert Virtual, self-paced
Training Provider Course Prerequisites or Preferred Prior
Coursera (main training landing page) Knowledge
Coursera (specific course description) N/A
Course Delivery Format
Virtual, self-paced

Computer Science | 23
Cost defense-in-depth approach, and User Account
Free to audit Control (UAC).

Linux: System Maintenance Competency


Computer Science
It can be tempting to skimp on regular system
maintenance once your computer is up and running. Proficiency Level
That said, it's essential to keep tabs on your machine, Expert
and check to see if there are any security problems
you should address, and whether your computer is Training Provider
keeping up with its tasks. In this course, dive into the LinkedIn Learning (main training landing page)
basics of Linux system maintenance. Instructor Scott LinkedIn Learning (specific course description)
Simpson explains how to approach any Linux
system, to help you get your bearings if a system Course Delivery Format
looks unfamiliar. He also covers system and security Virtual, self-paced
logs, troubleshooting the boot process, upgrading
software, freeing disk space, and automating reports Course Prerequisites or Preferred Prior
with scripting. Knowledge
N/A
Competency
Computer Science Cost
Free
Proficiency Level
Expert

Training Provider
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description)

Course Delivery Format


Virtual, self-paced

Course Prerequisites or Preferred Prior


Knowledge
N/A

Cost
Free

Windows 8: Network and Security


Learn about Windows 8 networking and security.
Author Steve Fullmer explores the Open Systems
Interconnection (OSI) model, IPv4 and IPv6
networking, Domain Name System (DNS) resolution,
and wireless networking. Additional topics include
public key infrastructure, Windows 8 Defender, the

Computer Science | 24
Data Mining & Integration

Data Curation 101: How to Manage This introductory course will discuss: its involvement
Research Data by the Digital Curation in the 9-step KDD process, which data can be mined
and used to enhance businesses, data patterns which
Centre can be visualized to understand the data better, the
process, tools, and its future by modern standards. It
A comprehensive collection of reports, presentations, will also talk about the increasing importance of
and briefs covering all aspects of the data lifecycle, transforming unprecedented quantities of digital
including ingest, appraisal and selection, metadata, data into business intelligence giving users an
preservation, and reuse. Elements are categorized by informational advantage.
topic and can be accessed and completed at your own
pace.
Competency
Data Mining & Integration
Competency
Data Mining & Integration
Proficiency Level
Database Science
Comprehension
Proficiency Level
Training Provider
Comprehension
Udemy (main training landing page)
Udemy (specific course description)
Training Provider
National Network of Libraries of Medicine (main
Course Delivery Format
training landing page)
Virtual, self-paced
Digital Curation Centre (specific course description)
Course Prerequisites or Preferred Prior
Course Delivery Format
Knowledge
Virtual, self-paced
N/A
Course Prerequisites or Preferred Prior
Knowledge Cost
N/A Free to audit, $50 for Verified Certificate

Cost Data Science Foundations:


Free
Fundamentals
Data Mining Introduction to Data Science provides a
comprehensive overview of modern data science: the
Uncover the essential tool for information practice of obtaining, exploring, modeling, and
management professionals known as Data Mining. interpreting data. While most only think of the "big
Data mining is the process of extracting patterns subject," big data, there are many more fields and
from large data sets by connecting methods from concepts to explore. Here Barton Poulson explores
statistics and artificial intelligence with database disciplines such as programming, statistics,
management. Although a relatively young and mathematics, machine learning, data analysis,
interdisciplinary field of computer science, data visualization, and (yes) big data. He explains why
mining involves analysis of large masses of data and data scientists are now in such demand, and the
conversion into useful information. skills required to succeed in different jobs. He shows

Data Mining & Integration | 25


how to obtain data from legitimate open-source Competency
repositories via web APIs and page scraping, and Data Mining & Integration
introduces specific technologies (R, Python, and
SQL) and techniques (support vector machines and Proficiency Level
random forests) for analysis. By the end of the Comprehension
course, you should better understand data science's
role in making meaningful insights from the complex Training Provider
and large sets of data all around us. Coursera (main training landing page)
Coursera (specific course description)
Competency
Data Mining & Integration Course Delivery Format
Data Visualization Virtual, self-paced

Proficiency Level Course Prerequisites or Preferred Prior


Comprehension Knowledge
N/A
Training Provider
LinkedIn Learning (main training landing page) Cost
LinkedIn Learning (specific course description) Free to audit, $50 for Verified Certificate
Course Delivery Format
Virtual, self-paced Cleaning Up Your Excel 2013 Data
Course Prerequisites or Preferred Prior Need to get data from a business-management
Knowledge system file, database software, text file, or poorly
Experience with R, Python, and SQL recommend designed Excel worksheet into optimal shape for
Excel 2013? This course can help. Dennis Taylor
Cost explores the functions, commands, and techniques in
Free Excel that restructure data, remove unwanted
characters, convert data into the desired format, and
Pattern Discovery in Data Mining prepare data for efficient analysis. He'll cover
adjusting row and column placement; transposing
Learn the general concepts of data mining along with data with Replace and Substitute functions, the Text
basic methodologies and applications. Then dive into to Columns command, and the new Flash Fill; and
one subfield in data mining: pattern discovery. Learn formatting and converting text, numbers, and other
values.
in-depth concepts, methods, and applications of
pattern discovery in data mining. We will also
introduce methods for data-driven phrase mining Competency
and some interesting applications of pattern Data Mining and Integration
discovery. This course provides you the opportunity
to learn skills and content to practice and engage in Proficiency Level
scalable pattern discovery methods on massive Basic
transactional data, discuss pattern evaluation
measures, and study methods for mining diverse Training Provider
kinds of patterns, sequential patterns, and sub-graph LinkedIn Learning (main training landing page)
patterns. LinkedIn Learning (specific course description)

Data Mining & Integration | 26


Course Delivery Format ranges of data. Dave also shows how to filter with
Virtual, self-paced color coding, and how to use advanced filtering
options such as slicers and shortcuts. Advance your
Course Prerequisites or Preferred Prior Excel skills in minutes with this mini course.
Knowledge
N/A Competency
Data Mining & Integration
Cost
Free Proficiency Level
Basic
Data Science Foundations: Data
Training Provider
Mining LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description)
This beginner-level course includes topics such as
prerequisites for data mining, data mining using R,
Course Delivery Format
Python, Orange, and RapidMiner, data reduction, Virtual, self-paced
data clustering, anomaly detection, association
analysis, regression analysis, sequence mining, text
Course Prerequisites or Preferred Prior
mining.
Knowledge
N/A
Competency
Data Mining & Integration
Cost
Free
Proficiency Level
Basic
Excel 2016: Working with Dates and
Training Provider Times
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Getting dates and times to show up the way you want
in an Excel spreadsheet can be tricky. In this concise
Course Delivery Format course, Excel expert Dennis Taylor shares easy
Virtual, self-paced solutions for formatting and calculating dates and
times in Excel 2016. Dennis explains what's going on
Course Prerequisites or Preferred Prior behind the scenes when Excel stores dates and times
Knowledge and offers tips for entering and formatting data.
N/A Next, he demonstrates how to work with dates and
times in common Excel functions, and how to
Cost calculate data with dates and times. Finally, Dennis
Free explains how to use dates and times with Excel
commands, including working with data filters.
Excel: Filtering for Beginners
Competency
Did you know Excel enables you to zero in on exactly Data Mining & Integration
the data you need with the click of a mouse? In this
short course, Microsoft Excel content publisher Dave Proficiency Level
Ludwig shows the ins and outs of filtering in Excel. Basic
He shows how to filter for text, numeric values, and

Data Mining & Integration | 27


Training Provider Big Data Foundations: Techniques
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description)
and Concepts
Big data is big news. But what is big data, and how do
Course Delivery Format
we use it? Simply put, big data is data that, by its
Virtual, self-paced
velocity, volume, or variety (the three Vs), cannot be
easily stored or analyzed with traditional methods.
Course Prerequisites or Preferred Prior
Spreadsheets and relational databases just don't cut
Knowledge
it with big data. In this course, Barton Poulson tells
N/A
you the methods that do work, introducing all the
techniques and concepts involved in capturing,
Cost
storing, manipulating, and analyzing big data,
Free
including data mining and predictive analytics. He
explains big data's relationship to data science,
Excel: Introduction to Formatting statistics, and programing; its uses in marketing,
scientific research, and tools like Amazon's
Get a quick crash course in Excel formatting to make recommendation engine; and the ethical issues that
your spreadsheets more readable and compelling. lie behind its use.
Dennis Taylor shows the basics of formatting in
Excel, including how to emphasize data with fonts, Competency
borders, and colors. Learn how to adjust the Data Mining & Integration
formatting of numbers, dates, and times; create
value-based conditional formatting; adjust rows and Proficiency Level
columns without commands; and easily copy and Foundational
even paint formatting across multiple cells.
Training Provider
Competency LinkedIn Learning (main training landing page)
Data Mining & Integration LinkedIn Learning (specific course description)

Proficiency Level Course Delivery Format


Basic Virtual, self-paced

Training Provider Course Prerequisites or Preferred Prior


LinkedIn Learning (main training landing page) Knowledge
LinkedIn Learning (specific course description) N/A

Course Delivery Format Cost


Virtual, self-paced Free

Course Prerequisites or Preferred Prior


Knowledge
Excel 2016: Managing and Analyzing
N/A Data

Cost Large amounts of data can become unmanageable


Free fast. But with the data management and analysis
features in Excel 2016, you can keep the largest
spreadsheets under control. In this course, Dennis
Taylor shares easy-to-use commands, features, and

Data Mining & Integration | 28


functions for maintaining large lists of data in Excel. Training Provider
He covers sorting, adding subtotals, filtering, Coursera (main training landing page)
eliminating duplicate data, and using Excel's Coursera (specific course description)
Advanced Filter feature and specialized database
functions to isolate and analyze data. With these Course Delivery Format
techniques, you'll be able to extract the most Virtual, instructor-led
important information from your data, in the
shortest amount of time. Course Prerequisites or Preferred Prior
Knowledge
Competency N/A
Data Mining & Integration
Cost
Proficiency Level Free to audit
Foundational
The Essential Elements of Predictive
Training Provider
LinkedIn Learning (main training landing page)
Analytics and Data Mining
LinkedIn Learning (specific course description)
A proper predictive analytics and data-mining
Course Delivery Format project can involve many people and many weeks.
Virtual, self-paced There are also many potential errors to avoid. A "big
picture" perspective is necessary to keep the project
on track. This course provides that perspective
Course Prerequisites or Preferred Prior
through the lens of a veteran practitioner who has
Knowledge
completed dozens of real-world projects. Keith
N/A
McCormick is an independent data miner and author
who specializes in predictive models and
Cost
segmentation analysis, including classification trees,
Free
cluster analysis, and association rules. Here he
shares his knowledge with you. Walk through each
Text Retrieval and Search Engines step of a typical project, from defining the problem
and gathering the data and resources, to putting the
Recent years have seen a dramatic growth of natural solution into practice. Keith also provides an
language text data, including web pages, news overview of CRISP-DM (the de facto data-mining
articles, scientific literature, emails, enterprise methodology) and the nine laws of data mining,
documents, and social media such as blog articles, which will keep you focused on strategy and business
forum posts, product reviews, and tweets. Text data value.
are unique in that they are usually generated directly
by humans rather than a computer system or sensors Competency
and are thus especially valuable for discovering Data Mining & Integration
knowledge about people’s opinions and preferences, Statistical Modeling
in addition to many other kinds of knowledge that we
encode in text. Proficiency Level
Foundational
Competency
Data Mining & Integration Training Provider
LinkedIn Learning (main training landing page)
Proficiency Level LinkedIn Learning (specific course description)
Foundational

Data Mining & Integration | 29


Course Delivery Format Programming with Python for Data
Virtual, self-paced
Science
Course Prerequisites or Preferred Prior
This practical course, developed in partnership with
Knowledge
Coding Dojo, targets individuals who have
N/A
introductory level Python programming experience.
The course teaches students how to start looking at
Cost
data with the lens of a data scientist by applying
Free
efficient, well-known mining models in order to
unearth useful intelligence, using Python, one of the
Data Wrangling in R popular languages for Data Scientists. Topics include
data visualization, feature importance and selection,
The R series is a comprehensive collection of training dimensionality reduction, clustering, classification
sessions designed to teach non-programmers how to and more! All the data sets used in this course are
write modular code and to introduce best practices gathered live-data or inspired by real-world domains
for using R for data analysis and data visualization. that can benefit from machine learning.
Each class uses both evidence-based best practices
for programming and practical hands-on lessons. In Competency
this two-hour class, participants will be provided a Data Mining & Integration
basic overview of manipulating, analyzing and Operations Research
exporting data using the R tidyverse. Participants
will leave the course with a better understanding of Proficiency Level
how to better manage data for more efficient and Full Performance
effective analysis.
Training Provider
Competency edX (main training landing page)
Data Mining & Integration edX (specific course description)
Programming and Scripting
Course Delivery Format
Proficiency Level Virtual, instructor-led
Full Performance
Course Prerequisites or Preferred Prior
Training Provider Knowledge
NIH Library (main training landing page) N/A
NIH Library (specific course description)
Cost
Course Delivery Format Free to audit, $150 for Verified Certificate
In-person, instructor-led

Course Prerequisites or Preferred Prior


The Analytics Edge
Knowledge
N/A In the last decade, the amount of data available to
organizations has reached unprecedented levels.
Data is transforming business, social interactions,
Cost
and the future of our society. In this course, you will
Free
learn how to use data and analytics to give an edge to
your career and your life. We will examine real world
examples of how analytics have been used to

Data Mining & Integration | 30


significantly improve a business or industry. These decrease the estimated effort required for the class
examples include Moneyball, eHarmony, the but are not necessary to succeed.
Framingham Heart Study, Twitter, IBM Watson, and
Netflix. Through these examples and many more, we Cost
will teach you the following analytics methods: linear Free to audit, $150 for Verified Certificate
regression, logistic regression, trees, text analytics,
clustering, visualization, and optimization. We will
be using the statistical software R to build models Big Data, Genes, and Medicine
and work with data. The contents of this course are
essentially the same as those of the corresponding This course distills for you expert knowledge and
MIT class (The Analytics Edge). It is a challenging skills mastered by professionals in Health Big Data
class, but it will enable you to apply analytics to real- Science and Bioinformatics. You will learn exciting
world applications. facts about the human body biology and chemistry,
genetics, and medicine that will be intertwined with
The class will consist of lecture videos, which are the science of Big Data and skills to harness the
broken into small pieces, usually between 4 and 8 avalanche of data openly available at your fingertips
minutes each. After each lecture piece, we will ask and which we are just starting to make sense of. We’ll
you a “quick question” to assess your understanding investigate the different steps required to master Big
of the material. There will also be a recitation, in Data analytics on real datasets, including Next
which one of the teaching assistants will go over the Generation Sequencing data, in a healthcare and
methods introduced with a new example and data biological context, from preparing data for analysis
set. Each week will have a homework assignment to completing the analysis, interpreting the results,
that involves working in R or LibreOffice with visualizing them, and sharing the results.
various data sets. (R is a free statistical and
computing software environment we’ll use in the When you master these high-demand skills, you will
course. See the Software FAQ below for more info). be well positioned to apply for or move to positions
At the end of the class there will be a final exam, in biomedical data analytics and bioinformatics. No
which will be like the homework assignments. matter what your skill levels are in biomedical or
technical areas, you will gain highly valuable new or
Competency sharpened skills that will make you stand-out as a
Data Mining & Integration professional and want to dive even deeper in
biomedical Big Data. It is my hope that this course
Proficiency Level will spark your interest in the vast possibilities
Full Performance offered by publicly available Big Data to better
understand, prevent, and treat diseases.
Training Provider
edX (main training landing page) Competency
edX (specific course description) Data Mining & Integration

Course Delivery Format Proficiency Level


Virtual, instructor-led Expert

Course Prerequisites or Preferred Prior Training Provider


Knowledge Coursera (main training landing page)
Familiar with concepts like mean, standard Coursera (specific course description)
deviation, and scatterplots. Mathematical maturity
and prior experience with programming will Course Delivery Format
Virtual, instructor-led

Data Mining & Integration | 31


Course Prerequisites or Preferred Prior Cost
Knowledge Free to audit
N/A

Cost
Introduction to Analytics Modeling
Free to audit
Analytical models are key to understanding data,
generating predictions, and making business
Business Analytics: Data Reduction decisions. Without models it’s nearly impossible to
Techniques Using Excel and R gain insights from data. In modeling, it’s essential to
understand how to choose the right data sets,
With businesses having to grapple with increasing algorithms, techniques and formats to solve a
amounts of data, the need for data reduction has business problem.
intensified in recent years. To make sense of an
overabundance of information, you can use cluster In this course, part of the Analytics: Essential Tools
analysis—which allows you to develop inferences and Methods MicroMasters program, you’ll gain an
about a handful of groups instead of an entire intuitive understanding of fundamental models and
population of individuals—as well as principal methods of analytics and practice how to implement
components analysis, which exposes latent variables. them using common industry tools like R.
You’ll learn about analytics modeling and how to
In this course, Conrad Carlberg explains how to carry choose the right approach from among the wide
out cluster analysis and principal components range of options in your toolbox.
analysis using Microsoft Excel, which tends to show You will learn how to use statistical models and
more clearly what's going on in the analysis. Then he machine learning as well as models for:
explains how to carry out the same analysis using R, • classification;
the open-source statistical computing software, • clustering;
which is faster and richer in analysis options than • change detection;
Excel. Plus, he walks through how to merge the • data smoothing;
results of cluster analysis and factor analysis to help • validation;
you break down a few underlying factors according to • prediction;
individuals' membership in just a few clusters. • optimization;
• experimentation;
Competency • decision making.
Data Mining & Integration
Competency
Proficiency Level Data Mining & Integration
Expert
Proficiency Level
Training Provider Expert
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Training Provider
edX (main training landing page)
Course Delivery Format edX (specific course description)
Virtual, instructor-led
Course Delivery Format
Course Prerequisites or Preferred Prior Virtual, instructor-led
Knowledge
N/A

Data Mining & Integration | 32


Course Prerequisites or Preferred Prior Cost
Knowledge Free to audit, $99 for Verified Certificate
• Probability and statistics
• Basic programming proficiency
• Linear algebra
• Basic calculus

Cost
Free to audit, $150 for Verified Certificate

Knowledge Inference and Structure


Discovery for Education
In this course, you will learn key methods for
discovering how content can be divided into skills
and concepts and how to measure student knowledge
while it is changing – i.e. the student is learning.

This course will also cover related methods for


discovering structure in unlabeled data, such as
factor analysis and clustering. It will also cover
related methods for relationship mining including
how to validly conduct correlation mining and how to
automatically discover association rules and
sequential rules.

Competency
Data Mining & Integration

Proficiency Level
Expert

Training Provider
edX (main training landing page)
edX (specific course description)

Course Delivery Format


Virtual, instructor-led

Course Prerequisites or Preferred Prior


Knowledge
This mini-course does not assume prior
programming knowledge beyond what you will
already have learned in other courses in this
MicroMasters, although advanced tools will be
discussed for interested students.

Data Mining & Integration | 33


Data Visualization
Data Visualization: A Practical Proficiency Level
Approach for Absolute Beginners Comprehension

Data Visualization literacy is fundamental for Training Provider


consuming any analysis in any industry or media – LinkedIn Learning (main training landing page)
from the news to working in healthcare or not-for- LinkedIn Learning (specific course description)
profit budgets and more. As well, the ability to create
data visualizations that are compelling, accurate, and Course Delivery Format
tell a story, is becoming a core skill of any job in the Virtual, self-paced
21st Century. This course provides a practical
approach to learning the theories and techniques of Course Prerequisites or Preferred Prior
data visualization for data analysis. Knowledge
N/A
Competency
Data Visualization Cost
Free
Proficiency Level
Comprehension Picking the Right Chart for Your Data
Training Provider When selecting charts to showcase data, many
edX (main training landing page) people simply pick from the few choices available in
edX (specific course description) Excel or other such software. This can be highly
limiting and can result in the selection of charts that
Course Delivery Format fail to effectively convey whatever you're trying to
Virtual, instructor-led communicate. This course can help you think more
strategically about your data and provide you with
Course Prerequisites or Preferred Prior the tools you need to pick the best visual display for
Knowledge the type of data you're working with—and your
N/A ultimate communication goals. Main topics include
getting to the key idea you're trying to communicate;
Cost finding the right standard chart for your data type;
Free to audit, $99 for Verified Certificate and brainstorming and experimenting to come up
with alternatives to the standards.
Data Visualization for Data Analysts
Competency
This beginner-level course includes topics such as Data Visualization
channeling your audience, understanding your data,
determining the information hierarchy, sketching Proficiency Level
and wireframing your ideas, defining your narrative, Comprehension
using typography, color, contrast, and shape to
convey meaning, and making your visualization Training Provider
interactive. LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description)
Competency
Data Visualization

Data Visualization | 34
Course Delivery Format This workshop is for individuals who want to better
Virtual, self-paced visualize their data using simple strategies in MS
Excel. The audience for this workshop includes NIH
Course Prerequisites or Preferred Prior employees, team leaders, and supervisors who use
Knowledge data to influence decisions. Attendees will walk
N/A through Dr. Evergreen's "Four Step Visualization
Process" to structure their visualization efforts.
Cost Attendees will identify which chart types uncover the
Free best story in their dataset and explore design skills to
highlight main points. Finally, attendees will explore
Data Visualization: Storytelling how to customize graphs in MS Excel so that they
have a more powerful impact. New skills learned can
Join data visualization expert Bill Shander as he be immediately implemented on-the-job to clarify
guides you through the process of turning "facts and participant's future data presentations and support
figures" into "story" to engage and fulfill our human clearer decision-making. This is described as an
expectation for information. This course is intended intermediate/advanced class.
for anyone who works with data and must
communicate it to others, whether a researcher, a Competency
data analyst, a consultant, a marketer, or a journalist. Data Visualization
Bill shows you how to think about, and craft, stories
from data by examining many compelling stories in Proficiency Level
detail. Basic

Competency Training Provider


Data Visualization HHS Learning Portal (main training landing page)
HHS Learning Portal (specific course description)
Proficiency Level
Basic Course Delivery Format
Virtual, self-paced
Training Provider
LinkedIn Learning (main training landing page) Course Prerequisites or Preferred Prior
LinkedIn Learning (specific course description) Knowledge
Microsoft Excel 365 Level 1 - NIHTC7005
Course Delivery Format Microsoft Excel 365 Level 2 - NIHTC7006
Virtual, instructor- led
Cost
Course Prerequisites or Preferred Prior $825
Knowledge
N/A Infographics: Communicating
Information Visually
Cost
Free Learn to make your raw data more appealing and
consumable with infographics. In this class, you'll
Hands-On Data Visualization learn what infographics are and how they can make
Workshop data easier to understand and share. Finally, we’ll
wrap up by demonstrating an online resource
(Piktochart) so you’ll be ready to create your own
infographics.

Data Visualization | 35
Competency Cost
Data Visualization Free

Proficiency Level Principles of Effective Data


Basic
Visualization
Training Provider
Data visualization is becoming an increasingly
National Network of Libraries of Medicine (main
common method of presenting large and complex
training landing page)
data sets, but the principles of visual communication
National Network of Libraries of Medicine (specific
are not widely understood or practiced. Like other
course description)
forms of communication, visualization has its own
grammar and syntax that, if used, can increase the
Course Delivery Format
aesthetics and effectiveness of any visualization. This
Virtual, instructor- led
session will provide an overview of how data
visualizations are constructed, how people tend to
Course Prerequisites or Preferred Prior
understand visual cues like shape and color, and how
Knowledge
to use those cues to create visualizations that are
N/A
both attractive and informative.
Cost
Competency
Free
Data Visualization

Learning Data Visualization Proficiency Level


Basic
This beginner-level course includes topics such as
channeling your audience, understanding your data, Training Provider
determining the information hierarchy, sketching NIH Library (main training landing page)
and wireframing your ideas, defining your narrative, NIH Library (specific course description)
using typography, color, contrast, and shape to
convey meaning, and making your visualization Course Delivery Format
interactive. In-person and virtual, instructor-led

Competency Course Prerequisites or Preferred Prior


Data Visualization Knowledge
N/A
Proficiency Level
Basic Cost
Free
Training Provider
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description)
Data Visualization and Communication
with Tableau
Course Delivery Format
Virtual, self-paced One of the skills that characterizes great business
data analysts is the ability to communicate practical
Course Prerequisites or Preferred Prior implications of quantitative analyses to any kind of
Knowledge audience member. Even the most sophisticated
N/A statistical analyses are not useful to a business if they

Data Visualization | 36
do not lead to actionable advice, or if the answers to area of graph analytics and want to learn more? This
those business questions are not conveyed in a way course gives you a broad overview of the field of
that non-technical people can understand. graph analytics, so you can learn new ways to model,
store, retrieve and analyze graph-structured data.
In this course you will learn how to become a master
at communicating business-relevant implications of After completing this course, you will be able to
data analyses. By the end, you will know how to model a problem into a graph database and perform
structure your data analysis projects to ensure the analytical tasks over the graph in a scalable manner.
fruits of your hard labor yield results for your Better yet, you will be able to apply these techniques
stakeholders. You will also know how to streamline to understand the significance of your data sets for
your analyses and highlight their implications your own projects.
efficiently using visualizations in Tableau, the most
popular visualization program in the business world. Competency
Using other Tableau features, you will be able to Data Visualization
make effective visualizations that harness the human Statistical Modeling
brain’s innate perceptual and cognitive tendencies to
convey conclusions directly and clearly. Finally, you Proficiency Level
will be practiced in designing and persuasively Foundational
presenting business “data stories” that use these
visualizations, capitalizing on business-tested Training Provider
methods and design principles. Coursera (main training landing page)
Coursera (specific course description)
Competency
Data Visualization Course Delivery Format
Virtual, instructor-led
Proficiency Level
Foundational Course Prerequisites or Preferred Prior
Knowledge
Training Provider N/A
Coursera (main training landing page)
Coursera (specific course description) Cost
Free to audit
Course Delivery Format
Virtual, instructor-led
Introduction to Data Analysis using
Course Prerequisites or Preferred Prior Excel
Knowledge
N/A The ability to analyze data is a powerful skill that
helps you make better decisions. Microsoft Excel is
Cost one of the top tools for data analysis and the built-in
Free to audit pivot tables are arguably the most popular analytic
tool.
Graph Analytics for Big Data
In this course, you will learn how to perform data
Want to understand your data network structure and analysis using Excel’s most popular features. You will
how it changes under different conditions? Curious learn how to create pivot tables from a range with
to know how to identify closely interacting clusters rows and columns in Excel. You will see the power of
within a graph? Have you heard of the fast-growing Excel pivots in action and their ability to summarize

Data Visualization | 37
data in flexible ways, enabling quick exploration of study. You'll learn to use a histogram, a
data and producing valuable insights from the representation of the distribution of numerical data,
accumulated data. to easily arrange data.

Pivots are used in many different industries by You will learn about basic concepts of statistics, such
millions of users who share the goal of reporting the as average and standard deviation. Methods of using
performance of companies and organizations. In the normal approximation to solve a problem will be
addition, Excel formulas can be used to aggregate covered in this course. In addition, we'll discuss the
data to create meaningful reports. To complement, correlation coefficient and the regression method in
pivot charts and slicers can be used together to order to represent the relationship between two
visualize data and create easy to use dashboards. variables.

Competency Competency
Data Visualization Data Visualization
Advanced Mathematics
Proficiency Level
Foundational Proficiency Level
Foundational
Training Provider
edX (main training landing page) Training Provider
edX (specific course description) edX (main training landing page)
edX (specific course description)
Course Delivery Format
Virtual, instructor-led Course Delivery Format
Virtual, instructor-led
Course Prerequisites or Preferred Prior
Knowledge Course Prerequisites or Preferred Prior
Basic understanding of creating formulas and how Knowledge
cells are referenced by rows and columns within Python Basics for Data Science
Excel to take this course Analyzing Data with Python

Cost Cost
Free to audit, $99 for Verified Certificate Free to audit, $39 for Verified Certificate

Introductory Statistics: Analyzing Data Tableau Essential Training


Using Graphs and Statistics
In this course, learn what you need to know to
Why do we study statistics? The field of statistics analyze and display data using Tableau Desktop—
provides professionals and scientists with conceptual and make better, more data-driven decisions for your
foundations and useful techniques for evaluating company. Discover how to install Tableau, connect
ideas, testing theories, and - ultimately - uncovering to data sources, and sort and filter your data.
the truth in any situation. Instructor Curt Frye also demonstrates how to create
and manipulate data visualizations—including
This course will familiarize you with data and basic highlight tables, charts, scatter plots, histograms,
statistical concepts, enabling you to analyze data maps, and dashboards—and shows how to share your
using graphs and statistics. We'll start with types of visualizations via the web.
data, controlled experiments, and observational

Data Visualization | 38
Competency dashboards? If so, check out DAT205x: Introduction
Data Visualization to Data Analysis using Excel.
This course is also a part of the Microsoft Excel for
Proficiency Level the Data Analyst X-Series.
Foundational
Competency
Training Provider Data Visualization
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Proficiency Level
Full Performance
Course Delivery Format
Virtual, self-paced Training Provider
edX (main training landing page)
Course Prerequisites or Preferred Prior edX (specific course description)
Knowledge
N/A Course Delivery Format
Virtual, instructor-led
Cost
Free Course Prerequisites or Preferred Prior
Knowledge
Analyzing and Visualizing Data with Understanding of Excel analytic tools such as tables,
pivot tables and pivot charts. Also, some experience
Excel in working with data from databases and from text
files will be helpful.
Excel is one of the most widely used solutions for
analyzing and visualizing data. It now includes tools Cost
that enable the analysis of more data, with improved Free to audit, $99 for Verified Certificate
visualizations and more sophisticated business
logics. In this data science course, you will get an
introduction to the latest versions of these new tools Data Analysis and Presentation Skills:
in Excel 2016 from an expert on the Excel Product the PwC Approach Specialization
Team at Microsoft.
This Specialization will help you get practical with
Learn how to import data from different sources, data analysis, turning business intelligence into real-
create mashups between data sources, and prepare world outcomes. We'll explore how a combination of
data for analysis. After preparing the data, find out better understanding, filtering, and application of
how business calculations can be expressed using the data can help you solve problems faster - leading to
DAX calculation engine. See how the data can be smarter and more effective decision-making. You’ll
visualized and shared to the Power BI cloud service, learn how to use Microsoft Excel, PowerPoint, and
after which it can be used in dashboards, queried other common data analysis and communication
using plain English sentences, and even consumed tools, and perhaps most importantly, we'll help you
on mobile devices. to present data to others in a way that gets them
engaged in your story and motivated to act.
Do you feel that the contents of this course are a bit
too advanced for you and you need to fill some gaps Competency
in your Excel knowledge? Do you need a better Data Visualization
understanding of how pivot tables, pivot charts and
slicers work together, and help in creating

Data Visualization | 39
Proficiency Level ▪ Introduction to Programming (with Python)
Full Performance
Cost
Training Provider Free
Coursera (main training landing page)
Coursera (specific course description) Communicating Business Analytics
Course Delivery Format
Results
Virtual, instructor-led
The analytical process does not end with models than
can predict with accuracy or prescribe the best
Course Prerequisites or Preferred Prior
solution to business problems. Developing these
Knowledge
models and gaining insights from data do not
N/A
necessarily lead to successful implementations. This
depends on the ability to communicate results to
Cost
those who make decisions. Presenting findings to
Free to audit
decision makers who are not familiar with the
language of analytics presents a challenge. In this
Data Visualization with Python course you will learn how to communicate analytics
results to stakeholders who do not understand the
In this seminar, attendees cover data visualizations details of analytics but want evidence of analysis and
guidelines and apply them to the analysis of a dataset data. You will be able to choose the right vehicles to
using the Python programming language. Attendees present quantitative information, including those
examine a few data visualization packages available based on principles of data visualization. You will
in Python for static and dynamic data visualization. also learn how to develop and deliver data-analytics
Attendees will work within a Jupyter notebook and stories that provide context, insight, and
have access to all materials for continued practice. interpretation.
Students will receive a Jupyter Notebook for further
study and practice. This seminar will be webcast for Competency
those who cannot attend in-person. Data Visualization

Competency Proficiency Level


Data Visualization Foundational
Programming & Scripting
Training Provider
Proficiency Level Coursera (main training landing page)
Full Performance Coursera (specific course description)

Training Provider Course Delivery Format


HHS Learning Portal (main training landing page) Virtual, instructor-led
HHS Learning Portal (specific course description)
Course Prerequisites or Preferred Prior
Course Delivery Format Knowledge
In-person, instructor-led N/A

Course Prerequisites or Preferred Prior Cost


Knowledge Free to audit
▪ Introduction to the Common Line Webinar
▪ Introduction to Jupyter Notebook

Data Visualization | 40
Data Analysis: Visualization and In this data science course, you will learn key
concepts in data acquisition, preparation,
Dashboard Design exploration, and visualization taught alongside
practical application-oriented examples such as how
Struggling with data at work? Wasting valuable time to build a cloud data science solution using Microsoft
working in multiple spreadsheets to gain an overview Azure Machine Learning platform, or with R, and
of your business? Find it hard to gain sharp insights Python on Azure stack.
from piles of data on your desktop?
Competency
If you are looking to enhance your efficiency in the
Data Visualization
office and improve your performance by making
Machine Learning
sense of data faster and smarter, then this advanced
data analysis course is for you.
Proficiency Level
Expert
You will learn advanced techniques for robust data
analysis in a business environment. This course
Training Provider
covers the main tasks required from data analysts
edX (main training landing page)
today, including importing, summarizing,
edX (specific course description)
interpreting, analyzing and visualizing data. It aims
to equip you with the tools that will enable you to be
Course Delivery Format
an independent data analyst. Most techniques will be
Virtual, instructor-led
taught in Excel with add-ons and free tools available
online. We encourage you to use your own data in
Course Prerequisites or Preferred Prior
this course but if not available, the course team can
Knowledge
provide.
Introductory level knowledge of either R or Python
Competency
Cost
Data Visualization
Free to audit, $99 for Verified Certificate
Proficiency Level
Expert Data Science and Machine Learning
Capstone Project
Training Provider
edX (main training landing page) Create a project that you can use to showcase your
edX (specific course description) Data Science skills to prospective employers. Apply
various data science and machine learning
Course Delivery Format techniques to analyze and visualize a data set
Virtual, instructor-led involving a real-life business scenario and build a
predictive model. Employers really care about how
Course Prerequisites or Preferred Prior well you can apply your knowledge and skills to solve
Knowledge real world problems. Now that you've taken several
Experience with spreadsheets courses on Data Science and Machine Learning, it’s
time to put your learning to practice and work on a
Cost data problem involving a real-life scenario.
Free to audit, $99 for Verified Certificate
New Yorkers use 311 system to report complaints for
Data Science Essentials the non-emergency problems they face. Various
agencies in New York get assigned to these problems.
The data related to these Complaints are available in

Data Visualization | 41
New York City Open Dataset. On investigation one
can see that in last few years the 311 complaints
coming to The Department of Housing Preservation
and Development in New York City has increased
significantly.

In this Capstone project your task would be to find


out answers to some questions that would help The
Department of Housing Preservation and
Development in New York City to effectively tackle
311 complaints coming to them. You need to use
Python and Data Science and Machine Learning
techniques such as Data Ingestion, Data Exploration,
Data Visualization, Feature Engineering,
Probabilistic Modeling, Model Validation, etc.

By the end of this course you will have used real


world Data Science tools to create a showcase project
and demonstrate to employers that you are job ready
and a worthy candidate in the field of Data Science.

Competency
Data Visualization
Machine Learning

Proficiency Level
Expert

Training Provider
edX (main training landing page)
edX (specific course description)

Course Delivery Format


Virtual, instructor-led

Course Prerequisites or Preferred Prior


Knowledge
N/A

Cost
Free to audit, $99 for Verified Certificate

Data Visualization | 42
Database Science

Database Foundations: Core Concepts Excel performance and efficiency. This course
introduces the Visual Basic for Applications
Understand the core concepts every IT professional programming language, covers creating subroutines
should know to start working with databases. This and functions to hold code, and provides a solid
course, the first in a four-part series with database grounding in the Excel 2007 object model.
consultant Adam Wilbert, is designed to provide a Programming techniques are demonstrated through
solid foundation that will serve you throughout your real-world examples. Exercise files accompany the
IT career. Learn about the different data storage course.
models and find out how to build your first database
with SQL Server—the Express edition, which Competency
requires no hardware or special connections for Database Science
setup. Then discover how to create database objects
with the data definition language (DDL) and edit Proficiency Level
data in your tables with data manipulation language Comprehension
(DML). Adam also covers critical relational database
concepts, such as relationships, indexes, and Training Provider
schemes. LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description)
Competency
Database Science Course Delivery Format
Virtual
Proficiency Level
Comprehension Course Prerequisites or Preferred Prior
Knowledge
Training Provider N/A
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Cost
Free
Course Delivery Format
Virtual Python for Everybody
Course Prerequisites or Preferred Prior This Specialization builds on the success of the
Knowledge Python for Everybody course and will introduce
N/A fundamental programming concepts including data
structures, networked application program
Cost interfaces, and databases, using the Python
Free programming language. In the Capstone Project,
you’ll use the technologies learned throughout the
Learning VBA in Excel 2010 Specialization to design and create your own
applications for data retrieval, processing, and
In Up and Running with VBA in Excel, Excel and visualization.
VBA expert Curt Frye introduces object-oriented
programming and shows how to automate routine Competency
tasks and provide custom functionality to enhance Database Science

Database Science | 43
Proficiency Level Course Prerequisites or Preferred Prior
Comprehension Knowledge
N/A
Training Provider
Coursera (main training landing page) Cost
Coursera (specific course description) Free

Course Delivery Format Database Clinic: MySQL


Virtual, instructor-led
The Database Clinic series pits experts and their
Course Prerequisites or Preferred Prior databases of choice against a series of the same
Knowledge challenges, to highlight the unique capabilities of
N/A each database. In this installment of the series, Brad
Wheeler demonstrates how MySQL—one of the most
Cost widely-used database programs—would rise to meet
Free to audit each task. After going over the strengths and
weaknesses of MySQL, Brad shows how to create a
Database Clinic: MS Excel database, and quickly load data into MySQL. He also
explains how to join data sets; how to search,
The Database Clinic series shows how to plan, build, transform, and perform calculations with your data;
and optimize databases using different software. This and how to use external programming tools to
course focuses on Microsoft Excel. While Excel interact with MySQL.
doesn't offer traditional relational database
management features, its table-based sheets, Competency
functions links, and powerful search and reporting Database Science
features make it a great tool for learning the basics of
database design. Join Curt Frye as he shows how to Proficiency Level
create a simple database, join data sets, search for Basic
records, and perform CRUD (create, read, update,
and delete) operations. Plus, learn how to use Excel's Training Provider
calculations, PivotTables, functions, and formulas to LinkedIn Learning (main training landing page)
gain deeper insights into your data. LinkedIn Learning (specific course description)

Competency Course Delivery Format


Database Science Virtual

Proficiency Level Course Prerequisites or Preferred Prior


Basic Knowledge
N/A
Training Provider
LinkedIn Learning (main training landing page) Cost
LinkedIn Learning (specific course description) Free

Course Delivery Format DataONE Data Management Education


Virtual Modules
This education program is geared towards earth
scientists, but it contains much useful information

Database Science | 44
for librarians to help them to be a resource to their Proficiency Level
faculty and researchers. Topics include metadata, Basic
data sharing, citation, and the data lifecycle.
Training Provider
Competency LinkedIn Learning (main training landing page)
Database Science LinkedIn Learning (specific course description)
Data Mining & Integration
Course Delivery Format
Proficiency Level Virtual, self-paced
Basic
Course Prerequisites or Preferred Prior
Training Provider Knowledge
DataONE (main training landing page) N/A
DataONE (specific course description)
Cost
Course Delivery Format Free
In-Person, instructor-led
Excel Business Intelligence Part 1:
Course Prerequisites or Preferred Prior
Knowledge
Power Query
N/A
Part 1 of this intermediate level course kicks off with
a summary of the power Excel landscape, explain
Cost
when to use tools like Power Query and Power Pivot
Free
and then focus on connecting and transforming data
using Excel's query editing tool. This course also
Database Foundations: Creating and covers basic table transformations, date and text
Manipulating Data editing tools, and pro tips like creating custom roll
and calendars, pivoting and unpivoting data and
Brand new to database administration? Take it one merging and appending queries. This is Part 1 of a
step at a time with Database Fundamentals, a series three-part series, which includes Part 2: Data
designed to support a new career or lifelong journey Modeling 101, and Part 3 Power Pivot and DAX.
in IT. This installment is devoted to data: getting it
into and out of tables and databases. Adam Wilbert Competency
shows how to get the most out of each data type, Database Science
including numbers, characters, and specialized types
like spatial data. Next are queries. Learn how to write Proficiency Level
commands and invoke functions in the SQL Editor to Basic
select just the records you want. Finally, get
comfortable inserting, updating, and deleting data Training Provider
with the SQL Server Management Studio (SSMS), a LinkedIn Learning (main training landing page)
suite of graphical tools and rich editors that make LinkedIn Learning (specific course description)
working with databases much more intuitive. Do you
want to test your knowledge? Watch the challenge Course Delivery Format
videos to practice what you've learned along the way. Virtual, self-paced

Competency Course Prerequisites or Preferred Prior


Database Science Knowledge
N/A

Database Science | 45
Cost Course Prerequisites or Preferred Prior
Free Knowledge
N/A
Training Provider
HHS Learning Portal (main training landing page) Cost
HHS Learning Portal (specific course description) Free

Course Delivery Format Learning SQL Programming


In-person, instructor-led
This course introduces this core programming
Course Prerequisites or Preferred Prior language. Learn how to request data from a server,
Knowledge limit and sort the responses, aggregate data from
N/A multiple tables with joins, and edit and delete data.
Instructor Scott Simpson also shows how to perform
Cost simple math operations and transform data into
Free different formats. Course topics include what is SQL,
asking for data with SELECT, limiting database
Java Essentials: Syntax and Structure responses, organizing responses, asking for data
from two or more tables, understanding join types
Get started with Java, the popular object-oriented and data types, transforming data, performing math,
programming language. In this course—the first adding and modifying data in a table.
installment in the Java Essential Training series—
start exploring this essential language and learn Competency
about basic Java syntax and the Java platform's Database Science
fundamental architecture. Instructor David Gassner
goes over the history of the language, providing Proficiency Level
coverage of its principles, components, and syntax. Basic
David explains how to install Java on Windows and
macOS, and how to create a project in IntelliJ IDEA. Training Provider
He also demonstrates how to work with primitive LinkedIn Learning (main training landing page)
variables, create and parse String values, and LinkedIn Learning (specific course description)
manage program flow—including how to create
reusable code. Course Delivery Format
Virtual, self-paced
Competency
Database Science Course Prerequisites or Preferred Prior
Knowledge
Proficiency Level N/A
Basic
Cost
Training Provider Free
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Master SQL for Data Science
Course Delivery Format SQL database skills are consistently among the most
Virtual sought after in the world of data science. Master the
skills needed for data science project specialists and
develop a solid foundation of knowledge for working

Database Science | 46
with SQL engineers. Discover how your SQL accuracy and consistency in your database using
knowledge applies to data science projects. Construct program flow functions. Adam closes with an
an expanded skill set optimized for data science assortment of useful query tricks. Take the
world. Expand your knowledge into related tool sets challenges posed along the way to test and practice
and platforms. your new Access skills.

The following courses (found in Lynda) are included Competency


in these trainings Database Science
▪ SQL Essential Training
▪ Learning SQL Programming Proficiency Level
▪ SQL Data Reporting and Analysis Foundational
▪ Advanced NoSQL for Data Science
▪ SQL Tips, Tricks, & Techniques Training Provider
▪ NoSQL for SQL Professionals LinkedIn Learning (main training landing page)
▪ Presto Essentials Data Science LinkedIn Learning (specific course description)

Competency Course Delivery Format


Database Science Virtual

Proficiency Level Course Prerequisites or Preferred Prior


Basic Knowledge
N/A
Training Provider
LinkedIn Learning (main training landing page) Cost
LinkedIn Learning (specific course description) Free

Course Delivery Format Programming Foundations: Databases


Virtual, self-paced
This course covers the essential exploratory
Course Prerequisites or Preferred Prior techniques for summarizing data. These techniques
Knowledge are typically applied before formal modeling
N/A commences and can help inform the development of
more complex statistical models. Exploratory
Cost techniques are also important for eliminating or
Free sharpening potential hypotheses about the world
that can be addressed by the data. We will cover in
Access 2016: Queries detail the plotting systems in R as well as some of the
basic principles of constructing data graphics. We
Learn how to find and translate complex raw data will also cover some of the common multivariate
into information you can use to make better statistical techniques used to visualize high-
decisions, with Access queries. Access expert Adam dimensional data.
Wilbert explains how to create real-world queries to
filter and sort data and perform calculations, as well Competency
as refine query results with built-in functions, all Database Science
while offering challenges that help you master the
material. Find out how to identify top performers, Proficiency Level
automate repetitive analysis tasks, make queries Basic
more flexible with parameter requests, and increase

Database Science | 47
Training Provider Biomedical Data Storage and Retrieval
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description)
(with SQL and NoSQL Databases)
In the seminar, attendees will examine data storage
Course Prerequisites or Preferred Prior
and retrieval strategies, use cases, and practical
Knowledge
examples using databases (including MySQL and
N/A
MongoDB) as well as programming languages to
access them (SQL and NoSQL). Programming will be
Cost
done using the Python programming language.
Free
Attendees will receive a Jupyter Notebook with all
steps taught in the class for further study and
Access 2019: Queries practice.

Learn how to find and translate complex raw data Competency


into information you can use to make better Database Science
decisions. Access expert Adam Wilbert explains how
to create real-world queries to filter and sort data Proficiency Level
and perform calculations, as well as refine query Foundational
results with built-in functions, all while offering
challenges that help you master the material. Find Training Provider
out how to identify top performers, automate HHS Learning Portal (main training landing page)
repetitive analysis tasks, make queries more flexible HHS Learning Portal (specific course description)
with parameter requests, and increase accuracy and
consistency in your database using program flow Course Delivery Format
functions. Adam closes with an assortment of useful Virtual, instructor-led
query tricks. Take the challenges posed along the way
to test and practice your new Access skills. Course Prerequisites or Preferred Prior
Knowledge
Competency Familiarity with Python Programming Language
Database Science
Cost
Proficiency Level Free
Foundational

Training Provider
Database Management Essentials
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Database Management Essentials provides the
foundation you need for a career in database
Course Delivery Format development, data warehousing, or business
Virtual, self-paced intelligence, as well as for the entire Data
Warehousing for Business Intelligence specialization.
In this course, you will create relational databases,
Course Prerequisites or Preferred Prior
write SQL statements to extract information to
Knowledge
satisfy business reporting requests, create entity
N/A
relationship diagrams (ERDs) to design databases,
and analyze table designs for excessive redundancy.
Cost
As you develop these skills, you will use either Oracle
Free
or MySQL to execute SQL statements and a database

Database Science | 48
diagramming tool such as the ER Assistant or Visual considering how data will be input and output at
Paradigm to create ERDs. We’ve designed this course your business.
to ensure a common foundation for specialization
learners. Everyone taking the course can jump right Competency
in with writing SQL statements in Oracle or MySQL. Database Science

Competency Proficiency Level


Database Science Foundational

Proficiency Level Training Provider


Foundational LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description)
Training Provider
Coursera (main training landing page) Course Delivery Format
Coursera (specific course description) Virtual, self-paced

Course Delivery Format Course Prerequisites or Preferred Prior


Virtual, instructor-led Knowledge
Experience with software
Course Prerequisites or Preferred Prior
Knowledge Cost
N/A Free

Cost SQL Data Reporting and Analysis


Free to audit
Join Emma Saunders as she shows you how to design
NoSQL for SQL Professionals and write simple SQL queries for data reporting and
analysis. Review the different types of SQL, and then
NoSQL databases can store non-relational data on a learn how to filter, group, and sort data, using built-
super large scale and can solve problems regular in SQL functions to format or calculate results. Learn
databases can't handle: indexing the entire Internet, a bit about data types and database design. Discover
predicting subscriber behavior, or targeting ads on a how to perform more complex queries, such as
platform as large as Facebook. But with over 150 joining data together from different database tables.
NoSQL database types, it can be hard for a SQL Finally, Emma shows how to save your queries as
professional to know where to start. In this course, views, so you can run them again and again. Topics
Lynn Langit breaks these types down into five main include using different versions of SQL, retrieving
categories and shows how to get your own NoSQL data with SELECT statements, filtering and sorting
database up and running with easy-to-configure your results. Transforming results with built-in SQL
cloud solutions. You'll learn how to add and query functions, grouping SQL results, merging data from
data and examine case studies where NoSQL was multiple tables, identifying data types, and how to
used to solve real-world data storage management make sense of your database design, saving SQL
issues. The final chapter contains tips just for startup queries.
businesses that are considering their first NoSQL
solution. Course topics include what is NoSQL, what Competency
is Hadoop, exploring Redis, HBase, MongoDB, and Database Science
Neo4j, Exploring NoSQL features in Microsoft SQL
Server, working with NoSQL data in the cloud, Proficiency Level
applying NoSQL choices to business scenarios, Foundational

Database Science | 49
Training Provider Course Delivery Format
LinkedIn Learning (main training landing page) Virtual, instructor-led
LinkedIn Learning (specific course description)
Course Prerequisites or Preferred Prior
Course Delivery Format Knowledge
Virtual, self-paced N/A

Course Prerequisites or Preferred Prior Cost


Knowledge Free to audit
N/A
Database Testing Using SQL Queries /
Cost
Free
MS Access
This course demonstrates all the basic and advanced
Data Warehouse Concepts, Design, level concepts including SQL introduction, MS
and Data Integration Access overview, creating a small database in MS
Access, Writing SQL queries, different SQL clauses,
In this course, you will learn exciting concepts and updating/deleting databases, SQL Joining and
skills for designing data warehouses and creating joining multiple tables. This is a total package to
data integration workflows. make you an expert database tester and SQL query
writer. Just like our other courses, most of the videos
These are fundamental skills for data warehouse in this course are free and available for preview. SQL
developers and administrators. You will have hands- has become essential for QA professionals in today’s
on experience for data warehouse design and use market. In this course, our team tried to make SQL
open source products for manipulating pivot tables look easy so that you can master the key concepts.
and creating data integration workflows. You will This course demonstrates hands-on use of SQL
also gain conceptual background about maturity instead of going over theories. This course should
models, architectures, multidimensional models, and make you an expert in writing SQL queries within
management practices, providing an organizational any types of databases. Whether you are new to the
perspective about data warehouse development. If database industry or a software tester, understanding
you are currently a business or information SQL has become essential in today's competitive
technology professional and want to become a data market. Also, today's software testing industry
warehouse designer or administrator, this course will requires extensive use of SQL as the demand of
give you the knowledge and skills to do that. By the database testing is booming every day. Moreover,
end of the course, you will have the design this can be your starting point to become a database
experience, software background, and organizational developer or administrator.
context that prepares you to succeed with data
warehouse development projects. Competency
Database Science
Competency
Database Science Proficiency Level
Full Performance
Proficiency Level
Expert Training Provider
Udemy (main training landing page)
Training Provider Udemy (specific course description)
Coursera (main training landing page)
Coursera (specific course description)

Database Science | 50
Course Delivery Format Intermediate Access 2016
Virtual, instructor-led
This instructor-led course is designed to help
Course Prerequisites or Preferred Prior participants become comfortable with advanced
Knowledge concepts and processes of databases (particularly
N/A Microsoft Access) This course emphasizes specific
procedures in database development, database
Cost programming and database applications. The course
Free to audit, $12 for Verified Certificate will concentrate on refining a database.

SQL Server: Developer and DBA Competency


Collaboration Database Science

Developers and database administrators (DBAs) Proficiency Level


sometimes have competing priorities—but they need Full Performance
to meet in the middle to ensure a successful user
experience. In this one-hour class, you can learn why Training Provider
developers and DBAs approach application design HHS Learning Portal (main training landing page)
differently, and how you can achieve compromise HHS Learning Portal (specific course description)
without sacrificing quality. Learn how to use Entity
Framework and optimized queries, leverage stored Course Delivery Format
procedures, and set up continuous development and Virtual, self-paced
deployment pipelines for your databases, while
gaining some understanding about how to work Course Prerequisites or Preferred Prior
together to build more robust, scalable applications. Knowledge
Introduction to Microsoft Access 2016 or equivalent
Competency
Database Science Cost
Free
Proficiency Level
Foundational Learning SQL Server Development on
Linux
Training Provider
LinkedIn Learning (main training landing page) The release of Microsoft SQL Server 2017 introduces
LinkedIn Learning (specific course description) a wide range of new features for developers—
including the ability to leverage the power of the SQL
Course Delivery Format Server engine on Linux operating systems. In this
Virtual, self-paced course, get up to speed with all of the exciting new
features available in this platform, and learn about
Course Prerequisites or Preferred Prior SQL Server on Linux. Instructor Joey D'Antoni also
Knowledge covers adaptive query processing, discussing batch
N/A mode adaptive joins, interleaved execution, and
automatic tuning. He wraps up the course with a
Cost discussion of SQL Operations Studio, an open-source
Free tool that you can use to manage SQL Server and
Azure SQL Database from any platform.

Database Science | 51
Competency Course Prerequisites or Preferred Prior
Database Science Knowledge
Programming and Scripting • Understand basic database development
• Understand relational database development
Proficiency Level • Knowledge and experience using basic
Basic version control
• Knowledge how to execute scripts against a
Training Provider database platform
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Cost
Free to audit, $99 for Verified Certificate
Course Delivery Format
Virtual Optimizing Performance for SQL
Course Prerequisites or Preferred Prior
Based Applications
Knowledge
Maximizing performance of your SQL based
N/A
applications ranges from optimization of your
database, to various tools and techniques for
Cost
monitoring and tuning your environments. In this
Free
course you’ll learn techniques to run highly
performant applications that use SQL Server.
DevOps for Databases
Competency
This course examines the challenges and solutions of Database Science
incorporating your database into a DevOps software
development process. This course will help you Proficiency Level
understand the challenges of working with various Expert
data stores while developing and changing your
software at a rapid pace. Training Provider
edX (main training landing page)
The course will cover: edX (specific course description)
Committing database code to a version control
system (VCS), Continuous Integration and unit Course Delivery Format
testing database code. Virtual, instructor-led
Competency Course Prerequisites or Preferred Prior
Database Science Knowledge
• Knowledge of writing T-SQL queries.
Proficiency Level
• Knowledge of basic relational database
Expert
concepts
Training Provider
Cost
edX (main training landing page)
Free to audit, $99 for Verified Certificate
edX (specific course description)

Course Delivery Format


Virtual, instructor-led

Database Science | 52
Relational Database Support for Data robust security model for your applications.
Additionally, learn how to encrypt connections and
Warehouses secure a network.
Relational Database Support for Data Warehouses is
Competency
the third course in the Data Warehousing for Database Science
Business Intelligence specialization. In this course,
you'll use analytical elements of SQL for answering
Proficiency Level
business intelligence questions. You'll learn features
Expert
of relational database management systems for
managing summary data commonly used in business
Training Provider
intelligence reporting. Because of the importance
LinkedIn Learning (main training landing page)
and difficulty of managing implementations of data
LinkedIn Learning (specific course description)
warehouses, we'll also delve into storage
architectures, scalable parallel processing, data
Course Delivery Format
governance, and big data impacts.
Virtual, self-paced
Competency
Course Prerequisites or Preferred Prior
Database Science
Knowledge
N/A
Proficiency Level
Expert
Cost
Training Provider Free
Coursera (main training landing page)
Coursera (specific course description)

Course Delivery Format


Virtual, instructor-led

Course Prerequisites or Preferred Prior


Knowledge
N/A

Cost
Free to audit

SQL Server: Security for Developers


Learn how to protect databases and preserve the
integrity of an organization's data by configuring the
security settings in SQL Server. This course covers
how to use built-in options on Microsoft platforms,
including Azure AD, to secure database and network
infrastructure. Learn about establishing users,
assigning roles, and granting privileges. Find out
how to prevent SQL Server from malicious injection
by addressing vulnerabilities. Discover how to
encrypt data at rest and in-transit. See how to build a

Database Science | 53
Machine Learning

Introduction to Machine Learning for enables computational systems to automatically


Data Science learn how to perform a desired task based on
information extracted from the data. Machine
This introductory course will introduce you to the learning has become one of the hottest fields of study
Fundamentals, that you need before you start getting today and the demand for jobs is only expected to
"Hands on. This course will help students genuinely increase. Gaining skills in this field will get you one
understand what computer science, algorithms, step closer to becoming a data scientist or
programming, data, big data, artificial intelligence, quantitative analyst.
machine learning, and data science is, the impacts
machine learning, and data science is having on Competency
society. what problems Machine Learning can solve, Machine Learning
and how the Machine Learning Process works,
How to avoid problems with Machine Learning, to Proficiency Level
successfully implement it without losing your mind! Comprehension

Competency Training Provider


Machine Learning California Institute of Technology (main training
landing page)
Proficiency Level California Institute of Technology (specific course
Comprehension description)

Training Provider Course Delivery Format


Udemy (main training landing page) Virtual, instructor-led
Udemy (specific course description)
Course Prerequisites or Preferred Prior
Course Delivery Format Knowledge
Virtual, instructor-led N/A

Course Prerequisites or Preferred Prior Cost


Knowledge Free
N/A
Machine Learning
Cost
Free to audit, $200 for Verified Certificate The first part of the course covers Supervised
Learning, a machine learning task that makes it
Learning from Data Introductory possible for your phone to recognize your voice, your
email to filter spam, and for computers to learn a
Machine Learning bunch of other cool stuff.
This introductory computer science course in In part two, you will learn about Unsupervised
machine learning will cover basic theory, algorithms, Learning. Ever wonder how Netflix can predict what
and applications. Machine learning is a key movies you'll like? Or how Amazon knows what you
technology in Big Data, and in many financial, want to buy before you do? Such answers can be
medical, commercial, and scientific applications. It found in this section!

Machine Learning | 54
Finally, can we program machines to learn like Training Provider
humans? This Reinforcement Learning section will LinkedIn Learning (main training landing page)
teach you the algorithms for designing self-learning LinkedIn Learning (specific course description)
agents like us!
Course Delivery Format
Competency Virtual, self-paced
Machine Learning
Course Prerequisites or Preferred Prior
Proficiency Level Knowledge
Comprehension N/A

Training Provider Cost


Udacity (main training landing page) Free
Udacity (specific course description)
Programming Fundamentals
Course Delivery Format
Virtual, instructor-led Programming is an increasingly important skill,
whether you aspire to a career in software
Course Prerequisites or Preferred Prior development, or in other fields. This course is the
Knowledge first in the specialization Introduction to
N/A Programming in C, but its lessons extend to any
language you might want to learn. This is because
Cost programming is fundamentally about figuring out
Free how to solve a class of problems and writing the
algorithm, a clear set of steps to solve any problem in
Introduction to Data Structures and its class. This course will introduce you to a powerful
Algorithms in Java problem-solving process—the Seven Steps—which
you can use to solve any programming problem. In
Enhance your programming skill set by learning this course, you will learn how to develop an
about some of the most commonly-used data algorithm, then progress to reading code and
structures and algorithms. In this course, instructor understanding how programming concepts relate to
Raghavendra Dixit walks through how to use Java to algorithms.
write code to implement data structures and
algorithms. After explaining why it's advantageous to Competency
study these topics, he goes over the analysis of Machine Learning
algorithms and discusses arrays—a data structure
found in most programming languages. He also Proficiency Level
explains how to implement linked lists in Java, and Comprehension
covers stacks, queues, recursion, binary search trees,
heaps, and more. Training Provider
Coursera (main training landing page)
Competency Coursera (specific course description)
Machine Learning
Course Delivery Format
Proficiency Level Virtual, instructor-led
Basic

Machine Learning | 55
Course Prerequisites or Preferred Prior SQL for Exploratory Data Analysis
Knowledge
N/A
Essential Training
Learn how to use SQL to understand the
Cost
characteristics of data sets destined for data science
Free to audit
and machine learning. The course begins with an
introduction to exploratory data analysis and how it
Programming Foundations: differs from hypothesis-driven statistical analysis.
Algorithms Instructor Dan Sullivan explains how SQL queries
and statistical calculations, and visualization tools
Algorithms are the universal building blocks of like Excel and R, can help you verify data quality and
programming. The power the software you use every avoid incorrect assumptions. Next, find out how to
day, whether it's a spreadsheet, a social network, or a perform data-quality checks, reveal and recover
driving assistant. Algorithms offer a way to think missing values, and check business logic. Discover
about programming challenges in plain English, how to use box plots to understand non-normal
before they are translated into a specific language distribution of data and use histograms to
like C# or JavaScript. In this course, author and understand the frequency of data values attributes.
developer Joe Marini explains some of the most Dan also explains how to use the chi square test to
popular and useful algorithms for searching and understand dependencies and measure correlations
sorting information, working with techniques like between attributes. The course concludes with a
recursion, and understanding common data collection of tips and best practices for exploratory
structures. He also discusses the performance data analysis.
implications of different algorithms and how to
evaluate the performance of a given algorithm. Each Competency
algorithm is shown in practice in Python, but the Machine Learning
lessons can be applied to any programming language.
Proficiency Level
Competency Basic
Machine Learning
Training Provider
Proficiency Level LinkedIn Learning (main training landing page)
Basic LinkedIn Learning (specific course description)

Training Provider Course Delivery Format


LinkedIn Learning (main training landing page) Virtual, self-paced
LinkedIn Learning (specific course description)
Course Prerequisites or Preferred Prior
Course Delivery Format Knowledge
Virtual, self-paced N/A

Course Prerequisites or Preferred Prior Cost


Knowledge Free
N/A
Supervised Learning
Cost
Free Data science is opening exciting new opportunities,
especially for Microsoft developers who want to take

Machine Learning | 56
advantage of the artificial intelligence and machine such as big data, and avoid some common pitfalls
learning capabilities of Azure. This introductory associated with programming AI.
course provides an overview of the basic concepts
underlying Azure Machine Learning. Learn the Competency
difference between supervised, unsupervised, and Machine Learning
reinforcement learning and important factors that
impact the success of any data science project: the Proficiency Level
quality of the data you use, the questions you ask, Foundational
and the predictions you make. Instructor Sahil Malik
also reviews some of the machine learning Training Provider
algorithms—clustering, anomaly detection, LinkedIn Learning (main training landing page)
classification, and regression—that are most relevant LinkedIn Learning (specific course description)
to Azure.
Course Delivery Format
Competency Virtual, self-paced
Machine Learning
Statistical Modeling Course Prerequisites or Preferred Prior
Knowledge
Proficiency Level N/A
Basic
Cost
Training Provider Free
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Machine Learning and AI Foundations:
Course Delivery Format Linear Regression
Virtual, self-paced
This intermediate-level course includes topics such
Course Prerequisites or Preferred Prior as building effective scatter plots in Chart Builder,
Knowledge challenges and assumptions of multiple regression,
N/A checking assumptions visually, creating dummy
codes, creating and testing interaction terms,
Cost understanding partial and part correlations, spotting
Free problems and taking corrective action, dealing with
multicollinearity.
Artificial Intelligence Foundations: Competency
Thinking Machines Machine Learning
Statistical Modeling
This course will introduce you to some of the key
concepts behind artificial intelligence, including the Proficiency Level
differences between "strong" and "weak" AI. You'll Foundational
see how AI has created questions around what it
means to be intelligent and how much trust we Training Provider
should put in machines. Instructor Doug Rose LinkedIn Learning (main training landing page)
explains the different approaches to AI, including LinkedIn Learning (specific course description)
machine learning and deep learning, and the
practical uses for new AI-enhanced technologies. Course Delivery Format
Plus, learn how to integrate AI with other technology, Virtual, self-paced

Machine Learning | 57
Course Prerequisites or Preferred Prior
Knowledge
Machine Learning with Big Data
N/A
Want to make sense of the volumes of data you have
collected? Need to incorporate data-driven decisions
Cost
into your process? This course provides an overview
Free
of machine learning techniques to explore, analyze,
and leverage data. You will be introduced to tools
Machine Learning and AI: Advanced and algorithms you can use to create machine
Decision Trees learning models that learn from data, and to scale
those models up to big data problems.
This course shows how to use leading machine-
learning techniques—cluster analysis, anomaly Competency
detection, and association rules—to get accurate, Machine Learning
meaningful results from big data. Instructor Keith
McCormick reviews the most common clustering Proficiency Level
algorithms: hierarchical, k-means, BIRCH, and self- Foundational
organizing maps (SOM). He uses the same
algorithms for anomaly detection, with additional Training Provider
specialized functions available in IBM SPSS Modeler. Coursera (main training landing page)
This course also covers some basic data mining Coursera (specific course description)
activities. This course concludes with a review of
association rules and sequence detection and Course Delivery Format
provides some resources for learning more. Virtual, instructor-led

Competency Course Prerequisites or Preferred Prior


Machine Learning Knowledge
Data Mining & Integration N/A

Proficiency Level Cost


Foundational Free to audit

Training Provider Introduction to Deep Learning


LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) The goal of this course is to give learners basic
understanding of modern neural networks and their
Course Delivery Format applications in computer vision and natural language
Virtual, self-paced understanding. The course starts with a recap of
linear models and discussion of stochastic
Course Prerequisites or Preferred Prior optimization methods that are crucial for training
Knowledge deep neural networks. Learners will study all popular
Experience gathering and organizing data. building blocks of neural networks including fully
Understanding for creating formulas in excel and connected layers, convolutional and recurrent layers.
modifying charts Learners will use these building blocks to define
complex modern architectures in TensorFlow and
Cost Keras frameworks. In the course project learner will
Free implement deep neural network for the task of image

Machine Learning | 58
captioning which solves the problem of giving a text classification to decision trees and clustering. By
description for an input image. completing this course, you will learn how to apply,
test, and interpret machine learning algorithms as
Competency alternative methods for addressing your research
Machine Learning questions.
Database Science
Competency
Proficiency Level Machine Learning
Full Performance Database Science

Training Provider Proficiency Level


Coursera (main training landing page) Full Performance
Coursera (specific course description)
Training Provider
Course Delivery Format Coursera (main training landing page)
Virtual, instructor-led Coursera (specific course description)

Course Prerequisites or Preferred Prior Course Delivery Format


Knowledge Virtual, instructor-led
• Basic knowledge of Python
• Basic linear algebra and probability Course Prerequisites or Preferred Prior
• Linear regression: mean squared error, Knowledge
analytical solution N/A
• Logistic regression: model, cross-entropy
loss, class probability estimation Cost
• Gradient descent for linear models. Free to audit
Derivatives of MSE and cross-entropy loss
functions Quantum Machine Learning
• The problem of overfitting
• Regularization for linear models The pace of development in quantum computing
mirrors the rapid advances made in machine
Cost learning and artificial intelligence. It is natural to ask
Free to audit whether quantum technologies could boost learning
algorithms: this field of inquiry is called quantum-
enhanced machine learning. The goal of this course
Machine Learning for Data Analysis is to show what benefits current and future quantum
technologies can provide to machine learning,
Are you interested in predicting future outcomes focusing on algorithms that are challenging with
using your data? This course helps you do just that! classical digital computers. We put a strong emphasis
Machine learning is the process of developing,
on implementing the protocols, using open source
testing, and applying predictive algorithms to frameworks in Python. Prominent researchers in the
achieve this goal. Make sure to familiarize yourself field will give guest lectures to provide extra depth to
with course 3 of this specialization before diving into
each major topic. These guest lecturers include Alán
these machine learning concepts. Building on Course
Aspuru-Guzik, Seth Lloyd, Roger Melko, and Maria
3, which introduces students to integral supervised Schuld.
machine learning concepts, this course will provide
an overview of many additional concepts, techniques,
and algorithms in machine learning, from basic

Machine Learning | 59
Competency mathematical understanding of the respective
Machine Learning algorithms, but we will only briefly touch on abstract
Database Science learning theory.

Proficiency Level In the second half of the course we shift to


Full Performance unsupervised learning techniques. In these problems
the end goal less clear-cut than predicting an output
Training Provider based on a corresponding input. We will cover three
edX (main training landing page) fundamental problems of unsupervised learning:
edX (specific course description) data clustering, matrix factorization, and sequential
models for order-dependent data. Some applications
Course Delivery Format of these models include object recommendation and
Virtual, instructor-led topic modeling.

Course Prerequisites or Preferred Prior Competency


Knowledge Machine Learning
Experience with machine learning is recommended
Proficiency Level
Cost Expert
Free to audit
Training Provider
edX (main training landing page)
Machine Learning edX (specific course description)
Machine Learning is the basis for the most exciting Course Delivery Format
careers in data analysis today. You’ll learn the Virtual, instructor-led
models and methods and apply them to real world
situations ranging from identifying trending news Course Prerequisites or Preferred Prior
topics, to building recommendation engines, ranking Knowledge
sports teams and plotting the path of movie zombies. Experience with machine learning is recommended
Major perspectives covered include:
probabilistic versus non-probabilistic modeling Cost
supervised versus unsupervised learning Free to audit, $300 for Verified Certificate
Topics include: classification and regression,
clustering methods, sequential models, matrix
factorization, topic modeling and model selection.
Methods include: linear and logistic regression,
support vector machines, tree classifiers, boosting,
maximum likelihood and MAP inference, EM
algorithm, hidden Markov models, Kalman filters, k-
means, Gaussian mixture models, among others.
In the first half of the course we will cover supervised
learning techniques for regression and classification.
In this framework, we possess an output or response
that we wish to predict based on a set of inputs. We
will discuss several fundamental methods for
performing this task and algorithms for their
optimization. Our approach will be more practically
motivated, meaning we will fully develop a

Machine Learning | 60
Operations Research

CS202: Discrete Structures and its role in business decisions. You'll learn why
data is important and how it has evolved. You'll be
This course has been designed to provide you with a introduced to “Big Data” and how it is used. You'll
clear, accessible introduction to discrete also be introduced to a framework for conducting
mathematics. Discrete mathematics describes Data Analysis and what tools and techniques are
processes that consist of a sequence of individual commonly used. Finally, you'll have a chance to put
steps (as compared to calculus, which describes your knowledge to work in a simulated business
processes that change in a continuous manner). The setting.
principal topics presented in this course are logic and
proof, induction and recursion, discrete probability, Competency
and finite state machines. As you progress through Operations Research
the units of this course, you will develop the Statistical Modeling
mathematical foundations necessary for more
specialized subjects in computer science, including Proficiency Level
data structures, algorithms, and compiler design. Comprehension
Upon completion of this course, you will have the
mathematical know-how required for an in-depth Training Provider
study of the science and technology of the computer Coursera (main training landing page)
age. Coursera (specific course description)

Competency Course Delivery Format


Operations Research Virtual, instructor-led

Proficiency Level Course Prerequisites or Preferred Prior


Comprehension Knowledge
N/A
Training Provider
saylor.com (main training landing page) Cost
saylor.com (specific course description) Free to audit

Course Delivery Format Exploring and Producing Data for


Virtual, instructor-led
Business Decision Making
Course Prerequisites or Preferred Prior
Knowledge This course provides an analytical framework to help
N/A you evaluate key problems in a structured fashion
and will equip you with tools to better manage the
Cost uncertainties that pervade and complicate business
Free processes. Specifically, you will be introduced to
statistics and how to summarize data and learn
concepts of frequency, normal distribution, statistical
Data-Driven Decision-Making studies, sampling, and confidence intervals.
Welcome to Data-Driven Decision Making. In this Competency
course you'll get an introduction to Data Analytics Operations Research

Operations Research | 61
Proficiency Level Course Prerequisites or Preferred Prior
Comprehension Knowledge
N/A
Training Provider
Coursera (main training landing page) Cost
Coursera (specific course description) Free to audit, $20 for Verified Certificate

Course Delivery Format


Virtual, instructor-led
Basic Structures

Course Prerequisites or Preferred Prior Understanding statistics is essential to understand


research in the social and behavioral sciences. In this
Knowledge
course you will learn the basics of statistics; not just
N/A
how to calculate them, but also how to evaluate
them. This course will also prepare you for the next
Cost
course in the specialization - the course Inferential
Free to audit
Statistics.
Introduction to Computational Competency
Thinking Operations Research

Computational thinking is critical for solving Proficiency Level


problems and using data effectively in modern Basic
society, but what is computational thinking anyway?
Computational thinking is really a way to solve Training Provider
problems by specifying detailed, step-by-step Coursera (main training landing page)
solutions to those problems; collecting, representing, Coursera (specific course description)
and analyzing data to support drawing conclusions
or making decisions; and using a variety of Course Delivery Format
techniques to improve the efficiency of our problem Virtual, instructor-led
solutions.
Course Prerequisites or Preferred Prior
This course is designed to help you learn key Knowledge
computational thinking topics and develop your N/A
skills in those areas.
Cost
Competency Free to audit
Operations Research
Introduction to Statistics and Data
Proficiency Level
Comprehension Analysis in Public Health

Training Provider This course will teach you the core building blocks of
Udemy (main training landing page) statistical analysis - types of variables, common
Udemy (specific course description) distributions, hypothesis testing - but, more than
that, it will enable you to take a data set you've never
Course Delivery Format seen before, describe its keys features, get to know its
Virtual, instructor-led strengths and quirks, run some vital basic analyses
and then formulate and test hypotheses based on

Operations Research | 62
means and proportions. You'll then have a solid future results with Excel's histograms, graphs, and
grounding to move on to more sophisticated analysis charts. He also covers testing hypotheses, modeling
and take the other courses in the series. You'll learn different data distributions, and calculating the
the popular, flexible and completely free software R, covariance and correlation between data sets. The
used by statistics and machine learning practitioners course closes with a look at calculating Bayesian
everywhere. It's hands-on, so you'll first learn about probabilities in Excel.
how to phrase a testable hypothesis via examples of
medical research as reported by the media. Then Competency
you'll work through a data set on fruit and vegetable Operations Research
eating habits: data that are realistically messy,
because that's what public health data sets are like in Proficiency Level
reality. There will be mini-quizzes with feedback Basic
along the way to check your understanding. The
course will sharpen your ability to think critically and Training Provider
not take things for granted: in this age of LinkedIn Learning (main training landing page)
uncontrolled algorithms and fake news, these skills LinkedIn Learning (specific course description)
are more important than ever.
Course Delivery Format
Competency Virtual, self-paced
Operations Research
Course Prerequisites or Preferred Prior
Proficiency Level Knowledge
Basic Experience gathering and organizing data.
Understanding for creating formulas in excel and
Training Provider modifying charts.
Coursera (main training landing page)
Coursera (specific course description) Cost
Free
Course Delivery Format
Virtual, instructor-led Big Data Foundations: Program
Course Prerequisites or Preferred Prior Management
Knowledge
N/A Big data is a game changer. It's brought an entirely
new era of data-driven insights to companies in all
Cost industries. But the greatest value in this new era
Free to audit comes from a world-class big data program: one that
is meticulously planned and then actively managed
for success. Without it, your company might not be
Learning Excel Data Analysis capturing all the insights big data can offer.
Microsoft Excel is an important tool for information Competency
workers that design and perform data analysis. This Operations Research
course provides an overview of the fundamentals,
from performing common calculations to conducting Proficiency Level
Bayesian analysis with Excel. Author Curt Frye starts Foundational
with the foundational concepts, including an
introduction to the central limit theorem, and then
shows how to visualize data, relationships, and

Operations Research | 63
Training Provider Course Prerequisites or Preferred Prior
LinkedIn Learning (main training landing page) Knowledge
LinkedIn Learning (specific course description) N/A

Course Delivery Format Cost


Virtual, self-paced Free

Course Prerequisites or Preferred Prior Excel Data Analysis: Forecasting


Knowledge
N/A Professor Wayne Winston has taught advanced
forecasting techniques to Fortune 500 companies for
Cost
more than twenty years. In this course, he shows how
Free
to use Excel's data-analysis tools—including charts,
formulas, and functions—to create accurate and
Business Analytics: Forecasting with insightful forecasts. Learn how to display time-series
Exponential Smoothing data visually; make sure your forecasts are accurate,
by computing for errors and bias; use trendlines to
Exponential smoothing is a term for a set of identify trends and outlier data; model growth;
straightforward forecasting procedures that apply account for seasonality; and identify unknown
self-correction. Each forecast comprises two variables, with multiple regression analysis. A series
components. It's a weighted average of the prior of practice challenges along the way helps you test
forecast, plus an adjustment that would have made your skills and compare your work to Wayne's
the prior forecast more accurate. Smoothing—like solutions.
most credible approaches to forecasting—requires a
baseline of observations, in sequence, to work Competency
properly. Weekly revenues and daily hospital Operations Research
admissions are typical examples. Several versions of
exponential smoothing exists, each corresponding to Proficiency Level
a type of baseline. In this course, Conrad Carlberg Foundational
introduces simple exponential smoothing, diving into
the basic idea behind it, and explaining how to Training Provider
assemble the forecast equation and optimize LinkedIn Learning (main training landing page)
forecasts. LinkedIn Learning (specific course description)

Competency Course Delivery Format


Operations Research Virtual, self-paced

Proficiency Level Course Prerequisites or Preferred Prior


Foundational Knowledge
N/A
Training Provider
LinkedIn Learning (main training landing page) Cost
LinkedIn Learning (specific course description) Free

Course Delivery Format Financial Forecasting with Big Data


Virtual, self-paced
Big data is transforming the world of business. Yet
many people don't understand what big data and

Operations Research | 64
business intelligence are, or how to apply the Proficiency Level
techniques to their day-to-day jobs. This course Foundational
addresses that knowledge gap, giving businesspeople
practical methods to create quick and relevant Training Provider
business forecasts using big data. HHS Learning Portal (main training landing page)
HHS Learning Portal (specific course description)
Competency
Operations Research Course Delivery Format
In-person, instructor-led
Proficiency Level
Foundational Course Prerequisites or Preferred Prior
Knowledge
Training Provider N/A
LinkedIn Learning (main training landing page)
LinkedIn Learning (specific course description) Cost
$799
Course Delivery Format
Virtual, self-paced Prescriptive Analytics: Business
Course Prerequisites or Preferred Prior
Analytics
Knowledge
Everyone is talking about big data these days, but
N/A
that's just the starting point for drawing high-value,
actionable insights from your organization's data.
Cost
This course takes viewers through the entire
Free
analytics lifecycle and workflow—beyond today's
hype and buzzwords—and describes how any
NIH Data Analysis Essentials organization can turn their investments in big data
into the actionable insights they really need. Author
This class targets budget analysts, Alan Simon introduces today's relevant technologies
program/management analysts, and others who seek and shows how best to apply them to specific
to enhance their data analysis skills. This 2-day business problems and opportunities within their
course is designed to go beyond the qualitative side organization. By the end of the course, viewers will
of data analysis. Course topics include: learning how understand how the different classes of analytics—
to Interpret and translate data into decisions, descriptive, predictive, and discovery—can lead to
learning how to effectively use of data and statistics prescriptive action.
in business become familiar with the consequences
of improper data manipulations become familiar Competency
with quantitative data collection methods, learning Operations Research
how to improve analysis success by effectively
utilizing software, having an understanding of Proficiency Level
regression, trend lines, and scenarios in Excel and Foundational
finding and analyzing data patterns, trends, and
fluctuations. Training Provider
LinkedIn Learning (main training landing page)
Competency LinkedIn Learning (specific course description)
Operations Research
Course Delivery Format
Virtual, self-paced

Operations Research | 65
Course Prerequisites or Preferred Prior functions and real asymptotic and then introduces
Knowledge the symbolic method in the context of applications in
N/A the analysis of algorithms and basic structures such
as permutations, trees, strings, words, and
Cost mappings.
Free
Competency
Algorithms Specialization Operations Research

Algorithms are the heart of computer science, and Proficiency Level


the subject has countless practical applications as Full Performance
well as intellectual depth. This specialization is an
introduction to algorithms for learners with at least a Training Provider
little programming experience. The specialization is Coursera (main training landing page)
rigorous but emphasizes the big picture and Coursera (specific course description)
conceptual understanding over low-level
implementation and mathematical details. After Course Delivery Format
completing this specialization, you will be well- Virtual, instructor-led
positioned to ace your technical interviews and speak
fluently about algorithms with other programmers Course Prerequisites or Preferred Prior
and computer scientists. Knowledge
N/A
Competency
Operations Research Cost
Free to audit
Proficiency Level
Full Performance Introduction to Data Modeling
Training Provider The role of the data modeler has become even more
Coursera (main training landing page) critical to the ongoing lifecycle of development and
Coursera (specific course description) maintenance, especially in this age of digital
transformation. Analysts, developers, DBAs, and BI
Course Delivery Format professionals need to develop their skills in analyzing
Virtual, instructor-led and modeling data. Whether working with new or
legacy data, you must define rules for quality,
Course Prerequisites or Preferred Prior retention, and protection. And you need a good
Knowledge foundation of data and data design concepts before
N/A you begin sourcing, preparing, and manipulating
data.
Cost
Free to audit In this introductory course, learn how logical and
physical data modeling can give you a better
Analysis of Algorithms understanding of your organization's data, business
rules, and information architecture decisions.
This course teaches a calculus that enables precise Examine how data models are critical to your data
quantitative predictions of large combinatorial security, privacy, and compliance posture. And get
structures. In addition, this course covers generating hands-on with real-world data—analyze it,
implement business requirements, develop data

Operations Research | 66
models, and forward and reverse-engineer SQL basic PGM representation, which allow more
Server databases. complex models to be encoded compactly.

Note: To complete the hands-on requirements, you’ll Competency


work with Office 365, Visual Studio, and Azure SQL Operations Research
Database. Free or limited-time trials are available for
these products. You will require an Azure Proficiency Level
subscription. You can sign up for a free Azure trial Full Performance
subscription (a valid credit card is required for
verification, but you will not be charged for Azure Training Provider
services). Note that the free trial is not available in all Coursera (main training landing page)
regions. It is possible to complete the course and Coursera (specific course description)
earn a certificate without completing the hands-on
practices. Course Delivery Format
Virtual, instructor-led
Competency
Operations Research Course Prerequisites or Preferred Prior
Knowledge
Proficiency Level N/A
Full Performance
Cost
Training Provider Free to audit
edX (main training landing page)
edX (specific course description) Computing for Data Analysis
Course Delivery Format The modern data analysis pipeline involves
Virtual, instructor-led collection, preprocessing, storage, analysis, and
interactive visualization of data.
Course Prerequisites or Preferred Prior The goal of this course, part of the Analytics:
Knowledge Essential Tools and Methods Micro-Masters
N/A program, is for you to learn how to build these
components and connect them using modern tools
Cost and techniques.
Free to audit, $99 for Verified Certificate
In the course, you’ll see how computing and
Probabilistic Graphical Models 1: mathematics come together. For instance, “under the
Representation hood” of modern data analysis lies numerical linear
algebra, numerical optimization, and elementary
This course is the first in a sequence of three. It data processing algorithms and data structures.
describes the two basic PGM representations: Together, they form the foundations of numerical
Bayesian Networks, which rely on a directed graph; and data-intensive computing.
and Markov networks, which use an undirected
graph. The course discusses both the theoretical The hands-on component of this course will develop
properties of these representations as well as their your proficiency with modern analytical tools. You
use in practice. The (highly recommended) honors will learn how to mash up Python, R, and SQL
track contains several hands-on assignments on how through Jupyter notebooks, among other tools.
to represent some real-world problems. The course Furthermore, you will apply these tools to a variety of
also presents some important extensions beyond the

Operations Research | 67
real-world datasets, thereby strengthening your Competency
ability to translate principles into practice. Operations Research

Competency Proficiency Level


Operations Research Expert
Advanced Mathematics
Training Provider
Proficiency Level Coursera (main training landing page)
Expert Coursera (specific course description)

Training Provider Course Delivery Format


edX (main training landing page) Virtual, instructor-led
edX (specific course description)
Course Prerequisites or Preferred Prior
Course Delivery Format Knowledge
Virtual, instructor-led N/A

Course Prerequisites or Preferred Prior Cost


Knowledge Free to audit
N/A

Cost
Free to audit, $500 for Verified Certificate

Data Structures and Algorithms


Specialization
You've learned the basic algorithms now and are
ready to step into the area of more complex problems
and algorithms to solve them. Advanced algorithms
build upon basic ones and use new ideas. We will
start with networks flows which are used in more
typical applications such as optimal matchings,
finding disjoint paths and flight scheduling as well as
more surprising ones like image segmentation in
computer vision. We then proceed to linear
programming with applications in optimizing budget
allocation, portfolio optimization, finding the
cheapest diet satisfying all requirements and many
others. Next, we discuss inherently hard problems
for which no exact good solutions are known (and
not likely to be found) and how to solve them in
practice. We finish with a soft introduction to
streaming algorithms that are heavily used in Big
Data processing. Such algorithms are usually
designed to be able to process huge datasets without
being able even to store a dataset.

Operations Research | 68
Programming and Scripting
Computer Science 101: Computers An entry-level course taught by David J. Malan,
and Programming for Beginners CS50x teaches students how to think algorithmically
and solve problems efficiently. Topics include
Computer Science is a topic that is becoming more abstraction, algorithms, data structures,
and more relevant. Whether in college, school or at encapsulation, resource management, security,
work. Computers and computer programs are software engineering, and web development.
everywhere in our everyday lives. We use software in Languages include C, Python, SQL, and JavaScript
smartphones, ATMs and even household appliances plus CSS and HTML. Problem sets inspired by real-
that we are able to control with our smartphones. A world domains of biology, cryptography, finance,
lot of people are confused and don't really know how forensics, and gaming. The on-campus version of
all of this actually works. CS50x, CS50, is Harvard's largest course.

This compact course will take you from zero Competency


knowledge to having a solid understanding of the Programming and Scripting
basic concepts of computer science and Computer Science
programming languages.
Proficiency Level
Competency Comprehension
Programming and Scripting
Training Provider
Proficiency Level edX (main training landing page)
Comprehension edX (specific course description)

Training Provider Course Delivery Format


Udemy (main training landing page) Virtual, instructor-led
Udemy (specific course description)
Course Prerequisites or Preferred Prior
Course Delivery Format Knowledge
Virtual, instructor-led N/A

Course Prerequisites or Preferred Prior Cost


Knowledge Free to audit, $90 for Verified Certificate
N/A
Inferential Statistics
Cost
$12 This course covers commonly used statistical
inference methods for numerical and categorical
CS50's Introduction to Computer data. You will learn how to set up and perform
Science hypothesis tests, interpret p-values, and report the
results of your analysis in a way that is interpretable
This is CS50x, Harvard University's introduction to for clients or the public. Using numerous data
the intellectual enterprises of computer science and examples, you will learn to report estimates of
the art of programming for majors and non-majors quantities in a way that expresses the uncertainty of
alike, with or without prior programming experience. the quantity of interest. You will be guided through
installing and using R and RStudio (free statistical

Programming and Scripting | 69


software) and will use this software for lab exercises Course Delivery Format
and a final project. The course introduces practical Virtual, self-paced
tools for performing data analysis and explores the
fundamental concepts necessary to interpret and Course Prerequisites or Preferred Prior
report results for both categorical and numerical Knowledge
data. It is highly recommended that you take the
Computer Science Principles: Programming Course
Competency first or you complete it simultaneously while
Programming and Scripting completing this course.

Proficiency Level Cost


Comprehension Free

Training Provider Data Science Bootcamp CoLab


Coursera (main training landing page)
Coursera (specific course description) This course would be right for you if: you have ever
wondered how to improve processes and analyze
Course Delivery Format large amounts of data. CoLab is a problem-solving
Virtual, instructor-led laboratory experience housed in BARDAS
Visualization Hub. Students spent 2 months
Course Prerequisites or Preferred Prior committing sixteen hours per week to applying
Knowledge transferable data science techniques. This course
N/A brings in outside experts to provide high-quality
instruction. This course covers data wrangling
Cost (mapping large amounts of data on an automated
Free to audit scale into a more easily readable format), predictive
analytics (gathering data across the public internet
Computer Science Principles Lab: and several servers to calculate probabilities of
JavaScript certain events occurring, visualization (displaying a
3D or 2D file format that allows viewers to easily
This beginner-level computer science course features infer results), machine learning (teaching computer
topics such as the history of JavaScript, setting up servers to use previous results from data to inform
the development environment, working with future computer-led tasks, often in the form of
variables and values, using customizing functions, algorithms rather than explicit code. With the
creating conditional tests, using loops, and creating guidance of industry-leading experts, participant will
and changing arrays. complete objective-based projects that will improve
their skillsets and communities. This course
Competency concludes with a capstone project, which aims to
Programming and Scripting optimize a task within the participant's agency using
data science. The benefits of participating in the Data
Proficiency Science CoLab include developing new skillsets,
Basic gaining practical solutions, reducing error, creating
community (community of HHS employees as skilled
Training Provider data scientists who continuously learn from experts
LinkedIn Learning (training main landing page) and from each other).
LinkedIn Learning (specific course description)
Competency
Programming and Scripting

Programming and Scripting | 70


Proficiency types. Participants will receive an overview of the
Basic major R data types, R data frames, and the primary
methods for importing data into RStudio. This one-
Training Provider and-a-half-hour session will additionally cover
Confluence (main training landing page) strategies for dealing with missing data in R.
Confluence (specific course description)
Competency
Course Delivery Format Programming and Scripting
In-person, instructor-led
Proficiency
Course Prerequisites or Preferred Prior Basic
Knowledge
Having experience with a college level statistics class Training Provider
is preferred. NIH Library (main training landing page)
NIH Library (specific course description)
Cost
Free Course Delivery Format
In-person, instructor-led
Introduction to R and RStudio
Course Prerequisites or Preferred Prior
The class is part of our Introduction to R Series. Knowledge
These classes are designed to teach non- N/A
programmers to write modular code and to introduce
best practices for using R for data analysis and data Cost
visualization. Free

Competency Introduction to R: Data Wrangling


Programming and Scripting
This course designed especially for non-
Proficiency programmers will help you get started with the
Basic basics of using R. Using Studio, a user-friendly
interface for R, this session will cover key
Training Provider terminology and concepts, essentials of data
NIH Library (main training landing page) processing and wrangling, along with
NIH Library (specific course description) troubleshooting and getting help.

Course Delivery Format Competency


In-person, instructor-led Programming and Scripting
Data Mining & Integration
Course Prerequisites or Preferred Prior
Knowledge Proficiency
N/A Basic

Introduction to R Data Types Training Provider


NIH Library (main training landing page)
This hands-on introductory class is the second class NIH Library (specific course description)
in the introductory R series which provides non-
programmers with a painless introduction to R data

Programming and Scripting | 71


Course Delivery Format Course topics include: enabling Firebug and web
Virtual, instructor-led inspectors, using a text editor, declaring and
assigning a variable, Booleans and the quest for
Course Prerequisites or Preferred Prior truth, worming with objects and arrays, using
Knowledge operators and control structures, iterating with
N/A loops, objects, references and functions, and
understanding variable scope.
Cost
Free Competency
Programming and Scripting
Introduction to Text Mining in R
Proficiency
Many applications in bioinformatics and portfolio Basic
analysis involve extracting meaning from large
volumes of natural language text. This course will Training Provider
introduce some of the tools available for text mining LinkedIn Learning (training main landing page)
in the R programming language. By the end of the LinkedIn Learning (specific course description)
course, you will be able to obtain text data via API,
process text for analysis, extract frequently occurring Course Delivery Format
terms and term associations, and cluster documents Virtual, self-paced
by topic. Note: this course assumes that you already
have a basic understanding of R and RStudio. Course Prerequisites or Preferred Prior
Knowledge
Competency N/A
Programming and Scripting
Data Mining & Integration Cost
Free
Proficiency
Basic Pandas Essential Training
Training Provider This intermediate-level course dives into topics such
NIH Library (main training landing page) as Data Frames, basic plotting, indexing, and group
NIH Library (specific course description) by. To help students learn how to work with data
more effectively, this course takes students through a
Course Delivery Format series of exercises that are based on the same large,
Virtual, instructor-led public data set: the Olympic medal winners from
1896 to 2008. Students will learn how to use the
Course Prerequisites or Preferred Prior pandas library and tools for data analysis and data
Knowledge structuring.
N/A
Competency
Cost Programming and Scripting
Free Statistical Modeling

Learning the JavaScript Language Proficiency


Basic
This beginner-level computer science course will
walk the student through JavaScript syntax basics.

Programming and Scripting | 72


Training Provider Course Delivery Format
LinkedIn Learning (training main landing page) Virtual, self-paced
LinkedIn Learning (specific course description)
Course Prerequisites or Preferred Prior
Course Delivery Format Knowledge
Virtual, self-paced N/A

Course Prerequisites or Preferred Prior Cost


Knowledge Free
N/A
Python for Data Science Essential
Cost
Free
Training
In this beginner level course, students learn how to
Programming Foundations: use Python for data preparation, data munging, data
Fundamentals visualization, and predictive analytics. Instructor
Lillian Pierson, P.E. covers the essential Python
This beginner-level course provides the core methods for preparing, cleaning, reformatting, and
knowledge to begin programming in any language. visualizing your data for use in analytics and data
The course instructor sees JavaScript to explore the science. Students gain a working understanding of
core syntax of a programing language and shows how machine learning, as well as outlier analysis, cluster
to write and execute your first application. The analysis, and network analysis. Students learn how
course covers creating small programs to explore to create web-based data visualizations with Plot.ly,
conditions, loops variables, and expressions. Finally, and how to use Python to scrape the web and capture
the course compares how code is written in several their own data sets. Course topics include getting
different languages, the libraries and frameworks started with Jupyter Notebooks visualizing data:
that have grown around them, and the reasons to basic charts, time series, and statistical plots,
choose each one. Course topics include writing preparing for analysis: treating missing values and
source code, understanding compiled and data transformation data analysis basics: arithmetic,
interpreted languages, requesting input, working summary statistics, and correlation analysis, outlier
with numbers, characters, strings, and operators, analysis: univariate, multivariate, and linear
writing conditional code, making the code modular, projection methods introduction to machine learning
writing loops, finding patterns in strings, working basic machine learning methods: linear and logistic
with arrays and collections, adopting a programming regression, Naïve Bayes, reducing dataset
style, reading and writing to various locations, dimensionality with PCA clustering and
debugging, managing memory usage, learning about classification: k-means, hierarchical, and k-NN,
other languages. simulating a social network with Network, creating
Plot.ly charts, scraping the web with Beautiful Soup.
Competency
Programming and Scripting Competency
Programming and Scripting
Proficiency Data Visualization
Basic
Proficiency
Training Provider Basic
LinkedIn Learning (training main landing page)
LinkedIn Learning (specific course description)

Programming and Scripting | 73


Training Provider
LinkedIn Learning (training main landing page) Cost
LinkedIn Learning (specific course description) Free

Course Delivery Format Becoming a Reproducible Scientist


Virtual, self-paced
(with Jupyter Notebooks, Git, Python) -
Course Prerequisites or Preferred Prior Part 1
Knowledge
N/A Attendees will explore reasons for reproducible
science and delve into practical exercises that will
Cost allow you to enhance your data analysis with good,
Free better, and best practices. Topics include
introduction to Jupyter Notebook, basic intro to the
Software Carpentry (Two-Day python programming language, data and project
organization, data exploration, automation,
Workshop) publishing, and sharing. This day long workshop will
condense the Data Carpentry Reproducible Science
The NIH Library is pleased to host a two-day Jupyter workshop held in Berkeley, CA in 2017 &
Software Carpentry workshop that will cover basic 2018 (https://github.com/Reproducible-Science-
software concepts and tools, including program Curriculum). Students will receive a Jupyter
design, version control, data management, and task Notebook for further study and practice.
automation. Software Carpentry (link is external)
aims to help researchers get their work done in less Competency
time and with less pain by teaching them basic Programming and Scripting
research computing skills. This hands-on workshop
will cover basic concepts and tools, including Proficiency Level
program design, version control, data management, Foundational
and task automation. Participants will be encouraged
to help one another and to apply what they have Training Provider
learned to their own research problems. HHS Learning Portal (main training landing page)
HHS Learning Portal (specific course description)
Competency
Programming and Scripting Course Delivery Format
Computer Science Virtual, instructor-led
Proficiency Course Prerequisites or Preferred Prior
Basic Knowledge
N/A
Training Provider
NIH Library (main training landing page) Cost
NIH Library (specific course description) Free
Course Delivery Format
In-person, instructor led
Bioinformatics Programming with
Python
Course Prerequisites or Preferred Prior
Knowledge In this seminar, we will explore the bioinformatics
N/A capabilities of the python programming language.

Programming and Scripting | 74


We will examine some of the many capabilities of the Training Provider
biopython package as well as delve into the bioconda HHS Learning Portal (main training landing page)
project for easy bioinformatics software installation. HHS Learning Portal (specific course description)
Students will receive a Jupyter Notebook with all
steps taught in the class for further study and Course Delivery Format
practice. Virtual, self-paced

Competency Course Prerequisites or Preferred Prior


Programming and Scripting Knowledge
▪ Introduction to the Command Line webinar-
Proficiency Level http://bioinformatics.niaid.nih.gov
Foundational ▪ Introduction to Jupyter notebook
▪ Introduction to Programming (with Python)
Training Provider or prior programming experience required
HHS Learning Portal (main training landing page)
HHS Learning Portal (specific course description) Cost
Free
Course Delivery Format
Virtual, instructor-led Cleaning Bad Data in R
Course Prerequisites or Preferred Prior Data integrity is the new focal point of the data
Knowledge science revolution. Now that everybody is onboard
▪ Introduction to the Command Line webinar- with the role of data in people's lives and business,
http://bioinformatics.niaid.nih.gov it's not an unfair question to ask, "Can you prove that
▪ Introduction to Jupyter notebook your data is accurate?" In this course, you can learn
▪ Introduction to Programming (with Python) how to identify and address many of the data
or prior programming experience required integrity issues facing modern data scientists, using
R and the tidy verse. Discover how to handle missing
Cost values and duplicated data. Find out how to convert
Free data between different units and tackle poorly
formatted text. Plus, learn how to detect outliers,
Building Workflows in Python address structural issues, and identify red flags that
indicate potential data quality issues.
In this seminar, we will assemble a working pipeline
based upon the steps we have learned in the previous Competency
seminars while surveying a few methods for building Programming and Scripting
workflows, including the snake make package. Data Mining and Integration
Students will receive a Jupyter Notebook with all
steps taught in the class for further study and Proficiency
practice. Foundational

Competency Training Provider


Programming and Scripting LinkedIn Learning (training main landing page)
LinkedIn Learning (specific course description)
Proficiency
Foundational Course Delivery Format
Virtual, self-paced

Programming and Scripting | 75


Course Prerequisites or Preferred Prior Intermediate Python Programming and
Knowledge
It is assumed that you already have a basic
Best Practices
knowledge of data analytics. Familiarity with the R
Now that you know the basics of the python
programming language, R Studio IDE and the
Tidyverse data wrangling packages is also preferred programming language, let's take the language to the
next level. In the seminar, we will explore some
to be prepared for this course.
advanced features of the python programming
language and we will also explore some best practices
Cost
to make the best use of the python programming
Free
language. Students will receive a Jupyter Notebook
with all steps taught in the class for further study and
Data Visualization in R in ggplot practice.
The R series is a comprehensive collection of training Competency
sessions designed to teach non-programmers how to Programming and Scripting
write modular code and to introduce best practices
for using R for data analysis and data visualization. Proficiency
Each class uses both evidence-based best practices Foundational
for programming and practical hands-on lessons.
This class provides a basic overview of using R to Training Provider
create data visualizations. Participants will become HHS Learning Portal (main training landing page)
familiar with using R to produce scatter plots, HHS Learning Portal (specific course description)
boxplots, and time series plots using ggplot.
Course Delivery Format
Competency In-person, instructor-led
Programming and Scripting
Data Visualization Course Prerequisites or Preferred Prior
Knowledge
Proficiency Completion of Introduction to R for Non-
Foundational programmers or basic proficiency with R; general
knowledge of statistical methods.
Training Provider
NIH Library (main training landing page) Cost
NIH Library (specific course description) Free
Course Delivery Format
In-person, instructor-led Introduction to the Common Line (and
UNIX)
Course Prerequisites or Preferred Prior
Knowledge The command line is a quick, powerful, text-based
N/A interface scientists and programmers use to
effectively and efficiently communicate with
Cost computers to accomplish a wide set of tasks.
Free Learning how to use it will enable you to effectively
write programs as you discover all that your
computer is capable of! You will learn about the
terminal, (bash) shell, prompt, how to get to the
command line and how to get out! Being able to use

Programming and Scripting | 76


the command line is important for navigating for Competency
directories and working with files. We will also cover Programming and Scripting
some very useful UNIX utilities. This course is a Data Visualization
prerequisite for the remainder of the Python
Programming for Scientific seminars. Proficiency
Foundational
Competency
Programming and Scripting Training Provider
NIH Library (main training landing page)
Proficiency Level NIH Library (specific course description)
Foundational
Course Delivery Format
Training Provider In-person, instructor-led
HHS Learning Portal (main training landing page)
HHS Learning Portal (specific course description) Course Prerequisites or Preferred Prior
Knowledge
Course Delivery Format N/A
Virtual, self-paced
Cost
Course Prerequisites or Preferred Prior Free
Knowledge
N/A Introduction to Data Visualization in R:
Cost
ggplot (Part 2)
Free
This hands-on advanced class is presented in two
parts and introduces participants to data
Introduction to Data Visualization in R: visualization in R using the ggplot package.
ggplot (Part 1) Participants will learn how to use the multifunctional
tool ggplot to create charts, graphs and other
The goal of this session is to teach non-programmers visualization modes of their data. The first one-and-
to write modular code and to introduce best practices a-half-hour session provides a basic overview of the
for using R for data analysis. The introduction to R basic graphing and formatting functions in ggplot.
series is divided into three topic areas: (1) The second session builds on the previous session to
Introduction to R, (2) Introduction to Data create more complex modes to effectively present
Wrangling in R, and (3) Data Visualization in R. data in a visual form (this is the second session).
Each topic area divided into two separate classes. By
the end of this class students should be able to Competency
organize their folders and files using R Studio Programming and Scripting
projects, discuss options for plotting in R, work with Data Visualization
Color in R, create a histogram in base R, explain the
various ways that R handles colors, add multiple Proficiency
colors to a plot in R, use color palates in R, use the R Foundational
ColorBrew package, understand the basics of plotting
in plot, demonstrate how to add a layer to a plot Training Provider
using plot, define plot aesthetics, add a geometric NIH Library (main training landing page)
function to a plot, modify a basic plot. NIH Library (specific course description)

Programming and Scripting | 77


Course Delivery Format SciPy, building a classifier, clustering data, working
In-person, instructor-led with big data and Spark, using MLlib, beginning with
Spark.
Course Prerequisites or Preferred Prior
Knowledge Competency
N/A Programming and Scripting

Cost Proficiency
Free Foundational

Learning Python Training Provider


LinkedIn Learning (training main landing page)
This introductory 2-hour course features topics such LinkedIn Learning (specific course description)
as installing Python, choosing an editor or IDE,
working with variables and expressions, writing Course Delivery Format
loops, using the date, time, and datetime classes, Virtual, self-paced
reading and writing files, fetching internet data,
Parsing and processing HTML. Course Prerequisites or Preferred Prior
Knowledge
Competency N/A
Programming and Scripting
Data Mining & Integration Cost
Free
Proficiency
Foundational R for Data Science: Lunchbreak
Lessons
Training Provider
LinkedIn Learning (training main landing page) Topics for this intermediate-level course include
LinkedIn Learning (specific course description) working with R built-in datasets, vector math, sub
setting, vectors, lists, matrices and more.
Course Delivery Format
Virtual, self-paced Competency
Programming and Scripting
Course Prerequisites or Preferred Prior
Knowledge Proficiency
N/A Foundational
Cost Training Provider
Free LinkedIn Learning (training main landing page)
LinkedIn Learning (specific course description)
Learning Python for Data Science,
with Tim Fox and Elephant Scale Course Delivery Format
Virtual, self-paced
This beginner-level course shows the participant how
to use and derive information from datasets using Course Prerequisites or Preferred Prior
Python. Course topics include configuring your Knowledge
system, setting up labs, using pandas, NumPy, and N/A

Programming and Scripting | 78


Cost performing a wide variety of tasks, but sometimes a
Free dataset or analysis requires a function that does not
yet exist in R. This hands-on class will provide an
Unit Testing and Test-Driven overview of creating and using custom functions in
R. Participants will learn how to write their own R
Deployment in Python functions, use these functions in R sessions, and save
them for future use. This is an intermediate course
Every software developer wants to ship high-quality that assumes participants are already comfortable
applications. Test-driven development (TDD) is a key using R and Studio for basic tasks.
discipline that can help you enhance your
development process—and, in turn, your code base— Competency
by ensuring that crashes and bugs are addressed Programming and Scripting
early on. In this course, join Richard Wells as he
covers unit testing and TDD for Python projects. Proficiency
Richard provides an overview of both unit testing Foundational
and TDD, explaining why both are crucial for
developers. He also shows how to set up your
Training Provider
development environment for TDD and goes over the
NIH Library (main training landing page)
pytest unit-testing framework. Throughout the
NIH Library (specific course description)
course, he shares best practices and provides
examples and test cases that can help you gain a
Course Delivery Format
practical understanding of TTD in Python.
In-person and virtual, instructor-led
Competency
Course Prerequisites or Preferred Prior
Programming and Scripting
Knowledge
This course assumes that participants are already
Proficiency comfortable with using R and R Studio for basic
Foundational
tasks.
Training Provider
Cost
LinkedIn Learning (training main landing page)
Free
LinkedIn Learning (specific course description)

Course Delivery Format Advanced Java Programming


Virtual, self-paced
Java Advanced Training shows developers how to
Course Prerequisites or Preferred Prior expand their programming skills and get more out of
Knowledge Java. This course offers platform- and framework-
N/A neutral tutorials that can be used to build web,
mobile, and desktop applications. Starting with
Cost advanced methods of defining Java classes and
Free programmatic flow, author David Gassner goes on to
describe the Java Reflection API and the Collections
Framework; management of files and directories;
Writing Custom Functions in R test-driven development with advanced exception
handling and reporting; and how to work with
R is a programming language and open source multiple threads.
environment for statistical computing and graphics.
It has tens of thousands of functions available for

Programming and Scripting | 79


Competency Training Provider
Programming and Scripting HHS Learning Portal (main training landing page)
HHS Learning Portal (specific course description)
Proficiency
Full Performance Course Delivery Format
Virtual, self-paced
Training Provider
LinkedIn Learning (training main landing page) Course Prerequisites or Preferred Prior
LinkedIn Learning (specific course description) Knowledge
N/A
Course Delivery Format
Virtual, self-paced Cost
Free
Course Prerequisites or Preferred Prior
Knowledge Data Analysis with Python and Pandas
N/A
Participants will continue with their Python training
Cost as they delve into the world of data analysis and data
Free science with Python. Participants will utilize the
Pandas Python package and apply it to the analysis of
Becoming a Reproducible Scientist our dataset. Participants will take our first look at
(with Jupyter Notebooks, Git, Python) - plotting in Python. Participants will work within a
Jupyter notebook and have access to all materials for
Part 2 continued practice. Students will receive a Jupyter
Notebook for further study and practice.
Attendees will explore reasons for reproducible
science and delve into practical exercises that will Competency
allow you to enhance your data analysis with good, Programming and Scripting
better, and best practices. Topics include
introduction to Jupyter Notebook, basic intro to the
Proficiency
python programming language, data and project Full Performance
organization, data exploration, automation,
publishing, and sharing. This day long workshop will
Training Provider
condense the Data Carpentry Reproducible Science
HHS Learning Portal (main training landing page)
Jupyter workshop held in Berkeley, CA in 2017 &
HHS Learning Portal (specific course description)
2018 (https://github.com/Reproducible-Science-
Curriculum). Students will receive a Jupyter
Course Delivery Format
Notebook with all steps taught in the class for further
In-person, instructor-led
study and practice.
Course Prerequisites or Preferred Prior
Competency
Knowledge
Programming and Scripting
N/A
Proficiency Level
Cost
Full Performance
Free

Programming and Scripting | 80


Deploying and Using Jasmine Proficiency
Full Performance
This course will cover the major areas of interest
within the Jasmine JavaScript testing environment Training Provider
and aims to acquaint the learner with the NIH Library (main training landing page)
fundamental knowledge to support further study of NIH Library (specific course description)
JavaScript testing. The course covers the deployment
and configuration of the Jasmine environment, the Course Delivery Format
architecture of the testing engine, and the syntax of In-person, instructor-led
the Jasmine test Functions and Methods. In
addition, the course covers Jasmine Spies and Course Prerequisites or Preferred Prior
Functions call stats, and includes a section on Knowledge
deploying and using Jasmine with the Node.js • Introduction to the Command Line webinar-
environment. http://bioinformatics.niaid.nih.gov
• Introduction to Jupyter notebook
Competency • Introduction to Programming (with Python)
Programming and Scripting or prior programming experience required

Proficiency Level Cost


Full Performance Free

Training Provider Learning Python Web Penetration


HHS Learning Portal (main training landing page)
HHS Learning Portal (specific course description)
Testing
Stop using automated testing tools. Customize and
Course Delivery Format
write your own tests with Python! While there are an
Virtual, self-paced
increasing number of sophisticated ready-made tools
to scan systems for vulnerabilities, Python allows
Course Prerequisites or Preferred Prior
testers to write system-specific scripts—or alter and
Knowledge
extend existing testing tools—to find, exploit, and
N/A
record as many security weaknesses as possible. This
course will give you the necessary skills to write
Cost
custom tools for different scenarios and modify
Free
existing Python tools to suit your application's needs.
Intermediate R: Statistical Analysis Christian Martorella starts off by providing an
overview of the web application penetration testing
This session will introduce participants to process and the tools the professionals use to
conducting statistical analysis with Studio, a free perform these tests. Next, he shows how to interact
program for R. The course will focus on how to use R with web applications using Python, HTTP, and the
to conduct these analyses. Topics to be covered Requests library. Then follow the web application
include: descriptive and summary statistics, penetration testing methodology. Each section
hypothesis testing, regression and general linear contains practical Python examples. To finish off,
models. Christian shows how to use the tools against a
vulnerable web application created specifically for
Competency this course.
Programming and Scripting
Statistical Modeling

Programming and Scripting | 81


Competency Course Delivery Format
Programming and Scripting Virtual, self-paced

Proficiency Course Prerequisites or Preferred Prior


Full Performance Knowledge
N/A
Training Provider
LinkedIn Learning (training main landing page) Cost
LinkedIn Learning (specific course description) Free

Course Delivery Format Python Programming for Experienced


Virtual, self-paced
Programmers
Course Prerequisites or Preferred Prior
Knowledge Do you already know how to program but want to
N/A learn the python programming language? In this
seminar we will perform a simple analysis of a
dataset as we learn the ins and outs of the python
Cost
programming language. Students will work within a
Free
Jupyter notebook and have access to all materials for
continued practice. Students will receive a Jupyter
Programming Foundations: Real- for further study and practice.
World Examples
Competency
Understanding core programming concepts and why Programming and Scripting
they are used is just as important as knowing how to Data Visualization
write code. New programmers need to learn to bridge
the gap: to connect the theory to practice. This series Proficiency
of training videos explains basic programming Full Performance
concepts by relating them to real-life objects, actions,
and scenarios. Each video will focus on a different Training Provider
analogy, mixing live action with segments that HHS Learning Portal (main training landing page)
demonstrate the concepts in code. For example, HHS Learning Portal (specific course description)
Barron Stone connects functions to recipes, lists to
parking spaces, and loops to that perpetual chore: Course Delivery Format
dishwashing. He illustrates most of the examples Virtual, instructor-led
using Python, but you can follow along in any
language you choose. Course Prerequisites or Preferred Prior
Knowledge
Competency • Introduction to the Command Line
Programming and Scripting • Introduction to Jupyter Notebook
• Introduction to Programming (with Python)
Proficiency or prior programming experience required
Full Performance
Cost
Training Provider Free
LinkedIn Learning (training main landing page)
LinkedIn Learning (specific course description)

Programming and Scripting | 82


SQL for Statistics Essential Training Topics:
• Intro to Biology
This intermediate-level course provides an overview • Next Generation Sequencing
of basic descriptive statistics and the SQL commands • GRanges, Rsamtools
students need to know to summarize data sets, find • Statistical modeling of counts
averages, and calculate variance and standard • Differential expression of counts
deviation. Students are introduced more detailed • Microarrays
analysis techniques using discreet and continuous • eSets
percentiles to help segment data, and correlations • Background correction
between variables to identify relationships. This • Differential expression of arrays
course concludes with an introduction to linear
• Normalization
regression, a widely used predictive analytics
technique. • Advanced differential expression
• Batch effects
Competency • Gene set testing
Programming and Scripting
Statistical Modeling Competency
Programming and Scripting
Proficiency Data Mining and Integration
Full Performance
Proficiency Level
Training Provider Expert
LinkedIn Learning (training main landing page)
LinkedIn Learning (specific course description) Training Provider
edX (main training landing page)
Course Delivery Format edX (specific course description)
Virtual, self-paced
Course Delivery Format
Course Prerequisites or Preferred Prior Virtual, instructor-led
Knowledge
It is assumed that those taking this course have some Course Prerequisites or Preferred Prior
familiarity with relational databases. If there are Knowledge
further questions about relational databases, it is N/A
recommended that one looks at the Learning
Relational Databases course and the SQL Essential Cost
Training course (both on LinkedIn Learning). Free to audit, $50 for Verified Certificate

Cost Master Parallel & Concurrent


Free
Programming in Python: 2 in 1
Introduction to Bioconductor Are you looking forward to getting well versed with
Parallel & Concurrent Programming Using Python?
We will cover some common uses of the software Then this is the perfect course for you!
packages within the Bioconductor project. You will The terms concurrency and parallelism are often
get to decide if you learn methods for next generation used in relation to multithreaded programs. Parallel
sequencing, microarrays or both. We will cover a programming is not a walk in the park and
number of normalizations, batch correction, and
testing methods for high throughput data.

Programming and Scripting | 83


sometimes confuses even some of the most algorithms, "cheat" effectively, distribute work on
experienced developers. one or more machines, and pick the right transport
and encoding methods. He also introduces load
This comprehensive 2-in-1 course will take you balancers and powerful server frameworks for HTTP
smoothly through this difficult journey of current and TCP and shows how to serve static content. Plus,
programming in Python, including common thread learn how to monitor performance of your projects
programming techniques and approaches to parallel and set up alerts, so you'll know when a system or
processing. Similarly, with parallel programming service fails.
techniques you explore the ways in which you can
write code that allows more than one process to Competency
happen at once. Programming and Scripting

After taking this course you will have gained an in- Proficiency
depth knowledge of using threads and processes with Expert
the help of real-world examples along with hands-on
in GPU programming with Python using the Training Provider
PyCUDA module and will evaluate performance LinkedIn Learning (training main landing page)
limitations. LinkedIn Learning (specific course description)

Competency Course Delivery Format


Programming and Scripting Virtual, self-paced

Proficiency Level Course Prerequisites or Preferred Prior


Expert Knowledge
N/A
Training Provider
Udemy (main training landing page) Cost
Udemy (specific course description) Free

Course Delivery Format Python Advanced Design Patterns


Virtual, instructor-led
This advanced Python course introduces some design
Course Prerequisites or Preferred Prior patterns described by the Gang of Four, including
Knowledge Command, Interpreter, and Memento. Students also
N/A discover how these patterns work at the code level by
walking through scripts. Course topics include
Cost architectural vs. design patterns, why use design
Free to audit, $200 for Verified Certificate patterns, design best practices, domain-specific
patterns and security patterns, gang of four design
Optimizing Python Services patterns, command, mediator, and state, template
methods. He also introduces some practical design
Sometimes, the difference between a good Python patterns described by the Gang of Four, including
application and a great one isn't found in the code; Command, Interpreter, and Memento. Plus, he helps
it's in the services that support your software. In this you grasp how these patterns work at the code level
course, instructor Miki Tebeka introduces Python by walking through sample scripts.
optimization tips and techniques to develop and run
more efficient sites and applications. Learn how to Competency
find bottlenecks, stress test your code, use caching Programming and Scripting

Programming and Scripting | 84


Proficiency
Expert

Training Provider
LinkedIn Learning (training main landing page)
LinkedIn Learning (specific course description)

Course Delivery Format


Virtual, self-paced

Course Prerequisites or Preferred Prior


Knowledge
Some prior knowledge of Python is recommended
before starting this course.

Cost
Free

Programming and Scripting | 85


Research Design

Biomedical and Health Research Data Cost


Management Training for Librarians Free

Health science professionals invited to participate in Best Practices for Biomedical


this 8-week online class with engaging lessons and Research and Data Management
practical activities. This course provided basic
knowledge and skills for librarians interested in This MOOC course is meant for a broad audience
helping patrons manage their research data. including librarians, biomedical researchers,
Attending the course will improve one’s ability to undergraduate and graduate biomedical students
initiate or extend research data management (RDM) and all other interested. Module 1 covers an
services at their institution. The major goal of this introduction and overview of research data
course is to introduce data issues and policies in management and best practices for biomedical
support of developing and implementing or research data. Module 2 details the research lifecycle.
enhancing research data management training Module 3 covers the contextual details needed to
services at your institution. The major topics of this make data meaningful. Module 4 covers data storage
course include an overview of data management, and security. Module 5 covers data management
choosing the appropriate metadata descriptors or policy. Module 6 covers biomedical ethics. Module 7
taxonomies for a dataset, addressing privacy and covers data sharing and reuse. Module 8 covers
security issues with data, and creating data curation and preservation of data. Module 8 covers
management plans. Students will be paired with scientific research teams.
mentors to complete their capstone projects.
Competency
Competency Research Design
Research Design Data Mining and Integration
Data Mining and Integration
Proficiency
Proficiency Comprehension
Comprehension
Training Provider
Training Provider National Network of Libraries of Medicine (main
National Network of Libraries of Medicine (main training landing page)
training landing page) National Network of Libraries of Medicine (specific
National Network of Libraries of Medicine (specific course description)
course description)
Course Delivery Format
Course Delivery Format Virtual, self-paced
Hybrid course. This course features video lectures,
hands-on exercise, and peer discussions. The course Course Prerequisites or Preferred Prior
cumulates with a capstone summit at NIH. Students Knowledge
will experience 8 weeks of online material for this N/A
course before participating in the capstone.
Cost
Course Prerequisites or Preferred Prior Free
Knowledge
N/A

Research Design | 86
Data Science Basics This course has one purpose, and that is to share a
methodology that can be used within data science, to
A follow-on to the Data Science @NLM Training ensure that the data used in problem solving is
Program Kick-Off, this course – a joint effort relevant and properly manipulated to address the
between NLM and Booz Allen Hamilton – is geared question at hand.
towards raising staff awareness of fundamental data
science processes and concepts across ten technical Competency
data science competencies. This course builds upon Research Design
the systematic data science process introduced in the
Data Science @NLM Training Program Kick-Off and Proficiency
digs deeper into the how’s and why’s through the Comprehension
process. Just as the Data Science @NLM Training
Program Kick-Off, this Data Science Basics course Training Provider
aims to provide all NLM staff members with a solid Coursera (main training landing page)
conceptual understanding of data science and a Coursera (specific course description)
common lexicon for our diverse workforce of data
savvy professionals. Course Delivery Format
Virtual, self-paced
Competency
Research Design Course Prerequisites or Preferred Prior
*This course is relevant to ALL competencies Knowledge
N/A
Proficiency
Comprehension Cost
Free
Training Provider
National Library of Medicine Data Science @NLM Training Program
Course Delivery Format
Kick-Off
Virtual, self-paced
This course – a joint effort between NLM and Booz
Allen Hamilton – is geared towards all NLM staff
Course Prerequisites or Preferred Prior
members and begins by breaking down NIH’s
Knowledge
definition of data science into digestible pieces.
None
Catherine Ordun, Chief Data Scientist at Booz Allen,
then relates those pieces to NLM’s unique workforce
Cost
and, more broadly, to modern biomedical, health,
Free
and social sciences. This course walks through the
systematic data science process at a high-level and
Data Science Methodology highlights real-world examples that draw heavily on
the scientific method. The Data Science @NLM
Despite the recent increase in computing power and Training Program Kick-Off aims to provide all NLM
access to data over the last couple of decades, our staff members with a solid conceptual understanding
ability to use the data within the decision-making of data science and a common lexicon for our diverse
process is either lost or not maximized at all too workforce of data savvy professionals.
often, we don't have a solid understanding of the
questions being asked and how to apply the data Competency
correctly to the problem at hand. Research Design

Research Design | 87
Proficiency Cost
Comprehension Free

Training Provider Understanding Research Methods


NLM Wiki (main training landing page)
NLM Wiki (specific course description) This MOOC is about demystifying research and
research methods. It will outline the fundamentals of
Course Delivery Format doing research, aimed primarily, but not exclusively,
Virtual, self-paced at the postgraduate level. It places the student
experience at the center of our endeavors by
Course Prerequisites or Preferred Prior engaging learners in a range of robust and
Knowledge challenging discussions and exercises befitting SOAS,
N/A University of London's status as a research-intensive
university and its rich research heritage.
Cost
Free Competency
Research Design
RDM for Librarians
Proficiency
An online training course that delivers concise and Comprehension
comprehensive instruction on research data
management specifically for librarians. Course topics Training Provider
include research data management, data Coursera (training main landing page)
management planning, data sharing. The aim of this Coursera (specific course description)
course is to raise awareness of research data
management and build confidence amongst Course Delivery Format
librarians, to support researchers in this area. Virtual, self-paced

Competency Course Prerequisites or Preferred Prior


Research Design Knowledge
Data Mining and Integration N/A

Proficiency Cost
Comprehension Free

Training Provider Academic Research Foundations:


National Network of Libraries of Medicine (main Quantitative
training landing page)
National Network of Libraries of Medicine (specific Quantitative research is a crucial part of academic
course description) study and a fundamental scholarly research
methodology. In this course, educator Rolin Moe
Course Delivery Format explores the foundations of this methodology to help
Virtual, self-paced you confidently tackle your own quantitative
research study. Rolin covers the characteristics of
Course Prerequisites or Preferred Prior quantitative research and explains how to approach
Knowledge different parts of the research process, such as
N/A creating a solid research question and developing a
literature review. He goes over the elements of a

Research Design | 88
study, explains how to collect and analyze data, and Training Provider
shows how to present your data in written and LinkedIn Learning (training main landing page)
numeric form. Once you wrap up this course, you'll LinkedIn Learning (specific course description)
be familiar with the framework of a quantitative
research study and prepared to start drafting your Course Delivery Format
own. Virtual, self-paced

Competency Course Prerequisites or Preferred Prior


Research Design Knowledge
N/A
Proficiency
Basic Cost
Free
Training Provider
LinkedIn Learning (training main landing page) Introduction to Hypothesis Testing
LinkedIn Learning (specific course description)
This class covers the fundamentals of hypothesis
Course Delivery Format testing, e.g. Type I and Type II errors, statistical
Virtual, self-paced power, p-values, and confidence intervals, and the
connection between these terms. Multiplicity
Course Prerequisites or Preferred Prior adjustment for multiple comparisons, as well as
Knowledge common mistakes and misconceptions are discussed.
N/A The Bayesian approach is briefly described.
Information is presented in non-technical terms, and
Cost emphasis is on understanding the concepts rather
Free than theory and formulas.

Information Literacy Competency


Research Design
Information literacy is the ability to discover and use
various types of information. It's an essential skill for Proficiency
navigating the information age. Watch this course to Basic
learn about strategies for finding information—from
a library, archive, database, or the Internet—and the Training Provider
ethics of using it. Librarian Elsa Loftis discusses NIH Library (main training landing page)
different types of resources and explains how to NIH Library (specific course description)
evaluate their usefulness and trustworthiness. She
also shows how to avoid plagiarism and copyright Course Delivery Format
infringement, and accurately cite sources. In-person, instructor-led

Competency Course Prerequisites or Preferred Prior


Research Design Knowledge
Statistical Modeling N/A

Proficiency Cost
Basic Free

Research Design | 89
Learning Data Science: Ask Great Competency
Research Design
Questions Data Mining and Integration
This intermediate-level course addresses the
Proficiency
following topics: harnessing the power of questions, Basic
testing one’s reasoning, identifying question types,
organizing questions, rooting out assumptions,
Training Provider
finding out errors, finding errors, highlighting
NIH Library (main training landing page )
missing data, overcoming question data.
NIH Library (specific course description)
Competency
Course Delivery Format
Research Design
In-person, instructor-led
Proficiency
Course Prerequisites or Preferred Prior
Basic
Knowledge
N/A
Training Provider
LinkedIn Learning (training main landing page)
Cost
LinkedIn Learning (specific course description)
Free
Course Delivery Format
Virtual, self-paced SurveyMonkey Essential Training

Course Prerequisites or Preferred Prior SurveyMonkey is a hugely popular survey platform


Knowledge that you can use for free, or with additional tools that
N/A you can purchase. In this course, discover how to
generate surveys on SurveyMonkey. Instructor David
Cost Rivers begins by outlining how SurveyMonkey
Free works, and how online surveys can help your
business. Next, he walks through the steps of
creating an online survey, explaining how to choose a
Overview of Common Statistical Tests template, add questions, and customize your design.
He also offers tips that can help to improve response
This lecture will briefly review the steps involved in rates. Then, David explains how to collect survey
data analysis and how study design, hypothesis, and results, including choosing a collection method,
type of data and their distributions contribute to the using the Email Invitation Collector, and collecting
choice of statistical tests. Statistical tests are used to responses via LinkedIn and other websites. Finally,
determine the presence and strength of a David covers how to analyze your survey results by
relationship between independent and outcome viewing individual responses, exploring question
variables. The basic concepts around the use and
summaries, and exporting your question data. He
interpretation of the following statistical tests will be also shows how to share your data with colleagues.
covered: chi-square, paired and two-sample t-tests,
ANOVA, correlations, simple and multiple
Competency
regression, logistic regression, and non-parametric
Research Design
tests.
Statistical Modeling

Proficiency
Basic

Research Design | 90
Training Provider Cost
LinkedIn Learning (training main landing page) Free
LinkedIn Learning (specific course description)
Data Analysis for Social Scientists
Course Delivery Format
Virtual, self-paced This statistics and data analysis course will introduce
you to the essential notions of probability and
Course Prerequisites or Preferred Prior statistics. We will cover techniques in modern data
Knowledge analysis: estimation, regression and econometrics,
N/A prediction, experimental design, randomized control
trials (and A/B testing), machine learning, and data
Cost visualization. We will illustrate these concepts with
Free applications drawn from real world examples and
frontier research. Finally, we will provide instruction
Systematic Reviews - Types of for how to use the statistical package R and
Reviews and Literature Searches opportunities for students to perform self-directed
empirical analyses.
This one-hour webinar will take a deep dive into the
preliminary consideration of completing a systematic This course is designed for anyone who wants to
review. At the completion of the class, participants learn how to work with data and communicate data-
will have a clear understating of strategizes to driven findings effectively.
formulate a workable systematic review question, the
different types of systematic reviews and their Competency
approaches, and how to create a “roadmap” for the Research Design
review. At the end of the class, participants should Advanced Mathematics
have a good enough grasp of the fundamentals of
hypothesis testing to understand basic statistical Proficiency Level
results, and to communicate with biostatisticians Foundational
more efficiently.
Training Provider
Competency edX (main training landing page)
Research Design edX (specific course description)

Proficiency Course Delivery Format


Basic Virtual, instructor-led

Training Provider Course Prerequisites or Preferred Prior


NIH Library (main training landing page) Knowledge
NIH Library (specific course description) N/A

Course Delivery Format Cost


Virtual, instructor-led Free

Course Prerequisites or Preferred Prior


Knowledge
N/A

Research Design | 91
Screening Best Practices and uniquely positioned to connect researchers and their
outputs with the wider information ecosystem. As
Managing Your Data for Systematic the backbone of scholarly knowledge, citations are
Reviews also vital components of Open Science RDM
strategies. Open, structured, separable citations
In this one-hour webinar, a deeper dive into encourage data reuse and remixing to reproduce,
inclusion and exclusion criteria will be discussed as verify, and build on results reported in scholarly
well as tips and resources to assist participants in literature. In addition to access, Wikidata adds these
screening and collecting data from research studies. citations to the linked data knowledge graph,
supporting applications like Scholia
Competency (https://tools.wmflabs.org/scholia/).
Research Design
Competency
Proficiency Research Design
Foundational Data Mining and Integration

Training Provider Proficiency


NIH Library (main training landing page) Foundational
NIH Library (specific course description)
Training Provider
Course Delivery Format National Network of Libraries of Medicine (main
Virtual, instructor-led training landing page)
National Network of Libraries of Medicine (specific
Course Prerequisites or Preferred Prior course description)
Knowledge
N/A Course Delivery Format
Virtual, instructor-led
Cost
Free Course Prerequisites or Preferred Prior
Knowledge
WikiData, Librarians, and Research N/A
Data Management
Cost
Join us as we host Wikidata expert and librarian Free
Katie Mika, from the University of Colorado Boulder.
This webinar introduces the WikiCite initiative to Biostatistics in Public Health
build a database of open citations to support free and
computational access to bibliographic metadata and Biostatistics is an essential skill for every public
will identify simple, high impact ways for to get health researcher because it provides a set of precise
involved. As experts in the intersection of methods for extracting meaningful conclusions from
bibliographic metadata, information discovery, and data. In this second course of the Biostatistics in
interdisciplinary research, Librarians are a Public Health Specialization, you'll learn to evaluate
tremendous resource for this community. Currently sample variability and apply statistical hypothesis
the WikiCite citation database is being developed in testing methods. Along the way, you'll perform
Wikidata, which has also become a viable linked data calculations and interpret real-world data from the
hub for library collections and authority data. published scientific literature. Topics include sample
Citations are vital to Wikipedia’s foundation of statistics, the central limit theorem, confidence
“verifiability, not truth,” and academic libraries are intervals, hypothesis testing, and p values.

Research Design | 92
Competency Instructors are welcome to customize the content of
Research Design the instructional modules to meet the learning needs
of their students and the policies and resources at
Proficiency Level their institutions.
Full Performance
Built upon the Frameworks for a Data Management
Training Provider Curriculum developed by the Lamar Soutter Library
Coursera (main training landing page) and the George C. Gordon Library at Worcester
Coursera (specific course description) Polytechnic Institute, the NECDMC is designed to
address present and future researchers’ data
Course Delivery Format management learning needs.
Virtual, instructor-led
Competency
Course Prerequisites or Preferred Prior Research Design
Knowledge Data Mining & Integration
N/A
Proficiency Level
Cost Basic
Free to audit
Training Provider
National Network of Libraries of Medicine (main
New England Collaborative Data training landing page)
Management Curriculum NECDMC (specific course description)

The New England Collaborative Data Management Course Delivery Format


Curriculum (NECDMC) project is led by the Lamar In-person, instructor-led
Soutter Library at the University of Massachusetts
Medical School in partnership with several libraries Course Prerequisites or Preferred Prior
in the New England region. Knowledge
N/A
NECDMC is an instructional tool for teaching data
management best practices to undergraduates, Cost
graduate students, and researchers in the health Free
sciences, sciences, and engineering disciplines. Each
of the curriculum's seven online instructional
modules aligns with the National Science Qualitative Research
Foundation's Data Management Plan
recommendations addresses universal data In this course, the second in the Market Research
management challenges included in a curriculum is a Specialization, you will go in-depth with qualitative
collection of actual research cases that provides a market research methods, from design to
discipline specific context to the content of the implementation to analysis.
instructional modules. These cases come from a
range of research settings such as clinical research, Competency
biomedical labs, an engineering project, and a Research Design
qualitative behavioral health study. Additional
research cases will be added to the collection on an Proficiency Level
ongoing basis. Each of the modules can be taught as Full Performance
a stand-alone class or as part of a series of classes.

Research Design | 93
Training Provider
Coursera (main training landing page)
Coursera (specific course description)

Course Delivery Format


Virtual, instructor-led

Course Prerequisites or Preferred Prior


Knowledge
N/A

Cost
Free to audit

Qualitative Research Methods


In this course you will be introduced to the basic
ideas behind the qualitative research in social
science. You will learn about data collection,
description, analysis and interpretation in qualitative
research. Qualitative research often involves an
iterative process. We will focus on the ingredients
required for this process: data collection and
analysis.

Competency
Research Design

Proficiency Level
Expert

Training Provider
Coursera (main training landing page)
Coursera (specific course description)

Course Delivery Format


Virtual, instructor-led

Course Prerequisites or Preferred Prior


Knowledge
N/A

Cost
Free to audit

Research Design | 94
Statistical Modeling

Inferential Statistics: Sampling and statistics are used in real-world scenarios from the
Hypothesis Testing worlds of business, sports, education, entertainment,
and more. These techniques will help students
This course is designed for students who are understand their data, prove theories, and save time,
complete beginners in statistics. Section 1 and 2: money, and other valuable resources—all by
These 2 sections cover the concepts that are crucial understanding the numbers.
to understand the basics of hypothesis testing -
Normal Distribution, Standard Normal Distribution, Competency
Sampling, Sampling Distribution and Central Limit Statistical Modeling
Theorem. (Before you start hypothesis testing, make
sure you are absolutely clear with these concepts) Proficiency
Section 3: This section caters to the basics of Comprehension
hypothesis testing with three methods - Critical
Value Method, Z-Score Method and p-value method. Training Provider
LinkedIn Learning (training main landing page)
Competency LinkedIn Learning (specific course description)
Statistical Modeling
Course Delivery Format
Proficiency Level Virtual, self-paced
Comprehension
Course Prerequisites or Preferred Prior
Training Provider Knowledge
Udemy (main training landing page) A knowledge of basic math and mathematical
Udemy (specific course description) concepts (such as addition, subtraction, square roots
and fractions) is preferred.
Course Delivery Format
Virtual, instructor-led Cost
Free
Course Prerequisites or Preferred Prior
Knowledge Statistical Foundations 2
N/A
This intermediate-level course, which serves as part
Cost 2 to "Statistical Foundations", moves into the topics
Free to audit, $100 for Verified Certificate of sampling, random samples, sample sizes,
sampling error and trustworthiness, the central unit
Statistical Foundations 1 theorem, t-distribution, confidence intervals
(including explaining unexpected outcomes), and
This beginner-level course covers statistics basics, hypothesis testing. This course is a must for those
like calculating averages, medians, modes, and working in data science, business, and business
standard deviations. Students learn how to use analytics—or anyone else who wants to go beyond
probability and distribution curves to inform means and medians and gain a deeper understanding
decisions, and how to detect false positives and of how statistics work in the real world. Eddie Davila
misleading data. Each concept is covered in simple first provides a bridge from Part 1, reviewing
language, with detailed examples that show how introductory concepts such as data and probability,
and then moves into the topics of sampling, random

Statistical Modeling | 95
samples, sample sizes, sampling error and of errors and learn how you can use these to make
trustworthiness, the central unit theorem, t- predictions relatively well and also provide an
distribution, confidence intervals (including estimate of the precision of your forecast.
explaining unexpected outcomes), and hypothesis
testing. This course is a must for those working in Once you learn this you will be able to understand
data science, business, and business analytics—or two concepts that are ubiquitous in data science:
anyone else who wants to go beyond means and confidence intervals, and p-values. Then, to
medians and gain a deeper understanding of how understand statements about the probability of a
statistics work in the real world. candidate winning, you will learn about Bayesian
modeling. Finally, at the end of the course, we will
Competency put it all together to recreate a simplified version of
Statistical Modeling an election forecast model and apply it to the 2016
election.
Proficiency
Comprehension Competency
Statistical Modeling
Training Provider
LinkedIn Learning (training main landing page) Proficiency Level
LinkedIn Learning (specific course description) Basic

Course Delivery Format Training Provider


Virtual, self-paced edX (main training landing page)
edX (specific course description)
Course Prerequisites or Preferred Prior
Knowledge Course Delivery Format
Knowledge of basic math and mathematical concepts Virtual, instructor-led
(such as addition, subtraction, square roots and
fractions) is preferred. Familiarity with basic Course Prerequisites or Preferred Prior
statistical concepts such as normal distribution Knowledge
curves and z scores is also preferred. N/A

Cost Cost
Free Free to audit, $49 for Verified Certificate

Data Science: Inference and Modeling SPSS: ANOVA


This 2-day class is an introduction to using statistics
Statistical inference and modeling are indispensable in SPSS. The class is targeted to those with some
for analyzing data affected by chance, and thus familiarity with SPSS who need to know how to find
essential for data scientists. In this course, you will and run various statistics. The class will start with a
learn these key concepts through a motivating case few data checking techniques, review t-tests, then
study on election forecasting. emphasize ANOVA. Various Aspects of the software
useful in manipulating data to get the most accurate
This course will show you how inference and statistics may be reviewed. Students will use the
modeling can be applied to develop the statistical SPSS software to work through practical examples.
approaches that make polls an effective tool and we'll
show you how to do this using R. You will learn Competency
concepts necessary to define estimates and margins Statistical Modeling

Statistical Modeling | 96
Proficiency hierarchical models and by the fourth advanced
Basic software engineering skills, such as parallel
computing and reproducible research concepts.
Training Provider
HHS Learning Portal (main training landing page) Competency
HHS Learning Portal (specific course description) Statistical Modeling

Course Delivery Format Proficiency Level


In-person, instructor-led Basic

Course Prerequisites or Preferred Prior Training Provider


Knowledge edX (main training landing page)
N/A edX (specific course description)

Cost Course Delivery Format


Free Virtual, instructor-led

Statistical Inference and Modeling for Course Prerequisites or Preferred Prior


Knowledge
High-Throughput Experiments
• Introduction to Bioconductor
Introduction to Linear Models and Matrix
In this course you’ll learn various statistics topics
Algebra or
including multiple testing problem, error rates, error
rate controlling procedures, false discovery rates, q- • Basic programming, intro to statistics, intro
values and exploratory data analysis. We then to linear algebra
introduce statistical modeling and how it is applied
to high-throughput data. We will discuss parametric Cost
distributions, including binomial, exponential, and Free to audit, $49 for Verified Certificate
gamma, and describe maximum likelihood
estimation. We provide several examples of how Statistical Modeling and Regression
these concepts are applied in next generation Analysis
sequencing and microarray data. Finally, we will
discuss hierarchical models and empirical baye along Regression Analysis is the most common statistical
with some examples of how these are used in modeling approach used in data analysis and it is the
practice. We provide R programming examples in a basis for more advanced statistical and machine
way that will help make the connection between learning modeling.
concepts and implementation.
In this course, you will be given fundamental
Given the diversity in educational background of our grounding in the use of widely used tools in
students we have divided the series into seven parts. regression analysis. You will learn the basics of
You can take the entire series or individual courses regression analysis such as linear regression, logistic
that interest you. If you are a statistician you should regression, Poisson regression, generalized linear
consider skipping the first two or three courses, regression and model selection.
similarly, if you are biologists you should consider
skipping some of the introductory biology lectures. Throughout this course, you will be exposed to not
Note that the statistics and programming aspects of only fundamental concepts of regression analysis but
the class ramp up in difficulty relatively quickly also many data examples using the R statistical
across the first three courses. By the third course will software. Thus, by the end of this course, you will
be teaching advanced statistical concepts such as also be familiar with the implementation of

Statistical Modeling | 97
regression models using the R statistical software Competency
along with interpretation for the results derived from Statistical Modeling
such implementations.
Proficiency
This course is more about the opportunity for Foundational
individual discovery than it is about mastering a
fixed set of techniques. Training Provider
LinkedIn Learning (training main landing page)
Competency LinkedIn Learning (specific course description)
Statistical Modeling
Course Delivery Format
Proficiency Level Virtual, self-paced
Basic
Course Prerequisites or Preferred Prior
Training Provider Knowledge
edX (main training landing page) A basic knowledge of algebra is preferred.
edX (specific course description)
Cost
Course Delivery Format Free
Virtual, instructor-led
Excel Statistics Essential Training 1
Course Prerequisites or Preferred Prior
Knowledge In this course, part one of a series, Joseph Schmuller
Understanding of statistics and probability but also teaches the fundamental concepts of descriptive and
basic programming proficiency, linear algebra and inferential statistics and shows you how to apply
basic calculus. them using Microsoft Excel. He explains how to
organize and present data and how to draw
Cost conclusions from data using Excel's functions,
Free to audit, $99 for Verified Certificate calculations, and charts, as well as the free and
powerful Excel Analysis Tool Pak. The objective is for
Business Analytics: Prescriptive the learner to fully understand and apply statistical
Analytics concepts—not to just blindly use a specific statistical
test for a dataset. Joseph uses Excel as a teaching
Everyone is talking about big data these days, but tool to illustrate the concepts and increase
that's just the starting point for drawing high-value, understanding, but all you need is a basic
actionable insights from your organization's data. understanding of algebra to follow along.
This course takes viewers through the entire
analytics lifecycle and workflow—beyond today's Competency
hype and buzzwords—and describes how any Statistical Modeling
organization can turn their investments in big data
into the actionable insights they really need. Author Proficiency
Alan Simon introduces today's relevant technologies Foundational
and shows how best to apply them to specific
business problems and opportunities within their Training Provider
organization. By the end of the course, viewers will LinkedIn Learning (training main landing page)
understand how the different classes of analytics— LinkedIn Learning (specific course description)
descriptive, predictive, and discovery—can lead to
prescriptive action.

Statistical Modeling | 98
Course Delivery Format Yash Patel dives into SPSS, focusing on how to run
Virtual, self-paced and interpret data for the most common types of
quantitative tests. Topics include t-tests, analysis of
Course Prerequisites or Preferred Prior variance (ANOVA), and understanding the statistical
Knowledge measurements behind academic research. Review
A basic knowledge of algebra is preferred. the tenants of qualitative testing, including the
central theorem, P values, and confidence intervals,
Cost and specific use cases for tests in SPSS. For each
Free type, Yash provides some general guidelines and
assumptions, along with a challenge and solution
SPSS: Regression exercise to practice what you've learned.

This 2-day class is an introduction to using statistics Competency


in SPSS. The class is targeted to those with some Statistical Modeling
familiarity with SPSS who need to know how to find
and run various statistics. The class will start with a Proficiency
few data checking techniques, review correlations, Foundational
then emphasize regression. Various aspects of the
software useful in manipulating data to get the most Training Provider
accurate statistics may be reviewed. Attendees will LinkedIn Learning (training main landing page)
use the software to work through practical examples. LinkedIn Learning (specific course description)

Competency Course Delivery Format


Statistical Modeling Virtual, self-paced

Proficiency Course Prerequisites or Preferred Prior


Foundational Knowledge
A basic knowledge of algebra is preferred.
Training Provider
HHS Learning Portal (main training landing page) Cost
HHS Learning Portal (specific course description) Free

Course Delivery Format Statistical Methods for Complex


In-person, instructor-led Sample Survey Data Analysis
Course Prerequisites or Preferred Prior Participants in this two-hour intermediate level class
Knowledge will learn the valid methods of analysis for complex
Before attending this course, it is recommended that sample survey data. Specifically, participants will
you have either attended SPSS Basics or have related gain knowledge in variance estimation methods and
experience. contrast results between model-based and design-
based statistical approaches. This class will provide
Cost participants with an overview of complex survey
Free design features and the data analysis process for
these surveys, from hypothesis formulation to
SPSS for Academic Research statistical inference, including design effects and
weighting, exploratory data analysis, variables
Explore how to run tests for academic research with selection, variance estimation methods, and model
SPSS, the leading statistical software. In this course,

Statistical Modeling | 99
selection. This hands-on experience uses real survey Proficiency
data in SAS to demonstrate the steps and techniques. Full Performance

Competency Training Provider


Statistical Modeling LinkedIn Learning (training main landing page)
Operations Research LinkedIn Learning (specific course description)

Proficiency Course Delivery Format


Foundational Virtual, self-paced

Training Provider Course Prerequisites or Preferred Prior


NIH Library (main training landing page) Knowledge
NIH Library (specific course description) It is recommended that students complete Excel
Statistics Essential Training 1 or at least have some
Course Delivery Format knowledge of statistics as this course is geared
In-person, instructor-led towards intermediate-level students with a bit of
statistics under their belt.
Course Prerequisites or Preferred Prior
Knowledge Cost
N/A Free

Cost Logistic Regression in R and Excel


Free
Learn how to use R and Excel to analyze data in this
Excel Statistics Essential Training 2 course with Conrad Carlberg. He takes you through
advanced logistic regression, starting with odds and
Understanding statistics is more important than logarithms and then moving on into binomial
ever. Statistical analysis is the basis for decision distribution and converting predicted odds back to
making in many fields, including business and probabilities. After this foundation is established, he
academia. In this course, part two of a series, shifts the focus to inferential statistics, likelihood
Professor Joseph Schmuller teaches you how to use ratios, and multinomial regression. Conrad's
statistics concepts and tools to perform analysis in comprehensive coverage of how to perform logistic
Microsoft Excel. He explains how to organize and regression includes tackling common problems,
present data and how to draw conclusions using explaining.
Excel's functions, charts, and 3D maps and the
Solver and Analysis Tool Pak add-ons. Learn to Competency
calculate mean, variance, standard deviation, and Statistical Modeling
correlation; visualize sampling distributions; and test
differences with analysis of variance (ANOVA). Then Proficiency
find out how to use linear, multiple, and nonlinear Full Performance
regression testing to analyze relationships between
variables and make predictions. Joseph also shows Training Provider
how to perform advanced correlations, variable LinkedIn Learning (training main landing page)
frequency testing, and simulations. LinkedIn Learning (specific course description)

Competency Course Delivery Format


Statistical Modeling Virtual, self-paced

Statistical Modeling | 100


Course Prerequisites or Preferred Prior Statistical Foundations 3
Knowledge
You should know how to install and use R and its Complete your mastery in this course, part 3 of our
packages. You should also have some background in Statistics Fundamentals series. Eddie Davila covers
least squares regression analysis. concepts such as small sample sizes, t-distribution,
degrees of freedom, chi-square testing, and more.
Cost This advanced skills training moves learners into the
Free practical study and application of experimental
design, analysis of variance, population comparison,
NIH NIAID OCICB CSB Advanced and regression analysis. Use these lessons to go
Excel: The new Power Query, Power beyond the basics and dive deeper into the specific
factors that influence your own calculations and
Pivot and DAX calculation engine is results. Course topics include: working with small
here! Create data models! sample sizes, using t-statistic vs. z-statistic,
calculating confidence intervals with t-scores,
This course introduces you to a data revolution: now comparing two populations (proportions),
you can use add-ins in Excel that were previously a comparing two population means, chi-square testing,
fee or unavailable. Get a tour of Power Query and ANOVA testing, regression testing.
then create your own queries in a detailed 35-step
exercise. Absorb power pivot and setup your first Competency
data model while learning DAX, the data analysis Statistical Modeling
programming language in Excel 2016. Learn data
modeling by adding more than one table to create Proficiency
relationships, thereby joining data into powerful Full Performance
pivot reports.
Training Provider
Competency LinkedIn Learning (training main landing page)
Statistical Modeling LinkedIn Learning (specific course description)

Proficiency Course Delivery Format


Full Performance Virtual, self-paced

Training Provider Course Prerequisites or Preferred Prior


HHS Learning Portal (main training landing page) Knowledge
HHS Learning Portal (specific course description) Knowledge of basic math and mathematical concepts
(such as addition, subtraction, square roots and
Course Delivery Format fractions) is preferred. Familiarity with basic
In-person, instructor-led statistical concepts such as normal distribution
curves and z scores is also preferred.
Course Prerequisites or Preferred Prior
Knowledge Cost
N/A Free

Cost
Free

Statistical Modeling | 101


NIH NIAAID OCIB CSB Excel Throughout this course, students will be exposed to
not only fundamental concepts of time series analysis
Advanced: From Macros to What-If- but also many data examples using the R statistical
/Forecasting to Advanced Data software. Thus, by the end of this course, students
Analysis will also be familiar with the implementation of time
series models using the R statistical software along
This course is designed to teach you macros and with interpretation for the results derived from such
advanced Excel 2016 new features. This course implementations This class is more about the
covers creating macros from start to finish, and then opportunity for individual discovery than it is about
show you the new advanced data analysis group with mastering a fixed set of techniques.
What-If analyses and forecasting. Participants will
also use the Outline feature to group and subtotal Competency
large subsets of data, and finish by brushing up on Statistical Modeling
other popular advanced formulas with special
emphasis on new Excel 2016 formulas like text join, Proficiency Level
concept and the enhanced flash fill. Expert

Competency Training Provider


Statistical Modeling edX (main training landing page)
edX (specific course description)
Proficiency
Expert Course Delivery Format
Virtual, instructor-led
Training Provider
HHS Learning Portal (main training landing page) Course Prerequisites or Preferred Prior
HHS Learning Portal (specific course description) Knowledge
A sound familiarity with under/graduate statistics
Course Delivery Format and probability but also basic programming
In-person, instructor-led proficiency, linear algebra and basic calculus. A
sound familiarity with linear regression modeling.
Course Prerequisites or Preferred Prior Introduction to statistics, intro to linear algebra
Knowledge
N/A Cost
Free to audit, $99 for Verified Certificate
Cost
Free

Time Series Analysis


In this course, students will learn standard time
series analysis topics such as modeling time series
using regression analysis, univariate ARMA/ARIMA
modelling, (G)ARCH modeling, Vector
Autoregressive (VAR) model along with forecasting,
model identification and diagnostics. Students will
be given fundamental grounding in the use of such
widely used tools in modeling time series.

Statistical Modeling | 102


ADDITIONAL DATA SCIENCE TRAININGS
Advanced Learning Options
NIH/NLM has a plethora of training courses available but no program is perfect, for where there are gaps, we
reviewed available online content to identify additional courses. All those courses are here and can be found in
this Catalog. Beyond that, we recognize that advanced level skill development doesn’t often happen in a
classroom, so we have provided a series of conferences, symposiums, and other events which may be applicable
for staff in the advanced data science learning paths.

Agile Leadership Principles Foundations of Fundamental Project


Planning and Management
Agile can often challenge project managers in the
realm of leadership. Old styles of command-control Projects are all around us. Virtually every
are now a thing of the past, except for the most organization runs projects, either formally or
conservative organizations. But Agile takes self- informally. We are engaged in projects at home and
empowerment to new levels and challenges at work. Across settings, planning principles and
traditional beliefs in what leadership means. execution methodologies can offer ways in which
projects can be run more effectively and efficiently.
In this course, you will learn how this new style of Project management provides organizations (and
leadership redefines and redistributes team roles by: individuals) with the language and the frameworks
for scoping projects, sequencing activities, utilizing
• Motivating through empowerment to gain resources, and minimizing risks.
better decisions
• Facilitating the creativity and inclusivity of a This is an introductory course on the key concepts of
high-functioning team planning and executing projects. We will identify
• Identifying and managing decision making factors that lead to project success, and learn how to
biases plan, analyze, and manage projects. Learners will be
• Negotiating conflicts across individuals, exposed to state-of-the-art methodologies and to
teams, and organizations considering the challenges of various types of
• Ensuring success through delegation and projects.
powerful constraint-based metrics
Training Provider
You’ll learn to turn one internally motivated and Coursera (main training landing page)
critically thinking mind into many; and driving speed Coursera (specific course description)
and innovation through leveraging all talents on the
team. Course Delivery Format
Virtual, instructor-led
Training Provider
edX (main training landing page) Cost
edX (specific course description) Free to audit

Course Delivery Format Instructional Methods in Health


Virtual, instructor-led
Professions Education
Cost
This course provides those involved in educating
Free to audit, $125 for Verified Certificate
members of the health professions an asynchronous,

Advanced Learning Options | 103


interdisciplinary, and interactive way to obtain, software development. This course focuses on the
expand, and improve their teaching skills. These day-to-day jobs of running a software development
skills can then be applied within their own program and how leading agile methodologies
professional context, with a variety of learners, (Scrum, XP, kanban) can help you do them better.
extending across many stages.
From transitioning a team to agile to running sprints
Training Provider to managing stakeholders, this course gives you the
Coursera (main training landing page) skills you need to manage an agile team in your
Coursera (specific course description) specific operating environment.

Course Delivery Format We'll show you how to:


Virtual, instructor-led • Think through and focus on the most
important aspects of your projects and
Cost sprints
Free to audit • Facilitate your team’s initial and ongoing
adoption of the specific agile practices that
Leading Teams work for you
• Anchor your outcomes and success criteria in
In this course, you will learn how to build your team, durable ideas about what makes for valuable
improve teamwork and collaboration, and sustain products
team performance through continuous learning and • Support your team's transition from
improvement. Specifically, you will learn best traditional approaches to agile
practices for composing a team and aligning • Create an agile-friendly environment across
individual and team goals. You will also learn how to functional disciplines
establish roles, build structures, and manage • Identify and manage outside stakeholder
decision making so that your team excels. This needs
course will also help you manage critical team
processes such as conflict resolution and building Training Provider
trust that have a profound impact on your team’s Coursera (main training landing page)
performance. You will discuss some of the best ways Coursera (specific course description)
to harness the productive potential of teams while
mitigating the risks and traps of teamwork. Course Delivery Format
Virtual, instructor-led
Training Provider
Coursera (main training landing page) Cost
Coursera (specific course description) Free to audit

Course Delivery Format Mentor for Impact: Start Mentoring


Virtual, instructor-led
This course provides essential wisdom & tools that
Cost help you start and succeed as a mentor. In this
Free to audit course, you will find:
1. An overview of main areas for mentoring
Managing an Agile Team opportunities
2. Essential qualities of a good mentor
Traditional development processes often lead to 3. A structure for GROW model conversations
team frustration and poor results. Agile offers a 4. A framework for sharing your experience
different approach to managing the complexity of through storytelling

Advanced Learning Options | 104


5. Tips to motivate your mentee to follow
through and get results

In the bonus section, you will find:


• A printable GROW model structure
worksheet
• A goalsetting worksheet
• Powerful questions for mentoring
conversations

You might want to take this course if you are


planning to become a volunteer mentor in your
organization or a youth program, and you are looking
for information on how to structure and hold
mentoring meetings.

Training Provider
Udemy (main training landing page)
Udemy (specific course description)

Course Delivery Format


Virtual, instructor-led

Cost
Free

Advanced Learning Options | 105


Professional Development Conferences

Artificial Intelligence Conference as rising startups. GTC showcases the latest


breakthroughs in AI training and inference, industry-
The AI Conference delivers an unsurpassed depth changing technologies, and successful
and breadth in technical content—with a laser-sharp implementations from research to production.
focus on the most important AI developments for
business. From apps and reinforcement learning to Training Provider
conversational interfaces and executive briefings, NVIDIA Conference (specific conference description)
learn how to implement AI in real-world projects
using machine learning, NLP, Tensorflow, and more. Conference Dates
Delve into the latest research and explore what the March 17-21, 2019 (Silicon Valley, CA)
future holds for applied artificial intelligence November 4-6, 2019 (Washington, D.C.)
engineering.
Rev Data Science Leaders’ Summit
Training Provider
Artificial Intelligence Conference (specific conference
Rev is a summit for data science leaders, by data
description)
science leaders. People who run data science teams
require a space to learn, collaborate, and discuss how
Conference Dates
to elevate data science’s role in organizations.
April 15-18, 2019 (New York City)
Attendees include data science leaders and their
teams, within organizations of all sizes and
Machine Learning Innovation Summit industries, from high-tech to non-profit.
What to expect.
The Machine Learning Innovation Summit brings
together the data science leaders of today to present Training Provider
the technical solutions of tomorrow. This conference Data Science Leaders Submit (specific conference
is full of insightful, practical content and description)
masterclasses delivered by engineers and data
scientists passionate about leveraging AI and Conference Dates
machine learning for the real-world business impact. May 23-24, 2019 (New York City)

Training Provider
Machine Learning Summit (specific conference Strata Data Conference
description)
Data Conference helps you put big data, cutting-edge
Conference Dates data science, and new business fundamentals to
April 15-18, 2019 (San Francisco) work. Strata Data Conference is where thousands of
innovators, leaders, and practitioners gather to
develop new skills, share best practices, and discover
NVIDIA GPU Technology Conference how tools and technologies are evolving to meet new
in Washington, DC challenges. Find out how big data, machine learning,
and analytics are changing how we do business at
NVIDIA’s GPU Technology Conference (GTC) is the Strata Data Conference.
premier AI conference, offering hundreds of
workshops, sessions, and keynotes hosted by
organizations like Google, Amazon, Facebook as well

Professional Development Conferences | 106


Training Provider
Strata Data Conference (specific conference
description)

Conference Dates
March 23-24, 2019 (San Francisco)
September 23-26, 2019 (New York City)

| 107
Advanced Data Science Webinars

Association for Computing Machinery extension to accepted OO techniques including


Design by Contract.
Webinar Series: Explainable Machine
Learning Models for Healthcare AI Training Provider
Association for Computing Machinery (specific
This tutorial extensively covers the definitions, webinar description)
nuances, challenges, and requirements for the design
of interpretable and explainable machine learning Data & Analytics Leadership and
models and systems in healthcare. We discuss many
uses in which interpretable machine learning models Vision for 2019
are needed in healthcare and how they should be
deployed. Additionally, we explore the landscape of CEOs now view data and the analytic insight derived
recent advances to address the challenges model from it as the key to success. This gives data and
interpretability in healthcare and describe how one analytics leaders a massive opportunity to deliver
would go about choosing the right interpretable transformational value to the enterprise. In this
machine learning algorithm for a given problem in complimentary data and analytics webinar, Gartner
healthcare. expert Debra Logan explores the current state of data
and analytics programs and what you should be
Training Provider doing in 2019. You will find out how you can advance
Association for Computing Machinery (specific your data and analytics programs.
webinar description)
Training Provider
Gartner (specific webinar description)
Association for Computing Machinery
Webinar Series: Concurrent Object- How to Build Reliable Data Pipelines
Oriented Programming with Bertrand Using AI and DataOps
Meyer
In this webcast, data analytics guru Wayne Eckerson
The future of programming is parallel. We have run will discuss the rise of modern data applications and
out of ways to make processors faster, and instead we the processes and tools (i.e. DataOps) required to
must use many of them. Programming techniques manage them. He will present a DataOps Framework
and languages have not followed. In the words of a and show the importance of using testing and
report by the National Research Council, parallel monitoring to build, automate, and manage robust
programming still requires “heroic programmers.” data pipelines. In addition, Eric Chu, VP Data
This webinar is focused on making parallel and Insights from Unravel Data will explain how to apply
concurrent programming accessible, even to the non- performance monitoring software and artificial
heroes among us. In fact, programming of any kind intelligence to your data pipelines and supporting big
used to be the preserve of heroes, but progress in data systems to keep your applications running
software technology made it possible to produce reliably.
quality programs on a routine basis using well-
documented engineering techniques such as (among Training Provider
others) OO principles. We will see how to extend this The Bloor Group (specific webinar description)
modern engineering approach to concurrent systems
through the SCOOP mechanism (Simple Concurrent
Object-Oriented Programming), an incremental

Advanced Data Science Webinars | 108


Leading a Successful Data Science Training Provider
Transforming Data with Intelligence (specific
Initiative webinar description
Data Analytics is a hot topic, and deservedly so. It
powers exponential growth in modern behemoths
like Google and Facebook but also drives positive
transformation in ancient businesses and agencies -
helping to cut costs, uncover fraud, discover new
markets, etc. Still, many analytic initiatives are never
implemented, though they are complete technical
successes; they are proven to work but never given
the chance. What is going wrong?

This talk explores the heretical thought that


leadership is getting in the way - that leaders often
inadvertently nurture organizational inertia that
diminishes, or eliminates, the chance for success
with analytics. Learn instead how to harness its
counter-intuitive insights, as illustrated by tales from
the front lines of this emerging field.

Training Provider
Elder Research (specific webinar description)

What's Ahead in Data Management in


2019
Data itself is evolving into larger volumes from new
sources in a broadening array of structures,
containers, interfaces, and latencies. New data-
driven business use cases are rising in prominence,
especially those for analytics, self-service, and agile
operations. In response to user demands, the
software vendor and open source communities are
supplying many new data platforms, tools, and
capabilities—all purpose-built for modern data and
its use cases.

This webinar is a “must attend” for technical users


and business managers who are facing these changes.
The expert panel on this webinar will help attendees
understand what’s ahead in 2019 and beyond for
data management. Attendees can then apply that
information to prioritize the data management
changes they must address and how they will prepare
via hiring, training, budgeting, making a business
case, and adopting the right data platforms and tools.

Advanced Data Science Webinars | 109


All courses detailed in this Course Catalog were
APPENDIX identified from these sources (see table below).

Resources

RESOURCE DESCRIPTION
Confluence Confluence is a collaboration software program which allows sharing across an organization.
Use cases for Confluence include basic enterprise communication, collaboration workspaces
for knowledge exchange, social networking. Training materials are amongst the many
resources that can be shared on Confluence.

Coursera Coursera is an open online course website featuring post-secondary courses from top global
academic institutions. Most courses are free to take, and consist of watching lecture videos
and presentations, doing readings, holding discussions with other students, and completing
assignments and quizzes.

edX Founded by Harvard University and MIT in 2012, edX is an online learning destination and
MOOC provider, offering high-quality courses from the world’s best universities and
institutions to learners everywhere.

iversity iversity is a Berlin-based online education platform. Since October 2013, iversity has
specialized in providing online courses and lectures in higher education, specifically MOOCs.
Courses are free and open for anyone to enroll and participate. Many of them are conducted
in English or German, but also other languages. iversity cooperates with individual professors
as well as different European universities.

LinkedIn Learning LinkedIn Learning combines the industry-leading content from with LinkedIn’s professional
data and network. On average, 35 new courses are added each week. Currently, LinkedIn
Learning features over 10,00 courses, which range in content representing business,
technology and creative skills. With more than 450 million-member profiles and billions of
engagements, LinkedIn Learning has a unique view of how jobs, industries, organizations and
skills evolve over time. From this information, LinkedIn Learning discovers the skills staff
needs and delivers expert-led courses to help staff obtain those skills.

National Network The goal of the National Network of Libraries of Medicine or NNLM is to advance the
Libraries of Medicine progress of medicine and improve public health by providing U.S. health professionals with
(NNLM) Training equal access to biomedical information and improving individuals' access to information to
Courses and enable them to make informed decisions about their health. National Network of Libraries of
Medicine offers classes on-demand for health and research professionals to use such as
Workshops bioinformatics tutorials and recordings. The NNLM training website includes updated
training schedules and calendars for staff to use when identifying their next training
opportunities.

NIH Library (Office of The NIH Library is part of the Library (ORS), which is in the Office of the Director (OD). This
Research Services) site includes a training calendar which includes all the upcoming data science skills trainings.
The NIH Library offers classes covering a variety of data- related topics including data
management, data visualization, data analysis, R and R Studio. Classes are free, hands-on,
and open to NIH and HHS staff, usually held in-person in the NIH Library training rooms,
Building 10, Clinical Center, near the South Entrance or virtually. In addition to classes, self-
paced online tutorials are available through a variety of vendors and our library staff.

Resources | 110
Saylor Academy Saylor Academy is a nonprofit initiative working since 2008 to offer free and open online
courses to all who want to learn. We offer nearly 100 full-length courses at the college and
professional levels, each of which is available right now -- at your pace, on your schedule, and
free of cost

Udacity Udacity began as an experiment in online learning, when Stanford instructors Sebastian
Thrun and Peter Norvig elected to offer their "Introduction to Artificial Intelligence" course
online to anyone, for free. Over 160,000 students in more than 190 countries enrolled. The
potential to educate at a global scale was awe-inspiring, and Udacity was founded to pursue a
mission to democratize education. It would take several years of intensive iteration and
experimentation to clarify our focus on career advancement through mastery of in-demand
skills, but today, Udacity proudly offers aspiring learners across the globe the opportunity to
participate in—and contribute to—some of the most exciting and innovative fields in the
world.
Udemy Udemy.com is an online learning platform. It is aimed at professional adults. Unlike
academic massive open online course programs which are driven by traditional collegiate
coursework, Udemy uses content from online content creators to sell for profit. Udemy
provides tools which enable users to create a course, promote it and earn money from student
tuition charges.

U.S. Department of U.S. Department of Health and Human Services Learning Management System (also known
Health & Human as the HHS Learning Portal). The HMS Learning Portal is used to track course registrations,
Services Learning complete mandatory and online trainings, view training history, certifications, curricula and
Management System more. The HHS Learning Portal also allows users to develop their own learning plans using
the courses featured within the Portal.

The Foundation for The Foundation for Advanced Education in the Sciences (FAES) is a non-profit foundation
Advanced Education committed to promoting the productivity and attractiveness of professional life on the
in the Sciences National Institutes of Health (NIH) campuses by providing advanced educational programs
and supporting biomedical research within the NIH intramural program. Located at NIH’s
(FAES)
main campus in Bethesda, Maryland, FAES programs complement the work of the NIH in
accomplishing its mission of research and training in the biomedical sciences Courses are
open to all qualified persons, both government and non-government.

Resources | 111
Notes

Notes | 112

Potrebbero piacerti anche