Sei sulla pagina 1di 2

Data Science and Visualization Workshop

February 3 to March 4, 2016

University of California, Riverside


NASA Jet Propulsion Laboratory

This free workshop introduces Big Data tools and applications that will provide
a solid foundation in the analysis of large data sets.

Location: Room 3035, Physics Building on UCR campus


Time: Monday, Wednesday and Friday 9:00am to 10:30am (check for
changes)

Schedule

Week 1.
- Welcome / Intro (Prof. Bahram Mobasher - UCR)
- Crash course on Python (Mr. Steve Jacobs - UCR)
- Crash course on R (Mr. Steve Jacobs - UCR)

At the end of this week the students will have a general overview of Big Data
and will be familiar with Python and R programming languages.

Week 2.
- Introduction to Machine Learning (Prof. Eamonn Keogh- UCR)
- Machine learning techniques for in Big Data (Eamonn Keogh - UCR)
- Statistics for Data Scientists (Prof. James Flegal - UCR)

At the end of the week students will have a general familiarity with machine
learning techniques/data transformations and statistics (Monte Carlo
simulation, Bayesian statistics, Markov Chain) and how to apply them to large
datasets.

Week 3.
(Dr. Miguel Calvo - UCR)
- Introduction to distributed file systems
- Introduction to Hadoop and Hadoop Distributed File System (HDFS)
- Common HDFS tasks
- Interfacing programs to HDFS, FUSE

Apache- Information Retrieval and Content Detection/Analysis (Dr. Chris


Mattmann- JPL)
(Dr. Nathaniel Stickley UCR/Caltech)
- Map reduce example using Hadoop streaming and python
- Analysis of a distributed text-based datasets
- Introduction to Hadoop Hive
- Hive examples, data analysis

During this week the students will learn how to perform analysis on large
distributed datasets using Hadoop and other currently used big data tools.
Students will also learn about detecting and analyzing features from files

Week 4.
- Data visualization techniques and tools (Dr. Miguel Aragon - UCR)
- Data visualization in NASA (Dr. Scott Davidoff - JPL)

During this week the students will learn how to display and visualize results of
their analysis in a scientifically meaningful way.

Week 5.
- Application of big data techniques in physical sciences (Prof. Asantha
Coorey- UCI)
- Challenges and Solutions in big data genomics (Prof. Thomas Girke-
UCR)
- Need for big data analytics in NASA (Richard Doyle and Daniel Crichton -
JPL)

During this week the students will learn about the applications of Big Data by
aiming to solve several real-world problems.

Participating Departments and Institutions

UCR Departments of Computer Science, Genomics, Physics and


Astronomy and Statistics
UCR Research and Economics Development Office
NASA Jet Propulsion Laboratory
University of California, Irvine
FIELDS (Fellowships and Internships in Extremely Large Data Sets

This project is being funded by a grant from the National Aeronautics and Space
Administration. As part of grant expectations, an evaluation of project usefulness and impact
will be conducted. All participants are requested to participate in the project evaluation.

Potrebbero piacerti anche