/lib

Learn, Imagine, Build
Geoff Messier's Projects & Ideas

Background Material

Introductory Material

The purpose of this section is to give students who are brand new to our group something to look at to get up to speed. Here, I’m trying to strike a balance between giving you the fundamentals that everyone should know and not spending too much time exploring techniques that you might not use in your specific project.

Additional Resources

Everything in this section is really good but it may or may not be useful for you, depending on the focus of your project.

Neural Networks and Deep Learning

Andrew Ng

Exploratory Data Analysis (EDA)

EDA (introduced in Chapter 4 of Seltman’s book) is a very important and often overlooked aspect of machine learning and data analysis. In order for your algorithm to produce good results, you must first understand the nature of your data and see if it contains any obvious inconsistencies or errors. Exploratory data analysis (EDA) is essentially using relatively straightforward plots and statistical quantities to determine this.

Performance Metrics

There are a variety of metrics used to evaluate the performance of a classification algorithm. I have some notes here that expand on this topic.

Survival Analysis

Much of our work is using machine learning algorithms to predict adverse future outcomes using features that have been accumulated in an individual’s data record. A related technique commonly used by medical researchers and bio-statiticians is survival analysis. Survival analysis looks for features in the data that increase the risk of an adverse outcome. It makes a series of very specific assumptions about the time to the occurence of these outcomes and whether they are censored by the end date of a study.

Some background reading: