I finished Coursera Johns Hopkins University Exploratory Data Analysis course earlier this year. It was a great course for getting into data and using R to get a good idea of what you are looking at. Here are some of the main subjects of the course and few quick thoughts on each.
Exploratory Graphs - You can use these to do a quick and dirty look into the data that you have and see what it might tell you. Then you can put together a more polished set of charts after you gain direction.
Plotting Systems - Here are the ones that are covered:
- base - The basic plotting system. You can create plots and then annotate and add things as you go. Easy to use, but set of commands to recreate.
- lattice - Everything is created in one function call and no way to add anything after that.
- ggplot2 - A combination of the two. A lot of the basics taken care of up front intuitively and then can add if needed after creation.
Clustering
- Hierarchical or K Means - Use distances (euclidian or manhattan) and dendrograms to create clusters of data of which to analyze.
- Dimension Reduction - Use principal component analysis (PCA) or singular value decomposition (SVD) to trim down the data and find meaningful relationships.
No comments:
Post a Comment