1 Simple Example Data 2 Load {dplyr} package 3 Function 1: inner_join 4 Function 2: left_join 5 Function 3: right_join 6 Function 4: full_join 7 Function 5: semi_join 8 Function 6: anti_join 9 Complex Example 1: Join Multiple Data Frames 10 Complex example 2: Join by Multiple Columns 11 Complex example 3: Join Data & Delete ID I always wanted to write a blog post summarizing the join function.
1 Types of Unusual Observations 1.1 Regression Outliers 1.2 Leverage 1.3 Influential Observations 1.4 Good vs. Bad Leverage 2 Detecting Influential Observations 2.1 Graphic diagnostics 2.1.1 A scatter plot with Confidence Ellipse 2.1.2 Quantile Comparison Plots (QQ-Plot) 2.1.2.1 Rule of Thumb 2.1.3 Added-variable plots 2.2 Numerical diagnostics 2.2.1 Hat Matrix 2.2.1.1 Rule of Thumb 2.2.2 Standardized Residuals 2.2.2.1 Rule of Thumb 2.2.3 Studentized Residuals 2.
1 What are Outliers? 2 Causes for Outliers 3 Types of Outliers 4 Philosophy about Finding Outliers 5 General Rules Figure 0.1: Outliers 4 years ago (Yes, back to 2016), I was asked by a director of data science department from a very famous IT company about outliers. Basically, she asked two questions:
What are outliers? How to detect them? Also in my daily research life, I have encountered noisy data all the time.
1 Data 2 Income, Balance & Default 3 Model Selection 4 Diagnosis 5 Interesting Points 6 Model Cross-Validation 7 Parameter Selection 8 Conclusion Logistic regression model is widely used for group classification. In education or social science, it has been used to classify students/individuals to different groups.
In the finance industry, logistic regression model is also quite useful to identify/classify individual’s group status (i.e. Y) according his/her other features (i.
Figure 0.1: Pipe Operator Instead of introducing tidyr and dplyr packages-two most essential R packages for data wrangling, I would like to insert a side topic that I think it’s worth to mention for R programming efficiency as my 2nd Tidyverse blog 1. To me, this important programming command completely changes my view of programming and reshapes my programming habit since I used it. This magic command is %>%, a.