R

RStudio Conference 2019 Notes at Austin

A RStudio Conference 2019 Notes at Austin

Outlier Detection

Outlier Detection tutorial and codes

Parallel Analysis

Parallel Analysis tutorial and codes

9 Join Function Example with the R {dplyr} Package

1 Simple Example Data 2 Load {dplyr} package 3 Function 1: inner_join 4 Function 2: left_join 5 Function 3: right_join 6 Function 4: full_join 7 Function 5: semi_join 8 Function 6: anti_join 9 Complex Example 1: Join Multiple Data Frames 10 Complex example 2: Join by Multiple Columns 11 Complex example 3: Join Data & Delete ID I always wanted to write a blog post summarizing the join function.

Outliers-Part 4:Finding Outliers in a multivariated way

1 Data Source 1.1 Variables in Data 2 Model-specific methods 2.1 Cook’s Distance 2.2 Pareto 3 Multivariate methods 3.1 Mahalanobis Distance 3.1.1 Details about Mahalanobis Distance 3.2 Robust Mahalanobis Distance 3.3 Minimum Covariance Determinant (MCD) 3.3.1 robust tolerance ellipsoid (RTE) 3.4 Invariant Coordinate Selection (ICS) 3.5 OPTICS 3.6 Isolation Forest 3.7 Local Outlier Factor 4 ‘check_outliers’ function in {performance} R package 4.0.1 Threshold specification 5 Reference Figure 0.

Outliers-Part 3:Outliers in Regression

1 Types of Unusual Observations 1.1 Regression Outliers 1.2 Leverage 1.3 Influential Observations 1.4 Good vs. Bad Leverage 2 Detecting Influential Observations 2.1 Graphic diagnostics 2.1.1 A scatter plot with Confidence Ellipse 2.1.2 Quantile Comparison Plots (QQ-Plot) 2.1.2.1 Rule of Thumb 2.1.3 Added-variable plots 2.2 Numerical diagnostics 2.2.1 Hat Matrix 2.2.1.1 Rule of Thumb 2.2.2 Standardized Residuals 2.2.2.1 Rule of Thumb 2.2.3 Studentized Residuals 2.

Outliers-Part 2:Finding Outliers in a univariated way

1 Method 1: Sorting Your Datasheet to Find Outliers 2 Method 2: Graphing Your Data to Identify Outliers 2.1 Histogram 2.2 Boxplot 2.2.1 Adjusted boxplot (Hubert and Vandervieren, 2008) 3 Method 3: Using Z-scores to Detect Outliers 3.1 Z-Score pros: 3.2 Z-Score cons: 4 Method 4: Using the Interquartile Range (IRQ) to Create Outlier Fences 5 Method 5: Percentiles 5.1 scores function from {outliers} packages 6 Method 6: Hampel filter 7 Method 7: Finding Outliers with Hypothesis Tests 7.