back to home

ehrlinger / ggRandomForests

Graphical analysis of random forests with the randomForestSRC, randomForest and ggplot2 packages.

152 stars
32 forks
21 issues
R

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing ehrlinger/ggRandomForests in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/ehrlinger/ggRandomForests)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

ggRandomForests: Visually Exploring Random Forests ======================================================== ggRandomForests will help uncover variable associations in the random forests models. The package is designed for use with the randomForest package (A. Liaw and M. Wiener 2002) or the randomForestSRC package (Ishwaran et.al. 2014, 2008, 2007) for survival, regression and classification random forests and uses the ggplot2 package (Wickham 2009) for plotting diagnostic and variable association results. ggRandomForests is structured to extract data objects from randomForestSRC or randomForest objects and provides S3 functions for printing and plotting these objects. The randomForestSRC package provides a unified treatment of Breiman's (2001) random forests for a variety of data settings. Regression and classification forests are grown when the response is numeric or categorical (factor) while survival and competing risk forests (Ishwaran et al. 2008, 2012) are grown for right-censored survival data. Recently, support for the randomForest package (A. Liaw and M. Wiener 2002) for regression and classification forests has also been added. Many of the figures created by the package are also available directly from within the or package. However, offers the following advantages: • Separation of data and figures: contains functions that operate on either the forest object directly, or on the output from and post processing functions (i.e. , , ) to generate intermediate data objects. S3 functions are provide to further process these objects and plot results using the graphics package. Alternatively, users can use these data objects for additional custom plotting or analysis operations. • Each data object/figure is a single, self contained object. This allows simple modification and manipulation of the data or objects to meet users specific needs and requirements. • The use of for plotting. We chose to use the package for our figures to allow users flexibility in modifying the figures to their liking. Each S3 plot function returns either a single object, or a of objects, allowing users to use additional functions or themes to modify and customize the figures to their liking. Check out the "Exploring Random Forests with ggRandomForests" vignette for a walk-through of these objects. The package has recently been extended for Breiman and Cutler's Random Forests for Classification and Regression package randomForest where possible. Though methods have been provided for all functions, the unsupported functions will return an error message indicating where support is still lacking. Recent improvements • now computes optional in-bag training error trajectories for fits when , making it easy to compare OOB and training curves in a single data object. • rebuilds the original training data from the call, so marginal dependence plots work even when models are trained inside helper functions or use calls. • is fully quantile based, providing balanced conditioning intervals that can be dropped directly into for coplots. • handles forests that were trained without importance metrics by issuing a warning and returning placeholders, ensuring downstream plotting code continues to run. References Breiman, L. (2001). Random forests, Machine Learning, 45:5-32. Ishwaran H. and Kogalur U.B. (2014). Random Forests for Survival, Regression and Classification (RF-SRC), R package version 1.5.5. Ishwaran H. and Kogalur U.B. (2007). Random survival forests for R. R News 7(2), 25--31. Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S. (2008). Random survival forests. Ann. Appl. Statist. 2(3), 841--860. A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2(3), 18--22. Wickham, H. ggplot2: elegant graphics for data analysis. Springer New York, 2009.