precise_graph.Rd
This function is the set-up function for precise_viz
. It creates and clusters a sparsified t-SNE graph through the following gerneral steps.
See details for more specific information.
Run several iterations of t-SNE on the input data.
Compute the random forest
similarity for each t-SNE iteration output.
Fuse the random forest similarities using distatis
.
Sparsify the fused matrix using Sparsify.matrix(k = perplexity)
.
Cluster the graph using cluster_louvain
.
precise_graph(data, method = 1, distance = TRUE, n_neighbors = 15, spread = 1, min_dist = 0.01, bandwidth = 1, parallel = FALSE, verbose = FALSE)
data | A square, numeric dataframe, matrix or tibble. |
---|---|
verbose | TRUE or FALSE. Should the function tell you what is happening internally? |
perplexity | A positive integer that loosely equates to the number of nearest neighboors. See details. |
theta | A positive numeric value between 0-1 that toggles between exact t-SNE and Barnes-Hut t-SNE. See details. |
max_iter | A positive integer for the number of iterations. See details. |
cores | An integer value equal to 1 or greater for the number of computer cores to use. |
A list with four objects: tsne_dist = a matrix of the the distatis fusion of random forest similarities computed the t-SNE results, tsne_d2 = a tibble with the 2D t-SNE results, tsne_d3 = a tibble with the 3D t-SNE results, precise_clusters = a tibble of the louvain clustering.
This function has a lot going on underneath the hood, and some of the parameters which may seem familiar to many users are being used in ways one might not expect:
perplexity sets both the perplexity paramater for Rtsne
as well as
k for Sparsify.matrix
.
theta toggles between two versions of t-SNE. If theta = 0.0, a modified version of tsne
is run where
instead of using the perplexity parameter Sparsify.matrix(k = perplexity)
is
run on the input matrix. If theta > 0.0, Rtsne
is run.
max_iter sets both the number of iterations for t-SNE and the number of trees for randomForest
.
cores has a maximum useful setting of 4.
Muchmore, B., Muchmore P. and Alarcón-Riquelme ME. (2018). Optimal Distance Matrix Construction with PreciseDist and PreciseGraph.
Jesse H. Krijthe (2015). Rtsne: T-Distributed Stochastic Neighbor Embedding using a Barnes-Hut Implementation, URL: https://github.com/jkrijthe/Rtsne
Justin Donaldson (2016). tsne: T-Distributed Stochastic Neighbor Embedding for R (t-SNE). R package version 0.1-3. https://CRAN.R-project.org/package=tsne
Csardi G, Nepusz T: The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. http://igraph.org
# NOT RUN { library(PreciseDist) test_matrix <- replicate(10, rnorm(50)) test_distances <- test_matrix %>% precise_dist(dists = c("euclidean", "manhattan")) test_fusion <- test_distances %>% precise_fusion(fusion = "distatis", verbose = TRUE) test_graph <- test_fusion %>% precise_graph(perplexity = 5, theta = 0.5, max_iter = 1000, cores = 1, verbose = TRUE) # }