This helper function is useful for determining which distances to input into precise_dist. If this list of distances is not sufficient see, precise_func_fact.

precise_dist_list(dists)

Arguments

dists

A string of the dists to view. Options include "all_dists", "static_dists", "time_dists", "binary_dists", "nominal_dists", minkowski_dists", "l1_dists", "intersection_dists", "inner_product_dists", "squared_dists", "correlation_dists", "other_dists", "rf_dists", "kernel_dists", "kodama_dists" and "tsne_dists".

Value

A list of the distance type chosen.

Details

Some distances have more than one argument, which is generally reflected in the name:

  • Any distance with a trailing "_phil" comes from the philentropy package. This specific designation is included because the results of these distances sometimes differ from their proxy-package counterpart.

  • minkowski_0.5 refers to the minkowski dist with a power of 0.5

  • random_forest_sqrt refers to a randomForest similarity where mtry = sqrt(ncol(data)) and ntree = 1501

  • laplace_1 and rbf_1 refer dots where sigma = kernlab::sigest(as.matrix(x), scale = FALSE, frac = 1)[[1]] and this represents the 0.1 quantile for the estimate of the sigma parameter.

  • kodama_knn refers to KODAMA where FUN = "KNN" and kodama_pls refers to KODAMA where FUN = "PLS-DA".

  • tsne_5 refers to x2p run with perplexity = 5. This result is then directly fed to p2sp.

  • For a detailed look at the default options for each distance, run PreciseDist:::pd_setup (no parenthesis) in your console.

References

Muchmore, B., Muchmore P. and Alarcón-Riquelme ME. (2018). Optimal Distance Matrix Construction with PreciseDist and PreciseGraph.

David Meyer and Christian Buchta (2018). proxy: Distance and Similarity Measures. R package version 0.4-22. https://CRAN.R-project.org/package=proxy

Hajk-Georg Drost (2018). philentropy: Similarity and Distance Quantification Between Probability Functions. R package version 0.1.0. https://CRAN.R-project.org/package=philentropy

Stefano Cacciatore, Leonardo Tenori, Claudio Luchinat, Phillip R. Bennett and David A. MacIntyre (2017). KODAMA: Knowledge Discovery by Accuracy Maximization. R package version 1.4. https://CRAN.R-project.org/package=KODAMA

Alexandros Karatzoglou, Alex Smola, Kurt Hornik, Achim Zeileis (2004). kernlab - An S4 Package for Kernel Methods in R. Journal of Statistical Software 11(9), 1-20. URL http://www.jstatsoft.org/v11/i09/

A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2(3), 18--22.

M. G. Baydogan and G. Runger (2013). Time Series Similarity Based on a Pattern-based Representation. submitted for publication

Pablo Montero, José A. Vilar (2014). TSclust: An R Package for Time Series Clustering. Journal of Statistical Software, 62(1), 1-43. URL http://www.jstatsoft.org/v62/i01/.

Usue Mori, Alexander Mendiburu, Jose A. Lozano (2016). Distance Measures for Time Series in R: The TSdist Package R journal, 8(2), 451--459. URL https://journal.r-project.org/archive/2016/RJ-2016-058/index.html

Benjamin J. Radford (2017). mmtsne: Multiple Maps t-SNE. R package version 0.1.0. https://CRAN.R-project.org/package=mmtsne

Examples

precise_dist_list("all_dists") %>% as.matrix()
#> [,1] #> [1,] "euclidean" #> [2,] "supremum" #> [3,] "manhattan" #> [4,] "minkowski_0.5" #> [5,] "minkowski_1.5" #> [6,] "minkowski_2.5" #> [7,] "sorensen" #> [8,] "gower" #> [9,] "gower_phil" #> [10,] "soergel" #> [11,] "soergel_phil" #> [12,] "kulczynski_d" #> [13,] "canberra" #> [14,] "canberra_phil" #> [15,] "lorentzian" #> [16,] "wave" #> [17,] "wavehedges" #> [18,] "intersection" #> [19,] "non_intersection" #> [20,] "czekanowski" #> [21,] "motyka" #> [22,] "tanimoto" #> [23,] "ruzicka" #> [24,] "kulczynski_s" #> [25,] "inner_product" #> [26,] "harmonic_mean" #> [27,] "cosine" #> [28,] "fjaccard" #> [29,] "ejaccard" #> [30,] "edice" #> [31,] "dice" #> [32,] "squared_euclidean" #> [33,] "neyman" #> [34,] "squared_chi" #> [35,] "divergence" #> [36,] "clark" #> [37,] "additive_symm" #> [38,] "correlation" #> [39,] "pearson2" #> [40,] "sq_pearson" #> [41,] "mahalanobis" #> [42,] "bray" #> [43,] "chord" #> [44,] "geodesic" #> [45,] "whittaker" #> [46,] "avg" #> [47,] "random_forest_sqrt" #> [48,] "random_forest_two" #> [49,] "random_forest_half" #> [50,] "laplace_1" #> [51,] "laplace_2" #> [52,] "laplace_3" #> [53,] "rbf_1" #> [54,] "rbf_2" #> [55,] "rbf_3" #> [56,] "kodama_knn" #> [57,] "kodama_pls" #> [58,] "tsne_5" #> [59,] "tsne_30" #> [60,] "tsne_50" #> [61,] "acf" #> [62,] "ar.lpc.ceps" #> [63,] "ar.mah" #> [64,] "ar.pic" #> [65,] "cid" #> [66,] "cor" #> [67,] "cort" #> [68,] "dtwarp" #> [69,] "dwt" #> [70,] "int.per" #> [71,] "pacf" #> [72,] "per" #> [73,] "mindist.sax" #> [74,] "spec.llr" #> [75,] "spec.glk" #> [76,] "dtw" #> [77,] "infnorm" #> [78,] "ccor" #> [79,] "sts" #> [80,] "lb.keogh" #> [81,] "edr" #> [82,] "erp" #> [83,] "lcss" #> [84,] "fourier" #> [85,] "dissim" #> [86,] "pdc" #> [87,] "wav" #> [88,] "lps" #> [89,] "jaccard" #> [90,] "kulczynski1" #> [91,] "kulczynski2" #> [92,] "mountford" #> [93,] "fager" #> [94,] "russel" #> [95,] "simple_matching" #> [96,] "hamman" #> [97,] "faith" #> [98,] "tanimoto" #> [99,] "dice" #> [100,] "phi" #> [101,] "stiles" #> [102,] "michael" #> [103,] "mozley" #> [104,] "yule" #> [105,] "yule2" #> [106,] "ochiai" #> [107,] "simpson" #> [108,] "braun_blanquet" #> [109,] "chi_squared" #> [110,] "phi_squared" #> [111,] "tschuprow" #> [112,] "cramer" #> [113,] "pearson"
precise_dist_list(c("kodama_dists", "minkowski_dists")) %>% as.matrix()
#> [,1] #> [1,] "kodama_knn" #> [2,] "kodama_pls" #> [3,] "euclidean" #> [4,] "supremum" #> [5,] "manhattan" #> [6,] "minkowski_0.5" #> [7,] "minkowski_1.5" #> [8,] "minkowski_2.5"