TY - GEN
T1 - Inducing controlled error over variable length ranked lists
AU - Park, Laurence A. F.
AU - Stone, Glenn
PY - 2014
Y1 - 2014
N2 - ![CDATA[When examining the robustness of systems that take ranked lists as input, we can induce noise, measured in terms of Kendall's tau rank correlation, by applying a set number of random adjacent transpositions. The set number of random transpositions ensures that any ranked lists, induced with this noise, has a specific expected Kendall's tau. However, if we have ranked lists of varying length, it is not clear how many random transpositions we must apply to each list to ensure that we obtain a consistent expected Kendall's tau across the collection. In this article we investigate how to compute the number of random adjacent transpositions required to obtain an expected Kendall's tau for a given list length, and find that it is infeasible to compute for lists of length more than 9. We also investigate an alternate and more efficient method of inducing noise in ranked lists called Gaussian Perturbation. We show that using this method, we can compute the parameters required to induce a consistent level of noise for lists of length 107 in just over six minutes. We also provide an approximate solution to provide results in less than 10 -5 seconds.]]
AB - ![CDATA[When examining the robustness of systems that take ranked lists as input, we can induce noise, measured in terms of Kendall's tau rank correlation, by applying a set number of random adjacent transpositions. The set number of random transpositions ensures that any ranked lists, induced with this noise, has a specific expected Kendall's tau. However, if we have ranked lists of varying length, it is not clear how many random transpositions we must apply to each list to ensure that we obtain a consistent expected Kendall's tau across the collection. In this article we investigate how to compute the number of random adjacent transpositions required to obtain an expected Kendall's tau for a given list length, and find that it is infeasible to compute for lists of length more than 9. We also investigate an alternate and more efficient method of inducing noise in ranked lists called Gaussian Perturbation. We show that using this method, we can compute the parameters required to induce a consistent level of noise for lists of length 107 in just over six minutes. We also provide an approximate solution to provide results in less than 10 -5 seconds.]]
UR - http://handle.uws.edu.au:8081/1959.7/548038
U2 - 10.1007/978-3-319-06605-9_22
DO - 10.1007/978-3-319-06605-9_22
M3 - Conference Paper
SN - 9783319066042
SP - 259
EP - 270
BT - Advances in Knowledge Discovery and Data Mining: 18th Pacific-Asia Conference, PAKDD 2014, Tainan, Taiwan, May 13-16, 2014. Proceedings, Part II
PB - Springer
T2 - Pacific-Asia Conference on Knowledge Discovery and Data Mining
Y2 - 13 May 2013
ER -