Inducing controlled error over variable length ranked lists

Laurence A. F. Park, Glenn Stone

    Research output: Chapter in Book / Conference PaperConference Paperpeer-review

    2 Citations (Scopus)

    Abstract

    ![CDATA[When examining the robustness of systems that take ranked lists as input, we can induce noise, measured in terms of Kendall's tau rank correlation, by applying a set number of random adjacent transpositions. The set number of random transpositions ensures that any ranked lists, induced with this noise, has a specific expected Kendall's tau. However, if we have ranked lists of varying length, it is not clear how many random transpositions we must apply to each list to ensure that we obtain a consistent expected Kendall's tau across the collection. In this article we investigate how to compute the number of random adjacent transpositions required to obtain an expected Kendall's tau for a given list length, and find that it is infeasible to compute for lists of length more than 9. We also investigate an alternate and more efficient method of inducing noise in ranked lists called Gaussian Perturbation. We show that using this method, we can compute the parameters required to induce a consistent level of noise for lists of length 107 in just over six minutes. We also provide an approximate solution to provide results in less than 10 -5 seconds.]]
    Original languageEnglish
    Title of host publicationAdvances in Knowledge Discovery and Data Mining: 18th Pacific-Asia Conference, PAKDD 2014, Tainan, Taiwan, May 13-16, 2014. Proceedings, Part II
    PublisherSpringer
    Pages259-270
    Number of pages12
    ISBN (Print)9783319066042
    DOIs
    Publication statusPublished - 2014
    EventPacific-Asia Conference on Knowledge Discovery and Data Mining -
    Duration: 13 May 2013 → …

    Publication series

    Name
    ISSN (Print)0302-9743

    Conference

    ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining
    Period13/05/13 → …

    Fingerprint

    Dive into the research topics of 'Inducing controlled error over variable length ranked lists'. Together they form a unique fingerprint.

    Cite this