By the power of Grayskull : small sample statistical power in information retrieval evaluation

Laurence A. F. Park, Glenn Stone

    Research output: Chapter in Book / Conference PaperConference Paperpeer-review

    Abstract

    ![CDATA[Information Retrieval evaluation is typically performed using a sample of queries and a statistical hypothesis test is used to make inferences about the systems accuracy on the population of queries. Research has shown that the t test is one of a set of tests that provides the greatest statistical power while maintaining acceptable type I error rates, when evaluating with a large sample of queries. In this article, we investigate the effect of using a small query sample on the control of the type I error rate and change in type II error rate of a given set of hypothesis tests, meaning that the hypothesis tests may not satisfy Central Limit Theorem conditions. We found that all test performed similarly for unpaired tests. We also found that the bootstrap test provided greater power for the paired test, but violated the desired type I error rate for the smallest sample size (5 queries).]]
    Original languageEnglish
    Title of host publicationProceedings of the 19th Australasian Document Computing Symposium, Melbourne, Australia, November 27-28, 2014
    PublisherACM
    Pages101-104
    Number of pages4
    ISBN (Print)9781450330008
    DOIs
    Publication statusPublished - 2014
    EventAustralasian Document Computing Symposium -
    Duration: 27 Nov 2014 → …

    Conference

    ConferenceAustralasian Document Computing Symposium
    Period27/11/14 → …

    Keywords

    • information storage and retrieval systems
    • statistical hypothesis testing
    • statistical power analysis
    • t-test (statistics)

    Fingerprint

    Dive into the research topics of 'By the power of Grayskull : small sample statistical power in information retrieval evaluation'. Together they form a unique fingerprint.

    Cite this