Weighted kernel model for text categorization

Lei Zhang, Debbie Zhang, Simeon J. Simoff, John K. Debenham, Peter Christen

    Research output: Chapter in Book / Conference PaperConference Paper

    Abstract

    ![CDATA[Traditional bag-of-words model and recent word-sequence kernel are two well-known techniques in the field of text categorization. Bag-of-words representation neglects the word order, which could result in less computation accuracy for some types of documents. Word-sequence kernel takes into account word order, but does not include all information of the word frequency. A weighted kernel model that combines these two models was proposed by the authors. This paper is focused on the optimization of the weighting parameters, which are functions of word frequency. Experiments have been conducted with Reuter’s database and show that the new weighted kernel achieves better classification accuracy.]]
    Original languageEnglish
    Title of host publicationData Mining and Analytics 2006: Proceedings of the Fifth Australasian Data Mining Conference (AusDM2006), Sydney, Australia, 29-30 November, 2006
    PublisherAustralian Computer Society
    Number of pages4
    ISBN (Print)1920682422
    Publication statusPublished - 2006
    EventAustralasian Data Mining Conference -
    Duration: 2 Dec 2019 → …

    Conference

    ConferenceAustralasian Data Mining Conference
    Period2/12/19 → …

    Keywords

    • text categorization
    • kernel methods
    • algorithms
    • parameters

    Fingerprint

    Dive into the research topics of 'Weighted kernel model for text categorization'. Together they form a unique fingerprint.

    Cite this