1

Using Negative Voting to Diversify Answers in Non-factoid Question Answering

We propose a ranking model to diversify answers of non-factoid questions based on an inverse notion of graph connectivity. By representing a collection of candidate answers as a graph, we posit that novelty, a measure of diversity, is inversely …

AskDragon: A redundancy-based factoid question answering system with lightweight local context analysis

We introduce our QA system AskDragon which employs a novel lightweight local context analysis technique to handling two broad classes of factoid questions, entity and numeric questions. The local context analysis module dramatically improves the …

Addressing the variability of natural language Expression in sentence similarity with semantic structure of the sentences

In this paper, we present a new approach that incorporates semantic structure of sentences, in a form of verb-argument structure, to measure semantic similarity between sentences. The variability of natural language expression makes it difficult for …

Utilizing Semantic, Syntactic, and Question category Information for Automated Digital Reference Services

Digital reference services normally rely on human experts to provide quality answers to the user requests via online communication tools. As the services gain more popularity, more experts are needed to keep up with a growing demand. Alternatively, …

The Evaluation of Sentence Similarity Measures

The ability to accurately judge the similarity between natural language sentences is critical to the performance of several applications such as text mining, question answering, and text summarization. Given two sentences, an effective similarity …

Semantic Representation in Text Classification using Topic Signature Mapping

Document representation is one of the crucial components that determine the effectiveness of text classification tasks. Traditional document representation approaches typically adopt a popular bag-of-word method as the underlying document …

Supporting student collaboration for image indexing

We describe the Image Tagger system - a web-based tool for supporting collaborative image indexing by students. The tool has been used in three successive graduate-level classes on content representation. To fully satisfy the class' requirements and …

Utilization of Global Ranking Information in Graph-based Biomedical Literature Clustering

In this paper, we explore how global ranking method in conjunction with local density method help identify meaningful term clusters from ontology enriched graph representation of biomedical literature corpus. One big problem with document clustering …

A framework for text processing and supporting access to collections of digitized historical newspapers

Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that …

Semantically Enhanced User Modeling

Content-based implicit user modeling techniques usually employ a traditional term vector as a representation of the user's interest. However, due to the problem of dimensionality in the vector space model, a simple term vector is not a sufficient …