Addressing the variability of natural language Expression in sentence similarity with semantic structure of the sentences

In this paper, we present a new approach that incorporates semantic structure of sentences, in a form of verb-argument structure, to measure semantic similarity between sentences. The variability of natural language expression makes it difficult for …

The Evaluation of Sentence Similarity Measures

The ability to accurately judge the similarity between natural language sentences is critical to the performance of several applications such as text mining, question answering, and text summarization. Given two sentences, an effective similarity …

Semantic Representation in Text Classification using Topic Signature Mapping

Document representation is one of the crucial components that determine the effectiveness of text classification tasks. Traditional document representation approaches typically adopt a popular bag-of-word method as the underlying document …

Utilization of Global Ranking Information in Graph-based Biomedical Literature Clustering

In this paper, we explore how global ranking method in conjunction with local density method help identify meaningful term clusters from ontology enriched graph representation of biomedical literature corpus. One big problem with document clustering …

A framework for text processing and supporting access to collections of digitized historical newspapers

Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that …

Semantically Enhanced User Modeling

Content-based implicit user modeling techniques usually employ a traditional term vector as a representation of the user's interest. However, due to the problem of dimensionality in the vector space model, a simple term vector is not a sufficient …

Algorithms for different approximations in incomplete information systems with maximal compatible classes as primitive granules

This paper proposes some expanded rough set models with maximal compatible classes as primitive granules, introduces two new granules for extending rough set model, and designs algorithms to solve maximal compatible classes, to find the lower and …

A tool for teaching principles of image metadata generation

We developed a simple web-based prototype to familiarize students with digital library tools. To assist the students with the indexing task, the prototype provided basic functionalities, including metadata input form, photo search interface. The …

Voting and political information gathering on paper and online

Electronic voting is slowly making its way into American politics. At the same time, more voters and potential voters are using online news and political information sources to help them make voting choices. We conducted a mockvoting study, using …