Papers by Dr. Muhammad Tanvir Afzal

Journal of IT in Asia, Apr 20, 2016
Knowledge diffusion has prime importance for generation of new knowledge. The creation of new kno... more Knowledge diffusion has prime importance for generation of new knowledge. The creation of new knowledge is not possible without referring or consulting the past work. Two very important potential flows related to knowledge diffusion can be observed in the common practice of researchers. First, in scientific research knowledge diffusion estimation using citation counts is generally used to establish the value of knowledge which inflates citations. Second the researchers use cited work to search the connected and related resources. Recently the social and collaborative phenomena termed as Web 2.0 has spurred new era of knowledge and information flow on the web. Its potential for the growth and diffusion of scientific knowledge has not been well explored. The emerging social and collaborative applications, such as tagging and bookmarking, are transforming the ways scientists and researchers organize their personal and collaborative information spaces. These bookmarking and tagging applications provide open data and rich metadata resources such as tags. Past research shows that the bookmarking and tagging can be used as a supplementary indicator for measuring research popularity and knowledge diffusion. However the current work exploits author keywords of scientific publications to link these resources with relevant tags extracted from a social bookmarking application such as CiteULike. This work compares, for a focus resource, the tags extracted from CiteULike based on author keywords with their corresponding tag cloud of CiteULike. The result shows that system extends the authors keyword set with social tags providing links to rich and focused resources in CiteULike. This also enhances the serendipitous discovery of emerging concepts related to the focused resource. Such a system may enhance the discovery of related and popular resources for researchers. This dataset has been made available publicly for scientific community.

J. Univers. Comput. Sci., 2011
Finding experts in academics as well as in enterprises is an important practical problem. Both ma... more Finding experts in academics as well as in enterprises is an important practical problem. Both manual and automated approaches are employed and have their own pros and cons. On one hand, the manual approaches need extensive human efforts but the quality of data is good, on the other hand, the automated approaches normally do not need human efforts but the quality of service is not as good as in the manual approaches. Furthermore, the automated approaches normally use only one metric to measure the expertise of an individual. For example, for finding experts in academia, the number of publications of an individual is used to discover and rank experts. This paper illustrates both manual and automated approaches for finding experts and subsequently proposes and implements an automated approach for measuring expertise profile in academia. The proposed approach incorporates multiple metrics for measuring an overall expertise level. To visualize a rank list of experts, an extended hyperbo...

J. Univers. Comput. Sci., 2012
The Linked Open Data project provides a new publishing paradigm for creating machine readable and... more The Linked Open Data project provides a new publishing paradigm for creating machine readable and structured data on the Web. Currently, the significant presence of data sets describing scholarly publications in the Linked Data cloud underpins the importance of Linked Data for the scientific community and for the open access movement. However, these semantically rich datasets need to be exploited and linked with real time applications. In the project we report on this. We have exploited numerous scholarly datasets and have created semantic links to papers in an online journal, particularly Journal of Universal Computer Science (J.UCS). The J. UCS plays an important part in the computer science publishing community and provides a number of innovative features and datasets to its web users. However, the legacy HTML format in which these features are made available makes it difficult for machines to understand and query. Keeping in mind the impressive benefits of the Linked Open Data p...

Journal of IT in Asia, 2016
Knowledge diffusion has prime importance for generation of new knowledge. The creation of new kno... more Knowledge diffusion has prime importance for generation of new knowledge. The creation of new knowledge is not possible without referring or consulting the past work. Two very important potential flows related to knowledge diffusion can be observed in the common practice of researchers. First, in scientific research knowledge diffusion estimation using citation counts is generally used to establish the value of knowledge which inflates citations. Second the researchers use cited work to search the connected and related resources. Recently the social and collaborative phenomena termed as Web 2.0 has spurred new era of knowledge and information flow on the web. Its potential for the growth and diffusion of scientific knowledge has not been well explored. The emerging social and collaborative applications, such as tagging and bookmarking, are transforming the ways scientists and researchers organize their personal and collaborative information spaces. These bookmarking and tagging appl...

Proceedings of the 7th International Conference on Frontiers of Information Technology, 2009
In numerous contexts and environments, it is necessary to identify and assign (potential) experts... more In numerous contexts and environments, it is necessary to identify and assign (potential) experts to subject fields. In the context of an academic journal for computer science (J.UCS), papers and reviewers are classified using the ACM classification scheme. This paper describes a system to identify and present potential reviewers for each category from the entire body of paper's authors. The topical classification hierarchy is visualized as a hyperbolic tree and currently assigned reviewers are listed for a selected node (computer science category). In addition, a spiral visualization is used to overlay a ranked list of further potential reviewers (high-profile authors) around the currently selected category. This new interface eases the task of journal editors in finding and assigning reviewers. The system is also useful for users who want to find research collaborators in specific research areas.
2009 First International Conference on Networked Digital Technologies, 2009

Journal of Digital Information Management, 2010
Linked Open Data (LOD) is becoming an essential part of the Semantic Web. Although LOD has amasse... more Linked Open Data (LOD) is becoming an essential part of the Semantic Web. Although LOD has amassed large quantities of structured data from diverse, openly available data sources, there is still a lack of user-friendly interfaces and mechanisms for exploring this huge resource. In this paper, we describe a methodology for harvesting relevant information from the gigantic LOD cloud. The methodology is based on combination of information: identification, extraction, integration and presentation. Relevant information is identified by using a set of heuristics. The identified information resource is extracted by employing an intelligent URI discovery technique. The extracted information is further integrated with the help of a Concept Aggregation Framework. Then the information is presented to end users in logical informational aspects. Thereby, the proposed system is capable of hiding complex underlying semantic mechanics from end users and reducing the users' cognitive load in locating relevant information. In this paper, we describe the methodology and its implementation in the CAF-SIAL system, and compare it with the state of the art.

Medieval manuscripts or other written documents from that period contain valuable information abo... more Medieval manuscripts or other written documents from that period contain valuable information about people, religion, and politics of the medieval period, making the study of medieval documents a necessary prerequisite to gaining in-depth knowledge of medieval history. Although tool-less study of such documents is possible and has been ongoing for centuries, much subtle information remains locked such manuscripts unless it gets revealed by effective means of computational analysis. Automatic analysis of medieval manuscripts is a non-trivial task mainly due to non-conforming styles, spelling peculiarities, or lack of relational structures (hyper-links), which could be used to answer meaningful queries. Natural Language Processing (NLP) tools and algorithms are used to carry out computational analysis of text data. However due to high percentage of spelling variations in medieval manuscripts, NLP tools and algorithms cannot be applied directly for computational analysis. If the spelling variations are mapped to standard dictionary words, then application of standard NLP tools and algorithms becomes possible. In this paper we describe a web-based software tool CAMM (Computational Analysis of Medieval Manuscripts) that maps medieval spelling variations to a modern German dictionary. Here we describe the steps taken to acquire, reformat, and analyze data, produce putative mappings as well as the steps taken to evaluate the findings. At the time of the writing of this paper, CAMM provides access to 11275 manuscripts organized into 54 collections containing a total of 242446 distinctly spelled words. CAMM accurately corrects spelling of 55% percent of the verifiable words. CAMM is freely available at http://researchworks.cs.athabascau.ca/

Journal of Universal Computer Science, 2007
We are approaching an era where research materials will be stored more and more as digital resour... more We are approaching an era where research materials will be stored more and more as digital resources on the World Wide Web. This of course will enable easier access to online publications. As the number of electronic publications expands, it will, however, become a challenge for individuals to find related or relevant papers. Related papers could be papers written by the same team of authors or by one of the authors, or even papers that deal with the same topic but were written by other authors. This, of course, raises the issue of linking to papers forward in time, or as we call it "links into the future". To be concrete, while reading a paper written in the year 1980, it would be nice to know if the same author has written another related paper in 1990's or if the same author has written a paper earlier, all this without making an explicit search. Based on the ascertained interest of a person reading a particular paper from a digital repository, an auto-suggestion facility could be useful to indicate papers in the same area, category and subject that might potentially be of interest to the reader. One is typically interested in finding related papers by the same author or by one of the authors of a paper. This feature can be implemented in two ways. The first is by creating links from this paper to all the relevant papers and updating it periodically for new papers appearing on the World Wide Web. Another way is by going through the references of all papers appearing on the WWW. Based on the references, one can create mutual links to the papers that are referred to. In this paper, we focus on offering personalised services beyond standard global access. We explore means of identifying the relevance (or relatedness) of papers. A related paper can mean different things to different people as explained above. Ideally, related papers are found and made accessible using links into the future that could be customised to suit the needs of individual users. In this paper, we will focus on a subset of the problem. We explore links into the future in the context of a particular journal which has existed for the past 13 years with over 1500 published papers. We discuss problems that arise in this restricted context while providing details of partial implementations. We plan to pursue our ideas in a more general setting in future implementations.

Journal of Digital Information Management, 2010
Citations management is an important task in managing digital libraries. Citations provide valuab... more Citations management is an important task in managing digital libraries. Citations provide valuable information e.g., used in evaluating an author's influences or scholarly quality (the impact factor of research journals). But although a reliable and effective autonomous citation management is essential, manual citation management can be extremely costly. Automatic citation mining on the other hand is a non-trivial task mainly due to non-conforming citation styles, spelling errors and the difficulty of reliably extracting text from PDF documents. In this paper we propose a novel rule-based autonomous citation mining technique, to address this important task. We define a set of common heuristics that together allow to improve the state of the art in automatic citation mining. Moreover, by first disambiguating citations based on venues, our technique significantly enhances the correct discovery of citations. Our experiments show that the proposed approach is indeed able to overcome limitations of current leading citation indexes such as ISI Web of Knowledge, Citeseer and Google Scholar.

Content analysis has been a tradition of many electronic and printed journals, in order to ensure... more Content analysis has been a tradition of many electronic and printed journals, in order to ensure quality and the journal's standing. Traditionally, researchers have tried to analyze patterns in scholarly publications using normal tables and statistical charts. In this paper we present an interactive visualization system that can help for a deeper analysis of different trends' patterns hidden in scholarly publications of a digital journal. We apply this technique to the Journal of Universal Computer Science (J.UCS). The proposed visualization system is an easy to use web application, based on animated 2D bubble chart and pie chart to handle geographical, temporal and large kinds of categorical data. The paper gives a brief overview of the state of the art visualization techniques available to understand the knowledge structure of any given academic discipline. The design and technical aspects of the proposed visualization tool and various interesting results drawn from it have been discussed.

In recent years the number of citations a paper is receiving is seen more and more (maybe too muc... more In recent years the number of citations a paper is receiving is seen more and more (maybe too much so) as an important indicator for the quality of a paper, the quality of researchers, the quality of journals, etc. Based on the number of citations a scholar has received over his lifetime or over the last few years various measures have been introduced. The number of citations (often without counting self-citations or citations from "minor" sources, in whatever way this may be defined), or some measurement based on the number of citations (like the h-or the g-factor) are being used to evaluate scholars; the citation index of a journal (again with a variety of parameters) is seen as measuring the impact of the journal, and hence the importance one assigns to publications there, etc. The number of measurements based on citation numbers is steadily increasing, and their definition has become a science in itself. However, they all rest on finding all relevant citations. Thus, "citation mining tools" used for the ISI Web of Knowledge, the Citeseer citation index, Google scholar or software such as the "publishorperish.com" software based on Google scholar, etc., are the critical starting points for all measurement efforts. In this paper we show that the current citation mining techniques do not discover all relevant citations. We propose a technique that increases accuracy substantially and show numeric evaluations for one typical journal. It is clear that in the absence of very reliable citation mining tools all current measurements based on citation counting should be considered with a grain of salt. 978-1-4244-4615-5/09/$25.00 ©2009 IEEE Citation Index Indexed papers All Citations within J.UCS Found by Citation
Information Supply of Related Papers from the Web for Scholarly e-Community
Current search engines require the explicit specification of queries in retrieving related materi... more Current search engines require the explicit specification of queries in retrieving related materials. Based on personalized information acquired over time, such retrieval systems aggregate or approximate the intent of users. In this case, an aggregated user profile is often constructed, with minimal application of context-specific information. This paper describes the design and realization of the idea of ’Links into the
iicm.edu, 2002
Abstract: Content analysis has been a tradition of many electronic and printed journals, in order... more Abstract: Content analysis has been a tradition of many electronic and printed journals, in order to ensure quality and the journal's standing. Traditionally, researchers have tried to analyze patterns in scholarly publications using normal tables and statistical charts. In this paper we present an interactive visualization system that can help for a deeper analysis of different trends' patterns hidden in scholarly publications of a digital journal. We apply this technique to the Journal of Universal Computer Science (J. UCS). The proposed ...
Journal of Digital Information Management, 2010
Citations management is an important task in managing digital libraries. Citations provide valuab... more Citations management is an important task in managing digital libraries. Citations provide valuable information e.g., used in evaluating an author's influences or scholarly quality (the impact factor of research journals). But although a reliable and effective autonomous citation management is essential, manual citation management can be extremely costly. Automatic citation mining on the other hand is a non-trivial task mainly
… , 2009. NDT'09. First …, Jul 28, 2009
In recent years the number of citations a paper is receiving is seen more and more (maybe too muc... more In recent years the number of citations a paper is receiving is seen more and more (maybe too much so) as an important indicator for the quality of a paper, the quality of researchers, the quality of journals, etc. Based on the number of citations a scholar has received over his lifetime or over the last few years various measures have been introduced. The number of citations (often without counting self-citations or citations from ldquominorrdquo sources, in whatever way this may be defined), or some measurement ...

2009 First International Conference on Networked Digital Technologies, 2009
Linked Open Data (LOD) is becoming an essential part of the Semantic Web. Although LOD has amasse... more Linked Open Data (LOD) is becoming an essential part of the Semantic Web. Although LOD has amassed large quantities of structured data from diverse, openly available data sources, there is still a lack of user-friendly interfaces and mechanisms for exploring this huge resource. In this paper we highlight two critical issues related to the exploration of the semantic LOD pool by end users. We introduce a proof of concept application which helps users to search information about a concept without having to know the mechanics of the Semantic Web or Linked Data. We assume that this kind of application may lead to bridge the gap between semantic search and end users. With this application, we concentrated on two aspects: 1) A novel Concept Aggregation Framework to present the most relevant information of LOD resources in an easy to understand way.
Turning keywords into URIs
ABSTRACT The Semantic Web strives to add structure and meaning to the Web, thereby providing bett... more ABSTRACT The Semantic Web strives to add structure and meaning to the Web, thereby providing better results and easier interfaces for its users. One important foundation of the Semantic Web is Linked Data, the concept of interconnected data, describing resources by use of RDF ...
Harvesting Pertinent Resources from Linked Open Data
Journal of Digital Information Management, 2010
Uploads
Papers by Dr. Muhammad Tanvir Afzal