0% found this document useful (0 votes)

70 views8 pages

Extracting Information Science Concepts

This document discusses extracting information science concepts using natural language programming and JAPE regular expressions in GATE. It provides a brief overview of information extraction tools and compares them, noting that each has advantages and disadvantages. The paper then uses CREOLE plugins in GATE to extract concepts from the field of information science to help speed up the ontology building process in a semi-automatic manner.

Uploaded by

Leïla Gazzeh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views8 pages

Extracting Information Science Concepts

Uploaded by

Leïla Gazzeh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228448906

Extracting Information Science concepts based on Jape Regular Expression

Article · January 2011

CITATION READS

1 306

2 authors:

Ahlam Sawsaa Joan Lu

University of Huddersfield University of Huddersfield
10 PUBLICATIONS 21 CITATIONS 117 PUBLICATIONS 331 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Grid Banking Model View project

Using Natural Language Programming NLP Technology to Model Domain Ontology OTO by Extracting Occupational Therapy Concepts View project

All content following this page was uploaded by Ahlam Sawsaa on 29 October 2014.

The user has requested enhancement of the downloaded file.

EXTRACTING INFORMATION SCIENCE CONCEPTS
BASED ON JAPE REGULAR EXPRESSION

Ahlam Sawsaa Joan Lu

Department of Informatics, School of Computing Department of Informatics, School of Computing
& Engineering- Huddersfield HD1 3DH, United & Engineering Huddersfield HD1 3DH, United
Kingdom Kingdom
<[email protected]> <[email protected]>

Abstracts Recently, an unstructured data on the

natural language document. These tools, such as a
www has generated important further interests in
the extraction text, email, webpage, report, and part of speech tagging-filtering- lexical semantic
research papers in its raw form. Far more
tagging to link between relevant information,
interesting, extracting information from a specific
domain using distribute corpora from World Wide identify the relationships among phrases and
Web is vital step towards creating corpus
sentence elements within text such as GATE. In
annotation. This paper describe a methods of
annotation concepts of Information Science to build fact, each of these tools has advantages and
domain ontology using Natural Language
disadvantages. It required comparative analysis of
programming NLP technology to speed up the
developing ontology process as time consuming existing tools for data extracting to recognize their
and experts in the domain has many barriers as
capabilities. For the purpose to adopt the most
time and loads to do. Using some NLP to reduce
the domain experts work and they can be evaluated appropriate tool we compared between them to
the results.
provide a distinct of the Information extracting
Keywords ontology– Regular expression-
tools[1,8] as illustrated in table (1).
Information extraction – Natural Language
Programming.

1 Introduction language process (NPL) is a technique used by

Recently Information Extracting (IE) has a great many tools to extract data that existing in
interesting in the area of emerging web pages on
In this paper firstly we provide a brief idea about
the internet which contains unstructured data. This
information extracting tools to justify the reason for
amount of information available on the Internet
using NLP technique. To speed up the building
needs a tool to extract to make it available to use in
process of the ontology of Information Science
the right time. Many specialists in the field of
(OIS) and extract concepts in the field, CREOLE
extracting information have worked to find suitable
Plagins in GATE has been deployed into this IE
tools, as Wrappers, that classify interesting data
system.
and mapping them to some appropriate formats
XML or relational database Furthermore, some
2 Background
HTML aware tools can be based on inheriting Basically, the annotating concepts of IS is based on
constructural features of documents to achieve the GATE developer which is a tool of architecture for
extracting process. On the other hand, the natural text Engineering. It is a free open source developed
by a team at Sheffield University which started in
the early of 1990s. The first version was in 1995, different between Information Retrieval and
the second one was in 2002 and the new version is Information Extraction (IE) [3] . IE helps to extract
in 2010s. GATE is running at any platform and information from huge amount of text for the
purpose of fact analysis. Whereas, Information
support JAVA 5 .0. Also, it developed and tested
retrieval (IR) just pulling the document that have
on Linux, windows, and Mac OS X. It has user

Table 1: Information extracting tools

Tools Type Degree of Based on Easy of use Written Adv. &Dis.

automation language
SHOE Knowledge Automatic + Java Allows users to mark up
annotation pages in SHOE guided by
ontologies or URL
Annota Annotation Automatic RDF mark up + C & is Doesn’t support IE,it is
schema W3C XML,XHTML,CSS available for liked to ontology server. –
&Xpointer windows, Makes annotation
unix,&MAC publicly available
Annozilla Email Automatic Mozilla ++ -
annotation
MnM Ontology editor Semi-automatic & HTML + Close to malita
automatic
Ontomat Automatic OWL ++ Use to create & maintain
ontology – Use
OtoBroker as server
COHSE integration of Automatic DAML+OIL + RDF Use ontology server to
text processing mark up pages in
components DAML+OIL& reuse as
RDF
Melita annotation Semi-automatic Extensible mark up ++ To retrival structure &
interface language,Java,HTML semi structured
annotations
KIM Semantic Automatic RDF ++ Semantic annotation,
ontotext annotation indexing, and retrieval of
platform unstructured and semi-
structured content.
GATE Annotation tool Semi-automatic & XML,HTML,XHTML,e +++ Comprises an
automatic mails architecture, framework.
Based inNLP group
interface to enable user editing and visualization relevant information according to the key word
and quick application development. Furthermore, it research. In contrast IE identify the query in
Support for manual annotation, sime-automatic and structure methods and provides knowledge at the
semantic annotation beside ontology management. deep level. While IR use normal queries engine
Moreover, GATE uses CREOLE plug-ins as objects which hard to gain the accurate answer, besides
for language engineering. All of these are packaged providing knowledge at typical level.
as Java Archive and XML configuration data[4].

GATE is a tool of Information Extraction system

(IE). Which is a method to extract unseen texts as
input and produce it in fixed format as XML,
HTML, these data can be displayed for users or
stored in database to analysis. Before talk about
GATE in more details we should clarify what is the
For instance, If you have an enquiry about when finite state algorithm and JAPE grammar and the
something is happened as which airports are application combines from Tokenisor, Sentence
currently closed due to the sever condition weather splitter, POS tagger, Gazatteer, Name entity tagger
in UK? Or to ask about where and who did (JAPE transducer). Orthomatcher (co-references),
something as where did Gordon Brown last visit NP and VP chanker. Among these modules we
before he left? [7] used: Tokenisor, Sentence splitter, Gazatteer, JAPE
transducer [5].

IR gives just the webpage containing the relevant 3 Methods

information and you need to search on it using The process followed the method is based on
terms or concepts to meet your needs, to analyze creating documents-corpora and Gazetteer of
this information. IE provides specific information Information Science, and is based on JAPE rules to
about your enquiry, even if the information is not extract IS concepts as well. Gate provides facilities
accurate but you can back the text. IE is used for for loading corpora for annotation from a URL and
many applications such as; Text Mining, Semantic uploading from a file. The process starts by
Annotation, Question Answering, Opinion Mining, uploading the corpus to the application framework
Decision Support, Rich information retrieval and with a JAPE grammar and Gazatteer to enable
exploration. annotating the concepts from the corpus. Diagram
(1) illustrates the process of corpus annotation.
GATE has many features of both automatic and
semi-automatic semantic annotation and also .
manual annotation which helps you to create your Documents of
Information Science
own annotations, for this purpose GATE developer
is used as the tool to extract terms and concepts
from a specific text effectively and efficiently. For
this work we annotate text belong to members of Analysis
Ontocop. Ontocop is a virtual community of process

practice of Information Science. That helps to

speed up ontology process of building a conceptual Corpus Pdf doc to
model as a life cycle of ontology of IS. XML

Additionally, GATE is a Module that has a Upload to

GATE
comprehensive set of plug-ins as: Alignment, Framewor
ANNIE, Annotation_Merging, k

Copy_Annots_Between_Docs, Gazetteer_LKB,
Running
Gazetteer_Ontology_Based, Information_Retrieval,
ANNIE
Keyphrase_Extraction_Algorithm,
Language_Identification, Ontology_Tools,
WordNet. Annotate concepts & Evaluation

GATE based on ANNE which is a new IE system

has core processing resources. ANNE relies on
Figure ( 1 ) Annotation workflow
Corpus: Collecting the corpus contains 300 required to create own list of concepts to be
documents, all the documents are relevant to annotated.
Information science field.
Gazetteer: The IS list included in Gazetteer which
contains terms. These terms have value to be 4 NLP technique of extract IS
identified such as; MajorType and MinarType for concepts:
each one, e.g. We present an automatic extraction methods
based on ANNE by creating JAPE grammar that
extracts concepts form xml, HTML text, by
Acquisition policy: major type= concept
creating Corpus with 300 documents in XML
Computer aided design: minor type= term format.

Data analysis: major type= concept Our JAPE rule to extract concepts shown in the
following role. The first entity detected is
JAPE rule: Using JAPE rule extracts concepts to
Information service {Type=Token, start=867,
identify Tokens that contain the concepts in the
correct order, and looking up to the concepts in the end= 837, id= 4210, majorType=concept}
Gazatteer list. labelled as information service.concept
JAPE (Java Annotation Patterns Engine) rules
create a phase based on Java for creating specific Phase: one
grammar. Each JAPE rule consists of LHS which
Input: Lookup Token
contains patterns to match. RHS details the
Options: control = appelt
annotations to be created [4].
Rule: concept1
We used JAPE grammar to support regular
expression matching, as it is the way of annotation
by GATE. Annotation can be made by using other (
CREOLE plug-ins such as Gazatteer which
({Token.string == "information"})

Figure (2) shows screenshot of IS Gazetteer

{Token.string == "service"} Options: control = all
({Lookup.minorType == region}): reginName Rule: concept2
Priority: 20
) : service
(
-->
({Token.string == "information"})
: reginName.Location = {},
{Token.string == "service"}
: Information service.concept = {}
Acquisition .service
({Lookup. major Type == "concept"})
3. {Token.string == "archival * "}
) : information
It will annotate archival library, archival
-->
journal, archival processing, archival
: Information. Concept = {Rule=concept2}
software, and archival studies. All these
For more precise details we apply regular
expression for matching strings of text, e.g rules are sorted in the INFCO. jape file .

Phase: Concept
5 Experiment & Evaluation
Input: Lookup Token Extraction IS concepts by using JAPE grammar
Options: control = appelt and Regular expression based on GATE developer
Rule: Glossary for automated extracting information provides a
( significant output. The main idea of using JAPE
({Token.string == "catalog?e"}) and Regular Expression is to identify IS
): concept terminology as tokens, for example, Computing,
--> Libraries and Information technology from a large
:{} .concept= {Rule= "Glossary"} text where terms are founded. The term
In this rule we specify a string of the text identification relies on lookup from Gazatteer list
{Token.string == }string matching to specify the of IS which could be matching, for instance, it
attributes of the annotation by using operators as could be book art, book card, book guidance or
“==”,which provide the whole string matching. book catalogue. Also, look up at these concepts
Some of these regular expressions in next such as computer application, computer Science,
example annotate concepts related to (abstract) computer experts, computer file, or computer
metacharacter (dot, *, [ ], |), image.

1. {Token.string == "abstract(ing)"} The corpus we used to extract information science

It may be abstract, abstracting, abstractor. concepts contains 300 documents which were
obtained. Therefore, a total of document is
Also, if we want to annotate acquisition
analyzed. By running ANNIE application
concept followed by another word as:
organized as document reset, Tokenisor, sentence
2. {Token.string == "acquisition. Spliter Gazzater, POS tagger, JAPE transducer and
number"} Orthomatcher. In annotation set appeared in display
It could be annotate the pan and concepts are highlighted in the annotation
default, as shown in figure (3)
Acquisition. police
Phase: Two
Input: Lookup Token
Figure (3): annotation concepts in Gate

The results show that our approach annotated concepts (see figure 4) and the annotation derives the knowledge
started from (896) end in (905) while computer science concept annotated from (2008)) and end at (2024), with
its features {major Type=concept}. Each annotation starts from specific point and ends at different point based
on how many token it has and listed each time.

Figure 4: Result of the annotation IS domain

We conduct this experiment to achieve accuracy and their relations to make global interoperability
rates that equal to the manual output by IS experts for possible. In future work we plan to enhance these
the annotating concepts. Statistics of the corpus show concepts to develop IS ontology to creating the
pattern matching of IS concepts based on the lookup taxonomy of IS as domain. Next step is coding it
IS list (402), correct concepts and accuracy were by using Protégé as ontology editor. Additionally,
generally higher, whereas, partially correct (0) such a generic model of the IS ontology will be
missing and false positives (0). evaluated.

Acknowledgment
The authors wish to thank Libyan government for
its support. And for each one who provide feedback
on this work.

Reference:
[1]ALBERTO, H. F., BERTHIER, A. L.-. & RIBEIRO-NETO
(2002) A brief survey of web data extraction tools.
SIGMOD Record
http://annotation.semanticweb.org/tools/.

[2]CHANG, C.-H., KAYED, M., GIRGIS, M. R. & SHAALA,

K. (2000) A Survey of Web Information Extraction
Systems. IEEE TRANSACTIONS ON KNOWLEDGE
Figure 5: The result accuracy AND DATA ENGINEERING, 13.

[3]CRESCENZI, V. & MECCA, G. (2004) Automatic

However, we use GATE due to its benefits as open Information Extraction from Large Websites. Journal
of the ACM, 51, pp. 731–779.
source and it contains multi-language NLP models
which can be reused for developing other [4]GATE (2010) Developing Language Processing Components
with GATE Version 6 (a User Guide).
resources. http://gate.ac.uk/sale/tao/splitch13.html#x18-
32300013.2.

6 Conclusion [5]HANDSCHUH, S. & STAAB, S. (2002) Authoring and

6.1 Achievement This paper described Annotation of Web Pages in CREAM. Honolulu,
Hawaii, USA.
a method of using NPL technique to extract
[6]MOENS, M.-F. (2006) Information Extraction: algorithms
concepts for the purpose of speed up developing and prospects in a retrieval context, Springer.
process of IS ontology. Furthermore, the
[7]SRIHARI, R. & LI, W. (2002) Information Extraction
development of IE system saved the efforts of Supported Question Answering. In Proceedings of
the Eighth Text Retrieval Conference (TREC-8 ).
domain experts by labelling most common
[8]TURMO, J., AGENO, A. & CATAL`A, N. (2006) Adaptive
concepts. In total we extract (664) concepts which
Information Extraction. ACM Computing Surveys, 38.
is the classes of Information Science Ontology, and
(650) subclasses, which is the main component of
the ontology skeleton. Using IE technique can be
applied to many different formats as XML, HTML
documents even using URL or emails).

6.2 Future work Ontology is at the

heart of the semantic web. It defined the concepts

View publication stats

Unit4 Final
No ratings yet
Unit4 Final
57 pages
A Machine Learning Approach To Information Extract
No ratings yet
A Machine Learning Approach To Information Extract
10 pages
Building Information Extraction System Based On Computing Domain Ontology
No ratings yet
Building Information Extraction System Based On Computing Domain Ontology
5 pages
Is WC 06 Welty Murdock
No ratings yet
Is WC 06 Welty Murdock
14 pages
A Machine Learning Approach To Information Extraction
No ratings yet
A Machine Learning Approach To Information Extraction
8 pages
Recent Survey On Automatic Ontology Learning
No ratings yet
Recent Survey On Automatic Ontology Learning
5 pages
Machine Learning for Informal IE
No ratings yet
Machine Learning for Informal IE
34 pages
Information Extraction Methods Overview
No ratings yet
Information Extraction Methods Overview
40 pages
Edward H. Y. Lim, James N. K. Liu, Raymond S. T. Lee - Knowledge Seeker - Ontology Modelling For Information Search and Management
No ratings yet
Edward H. Y. Lim, James N. K. Liu, Raymond S. T. Lee - Knowledge Seeker - Ontology Modelling For Information Search and Management
252 pages
Knowledge Graphs from Research Articles
No ratings yet
Knowledge Graphs from Research Articles
4 pages
LLM-Powered Natural Language Text Processing For O
No ratings yet
LLM-Powered Natural Language Text Processing For O
14 pages
Employing A Domain Specific Ontology To Perform Semantic Search
No ratings yet
Employing A Domain Specific Ontology To Perform Semantic Search
13 pages
Data Filtering with NLP Techniques
No ratings yet
Data Filtering with NLP Techniques
4 pages
Grammatical Inference for Information Extraction
No ratings yet
Grammatical Inference for Information Extraction
4 pages
A Novel Approach To Automatic Gazetteer Generation Using Wikipedia
No ratings yet
A Novel Approach To Automatic Gazetteer Generation Using Wikipedia
9 pages
Information Extraction From Unstructured
No ratings yet
Information Extraction From Unstructured
9 pages
Empirical Method To Extract Information
No ratings yet
Empirical Method To Extract Information
15 pages
FALLSEM2023-24 CSE4022 ETH VL2023240103739 2023-08-23 Reference-Material-II
No ratings yet
FALLSEM2023-24 CSE4022 ETH VL2023240103739 2023-08-23 Reference-Material-II
5 pages
English7 Q3 W1 D4
No ratings yet
English7 Q3 W1 D4
44 pages
Keyword Extraction Issues and Methods
No ratings yet
Keyword Extraction Issues and Methods
33 pages
Aplicacion de Tecnicas de Extraccion de Informacion A Bibliotecas Digitales Applying Information Extraction Techniques To Dls 0
No ratings yet
Aplicacion de Tecnicas de Extraccion de Informacion A Bibliotecas Digitales Applying Information Extraction Techniques To Dls 0
10 pages
Resolving Ambiguous Entity Through Context Knowled
No ratings yet
Resolving Ambiguous Entity Through Context Knowled
14 pages
Annotation Imprtant
No ratings yet
Annotation Imprtant
5 pages
GATE: Semantic Text Analysis Overview
No ratings yet
GATE: Semantic Text Analysis Overview
165 pages
Heterogeneouswebdataextractionusingontology: Hicham Snoussi Laurent Magnin Jian-Yun Nie
No ratings yet
Heterogeneouswebdataextractionusingontology: Hicham Snoussi Laurent Magnin Jian-Yun Nie
13 pages
Research On Domain Ontology Construction in Digita
No ratings yet
Research On Domain Ontology Construction in Digita
7 pages
DeekshikaJadyada27 AP24LDS11
No ratings yet
DeekshikaJadyada27 AP24LDS11
4 pages
An - Ontological - Framework - For - Information - Extraction - From - Diverse - Scientific - Sources-Gohar-Zaman SB
No ratings yet
An - Ontological - Framework - For - Information - Extraction - From - Diverse - Scientific - Sources-Gohar-Zaman SB
14 pages
Knowledge Extraction From Natural Language Text in The Model-Driven Engineering
No ratings yet
Knowledge Extraction From Natural Language Text in The Model-Driven Engineering
12 pages
Unit 4 DNLP
No ratings yet
Unit 4 DNLP
52 pages
1) What Is Natural Language Processing?
No ratings yet
1) What Is Natural Language Processing?
14 pages
Research On Text Generation Model of Natural Language Processing Based On Computer Artificial Intelligence
No ratings yet
Research On Text Generation Model of Natural Language Processing Based On Computer Artificial Intelligence
6 pages
Data Mining
No ratings yet
Data Mining
84 pages
A Survey On Hidden Markov Models For Information Extraction
No ratings yet
A Survey On Hidden Markov Models For Information Extraction
4 pages
2024 Lrec-Main 283
No ratings yet
2024 Lrec-Main 283
11 pages
D2.1.1 Ontology-Based Information Extraction (OBIE) v.1
No ratings yet
D2.1.1 Ontology-Based Information Extraction (OBIE) v.1
37 pages
Lecture07 03
No ratings yet
Lecture07 03
13 pages
Info Extraction Techniques Analysis
No ratings yet
Info Extraction Techniques Analysis
9 pages
Session 6
No ratings yet
Session 6
19 pages
Advanced Info Extraction Methods
No ratings yet
Advanced Info Extraction Methods
20 pages
Essential NLP Techniques Guide
No ratings yet
Essential NLP Techniques Guide
2 pages
Piskorski 2012
No ratings yet
Piskorski 2012
27 pages
Nasar 2021
No ratings yet
Nasar 2021
39 pages
NLTK Analysis 5
No ratings yet
NLTK Analysis 5
5 pages
Information Extraction Survey
No ratings yet
Information Extraction Survey
117 pages
A Methodology For Engineering Domain Ontology Usin
No ratings yet
A Methodology For Engineering Domain Ontology Usin
7 pages
Unit 4 TB
No ratings yet
Unit 4 TB
23 pages
A Review of Open Information Extraction
No ratings yet
A Review of Open Information Extraction
9 pages
Icait2011 Submission 5
No ratings yet
Icait2011 Submission 5
3 pages
NLP MiniProject GroupNo 16
No ratings yet
NLP MiniProject GroupNo 16
9 pages
Large Language Models For Generative Information Extraction
No ratings yet
Large Language Models For Generative Information Extraction
24 pages
Unit - 1
No ratings yet
Unit - 1
11 pages
Domain-Specific Information Extraction Structures
No ratings yet
Domain-Specific Information Extraction Structures
5 pages
Large Language Models For Generative Information Extraction: A Survey
No ratings yet
Large Language Models For Generative Information Extraction: A Survey
43 pages
Semi-Structured Data Extraction Techniques
No ratings yet
Semi-Structured Data Extraction Techniques
11 pages
Overview of Information Extraction in NLP
No ratings yet
Overview of Information Extraction in NLP
25 pages
Domenske Ontologije I Deo
No ratings yet
Domenske Ontologije I Deo
7 pages
Semantic Information Retrieval Based On Domain Ontology
No ratings yet
Semantic Information Retrieval Based On Domain Ontology
4 pages
HS 111 Group No 10
No ratings yet
HS 111 Group No 10
6 pages
Mobile Legends: Bang Bang Overview
No ratings yet
Mobile Legends: Bang Bang Overview
2 pages
Understanding Advertising Basics
No ratings yet
Understanding Advertising Basics
25 pages
Сувилахуйн ур чадвар тест
No ratings yet
Сувилахуйн ур чадвар тест
199 pages
Are Browse-Wrap Agreements Legally Binding As Analysed Across Multiple Jurisdictions
No ratings yet
Are Browse-Wrap Agreements Legally Binding As Analysed Across Multiple Jurisdictions
4 pages
Wonderware Guide To InTouch® HMI Documentation
No ratings yet
Wonderware Guide To InTouch® HMI Documentation
14 pages
Network Security
No ratings yet
Network Security
5 pages
Security in E-Commerce
No ratings yet
Security in E-Commerce
13 pages
Moxa Building An Industrial DMZ With Moxa Secure Routers Tech Note v1.0
No ratings yet
Moxa Building An Industrial DMZ With Moxa Secure Routers Tech Note v1.0
28 pages
Microsoft Product Keys and Details
No ratings yet
Microsoft Product Keys and Details
4 pages
Aci Remote Leaf Discovery and Configurat
No ratings yet
Aci Remote Leaf Discovery and Configurat
23 pages
Knime Anomaly Detection Visualization
No ratings yet
Knime Anomaly Detection Visualization
13 pages
Key Elements of Blogs and Columns
No ratings yet
Key Elements of Blogs and Columns
3 pages
Diplomacy World 87 - Summer 2001
No ratings yet
Diplomacy World 87 - Summer 2001
30 pages
US Army War College - Pravilnik
No ratings yet
US Army War College - Pravilnik
81 pages
Role of Ad Agencies
No ratings yet
Role of Ad Agencies
26 pages
Mobile Banking Security Risks
No ratings yet
Mobile Banking Security Risks
12 pages
Society, Law and Ethics
No ratings yet
Society, Law and Ethics
64 pages
New Zealand
No ratings yet
New Zealand
3 pages
Future Trends in Media and Technology
No ratings yet
Future Trends in Media and Technology
5 pages
Protect Blogger Content Easily
No ratings yet
Protect Blogger Content Easily
5 pages
SPI and CAN Interfaces in MCUs
No ratings yet
SPI and CAN Interfaces in MCUs
21 pages
Internet Addiction Test
No ratings yet
Internet Addiction Test
5 pages
Software Requirements Specification - Payment Gateway
No ratings yet
Software Requirements Specification - Payment Gateway
4 pages
AI-Powered Social Media Scheduler
No ratings yet
AI-Powered Social Media Scheduler
1 page
Structure Impact of Technology in Politics: What I Know?
No ratings yet
Structure Impact of Technology in Politics: What I Know?
2 pages
Airline Reservation System
100% (1)
Airline Reservation System
17 pages
Mail Art
100% (1)
Mail Art
11 pages
All Name
No ratings yet
All Name
5 pages
CN Lab Manual
No ratings yet
CN Lab Manual
59 pages

Extracting Information Science Concepts

Uploaded by

Extracting Information Science Concepts

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Extracting Information Science concepts based on Jape Regular Expression

Article · January 2011

Ahlam Sawsaa Joan Lu

SEE PROFILE SEE PROFILE

Grid Banking Model View project

The user has requested enhancement of the downloaded file.

Ahlam Sawsaa Joan Lu

Abstracts Recently, an unstructured data on the

1 Introduction language process (NPL) is a technique used by

Table 1: Information extracting tools

Tools Type Degree of Based on Easy of use Written Adv. &Dis.

GATE is a tool of Information Extraction system

IR gives just the webpage containing the relevant 3 Methods

practice of Information Science. That helps to

Additionally, GATE is a Module that has a Upload to

GATE based on ANNE which is a new IE system

Figure (2) shows screenshot of IS Gazetteer

1. {Token.string == "abstract(ing)"} The corpus we used to extract information science

Figure 4: Result of the annotation IS domain

[2]CHANG, C.-H., KAYED, M., GIRGIS, M. R. & SHAALA,

[3]CRESCENZI, V. & MECCA, G. (2004) Automatic

6 Conclusion [5]HANDSCHUH, S. & STAAB, S. (2002) Authoring and

6.2 Future work Ontology is at the

View publication stats

You might also like