The concept of Webpage visibility is usually linked to search engine optimization (SEO), and it is based on global in-link metric [1]. SEO is the process of designing Webpages to optimize its potential to rank high on search engines,... more
In this paper we present a spaceship game which allows us to evaluate human behavior with respect to maintenance and repairing malfunctions. We ran an experiment in which subjects played the spaceship game twice. In one of the games, they... more
Purpose-This exploratory research examined law enforcement officers' attitudes toward the public-private partnerships (PPPs) in policing cyberspace. Particularly, by investigating the predictors of police officers' support for the PPPs,... more
Society's growing dependence on computers and information technologies has been matched by an escalation of the frequency and sophistication of cyber attacks committed by criminals operating from the Darknet. As a result, security... more
with the exponential increase in data storage. Ranking models are used in search engines to locate relevant pages and rank them in decreasing order of relevance. They are an integral component of a search engine. The offline gathering... more
Imagine that all the information in the entire world written in every known language, and every graphic image, video clip, or photograph copied digitally was available at your fingertips. This vast amount of data could then be reduced to... more
The concept of Webpage visibility is usually linked to search engine optimization (SEO), and it is based on global in-link metric [1]. SEO is the process of designing Webpages to optimize its potential to rank high on search engines,... more
In this paper we present and compare two methodologies for rapidly inducing multiple subject-specific taxonomies from crawled data. The first method involves a sentence-level words co-occurrence frequency method for building the taxonomy,... more
The concept of Webpage visibility is usually linked to search engine optimization (SEO), and it is based on global in-link metric [1]. SEO is the process of designing Webpages to optimize its potential to rank high on search engines,... more
Computer ethicists have long been intrigued by the possibility that computers, computer programs, and robots might develop to a point at which they could be considered moral agents. In such a future, computers might be considered... more
In this paper we present and compare two methodologies for rapidly inducing multiple subject-specific taxonomies from crawled data. The first method involves a sentence-level words co-occurrence frequency method for building the taxonomy,... more
With the increasing number of individuals accessing online child sexual exploitation material (CSEM), there is an urgent need for primary prevention strategies to supplement the traditional focus on arrest and prosecution. We examined... more
A Web crawler is an important component of the Web search engine. It demands large amount of hardware resources (CPU and memory) to crawl data from the rapidly growing and changing Web. So that the crawling process should be a continuous... more
The Web poses itself as the largest data repository ever available in the history of humankind. Major efforts have been made in order to provide efficient access to relevant information within this huge repository of data. Although... more
Maximum search engines use only the search keywords for searching. Due to the ambiguity of semantics and usages of the search keywords, the results are noisy and many of them do not match the user’s search goals. In general the search... more
Maximum search engines use only the search keywords for searching. Due to the ambiguity of semantics and usages of the search keywords, the results are noisy and many of them do not match the user’s search goals. In general the search... more
Agents that interact with humans are known to benefit from integrating behavioral science and exploiting the fact that humans are irrational. Therefore, when designing agents for interacting with automated agents, it is crucial to know... more
The project of the Ontology Web Search Engine is presented in this paper. The main purpose of this paper is to develop such a project that can be easily implemented. Ontology Web Search Engine is software to look for and index ontologies... more
Agents that interact with humans are known to benefit from integrating behavioral science and exploiting the fact that humans are irrational. Therefore, when designing agents for interacting with automated agents, it is crucial to know... more
The increasing importance of search engines to commercial web sites has given rise to a phenomenon we call "web spam", that is, web pages that exist only to mislead search engines into (mis)leading users to certain web sites. Web spam is... more
Making use of search engines is most popular Internet task apart from email. Currently, all major search engines employ web crawlers because effective web crawling is a key to the success of modern search engines. Web crawlers can give... more
This is a Diamond Open Access article distributed under the terms of the Creative Commons H T U Attribution-NonCommercial-ShareAlike 4.0 International (CC-BY-NC-SA 4.0) LicenseU T H , T which permits unrestricted non-commercial useT ,... more
Using web crawler technology to support design-related web information collection in idea generation
Effective information gathering in problem and task related fields with which designers or design teams may not be familiar is a key part of the design process. Designers usually consult with subject experts to access expert information.... more
The Web poses itself as the largest data repository ever available in the history of humankind. Major efforts have been made in order to provide efficient access to relevant information within this huge repository of data. Although... more
The increasing importance of search engines to commercial web sites has given rise to a phenomenon we call "web spam", that is, web pages that exist only to mislead search engines into (mis)leading users to certain web sites. Web spam is... more
In this paper we present and compare two methodologies for rapidly inducing multiple subject-specific taxonomies from crawled data. The first method involves a sentence-level words co-occurrence frequency method for building the taxonomy,... more
This article looks at the controversial decision taken by Yahoo CEO Marissa Mayer to ban telework in early 2013. It analyses the pros and cons of teleworking and searches for the underlying assumption that may have led to this ruling. A... more
Many Web IR and Digital Library applications require a crawling process to collect pages with the ultimate goal of taking advantage of useful information available on Web sites. For some of these applications the criteria to determine... more
Finding useful information from the Web which has a huge and widely distributed structure requires efficient search techniques. Distributive and varying nature of Web resources is always major issue for search engines maintain latest... more
Extracting information from the web is becoming gradually important and popular. To find Web pages one typically uses search engines that are based on the web crawling framework. A web crawler is a software module that fetches data from... more
The anonymous marketplaces ecosystem represents a new channel for black market/goods and services, offering a huge variety of illegal items. For many darknet marketplaces, the overall sales incidence is not (yet) comparable with the... more
Darknet markets have been studied to varying degrees of success for several years (since the original Silk Road was launched in 2011), but many obstacles are involved which prevent a complete and systematic survey. The Australian National... more
The World Wide Web is the largest collection of data today and it continues increasing day by day. A web crawler is a program from the huge downloading of web pages from World Wide Web and this process is called Web crawling. To collect... more
Topical crawlers are becoming important tools to support applications such as specialized Web portals, online searching, and competitive intelligence. As the Web mining field matures, the disparate crawling strategies proposed in the... more
L’analyse des sources ouvertes necessite des outils qui soient capables d’effectuer des crawls de sites web pour mieux les categoriser et faciliter leurs analyses sous des formes notamment cartographiques. Base sur l’analyse des... more
Web log mining provides tremendous information about user traffic and search engine behavior at web sites. The behavior of search engines could be used in analyzing server load, quality of search engines, dynamics of search engine... more
With the continuous growth and rapid advancement of web based services, the traffic generated by web servers have drastically increased. Analyzing such data, which is normally known as click stream data, could reveal a lot of information... more
we introduce a new fully automated online media monitoring system MNSight. We explain how the system works, for who is designated, its architecture and scalability. We show that it is easily accessible media monitoring system for the... more
A large amount of data on the WWW remains inaccessible to crawlers of Web search engines because it can only be exposed on demand as users fill out and submit forms. The Hidden web refers to the collection of Web data which can be... more
Finding meaningful information among the billions of information resources on the web is a tedious task as the popularity of Internet is growing rapidly. The future of web is a structured semantic web in place of unstructured information... more
The main purpose of this paper is to present an algorithm of OWL (Web Ontology Language) ontology transformation to concept map for subsequent generation of rules and also to evaluate the efficiency of this algorithm. These generated... more
The number of people with disabilities is continuously increasing. Providing patients who have disabilities with the rehabilitation and care necessary to allow them good quality of life creates overwhelming demands for health and... more
Agents that interact with humans are known to benefit from integrating behavioral science and exploiting the fact that humans are irrational. Therefore, when designing agents for interacting with automated agents, it is crucial to know... more
This paper presents a potential seed selection algorithm for web crawlers using a gain-share scoring approach. Initially we consider a set of arbitrarily chosen tourism queries. Each query is given to the selected N commercial Search... more
Many national and international heritage institutes real-ize the importance of archiving the web for future culture heritage. Web archiving is currently performed either by harvesting a national domain, or by crawling a pre-defined list... more
Many national and international heritage institutes realize the importance of archiving the web for future culture heritage. Web archiving is currently performed either by harvesting a national domain, or by crawling a pre-defined list of... more
Detection of malicious and non-malicious website visitors using unsupervised neural network learning
Distributed denials of service (DDoS) attacks are recognized as one of the most damaging attacks on the Internet security today. Recently, malicious web crawlers have been used to execute automated DDoS attacks on web sites across the... more
We address the problem of identifying the domain of online databases. More precisely, given a set F of Web forms automatically gathered by a focused crawler and an online database domain D, our goal is to select from F only the forms that... more
We present a system for taxonomy construction that reached the first place in all subtasks of the SemEval 2016 challenge on Taxonomy Extraction Evaluation. Our simple yet effective approach harvests hypernyms with substring inclusion and... more