0% found this document useful (0 votes)

86 views43 pages

Cosmos: Cluster of Systems of Metadata For Official Statistics

The FASTER project aims to develop a flexible platform for accessing statistical data and resources online. It focuses on improving the NESSTAR platform through increased functionality like access control, management of aggregate data files, and usability. The key aspects are developing an XML-based metadata specification that is standards compliant, building a metadata repository, and providing a rich user environment while enforcing access controls. The architecture emphasizes XML support, access controls, and draws upon experiences from the NESSTAR and DDI metadata standard.

Uploaded by

maakz4u

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views43 pages

Cosmos: Cluster of Systems of Metadata For Official Statistics

Uploaded by

maakz4u

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 43

COSMOS

Cluster Of Systems of Metadata for Official Statistics

The COSMOS Projects profile

Prepared by Prof H. Papageorgiou and M. Vardaki University of Athens, Department of Mathematics Athens, Greece

January 5th, 2002

Contents
Contents......................................................................................................................2 Preface........................................................................................................................4 Chapter 1 - Projects presentation general information...........................................6 1.1 FASTER...............................................................................................................6 1.1.1 Main Objectives............................................................................................6 1.1.2 Development of a Metadata Specification in XML.....................................7 1.1.3 System Architecture......................................................................................8 1.1.4 The Data Web Technology...........................................................................8 1.1.5 User Environment.........................................................................................9 1.1.6 Access Control...............................................................................................9 1.2 IQML ...................................................................................................................9 1.2.1. Main objectives:.........................................................................................10 1.2.2. Metadata specification...............................................................................10 1.2.3. Architecture................................................................................................10 1.3 IPIS.....................................................................................................................12 1.3.1. Main objectives...........................................................................................12 I.3.2 Architecture:................................................................................................13 I.4 METAWARE.......................................................................................................16 I.4.1 Main objectives............................................................................................16 I.4.2 Architecture.................................................................................................17 I.4.3 Metadata support........................................................................................18 I.5 MISSION.............................................................................................................19 I.5.1 Main objective:............................................................................................19 I.5.2 Architecture:................................................................................................20 Chapter 2 - Projects comparison.............................................................................23 2.1. Objectives ......................................................................................................23 II.2 Comparative analysis of COSMOS cluster projects...................................23 2.2.1 Data Capture..............................................................................................31 2.2.3 Metadata Repository..................................................................................32 2.2.4 Metadata Categories and modelling:........................................................32 II.3 Metadata Model - COMPARISONS............................................................36

II.4 Architecture comparisons .........................................................................37 References................................................................................................................39 Annex 1 Template..................................................................................................41

Preface
This is the COSMOS deliverable for the topic COSMOS Projects profile prepared by the UoA/Dept of Mathematics. This document attempts to provide a comparative analysis of the five projects that participate in COSMOS (FASTER, IQML, IPIS, METAWARE, MISSION). It intents to illustrate similarities and differences in the following areas: Objectives Areas of Application User requirements User services provided System architecture Other significant issues

To obtain the required information, the following steps have been taken: An initial table including all five projects similarities and differences in certain domains had been prepared, presented and discussed in detail at the COSMOS kick-off meeting in Essex by UoA/Dept of Mathematics. A number of alterations and additional differences and relationships had been suggested by the partners. To achieve the best understanding of each projects specificities, a template was prepared by H.Papageorgiou and M.Vardaki and has been finalized with the help of Hilary Beedham, and then sent to all partners for completion. This template is provided in Annex 1. The responses collected are presented in two relevant tables in Chapter 2. In some cases, from the answers received from the corresponding partners some questions had not been completed and a N.A (nonavailable) is indicated in the corresponding cell of the table; there are also cases where the answers provided were not clear enough, and this is also mentioned. In order to give an overview of each project and to be able to proceed to comparisons, we had to examine a number of documents and websites for each project that are given as references. The first chapter of this deliverable is entirely extracted from the relevant documentation and the comparisons in Chapter 2 concerning the metadata model and the possible relationships among COSMOS cluster of projects, has also been obtained from these documents study.

Therefore, in this deliverable, not only a comparison of projects is attempted, but an effort is also made to present the interrelations and give

some possible areas of projects interaction within the COSMOS cluster framework.

Chapter 1 - Projects presentation general information

In this section, an initial, brief presentation of all five projects is attempted. These results have been mainly obtained from the study of the relevant documentation that is indicated in the text. However, some parts from the responses of the questionnaires have been considered, related mostly to the projects objectives.

1.1 FASTER1
1.1.1 Main Objectives
FASTER is a dissemination project that aims in developing a flexible and intelligent platform that makes it possible to access various types of statistical data and electronic resources on the Internet. It stresses in developing the technology pioneered in the NESSTAR project by increasing functionality to include access control, statistical disclosure control, management of hierarchical and aggregate data files, to improve usability of the software2. The FASTER project revolves around the metadata repository and will use it to offer a rich, user configurable environment that at the same time will enforce access control rules to the underlying data. This metadata repository will be XML based and standards compliant to facilitate information interchange with other systems. FASTER seems to updating approaches followed in NESSTAR in accordance with technological advances in the area of information interchange. Publicly available results indicate that although the same general architecture will be followed, emphasis will now be given to standards compliance (the metadata repository will evolve using an RDF/XML direction while keeping the base DDI directions), enhancement of the metadata role (metadata will be responsible not only for data conformance details, but will also refer to user requirements as they browse, personalization), and metadata applicability to a wider array of data sources (time-variant, multidimensional, etc). Finally, access control will be metadata supported in all levels of the final system. Architecturally, greater emphasis is placed on XML support and access control issues while some of the NESSTAR

The FASTER project web site

<http://www.faster-data.org/> - Most of this parts information can be found in this

As indicated in the completed template for FASTER by H.Beedham

approaches seem to have been depreciated (such as CORBA messaging and Chesire1). Relevant approaches that have been examined in this phase include the Chesire project (a Z39.50 approach), the CBS Cristal model2, XML approaches, RDF and the Data Documentation Initiative (DDI) standard. It has to be noted that NESSTAR is build upon a DDI based metadata schema. It follows that FASTER can draw upon the experience accumulated in NESSTAR3 in using the DDI DTD and its relevant shortcomings4. It should be noted that NESSTAR has produced a working system for both the front-end (a java-based client for the user's browser) and the back-end (a collection of software tools that enable publishers to incorporate their data into a common repository). Some more information about how the project can meet its goals is the following:

1.1.2 Development of a Metadata Specification in XML

FASTER is heavily relying on metadata to provide for all of its intended functionality, namely an open architecture for the dissemination and use of statistics, a configurable user environment and access control to the underlying data. The data providers will work on the specification of the data and metadata, by building an extensible XML specification that is necessary to create a user managed interface. The prospects are to develop an open architecture for the dissemination and use of statistics The syntax will be in XML (eXtensible Markup Language) and will create a flexible and extensible structure for the metadata and metadata interfaces. The metadata specification will build on earlier work within both the official statistical community and commercial and academic data service providers utilizing distributed information systems. The definition of an appropriate metadata object model and structure - will be based on discussion with interested parties building thus on the experience of current distributed information systems. It will arrange expert workshops in which to discuss appropriate semantics and structures for different types of statistical data. The objectives will be achieved by a multidisciplinary team operating at the European level bringing together world leaders in metadata management, distributed data systems, statistical disclosure control and access control. It is hoped to exploit the momentum of the ICPSR5 (The Inter-university Consortium for Political and Social Research) led DDI initiative and work closely with this wider community. Given the desire to create a flexible user environment, a major emphasis
1 2 3 4

The Chesire Project, < http://cheshire.lib.berkeley.edu/> The CBS Cristal Model, < http://neon.vb.cbs.nl/sos_cubes> The NESSTAR project <http://www.nesstar.org/>. DDI DTD beta testers results, < http://www.icpsr.umich.edu/DDI/codebook/testers.html> http://www.icpsr.umich.edu/DDI/codebook/testers.html

will be on the definition of a flexible architecture in which new resources, ranging from electronic journals to multimedia objects can be included.

1.1.3 System Architecture

FASTER has a three-tier distributed architecture. The system architecture will be built upon the developments of two major projects(NESSTAR and LIMBER1). On the one hand it will bring together the distributed search facility of a project that uses the XML syntax to facilitate seamless searching of remote databases - a type of virtual library. On the other hand it will use the benefits of a consistent XML based interface between different client and server environments to allow access to a variety of data repositories through a number of different clients.

1.1.4 The Data Web Technology

The project attempts to bring to the world of statistical data the same kind of universal information access that the World Wide Web has brought to the world of text publishing. This will be achieved by improving on the basic WWW model by providing a richer set of standardised services that will greatly simplify the identification, access and processing of data and metadata. The basic elements of the Data Web will be:

A Data Browser (Client) that will provide a user-friendly graphical user interface to the system, its role being similar to that played by the Web Browser in the WWW - An abstract protocol for statistical data access that will be mapped to one or more existing or emerging communication standards (e.g. XML over HTTP, CORBA or DCOM). A flexible Server that will be able to host a set of different services and gateways to useful external resources and services.

The Data Web will be seamlessly integrated with the WWW. It will be possible to create links from the WWW to Data Web resources and operations and inversely from statistical metadata to WWW sites. This integration will allow for the creation of a new kind of data-rich, WWW accessible documents that will blend text, images and live data. The interface represents a major step forward in allowing producers to build advanced systems that take data right through from collection to dissemination and users who will have seamless access to a wide variety of data resources. The key element is the interface between these resources. Full advantage will be taken of existing tools and systems in the design of this virtual data environment, and the design will be firmly based in the partners practical and wide-ranging experience in disseminating and publishing data.
1

http://www.venus.cis.rl.ac.uk/limber/

1.1.5 User Environment

Develop a configurable user environment, as a Web based client application or applet, in which the user is able to personalise their environment for immediate interaction (including visualisation and geographical displays) with aggregate and micro statistical data and other resources). The user environment will take advantage of the latest tools, including active agents and the concept of bookmarks, to create a configurable user environment. In this environment the user will be able to create immediate access to any number of electronic resources, such as their favorite data sites, data sets, tables, search strings, user guides and wider resources. Not only will this assist the user in bringing together multiple data sources, but it will create an environment in which the goal of flexible user driven tabulations and visualizations become a reality.

1.1.6 Access Control

Two are the important issues concerning access control: i. Disclosure control Various projects have been working in these areas and the FASTER project will seek to take advantage of the results to control the access to the rapid tabulation service. ii. Authentication of users Implement full access control to the underlying data, taking care of both data confidentiality issues (including disclosure control) and the commercial opportunities. This may simply be a commercial requirement, or it may be based on restricting access to certain types of data for certain types of users. Whatever the reason for needing access control (data sensitivity or commercial interests) it is essential to have some sort of flexible control to data that are commercially sensitive or subject to confidentiality issues.

1.2 IQML
The main goal of IQML, according to the answer gathered from the completed template, is to improve the accuracy and timeliness of statistical data collection from enterprises and individuals whilst at the same time reducing the burden of statistical reporting on enterprises.

1.2.1.

Main objectives 1:
To examine the realities of metadata interchange and object standards in order to facilitate an active contribution to the metadata interchange standards by implementing, in software, chosen aspects of the international standards for metadata interchange (eg. CWM from the Object Management Group, and Registry and Repository from ebXML), and carrying out trials in the area of intelligent questionnaires. To exploit the emerging technologies to facilitate the automation, user friendliness, and application integration, of raw data collection demands of collection agencies. To assist raw data collection agencies to build collection instruments in a variety of forms (e.g. CATI, CAPI) using a common metadata model which will facilitate the development of and access to a common metadata repository.

To ensure the metadata interchange and database access standards being elaborated at the international level by software vendors (eg Open Information Model (OIM) from the metadata coalition, Common Warehouse Metadata (CWM) from the Object Management Group) takes into account the needs of the intelligent questionnaire, by participating in the standards process and by developing products, and re-engineering existing products, that use these standards in live data collection scenarios.

1.2.2. Metadata specification

Metadata will follow international standards, which presumably refer to the development of the Common Warehouse Metamodel (CWM), which aims at creating a standard for data-warehouse metadata specifications. On statistical metadata there is expected to have a complete definition of the survey. All metadata are stored in the Metadata Repository

1.2.3. Architecture

As indicated in the completed template for IQML by Chris Nelson

The overall concept of the projects architecture is illustrated in the figure below1:

The system consists of five modules2: The Metadata Maintenance and Repository will support the definition of metadata objects, from fine-grained objects such as codes, to coarser grained objects such as tables, that can be used in a questionnaire. APIs will be developed to store and access these metadata objects. The product will allow questionnaire design systems and other software to access the metadata without the need to know the underlying structure or source of the metadata by implementing object interfaces that follow international standards. The Questionnaire Designer package will enable the user to design and manage questionnaires which can be deployed using the other software modules of the suite. The tool will allow the user to define questionnaires at a number of levels: conceptual, logical, and formal. Attention will be paid to requirements of different types of respondent (business and individual), and to the different types of surveys (eg economic or social) that may be addressed. The questionnaire design tool will capture all relevant metadata and store it in the metadata repository. The Questionnaire Presentation tool will render the questionnaire for use with PCs and in particular with web browsers. XML support in the CWM for the presentation, validation, navigation and calculation will be implemented by the tool. This will allow users to fill in the data and for it to be validated as appropriate. The Database Interrogation tool will support the extraction of data from popular databases and to map these data to the XML. It will also allow data to be extracted from the XML and loaded into a database. Once configured, this will support the automated loading and extraction of data to and from databases and the electronic questionnaire. The Survey Administration package will allow the questionnaires to be integrated with registers and sample frames. It will track the despatch and receipts of questionnaires and software to individuals and organisations.

1 2

As illustrated in the completed template for IQML by Chris Nelson IQML: Registry and Repository Interface Specification, by Chris Nelson and Andy Jenkins, Dimension EDI

1.3 IPIS
1.3.1. Main objectives 1
The IPIS project aims to develop and apply advanced methods and technologies required for maximum international compatibility, for efficient use of public information and for increasing the efficiency of administration. In particular, to access, organise and disseminate relevant information is a pre-requisite for any decision-making activity in all stages of the work-flow process. The general objective is to develop new tools and services to enable Public Administrations (P.As.) to design, organise, develop and disseminate Public Information Systems (PIS) in a pre-harmonised and standardised way. The objective concerning software development is the introduction of a new public information system, which would allow for:

Full exploitation of integrated information systems that combine in meaningful ways social and economic data of mutual relevance, for policy analysis and comparative research Resolution of data and metadata storage problems Adaptation to different types of data (labour-market industrial, trading and financial flows, balance sheets, etc.) indicators,

Universal access to a distributed public information systems for analysing a large variety of statistical data and their accompanying metadata. Develop a user-friendly metadata-based software application tool that will assist data providers with on-time availability of low cost, high performance and high reliability integrated information infrastructure, thus increasing productivity and decreasing costs. Furthermore, the simultaneous provision of meta-information will increase the value of the statistical results for the end-users. Demonstrate the usefulness of such information infrastructure on two main specific applications, namely cross-border trading and vocational training.

Areas of application: Labour Market (pilot in Household Budget Surveys), Cross-Border Trading (pilot in External Trade), Vocational Education and Training

From Annex 1 of the IPIS project

I.3.2 Architecture1:
I. SYSTEM FUNCTIONAL REQUIREMENTS: i. The use of meta-information, in the form of metadata for the proper processing and presentation of the unprocessed data, is of major importance. The metadata will be used as a means of data identification, knowledge base and as a means for the automatic transformation of these data. The development of a proper metadata model that should be used for the automation of the whole system and as a means to identify and exploit the full potential of all the resources of the system. ii. The system shall be able to manage the effective storage and identification of statistical data and metadata and to provide a mechanism for the retrieval and processing of them. The data that will be stored on the system are expected to be in the form of surveys, tables referring to a level of aggregation, official data in the form of tables, data in registries and probably data from reports produced at special circumstances. iii. A facility to search and retrieve external data described by some generally accepted format (like DDI or RDF) shall be taken into consideration. iv. The most important requirement from the system is to process the statistical data and meta-information. The system needs to have the ability to select and present the required data in the level of analysis asked and with all the required meta-information that needs to be presented with the data in order to understand their meaning and how to handle them. v. The system is required to provide some statistical processing procedures. These procedures may be implemented either as a build-in functionality or as an additional functionality offered to the user as a set of library functions. So the final user will have the ability used them for the processing required. vi. The system shall have facilities to present data for spatial and temporal comparison either within the some country or between different member countries in the EU. The system may also have the ability to present information in different levels of aggregation. vii. A set of standard graphics facilities is required The system must contain a business graphics facility able to present the typical set of graphics (xy-lines, bar graphs-simple or stacked, pies, scatter plots). These are required to be presented in the display, to be printed, or to be included in common packages. II. DATA STORAGE
1

The system will have the ability to store and handle data sets from statistical surveys along with their specification schemes. Also the

From IPIS Deliverable D7.1

system must have the ability to store data in the form of aggregates of these variables The system will store or infer to a set of separate data bases containing official data as they are collected by the appropriate Public Administrations. The design must be abstract enough in order to be able to accept data types not known in the time of its development. Data from registries will be stored on the IPIS repository taking into consideration the principle of non-disclosure of personal information in the cases where the principle is applicable.

III. METADATA PRESENTATION AND STORAGE The system is required to have facilities for the storage and presentation of the available metadata information for the selected data sets. In addition, the presentation of the set of the underling classifications used for the data processing is required. The functionality for the automatic transformation of the presented data from a classification to another needs to be taken into consideration. IV. DATA EXPORT The requirements for export procedures include the following: A facility for exporting data and metadata in the most commonly used formats. A facility to export tables and graphs as embeddable object to other software programs. A facility to export database data in the widely used formats. V. DATA PROCESSING Data Selection The system must have the ability to select data across different data sets either from the some country for different time periods or from different countries for the same time period. The selected data must contain the whole set of the appropriate information required for their proper handling as well as the full explanation of them. For that reason a kind of enhanced SQL queries scheme must be supported. The enhancements needed, concern the maintenance of the names of the variables and the explanatory texts associated with them. Tabulations The selected and manipulated data is required to be presented in tables containing the proper labels, as well as the proper names of variables along with the notes required for the appropriate comprehension of tables. For the presentation on the display, some handlers are required in order to give additional information concerning explanatory issues or other statistical metadata issues. The appearance of the proper format of the tables must be preserved. Also the system must provide facilities for printing.

The system must have advanced facilities concerning tabulations either for presentation purposes on the display or for printing. The data that are contained in the tabulated tables is needed to be possible to be used for statistical processing and for graphics purposes. The statistical processing required involves facilities for standard statistical measures (descriptive statistics), statistical inference tools, and regression models. Some of them need to be offered in predefined selections and others in a set of library functions available to the user. The tables presented need to have the following capabilities: Presentation of the required data with the labels describing them and the meta-information required for their understanding. The metainformation will be presented as notes, as 'balloons', or with some similar facility. That facility will indicate that there exists metainformation related with some element of data. The variables presented need to be coded with a widely accepted coding scheme in order to prevent misuse of data. A possibility is to hide from the final user the coding scheme providing him with a kind of a front end GUI. The tables may have the ability to provide either numeric values for the variables or percentages. Some kind of a processing functionality is required for the elaboration of indicators defined by the end user. The tables defined may be single rows or columns or cross-tabulated tables. The system needs to have the ability to present data in various levels of aggregation. The system needs to have the ability to present spatial and temporal data in tables. The presented data is needed to be exported with their metadata on a set of formats for use with other software packages (statistical or general) for further processes,. The tabulation facilities of the software need to be advanced enough in order to cover the needs of various categories of users.

Automatic Transformations of Data The system must be able to automatically transform the underling data between classifications, between monetary units or to aggregate the data in higher classification categories. The user needs to have the ability to choose between different methods of transformation that are available at the state of the processing he is working on.

Spatial Comparisons The system is required to have the ability to select data and present spatial comparisons of statistical data from regions in a country or across countries either in tabular forms or graphically. These data is required to be used in statistical manipulations. Temporal Comparisons The system is required to have the ability to select data from data set of different years and to create time series, which can be used for comparisons. These data have to be presented in tables or graphs or to be manipulated statistically.

I.4
I.4.1

METAWARE1
Main objectives

The objective of METAWARE is the "Development of a standard metadata repository for data warehouses (DWH) and standard interfaces and functions to exchange metadata between the basic statistical production system and data warehouses2. The aim is to make statistical data warehouse technologies more user-friendly for user access by the public sector. This will support the application of official statistics in the society and broaden the scope of users. The system will operate both in the traditional client/server environment and in the Internet world. The aim is also to support and enhance standardisation both at the national and international level". It should be considered as a kind of continuation of the IMIM project.3 The main idea is the support of data warehouse applications with statistical metadata. The basic structure of the technical project solution, published in the Annex 1, is illustrated as follows:

1 2 3

Most information had been extracted by Deliverables D1 and D2 As indicated in the template completed for METAWARE by L. Planque
http://imim.scb.se

External MD source Data input MD exchange function

Production tools controlled by MD-active functions Data WareHouse

Production tools controlled by MD-active functions Data output

External MD target

MD exchange function

Metadata interface

Statistical Metadata Repository

metadata Metadata support for Data Warehouses

data

The main idea is to support data warehouse tools used in the DWenvironment with metadata and to support the data access with relevant metadata. It has been decided to use an object-oriented approach to define and specify the metadata system necessary for a data warehouse approach. An object-oriented approach means primarily the definition and specification of a number of object types relevant to the DW application. It has also been decided to use metadata specifications that have been done by the Neufchatel Group (Statistical Classifications) and that are available and useful from other sources. A close co-operation with the work done in the Metanet project is envisaged. The Metaware project expects both feedback and input from that project. The metadata specification and the logical system design will be done independently of a special software approach. This means that the system approach will be adaptable to different technical software solutions. Within the Metaware project it is planned to develop a prototype based on the Bridge software, developed during and after the lifetime of the IMIM project.

I.4.2

Architecture

For the prototype development is is planned to use MS Analysis Services and Oracle Express as two existing data warehouse engines. Both systems have OLAP functions that do not sufficient support statistical metadata. The project has to explore how the engines via API can be extended in its metadata functions. A special application layer has to be designed and be implemented as a prototype development. The object types specified by the project have to be implemented in a common metadata interface. The project will use the ComeIn interface tool. The functionality of ComeIn has to be extended by implementing new object types. The project will use object types already defined in ComeIn.

The overall architectural concept can be illustrated as follows:

User Interface

Metadata Interface (ComeIn)

Metadata Repsository

DW applikation, Import/Export Metadata Data Metadata Data Warehouse Engine(System) MS Analysis Services Oracle Express

I.4.3

Metadata support

The project will define object types relevant to the data warehouse problems and introduce them also to the Metanet project for adoption. It could be the task of the Metanet project to define the reference object type, while the Metaware project will use the same object type but only with the attributes useful for that task. On the other hand the Metaware project will use object types developed and defined in other projects. The reference model of the project is the general repository for the description of metadata object types. The Metaware project will use a number of these object type with a suitable subset of attributes. The general agreement about such a reference model for metadata object types will permit the development of common interfaces for the exchange of metadata. XML could for instance be one solution. But also the ComeIn interface is based on the same philosophy and will support the exchange of metadata between different software packages or different components in a system. The metadata model does not reflect versioning features and multilingual support, since those are not part of the conceptual model. Multilingual and version support is provided by the appropriate implementation of a ComeIn interface. All textual metadata objects should support any number of languages. Moreover, all metadata objects are assumed to support versions to reflect minor changes in a metadata object. This is not reflected in the data model because this is not of conceptual interest and can be implemented in many different ways. The technical level just defines the way in which input relations (record types) are processed by operation implementations in order to create one or more output relations (record types). The conceptual part is

divided into the variable definition part and the process definition part. In order to support retrieval functions keywords (thesaurus) and statistical activities (surveys and products) have been introduced. Conceptual parts are more complex since the concept of statistical variables is rather complex. On the other hand, variable and process definitions provide the meta-information, which is required for retrieval processes and for providing conceptual information about the data. Moreover, conceptual information can be used for generating 50% or more of the technical metadata.

I.5

MISSION

The main goal of this project is to utilise the World Wide Web and emerging agent based technologies to provide a modular system of software which will enable providers of official statistics to publish their data in a unified framework, and to allow consumers of statistics to access these data in an informed manner with minimum effort1.

I.5.1 Main objective2:

A software suite allowing statistical data providers to publish data on the Web, satisfying the following criteria: (1) suppliers of official statistics can subscribe to an integrated network of datastores via an interface to their existing data; (2) suppliers retain control over all aspects of access to their existing data; (3) users can make requests in a declarative manner, with a minimum of understanding of statistics, or the domain area, and still retrieve meaningful results; (4) users can tailor their working environment, from simple requests to detailed in-depth analysis; (5) methods of data manipulation and analysis can be retained, re-used and published; (6) libraries of metadata can be constructed and made available to other users; (7) a flexible architecture allows third parties to act as Independent Metadata Providers, thus encouraging free exchange of knowledge; (8) users can build up individual profiles, accessing data and methods most relevant to their needs. MAIN STEPS Identify the relevant data sets using a metadata template, which includes mainly semantics. This is a first screening used to classify the datasets into three transnational ones(one concerning present data, one a time series and a third consisting of longitudinal dataset) Final classification after considering more (typically multilingual) metadata. Identifying common variables and design the linked tables.

1 2

As indicated in the completed template for MISSION sent be Yaxin Bi. As indicated in the completed template for MISSION sent be Yaxin Bi.

Building the dataset Maintaining the data set

In all it is evident that MISSION is set to be able to incorporate any set of statistical tables and thus includes in its examples of uses output data from Eurostat, Statistics Finland and ONS. It is paying considerable attention to data (and this involves large amounts of microdata and administrative data) from Education and Health. It is not paying special attention to extracting indicators but considers as a requirement the production and maintenance of relevant ones

I.5.2 Architecture:

The MISSION system relies on agents (specific software modules) for both the communication between various modules and the coordination of their actions.

Components of Architecture The architecture comprises five basic logical, or conceptual, units: The Client component is a Web based user interface which connects a user to all sites participating in the architecture. The Client obtains a request from the user, and sends an agent to search for a Library that can satisfy the request. It is speculated that the client will take the form of a Java applet or a true web interface using HTML and Javascript. Of course, other approaches may prove suitable, eg. plugins. The Compute server is a statistical processing engine which stores no information of its own. Based on the query it receives, it obtains the necessary data from various data servers, performs the request, and returns the result to the Library which made the request. It may also make a request to third party statistical packages. A primary objective of the compute server unit is to integrate a distributed declarative querying facility and a distributed statistical aggregation system, using distributed database and web technology. The compute server architecture will be designed to incorporate intelligent agent techniques: query agents will facilitate interaction with library server units concerning locational and other operational metadata; query agents will also facilitate interaction with data server units concerning macrodata; mediation agents will enable the user-specified merger of heterogeneous macrodata and accompanying metadata. The compute server will be specified and designed to efficiently implement the statistical macrodata operators and associated metadata operators necessary to accomplish the required data merger through a Java-embedded query language. The principal task of the computer server unit is therefore to receive

and interpret queries from library units and to return macrodata and metadata results along with action plan metadata useful for future query optimisation. The Library is a repository for statistical metadata. It holds the three different kinds of metadata. When a Library receives a request, it decomposes it, and, if necessary, it can send to other Libraries in the system for any metadata it requires. Once it has built up an operation, it submits it to a compute server. On receiving the reply to the request, it returns the answer to the Client. The data server is the unit, which gives access to the data. The data server holds the data itself, management tools for registering and maintaining the system and a gateway module. The gateways hold the minimum amount of metadata necessary for the safe use of the data. This includes registration details to allow the Provider to control access to the data and information about the physical structure of the datastore. Other metadata is made available to be uploaded to Libraries that request it. In brief, the Library is a server in the application layer that:

serves as a statistical metadata repository. Metadata in the context of MISSION project are: access metadata (that are machine readable and contain physical & logical info to access actual metadata), methodological metadata (that are machine readable and are required to process data for statistical analysis) and finally contextual metadata (human readable, provide extra information for the user in the form of notes, footers, survey details etc.) perform a front-end pre-processing of user requests (perhaps performing such tasks as syntax checking, validation, conformance to metadata requirements, etc) before dispatching them to other modules of the system does not hold any actual data, only relevant metadata

Agents form the dynamic part of the system. Agents perform intermediate processing and navigate the Internet to access the appropriate building blocks of the system. Once these are located and accessed, agents are responsible to invoke the appropriate computations on the engines or retrieve the appropriate data and metadata according to the user request/goal. Specifically, it is specified that suitable agents, which shall be developed in the course of project progress, will:

perform intermediate processing and navigate to appropriate part of the system to obtain resources (data, computations, etc.)

invoke computational routines (as needed) and provide data to other agents when requested.

Chapter 2 - Projects comparison

2.1. Objectives

It seems that all five projects aim directly or indirectly to the following: Development of a standard metadata repository Use of the web for data dissemination Metadata collection and manipulation Use of a metadata model (but with different classes, attributes and structure), although overlapping with some of the COSMOS projects particularly for the description of data storage and processing. The development of a metadata repository Micro and macro-data are (or will be) supported and manipulated Use of current state-of-the art technologies

II.2

Comparative analysis of COSMOS cluster projects

However, differences in approaches are identified mainly in the following: 1. AREAS OF APPLICATION 2. USER COMMUNITIES SUPPORTED 3. USER SERVICES DATA COLLECTION 4. USER SERVICES DATA MANIPULATION 5. USER SERVICES DATA IMPORT 6. USER SERVICES DATA EXPORT 7. USER SERVICES DATA BROWSING 8. USER SERVICES MULTILINGUAL SUPPORT 9. USER SERVICES VISUALIZING 10. USER SERVICES HARMONIZATION 11. USER SERVICES OTHER 12. ARCHITECTURE 13. OTHER ISSUES

Remark: In the following tables, some information has not been specified, because we have received general answers from the corresponding partners; these answers are just added as footnotes. In cases where N.A.

(Non Available) in indicated, it means that the responsible partners have not completed this question.

COMPARATIVE
FASTER
Areas application of

ANALYSIS OF THE

COSMOS

PROJECTS

USER

TYPES AND

SERVICES
MISSION
Not specified3 (We can assume from the response to the question about classifications, that some of the areas are: Labour Market, Health and Environmental statistics)

IQML
Not specified1

IPIS
Labour Market (mainly the area of Household Budget Survey), Vocational Education and Training and Cross Border Trading (External Trade).

METAWARE
Not specified2

General areas of social science - health, income, voting patterns, labour market, housing tenure, employment etc.

User Types

End users ( academic com-munity, policy makers mass media community commercial community management consultants) Data disseminators (Data archives, Local authorities, Central government depts.) Data producers (government departments, survey organisations, research institutes)

Respondents (either as individuals or companies) Questionnaire designer (i.e. statistician or data collector that designs the questionnaire) Survey administrator (person responsible for administering the survey in terms of population, survey sample, follow up of responses/non responses etc.) Collecting agency (i.e. organization responsible

Policy Makers (International Organisations, International and National Statistical Offices, Public Administrations and Institutions (Ministries, etc), European Commission, etc) End-Users (Universities, Research institutes, Service providers, Enterprises, General Public (citizens, consumers, workers, etc.).

Data producers & publishers (e.g. users of commercial data warehousing packages) for the main objective; Other statistics users for subobjectives

Casual users Expert users

1 2

The answer to the relevant question of the template was raw data collection, questionnaire design, survey administration, registry and repository The answer to the relevant question of the template was The whole metadata domain, but particularly metadata on data warehouses and cubes 3 The answer to the relevant question of the template was Allowing statistical data providers to share expertise and publish data on the Web

FASTER

IQML
for collecting the completed questionnaires)

IPIS

METAWARE

MISSION

Data Collection facilitated by software

YES

Data and metadata from all kind of data providers

Data collections offered Data types supported Data Types manipulated Metadata Manipulation Data Dictionaries

YES

Data and metadata from all kind of data providers Micro and Macro Micro and Macro YES YES, Data dictionaries (by Statistical Domain, Statistical products, Data Producer, Year), Time series dictionaries, specialized collections of indicators, userdefined collections of indicators, etc. YES

NO*

Micro (macro development) Micro (macro development) NO YES

under under

Micro and Macro NONE NO YES

Micro and Macro Micro and Macro YES YES

Data Import

YES

No (only metadata import)

YES

The Metaware model does not support data in itself but only metadata (including links to datasets, records and physical data elements).

Data Export Data browsing Search Facilities

YES YES Keywords, free text, Boolean, catalogue (by index)

YES Metadata browsing the repository of

YES YES Wizard like facility, Search by variables, var id., metadata, keywords codes, etc.

No (only metadata export) NO Data NO

YES Metadata browsing YES

Searches on the repository to assist the designer or survey administrator to find metadata relating to the questionnaire or survey All European language. Perhaps other languages

Languages supported

Native language of survey for the majority of DDI fields, agreement amongst archives that certain fields of the DDI will be translated into English allows searching across archives in English for these fields. Collaboration with the LIMBER project (which is developing a multilingual thesaurus) means that it a test version demonstrating natural language searches across catalogues in English, German, French and Spanish is expected by the end of the project. Not in the standard interface but this is provided in additional

English, French, Greek, Portuguese (final systems functionality)

All

English only

Collection of Indicators

YES, in all projects areas of application

YES in all projects areas of application

N.A.

interfaces Collection of classifications NO. These could be included as part of the standard coding information held in the DDI but they could not be mapped to other classifications. YES, all repository objects can be classified by many different categories ( a category = one classification node from one classification scheme). Allowable responses to questions may be taken from a classification. No mapping between classifications. YES NO Questionnaires can be represented in many media, current external formats are HTML and XML HTML for the respondents, Java based GUI for the survey administrator or questionnaire designer NO YES, Response to questions can be generated from calculations YES, in all projects areas of application YES in all projects areas of application YES (Heterogeneous query is required) in Labour Market, Health and Environmental statistics. Mission may store international ones in the system and allow automatically or manually set up mappings NO YES HTML, sheet XML, Excel

E-mail facilities Data publishing Data presentation format

YES YES HTML, tab-delimited, http bookmarks

YES NO In HTML tables

NO NO (Only Metadata publishing) NO Data

Data visualization format

HTML

In all kinds of Excel graphs

NO Data

Harmonisation of results Transformation s supported

NO No but software includes facility to tabulate, produce descriptive statistics, scatterplot and regression.

YES YES,(Selection, projection, concatenation, grouping, join, reclassification,

NO YES, All kind, including data mining.

YES YES, (Transformation is used for metadata harmonisation and to support heterogeneous queries)

algebraic)

GUI specification

Java or HTML

Questionnaire Designer Tool (QDT), Survey Administration Tool (SAT), and repository each have GUI. The Questionnaire Presentation Tool (QPT) GUI is HTML. YES NO

HTML

User-friendly Visual Basic interface to enter & process metainformation on data warehouses and cubes and to pilot commercial data warehousing packages. YES NO

Yes, implement using Java Application (stand-alone), maybe later on Web

Access control / security Statistical Disclosure control User action histories

YES Will interface to Statistics Netherlands SDC software on completion of project NO

NO NO

YES NO

YES, Limited audit trail on repository items. No history kept of respondent navigation of the questionnaire

YES

COMPARATIVE
FASTER
Type of Architecture Data repository 3 tier Distributed

ANALYSIS OF THE

COSMOS

PROJECTS

ARCHITECTURE
METAWARE
3-tier Distributed

IQML
3-tier Centralized: Repository can be located on any HTTP server, but is at present a single and shared resource IQML model (question bank objects, survey admin objects). Repository model is based on ebXML registry/repository model 3-tier

IPIS
Centralized

MISSION
3-tier Distributed

Metadata model

DDI

Uniquely developed for the IPIS project, following the OECD MEI. The classes hold information on: Statistical populations, Survey variables, Indicators, classifica-tions and other standards, Data quality issues, Source agencies and collection info, Logistic metadata, Process metadata XML, DDI NO

Process definitions (Cubes, registers, statistical processes, etc.); Variable definitions (Variables, classifications, measure units, etc.); Technical level (Record types, process implementations, etc.); Thesaurus (Keywords, terms, synonyms, etc.). XML NO

MIMAMED

Interchange protocol Use of Agents

DDI, XML In plan

HTTP, SOAP NO

XML YES

The interaction of all five projects have been presented in the following figure (COSMOS, Annex1):

Data capture IQML INPUT IPIS

Data dissemination Mission Faster IPIS OUTPUT

Metadata Repository IQML IPIS Metaware PROCESS Faster Mission

However, the intensive analysis performed proved that there can be found some more concrete converging points that have to be discussed. Therefore, following the above diagram we would like to mention the following: 2.2.1 Data Capture IQML is a data capture system that aims in improving the accuracy and timeliness of statistical data. No specific data providers are considered, meaning that no specified areas of application are considered. We can deduce that the questionnaires designed will be fully supported by various areas of application. IPIS is a dissemination system. However, it incorporates not only the output from statistical agencies but also from administrative sources such as Customs and Vocational Education and Training (VET) institutions, a s well as organisations related to Customs operations. To that extent it can be regarded an input system as well.

2.2.2. Data dissemination Except from IQML that is clearly and input system and METAWARE that is concerned mainly for metadata and standards, the other three projects MISSION, FASTER and IPIS are clearly data dissemination projects. All of them provide a web-based mechanism to allow users to access the existing data and perform certain manipulations from anywhere around the world.

2.2.3 Metadata Repository The key link between all COSMOS projects is the use of repositories. All projects have followed closely the standards in this area. Linking of metadata repositories or accessing multiple metadata repositories from one application can be achieved by having a common understanding of the domain model for the chosen business area, and a common way of accessing registries and repositories.

2.2.4 Metadata Categories and modelling:

FASTER
Three models seem to have links to FASTERs metadata needs: GESMES (GEneric Statistical MESsage), CWM (Common Warehouse Metamodel1) and Registries and Repositories. The GESMES model is suitable for cubes and time series. In that context, it could play a role in the FASTER environment. The CWM is a system of combining models on several levels of abstractness. Some parts of the model seem to have relations with FASTER: OLAP is a system for describing cubes, very much like what is needed in FASTER; description of the records could also be performed in the CWM; classifications and taxonomies might be described using the Business Nomenclature. Registries and Repositories could form the interface between Web Applications and the Data Archives. OASIS and ebXML provide tools for such registries. If an extension to questionnaires is needed, IQML could be a good candidate. The FASTER model is composed by a set of sub-models: Dataset Model Datatypes Model Cube Model Classification Model Transformation Model Documentation Model

The following have to be mentioned : the Object Model is an extension of the OO model defined by the W3C RDF and RDF Schema standards RPCs (remote procedure calls) are performed as normal HTML FORM calls

http://www.cwmforum.org/

The properties are the attributes of an object, they can be either: literals such as a String, an Integer or any other basic type such as the ones defined by XML Schema a reference to another object

Especially for Statistical Disclosure Control, two steps can be identified: The first step is that the data is checked on statistical confidentiality. After that, data that is found to be unsafe, is made safe in a second step. In the case of multidimensional tables, a third step can follow to secure data that, in combination with the secured data, can still reveal individuals. This implies that there is a difference with respect to Statistical Disclosure Control for rectangular record data and multidimensional table data. In analogy to the Argus program, record data will be referred to as micro data, while multidimensional data will be referred to as tabular data. The two different kinds of data also incorporate different approaches in disclosure control. Therefore the two streams will be described separately.

IQML:
This project has developed its own metadata model including information about questions, which represent variables, and about survey administration (e.g. sampling and collection method). Additionally, its repository model is based on ebXML registry/repository model. According to this, a registry is assigned to each repository and contains metadata about the objects in the repository, so that the system can find out whether it contains data relevant with the users question. The metadata model of IQML can be divided in the following parts:

Question Bank, holding information questionnaire design and structure Navigation, Calculation, and Validation describing processes and use of the questionnaire Survey Administration, containing information about the survey under consideration, the sample, etc

Of course, there are interrelations between these parts and we should also stress in the concept of the Node class as any content or expression can be linked to any Node. The major classes from the Question Bank are sub classes of Node. The Content is dependent upon the Context: different Content can be present for the same Node dependent upon Context. Examples of Context are Age Range, Language etc.

IPIS
33

The project has started from scratch, with no predecessor. The metadata types supported are of the following 4 categories: Semantic metadata Documentatio n metadata These are the metadata that give the meaning of the data. Examples of semantic metadata are the sampling population used, the variables measured, the nomenclatures used etc. This is mainly text-based metainformation, like for example labels, that are used in the presentation of the data. Documentation metadata are useful for creating user-friendly interfaces, since semantic metadata are usually too complex to be presented to the user. Usually, an overlap between the semantic and documentation metadata occurs since, many times, we have to store metadata in both structured (i.e. semantic metadata used mainly by machines) and verbal-text (i.e. documentation metadata used by humans) form. These are miscellaneous metadata used for manipulating the data sets. Examples of logistic metadata are the datas URL, the type of RDBMS used, the format and version of the used files etc. Process metadata are the metadata used by Information Systems to support metadata-guided statistical processing. These metadata are transparent to the data consumer and are used in data and metadata transformations. In the rest of the document we will focus our attention on this type of metadata.

Logistic metadata

Process metadata

The following figure describes the overlapping of the above-mentioned metadata categories

In general, the classes of the IPIS statistical metadata model hold information on the following:

Statistical populations Survey variables.

Indicators, classifications and other standards Data quality issues. Source agencies and collection info. Logistic metadata Process metadata

The modelling language used is the Rational Rose UML.

METAWARE:
The project follows the recommendations of IMIM project and it is planned to develop a prototype based on the Bridge software. The technical basic structure of the metadata support for a data warehouse approach has been demonstrated in the initial description of the METAWARE project, Chapter 1, relevant part. The Projects holds three types of metadata classified according to purpose: Physical metadata; Operational metadata; Conceptual metadata. Conceptual metadata Operational metadata Physical metadata Systems and Applications External Users and Statisticians

Communication between users or systems and users is, however, frequently based on conceptual metadata. Systems are referring to operational and physical metadata to provide the required information to the user. These three different aspects of metadata are not clearly distinguished from each other. Operational and physical metadata can be derived from conceptual metadata in many cases." The ComeIn metadata model will be used. ComeIn 3.0, which will be released in January 2002, will fully support the DDI Codebook, the Dublin Core and the ISO 11179 standard. Thus, it will be possible to generate from a ComeIn compliant metadata system (e.g. Bridge NA) DDI Codebooks as well as ISO 11179 compliant registry entries. Moreover, a ComeIn SOAP server (Simple Object Access Protocol W3C standard) will be provided with

ComeIn 3.0, which allows direct exchange of metadata in the internet via SOAP, which is an XML based object access protocol.

MISSION
This project is the successor of the ADDSIA project, which used the MAMED model supporting macro and metadata. The MISSION project now has extended the existing metadata model into the MIMAMED model, which supports microdata, apart from macro- and metadata. Three kinds of metadata are distinguished: a. technical metadata, which refer to the physical storage means and location of data, b. active metadata, which actually define the manipulations that a user can perform, c. passive metadata, which describe certain features of data in freetext, e.g. quality. Furthermore, the MISSION project supports metadata capturing using additional methods and standards, such as the PC-Axis standard. Finally, it must be noted that there is a special package called Standard Metadata Package which contains the items of MISSION that can be mapped onto the standards that are being developed for metadata, for example the CWM model by OMG, the Guidelines for statistical metadata by United Nations, the Corporate Metadata Repository (CMR) Model, etc.

II.3

Metadata Model - COMPARISONS

It should be noted that all five projects of COSMOS use the object-oriented approach to model their metadata. Also, all models hold both human and machine understandable metadata, that is they keep information in freetext representing a definition or a description, but they also keep metadata that define the processes which a user can execute. Therefore, the burden of unifying all these metadata models is expected to be significantly reduced. Specific relationships: Specific attention has to paid to the possible relationship of the IPIS metadata model with the IQML one, as the IPIS model holds information on collection information of a statistical agency, where the Design and description of a Questionnaire are main attributes of the class Collection Info. Therefore, the IQML model, especially its part of Question Bank can be considered as an extension of the IPIS model and can be related with the class Collection Info. Further possibilities of relationship can be found in
36

the Survey Administrator of the IQML model and the part of information on Statistical Populations of the IPIS one. In addition, in METAWARE, there are parts of the model on Process Implementation, Process definition (for tables manipulation) and Variables definition (treating measure units, statistical objects, value sets, classification items, etc) that can be linked with almost all COSMOS projects metadata model. Especially for IQML and IPIS, they are essential. Besides, the Thesaurus part of METAWARE model can be of value for the MISSION project, and the Library part of MISSION where the main function is to hold the metadata that allows the user to search for and query data, can serve the purposes of METAWARE. Furthermore, the MISSIONs Standard Metadata package that holds the items of MISSION that can be mapped onto the standards that are being developed for metadata, can be perfectly be connected with the other projects parts on standards, classifications and indicators (according to the project, with a preservation whether it can serve the purpose of the FASTER project, where some information is provided by additional interfaces, not the standard one). In addition, since all projects will provide data dictionaries, the part of MISSIONs project for this purpose is deemed necessary to be included in the COSMOS metadata model. Last but not least, the FASTER part of model on space, variables and populations can be considered one of the most analytical ones compared with the other projects of COSMOS cluster in this specific part, and can serve as a guideline in the for the related procedures.

II.4

Architecture comparisons
Basically all projects employ an n-tier architecture (3-tier as illustrated in the templates completed) to use either in a distributed system (MISSION, METAWARE, FASTER) or in a centralized one (IPIS, IQML) together with some kind of communication glue to tie various modules (XML, Z39.50, CORBA) and use mostly standards-based technologies. XML is used as interchange protocol by four COSMOS projects (except from IQML that uses HTTP, SOAP) and the DDI is used in parallel by IPIS and FASTER. Only MISSION uses Agents at the moment and FASTER also plans to develop them. Main technologies: XML, HTML, (DDI: FASTER and IPIS, SOAP:IQML) UML (most of them use Rational Rose) Java, C++ Generic registry and repository MS Analysis Services and Oracle Express
37

Oracle 8i

References
Papers and presentations De Vaney Chris (1997), Common Application Platform Architecture of the Distributed Application Server, Working Paper, WSEL/WP003/Rev.001, 27/12/1996. Karge Reinhard, (1997), BRIDGE, Workshop on Output Databases Stockholm. IMIM, WSIG/WP/017/Rev.000,

Karge Reinhard, Bridge Functionality, http://imim.scb.se. Musgrave S. and Ryssevik J., (2000), Beyond NESSTAR: FASTER access to data, IASSIST.

Project Deliverables and documents IPIS: Deliverables D5, by UoA/Dept of Mathematics Team, Deliverable D6 and D7.1 IQML Registry and Repository Interface Specification, by Chris Nelson and Andy Jenkins, Dimension EDI MISSION, Deliverables D4, D6 METAWARE, Deliverables D1, D2 COSMOS, Annex 1 projects interrelations EPROS publication documents for all 5 projects

Web Pages: The NESSTAR project: http://www.nesstar.org/ The FASTER project: http://www.faster-data.org/ The IQML project: http://www.epros.ed.ac.uk/iqml/ The MISSION project: http://www.epros.ed.ac.uk/mission/ The IPIS project: http://www.instore.gr/ipis/ The METANET project: http://www.epros.ed.ac.uk/metanet/ The CWM metamodel: http://www.cwmforum.org/ The IMIM project: http://imim.scb.se The LIMBER project: http://www.venus.cis.rl.ac.uk/limber/ The Chesire Project: http://cheshire.lib.berkeley.edu/ Bridge software: http://imim.scb.se/software/bridge.htm The CBS Cristal Model: http://neon.vb.cbs.nl/sos_cubes DDI DTD beta testers results: http://www.icpsr.umich.edu/DDI/codebook/testers.html The CORBA: http://www.corba.org/ The OMG organization: http://www.omg.org/
39

http://www.w3.org/RDF/ http://www.omg.org/news/pr99.html - xmi

Annex 1 Template
(TO BE COMPLETED BY EACH COSMOS
PROJECT)

General Information:
Projects objectives Main objective: Sub-objectives: Areas of Application: User types: (please provide an example for each user category in order to avoid errors due to differences in terminology)

User Services provided Data Collection (does the projects software facilitates data collection?) Data Collection (does the project offers collections of data, i.e
collections of survey data, indicators, etc?)

Data types supported? Micro Data types manipulated? Micro

Macro Macro

Metadata Manipulation Yes No (i.e manipulated in the metadata model, harmonisation of metadata, transformations of metadata with pre- and post-conditions, etc?) Data Dictionaries Data import Data Export Yes Yes Yes No No No No If YES, what

Search Facilities Yes types of search facilities? Languages Supported

Collection of Indicators Yes No which of the projects areas of application?

If yes, in

(i.e store and use by the system groups of pre-selected indicators i.e in the area of Labour Market, Household Budget Survey, etc?)

Collection of Classifications: Yes No which of the projects areas of application?

If yes, in

(i.e do you store some pivot classifications (i.e international ones) to allow for mapping of other classifications into them?) e-mail facilities Data publishing Data presentation in what format? (i.e HTML tables) Data visualization-in what format? Data Browsing Harmonisation of results Yes No No If yes Yes No

Transformations supported Yes which ones are supported? GUI specification Access control/security functions: Yes Statistical disclosure control: User action histories: Yes Yes No

No No

Architecture Architecture (3-tier or other??) Data Repository: Centralized Distributed

Metadata model used: (please provide the main metadata categories supported) Interchange protocol: Use of Agents? Yes No

Other Other characteristics that may have been omitted and are essential for the better understanding of the projects framework.

Please provide any common features between the project you are involved and any other project of COSMOS that cannot result from the previous questions

Common questions

The FASTER project employs a three-tier distributed architecture that builds upon the developments of the NESSTAR and LIMBER projects . It combines a distributed search facility using XML syntax for seamless remote database searching and a consistent XML-based interface for accessing multiple data repositories . This architecture improves access to statistical data by enabling it to be more accessible via a Data Web technology that offers universal information access similar to the WWW, integrating statistical data with text and live data .

FASTER addresses metadata management by developing a metadata repository that is XML-based and standards compliant to facilitate data interchange and usability . It enhances metadata's role by making it responsible not only for data conformance but also for personalization and access control . The project leverages the Data Documentation Initiative (DDI) standard and other XML approaches for metadata specification, ensuring a flexible and extensible structure for metadata and interfaces .

FASTER addresses privacy and security concerns by implementing access control mechanisms that are supported by metadata at all levels of the system . This inclusion ensures that sensitive statistical data is accessed only by authorized users, maintaining confidentiality and compliance with data protection standards . The emphasis on XML-based interfaces also contributes to secure data handling across different server environments .

The interrelation between FASTER and other COSMOS cluster projects is significant as it facilitates collaborative approaches to address shared challenges in data dissemination and usage . These projects interact by sharing metadata repositories and processes, showcasing converging points in their architecture and goals, such as data capture and dissemination . Such interactions enhance resource sharing, technology transfer, and provide a united framework for advancing statistical data management .

FASTER introduces several methodological advancements over NESSTAR, including improved functionality with standards compliance, such as XML and RDF, for broader metadata applicability . It abandons some NESSTAR tactics like CORBA messaging in favor of more robust XML approaches, enhancing metadata’s role in usability and user personalization . By incorporating access control within metadata and expanding to multi-dimensional data sources, FASTER improves on NESSTAR's groundwork to offer a more flexible and comprehensive data management solution .

FASTER's reliance on XML and RDF standards implies a commitment to interoperability, flexibility, and semantic precision in metadata management . Using these standards facilitates information interchange with other systems and enhances the applicability of metadata to diverse data sources, including time-variant and multi-dimensional ones . This reliance ensures that the metadata repository remains standards compliant, supporting seamless integration and future scalability .

The Data Web technology in FASTER aims to revolutionize statistical data handling by providing universal information access akin to the WWW's impact on text publishing . It involves creating a Data Browser (Client) that offers a user-friendly interface to standardized services for data access and processing . This allows seamless integration with the WWW, enabling the creation of data-rich documents that blend text, images, and live data, thereby simplifying data identification and dissemination .

The user-friendly graphical user interface in FASTER enhances data dissemination by offering seamless navigation and interaction with statistical data . The Data Browser component simplifies data access and processing with a Web browser-like experience, easing user engagement with complex datasets . This GUI design allows users to intuitively manage data, automate routine data tasks, and deliver real-time data views, thereby making statistical data more accessible and actionable .

Workshops and multidisciplinary teams are crucial for FASTER in developing metadata specifications and flexible user environments, as they allow for collaboration and consensus building on metadata semantics and structures . These expert workshops facilitate discussions with interested parties, leveraging their expertise to refine architecture and data accessibility . Multidisciplinary teams bring together leaders in metadata management and statistical control to address FASTER’s goals on a European level .

FASTER plans to enhance user interaction with statistical data by developing a configurable Web-based client application that allows users to personalize their environment for immediate statistical data interaction . Users will be able to visualize and interact with data using tools like visualizations and bookmarks . This setup is expected to facilitate easier data access and management, aid users in combining various data sources, and improve user engagement with a wide range of statistical data .

DOACC BLevel Format
No ratings yet
DOACC BLevel Format
14 pages
U.S. Department of Education Federal Student Aid: Detailed Design Document Template
0% (1)
U.S. Department of Education Federal Student Aid: Detailed Design Document Template
42 pages
Detailed Design Document Template
40% (5)
Detailed Design Document Template
42 pages
Software Design Description Version Date: 06/02/2010
No ratings yet
Software Design Description Version Date: 06/02/2010
16 pages
Design Document For Library Management System
57% (28)
Design Document For Library Management System
16 pages
Sample High Level Design
100% (2)
Sample High Level Design
12 pages
Franco Update
No ratings yet
Franco Update
110 pages
UCL ISD Technical Design Guide
100% (1)
UCL ISD Technical Design Guide
8 pages
High Level Design Guidelines Document
No ratings yet
High Level Design Guidelines Document
9 pages
Samar College Library System Specs
No ratings yet
Samar College Library System Specs
17 pages
Sample Report
No ratings yet
Sample Report
5 pages
Hmi / Scada: Guide Form Specification
No ratings yet
Hmi / Scada: Guide Form Specification
37 pages
Library Management System
67% (3)
Library Management System
53 pages
Harshaqe Main Project
No ratings yet
Harshaqe Main Project
69 pages
BU PMA Website Redesign: Architecture & Design Document
No ratings yet
BU PMA Website Redesign: Architecture & Design Document
13 pages
Online Library Management System Report
50% (2)
Online Library Management System Report
38 pages
Software Requirements Specification For Submitted by Team 13
No ratings yet
Software Requirements Specification For Submitted by Team 13
12 pages
Online Course Submission System
No ratings yet
Online Course Submission System
58 pages
WITSML Real-Time Data Viewer Project
No ratings yet
WITSML Real-Time Data Viewer Project
99 pages
OSG Accounting System Requirements Guide
No ratings yet
OSG Accounting System Requirements Guide
16 pages
D2.3 Portal Architecture0.3
No ratings yet
D2.3 Portal Architecture0.3
16 pages
SRS for Web Publishing System
No ratings yet
SRS for Web Publishing System
8 pages
Detailed Design Guidelines SEO
No ratings yet
Detailed Design Guidelines SEO
8 pages
Summer Training Report
50% (4)
Summer Training Report
58 pages
Project Name: Punjab University College of Information Technology
No ratings yet
Project Name: Punjab University College of Information Technology
5 pages
Template - Software Requirements Specification
No ratings yet
Template - Software Requirements Specification
13 pages
Structured SQL Query Design Guide
No ratings yet
Structured SQL Query Design Guide
21 pages
Software Engineering Course Work Group 7
No ratings yet
Software Engineering Course Work Group 7
17 pages
Computing and Concurrent System Design (Java Threads)
No ratings yet
Computing and Concurrent System Design (Java Threads)
56 pages
Index Nielit Report
No ratings yet
Index Nielit Report
3 pages
Vertopal Com Hotel Management System
No ratings yet
Vertopal Com Hotel Management System
34 pages
Face Detection and Recognition Final Project (Muhammad Waqas, Dinyal Arshad, Waqas Saeed and Ayaz Khan)
71% (14)
Face Detection and Recognition Final Project (Muhammad Waqas, Dinyal Arshad, Waqas Saeed and Ayaz Khan)
7 pages
Stream Processing Paradigm Analysis
100% (1)
Stream Processing Paradigm Analysis
36 pages
Eauction Synopsis
33% (3)
Eauction Synopsis
57 pages
E Commerce
No ratings yet
E Commerce
28 pages
Minor 1
No ratings yet
Minor 1
51 pages
SOAP: SCUBA Oxygen Analysis Project: Final Report
No ratings yet
SOAP: SCUBA Oxygen Analysis Project: Final Report
46 pages
Functional Specification Guide
100% (1)
Functional Specification Guide
15 pages
Essbas Student Guide I
No ratings yet
Essbas Student Guide I
125 pages
Virtual-ED Software Requirements Spec
No ratings yet
Virtual-ED Software Requirements Spec
44 pages
ABSTRACT
No ratings yet
ABSTRACT
7 pages
SG Vu Report
No ratings yet
SG Vu Report
57 pages
Library Management System Sds
82% (11)
Library Management System Sds
106 pages
Event-Driven Architecture Insights
No ratings yet
Event-Driven Architecture Insights
98 pages
Project Documentation
No ratings yet
Project Documentation
105 pages
Project Title: Product Specification Document
No ratings yet
Project Title: Product Specification Document
10 pages
Laboratory Information System
No ratings yet
Laboratory Information System
12 pages
SRS - Template Ravindra Sir
No ratings yet
SRS - Template Ravindra Sir
29 pages
Oup 118
No ratings yet
Oup 118
62 pages
Final Project Proposal - Batch 93 Library Management System: HND in Computing & Software Engineering ICBT Campus 2021
No ratings yet
Final Project Proposal - Batch 93 Library Management System: HND in Computing & Software Engineering ICBT Campus 2021
27 pages
Report On Mini Project
No ratings yet
Report On Mini Project
51 pages
Final Project Proposal Guide: University of The Punjab Gujranwala Campus
No ratings yet
Final Project Proposal Guide: University of The Punjab Gujranwala Campus
8 pages
Crash Info
No ratings yet
Crash Info
2 pages
Introduction to MSW Logo Programming
No ratings yet
Introduction to MSW Logo Programming
53 pages
SAP Process Control: Administration Guide - PUBLIC 2021-02-08
No ratings yet
SAP Process Control: Administration Guide - PUBLIC 2021-02-08
532 pages
Thesis Title 2020
No ratings yet
Thesis Title 2020
23 pages
Mess Management System
No ratings yet
Mess Management System
20 pages
Bss Comp
No ratings yet
Bss Comp
26 pages
A Java Matlab For Deregulated Markets
No ratings yet
A Java Matlab For Deregulated Markets
6 pages
Python Documentation Contents
No ratings yet
Python Documentation Contents
149 pages
Unit 2 Cloud Computing Reference Model
No ratings yet
Unit 2 Cloud Computing Reference Model
22 pages
Offline - External Mail Approval Process Without Using SAP Part-3
No ratings yet
Offline - External Mail Approval Process Without Using SAP Part-3
11 pages
Understanding VBScript Variants
No ratings yet
Understanding VBScript Variants
17 pages
Using Assembler in Delphi
No ratings yet
Using Assembler in Delphi
44 pages
CRUD Operations with MySQL in Visual Studio
No ratings yet
CRUD Operations with MySQL in Visual Studio
7 pages
Assignment No 01 Basic R Programming
No ratings yet
Assignment No 01 Basic R Programming
5 pages
Defining Classes and Objects in OOP
No ratings yet
Defining Classes and Objects in OOP
13 pages
MCA Paper-I Programming With C
No ratings yet
MCA Paper-I Programming With C
179 pages
I Want To Test Fuzzilli Againt v8, You Have To Giv
No ratings yet
I Want To Test Fuzzilli Againt v8, You Have To Giv
4 pages
PowerShell Essentials For The Busy Admin
No ratings yet
PowerShell Essentials For The Busy Admin
16 pages
Contoh Kira Bil Elektrik
50% (2)
Contoh Kira Bil Elektrik
6 pages
Module 9
No ratings yet
Module 9
4 pages
Secure Code Review: Enhancing Application Security
No ratings yet
Secure Code Review: Enhancing Application Security
10 pages
Lt2 Dfp50193 Web (f1087 - Haris)
No ratings yet
Lt2 Dfp50193 Web (f1087 - Haris)
6 pages
BSc CS Semester VI Timetable Jan-June
No ratings yet
BSc CS Semester VI Timetable Jan-June
1 page
Dokumen - Tips - Subroutine Guide 56228b2f86860
No ratings yet
Dokumen - Tips - Subroutine Guide 56228b2f86860
68 pages
Why Choose Standard ML for Compilers?
No ratings yet
Why Choose Standard ML for Compilers?
34 pages
AngularJS PhoneCat Tutorial Guide
No ratings yet
AngularJS PhoneCat Tutorial Guide
8 pages
PHP, Apache, MySQL: Open Source Overview
No ratings yet
PHP, Apache, MySQL: Open Source Overview
3 pages
Assignment 6
No ratings yet
Assignment 6
2 pages
CS-302 Computer Architecture & Assembly Language: Lecture # 01
No ratings yet
CS-302 Computer Architecture & Assembly Language: Lecture # 01
34 pages
Software Testing Tools Course Overview
No ratings yet
Software Testing Tools Course Overview
4 pages