0% found this document useful (1 vote)

820 views780 pages

DSpace Manual

Uploaded by

TTeste dos Ttests

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (1 vote)

820 views780 pages

DSpace Manual

Uploaded by

TTeste dos Ttests

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

DSpace 7.

x Documentation
DSpace 7.x Documentation

Exported on 08/03/2021

DSpace 7.x Documentation – DSpace 7.x Documentation

Table of Contents
1 Introduction ......................................................................................................20
1.1 Release Notes......................................................................................................................... 21
1.1.1 7.0 Release Notes ................................................................................................................................................. 22
1.1.2 7.0 Configurations Removed ............................................................................................................................... 24
1.1.3 7.0 Acknowledgments.......................................................................................................................................... 26
1.1.3.1 Major Contributing Institutions........................................................................................................................... 26
1.1.3.2 Financial Contributors ......................................................................................................................................... 26
1.1.3.3 Frontend / User Interface Acknowledgments..................................................................................................... 27
1.1.3.4 Backend / REST API Acknowledgments .............................................................................................................. 27
1.1.3.5 Additional Thanks ................................................................................................................................................ 28
1.1.4 7.0 Beta 1-5 Release Notes................................................................................................................................... 28
1.1.4.1 7.0 Beta 5 Release Notes...................................................................................................................................... 28
1.1.4.2 7.0 Beta 4 Release Notes...................................................................................................................................... 30
1.1.4.3 7.0 Beta 3 Release Notes...................................................................................................................................... 31
1.1.4.4 7.0 Beta 2 Release Notes...................................................................................................................................... 33
1.1.4.5 7.0 Beta 1 Release Notes...................................................................................................................................... 33
1.2 Functional Overview .............................................................................................................. 35
1.2.1 Online access to your digital assets .................................................................................................................... 36
1.2.1.1 Full-text search..................................................................................................................................................... 36
1.2.1.2 Navigation ............................................................................................................................................................ 36
1.2.1.3 Supported file types............................................................................................................................................. 37
1.2.1.4 Optimized for Google Indexing............................................................................................................................ 37
1.2.1.5 OpenURL Support ................................................................................................................................................ 37
1.2.1.6 Support for modern browsers ............................................................................................................................. 37
1.2.2 Metadata Management........................................................................................................................................ 38
1.2.2.1 Metadata............................................................................................................................................................... 38
1.2.2.2 Choice Management and Authority Control ....................................................................................................... 38
1.2.3 Licensing............................................................................................................................................................... 39
1.2.3.1 Collection and Community Licenses................................................................................................................... 40
1.2.3.2 License granted by the submitter to the repository........................................................................................... 40
1.2.3.3 Creative Commons Support for DSpace Items................................................................................................... 40
1.2.4 Persistent URLs and Identifiers ........................................................................................................................... 40
1.2.4.1 Handles................................................................................................................................................................. 40

– 2
DSpace 7.x Documentation – DSpace 7.x Documentation

1.2.4.2 Bitstream 'Persistent' Identifiers ........................................................................................................................ 41

1.2.5 Getting content into DSpace ............................................................................................................................... 42
1.2.5.1 The Manual DSpace Submission and Workflow System .................................................................................... 42
1.2.5.2 Command line import facilities........................................................................................................................... 44
1.2.5.3 Registration for externally hosted files ............................................................................................................... 44
1.2.5.4 SWORD Support ................................................................................................................................................... 44
1.2.6 Getting content out of DSpace ............................................................................................................................ 44
1.2.6.1 OAI Support .......................................................................................................................................................... 44
1.2.6.2 Command Line Export Facilities ......................................................................................................................... 45
1.2.6.3 Packager Plugins.................................................................................................................................................. 45
1.2.6.4 Crosswalk Plugins ................................................................................................................................................ 45
1.2.6.5 Supervision and Collaboration............................................................................................................................ 46
1.2.7 User Management ................................................................................................................................................ 46
1.2.7.1 User Accounts (E-Person) .................................................................................................................................... 46
1.2.7.2 Subscriptions ....................................................................................................................................................... 46
1.2.7.3 Groups .................................................................................................................................................................. 47
1.2.8 Access Control...................................................................................................................................................... 47
1.2.8.1 Authentication ..................................................................................................................................................... 47
1.2.8.2 Authorization........................................................................................................................................................ 47
1.2.9 Usage Metrics ....................................................................................................................................................... 48
1.2.9.1 Item, Collection and Community Usage Statistics............................................................................................. 48
1.2.9.2 System Statistics .................................................................................................................................................. 49
1.2.10 Digital Preservation ............................................................................................................................................. 50
1.2.10.1 Checksum Checker............................................................................................................................................... 50
1.2.11 System Design ...................................................................................................................................................... 50
1.2.11.1 Data Model ........................................................................................................................................................... 50
1.2.11.2 Amazon S3 Support ............................................................................................................................................. 52

2 Installing DSpace ..............................................................................................53

2.1 Installation Overview ............................................................................................................. 53
2.2 Installing the Backend (Server API)....................................................................................... 53
2.2.1 Backend Requirements ....................................................................................................................................... 53
2.2.1.1 UNIX-like OS or Microsoft Windows .................................................................................................................... 54
2.2.1.2 Java JDK 11 (OpenJDK or Oracle JDK)................................................................................................................ 54
2.2.1.3 Apache Maven 3.3.x or above (Java build tool) .................................................................................................. 55
2.2.1.4 Apache Ant 1.10.x or later (Java build tool) ........................................................................................................ 55

– 3
DSpace 7.x Documentation – DSpace 7.x Documentation

2.2.1.5 Relational Database (PostgreSQL or Oracle)...................................................................................................... 56

2.2.1.6 Apache Solr 8.x (full-text index/search service) ................................................................................................. 57
2.2.1.7 Servlet Engine (Apache Tomcat 9, Jetty, Caucho Resin or equivalent) ............................................................ 57
2.2.1.8 (Optional) IP to City Database for Location-based Statistics ............................................................................ 58
2.2.1.9 Git (code version control) .................................................................................................................................... 59
2.2.2 Backend Installation ............................................................................................................................................ 59
2.3 Installing the Frontend (User Interface)................................................................................ 66
2.3.1 Frontend Requirements....................................................................................................................................... 66
2.3.1.1 UNIX-like OS or Microsoft Windows .................................................................................................................... 66
2.3.1.2 Node.js (v12.x or v14.x) ........................................................................................................................................ 67
2.3.1.3 Yarn (v1.x) ............................................................................................................................................................. 67
2.3.1.4 PM2 (or another Process Manager for Node.js apps) (optional, but recommended for Production).............. 67
2.3.1.5 DSpace 7.x Backend (see above)......................................................................................................................... 67
2.3.2 Frontend Installation ........................................................................................................................................... 67
2.4 What Next?.............................................................................................................................. 70
2.5 Common Installation Issues .................................................................................................. 71
2.5.1 Troubleshoot an error or find detailed error messages..................................................................................... 71
2.5.2 "CORS error" or "Invalid CORS request"............................................................................................................. 71
2.5.3 "403 Forbidden" error with a message that says "Access is denied. Invalid CSRF Token" .............................. 71
2.5.4 Using a Self-Signed SSL Certificate causes the Frontend to not be able to access the Backend .................... 72
2.5.5 My REST API is running under HTTPS, but some of its "link" URLs are switching to HTTP? ............................ 73
2.5.6 Database errors occur when you run ant fresh_install ...................................................................................... 73

3 Upgrading DSpace ............................................................................................75

3.1 Release Notes / Significant Changes..................................................................................... 76
3.2 Upgrading the Backend (Server API)..................................................................................... 77
3.2.1 Backup your DSpace Backend............................................................................................................................. 77
3.2.2 Update Backend Prerequisite Software ............................................................................................................. 78
3.2.3 Upgrading the Backend Steps............................................................................................................................. 78
3.3 Upgrading the Frontend (User Interface) ............................................................................. 85
3.4 Troubleshooting Upgrade Issues .......................................................................................... 86
3.4.1 "Ignored" Flyway Migrations ............................................................................................................................... 86
3.4.2 Manually updating the Metadata Registries ....................................................................................................... 86

4 Using DSpace.....................................................................................................87
4.1 Authentication and Authorization......................................................................................... 87

– 4
DSpace 7.x Documentation – DSpace 7.x Documentation

4.1.1 Authentication Plugins ........................................................................................................................................ 87

4.1.1.1 Stackable Authentication Method(s) .................................................................................................................. 87
4.1.2 Embargo ............................................................................................................................................................. 113
4.1.2.1 What is an Embargo? ......................................................................................................................................... 114
4.1.2.2 DSpace Embargo Functionality......................................................................................................................... 114
4.1.2.3 Configuring and using Embargo in DSpace Submission User Interface ......................................................... 115
4.1.2.4 Technical Specifications.................................................................................................................................... 115
4.1.2.5 Creating Embargoes via Metadata .................................................................................................................... 117
4.1.2.6 Pre-3.0 Embargo Lifter Commands................................................................................................................... 120
4.1.3 Managing User Accounts ................................................................................................................................... 121
4.1.3.1 From the browser............................................................................................................................................... 122
4.1.3.2 From the command line .................................................................................................................................... 122
4.1.3.3 Email Subscriptions ........................................................................................................................................... 125
4.1.4 Request a Copy................................................................................................................................................... 126
4.1.4.1 Introduction ....................................................................................................................................................... 126
4.1.4.2 Requesting a copy using the User Interface ..................................................................................................... 126
4.1.4.3 (Optional) Requesting a copy with Help Desk workflow.................................................................................. 128
4.1.4.4 Email templates ................................................................................................................................................. 131
4.1.4.5 Configuration parameters ................................................................................................................................. 132
4.1.4.6 Selecting Request a Copy strategy via Spring Configuration .......................................................................... 133
4.2 Configurable Entities ........................................................................................................... 134
4.2.1 Introduction ....................................................................................................................................................... 135
4.2.2 Default Entity Models......................................................................................................................................... 135
4.2.2.1 Research Entities................................................................................................................................................ 135
4.2.2.2 Journals.............................................................................................................................................................. 136
4.2.3 Enabling Entities ................................................................................................................................................ 136
4.2.3.1 1. Configure your entity model (optionally) ..................................................................................................... 137
4.2.3.2 2. Import entity model into the database......................................................................................................... 137
4.2.3.3 3. Configuration of community/collection list for Entity types ....................................................................... 137
4.2.3.4 4. Configure Submission Forms for each Entity type ....................................................................................... 139
4.2.3.5 5. Configure Workflow for each Entity type (optionally).................................................................................. 139
4.2.3.6 6. Configure Virtual Metadata to display for related Entities (optionally) ...................................................... 139
4.2.4 Designing your own Entity model ..................................................................................................................... 140
4.2.4.1 Thinking about the object model...................................................................................................................... 141
4.2.4.2 Configuring the object model............................................................................................................................ 141

– 5
DSpace 7.x Documentation – DSpace 7.x Documentation

4.2.4.3 Configuring the metadata fields ....................................................................................................................... 142

4.2.4.4 Configuring the item display pages .................................................................................................................. 142
4.2.4.5 Configuring virtual metadata ............................................................................................................................ 142
4.2.4.6 Configuring discovery ........................................................................................................................................ 142
4.2.4.7 Additional Technical Details.............................................................................................................................. 142
4.3 Curation System................................................................................................................... 143
4.3.1 Tasks ................................................................................................................................................................... 143
4.3.2 Activation............................................................................................................................................................ 143
4.3.3 Task Invocation .................................................................................................................................................. 144
4.3.3.1 On the command line ........................................................................................................................................ 144
4.3.3.2 In the admin UI................................................................................................................................................... 145
4.3.3.3 In workflow......................................................................................................................................................... 146
4.3.3.4 In arbitrary user code......................................................................................................................................... 147
4.3.4 Asynchronous (Deferred) Operation ................................................................................................................. 148
4.3.5 Task Output and Reporting ............................................................................................................................... 148
4.3.5.1 Status Code ........................................................................................................................................................ 148
4.3.5.2 Result String ....................................................................................................................................................... 149
4.3.5.3 Reporting Stream............................................................................................................................................... 149
4.3.6 Task Properties .................................................................................................................................................. 149
4.3.7 Task Parameters ................................................................................................................................................ 150
4.3.8 Scripted Tasks .................................................................................................................................................... 151
4.3.9 Bundled Tasks .................................................................................................................................................... 151
4.3.9.1 Bitstream Format Profiler Task ......................................................................................................................... 152
4.3.9.2 Link Checker Tasks............................................................................................................................................. 152
4.3.9.3 MetadataWebService Task ................................................................................................................................ 153
4.3.9.4 MicrosoftTranslator Task................................................................................................................................... 156
4.3.9.5 NoOp Task .......................................................................................................................................................... 157
4.3.9.6 Required Metadata Task.................................................................................................................................... 157
4.3.9.7 Virus Scan Task................................................................................................................................................... 158
4.4 Exporting Content and Metadata........................................................................................ 160
4.4.1 Linked (Open) Data ............................................................................................................................................ 160
4.4.1.1 Introduction ....................................................................................................................................................... 161
4.4.1.2 Linked (Open) Data Support within DSpace..................................................................................................... 162
4.4.2 SWORDv1 Client ................................................................................................................................................. 176
4.4.2.1 Enabling the SWORD Client ............................................................................................................................... 177

– 6
DSpace 7.x Documentation – DSpace 7.x Documentation

4.4.2.2 Configuring the SWORD Client .......................................................................................................................... 177

4.4.3 Exchanging Content Between Repositories ..................................................................................................... 178
4.4.3.1 Transferring Content via Export and Import .................................................................................................... 179
4.4.3.2 Transferring Items using Simple Archive Format ............................................................................................. 179
4.4.3.3 Transferring Items using OAI-ORE/OAI-PMH Harvester ................................................................................... 179
4.4.4 OAI....................................................................................................................................................................... 179
4.4.4.1 OAI Interfaces ..................................................................................................................................................... 179
4.4.4.2 OAI-PMH Data Provider 2.0 (Internals).............................................................................................................. 186
4.4.4.3 OAI 2.0 Server ..................................................................................................................................................... 189
4.4.5 OpenAIRE4 Guidelines Compliancy .................................................................................................................. 199
4.4.5.1 Loading of Entities and Fields ........................................................................................................................... 199
4.4.5.2 OAI interface....................................................................................................................................................... 200
4.5 Ingesting Content and Metadata......................................................................................... 200
4.5.1 Ingesting HTML Archives.................................................................................................................................... 201
4.5.2 SWORDv2 Server ................................................................................................................................................ 202
4.5.2.1 Enabling SWORD v2 Server ................................................................................................................................ 202
4.5.2.2 Configuring SWORD v2 Server ........................................................................................................................... 202
4.5.2.3 Deposit to SWORDv2 Server .............................................................................................................................. 215
4.5.2.4 Troubleshooting................................................................................................................................................. 216
4.5.3 SWORDv1 Server ................................................................................................................................................ 216
4.5.3.1 Enabling SWORD Server..................................................................................................................................... 217
4.5.3.2 Configuring SWORD Server................................................................................................................................ 217
4.5.3.3 Deposit to SWORD Server .................................................................................................................................. 226
4.5.4 Exporting and Importing Community and Collection Hierarchy..................................................................... 226
4.5.4.1 Community and Collection Structure Importer ............................................................................................... 227
4.5.4.2 Community and Collection Structure Exporter................................................................................................ 229
4.5.5 Importing Items via basic bibliographic formats (Endnote, BibTex, RIS, TSV, CSV) and online services (OAI,
arXiv, PubMed, CrossRef, CiNii) ......................................................................................................................... 229
4.5.5.1 Introduction ....................................................................................................................................................... 230
4.5.5.2 Features.............................................................................................................................................................. 230
4.5.5.3 Submitting starting from external sources....................................................................................................... 230
4.5.5.4 Submitting starting from bibliographic file ...................................................................................................... 230
4.5.5.5 More Information ............................................................................................................................................... 230
4.5.6 Registering Bitstreams via Simple Archive Format .......................................................................................... 231
4.5.6.1 Overview............................................................................................................................................................. 231

– 7
DSpace 7.x Documentation – DSpace 7.x Documentation

4.5.7 Importing and Exporting Items via Simple Archive Format............................................................................. 233
4.5.7.1 Item Importer and Exporter............................................................................................................................... 233
4.5.8 Importing and Exporting Content via Packages............................................................................................... 244
4.5.8.1 Package Importer and Exporter ........................................................................................................................ 244
4.5.9 Configurable Workflow ...................................................................................................................................... 250
4.5.9.1 Introduction ....................................................................................................................................................... 250
4.5.9.2 Data Migration.................................................................................................................................................... 251
4.5.9.3 Configuration ..................................................................................................................................................... 252
4.5.9.4 Authorizations .................................................................................................................................................... 257
4.5.9.5 Database............................................................................................................................................................. 257
4.5.9.6 Additional workflow steps/actions and features ............................................................................................. 259
4.5.10 Submission User Interface................................................................................................................................. 260
4.5.10.1 Default Submission Process .............................................................................................................................. 261
4.5.10.2 Understanding the Submission Configuration Files ........................................................................................ 262
4.5.10.3 Reordering/Removing/Adding Submission Steps............................................................................................ 264
4.5.10.4 Assigning a custom Submission Process to a Collection ................................................................................. 265
4.5.10.5 Custom Metadata-entry Steps for Submission................................................................................................. 265
4.5.10.6 Configuring the File Upload step....................................................................................................................... 271
4.5.10.7 Creating new Submission Steps Programmatically......................................................................................... 274
4.5.10.8 Live Import from external sources .................................................................................................................... 274
4.5.10.9 Simple HTML Fragment Markup........................................................................................................................ 283
4.6 Items and Metadata ............................................................................................................. 284
4.6.1 Authority Control of Metadata Values............................................................................................................... 284
4.6.1.1 work in progress................................................................................................................................................. 284
4.6.1.2 Introduction ....................................................................................................................................................... 284
4.6.1.3 Simple choice management for DSpace submission forms ............................................................................ 285
4.6.1.4 Hierarchical Taxonomies and Controlled Vocabularies................................................................................... 286
4.6.1.5 Authority Control: Enhancing DSpace metadata fields with Authority Keys .................................................. 287
4.6.2 Batch Metadata Editing ..................................................................................................................................... 287
4.6.2.1 Batch Metadata Editing Tool ............................................................................................................................. 288
4.6.2.2 Batch Metadata Editing Configuration ............................................................................................................. 294
4.6.3 DOI Digital Object Identifier............................................................................................................................... 296
4.6.3.1 Persistent Identifier ........................................................................................................................................... 297
4.6.3.2 DOI Registration Agencies ................................................................................................................................. 297
4.6.3.3 Adding support for other Registration Agencies .............................................................................................. 306

– 8
DSpace 7.x Documentation – DSpace 7.x Documentation

4.6.4 Item Level Versioning......................................................................................................................................... 307

4.6.4.1 What is Item Level Versioning? .......................................................................................................................... 307
4.6.4.2 Important warnings - read before enabling ..................................................................................................... 307
4.6.4.3 Enabling Item Level Versioning ......................................................................................................................... 308
4.6.4.4 Initial Requirements .......................................................................................................................................... 308
4.6.4.5 User Interface ..................................................................................................................................................... 309
4.6.4.6 Architecture........................................................................................................................................................ 310
4.6.4.7 Configuration ..................................................................................................................................................... 311
4.6.4.8 Identified Challenges & Known Issues .............................................................................................................. 314
4.6.5 Mapping/Linking Items to multiple Collections ............................................................................................... 315
4.6.5.1 Introduction ....................................................................................................................................................... 315
4.6.5.2 Using the Item Mapper....................................................................................................................................... 315
4.6.5.3 Implications........................................................................................................................................................ 316
4.6.6 Metadata Recommendations ............................................................................................................................ 316
4.6.6.1 Recommended Metadata Fields ....................................................................................................................... 316
4.6.6.2 Local Fields......................................................................................................................................................... 317
4.6.7 Moving Items ...................................................................................................................................................... 317
4.6.7.1 Moving Items via Web UI.................................................................................................................................... 317
4.6.7.2 Moving Items via the Batch Metadata Editor.................................................................................................... 318
4.6.8 ORCID Integration .............................................................................................................................................. 318
4.6.8.1 Introduction ....................................................................................................................................................... 318
4.6.8.2 Use case and high level benefits ....................................................................................................................... 319
4.6.8.3 Enabling the ORCID authority control .............................................................................................................. 319
4.6.8.4 Importing existing authors & keeping the index up to date ............................................................................ 320
4.6.8.5 Configuration ..................................................................................................................................................... 326
4.6.8.6 Adding additional fields under ORCID .............................................................................................................. 327
4.6.8.7 Integration with other systems beside ORCID.................................................................................................. 329
4.6.8.8 FAQ...................................................................................................................................................................... 329
4.6.9 PDF Citation Cover Page .................................................................................................................................... 330
4.6.9.1 Configuration settings for Citation Cover Page................................................................................................ 331
4.6.10 Updating Items via Simple Archive Format ...................................................................................................... 333
4.6.10.1 Item Update Tool ............................................................................................................................................... 334
4.7 Managing Community Hierarchy ........................................................................................ 336
4.7.1 Sub-Community Management .......................................................................................................................... 336
4.8 Statistics and Metrics........................................................................................................... 338

– 9
DSpace 7.x Documentation – DSpace 7.x Documentation

4.8.1 SOLR Statistics ................................................................................................................................................... 338

4.8.1.1 What is exactly being logged ?........................................................................................................................... 339
4.8.1.2 Web User Interface Elements ............................................................................................................................ 341
4.8.1.3 Architecture........................................................................................................................................................ 343
4.8.1.4 Configuration settings for Statistics ................................................................................................................. 343
4.8.1.5 Statistics Administration ................................................................................................................................... 349
4.8.1.6 Custom Reporting - Querying SOLR Directly .................................................................................................... 349
4.8.1.7 Managing the City Database File ....................................................................................................................... 350
4.8.1.8 SOLR Statistics Maintenance............................................................................................................................. 351
4.8.2 DSpace Google Analytics Statistics ................................................................................................................... 363
4.8.2.1 Google Analytics Recording............................................................................................................................... 363
4.8.2.2 Google Analytics Reporting ............................................................................................................................... 363
4.8.2.3 Configuration settings for Google Analytics Statistics..................................................................................... 364
4.8.3 Exchange usage statistics with IRUS................................................................................................................. 365
4.8.3.1 Introduction ....................................................................................................................................................... 366
4.8.3.2 Prerequisite ........................................................................................................................................................ 366
4.8.3.3 Configuration ..................................................................................................................................................... 366
4.9 User Interface ....................................................................................................................... 367
4.9.1 User Interface Configuration ............................................................................................................................. 367
4.9.1.1 Overview............................................................................................................................................................. 368
4.9.1.2 Configuration Override ...................................................................................................................................... 368
4.9.1.3 Configuration Reference.................................................................................................................................... 369
4.9.2 User Interface Customization............................................................................................................................ 378
4.9.2.1 Angular Overview ............................................................................................................................................... 378
4.9.2.2 Theme Technologies.......................................................................................................................................... 379
4.9.2.3 Creating a Custom Theme ................................................................................................................................. 379
4.9.2.4 Additional Theming Resources ......................................................................................................................... 386
4.9.3 Discovery ............................................................................................................................................................ 386
4.9.3.1 What is DSpace Discovery.................................................................................................................................. 387
4.9.3.2 Configuration files.............................................................................................................................................. 389
4.9.3.3 General Discovery settings (config/modules/discovery.cfg) ........................................................................... 389
4.9.3.4 Modifying the Discovery User Interface (config/spring/api/discovery.xml) ................................................... 391
4.9.3.5 Discovery Solr Index Maintenance .................................................................................................................... 405
4.9.3.6 Advanced Solr Configuration ............................................................................................................................ 406
4.9.4 Multilingual Support .......................................................................................................................................... 407

– 10
DSpace 7.x Documentation – DSpace 7.x Documentation

4.9.4.1 Multilingual Support on the Backend (REST API)............................................................................................. 407

4.9.4.2 Multilingual Support on the Frontend (UI) ....................................................................................................... 408

5 System Administration ...................................................................................410

5.1 Introduction to DSpace System Administration ................................................................ 410
5.2 AIP Backup and Restore....................................................................................................... 411
5.2.1 Background & Overview .................................................................................................................................... 412
5.2.1.1 How does this differ from traditional DSpace Backups? Which Backup route is better?............................... 412
5.2.1.2 How does this help backup your DSpace to remote storage or cloud services (like DuraCloud)? ................ 415
5.2.1.3 AIPs are Archival Information Packages ........................................................................................................... 415
5.2.1.4 AIP Structure / Format ....................................................................................................................................... 416
5.2.2 Running the Code............................................................................................................................................... 416
5.2.2.1 Exporting AIPs .................................................................................................................................................... 416
5.2.2.2 Ingesting / Restoring AIPs.................................................................................................................................. 418
5.2.2.3 Cleaning up from a failed import ...................................................................................................................... 426
5.2.2.4 Performance considerations ............................................................................................................................. 426
5.2.2.5 Disable User Interaction for Cron...................................................................................................................... 427
5.2.3 Command Line Reference ................................................................................................................................. 427
5.2.3.1 Additional Packager Options............................................................................................................................. 429
5.2.4 Configuration in 'dspace.cfg'............................................................................................................................. 435
5.2.4.1 AIP Metadata Dissemination Configurations.................................................................................................... 435
5.2.4.2 AIP Ingestion Metadata Crosswalk Configurations .......................................................................................... 436
5.2.4.3 AIP Ingestion EPerson Configurations .............................................................................................................. 437
5.2.4.4 AIP Configurations To Improve Ingestion Speed while Validating.................................................................. 437
5.2.5 Common Issues or Error Messages ................................................................................................................... 438
5.2.6 DSpace AIP Format ............................................................................................................................................ 439
5.2.6.1 Makeup and Definition of AIPs .......................................................................................................................... 440
5.2.6.2 AIP Details: METS Structure............................................................................................................................... 442
5.2.6.3 Metadata in METS .............................................................................................................................................. 445
5.3 Ant targets and options ....................................................................................................... 458
5.3.1 Options ............................................................................................................................................................... 458
5.3.2 Targets................................................................................................................................................................ 459
5.4 Command Line Operations ................................................................................................. 460
5.4.1 Executing command line operations ................................................................................................................ 460
5.4.2 Available operations .......................................................................................................................................... 460

– 11
DSpace 7.x Documentation – DSpace 7.x Documentation

5.4.2.1 General use......................................................................................................................................................... 460

5.4.2.2 Legacy statistics ................................................................................................................................................. 462
5.4.2.3 SOLR Statistics ................................................................................................................................................... 462
5.4.3 Database Utilities ............................................................................................................................................... 462
5.4.4 Executing streams of commands ...................................................................................................................... 464
5.5 Handle.Net Registry Support............................................................................................... 464
5.5.1 To install your Handle resolver on the host where DSpace runs..................................................................... 465
5.5.2 To install a Handle resolver on a separate machine ........................................................................................ 466
5.5.3 To install a Handle resolver on a separate machine using template handles ................................................ 468
5.5.4 Updating Existing Handle Prefixes .................................................................................................................... 468
5.6 Mediafilters for Transforming DSpace Content.................................................................. 469
5.6.1 MediaFilters: Transforming DSpace Content.................................................................................................... 469
5.6.1.1 Overview............................................................................................................................................................. 469
5.6.1.2 Available Media Filters ....................................................................................................................................... 469
5.6.1.3 Enabling/Disabling MediaFilters ....................................................................................................................... 471
5.6.1.4 Executing (via Command Line).......................................................................................................................... 471
5.6.1.5 Creating Custom MediaFilters ........................................................................................................................... 472
5.6.1.6 Configuration parameters ................................................................................................................................. 474
5.6.2 ImageMagick Media Filters ................................................................................................................................ 474
5.6.2.1 ImageMagic Media Filters .................................................................................................................................. 474
5.7 Performance Tuning DSpace............................................................................................... 477
5.7.1 Performance Tuning the Backend (REST API) .................................................................................................. 478
5.7.1.1 Give Tomcat More Memory................................................................................................................................ 478
5.7.1.2 Give the Command Line Tools More Memory................................................................................................... 480
5.7.2 Give PostgreSQL Database More Memory ........................................................................................................ 481
5.8 Scheduled Tasks via Cron.................................................................................................... 481
5.8.1 Recommended Cron Settings............................................................................................................................ 482
5.9 Search Engine Optimization................................................................................................ 485
5.9.1 Ensuring your DSpace is indexed ...................................................................................................................... 485
5.9.1.1 Keep your DSpace up to date ............................................................................................................................ 485
5.9.1.2 Ensure your DSpace is visible to search engines.............................................................................................. 486
5.9.1.3 Ensure the sitemaps feature is enabled............................................................................................................ 486
5.9.1.4 Ensure Server-side rendering is enabled in the UI ........................................................................................... 488
5.9.1.5 Create a good robots.txt .................................................................................................................................... 488

– 12
DSpace 7.x Documentation – DSpace 7.x Documentation

5.9.1.6 Ensure Item Metadata appears in the HTML HEAD .......................................................................................... 490
5.9.1.7 Avoid redirecting file downloads to Item landing pages ................................................................................. 491
5.9.1.8 Turn OFF any generation of PDF cover pages................................................................................................... 491
5.9.1.9 In general, OAI-PMH is not useful to Search Engines ....................................................................................... 491
5.9.2 Google Scholar Metadata Mappings ................................................................................................................. 492
5.10 Troubleshooting Information.............................................................................................. 492
5.11 Validating CheckSums of Bitstreams .................................................................................. 493
5.11.1 Checksum Checker............................................................................................................................................. 493
5.11.1.1 Checker Execution Mode ................................................................................................................................... 494
5.11.1.2 Checker Results Pruning.................................................................................................................................... 495
5.11.1.3 Checker Reporting ............................................................................................................................................. 495
5.11.1.4 Cron or Automatic Execution of Checksum Checker ....................................................................................... 496
5.11.1.5 Automated Checksum Checkers' Results ......................................................................................................... 496
5.11.1.6 Database Query.................................................................................................................................................. 497

6 DSpace Development .....................................................................................499

6.1 Advanced Customisation..................................................................................................... 499
6.1.1 Additions module............................................................................................................................................... 499
6.1.2 Server Webapp Overlay ..................................................................................................................................... 499
6.1.3 Rest (Deprecated) Webapp Overlay .................................................................................................................. 500
6.1.4 DSpace Service Manager ................................................................................................................................... 500
6.1.4.1 Introduction ....................................................................................................................................................... 500
6.1.4.2 Configuration ..................................................................................................................................................... 500
6.1.4.3 Architectural Overview ...................................................................................................................................... 502
6.1.4.4 Tutorials.............................................................................................................................................................. 502
6.2 REST API ............................................................................................................................... 502
6.2.1 Overview............................................................................................................................................................. 503
6.2.2 REST Contract / Documentation ....................................................................................................................... 503
6.2.3 REST Configuration............................................................................................................................................ 503
6.2.4 Technical Design ................................................................................................................................................ 505
6.3 REST API v6 (deprecated) .................................................................................................... 506
6.3.1 What is DSpace REST API (v4-v6)....................................................................................................................... 506
6.3.1.1 Installing the REST API (v4-v6)........................................................................................................................... 506
6.3.1.2 REST Endpoints.................................................................................................................................................. 507
6.3.1.3 Model - Object data types.................................................................................................................................. 516

– 13
DSpace 7.x Documentation – DSpace 7.x Documentation

6.3.2 Introduction to Jersey for developers .............................................................................................................. 517

6.3.3 Configuration for DSpace REST......................................................................................................................... 518
6.3.4 Recording Proxy Access by Tools ...................................................................................................................... 518
6.3.5 Additional Information ...................................................................................................................................... 518
6.3.6 REST Based Quality Control Reports ................................................................................................................ 518
6.3.6.1 Tutorial ............................................................................................................................................................... 519
6.3.6.2 Summary ............................................................................................................................................................ 519
6.3.6.3 API Calls Used in these Reports ......................................................................................................................... 519
6.3.6.4 Report Screen Shots .......................................................................................................................................... 520
6.3.6.5 Installation and Configuration .......................................................................................................................... 521
6.3.6.6 REST Reports - Collection Report Screenshots with Annotated API Calls ...................................................... 525
6.3.6.7 REST Reports - Metadata Query Screenshots with Annotated API Calls......................................................... 532
6.3.6.8 REST Reports - Summary of API Calls ............................................................................................................... 537
6.4 Curation Tasks...................................................................................................................... 539
6.4.1 Writing your own tasks ...................................................................................................................................... 539
6.4.2 Task Output and Reporting ............................................................................................................................... 540
6.4.2.1 Status Code ........................................................................................................................................................ 540
6.4.2.2 Result String ....................................................................................................................................................... 541
6.4.2.3 Reporting Stream............................................................................................................................................... 541
6.4.2.4 Accessing task output in calling code ............................................................................................................... 541
6.4.3 Task Properties .................................................................................................................................................. 541
6.4.4 Task Annotations ............................................................................................................................................... 542
6.4.5 Scripted Tasks .................................................................................................................................................... 542
6.4.5.1 Interface ............................................................................................................................................................. 542
6.4.6 Curation tasks in Jython.................................................................................................................................... 543
6.4.6.1 Setting up scripted tasks in Jython................................................................................................................... 543
6.4.6.2 See also............................................................................................................................................................... 545
6.5 Development Tools Provided by DSpace ........................................................................... 545
6.5.1 Date parser tester............................................................................................................................................... 545
6.6 Services to support Alternative Identifiers ......................................................................... 545
6.6.1 Versioning and Identifier Service ...................................................................................................................... 545
6.6.1.1 Versioning Service.............................................................................................................................................. 546
6.6.1.2 Identifier Service ................................................................................................................................................ 547
6.7 Batch Processing.................................................................................................................. 550

– 14
DSpace 7.x Documentation – DSpace 7.x Documentation

7 DSpace Reference ...........................................................................................552

7.1 Configuration Reference...................................................................................................... 552
7.1.1 General Configuration ....................................................................................................................................... 554
7.1.1.1 Configuration File Syntax .................................................................................................................................. 554
7.1.1.2 Configuration Scheme for Reloading and Overriding ...................................................................................... 557
7.1.1.3 Why are there multiple copies of some config files? ........................................................................................ 559
7.1.2 The local.cfg Configuration Properties File ...................................................................................................... 560
7.1.3 The dspace.cfg Configuration Properties File .................................................................................................. 563
7.1.3.1 Main DSpace Configurations ............................................................................................................................. 563
7.1.3.2 DSpace Database Configuration ....................................................................................................................... 564
7.1.3.3 DSpace Email Settings ....................................................................................................................................... 566
7.1.3.4 File Storage......................................................................................................................................................... 571
7.1.3.5 Logging Configuration ....................................................................................................................................... 572
7.1.3.6 General Plugin Configuration ............................................................................................................................ 573
7.1.3.7 Configuring the Search Engine.......................................................................................................................... 573
7.1.3.8 Handle Server Configuration............................................................................................................................. 574
7.1.3.9 Delegation Administration: Authorization System Configuration .................................................................. 575
7.1.3.10 Login as feature.................................................................................................................................................. 583
7.1.3.11 Restricted Item Visibility Settings ..................................................................................................................... 584
7.1.3.12 Proxy Settings .................................................................................................................................................... 584
7.1.3.13 Configuring Media Filters................................................................................................................................... 586
7.1.3.14 Crosswalk and Packager Plugin Settings.......................................................................................................... 588
7.1.3.15 Event System Configuration.............................................................................................................................. 593
7.1.3.16 Embargo ............................................................................................................................................................. 596
7.1.3.17 Checksum Checker Settings .............................................................................................................................. 597
7.1.3.18 Item Export and Download Settings ................................................................................................................. 598
7.1.3.19 Subscription Emails ........................................................................................................................................... 599
7.1.3.20 Hiding Metadata................................................................................................................................................. 599
7.1.3.21 Settings for the Submission Process................................................................................................................. 600
7.1.3.22 Configuring the Sherpa/RoMEO Integration..................................................................................................... 600
7.1.3.23 Configuring Creative Commons License........................................................................................................... 601
7.1.3.24 WEB User Interface Configurations ................................................................................................................... 604
7.1.3.25 Browse Index Configuration .............................................................................................................................. 606
7.1.3.26 Links to Other Browse Contexts ........................................................................................................................ 614
7.1.3.27 Submission License Substitution Variables...................................................................................................... 615

– 15
DSpace 7.x Documentation – DSpace 7.x Documentation

7.1.3.28 Syndication Feed (RSS) Settings ....................................................................................................................... 616

7.1.3.29 OpenSearch Support ......................................................................................................................................... 620
7.1.3.30 Content Inline Disposition Threshold ............................................................................................................... 623
7.1.3.31 Multi-file HTML Document/Site Settings .......................................................................................................... 623
7.1.3.32 Sitemap Settings................................................................................................................................................ 624
7.1.3.33 Authority Control Settings................................................................................................................................. 625
7.1.3.34 Configuring Multilingual Support...................................................................................................................... 626
7.1.3.35 Upload File Settings........................................................................................................................................... 628
7.1.3.36 SFX Server (OpenURL)........................................................................................................................................ 628
7.1.3.37 Controlled Vocabulary Settings ........................................................................................................................ 630
7.1.4 Optional or Advanced Configuration Settings.................................................................................................. 631
7.1.4.1 The Metadata Format and Bitstream Format Registries.................................................................................. 631
7.1.4.2 Configuring Usage Instrumentation Plugins .................................................................................................... 632
7.1.4.3 Behavior of the workflow system...................................................................................................................... 633
7.1.4.4 Recognizing Web Spiders (Bots, Crawlers, etc.) ............................................................................................... 633
7.1.5 Command-line Access to Configuration Properties......................................................................................... 634
7.2 DSpace Item State Definitions............................................................................................. 634
7.3 Directories and Files............................................................................................................. 636
7.3.1 Overview............................................................................................................................................................. 636
7.3.2 Source Directory Layout .................................................................................................................................... 637
7.3.3 Installed Directory Layout ................................................................................................................................. 638
7.3.4 Contents of Server Web Application ................................................................................................................. 638
7.3.5 Log Files.............................................................................................................................................................. 639
7.3.5.1 log4j2.xml File. ................................................................................................................................................... 640
7.4 Metadata and Bitstream Format Registries........................................................................ 640
7.4.1 Default Dublin Core Metadata Registry (DC)..................................................................................................... 640
7.4.2 Dublin Core Terms Registry (DCTERMS) ........................................................................................................... 645
7.4.3 Local Metadata Registry (local) ......................................................................................................................... 648
7.4.4 Default Bitstream Format Registry.................................................................................................................... 649
7.5 Architecture.......................................................................................................................... 652
7.5.1 Overview............................................................................................................................................................. 652
7.5.2 Application Layer ............................................................................................................................................... 653
7.5.2.1 Web User Interface............................................................................................................................................. 654
7.5.2.2 REST API ............................................................................................................................................................. 654
7.5.2.3 OAI-PMH Data Provider...................................................................................................................................... 654

– 16
DSpace 7.x Documentation – DSpace 7.x Documentation

7.5.2.4 RDF / Linked Data Provider................................................................................................................................ 654

7.5.2.5 SWORD v1 Service / Server ................................................................................................................................ 654
7.5.2.6 SWORD v2 Service / Server ................................................................................................................................ 654
7.5.2.7 DSpace Command Line Launcher ..................................................................................................................... 654
7.5.3 Business Logic Layer .......................................................................................................................................... 655
7.5.3.1 Core Classes ....................................................................................................................................................... 656
7.5.3.2 Content Management API.................................................................................................................................. 659
7.5.3.3 Plugin Service..................................................................................................................................................... 664
7.5.3.4 Workflow System ............................................................................................................................................... 670
7.5.3.5 Administration Toolkit....................................................................................................................................... 671
7.5.3.6 E-person/Group Manager .................................................................................................................................. 672
7.5.3.7 Authorization...................................................................................................................................................... 672
7.5.3.8 Handle Manager/Handle Plugin ........................................................................................................................ 674
7.5.3.9 Search ................................................................................................................................................................. 675
7.5.3.10 Browse API.......................................................................................................................................................... 675
7.5.3.11 Checksum checker ............................................................................................................................................. 677
7.5.3.12 OpenSearch Support ......................................................................................................................................... 678
7.5.3.13 Embargo Support............................................................................................................................................... 679
7.5.4 DSpace Services Framework ............................................................................................................................. 681
7.5.4.1 Architectural Overview ...................................................................................................................................... 681
7.5.4.2 Basic Usage ........................................................................................................................................................ 683
7.5.4.3 Providers and Plugins ........................................................................................................................................ 684
7.5.4.4 Core Services ...................................................................................................................................................... 684
7.5.4.5 Examples ............................................................................................................................................................ 685
7.5.4.6 Tutorials.............................................................................................................................................................. 686
7.5.5 Storage Layer ..................................................................................................................................................... 686
7.5.5.1 RDBMS / Database Structure............................................................................................................................. 687
7.5.5.2 Bitstream Store .................................................................................................................................................. 690
7.6 History .................................................................................................................................. 696
7.6.1 Changes in 7.x..................................................................................................................................................... 697
7.6.1.1 Changes in DSpace 7.0....................................................................................................................................... 697
7.6.2 Changes in 6.x..................................................................................................................................................... 697
7.6.2.1 Changes in DSpace 6.3....................................................................................................................................... 698
7.6.2.2 Changes in DSpace 6.2....................................................................................................................................... 702
7.6.2.3 Changes in DSpace 6.1....................................................................................................................................... 704

– 17
DSpace 7.x Documentation – DSpace 7.x Documentation

7.6.2.4 Changes in DSpace 6.0....................................................................................................................................... 708

7.6.3 Changes in 5.x..................................................................................................................................................... 716
7.6.3.1 Changes in DSpace 5.9....................................................................................................................................... 716
7.6.3.2 Changes in DSpace 5.8....................................................................................................................................... 719
7.6.3.3 Changes in DSpace 5.7....................................................................................................................................... 719
7.6.3.4 Changes in DSpace 5.6....................................................................................................................................... 722
7.6.3.5 Changes in DSpace 5.5....................................................................................................................................... 725
7.6.3.6 Changes in DSpace 5.4....................................................................................................................................... 726
7.6.3.7 Changes in DSpace 5.3....................................................................................................................................... 729
7.6.3.8 Changes in DSpace 5.2....................................................................................................................................... 732
7.6.3.9 Changes in DSpace 5.1....................................................................................................................................... 736
7.6.3.10 Changes in DSpace 5.0....................................................................................................................................... 740
7.6.4 Changes in 4.x..................................................................................................................................................... 741
7.6.4.1 Changes in DSpace 4.9....................................................................................................................................... 741
7.6.4.2 Changes in DSpace 4.8....................................................................................................................................... 742
7.6.4.3 Changes in DSpace 4.7....................................................................................................................................... 743
7.6.4.4 Changes in DSpace 4.6....................................................................................................................................... 743
7.6.4.5 Changes in DSpace 4.5....................................................................................................................................... 744
7.6.4.6 Changes in DSpace 4.4....................................................................................................................................... 745
7.6.4.7 Changes in DSpace 4.3....................................................................................................................................... 746
7.6.4.8 Changes in DSpace 4.2....................................................................................................................................... 747
7.6.4.9 Changes in DSpace 4.1....................................................................................................................................... 751
7.6.4.10 Changes in DSpace 4.0....................................................................................................................................... 755
7.6.5 Changes in 3.x..................................................................................................................................................... 755
7.6.5.1 Changes in DSpace 3.6....................................................................................................................................... 756
7.6.5.2 Changes in DSpace 3.5....................................................................................................................................... 756
7.6.5.3 Changes in DSpace 3.4....................................................................................................................................... 756
7.6.5.4 Changes in DSpace 3.3....................................................................................................................................... 757
7.6.5.5 Changes in DSpace 3.2....................................................................................................................................... 758
7.6.5.6 Changes in DSpace 3.1....................................................................................................................................... 759
7.6.5.7 Changes in DSpace 3.0....................................................................................................................................... 759
7.6.6 Changes in 1.8.x.................................................................................................................................................. 760
7.6.6.1 Changes in DSpace 1.8.3.................................................................................................................................... 760
7.6.6.2 Changes in DSpace 1.8.2.................................................................................................................................... 761
7.6.6.3 Changes in DSpace 1.8.1.................................................................................................................................... 761

– 18
DSpace 7.x Documentation – DSpace 7.x Documentation

7.6.6.4 Changes in DSpace 1.8.0.................................................................................................................................... 762

7.6.7 Changes in 1.7.x.................................................................................................................................................. 763
7.6.7.1 Changes in DSpace 1.7.3.................................................................................................................................... 763
7.6.7.2 Changes in DSpace 1.7.2.................................................................................................................................... 763
7.6.7.3 Changes in DSpace 1.7.1.................................................................................................................................... 764
7.6.7.4 Changes in DSpace 1.7.0.................................................................................................................................... 764
7.6.8 Changes in 1.6.x.................................................................................................................................................. 765
7.6.8.1 Changes in DSpace 1.6.2.................................................................................................................................... 765
7.6.8.2 Changes in DSpace 1.6.1.................................................................................................................................... 765
7.6.8.3 Changes in DSpace 1.6.0.................................................................................................................................... 766
7.6.9 Changes in 1.5.x.................................................................................................................................................. 767
7.6.9.1 Changes in DSpace 1.5.2.................................................................................................................................... 767
7.6.9.2 Changes in DSpace 1.5.1.................................................................................................................................... 768
7.6.9.3 Changes in DSpace 1.5.0.................................................................................................................................... 769
7.6.10 Changes in 1.4.x.................................................................................................................................................. 770
7.6.10.1 Changes in DSpace 1.4.1.................................................................................................................................... 770
7.6.10.2 Changes in DSpace 1.4.0.................................................................................................................................... 772
7.6.11 Changes in 1.3.x.................................................................................................................................................. 773
7.6.11.1 Changes in DSpace 1.3.2.................................................................................................................................... 773
7.6.11.2 Changes in DSpace 1.3.1.................................................................................................................................... 773
7.6.11.3 Changes in DSpace 1.3.0.................................................................................................................................... 773
7.6.12 Changes in 1.2.x.................................................................................................................................................. 774
7.6.12.1 Changes in DSpace 1.2.2.................................................................................................................................... 774
7.6.12.2 Changes in DSpace 1.2.1.................................................................................................................................... 775
7.6.12.3 Changes in DSpace 1.2.0.................................................................................................................................... 776
7.6.13 Changes in 1.1.x.................................................................................................................................................. 779
7.6.13.1 Changes in DSpace 1.1.1.................................................................................................................................... 779
7.6.13.2 Changes in DSpace 1.1....................................................................................................................................... 779

– 19
DSpace 7.x Documentation – DSpace 7.x Documentation

1 Introduction
DSpace is an open source software platform that enables organisations to:
• capture and describe digital material using a submission workflow module, or a variety of programmatic
ingest options
• distribute an organisation's digital assets over the web through a search and retrieval system
• preserve digital assets over the long term
This system documentation includes a functional overview of the system(see page 35), which is a good introduction to
the capabilities of the system, and should be readable by non-technical folk. Everyone should read this section first
because it introduces some terminology used throughout the rest of the documentation.
For people actually running a DSpace service, there is an installation guide(see page 53), and sections on
configuration(see page 552) and the directory structure(see page 636). Support options are available in the DSpace
Support Guide1.
For those interested in the details of how DSpace works, and those potentially interested in modifying the code for
their own purposes, there is a detailed architecture section(see page 652).
Other good sources of information are:
• The DSpace Support Guide2 lists various places to ask for help, report bugs or security issues, etc.
• The DSpace REST API contract3 which documents the REST API behavior, etc. If you want source code docs,
we also provide JavaDocs for the Java API layer which can be built by running mvn javadoc:javadoc
• The DSpace Wiki4 contains stacks of useful information about the DSpace platform and the work people are
doing with it. You are strongly encouraged to visit this site and add information about your own work. Useful
Wiki areas are:
• A list of DSpace resources5 (Web sites, mailing lists etc.)
• Technical FAQ6
• Registry of projects using DSpace7
• Guidelines for contributing back to DSpace8
• www.dspace.org9 has announcements and contains useful information about bringing up an instance of
DSpace at your organization.
• The DSpace Community List10. Join DSpace-Community to ask questions or join discussions about non-
technical aspects of building and running a DSpace service. It is open to all DSpace users. Ask questions,
share news, and spark discussion about DSpace with people managing other DSpace sites. Watch DSpace-
Community for news of software releases, user conferences, and announcements about DSpace.
• The DSpace Technical List11. DSpace developers & fellow community members help answer installation and
technology questions, share information and help each other solve technical problems through the DSpace-
Tech mailing list. Post questions or contribute your expertise to other developers working with the system.
• The DSpace Development List12. Join Discussions among DSpace Developers. The DSpace-Dev listserv is for
DSpace developers working on the DSpace platform to share ideas and discuss code changes to the open
source platform. Join other developers to shape the evolution of the DSpace software. The DSpace

1 https://wiki.lyrasis.org/display/DSPACE/Support
2 https://wiki.lyrasis.org/display/DSPACE/Support
3 https://github.com/DSpace/Rest7Contract/blob/main/README.md
4 http://wiki.dspace.org/
5 https://wiki.lyrasis.org/display/DSPACE/DSpaceResources
6 https://wiki.lyrasis.org/display/DSPACE/TechnicalFAQ
7 http://registry.duraspace.org/registry/dspace
8 https://wiki.lyrasis.org/display/DSPACE/Code+Contribution+Guidelines
9 http://www.dspace.org/
10 https://groups.google.com/d/forum/dspace-community
11 https://groups.google.com/d/forum/dspace-tech
12 https://groups.google.com/d/forum/dspace-devel

Introduction – 20
DSpace 7.x Documentation – DSpace 7.x Documentation

community depends on its members to frame functional requirements and high-level architecture, and to
facilitate programming, testing, documentation and to the project.

1.1 Release Notes

 Try out DSpace 7.0!

To try out DSpace 7.0 immediately, see Try out DSpace 713. This includes instructions for a quick-install
via Docker, as well as information on our sandbox/demo site for DSpace 714.
DSpace 7 includes a separate backend (REST API) & frontend (User Interface). Full installation instructions
are available at Installing DSpace(see page 53).
• Download DSpace 7.0 Backend: https://github.com/DSpace/DSpace/releases/tag/dspace-7.0
• Download DSpace 7.0 User Interface: https://github.com/DSpace/dspace-angular/releases/tag/
dspace-7.0

 Upgrade from any past version of DSpace!

Installing DSpace(see page 53) provides an overview of the DSpace 7 installation process and all prerequisite
software. You should review this before attempting an upgrade, in order to ensure you are running the
required versions of Java, Node, etc.
Upgrading DSpace(see page 75) provides a guide for upgrading from any old version of DSpace to v7. As in
the past, your data migrates automatically, no matter which older version you are running. However, as
the old XMLUI and JSPUI user interfaces are no longer supported, you must switch to using the new User
Interface.

• 7.0 Release Notes(see page 22)

• 7.0 Configurations Removed(see page 24)
• 7.0 Acknowledgments(see page 26)
• Major Contributing Institutions(see page 26)
• Financial Contributors(see page 26)
• Frontend / User Interface Acknowledgments(see page 27)
• Backend / REST API Acknowledgments(see page 27)
• Additional Thanks(see page 28)
• 7.0 Beta 1-5 Release Notes(see page 28)
• 7.0 Beta 5 Release Notes(see page 28)
• 7.0 Beta 4 Release Notes(see page 30)
• 7.0 Beta 3 Release Notes(see page 31)
• 7.0 Beta 2 Release Notes(see page 33)
• 7.0 Beta 1 Release Notes(see page 33)

13 https://wiki.lyrasis.org/display/DSPACE/Try+out+DSpace+7
14 https://demo7.dspace.org/

Introduction – 21
DSpace 7.x Documentation – DSpace 7.x Documentation

1.1.1 7.0 Release Notes

 Brief videos of some DSpace 7 features are available at https://duraspace.org/dspace/dspace-7/

DSpace 7.0 is the largest release in the history of DSpace software. While retaining the "out-of-the-box" aspects
DSpace is known for, it represents a major evolution of the platform including:
• A completely new User Interface (demo site15). This is the new Javascript-based frontend, built on
Angular.io16 (with support for SEO provided by Angular Universal). This new interface is also customizable
via HTML and CSS (Sass) and Bootstrap. For early theme building tips see User Interface Customization(see
page 378)
• A completely new, fully featured REST API (demo site17), provided via a single "server" webapp backend.
This new backend is not only a REST API, but also still supports OAI-PMH, SWORD (v1 or v2) and RDF.
Anything you can do from the User Interface is now also possible in our REST API. See REST API(see page 502)
documentation for more details.
• A newly designed search box. Search from the header of any page (click the magnifying glass). The search
results page now features automatic search highlight, expandable & searchable filters, and optional
thumbnail-based results (click on the “grid” view).
• A new MyDSpace area to manage your submissions & reviews, MyDSpace includes a new drag & drop
area to start a new submission, and easily search your workflow tasks or in progress submissions to find
what you were working on. (Login, click on your user profile icon, click “MyDSpace”). Find workflow tasks to
claim by selecting “All tasks” in the “Show” dropdown.
• A new configurable submission user interface, featuring a one-page, drag & drop submission form. This
form is completely configurable and can be prepopulated by dragging & dropping a metadata file (e.g. ArXiv,
CSV/TSV, Endnote, PubMed, or RIS. etc) or by importing via external APIs (e.g ORCID, PubMed, Sherpa
Journals or Sherpa Publishers, etc) (video18). Local controlled vocabularies are also still supported (video19).
See Submission User Interface(see page 260) for more details.
• Optional, new Configurable Entities(see page 134) feature. DSpace now supports “entities”, which are DSpace
Items of a specific ‘type’ which may have relationships to other entities. These entity types and relationships
are configurable, with two examples coming out-of-the-box: a set of Journal hierarchy entities (Journal,
Volume, Issue, Publication) and a set of Research entities (Publication, Project, Person, OrgUnit). For more
information see Configurable Entities(see page 134).
• Dynamic user interface translations (Click the globe, and select a language). Interested in adding more
translations? See DSpace 7 Translation - Internationalization (i18n) - Localization (l10n)20.
• A new Admin sidebar. Login as an Administrator, and an administrative sidebar appears. Features available
include:
• Quickly create or edit objects from anywhere in the system. Either browse to the object first, or
search for it using the Admin sidebar.
• Processes UI (video21) allows Administrators to run backend scripts/processes while monitoring their
progress & completion. (Login as an Admin, select "Processes" in sidebar)
• Administrative Search (video22) combines retrieval of withdrawn items and private items, together
with a series of quick action buttons.

15 https://demo7.dspace.org/
16 http://Angular.io
17 https://api7.dspace.org/server/
18 https://www.youtube.com/watch?v=mGRDl0khzrQ
19 https://youtu.be/OfEIoxOJK-8
20 https://wiki.lyrasis.org/pages/viewpage.action?pageId=117735441
21 https://www.youtube.com/watch?v=vcsWkWQONkY
22 https://www.youtube.com/watch?v=JV8Rb-9cByo&t=1s

Introduction – 22
DSpace 7.x Documentation – DSpace 7.x Documentation

• Administer Active Workflows (video23) allows Administrators to see every submission that is currently
in the workflow approval process.
• Bitstream Editing (video24) has a drag-and-drop interface for re-ordering bitstreams and makes
adding and editing bitstreams more intuitive.
• Metadata Editing (video25) introduces suggest-as-you-type for field name selection of new metadata.
• Login As (Impersonate) another account allows Administrators to debug issues that a specific user is
seeing, or do some work on behalf of that user. (Login as an Admin, Click "Access Control" in sidebar,
Click "People". Search for the user account & edit it. Click the "Impersonate EPerson" button. You
will be authenticated as that user until you click "Stop Impersonating EPerson" in the upper right.)
• Improved GDPR alignment (video26)
• User Agreement required for all authenticated users to read and agree to. (Login for first time, and
sample user agreement will display. After agreeing to it, it will not appear again.)
• Cookie Preferences are now available for all users (anonymous or authenticated). A cookie
preference popup appears when first accessing the site. Users are given information on what cookies
added by DSpace, including a Privacy Statement which can be used to describe how their data is
used.
• User Accounts can be deleted even if they've submitted content in the past.
• Support for OpenAIREv4 Guidelines for Literature Repositories27 in OAI-PMH (See the new “openaire4”
context in OAI-PMH).
• Search Engine Optimization: Tested and approved by the Google Scholar team, DSpace still includes all the
SEO features you require: a robots.txt, Sitemaps and Google Scholar "citation" tags.
• Video/Image Content Streaming (Kindly donated by Zoltán Kanász-Nagy28 and Dániel Péter Sipos29 of
Qulto): When enabled, DSpace can now stream videos & view images full screen, using an embedded viewer.
(See the "mediaViewer" settings in the environment.common.ts30 to enable.)
• Basic Usage Statistics (video31) are available for the entire site (See "Statistics" menu at top of homepage),
or specific Communities, Collections or Items (Click on that same "Statistics" menu after browsing to a
specific object
• Additional features are listed in the Beta release notes below. Also, give it a try on our demo site32 & see
what you discover!

 DSpace 7 does not yet include all the features of DSpace 6.x
DSpace 7.0 represents a major evolution of the platform into a new, modern web architecture. This means
there are tons of new and redesigned features in 7.0. However, in order to get this release in your hands
sooner, DSpace Steering decided to delay some 6.x features for later 7.x releases. So, if you don't see a 6.x
feature yet in 7.0, it'll likely be coming soon in a later 7.x release. For a prioritized list of upcoming features
see "What features are coming in a later 7.x release?" on our DSpace Release 7.0 Status33 page.

Additional major changes to be aware of in the 7.x platform (not an exhaustive list):

23 https://www.youtube.com/watch?v=CjH8VS2WDjE
24 https://youtu.be/s1msEKK0f68
25 https://youtu.be/6KVB2ugUgjI
26 https://www.youtube.com/watch?v=bKqFmb6Ywng
27 https://guidelines.openaire.eu/en/latest/literature/index.html
28 https://github.com/kanasznagyzoltan
29 https://github.com/dsipos-dev
30 https://github.com/DSpace/dspace-angular/blob/main/src/environments/environment.common.ts#L273-L276
31 https://youtu.be/T2g74zs_wmM
32 https://demo7.dspace.org/
33 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Status

Introduction – 23
DSpace 7.x Documentation – DSpace 7.x Documentation

• XMLUI and JSPUI are no longer supported or distributed with DSpace. All users should immediately
migrate to and utilize the new Angular User Interface34. There is no migration path from either the XMLUI or
JSPUI to the new User interface. However, the new user interface can be themed via HTML and CSS (SCSS).
• The old REST API ("rest" webapp from DSpace v4.x-6.x) is deprecated and will be removed in v8.x. The
new REST API(see page 502) (provided in the "server" webapp) replaces all functionality available in the older
REST API. If you have tools that rely on the old REST API, you can still (optionally) build & deploy it alongside
the "server" webapp via the "-Pdspace-rest" Maven flag. See REST API v6 (deprecated)(see page 506)
• The Submission Form configuration has changed. The "item-submission.xml" file has changed its
structure, and the "input-forms.xml" has been replaced by a "submission-forms.xml". See Submission User
Interface(see page 260)
• ElasticSearch Usage Statistics have been removed. Please use SOLR Statistics(see page 338) or DSpace
Google Analytics Statistics(see page 363).
• The traditional, 3-step Workflow system has been removed in favor of the Configurable Workflow
System(see page 250). For most users, you should see no effect or difference. The default setup for this
Configurable Workflow System is identical to the traditional, 3-step workflow ("Approve/Reject", "Approve/
Reject/Edit Metadata", "Edit Metadata")
• The old BTE import framework in favor of Live Import Framework(see page 274) (features of BTE have been
ported to Live Import)
• Apache Solr is no longer embedded within the DSpace installer. Solr now MUST be installed as a
separate dependency alongside the DSpace backend. See Installing DSpace(see page 53).
• A large number of old/obsolete configurations were removed. "7.0 Configurations Removed" section
below.
• See Upgrading DSpace(see page 75) for more hints on the upgrade from any old version of DSpace to 7.x
Additional Resources
• Video presentations / Workshops from OR2021 (June 2021) showing off many of the new features &
configurations of DSpace 7: DSpace 7 at OR202135

1.1.2 7.0 Configurations Removed

With the removal of the JSPUI and XMLUI, a large number of server-side (backend) configurations were made
obsolete and were therefore removed between the 6.x and 7.0 release. Those Configurations removed included:
• Within the [dspace]/config/ directory, these are the configuration files which were deleted:
• dc2mods.cfg
• input-forms.xml / dtd (REPLACED BY submission-forms.xml, see Submission User Interface(see
page 260))
• log4j.properties (REPLACED BY log4j2.xml)
• log4j-console.properties (REPLACED BY log4j-console.xml)
• log4j-solr.properties (no replacement as Solr now must be installed separately)
• news-side.html
• news-top.html
• news-xmlui.xml
• workflow.xml (REPLACED BY ./spring/api/workflow.xml)
• xmlui.xconf / dtd
• emails/bte_* (BTE import framework was removed in favor of Live Import from external
sources(see page 274))
• modules/controlpanel.cfg
• modules/elastic-search-statistics.cfg (Elastic Search support was removed in favor of Solr)
• modules/fetchccdata.cfg

34 https://github.com/DSpace/dspace-angular/
35 https://wiki.lyrasis.org/display/DSPACE/DSpace+7+at+OR2021

Introduction – 24
DSpace 7.x Documentation – DSpace 7.x Documentation

• modules/publication-lookup.cfg
• spring/api/bte.xml (BTE import framework was removed in favor of Live Import from external
sources(see page 274))
• spring/oai/* (OAI is now part of the backend "server webapp" and needs no separate
configurations)
• spring/xmlui/*
• Within the dspace.cfg main configuration file, the following settings were removed:
• log.init.config (replaced by log4j2.xml)
• webui.submit.blocktheses
• webui.submit.upload.html5
• webui.submission.restrictstep.enableAdvancedForm
• webui.submission.restrictstep.groups
• webui.submit.enable-cc
• webui.browse.thumbnail.*
• webui.item.thumbnail.*
• webui.preview.enabled
• webui.strengths.show
• webui.browse.author-field
• webui.browse.author-limit
• webui.browse.render-scientific-formulas
• recent.submissions.*
• webui.collectionhome.*
• plugin.sequence.org.dspace.plugin.SiteHomeProcessor
• plugin.sequence.org.dspace.plugin.CommunityHomeProcessor
• plugin.sequence.org.dspace.plugin.CollectionHomeProcessor
• plugin.sequence.org.dspace.plugin.ItemHomeProcessor
• plugin.single.org.dspace.app.webui.search.SearchRequestProcessor
• plugin.single.org.dspace.app.xmlui.aspect.administrative.mapper.SearchRequestP
rocessor
• plugin.named.org.dspace.app.webui.json.JSONRequest
• plugin.single.org.dspace.app.webui.util.StyleSelection
• webui.bitstream.order.*
• webui.itemdisplay.*
• webui.resolver.*
• webui.preferred.identifier
• webui.identifier.*
• webui.mydspace.*
• webui.suggest.*
• webui.controlledvocabulary.enable
• webui.session.invalidate
• itemmap.*
• jspui.*
• xmlui.*
• mirage2.*
A full list of all changes / bug fixes in 7.x is available in the Changes in 7.x(see page 697) section.

Introduction – 25
DSpace 7.x Documentation – DSpace 7.x Documentation

1.1.3 7.0 Acknowledgments

1.1.3.1 Major Contributing Institutions

The following institutions have been major code contributors to the DSpace 7 release (in general)
• Atmire36 - also hosts/maintains DSpace 7 UI demo at https://demo7.dspace.org
• 4Science37 - also hosts/maintains DSpace 7 REST demo at https://api7.dspace.org/server/
• FCT38 / RCAAP39

1.1.3.2 Financial Contributors

We gratefully recognize the following institutions who together have generously contributed financially to support
the DSpace 7 staged release program (see DSpace 7 Release Goals40), and individuals who devoted time to
fundraising:
• Auburn University
• Cornell University
• Pascal Becker
• Dalhousie University
• Duke University
• ETH Zurich, ETH Library
• Fraunhofer Gesellschaft
• Imperial College London
• Indiana University–Purdue University, Indianapolis
• LYRASIS
• National Library of Finland
• Beate Rajski
• Staats- und Universitätsbibliothek Hamburg – Carl von Ossietzky
• Technische Universität Berlin
• Technische Universität Hamburg (TUHH)
• The DSpace-Konsortium Deutschland
• The Helmut-Schmidt-Universität/Universität der Bundeswehr Hamburg
• The Library Code GmbH
• The Ohio State University
• Texas Digital Library
• University of Arizona
• University of Edinburgh
• University of Kansas
• University of Minnesota
• University of Missouri
• University of Toronto
• World Bank
• ZHAW

36 https://www.atmire.com/
37 https://www.4science.it/
38 https://www.fct.pt/
39 https://www.rcaap.pt/
40 https://wiki.lyrasis.org/display/DSPACE/DSpace+7+Release+Goals

Introduction – 26
DSpace 7.x Documentation – DSpace 7.x Documentation

1.1.3.3 Frontend / User Interface Acknowledgments

The following 55 individuals have contributed directly to the new DSpace (Angular) User Interface in this release
(ordered by number of GitHub commits): Giuseppe Digilio (atarix83), Kristof De Langhe (Atmire-Kristof), Lotte
Hofstede (LotteHofstede), Art Lowel (artlowel), Marie Verdonck (MarieVerdonck), Julius Gruber (Flusspferd123),
Yura Bondarenko (ybnd), William Welling (wwelling and wellingWilliam), Yana De Pauw (YanaDePauw), Tim
Donohue (tdonohue), Alessandro Martelli (alemarte), Michael Spalti (mspalti), Jonas Van Goolen (jonas-atmire),
Laura Henze (lhenze), Dániel Péter Sipos (dsipos-dev), Samuel Cambien (samuelcambien), Bruno Roemers (bruno-
atmire), Matteo Perelli (sourcedump), Bram Luyten (bram-atmire), Ben Bosman (benbosman), Terry Brady
(terrywbrady), Raf Ponsaerts (Raf-atmire), Danilo Di Nuzzo (ddinuzzo), Andrea Chiapparelli (andreachiapparelli),
Antoine Snyers (antoine-atmire), Corrado Lombardi (corrad82-4s), Courtney Pattison (courtneypattison), Àlex
Magaz Graça (rivaldi8), Chris Wilper (cwilper), Christian Scheible (christian-scheible), Andrew Wood
(AndrewZWood), Reeta Kuuskoski (reetagithub), Vítor Silvério Rodrigues (vitorsilverio), Alexander Sulfrian
(AlexanderS), muiltje, José Carvalho (josekarvalho), Claudia Jürgen (cjuergen), fernandaruizm, Ivan Masar (helix84),
Paulo Graça (paulo-graca), Philip Vissenaekens (PhilipVis), Nagy Akos (akoscomp), Kevin Van de Velde (KevinVdV),
Sascha Szott (saschaszott), Mohamed Mohideen Abdul Rasheed (mohideen), David Cavrenne (davidatmire), Hardy
Pottinger (hardyoyo), Luca Giamminonni (LucaGiamminonni), Mateus Mercer (MatMercer), Denijs Balodis (Denijsb),
Pascal-Nicolas Becker (pnbecker), Mikus Zarins (MixonZ), marciofoz, Andrea Bollini (abollini), Martin Walk
(MW3000).
Out of the above list, the following individuals contributed a translation of the new interface (ordered
alphabetically by language): Ivan Masar (Czech), Marina Muilwijk (Dutch), Reeta Kuuskoski (Finnish), David
Cavrenne (French), Claudia Jürgen and Sasha Szott (German), Nagy Akos and Transylvanian Museum Society
(Hungarian), Mikus Zarins (Latvian), Vítor Silvério Rodrigues and marciofoz (Brazilian Portuguese), José Carvalho
(Portuguese) and Maria Fernanda Ruiz (Spanish).
The above contributor lists were determined based on historical contributions to the "dspace-angular" project in
GitHub until 7.0: https://github.com/DSpace/dspace-angular/graphs/contributors?
from=2016-11-27&to=2021-07-29&type=c

1.1.3.4 Backend / REST API Acknowledgments

The following 55 individuals have contributed directly to the DSpace backend (REST API, Java API, OAI-PMH, etc) in
this release (ordered by number of GitHub commits): Raf Ponsaerts (Raf-atmire), Tim Donohue (tdonohue), Andrea
Bollini (abollini), Michele Boychuk (Micheleboychuk), Mark Wood (mwoodiupui), Marie Verdonck (MarieVerdonck),
Ben Bosman (benbosman), Luigi Andrea Pascarelli (lap82), Terry Brady (terrywbrady), Tom Desair (tomdesair), Yana
De Pauw (YanaDePauw), Chris Wilper (cwilper), Peter Nijs (peter-atmire), Kevin Van de Velde (KevinVdV), Bruno
Roemers (bruno-atmire), Giuseppe Digilio (atarix83), Pasquale Cavallo (pasqualecvl), Jelle Pelgrims (jpelgrims-
atmire), Andrew Wood (AndrewZWood), Samuel Cambien (samuelcambien), Antoine Snyers (antoine-atmire), Kim
Shepherd (kshepherd), Yura Bondarenko (ybnd), Michael Spalti (mspalti), Alessandro Martelli (alemarte), Oliver
Goldschmidt (olli-gold), Jonas Van Goolen (jonas-atmire), Kristof De Langhe (Atmire-Kristof), Alexander Sulfrian
(AlexanderS), Patrick Trottier (PTrottier), Pablo Prieto (ppmdo), Hardy Pottinger (hardyoyo), Pascal-Nicolas Becker
(pnbecker), William Tantzen (tantz001), Paulo Graça (paulo-graca), Luca Giamminonni (LucaGiamminonni), Ivan
Masar (helix84), Hrafn Malmquist (J4bbi), Ian Little (ilittle-cnri), Anis Moubarik (anis-moubarik), Claudia Jürgen
(cjuergen), Alan Orth (alanorth), xuejiangtao, Danilo Di Nuzzo (ddinuzzo), James Creel (jcreel), Marsa Haoua
(marsaoua), Philip Vissenaekens (PhilipVis), Miika Nurminen (minurmin), Bram Luyten (bram-atmire), Christian
Scheible (christian-scheible), Nicholas Woodward (nwoodward), József Marton (jmarton), Mohamed Mohideen
Abdul Rasheed (mohideen), Saiful Amin (saiful-semantic), Àlex Magaz Graça (rivaldi8)
The above contributor list was determined based on contributions to the "DSpace" project in GitHub between 6.0
(after Oct 24, 2016) and 7.0: https://github.com/DSpace/DSpace/graphs/contributors?
from=2016-10-24&to=2021-07-29&type=c Therefore this list may include individuals who contributed to later 6.x
releases, but only if their bug fix was also applied to 7.0.

Introduction – 27
DSpace 7.x Documentation – DSpace 7.x Documentation

1.1.3.5 Additional Thanks

Additional thanks to our DSpace Leadership Group41 and DSpace Steering Group42 for their ongoing DSpace
support and advice. Thanks also to LYRASIS43 for your leadership, collaboration & support in helping to speed up
the development process of DSpace 7.
Thanks also to the various developer & community Working Groups who have worked diligently to help make
DSpace 7 a reality. These include:
• DSpace 7 Working Group44 - This is the team behind the code
• DSpace 7 Entities Working Group45 - This team designed & implemented Configurable Entities(see page 134)
• DSpace 7 Marketing Working Group46 - This team did all our DSpace 7marketing, press releases &
announcements.
• DSpace Community Advisory Team47 (DCAT) - This team helped organize/lead the DSpace 7.0 Testathon48
(to bang on the system to find any last bugs), and they also provided us with advice on features, etc.
We apologize to any contributor accidentally left off this list. DSpace has such a large, active development
community that we sometimes lose track of all our contributors. Acknowledgments to those left off will be made in
future releases.

1.1.4 7.0 Beta 1-5 Release Notes

DSpace 7.0 was developed via a series of Beta releases from 2020-21. The release notes for each Beta are retained
here for reference.

1.1.4.1 7.0 Beta 5 Release Notes

Released April 2021
Included in Beta 5
• Support for custom theme(s) in UI & accessibility cleanup of base theme. See early information at
DSpace UI Design principles and guidelines49 and the "themes" section of the environment.common.ts50
• Updated the "base" theme (default Bootstrap look & feel) for consistency and better accessibility.
(Additional accessibility analysis will be performed during Testathon)
• Added a simple "dspace" theme (this is the new default theme, and primarily shows an example of
customizing color scheme & homepage)
• Added a "custom" theme folder with all necessary files. These files can be directly modified to create
a completely custom theme.
• Major performance improvements to UI by making better use of caching & smart reloading

41 https://duraspace.org/dspace/community/leadership-group/
42 https://duraspace.org/dspace/community/dspace-steering-group/
43 https://www.lyrasis.org/
44 https://wiki.lyrasis.org/display/DSPACE/DSpace+7+Working+Group
45 https://wiki.lyrasis.org/display/DSPACE/DSpace+7+Entities+Working+Group
46 https://wiki.lyrasis.org/display/DSPACE/DSpace+7+Marketing+Working+Group
47 https://wiki.lyrasis.org/display/cmtygp/DSpace+Community+Advisory+Team
48 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Testathon+Page
49 https://wiki.lyrasis.org/display/DSPACE/DSpace+UI+Design+principles+and+guidelines
50 https://github.com/DSpace/dspace-angular/blob/main/src/environments/environment.common.ts#L230-L267

Introduction – 28
DSpace 7.x Documentation – DSpace 7.x Documentation

• Video/Image Content Streaming (Kindly donated by Zoltán Kanász-Nagy51 and Dániel Péter Sipos52 of
Qulto): When enabled, DSpace can now stream videos & view images full screen, using an embedded viewer.
• See the new "mediaViewer" settings in the environment.common.ts53 to enable. S54ample
screenshots of the feature can also be found at https://github.com/DSpace/dspace-angular/issues/
885
• New Administrative Features
• Add ability to modify Community/Collection resource policies (i.e. permissions). Edit a Community or
Collection and look at the "Authorizations" tab.
• Add ability to edit/delete user Groups.
• Add private/withdrawn item badges for Administrators to quickly see which Items are private or
withdrawn. These are viewable throughout the browse/search when logged in as an Administrative
user.
• Configurable Entities Improvements
• Entities now report their Entity type in the URL path (e.g. Person entities use URL path /entities/
person/[uuid] and Publication entities use the URL path /entities/publication/[uuid])
• Each Entity type now has a custom Submission form.
• These can be most easily seen in the Demo site. Submitting to the "People" collection55 uses
the "Person" Entity Form. Submitting to the "Articles" collection56 uses the "Publication"
Entity Form. The full list of Entity-specific Collection submission mappings can be found in
the example in item-submission.xml57 (this example is enabled on our Demo Site)
• General performance improvements for Entities. Introduction of "tilted" relationships58 for
Configurable Entities that may have hundreds or thousands of relationships.
• Improvements to Upgrade process
• Added a new Submission form migration script59 to help DSpace 5/6 institutions migrate their old
Submission configuration files to the new/updated format for v7.
• Security fixes
• Added CSRF (Cross-Site Request Forgery)60 protection to REST API. UI (and any other clients) now
must be trusted to login to the REST API.
• Improved permissions checks/validation in UI for Administrator, Community/Collection
Administrator and Submitter roles.
• Fixed several other security issues auto-reported by LGTM61
• Many bug fixes
• Fixed issue where mapped items were not appearing
• Fixed issue where Handles were not redirecting
• Fixed issues with Sherpa and ORCID integrations
• Fixed several small issues with OpenAIRE v4 support in OAI-PMH
• Fixed many bugs in MyDSpace and Submission UI
• Fixed several bugs in CSV import/export process.
• Fixes to search/browse pagination & breadcrumb trail
• Improved performance of Browse by Community/Collection hierarchy
• LDAP Authentication support is working again
• Many dependency upgrades

51 https://github.com/kanasznagyzoltan
52 https://github.com/dsipos-dev
53 https://github.com/DSpace/dspace-angular/blob/main/src/environments/environment.common.ts#L273-L276
54 https://github.com/DSpace/dspace-angular/pull/888
55 https://demo7.dspace.org/collections/9398affe-a977-4992-9a1d-6f00908a259f
56 https://demo7.dspace.org/collections/282164f5-d325-4740-8dd1-fa4d6d3e7200
57 https://github.com/DSpace/DSpace/blob/main/dspace/config/item-submission.xml#L23-L49
58 https://github.com/DSpace/DSpace/pull/3134
59 https://github.com/DSpace/DSpace/pull/3076
60 https://owasp.org/www-community/attacks/csrf
61 https://lgtm.com/

Introduction – 29
DSpace 7.x Documentation – DSpace 7.x Documentation

• Upgrade UI to Angular v10

• Upgrade UI to Node v12 or v14 support
• Upgrade Backend to Solr v8 support
• Upgrade to ORCID v3 support
• Upgrade to SHERPA v2 support
• Removal of obsolete features
• Removal of the old BTE framework in favor of Live Import Framework(see page 274) (features of BTE
have been ported to Live Import)
• Removal of Traditional/Basic workflow in favor of Configurable Workflow(see page 250) (default
workflow is still the same as in DSpace 6)
Changelog
• All User Interface changes: https://github.com/DSpace/dspace-angular/issues?
q=is%3Aclosed+milestone%3A7.0beta5
• All Backend changes: https://github.com/DSpace/DSpace/issues?q=is%3Aclosed+milestone%3A7.0beta5

1.1.4.2 7.0 Beta 4 Release Notes

Released October 2020
Included in Beta 4
• Live Import framework (video62) support has been added to the Submission Form (and REST API /api/
integration/externalsources endpoint)
• Search an external site for works to import (From your MyDSpace page, click the "Import metadata
from external source" button in upper right). Currently supports Library of Congress Names, ORCID,
PubMed, Sherpa Journals or Sherpa Publishers.
• Drag and drop a bibliographic file into Submission form or MyDSpace page to prepopulate metadata.
Supported formats include ArXiv, CSV (or TSV), Endnote, PubMed, or RIS.
• Controlled Vocabulary support (video63) in Submission Form. Depending on the field configuration, this
can include autocomplete of known terms (see default "Subject Keywords" field), dropdown support (see
default "Type" field) and hierarchical tree views
• Includes support for Controlled Vocabs, Authority Control and "Value-Pairs" (from submission
configs)
• Curation Tasks are now supported via the Admin UI and the Processes UI. (Login as an Admin, select
"Curation Tasks")
• Import / Export metadata from/to CSV (i.e. Batch Metadata Editing64) is now available from the Admin UI.
(Login as an Admin, select "Export" > "Metadata", select "Import" > "Metadata")
• Basic Usage Statistics (video65) are available for the entire site (See "Statistics" menu at top of homepage),
or specific Communities, Collections or Items (Click on that same "Statistics" menu after browsing to a
specific object).
• Support for exchanging usage data to IRUS66 was added. See new "irus-statistics.cfg" and DS-62667
• Improved GDPR Alignment (video68)
• User Agreement required for all authenticated users to read and agree to. (Login for first time, and
sample user agreement will display. After agreeing to it, it will not appear again.)

62 https://youtu.be/irL7RO1HhFU
63 https://youtu.be/OfEIoxOJK-8
64 https://wiki.lyrasis.org/display/DSDOC6x/Batch+Metadata+Editing
65 https://youtu.be/T2g74zs_wmM
66 https://irus.jisc.ac.uk/
67 https://jira.lyrasis.org/browse/DS-626
68 https://www.youtube.com/watch?v=bKqFmb6Ywng

Introduction – 30
DSpace 7.x Documentation – DSpace 7.x Documentation

• Cookie Preferences are now available for all users (anonymous or authenticated). A cookie
preference popup appears when first accessing the site. Users are given information on what cookies
added by DSpace, including a Privacy Statement which can be used to describe how their data is
used.
• User Accounts can be deleted even if they've submitted content in the past.
• When a user is deleted, their past submissions are kept but the submitter field is set to empty
(null).
• Users cannot be deleted if they are the only member of a workflow approval group. Admins
must either delete that group first, or assign another member to the group. This ensures
Workflows are kept even if a user account needs to be deleted.
• Language preferences are now kept for all users (anonymous or logged in). By default, DSpace will try to
use your browser's preferred language (if found in Accept-Language header and a translation in that
language exists). Users can override it by either saving a preferred language in their user profile, or by
manually selecting a different language from the globe icon (upper right).
• IP-based authorization lets you restrict (or provide access to) objects based on the user's IP address. This
uses the same "authentication-ip.cfg" configuration as DSpace 6, allowing you to map IP ranges to specific
DSpace Groups. Users within that IP range are added to the mapped DSpace Group for the remainder of
their session.
• Search Engine Optimization: Addition of robots.txt, Sitemaps and Google Scholar "citation" tags. These
optimizations are being tested by the Google Scholar team and may be improved further in the upcoming
beta 5 release.
• For improved SEO, Sitemaps are now enabled by default and automatically update once per day.
• Security Fixes and Dependency upgrades
• Enhancements to new /api/authz/features endpoint in REST API to provide additional feature-
specific permission checks
• Flyway69 database engine was upgraded to version 6.5.5
• Indexing enhancements (some objects were being indexed twice, see PR#296070)
• Fixes to Shibboleth login
• Additional bug fixes to both UI and REST API
Changelog
• All User Interface changes: https://github.com/DSpace/dspace-angular/issues?
q=is%3Aclosed+milestone%3A7.0beta4
• All Backend changes: https://github.com/DSpace/DSpace/issues?q=is%3Aclosed+milestone%3A7.0beta4

1.1.4.3 7.0 Beta 3 Release Notes

Released July 2020
Included in Beta 3
• Processes Admin UI (video71) allows Administrators to run backend scripts/processes while monitoring
their progress & completion. (Login as an Admin, select "Processes" in sidebar)
• Currently supported processes include "index-discovery" (reindex site), "metadata-export" (batch
metadata editing CSV export), and "metadata-import" (batch metadata editing CSV import).
• Manage Account Profile allows logged in users to update their name, language or password. (Login, click on
the account icon, and select "Profile")
• New User Registration (video72) and password reset on the Login Screen

69 https://flywaydb.org/
70 https://github.com/DSpace/DSpace/pull/2960
71 https://www.youtube.com/watch?v=vcsWkWQONkY
72 https://www.youtube.com/watch?v=C2k8vn1rWNE

Introduction – 31
DSpace 7.x Documentation – DSpace 7.x Documentation

• Login As (Impersonate) another account allows Administrators to debug issues that a specific user is
seeing, or do some work on behalf of that user. (Login as an Admin, Click "Access Control" in sidebar, Click
"People". Search for the user account & edit it. Click the "Impersonate EPerson" button. You will be
authenticated as that user until you click "Stop Impersonating EPerson" in the upper right.)
• Requires "webui.user.assumelogin=true" to be set in your local.cfg on backend. Also be aware that
you can only "impersonate" a user who is not a member of the Administrator group.
• Manage Authorization Policies of an Item allows Administrators to directly change/update the access
policies of an Item, its Bundles or Bitstreams. (Login as an Admin, Click "Edit" → "Item" in sidebar, and
search for the Item. Click the "Authorization.." button on its "Status" tab.
• Manage Item Templates of a Collection allows Administrators to create/manage template metadata that all
new Items will start with when submitted to that Collection. (Login as an Admin, Click "Edit" → "Collection"
in sidebar and search for the Collection. Click the "Add" button under "Template Item" to get started.)
• NOTE: unfortunately there's a known bug that while you can create these templates, the submission
process is not yet using them. See https://github.com/DSpace/dspace-angular/issues/748
• Administer Active Workflows (video73) allows Administrators to see every submission that is currently in
the workflow approval process. From there, they have the option to delete Items (if they are no longer
needed), or send them back to the workflow pool (to allow another user to review them). (Login as an
Admin, Click "Administer Workflow" in sidebar)
• CC License step allows your users to select a Creative Commons License as part of their submission. Once
enabled in the "item-submission.xml" (on the backend) it appears as part of the submission form.
• Angular CLI compatibility was added to the User Interface. This allows developers to easily update the User
Interface using standard Angular commandline tools. More information (including tutorials) is available
at https://cli.angular.io/
• English, Latvian, Dutch, German, French, Portuguese, Spanish and Finnish language catalogs
• Numerous bugs were fixed based on early user testing. (Thanks to all who've tested Beta 1 or Beta 2 and
reported your feedback!) Some bugs fixed include:
• Login/Logout session fixes (including compatibility with Firefox and Safari browsers)
• Improved Community/Collection tree browsing performance
• Fixes to editing Communities, Collections and Items. This includes improved drag & drop reordering
of bitstreams in an Item.
• Improved performance of Collection dropdown in submission
• Ability to download restricted bitstreams (previously these would error out)
• Authorization & security improvements in both REST API and UI
• Upgraded all REST API dependencies (Spring, Spring Boot, HAL Browser) and enhanced our automated
testing via additional Integration Tests.
• All features previous mentioned in 7.0 Beta 2 Release Notes(see page 33) and 7.0 Beta 1 Release Notes(see page 33)
below
Learn More: New videos are available highlighting features of the MyDSpace area:
• Manage Submissions in MyDSpace (video74)
• Manage Tasks in MyDSpace (video75)
Changelog
• All User Interface changes: https://github.com/DSpace/dspace-angular/issues?
q=is%3Aclosed+milestone%3A7.0beta3
• All Backend changes: https://github.com/DSpace/DSpace/issues?q=is%3Aclosed+milestone%3A7.0beta3

73 https://www.youtube.com/watch?v=CjH8VS2WDjE
74 https://youtu.be/uM1q6W6k6lY
75 https://youtu.be/R0v1WNFDbmI

Introduction – 32
DSpace 7.x Documentation – DSpace 7.x Documentation

1.1.4.4 7.0 Beta 2 Release Notes

Released April 2020
Included in Beta 2
• Administrative Search (video76) combines retrieval of withdrawn items and private items, together with a
series of quick action buttons.
• EPeople, Groups and Roles can now be viewed, created and updated.
• Manage Groups (Login as an Admin → Access Control → Groups)
• Manage EPeople (Login as an Admin → Access Control → EPeople)
• Manage Community/Collection Roles (Login as an Admin → Edit Community/Collection → Assign
Roles). Note: this feature is Admin-only in beta 2, but will be extended to Community/Collection
Admins in beta 3.
• Bitstream Editing (video77) has a drag-and-drop interface for re-ordering bitstreams and makes adding and
editing bitstreams more intuitive.
• Metadata Editing (video78) introduces suggest-as-you-type for field name selection of new metadata.
• Update Profile / Change Password (Login → Select user menu in upper right → Profile)
• Shibboleth Authentication79
• Viewing Item Version History (requires upgrading from a 6.x site that includes Item Versioning80)
• Collection and Community (video81) creation and edit pages.
• English, Latvian, Dutch, German, French, Portuguese and Spanish language catalogs
• Security and authorization improvements, including REST API support hiding specific metadata fields
(metadata.hide property) and upgrades of different software packages on which DSpace 7 depends.
• All features previous mentioned in 7.0 Beta 1 Release Notes(see page 33) below
A full list of all changes / bug fixes in 7.x is available in the Changes in 7.x(see page 697) section.

1.1.4.5 7.0 Beta 1 Release Notes

Released March 2020
New features to look for
• A completely new User Interface (demo site82). This is the new Javascript-based frontend, built on
Angular.io83 (with support for SEO provided by Angular Universal). This new interface is also via HTML and
CSS (SCSS). For early theme building training, see the “Getting Started with DSpace 7 Workshop” from the
North American User Group meeting: slides84 or video recording85.
• A completely new, fully featured REST API (demo site86), provided via a single "server" webapp backend.
This new backend is not only a REST API, but also still supports OAI-PMH, SWORD (v1 or v2) and RDF. See the
REST API's documentation / contract at https://github.com/DSpace/Rest7Contract/blob/master/
README.md

76 https://youtu.be/XoYStblYZWY
77 https://youtu.be/s1msEKK0f68
78 https://youtu.be/6KVB2ugUgjI
79 https://wiki.lyrasis.org/display/DSPACE/DSpace+7+Shibboleth+Configuration
80 https://wiki.lyrasis.org/display/DSDOC6x/Item+Level+Versioning
81 https://youtu.be/wnrUOHRS5WA
82 https://dspace7-demo.atmire.com/
83 http://Angular.io
84 https://tinyurl.com/na-dsug2019-dspace7
85 https://umn.zoom.us/recording/play/Mk3gWE1AGEErGg6fLaZ_u-rg_8pEC7MWu_2uWa5l8Q03fKM7ra9sm1-MntSRtNti?
startTime=1569247043000
86 https://dspace7.4science.cloud/server/

Introduction – 33
DSpace 7.x Documentation – DSpace 7.x Documentation

• A newly designed search box. Search from the header of any page (click the magnifying glass). The search
results page now features automatic search highlight, expandable & searchable filters, and optional
thumbnail-based results (click on the “grid” view).
• A new MyDSpace area, including a new, one-page, drag & drop submission form, a new workflow approval
process, and searchable past submissions. (Login, click on your user profile icon, click “MyDSpace”). Find
workflow tasks to claim by selecting “All tasks” in the “Show” dropdown.
• Dynamic user interface translations (Click the globe, and select a language). Anyone interested in adding
more translations? See DSpace 7 Translation - Internationalization (i18n) - Localization (l10n)87.
• A new Admin sidebar. Login as an Administrator, and an administrative sidebar appears. Use this to create
a new Community/Collection/Item, edit existing ones, and manage registries. (NOTE: A number of
Administrative tools are still missing or greyed out. They will be coming in future Beta releases.)
• Optional, new Configurable Entities(see page 134) feature. DSpace now supports “entities”, which are DSpace
Items of a specific ‘type’ which may have relationships to other entities. These entity types and relationships
are configurable, with two examples coming out-of-the-box: a set of Journal hierarchy entities (Journal,
Volume, Issue, Publication) and a set of Research entities (Publication, Project, Person, OrgUnit). For more
information see “The Power of Configurable Entities” from OR2019: slides88 or video recording89.
Additionally, a test data set featuring both out-of-the-box examples can be used when trying out DSpace 7
via Docker90. Early documentation is available at Configurable Entities(see page 134).
• Support for OpenAIREv4 Guidelines for Literature Repositories91 in OAI-PMH (See the new “openaire4”
context in OAI-PMH).
Additional major changes to be aware of in the 7.x platform (not an exhaustive list):
• XMLUI and JSPUI are no longer supported or distributed with DSpace. All users should immediately
migrate to and utilize the new Angular User Interface92. There is no migration path from either the XMLUI or
JSPUI to the new User interface. However, the new user interface can be themed via HTML and CSS (SCSS).
• The old REST API ("rest" webapp from DSpace v4.x-6.x) is deprecated and will be removed in v8.x. The
new REST API (provided in the "server" webapp) replaces all functionality available in the older REST API. If
you have tools that rely on the old REST API, you can still (optionally) build & deploy it alongside the "server"
webapp via the "-Pdspace-rest" Maven flag.
• The Submission Form configuration has changed. The "item-submission.xml" file has changed its
structure, and the "input-forms.xml" has been replaced by a "submission-forms.xml". For early
documentation see Configuration changes in the submission process93
• ElasticSearch Usage Statistics have been removed. Please use SOLR Statistics(see page 338) or DSpace
Google Analytics Statistics(see page 363).
• The traditional, 3-step Workflow system has been removed in favor of the Configurable Workflow
System(see page 250). For most users, you should see no effect or difference. The default setup for this
Configurable Workflow System is identical to the traditional, 3-step workflow ("Approve/Reject", "Approve/
Reject/Edit Metadata", "Edit Metadata")
• Apache Solr is no longer embedded within the DSpace installer (and has been upgraded to Solr v7).
Solr now MUST be installed as a separate dependency alongside the DSpace backend. See Installing
DSpace(see page 53).
• Some command-line tools/scripts are enabled in the new REST API (e.g. index-discovery): See new
Scripts endpoint: https://github.com/DSpace/Rest7Contract/blob/master/scripts-endpoint.md
• DSpace now has a single, backend "server" webapp to deploy in Tomcat (or similar). In DSpace 6.x and
below, different machine interfaces (OAI-PMH, SWORD v1 or v2, RDF, REST API) were provided via separate
deployable webapps. Now, all those interfaces along with the new REST API are in a single, "server" webapp

87 https://wiki.lyrasis.org/pages/viewpage.action?pageId=117735441
88 https://www.slideshare.net/Atmire/dspace-7-the-power-of-configurable-entities
89 https://lecture2go.uni-hamburg.de/l2go/-/get/v/24831
90 https://wiki.lyrasis.org/display/DSPACE/Try+out+DSpace+7#TryoutDSpace7-InstallviaDocker
91 https://guidelines.openaire.eu/en/latest/literature/index.html
92 https://github.com/DSpace/dspace-angular/
93 https://wiki.lyrasis.org/display/DSPACE/Configuration+changes+in+the+submission+process

Introduction – 34
DSpace 7.x Documentation – DSpace 7.x Documentation

built on Spring Boot94. You can now control which interfaces are enabled, and what path they appear on via
configuration (e.g. "oai.enabled=true" and "oai.path=oai"). See https://jira.lyrasis.org/browse/DS-4257
• Configuration(see page 552) has been upgraded to Apache Commons Configuration version 2. For most
users, you should see no effect or difference. No DSpace configuration files were modified during this upgrade
and no configurations or settings were renamed or changed. However, if you locally modified or customized
the [dspace]/config/config-definition.xml (DSpace's Apache Commons Configuration settings),
you will need to ensure those modifications are compatible with Apache Commons Configuration version 2.
See the Apache Commons Configuration's configuration definition file reference95 for more details.
• Handle Server has been upgraded to version 9.x : https://jira.lyrasis.org/browse/DS-4205
• DSpace now has sample Docker images (configurations) which can be used to try out DSpace quickly.
See Try out DSpace 796 ("Install via Docker" section).

1.2 Functional Overview

The following sections describe the various functional aspects of the DSpace system.

• Online access to your digital assets(see page 36)

• Full-text search(see page 36)
• Navigation(see page 36)
• Supported file types(see page 37)
• Optimized for Google Indexing(see page 37)
• OpenURL Support(see page 37)
• Support for modern browsers(see page 37)
• Metadata Management(see page 38)
• Metadata(see page 38)
• Choice Management and Authority Control(see page 38)
• Licensing(see page 39)
• Collection and Community Licenses(see page 40)
• License granted by the submitter to the repository(see page 40)
• Creative Commons Support for DSpace Items(see page 40)
• Persistent URLs and Identifiers(see page 40)
• Handles(see page 40)
• Bitstream 'Persistent' Identifiers(see page 41)
• Getting content into DSpace(see page 42)
• The Manual DSpace Submission and Workflow System(see page 42)
• Workflow Steps(see page 43)
• Submission Workflow in DSpace(see page 43)
• Command line import facilities(see page 44)
• Registration for externally hosted files(see page 44)
• SWORD Support(see page 44)
• Getting content out of DSpace(see page 44)
• OAI Support(see page 44)
• Command Line Export Facilities(see page 45)
• Packager Plugins(see page 45)
• Crosswalk Plugins(see page 45)
• Supervision and Collaboration(see page 46)
• User Management(see page 46)

94 https://spring.io/projects/spring-boot
95 https://commons.apache.org/proper/commons-configuration/userguide/
howto_combinedbuilder.html#Configuration_definition_file_reference
96 https://wiki.lyrasis.org/display/DSPACE/Try+out+DSpace+7

Introduction – 35
DSpace 7.x Documentation – DSpace 7.x Documentation

• User Accounts (E-Person)(see page 46)

• Subscriptions(see page 46)
• Groups(see page 47)
• Access Control(see page 47)
• Authentication(see page 47)
• Authorization(see page 47)
• Usage Metrics(see page 48)
• Item, Collection and Community Usage Statistics(see page 48)
• System Statistics(see page 49)
• Digital Preservation(see page 50)
• Checksum Checker(see page 50)
• System Design(see page 50)
• Data Model(see page 50)
• Amazon S3 Support(see page 52)

1.2.1 Online access to your digital assets

The online presentation of your content in an organized tree of Communities and Collections is a main feature of
DSpace. Users can access pages for individual items, these are metadata descriptions together with files available
for download. The structure is summarised in this diagram (click to see the image at full size).

1.2.1.1 Full-text search

DSpace can process uploaded text based contents for full-text searching. This means that not only the metadata
you provide for a given file will be searchable, but all of its contents will be indexed as well. This allows users to
search for specific keywords that only appear in the actual content and not in the provided description.

1.2.1.2 Navigation
DSpace allows users to find their way to relevant content in a number of ways, including:
• Searching for one or more keywords in metadata or extracted full-text
• Faceted browsing through any field provided in the item description.
• Through external reference, such as a Handle

Introduction – 36
DSpace 7.x Documentation – DSpace 7.x Documentation

• By clicking on Community and Collection titles to explore their contents

Another important mechanism for discovery in DSpace is the browse. This is the process whereby the user views a
particular index, such as the title index, and navigates around it in search of interesting items. The browse
subsystem provides a simple API for achieving this by allowing a caller to specify an index, and a subsection of that
index. The browse subsystem then discloses the portion of the index of interest. Indices that may be browsed are
item title, item issue date, item author, and subject terms. Additionally, the browse can be limited to items within a
particular collection or community.
For more information on Search/Browse functionality in DSpace, see Discovery(see page 386).

1.2.1.3 Supported file types

DSpace can accommodate any type of uploaded file. While DSpace is most known for hosting text based materials
including scholarly communication and electronic theses and dissertations (ETDs), there are many stakeholders in
the community who use DSpace for multimedia, data and learning objects. While some restrictions apply, DSpace
can even serve as a store for HTML Archives(see page 201).
Files that have been uploaded to DSpace are often referred to as "Bitstreams". The reason for this is mainly historic
and tracks back to the technical implementation. After ingestion, files in DSpace are stored on the file system as a
stream of bits without the file extension.
By default, DSpace only recognizes specific file types, as defined in its Bitstream Format Registry. The default
Bitstream Format Registry(see page 640) recognizes many common file formats, but it can be enhanced at your local
institution via the Admin User Interface.

1.2.1.4 Optimized for Google Indexing

The Duraspace community fosters a close relation with Google to ensure optimal indexing of DSpace content,
primarily in the Google Search and Google Scholar products. For the purpose of Google Scholar indexing, DSpace
added specific metadata in the page head tags facilitating indexing in Scholar. More information can be retrieved
on the Google Scholar Metadata Mappings page(see page 492). Popular DSpace repositories often generate over 60%
of their visits from Google pages.

1.2.1.5 OpenURL Support

DSpace supports the OpenURL protocol97 in a rather simple fashion. If your institution has an SFX server98, DSpace
will display an OpenURL link on every item page, automatically using the Dublin Core metadata. Additionally,
DSpace can respond to incoming OpenURLs. Presently it simply passes the information in the OpenURL to the
search subsystem. A list of results is then displayed, which usually gives the relevant item (if it is in DSpace) at the
top of the list.

1.2.1.6 Support for modern browsers

The DSpace developer community aims to rely on modern web standards and well tested libraries where possible.
As a rule of thumb, users can expect that the DSpace web interfaces work on modern web browsers. DSpace
developers routinely test new interface developments on recent versions of Firefox, Safari, Chrome and Microsoft
Edge. Because of fast moving, automatic, incremental updates to these browsers, support is no longer targeted at
specific versions of these browsers. (Please note that we do not recommend or support using Internet Explorer as it
is considered "end of life" by Microsoft.)

97 https://en.wikipedia.org/wiki/OpenURL
98 http://www.exlibrisgroup.com/category/SFXOverview

Introduction – 37
DSpace 7.x Documentation – DSpace 7.x Documentation

1.2.2 Metadata Management

1.2.2.1 Metadata
Broadly speaking, DSpace holds three sorts of metadata about archived content:
• Descriptive Metadata: DSpace can support multiple flat metadata schemas for describing an item. A
qualified Dublin Core metadata schema loosely based on the Library Application Profile99 set of elements
and qualifiers is provided by default. This default schema is described in more detail in Metadata and
Bitstream Format Registries(see page 640). However, you can configure multiple schemas and select metadata
fields from a mix of configured schemas to describe your items. Other descriptive metadata about items (e.g.
metadata described in a hierarchical schema) may be held in serialized bitstreams.
• Administrative Metadata: This includes preservation metadata, provenance and authorization policy data.
Most of this is held within DSpace's relational DBMS schema. Provenance metadata (prose) is stored in
Dublin Core records. Additionally, some other administrative metadata (for example, bitstream byte sizes
and MIME types) is replicated in Dublin Core records so that it is easily accessible outside of DSpace.
• Structural Metadata: This includes information about how to present an item, or bitstreams within an item,
to an end-user, and the relationships between constituent parts of the item. As an example, consider a thesis
consisting of a number of TIFF images, each depicting a single page of the thesis. Structural metadata would
include the fact that each image is a single page, and the ordering of the TIFF images/pages. Structural
metadata in DSpace is currently fairly basic; within an item, bitstreams can be arranged into separate
bundles as described above. A bundle may also optionally have a primary bitstream. This is currently used by
the HTML support to indicate which bitstream in the bundle is the first HTML file to send to a browser. In
addition to some basic technical metadata, a bitstream also has a 'sequence ID' that uniquely identifies it
within an item. This is used to produce a 'persistent' bitstream identifier for each bitstream. Additional
structural metadata can be stored in serialized bitstreams, but DSpace does not currently understand this
natively.

1.2.2.2 Choice Management and Authority Control

This is a configurable framework that lets you define plug-in classes to control the choice of values for specified
DSpace metadata fields. It also lets you configure fields to include "authority" values along with the textual
metadata value. The choice-control system includes a user interface in both the Configurable Submission UI and
the Admin UI (edit Item pages) that assists the user in choosing metadata values.
Introduction and Motivation
Definitions
Choice Management
This is a mechanism that generates a list of choices for a value to be entered in a given metadata field. Depending
on your implementation, the exact choice list might be determined by a proposed value or query, or it could be a
fixed list that is the same for every query. It may also be closed (limited to choices produced internally) or open,
allowing the user-supplied query to be included as a choice.
Authority Control
This works in addition to choice management to supply an authority key along with the chosen value, which is also
assigned to the Item's metadata field entry. Any authority-controlled field is also inherently choice-controlled.
About Authority Control

99 http://www.dublincore.org/documents/library-application-profile/

Introduction – 38
DSpace 7.x Documentation – DSpace 7.x Documentation

The advantages we seek from an authority controlled metadata field are:

1. There is a simple and positive way to test whether two values are identical, by comparing authority
keys.
• Comparing plain text values can give false positive results e.g. when two different people have a
name that is written the same.
• It can also give false negative results when the same name is written different ways, e.g. "J. Smith"
vs. "John Smith".
2. Help in entering correct metadata values. The submission and admin UIs may call on the authority to
check a proposed value and list possible matches to help the user select one.
3. Improved interoperability. By sharing a name authority with another application, your DSpace can
interoperate more cleanly with other applications.
• For example, a DSpace institutional repository sharing a naming authority with the campus social
network would let the social network construct a list of all DSpace Items matching the shared author
identifier, rather than by error-prone name matching.
• When the name authority is shared with a campus directory, DSpace can look up the email address of
an author to send automatic email about works of theirs submitted by a third party. That author does
not have to be an EPerson.
4. Authority keys are normally invisible in the public web UIs. They are only seen by administrators editing
metadata. The value of an authority key is not expected to be meaningful to an end-user or site visitor.
Authority control is different from the controlled vocabulary of keywords already implemented in the submission
UI:
1. Authorities are external to DSpace. The source of authority control is typically an external database or
network resource.
• Plug-in architecture makes it easy to integrate new authorities without modifying any core code.
2. This authority proposal impacts all phases of metadata management.
• The keyword vocabularies are only for the submission UI.
• Authority control is asserted everywhere metadata values are changed, including unattended/batch
submission, SWORD package submission, and the administrative UI.
Some Terminology

Authority An authority is a source of fixed values for a given domain, each unique
value identified by a key.

. For example, the OCLC LC Name Authority Service.

Authority Record The information associated with one of the values in an authority; may
include alternate spellings and equivalent forms of the value, etc.

Authority Key An opaque, hopefully persistent, identifier corresponding to exactly

one record in the authority.

1.2.3 Licensing
DSpace offers support for licenses on different levels

Introduction – 39
DSpace 7.x Documentation – DSpace 7.x Documentation

1.2.3.1 Collection and Community Licenses

Each community and collection in the hierarchy of a DSpace repository can contain its own license terms. This
allows an institution to use the repository both for collections where certain rights are reserved and others from
which the content may be accessed and distributed more freely.

1.2.3.2 License granted by the submitter to the repository

At the end of the manual submission process, the submitter is asked to grant the repository service an appropriate
distribution license. This license can be easily customized on a per collection basis. In its most common form, the
submitter grants to the repository service a non-exclusive distribution license, meaning that he officially gives the
repository service the right to share his or her work with the world.

1.2.3.3 Creative Commons Support for DSpace Items

DSpace provides support for Creative Commons licenses to be attached to items in the repository. They represent
an alternative to traditional copyright. To learn more about Creative Commons, visit their website100. Support for
license selection is controlled by a site-wide configuration option, and since license selection involves interaction
with the Creative Commons website, additional parameters may be configured to work with a proxy server. If the
option is enabled, users may select a Creative Commons license during the submission process, or select to don't
assign a Creative Commons license at all. If a selection is made, metadata and a copy of the license in the RDF
format is stored along with the item in the repository. There is also an indication - text and a Creative Commons
icon - in the item display page of the web user interface when an item is licensed under Creative Commons. The RDF
license is embedded in the html page of the item to allow machine understanding of the licensing terms. For
specifics of how to configure and use Creative Commons licenses, see the configuration section(see page 601).

1.2.4 Persistent URLs and Identifiers

1.2.4.1 Handles
Researchers require a stable point of reference for their works. The simple evolution from sharing of citations to
emailing of URLs broke when Web users learned that sites can disappear or be reconfigured without notice, and
that their bookmark files containing critical links to research results couldn't be trusted in the long term. To help
solve this problem, a core DSpace feature is the creation of a persistent identifier for every item, collection and
community stored in DSpace. To persist identifiers, DSpace requires a storage- and location- independent
mechanism for creating and maintaining identifiers. DSpace uses the CNRI Handle System101 for creating these
identifiers. The rest of this section assumes a basic familiarity with the Handle system.
DSpace uses Handles primarily as a means of assigning globally unique identifiers to objects. Each site running
DSpace needs to obtain a unique Handle 'prefix' from CNRI, so we know that if we create identifiers with that prefix,
they won't clash with identifiers created elsewhere.
Presently, Handles are assigned to communities, collections, and items. Bundles and bitstreams are not assigned
Handles, since over time, the way in which an item is encoded as bits may change, in order to allow access with
future technologies and devices. Older versions may be moved to off-line storage as a new standard becomes de
facto. Since it's usually the item that is being preserved, rather than the particular bit encoding, it only makes sense

100 http://creativecommons.org/
101 http://www.handle.net/

Introduction – 40
DSpace 7.x Documentation – DSpace 7.x Documentation

to persistently identify and allow access to the item, and allow users to access the appropriate bit encoding from
there.
Of course, it may be that a particular bit encoding of a file is explicitly being preserved; in this case, the bitstream
could be the only one in the item, and the item's Handle would then essentially refer just to that bitstream. The
same bitstream can also be included in other items, and thus would be citable as part of a greater item, or
individually.
The Handle system also features a global resolution infrastructure; that is, an end-user can enter a Handle into any
service (e.g. Web page) that can resolve Handles, and the end-user will be directed to the object (in the case of
DSpace, community, collection or item) identified by that Handle. In order to take advantage of this feature of the
Handle system, a DSpace site must also run a 'Handle server' that can accept and resolve incoming resolution
requests. All the code for this is included in the DSpace source code bundle.
Handles can be written in two forms:

hdl:1721.123/4567
http://hdl.handle.net/1721.123/4567

The above represent the same Handle. The first is possibly more convenient to use only as an identifier; however, by
using the second form, any Web browser becomes capable of resolving Handles. An end-user need only access this
form of the Handle as they would any other URL. It is possible to enable some browsers to resolve the first form of
Handle as if they were standard URLs using CNRI's Handle Resolver plug-in102, but since the first form can always be
simply derived from the second, DSpace displays Handles in the second form, so that it is more useful for end-users.
It is important to note that DSpace uses the CNRI Handle infrastructure only at the 'site' level. For example, in the
above example, the DSpace site has been assigned the prefix '1721.123'. It is still the responsibility of the DSpace
site to maintain the association between a full Handle (including the '4567' local part) and the community,
collection or item in question.

1.2.4.2 Bitstream 'Persistent' Identifiers

Similar to handles for DSpace items, bitstreams also have 'Persistent' identifiers. They are more volatile than
Handles, since if the content is moved to a different server or organization, they will no longer work (hence the
quotes around 'persistent'). However, they are more easily persisted than the simple URLs based on database
primary key previously used. This means that external systems can more reliably refer to specific bitstreams stored
in a DSpace instance.
Each bitstream has a sequence ID, unique within an item. This sequence ID is used to create a persistent ID, of the
form:
dspace url/bitstream/handle/sequence ID/filename
For example:

https://dspace.myu.edu/bitstream/123.456/789/24/foo.html

The above refers to the bitstream with sequence ID 24 in the item with the Handle hdl:123.456/789103.
The foo.html is really just there as a hint to browsers: Although DSpace will provide the appropriate MIME type,
some browsers only function correctly if the file has an expected extension.

102 http://www.handle.net/resolver/index.html
103 http://hdl:123.456

Introduction – 41
DSpace 7.x Documentation – DSpace 7.x Documentation

1.2.5 Getting content into DSpace

1.2.5.1 The Manual DSpace Submission and Workflow System

Rather than being a single subsystem, ingesting is a process that spans several. Below is a simple illustration of the
current ingesting process in DSpace.

DSpace Ingest Process

The batch item importer is an application, which turns an external SIP (an XML metadata document with some
content files) into an "in progress submission" object. The Web submission UI is similarly used by an end-user to
assemble an "in progress submission" object.
Depending on the policy of the collection to which the submission in targeted, a workflow process may be started.
This typically allows one or more human reviewers or 'gatekeepers' to check over the submission and ensure it is
suitable for inclusion in the collection.
When the Batch Ingester or Submission UI completes the InProgressSubmission object, and invokes the next stage
of ingest (be that workflow or item installation), a provenance message is added to the Dublin Core which includes
the filenames and checksums of the content of the submission. Likewise, each time a workflow changes state (e.g. a
reviewer accepts the submission), a similar provenance statement is added. This allows us to track how the item
has changed since a user submitted it.
Once any workflow process is successfully and positively completed, the InProgressSubmission object is consumed
by an "item installer", that converts the InProgressSubmission into a fully blown archived item in DSpace. The item
installer:
• Assigns an accession date
• Adds a "date.available" value to the Dublin Core metadata record of the item
• Adds an issue date if none already present
• Adds a provenance message (including bitstream checksums)
• Assigns a Handle persistent identifier
• Adds the item to the target collection, and adds appropriate authorization policies
• Adds the new item to the search and browse index

Introduction – 42
DSpace 7.x Documentation – DSpace 7.x Documentation

Workflow Steps
By default, a collection's workflow may have up to three steps. Each collection may have an associated e-person
group for performing each step; if no group is associated with a certain step, that step is skipped. If a collection has
no e-person groups associated with any step, submissions to that collection are installed straight into the main
archive. Keep in mind, however, that this is only the default behavior, and the workflow process can be
configured/customized easily, see Configurable Workflow(see page 250).
In other words, the default sequence is this: The collection receives a submission. If the collection has a group
assigned for workflow step 1, that step is invoked, and the group is notified. Otherwise, workflow step 1 is skipped.
Likewise, workflow steps 2 and 3 are performed if and only if the collection has a group assigned to those steps.
When a step is invoked, the submission is put into the 'task pool' of the step's associated group. One member of
that group takes the task from the pool, and it is then removed from the task pool, to avoid the situation where
several people in the group may be performing the same task without realizing it.
The member of the group who has taken the task from the pool may then perform one of three actions:

Workflow Step Possible actions

1 Can accept submission for inclusion, or reject submission.

2 Can edit metadata provided by the user with the submission, but cannot
change the submitted files. Can accept submission for inclusion, or reject
submission.

3 Can edit metadata provided by the user with the submission, but cannot
change the submitted files. Must then commit to archive; may not reject
submission.

Submission Workflow in DSpace

If a submission is rejected, the reason (entered by the workflow participant) is e-mailed to the submitter, and it is
returned to the submitter's 'My DSpace' page. The submitter can then make any necessary modifications and re-
submit, whereupon the process starts again.

Introduction – 43
DSpace 7.x Documentation – DSpace 7.x Documentation

If a submission is 'accepted', it is passed to the next step in the workflow. If there are no more workflow steps with
associated groups, the submission is installed in the main archive.
One last possibility is that a workflow can be 'aborted' by a DSpace site administrator. This is accomplished using
the Administration UI.

1.2.5.2 Command line import facilities

DSpace includes batch tools to import items in a simple directory structure, where the Dublin Core metadata is
stored in an XML file. This may be used as the basis for moving content between DSpace and other systems. For
more information see Item Importer and Exporter(see page 0).
DSpace also includes various package importer tools, which support many common content packaging formats like
METS. For more information see Package Importer and Exporter(see page 0). Additionally, DSpace can import/export
Archival Information Packages (AIPs), see AIP Backup and Restore(see page 411).

1.2.5.3 Registration for externally hosted files

Registration is an alternate means of incorporating items, their metadata, and their bitstreams into DSpace by
taking advantage of the bitstreams already being in accessible computer storage. An example might be that there is
a repository for existing digital assets. Rather than using the normal interactive ingest process or the batch import
to furnish DSpace the metadata and to upload bitstreams, registration provides DSpace the metadata and the
location of the bitstreams. DSpace uses a variation of the import tool to accomplish registration.

1.2.5.4 SWORD Support

SWORD (Simple Web-service Offering Repository Deposit) is a protocol that allows the remote deposit of items into
repositories. SWORD was further developed in SWORD version 2 to add the ability to retrieve, update, or delete
deposits. DSpace supports the SWORD protocol via the 'sword' web application and SWord v2 via the swordv2 web
application. The specification and further information can be found at http://swordapp.org104. See also SWORDv1
Server(see page 216) and SWORDv2 Server(see page 202).

1.2.6 Getting content out of DSpace

1.2.6.1 OAI Support

The Open Archives Initiative105 has developed a protocol for metadata harvesting106. This allows sites to
programmatically retrieve or 'harvest' the metadata from several sources, and offer services using that metadata,
such as indexing or linking services. Such a service could allow users to access information from a large number of
sites from one place.
DSpace exposes the Dublin Core metadata for items that are publicly (anonymously) accessible. Additionally, the
collection structure is also exposed via the OAI protocol's 'sets' mechanism. OCLC's open
source OAICat107 framework is used to provide this functionality.

104 http://swordapp.org/
105 http://www.openarchives.org/
106 http://www.openarchives.org/OAI/openarchivesprotocol.html
107 http://www.oclc.org/research/software/oai/cat.shtm

Introduction – 44
DSpace 7.x Documentation – DSpace 7.x Documentation

You can also configure the OAI service to make use of any crosswalk plugin to offer additional metadata formats,
such as MODS.
DSpace's OAI service does support the exposing of deletion information for withdrawn items, but not for items that
are 'expunged' (see above). DSpace also supports OAI-PMH resumption tokens. See OAI(see page 179) for more
information.

1.2.6.2 Command Line Export Facilities

DSpace includes batch tools to export items in a simple directory structure, where the Dublin Core metadata is
stored in an XML file. This may be used as the basis for moving content between DSpace and other systems. For
more information see Item Importer and Exporter(see page 0).
DSpace also includes various package exporter tools, which support many common content packaging formats like
METS. For more information see Package Importer and Exporter(see page 0). Additionally, DSpace can import/export
Archival Information Packages (AIPs), see AIP Backup and Restore(see page 411).

1.2.6.3 Packager Plugins

Packagers are software modules that translate between DSpace Item objects and a self-contained external
representation, or "package". A Package Ingester interprets, or ingests, the package and creates an Item. A Package
Disseminator writes out the contents of an Item in the package format.
A package is typically an archive file such as a Zip or "tar" file, including a manifest document which contains
metadata and a description of the package contents. The IMS Content Package108 is a typical packaging standard. A
package might also be a single document or media file that contains its own metadata, such as a PDF document
with embedded descriptive metadata.
Package ingesters and package disseminators are each a type of named plugin (see Plugin Manager(see page 0)), so it
is easy to add new packagers specific to the needs of your site. You do not have to supply both an ingester and
disseminator for each format; it is perfectly acceptable to just implement one of them.
Most packager plugins call upon Crosswalk Plugins(see page 45) to translate the metadata between DSpace's object
model and the package format.
More information about calling Packagers to ingest or disseminate content can be found in the Package Importer
and Exporter(see page 0) section of the System Administration documentation.

1.2.6.4 Crosswalk Plugins

Crosswalks are software modules that translate between DSpace object metadata and a specific external
representation. An Ingestion Crosswalk interprets the external format and crosswalks it to DSpace's internal data
structure, while a Dissemination Crosswalk does the opposite.
For example, a MODS ingestion crosswalk translates descriptive metadata from the MODS format to the metadata
fields on a DSpace Item. A MODS dissemination crosswalk generates a MODS document from the metadata on a
DSpace Item.
Crosswalk plugins are named plugins (see Plugin Manager(see page 0)), so it is easy to add new crosswalks. You do not
have to supply both an ingester and disseminator for each format; it is perfectly acceptable to just implement one
of them.

108 http://www.imsglobal.org/content/packaging/

Introduction – 45
DSpace 7.x Documentation – DSpace 7.x Documentation

There is also a special pair of crosswalk plugins which use XSL stylesheets to translate the external metadata to or
from an internal DSpace format. You can add and modify XSLT crosswalks simply by editing the DSpace
configuration and the stylesheets, which are stored in files in the DSpace installation directory.
The Packager plugins and OAH-PMH server make use of crosswalk plugins.

1.2.6.5 Supervision and Collaboration

In order to facilitate, as a primary objective, the opportunity for thesis authors to be supervised in the preparation
of their e-theses, a supervision order system exists to bind groups of other users (thesis supervisors) to an item in
someone's pre-submission workspace. The bound group can have system policies associated with it that allow
different levels of interaction with the student's item; a small set of default policy groups are provided:
• Full editorial control
• View item contents
• No policies
Once the default set has been applied, a system administrator may modify them as they would any other
policy set in DSpace
This functionality could also be used in situations where researchers wish to collaborate on a particular submission,
although there is no particular collaborative workspace functionality.

1.2.7 User Management

Although many of DSpace's functions such as document discovery and retrieval can be used anonymously, some
features (and perhaps some documents) are only available to certain "privileged" users. E-People and Groups are
the way DSpace identifies application users for the purpose of granting privileges. This identity is bound to a
session of a DSpace application such as the Web UI or one of the command-line batch programs. Both E-People and
Groups are granted privileges by the authorization system described below.

1.2.7.1 User Accounts (E-Person)

DSpace holds the following information about each e-person:
• E-mail address
• First and last names
• Whether the user is able to log in to the system via the Web UI, and whether they must use an X509 certificate
to do so;
• A password (encrypted), if appropriate
• A list of collections for which the e-person wishes to be notified of new items
• Whether the e-person 'self-registered' with the system; that is, whether the system created the e-person
record automatically as a result of the end-user independently registering with the system, as opposed to
the e-person record being generated from the institution's personnel database, for example.
• The network ID for the corresponding LDAP record, if LDAP authentication is used for this E-Person.

1.2.7.2 Subscriptions
As noted above, end-users (e-people) may 'subscribe' to collections in order to be alerted when new items appear
in those collections. Each day, end-users who are subscribed to one or more collections will receive an e-mail giving
brief details of all new items that appeared in any of those collections the previous day. If no new items appeared in
any of the subscribed collections, no e-mail is sent. Users can unsubscribe themselves at any time. RSS feeds of new
items are also available for collections and communities.

Introduction – 46
DSpace 7.x Documentation – DSpace 7.x Documentation

1.2.7.3 Groups
Groups are another kind of entity that can be granted permissions in the authorization system. A group is usually an
explicit list of E-People; anyone identified as one of those E-People also gains the privileges granted to the group.
However, an application session can be assigned membership in a group without being identified as an E-Person.
For example, some sites use this feature to identify users of a local network so they can read restricted materials not
open to the whole world. Sessions originating from the local network are given membership in the "LocalUsers"
group and gain the corresponding privileges.
Administrators can also use groups as "roles" to manage the granting of privileges more efficiently.

1.2.8 Access Control

1.2.8.1 Authentication
Authentication is when an application session positively identifies itself as belonging to an E-Person and/or Group.
In DSpace, it is implemented by a mechanism called Stackable Authentication: the DSpace configuration declares a
"stack" of authentication methods. An application (like the Web UI) calls on the Authentication Manager, which tries
each of these methods in turn to identify the E-Person to which the session belongs, as well as any extra Groups.
The E-Person authentication methods are tried in turn until one succeeds. Every authenticator in the stack is given
a chance to assign extra Groups. This mechanism offers the following advantages:
• Separates authentication from the Web user interface so the same authentication methods are used for
other applications such as non-interactive Web Services
• Improved modularity: The authentication methods are all independent of each other. Custom
authentication methods can be "stacked" on top of the default DSpace username/password method.
• Cleaner support for "implicit" authentication where username is found in the environment of a Web request,
e.g. in an X.509 client certificate.
For more information see Authentication Plugins(see page 87)

1.2.8.2 Authorization
DSpace's authorization system is based on associating actions with objects and the lists of EPeople who can
perform them. The associations are called Resource Policies, and the lists of EPeople are called Groups. There are
two built-in groups: 'Administrators', who can do anything in a site, and 'Anonymous', which is a list that contains
all users. Assigning a policy for an action on an object to anonymous means giving everyone permission to do that
action. (For example, most objects in DSpace sites have a policy of 'anonymous' READ.) Permissions must be
explicit - lack of an explicit permission results in the default policy of 'deny'. Permissions also do not 'commute'; for
example, if an e-person has READ permission on an item, they might not necessarily have READ permission on the
bundles and bitstreams in that item. Currently Collections, Communities and Items are discoverable in the browse
and search systems regardless of READ authorization.
The following actions are possible:
Collection

ADD/REMOVE add or remove items (ADD = permission to submit items)

Introduction – 47
DSpace 7.x Documentation – DSpace 7.x Documentation

DEFAULT_ITEM_READ inherited as READ by all submitted items

DEFAULT_BITSTREAM_READ inherited as READ by Bitstreams of all submitted items. Note: only

affects Bitstreams of an item at the time it is initially submitted. If a
Bitstream is added later, it does not get the same default read policy.

COLLECTION_ADMIN collection admins can edit items in a collection, withdraw items, map
other items into this collection.

Item

ADD/REMOVE add or remove bundles

READ can view item (item metadata is always viewable)

WRITE can modify item

Bundle

ADD/REMOVE add or remove bitstreams to a bundle

Bitstream

READ view bitstream

WRITE modify bitstream

Note that there is no 'DELETE' action. In order to 'delete' an object (e.g. an item) from the archive, one must have
REMOVE permission on all objects (in this case, collection) that contain it. The 'orphaned' item is automatically
deleted.
Policies can apply to individual e-people or groups of e-people.

1.2.9 Usage Metrics

DSpace is equipped with SOLR based infrastructure to log and display pageviews and file downloads.

1.2.9.1 Item, Collection and Community Usage Statistics

Usage statistics can be retrieved from individual item, collection and community pages. These Usage Statistics
pages show:
• Total page visits (all time)
• Total Visits per Month

Introduction – 48
DSpace 7.x Documentation – DSpace 7.x Documentation

• File Downloads (all time)*

• Top Country Views (all time)
• Top City Views (all time)
*File Downloads information is only displayed for item-level statistics. Note that downloads from separate
bitstreams are also recorded and represented separately. DSpace is able to capture and store File Download
information, even when the bitstream was downloaded from a direct link on an external website.

1.2.9.2 System Statistics

Various statistical reports about the contents and use of your system can be automatically generated by the system.
These are generated by analyzing DSpace's log files. Statistics can be broken down monthly.
The report includes following sections
• A customizable general overview of activities in the archive, by default including:
• Number of items archived
• Number of bitstream views
• Number of item page views
• Number of collection page views
• Number of community page views
• Number of user logins
• Number of searches performed
• Number of license rejections
• Number of OAI Requests
• Customizable summary of archive contents
• Broken-down list of item viewings
• A full break-down of all performed actions
• User logins
• Most popular searches
• Log Level Information
• Processing information!stats_genrl_overview.png!
The results of statistical analysis can be presented on a by-month and an in-total report, and are available
via the user interface. The reports can also either be made public or restricted to administrator access only.

Introduction – 49
DSpace 7.x Documentation – DSpace 7.x Documentation

1.2.10 Digital Preservation

1.2.10.1 Checksum Checker

The purpose of the checker is to verify that the content in a DSpace repository has not become corrupted or been
tampered with. The functionality can be invoked on an ad-hoc basis from the command line, or configured via cron
or similar. Options exist to support large repositories that cannot be entirely checked in one run of the tool. The tool
is extensible to new reporting and checking priority approaches.

1.2.11 System Design

1.2.11.1 Data Model

Introduction – 50
DSpace 7.x Documentation – DSpace 7.x Documentation

Data Model Diagram

The way data is organized in DSpace is intended to reflect the structure of the organization using the DSpace
system. Each DSpace site is divided into communities, which can be further divided into sub-communities reflecting
the typical university structure of college, department, research center, or laboratory.
Communities contain collections, which are groupings of related content. A collection may appear in more than one
community.
Each collection is composed of items, which are the basic archival elements of the archive. Each item is owned by
one collection. Additionally, an item may appear in additional collections; however every item has one and only one
owning collection.
Items are further subdivided into named bundles of bitstreams. Bitstreams are, as the name suggests, streams of
bits, usually ordinary computer files. Bitstreams that are somehow closely related, for example HTML files and
images that compose a single HTML document, are organized into bundles.
In practice, most items tend to have these named bundles:
• ORIGINAL – the bundle with the original, deposited bitstreams
• THUMBNAILS – thumbnails of any image bitstreams
• TEXT – extracted full-text from bitstreams in ORIGINAL, for indexing
• LICENSE – contains the deposit license that the submitter granted the host organization; in other words,
specifies the rights that the hosting organization have
• CC_LICENSE – contains the distribution license, if any (a icenommons109 license) associated with the item.
This license specifies what end users downloading the content can do with the content
Each bitstream is associated with one Bitstream Format. Because preservation services may be an important aspect
of the DSpace service, it is important to capture the specific formats of files that users submit. In DSpace, a
bitstream format is a unique and consistent way to refer to a particular file format. An integral part of a bitstream
format is an either implicit or explicit notion of how material in that format can be interpreted. For example, the
interpretation for bitstreams encoded in the JPEG standard for still image compression is defined explicitly in the
Standard ISO/IEC 10918-1. The interpretation of bitstreams in Microsoft Word 2000 format is defined implicitly,
through reference to the Microsoft Word 2000 application. Bitstream formats can be more specific than MIME types
or file suffixes. For example, application/ms-word and .doc span multiple versions of the Microsoft Word application,
each of which produces bitstreams with presumably different characteristics.
Each bitstream format additionally has a support level, indicating how well the hosting institution is likely to be able
to preserve content in the format in the future. There are three possible support levels that bitstream formats may
be assigned by the hosting institution. The host institution should determine the exact meaning of each support
level, after careful consideration of costs and requirements. MIT Libraries' interpretation is shown below:

Supported The format is recognized, and the hosting institution is confident it can make
bitstreams of this format usable in the future, using whatever combination of
techniques (such as migration, emulation, etc.) is appropriate given the
context of need.

Known The format is recognized, and the hosting institution will promise to preserve
the bitstream as-is, and allow it to be retrieved. The hosting institution will
attempt to obtain enough information to enable the format to be upgraded to
the 'supported' level.

109 http://www.creativecommons.org/

Introduction – 51
DSpace 7.x Documentation – DSpace 7.x Documentation

Unsupported The format is unrecognized, but the hosting institution will undertake to
preserve the bitstream as-is and allow it to be retrieved.

Each item has one qualified Dublin Core metadata record. Other metadata might be stored in an item as a serialized
bitstream, but we store Dublin Core for every item for interoperability and ease of discovery. The Dublin Core may
be entered by end-users as they submit content, or it might be derived from other metadata as part of an ingest
process.
Items can be removed from DSpace in one of two ways: They may be 'withdrawn', which means they remain in the
archive but are completely hidden from view. In this case, if an end-user attempts to access the withdrawn item,
they are presented with a 'tombstone,' that indicates the item has been removed. For whatever reason, an item
may also be 'expunged' if necessary, in which case all traces of it are removed from the archive.

Object Example

Community Laboratory of Computer Science; Oceanographic Research Center

Collection LCS Technical Reports; ORC Statistical Data Sets

Item A technical report; a data set with accompanying description; a

video recording of a lecture

Bundle A group of HTML and image bitstreams making up an HTML

document

Bitstream A single HTML file; a single image file; a source code file

Bitstream Format Microsoft Word version 6.0; JPEG encoded image format

1.2.11.2 Amazon S3 Support

DSpace offers two means for storing bitstreams. The first is in the file system on the server. The second is using
Amazon S3. For more information, see Storage Layer(see page 686)

Introduction – 52
DSpace 7.x Documentation – DSpace 7.x Documentation

2 Installing DSpace
• Installation Overview(see page 53)
• Installing the Backend (Server API)(see page 53)
• Backend Requirements(see page 53)
• Backend Installation(see page 59)
• Installing the Frontend (User Interface)(see page 66)
• Frontend Requirements(see page 66)
• Frontend Installation(see page 67)
• What Next?(see page 70)
• Common Installation Issues(see page 71)
• Troubleshoot an error or find detailed error messages(see page 71)
• "CORS error" or "Invalid CORS request"(see page 71)
• "403 Forbidden" error with a message that says "Access is denied. Invalid CSRF Token"(see page 71)
• Using a Self-Signed SSL Certificate causes the Frontend to not be able to access the Backend(see page
72)
• My REST API is running under HTTPS, but some of its "link" URLs are switching to HTTP?(see page 73)
• Database errors occur when you run ant fresh_install(see page 73)

2.1 Installation Overview

 Try out DSpace 7 before you install

If you'd like to quickly try out DSpace 7 before a full installation, see Try out DSpace 7110 for instructions on
a quick install via Docker.

As of version 7 (and above), the DSpace application is split into a "frontend" (User Interface) and a
"backend" (Server API). Most institutions will want to install BOTH. However, you can decide whether to run them
on the same machine or separate machines.
• The DSpace Frontend consists of a User Interface built on Angular.io111. It cannot be run alone, as
it requires a valid DSpace Backend to function. The frontend provides all user-facing functionality
• The DSpace Backend consists of a Server API ("server" webapp), built on Spring Boot112. It can be run
standalone, however it has no user interface. The backend provides all machine-based interfaces, including
the REST API, OAI-PMH, SWORD (v1 and v2) and RDF.
We recommend installing the Backend first, as the Frontend requires a valid Backend to run properly.

2.2 Installing the Backend (Server API)

2.2.1 Backend Requirements

• • • UNIX-like OS or Microsoft Windows(see page 54)
• Java JDK 11 (OpenJDK or Oracle JDK)(see page 54)

110 https://wiki.lyrasis.org/display/DSPACE/Try+out+DSpace+7
111 https://angular.io/
112 https://spring.io/projects/spring-boot

Installing DSpace – 53

DSpace 7.x Documentation – DSpace 7.x Documentation

• Apache Maven 3.3.x or above (Java build tool)(see page 55)

• Configuring a Maven Proxy(see page 55)
• Apache Ant 1.10.x or later (Java build tool)(see page 55)
• Relational Database (PostgreSQL or Oracle)(see page 56)
• PostgreSQL 11.x, 12.x or 13.x (with pgcrypto installed)(see page 56)
• Oracle 10g or later(see page 57)
• Apache Solr 8.x (full-text index/search service)(see page 57)
• Servlet Engine (Apache Tomcat 9, Jetty, Caucho Resin or equivalent)(see page 57)
• (Optional) IP to City Database for Location-based Statistics(see page 58)
• Git (code version control)(see page 59)

2.2.1.1 UNIX-like OS or Microsoft Windows

• UNIX-like operating system (Linux, HP/UX, Mac OSX, etc.) : Many distributions of Linux/Unix come with some
of the dependencies below pre-installed or easily installed via updates. You should consult your particular
distribution's documentation or local system administrators to determine what is already available.
• Microsoft Windows: While DSpace can be run on Windows servers, most institutions tend to run it on a UNIX-
like operating system.

2.2.1.2 Java JDK 11 (OpenJDK or Oracle JDK)

• OpenJDK download and installation instructions can be found here http://openjdk.java.net/install/. Most
operating systems provide an easy path to install OpenJDK. Just be sure to install the full JDK (development
kit), and not the JRE (which is often the default example).
• Oracle's Java can be downloaded from the following location: http://www.oracle.com/technetwork/java/
javase/downloads/index.html. Make sure to download the appropriate version of the Java SE JDK.

 Make sure to install the JDK and not just the JRE
At this time, DSpace requires the full JDK (Java Development Kit) be installed, rather than just the JRE
(Java Runtime Environment). So, please be sure that you are installing the full JDK and not just the JRE.

 Only JDK11 is fully supported

Older versions of Java are unsupported. This includes JDK v7-10.
Newer versions of Java may work (e.g. JDK v12-16), but we do not recommend running them in
Production. We highly recommend running only Java LTS (Long Term Support) releases in Production, as
non-LTS releases may not receive ongoing security fixes. As of this DSpace release, JDK11 is the most
recent Java LTS release, with the next one (JDK17) being due sometime around September 2021. As soon
as the next Java LTS release is available, we will analyze it for compatibility with this release of DSpace.
For more information on Java releases, see the Java roadmaps for Oracle113 and/or OpenJDK114.

113 https://www.oracle.com/technetwork/java/java-se-support-roadmap.html
114 https://adoptopenjdk.net/support.html#roadmap

Installing DSpace – 54

DSpace 7.x Documentation – DSpace 7.x Documentation

2.2.1.3 Apache Maven 3.3.x or above (Java build tool)

Maven is necessary in the first stage of the build process to assemble the installation package for your DSpace
instance. It gives you the flexibility to customize DSpace using the existing Maven projects found in the [dspace-
source]/dspace/modules directory or by adding in your own Maven project to build the installation package for
DSpace, and apply any custom interface "overlay" changes.
Maven can be downloaded from http://maven.apache.org/download.html It is also provided via many operating
system package managers.

Configuring a Maven Proxy

You can configure a proxy to use for some or all of your HTTP requests in Maven. The username and password are
only required if your proxy requires basic authentication (note that later releases may support storing your
passwords in a secured keystore‚ in the meantime, please ensure your settings.xml file (usually ${user.home}/.m2/
settings.xml) is secured with permissions appropriate for your operating system).
Example:

<settings>
.
.
<proxies>
<proxy>
<active>true</active>
<protocol>http</protocol>
<host>proxy.somewhere.com</host>
<port>8080</port>
<username>proxyuser</username>
<password>somepassword</password>
<nonProxyHosts>www.google.com|*.somewhere.com</nonProxyHosts>
</proxy>
</proxies>
.
.
</settings>

2.2.1.4 Apache Ant 1.10.x or later (Java build tool)

 While Apache Ant recommends using v1.10.x for Java 11, we've also had some success with recent versions
of 1.9.x (specifically v1.9.15 seems to work fine with Java 11). That said, earlier versions of v1.9.x are not
compatible with Java 11.

Apache Ant is required for the second stage of the build process (deploying/installing the application). First, Maven
is used to construct the installer ([dspace-source]/dspace/target/dspace-installer), after which Ant is
used to install/deploy DSpace to the installation directory.

Installing DSpace – 55

DSpace 7.x Documentation – DSpace 7.x Documentation

Ant can be downloaded from the following location: http://ant.apache.org115 It is also provided via many operating
system package managers.

2.2.1.5 Relational Database (PostgreSQL or Oracle)

PostgreSQL 11.x, 12.x or 13.x (with pgcrypto installed)

 PostgreSQL v9.4 to v10.x may work, but those versions are less well tested.
Active development/testing on DSpace 7 has occurred on PostgreSQL v11.x, v12.x and v13.x. However, it is
likely that the backend would also function on PostgreSQL v9.4 - v10.x. At this time we have not
performed sufficient testing on these earlier versions to add them to the prerequisites listing.

DSpace 7 will definitely not function on versions below 9.4 as DSpace requires installing and running
the pgcrypto extension116 (see below) v1.1, which was not available until PostgreSQL v9.4.

• PostgreSQL can be downloaded from http://www.postgresql.org/. It is also provided via many operating
system package managers.
• If the version of Postgres provided by your package manager is outdated, you may wish to use one of
the official PostgreSQL provided repositories:
• Linux users can select their OS of choice for detailed instructions on using the official
PostgreSQL apt or yum repository: http://www.postgresql.org/download/linux/
• Windows users will need to use the windows installer: http://www.postgresql.org/download/
windows/
• Mac OSX users can choose their preferred installation method: http://www.postgresql.org/
download/macosx/
• Install the pgcrypto extension.117 It will also need to be enabled on your DSpace Database (see Installation
instructions below for more info). The pgcrypto extension allows DSpace to create UUIDs (universally unique
identifiers) for all objects in DSpace, which means that (internal) object identifiers are now globally unique
and no longer tied to database sequences.
• On most Linux operating systems (Ubuntu, Debian, RedHat), this extension is provided in the
"postgresql-contrib" package in your package manager. So, ensure you've installed "postgresql-
contrib".
• On Windows, this extension should be provided automatically by the installer (check your
"[PostgreSQL]/share/extension" folder for files starting with "pgcrypto")
• Unicode (specifically UTF-8) support must be enabled (but this is enabled by default).
• Once installed, you need to enable TCP/IP connections (DSpace uses JDBC):
• In postgresql.conf: uncomment the line starting: listen_addresses = 'localhost'. This is
the default, in recent PostgreSQL releases, but you should at least check it.
• Then tighten up security a bit by editing pg_hba.conf and adding this line:

host dspace dspace 127.0.0.1 255.255.255.255 md5

115 http://ant.apache.org/
116 http://www.postgresql.org/docs/9.4/static/pgcrypto.html
117 http://www.postgresql.org/docs/9.4/static/pgcrypto.html

Installing DSpace – 56

DSpace 7.x Documentation – DSpace 7.x Documentation

This should appear before any lines matching all databases, because the first matching rule
governs.
• Then restart PostgreSQL.

Oracle 10g or later

 Please be aware that all active development occurs on PostgreSQL at this time. However, we provide
Oracle as a secondary option if you are less comfortable with PostgreSQL.

• Details on acquiring Oracle can be downloaded from the following location: http://www.oracle.com/
database/. You will need to create a database for DSpace. Make sure that the character set is one of the
Unicode character sets. DSpace uses UTF-8 natively, and it is suggested that the Oracle database use the
same character set. You will also need to create a user account for DSpace (e.g. dspace) and ensure that it
has permissions to add and remove tables in the database. Refer to the Quick Installation for more details.
• NOTE: If the database server is not on the same machine as DSpace, you must install the Oracle client
to the DSpace server and point tnsnames.ora and listener.ora files to the database the Oracle
server.

2.2.1.6 Apache Solr 8.x (full-text index/search service)

 Make sure to install Solr with Authentication disabled (which is the default). DSpace does not yet support
authentication to Solr (see https://github.com/DSpace/DSpace/issues/3169). Instead, we recommend
placing Solr behind a firewall and/or ensuring port 8983 (which Solr runs on) is not available for public/
anonymous access on the web. Solr only needs to be accessible to requests from the DSpace backend.

Solr can be obtained at the Apache Software Foundation site for Lucene and Solr118. You may wish to read portions
of the quick-start tutorial119 to make yourself familiar with Solr's layout and operation. Unpack a Solr .tgz or .zip
archive in a place where you keep software that is not handled by your operating system's package management
tools, and arrange to have it running whenever DSpace is running. You should ensure that Solr's index directories
will have plenty of room to grow. You should also ensure that port 8983 is not in use by something else, or configure
Solr to use a different port.
If you are looking for a good place to put Solr, consider /opt or /usr/local. You can simply unpack Solr in one
place and use it. Or you can configure Solr to keep its indexes elsewhere, if you need to – see the Solr
documentation for how to do this.
It is not necessary to dedicate a Solr instance to DSpace, if you already have one and want to use it. Simply copy
DSpace's cores to a place where they will be discovered by Solr. See below.

2.2.1.7 Servlet Engine (Apache Tomcat 9, Jetty, Caucho Resin or equivalent)

• Apache Tomcat 9. Tomcat can be downloaded from the following location: http://tomcat.apache.org120. It
is also provided via many operating system package managers.
• The Tomcat owner (i.e. the user that Tomcat runs as) must have read/write access to the DSpace
installation directory (i.e. [dspace]). There are a few common ways this may be achieved:

118 https://lucene.apache.org/solr
119 http://lucene.apache.org/solr/guide/7_7/solr-tutorial.html
120 http://tomcat.apache.org/whichversion.html

Installing DSpace – 57

DSpace 7.x Documentation – DSpace 7.x Documentation

• One option is to specifically give the Tomcat user (often named "tomcat") ownership of the
[dspace] directories, for example:

# Change [dspace] and all subfolders to be owned by "tomcat"

chown -R tomcat:tomcat [dspace]

• Another option is to have Tomcat itself run as a new user named "dspace" (see installation
instructions below). Some operating systems make modifying the Tomcat "run as" user easily
modifiable via an environment variable named TOMCAT_USER. This option may be more
desirable if you have multiple Tomcat instances running, and you do not want all of them to
run under the same Tomcat owner.
• You need to ensure that Tomcat a) has enough memory to run DSpace, and b) uses UTF-8 as its
default file encoding for international character support. So ensure in your startup scripts (etc) that
the following environment variable is set: JAVA_OPTS="-Xmx512M -Xms64M -Dfile.encoding=UTF-8"
• Modifications in [tomcat]/conf/server.xml : You also need to alter Tomcat's default configuration to
support searching and browsing of multi-byte UTF-8 correctly. You need to add a configuration
option to the <Connector> element in [tomcat]/config/server.xml: URIEncoding="UTF-8" e.g. if you're
using the default Tomcat config, it should read:

You may change the port from 8080 by editing it in the file above, and by setting the variable
CONNECTOR_PORT in server.xml. You should set the URIEncoding even if you are running Tomcat
behind a proxy (Apache HTTPD, Nginx, etc.) via AJP.
• Jetty or Caucho Resin
• DSpace 7 has not been tested with Jetty or Caucho Resin, after the switch to Java 11
• Older versions of DSpace were able to run on a Tomcat-equivalent servlet Engine, such as Jetty
(https://www.eclipse.org/jetty/) or Caucho Resin (http://www.caucho.com/). If you choose to use a
different servlet container, please ensure that it supports Servlet Spec 3.1 (or above).
• Jetty and Resin are configured for correct handling of UTF-8 by default.

2.2.1.8 (Optional) IP to City Database for Location-based Statistics

Optionally, if you wish to record the geographic locations of clients in DSpace usage statistics records, you will need
to install (and regularly update) one of the following:
• Either, a copy of MaxMind's GeoLite City database121 (in MMDB format)
• NOTE: Installing MaxMind GeoLite2 is free. However, you must sign up for a (free) MaxMind account
in order to obtain a license key to use the GeoLite2 database.
• You may download GeoLite2 directly from MaxMind, or many Linux distributions provide the
geoipupdate tool directly via their package manager. You will still need to configure your license
key prior to usage.

121 https://dev.maxmind.com/geoip/geoip2/geolite2/

Installing DSpace – 58

DSpace 7.x Documentation – DSpace 7.x Documentation

• Once the "GeoLite2-City.mmdb" database file is installed on your system, you will need to configure
its location as the value of usage-statistics.dbfile in your local.cfg configuration file.
• See the "Managing the City Database File" section of SOLR Statistics(see page 338) for more information
about using a City Database with DSpace.
• Or, you can alternatively use/install DB-IP's City Lite database122 (in MMDB format)
• This database is also free to use, but does not require an account to download.
• Once the "dbip-city-lite.mmdb" database file is installed on your system, you will need to configure
its location as the value of usage-statistics.dbfile in your local.cfg configuration file.
• See the "Managing the City Database File" section of SOLR Statistics(see page 338) for more information
about using a City Database with DSpace.

2.2.1.9 Git (code version control)

Currently, there is a known bug in DSpace where a third-party Maven Module expects git to be available (in order
to support the ./dspace version commandline tool). We are working on a solution within this ticket:
DS-3418123 - DSpace Build Error When Git is Not Present VOLUNTEER NEEDED
For the time being, you can work around this problem by installing Git locally: https://git-scm.com/downloads

2.2.2 Backend Installation

1. Install all the Backend Requirements(see page 53) listed above.
2. Create a DSpace operating system user (optional) . As noted in the prerequisites above, Tomcat (or Jetty,
etc) must run as an operating system user account that has full read/write access to the DSpace installation
directory (i.e. [dspace]). Either you must ensure the Tomcat owner also owns [dspace], OR you can
create a new "dspace" user account, and ensure that Tomcat also runs as that account:

useradd -m dspace

The choice that makes the most sense for you will probably depend on how you installed your servlet
container (Tomcat/Jetty/etc). If you installed it from source, you will need to create a user account to run it,
and that account can be named anything, e.g. 'dspace'. If you used your operating system's package
manager to install the container, then a user account should have been created as part of that process and it
will be much easier to use that account than to try to change it.
3. Download the latest DSpace release124 from the DSpace GitHub Repository. You can choose to either
download the zip or tar.gz file provided by GitHub, or you can use "git" to checkout the appropriate tag (e.g.
dspace-7.0) or branch.
4. Unpack the DSpace software. After downloading the software, based on the compression file format,
choose one of the following methods to unpack your software:
a. Zip file. If you downloaded dspace-7.0.zip do the following:

unzip dspace-7.0.zip

b. .gz file. If you downloaded dspace-7.0.tar.gz do the following:

122 https://db-ip.com/db/download/ip-to-city-lite
123 https://jira.lyrasis.org/browse/DS-3418?src=confmacro
124 https://github.com/DSpace/DSpace/releases

Installing DSpace – 59

DSpace 7.x Documentation – DSpace 7.x Documentation

gunzip -c dspace-7.0.tar.gz | tar -xf -

For ease of reference, we will refer to the location of this unzipped version of the DSpace release as
[dspace-source] in the remainder of these instructions. After unpacking the file, the user may wish to
change the ownership of the dspace-7.x folder to the "dspace" user. (And you may need to change
the group).
5. Database Setup
• PostgreSQL:
• Create a dspace database user (this user can have any name, but we'll assume you name it
"dspace"). This is entirely separate from the dspace operating-system user created above:

createuser --username=postgres --no-superuser --pwprompt dspace

You will be prompted (twice) for a password for the new dspace user. Then you'll be
prompted for the password of the PostgreSQL superuser (postgres).
• Create a dspace database, owned by the dspace PostgreSQL user. Similar to the previous
step, this can only be done by a "superuser" account in PostgreSQL (e.g. postgres):

createdb --username=postgres --owner=dspace --encoding=UNICODE dspace

You will be prompted for the password of the PostgreSQL superuser (postgres).
• Finally, you MUST enable the pgcrypto extension125 on your new dspace database. Again, this
can only be enabled by a "superuser" account (e.g. postgres)

# Login to the database as a superuser, and enable the pgcrypto extension on this
database
psql --username=postgres dspace -c "CREATE EXTENSION pgcrypto;"

The "CREATE EXTENSION" command should return with no result if it succeeds. If it fails or
throws an error, it is likely you are missing the required pgcrypto extension (see Database
Prerequisites126 above).
• Alternative method: How to enable pgcrypto via a separate database schema. While
the above method of enabling pgcrypto is perfectly fine for the majority of users, there
may be some scenarios where a database administrator would prefer to install
extensions into a database schema that is separate from the DSpace tables.
Developers also may wish to install pgcrypto into a separate schema if they plan to
"clean" (recreate) their development database frequently. Keeping extensions in a
separate schema from the DSpace tables will ensure developers would NOT have to
continually re-enable the extension each time you run a "./dspace database
clean". If you wish to install pgcrypto in a separate schema here's how to do that:

125 http://www.postgresql.org/docs/9.4/static/pgcrypto.html
126 https://wiki.duraspace.org/display/DSDOC6x/Installing+DSpace#InstallingDSpace-RelationalDatabase:(PostgreSQLorOracle)

Installing DSpace – 60

DSpace 7.x Documentation – DSpace 7.x Documentation

# Login to the database as a superuser

psql --username=postgres dspace
# Create a new schema in this database named "extensions" (or whatever you want
to name it)
CREATE SCHEMA extensions;
# Enable this extension in this new schema
CREATE EXTENSION pgcrypto SCHEMA extensions;
# Grant rights to call functions in the extensions schema to your dspace user
GRANT USAGE ON SCHEMA extensions TO dspace;

# Append "extensions" on the current session's "search_path" (if it doesn't
already exist in search_path)
# The "search_path" config is the list of schemas that Postgres will use
SELECT set_config('search_path',current_setting('search_path') || ',extensions',f
alse) WHERE current_setting('search_path') !~ '(^|,)extensions(,|$)';
# Verify the current session's "search_path" and make sure it's correct
SHOW search_path;
# Now, update the "dspace" Database to use the same "search_path" (for all
future sessions) as we've set for this current session (i.e. via set_config()
above)
ALTER DATABASE dspace SET search_path FROM CURRENT;

• Oracle:
• Setting up DSpace to use Oracle is a bit different now. You will need still need to get a copy of
the Oracle JDBC driver, but instead of copying it into a lib directory you will need to install it
into your local Maven repository. (You'll need to download it first from this location: http://
www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html.) Run
the following command (all on one line):

mvn install:install-file
-Dfile=ojdbc6.jar
-DgroupId=com.oracle
-DartifactId=ojdbc6
-Dversion=11.2.0.4.0
-Dpackaging=jar
-DgeneratePom=true

• You need to compile DSpace with an Oracle driver (ojdbc6.jar) corresponding to your Oracle
version - update the version in [dspace-source]/pom.xml E.g.:

<dependency>
<groupId>com.oracle</groupId>
<artifactId>ojdbc6</artifactId>
<version>11.2.0.4.0</version>
</dependency>

• Create a database for DSpace. Make sure that the character set is one of the Unicode
character sets. DSpace uses UTF-8 natively, and it is required that the Oracle database use the
same character set. Create a user account for DSpace (e.g. dspace) and ensure that it has
permissions to add and remove tables in the database.

Installing DSpace – 61

DSpace 7.x Documentation – DSpace 7.x Documentation

• NOTE: You will need to ensure the proper db.* settings are specified in your local.cfg file
(see next step), as the defaults for all of these settings assuming a PostgreSQL database
backend.

db.url = jdbc:oracle:thin:@host:port/SID
# e.g. db.url = jdbc:oracle:thin:@//localhost:1521/xe
# NOTE: in db.url, SID is the SID of your database defined in tnsnames.ora
# the default Oracle port is 1521
# You may also use a full SID definition, e.g.
# db.url = jdbc:oracle:thin:@(description=(address_list=(address=(protocol=TCP)
(host=localhost)(port=1521)))(connect_data=(service_name=DSPACE)))

# Oracle driver and dialect
db.driver = oracle.jdbc.OracleDriver
db.dialect = org.hibernate.dialect.Oracle10gDialect

# Specify DB username, password and schema to use
db.username =
db.password =
db.schema = ${db.username}
# For Oracle, schema is equivalent to the username of your database account,
# so this may be set to ${db.username} in most scenarios

• Later, during the Maven build step, don't forget to specify mvn -Ddb.name127=oracle
package
6. Initial Configuration (local.cfg): Create your own [dspace-source]/dspace/config/local.cfg
configuration file. You may wish to simply copy the provided [dspace-source]/dspace/config/
local.cfg.EXAMPLE. This local.cfg file can be used to store any configuration changes that you wish to
make which are local to your installation (see local.cfg configuration file(see page 560) documentation). ANY
setting may be copied into this local.cfg file from the dspace.cfg or any other *.cfg file in order to override
the default setting (see note below). For the initial installation of DSpace, there are some key settings you'll
likely want to override. Those are provided in the [dspace-source]/dspace/config/
local.cfg.EXAMPLE. (NOTE: Settings followed with an asterisk (*) are highly recommended, while all
others are optional during initial installation and may be customized at a later time.)
• dspace.dir* - must be set to the [dspace] (installation) directory (NOTE: On Windows be sure to use
forward slashes for the directory path! For example: "C:/dspace" is a valid path for Windows.)
• dspace.server.url* - complete URL of this DSpace backend (including port and any subpath). Do
not end with '/'. For example: http://localhost:8080/server
• dspace.ui.url* - complete URL of the DSpace frontend (including port and any subpath).
REQUIRED for the REST API to fully trust requests from the DSpace frontend. Do not end with '/'. For
example: http://localhost:4000
• dspace.name - Human-readable, "proper" name of your server, e.g. "My Digital Library".
• solr.server* - complete URL of the Solr server. DSpace makes use of Solr128 for indexing
purposes. http://localhost:8983/solr unless you changed the port or installed Solr on some other
host.
• default.language - Default language for all metadata values (defaults to "en_US")
• db.url* - The full JDBC URL to your database (examples are provided in the local.cfg.EXAMPLE)

127 http://Ddb.name
128 http://lucene.apache.org/solr/

Installing DSpace – 62

DSpace 7.x Documentation – DSpace 7.x Documentation

• db.driver* - Which database driver to use, based on whether you are using PostgreSQL or Oracle

• db.dialect* - Which database dialect to use, based on whether you are using PostgreSQL or
Oracle
• db.username* - the database username used in the previous step.
• db.password* - the database password used in the previous step.
• db.schema* - the database schema to use (examples are provided in the local.cfg.EXAMPLE)
• mail.server - fully-qualified domain name of your outgoing mail server.
• mail.from.address - the "From:" address to put on email sent by DSpace.
• mail.feedback.recipient - mailbox for feedback mail.
• mail.admin - mailbox for DSpace site administrator.
• alert.recipient - mailbox for server errors/alerts (not essential but very useful!)
• registration.notify- mailbox for emails when new users register (optional)

 Your local.cfg file can override ANY settings from other *.cfg files in DSpace
The provided local.cfg.EXAMPLE only includes a small subset of the configuration
settings available with DSpace. It provides a good starting point for your own local.cfg
file.
However, you should be aware that ANY configuration can now be copied into your
local.cfg to override the default settings. This includes ANY of the settings/
configurations in:
• The primary dspace.cfg file ([dspace]/config/dspace.cfg)
• Any of the module configuration files ([dspace]/config/modules/*.cfg files)
• Any of the Spring Boot settings ([dspace-src]/dspace-server-webapp/src/
main/resources/application.properties)
Individual settings may also be commented out or removed in your local.cfg, in order to
re-enable default settings.
See the Configuration Reference(see page 552) section for more details.

7. DSpace Directory: Create the directory for the DSpace backend installation (i.e. [dspace]). As root (or a
user with appropriate permissions), run:

mkdir [dspace]
chown dspace [dspace]

(Assuming the dspace UNIX username.)

8. Build the Installation Package: As the dspace UNIX user, generate the DSpace installation package.

cd [dspace-source]
mvn package

Installing DSpace – 63

DSpace 7.x Documentation – DSpace 7.x Documentation

 Building with Oracle Database Support

Without any extra arguments, the DSpace installation package is initialized for PostgreSQL. If you
want to use Oracle instead, you should build the DSpace installation package as follows:
mvn -Ddb.name129=oracle package

9. Install DSpace Backend: As the dspace UNIX user, install DSpace to [dspace]:

cd [dspace-source]/dspace/target/dspace-installer
ant fresh_install

 To see a complete list of build targets, run: ant help The most likely thing to go wrong here is the
test of your database connection. See the Common Installation Issues(see page 71) Section below for
more details.

10. Initialize your Database: While this step is optional (as the DSpace database should auto-initialize itself on
first startup), it's always good to verify one last time that your database connection is working properly. To
initialize the database run:

[dspace]/bin/dspace database migrate

11. Deploy Server web application: The DSpace backend consists of a single "server" webapp (in [dspace]/
webapps/server). You need to deploy this webapp into your Servlet Container (e.g. Tomcat). Generally,
there are two options (or techniques) which you could use...either configure Tomcat to find the DSpace
"server" webapp, or copy the "server" webapp into Tomcat's own webapps folder.
• Technique A. Tell your Tomcat/Jetty/Resin installation where to find your DSpace web application(s).
As an example, in the directory [tomcat]/conf/Catalina/localhost you could add files similar
to the following (but replace [dspace]with your installation location):

DEFINE A CONTEXT PATH FOR DSpace Server webapp: server.xml

<?xml version='1.0'?>
<Context
docBase="[dspace]/webapps/server"/>

The name of the file (not including the suffix ".xml") will be the name of the context, so for
example server.xml defines the context at http://host:8080/server. To define the root
context (http://host:8080/), name that context's file ROOT.xml. Optionally, you can also choose
to install the old, deprecated "rest" webapp if you
• Technique B. Simple and complete. You copy only (or all) of the DSpace Web application(s) you wish
to use from the [dspace]/webapps directory to the appropriate directory in your Tomcat/Jetty/Resin
installation. For example:
cp -R [dspace]/webapps/* [tomcat]/webapps* (This will copy all the web applications to
Tomcat).
cp -R [dspace]/webapps/server [tomcat]/webapps* (This will copy only the Server web
application to Tomcat.)

129 http://Ddb.name

Installing DSpace – 64

DSpace 7.x Documentation – DSpace 7.x Documentation

To define the root context (http://host:8080/), name that context's directory ROOT.
12. Optionally, also install the deprecated DSpace 6.x REST API web application. If you previously used the
DSpace 6.x REST API, for backwards compatibility the old, deprecated "rest" webapp is still available to
install (in [dspace]/webapps/rest). It is NOT used by the DSpace frontend. So, most users should skip
this step.
13. Copy Solr cores: DSpace installation creates a set of four empty Solr cores already configured.
a. Copy them from [dspace]/solr to the place where your Solr instance will discover them. For
example:

cp -R [dspace]/solr/* [solr]/server/solr/configsets
chown -R solr:solr [solr]/server/solr/configsets

b. Start (or re-start) Solr. For example:

[solr]/bin/solr restart

c. You can check the status of Solr and your new DSpace cores by using its administrative web
interface. Browse to http://localhost:8983/ to see if Solr is running well, then look at the cores
by selecting (on the left) Core Admin or using the Core Selector drop list.
14. Create an Administrator Account: Create an initial administrator account from the command line:

[dspace]/bin/dspace create-administrator

15. Initial Startup! Now the moment of truth! Start up (or restart) Tomcat/Jetty/Resin.
a. REST API Interface - (e.g.) http://dspace.myu.edu:8080/server/
b. OAI-PMH Interface - (e.g.) http://dspace.myu.edu:8080/server/oai/request?verb=Identify
c. For an example of what the default backend looks like, visit the Demo Backend: https://
api7.dspace.org/server/
16. Production Installation (adding HTTPS support): Running the DSpace Backend on HTTP & port 8080 is
only usable for testing/demo environments. If you want to run DSpace in Production, you MUST run the
backend with HTTPS support (otherwise logins will not work outside of your local domain).
a. For HTTPS support, we recommend installing either Apache HTTPD130 or Nginx131, configuring SSL at
that level, and proxying all requests to your Tomcat installation. Keep in mind, if you want to host
both the DSpace Backend and Frontend on the same server, you can use one installation of Apache
HTTPD or Nginx to manage HTTPS/SSL and proxy to both.
b. These instructions are specific to Apache HTTPD, but a similar setup can be achieved with Nginx
i. Install Apache HTTPD132, e.g. sudo apt install apache2
ii. Install the mod_proxy133 and mod_proxy_ajp134 modules, e.g. sudo en2mod proxy; sudo
a2enmod proxy_ajp
1. Alternatively, you can choose to use mod_proxy_http135 to create an http proxy. A
separate example is commented out below
iii. Restart Apache to enable

130 https://httpd.apache.org/
131 https://www.nginx.com/
132 https://httpd.apache.org/
133 https://httpd.apache.org/docs/current/mod/mod_proxy.html
134 https://httpd.apache.org/docs/current/mod/mod_proxy_ajp.html
135 https://httpd.apache.org/docs/current/mod/mod_proxy_http.html

Installing DSpace – 65

DSpace 7.x Documentation – DSpace 7.x Documentation

iv. For mod_proxy_ajp to communicate with Tomcat, you'll need to enable Tomcat's AJP
connector in your Tomcat's server.xml:

<Connector protocol="AJP/1.3" port="8009" redirectPort="8443" URIEncoding="UTF-8" />

v. Now, setup a new VirtualHost for your site (using HTTPS / port 443) which proxies all requests
to Tomcat's AJP connector (running on port 8009)

<VirtualHost _default_:443>
.. setup your host how you want, including log settings...

SSLEngine on
SSLCertificateFile [full-path-to-PEM-cert]
SSLCertificateKeyFile [full-path-to-cert-KEY]

# Proxy all HTTPS requests to "/server" from Apache to Tomcat via AJP connector
ProxyPass /server ajp://localhost:8009/server
ProxyPassReverse /server ajp://localhost:80009/server

# If you would rather use mod_proxy_http as an http proxy to port 8080
# then use these settings instead
#ProxyPass /server http://localhost:8080/server
#ProxyPassReverse /server http://localhost:8080/server
# When using mod_proxy_http, you need to also ensure the X-Forwarded-Proto header
is sent
# to tell DSpace it is behind HTTPS, otherwise some URLs may continue to use HTTP
# (requires installing/enabling mod_headers)
#RequestHeader set X-Forwarded-Proto https
</VirtualHost>

c. After switching to HTTPS, make sure to go back and update the URLs (e.g. dspace.server.url) in
your local.cfg to match the new URL of your backend. This will require briefly rebooting Tomcat.

2.3 Installing the Frontend (User Interface)

2.3.1 Frontend Requirements

• • • UNIX-like OS or Microsoft Windows(see page 66)
• Node.js (v12.x or v14.x)(see page 67)
• Yarn (v1.x)(see page 67)
• PM2 (or another Process Manager for Node.js apps) (optional, but recommended for
Production)(see page 67)
• DSpace 7.x Backend (see above)(see page 67)

2.3.1.1 UNIX-like OS or Microsoft Windows

Installing DSpace – 66

DSpace 7.x Documentation – DSpace 7.x Documentation

• Microsoft Windows: While DSpace can be run on Windows servers, most institutions tend to run it on a UNIX-
like operating system.

2.3.1.2 Node.js (v12.x or v14.x)

• Node.js can be found at https://nodejs.org/. It may be available through your Linux distribution's package
manager. We recommend running a Long Term Support (LTS) version136 (even numbered releases). Non-
LTS versions (odd numbered releases) are not recommended.
• Node.js is a Javascript runtime that also provides npm137 (Node Package Manager). It is used to both build
and run the frontend.

2.3.1.3 Yarn (v1.x)

• Yarn v1.x is available at https://classic.yarnpkg.com/. It can usually be install via NPM (or through your Linux
distribution's package manager). We do NOT currently support Yarn v2.

# You may need to run this command using "sudo" if you don't have proper privileges
npm install --global yarn

• Yarn is used to build/install the frontend.

2.3.1.4 PM2 (or another Process Manager for Node.js apps) (optional, but recommended for
Production)
• In Production scenarios, we highly recommend starting/stopping the User Interface using a Node.js process
manager. There are several available, but our current favorite is PM2138. The rest of this installation guide
assumes you are using PM2.
• PM2139 is very easily installed via NPM

# You may need to run this command using "sudo" if you don't have proper privileges
npm install --global pm2

2.3.1.5 DSpace 7.x Backend (see above)

• The DSpace User Interface (Frontend) cannot function without an installed DSpace Backend. Follow the
instructions above.
• The Frontend and Backend do not need to be installed on the same machine/server. They may be installed on
separate machines as long as the two machines can connect to one another via HTTP or HTTPS.

2.3.2 Frontend Installation

1. First, install all the Frontend Requirements(see page 66) listed above & verify the backend/REST API is publicly
accessible.

136 https://nodejs.org/en/about/releases/
137 https://www.npmjs.com/
138 https://pm2.keymetrics.io/
139 https://pm2.keymetrics.io/

Installing DSpace – 67

DSpace 7.x Documentation – DSpace 7.x Documentation

2. Download the latest dspace-angular release140 from the DSpace GitHub repository. You can choose to either
download the zip or tar.gz file provided by GitHub, or you can use "git" to checkout the appropriate tag (e.g.
dspace-7.0) or branch.
3. Install all necessary local dependencies by running the following from within the unzipped "dspace-angular"
directory

# change directory to our repo

cd dspace-angular

# install the local dependencies
yarn install

4. Create a Production Configuration file at [dspace-angular]/src/environments/

environment.prod.ts. You may wish to use the environment.template.ts as a starting point. This
environment.prod.ts file can be used to override any of the default configurations specified in the
environment.common.ts (in that same directory). At a minimum this file MUST include the "ui" and "rest"
sections similar to the following (keep in mind, you only need to include settings that you need to modify):

export const environment = {

// The "ui" section defines where you want Node.js to run/respond. It may correspond to your
primary URL, but it also may not (if you are running behind a proxy).
// In this example, we are setting up our UI to just use localhost, port 4000.
// This is a common setup for when you want to use Apache or Nginx to handle HTTPS and proxy
requests to Node on port 4000
ui: {
ssl: false,
host: 'localhost',
port: 4000,
// NOTE: Space is capitalized because 'namespace' is a reserved string in TypeScript
nameSpace: '/'
},
// This example is valid if your Backend is publicly available at https://api.mydspace.edu/server/
// The REST settings MUST correspond to the primary URL of the backend. Usually, this means they
must be kept in sync
// with the value of "dspace.server.url" in the backend's local.cfg
rest: {
ssl: true,
host: 'api.mydspace.edu',
port: 443,
// NOTE: Space is capitalized because 'namespace' is a reserved string in TypeScript
nameSpace: '/server'
}
};

a. HINT #1: In the "ui" section above, you may wish to start with "ssl: false" and "port: 4000" just to be
certain that everything else is working properly. With those settings, you can quickly test your UI by
running "yarn start" and trying to access it via http://[mydspace.edu]:4000/ from your web
browser. KEEP IN MIND, we highly recommend always using HTTPS for Production.
b. HINT #2: If Node throws an error saying "listen EADDRNOTAVAIL: address not available", try setting
the "host" to "0.0.0.0" or "localhost". Usually that error is a sign that the "host" is not recognized.

140 https://github.com/DSpace/dspace-angular/releases

Installing DSpace – 68

DSpace 7.x Documentation – DSpace 7.x Documentation

c. If there are other settings you know you need to modify in the default environment.common.ts
configuration file you can also copy them into this same file.
5. Build the User Interface for Production. This uses your environment.prod.ts and the source code to
create a compiled version of the UI in the [dspace-angular]/dist folder

yarn run build:prod

a. HINT: if you change/update your environment.prod.ts, then you will need to rebuild the UI
application (i.e. rerun this command).
6. Assuming you are using PM2141, create a JSON configuration file describing how to run our UI application.
This need NOT be in the same directory as the dspace-angular codebase itself (in fact you may want to put
the parent directory or another location). Keep in mind the "cwd" setting (on line 5) must be the full path to
your [dspace-angular] folder.

dspace-angular.json

{
"apps": [
{
"name": "dspace-angular",
"cwd": "/home/dspace/dspace-angular",
"script": "yarn",
"args": "run serve:ssr",
"interpreter": "none"
}
]
}

7. Now, start the application using PM2 using the configuration file you created in the previous step

# In this example, we are assuming the config is named "dspace-angular.json"

pm2 start dspace-angular.json

# To see the logs, you'd run
# pm2 logs

# To stop it, you'd run
# pm2 stop dspace-angular.json

a. For more PM2 commands see https://pm2.keymetrics.io/docs/usage/quick-start/

b. HINT: You may also want to install/configure pm2-logrotate142 to ensure that PM2's log folder doesn't
fill up over time.
8. At this point, the User Interface should be available at the URL you configured in your
environment.prod.ts
a. For an example of what the default frontend looks like, visit the Demo Frontend: https://
demo7.dspace.org/
9. For HTTPS (port 443) support, you have two options

141 https://pm2.keymetrics.io/
142 https://github.com/keymetrics/pm2-logrotate

Installing DSpace – 69

DSpace 7.x Documentation – DSpace 7.x Documentation

a. (Recommended) You can install either Apache HTTPD143 or Nginx144 , configuring SSL at that level,
and proxy requests to PM2 (on port 4000). This is our current recommended approach. Plus, as a
bonus, if you want to host the UI and Backend on the same server, you can use just one Apache
HTTPD (or Nginx) to proxy to both. These instructions are specific to Apache.
i. Install Apache HTTPD145, e.g. sudo apt install apache2
ii. Install the mod_proxy146 and mod_proxy_http147 modules, e.g. sudo en2mod proxy; sudo
a2enmod proxy_http
iii. Restart Apache to enable
iv. Now, setup a new VirtualHost for your site (preferably using HTTPS / port 443) which proxies
all requests to PM2 running on port 4000.

<VirtualHost _default_:443>
.. setup your host how you want, including log settings...

SSLEngine on
SSLCertificateFile [full-path-to-PEM-cert]
SSLCertificateKeyFile [full-path-to-cert-KEY]

# Proxy all HTTPS requests from Apache to PM2 on port 4000
# NOTE that this proxy URL must match the "ui" settings in your environment.*.ts
ProxyPass / http://localhost:4000/
ProxyPassReverse / http://localhost:4000/
</VirtualHost>

b. (Alternatively) You can use the basic HTTPS support built into dspace-angular node server. (This may
currently be better for non-Production environments as it has not been well tested)
i. Create a [dspace-angular]/config/ssl/ folder and add a key.pem and cert.pem to that
folder (they must have those exact names)
ii. Enable "ui.ssl" (set to true)
iii. Update your "ui.port" to be 443
1. In order to run Node/PM2 on port 443, you also will likely need to provide node with
special permissions, like in this example148.
iv. Rebuild and then restart the app in PM2
v. Keep in mind, while this setup is simple, you may not have the same level of detailed,
Production logs as you would with Apache HTTPD or Nginx
10. Additional UI configurations are described in the environment.common.ts149 and at https://github.com/
DSpace/dspace-angular/blob/main/docs/Configuration.md (More documentation will be coming soon)

2.4 What Next?

After a successful installation, you may want to take a closer look at
• Scheduled Tasks via Cron(see page 481) : Several DSpace features require that a command-line script is run
regularly via cron.
• Configuration Reference(see page 552) : Details on the configuration options available to the Backend

143 https://httpd.apache.org/
144 https://www.nginx.com/
145 https://httpd.apache.org/
146 https://httpd.apache.org/docs/current/mod/mod_proxy.html
147 https://httpd.apache.org/docs/current/mod/mod_proxy_http.html
148 https://levelup.gitconnected.com/tws-004-how-to-configure-nodejs-to-use-port-443-86f1ca801c5f
149 https://github.com/DSpace/dspace-angular/blob/main/src/environments/environment.common.ts

Installing DSpace – 70

DSpace 7.x Documentation – DSpace 7.x Documentation

• Handle Server installation(see page 464): Optionally, you may wish to enable persistent URLs for your DSpace
site using CRNI's Handle.Net Registry
• Statistics and Metrics(see page 338): Optionally, you may wish to configuration one (or more) Statistics options
within DSpace, including Google Analytics(see page 363) and (internal) Solr Statistics(see page 338)
• Multilingual Support(see page 407): Optionally, you may wish to enable multilingual support in your DSpace
site.
• Using DSpace(see page 87) : Various other pages which describe usage and additional configurations related to
other DSpace features.
• System Administration(see page 410): Various other pages which describe additional backend installation
options/configurations.
If you've run into installation problems, you may want to...
• Review Commons Installation Issues(see page 71) (see below)
• Ask for Support150 via one of the support options documented on that page

2.5 Common Installation Issues

2.5.1 Troubleshoot an error or find detailed error messages

See the Troubleshoot an error151 page, look for the section on "DSpace 7.x". This will provide you hints on locating
error messages both in the User Interface (frontend) and in the REST API (backend)

2.5.2 "CORS error" or "Invalid CORS request"

If you are seeing a CORS error in your browser, this means that you are accessing the REST API via an "untrusted"
client application. To fix this error, you must change your REST API / Backend configuration to trust the application.
• By default, the DSpace REST API / Backend will only trust the application at dspace.ui.url. Therefore, you
should first verify that your dspace.ui.url setting (in your local.cfg) exactly matches the primary URL of
your User Interface (i.e. the URL you see in the browser). This must be an exact match: mode (http vs https),
domain, port, and subpath(s) all must match.
• If you need to trust additional client applications / URLs, those MUST be added to the
rest.cors.allowed-origins configuration. See REST API(see page 502) for details on this configuration.
• Also, check your Tomcat (or servlet container) log files. If Tomcat throws a syntax or other major error, it may
return an error response that triggers a CORS error. In this scenario, the CORS error is only a side effect of a
larger error.
If you modify either of the above settings, you will need to restart Tomcat for the changes to take effect.

2.5.3 "403 Forbidden" error with a message that says "Access is denied. Invalid
CSRF Token"
First, double check that you are seeing that exact error message. A 403 Forbidden error may be thrown in a
variety of scenarios. For example, a 403 may be thrown if a page requires a login, if you have entered an invalid
username or password, or even sometimes when there is a CORS error (see previous installation issue for how to
solve that).

150 https://wiki.lyrasis.org/display/DSPACE/Support
151 https://wiki.lyrasis.org/display/DSPACE/Troubleshoot+an+error

Installing DSpace – 71

DSpace 7.x Documentation – DSpace 7.x Documentation

If you are seeing the message "Invalid CSRF Token" message (especially on every login), this is usually the result of a
configuration / setup issue.
Here's some things you should double check:
1. If you need to be able to login to the REST API from other domains, then your Backend must be running
HTTPS.
a. If the REST API Backend is running HTTP, then it will always send the required DSPACE-XSRF-COOKIE
cookie with a value of SameSite=Lax. This setting means that the cookie will not be sent (by your
browser) to any other domains. Effectively, this will block all logins from any domain that is not the
same as the REST API (as this cookie will not be sent back to the REST API as required for CSRF
validation). In other words, running the REST API on HTTP is only possible if the User Interface is
running on the exact same domain. For example, running both on 'localhost' with HTTP is a common
development setup, and this will work fine.
b. In order to allow for cross-domain logins, you MUST enable HTTPS on the REST API. This will result in
the DSPACE-XSRF-COOKIE cookie being set to SameSite=None; Secure. This setting means the
cookie will be sent cross domain, but only for HTTPS requests. It also allows the user interface (or
other client applications) to be on any domain, provided that the domain is trusted by CORS (see
rest.cors.allowed-origins setting in REST API(see page 502))
2. Verify that your User Interface's "rest" section matches the value of "dspace.server.url" configuration
on the Backend. This simply ensures your UI is sending requests to the correct REST API. Also pay close
attention that both specify HTTPS when necessary (see previous bullet).
3. Verify that your "dspace.server.url" configuration on the Backend matches the primary URL of the REST
API (i.e. the URL you see in the browser). This must be an exact match: mode (http vs https), domain, port,
and subpath(s) all must match, and it must not end in a trailing slash (e.g. "https://api7.dspace.org/server" is
valid, but "https://api7.dspace.org/server/" may cause problems).
4. Verify that your "dspace.ui.url" configuration on the Backend matches the primary URL of your User
Interface (i.e. the URL you see in the browser). This must be an exact match: mode (http vs https), domain,
port, and subpath(s) all must match, and it must not end in a trailing slash (e.g. "https://demo7.dspace.org"
is valid, but "https://demo7.dspace.org/" may cause problems).
5. Verify that nothing (e.g. a proxy) is blocking Cookies and HTTP Headers from being passed between the UI
and REST API. DSpace's CSRF protection relies on the client (User Interface) being able to return both a valid
DSPACE-XSRF-COOKIE cookie and a matching X-XSRF-TOKEN header back to the REST API for
validation. See our REST Contract for more details https://github.com/DSpace/RestContract/blob/main/
csrf-tokens.md
6. If you are running a custom application, or accessing the REST API from the command-line (or other third
party tool like Postman152), you MUST ensure you are sending the CSRF token on every modifying request.
See our REST Contract for more details https://github.com/DSpace/RestContract/blob/main/csrf-tokens.md
For additional information on how DSpace's CSRF Protection works, see our REST Contract at https://github.com/
DSpace/RestContract/blob/main/csrf-tokens.md

2.5.4 Using a Self-Signed SSL Certificate causes the Frontend to not be able to
access the Backend
If you setup the backend to use HTTPS with a self-signed SSL certificate, then Node.js (which the frontend runs on)
may not "trust" that certificate by default. This will result in the Frontend not being able to make requests to the
Backend.
One possible workaround (untested as of yet) is to try setting the NODE_EXTRA_CA_CERTS environment variable153
(which tells Node.js to trust additional CA certificates).

152 https://www.postman.com/
153 https://nodejs.org/api/cli.html#cli_node_extra_ca_certs_file

Installing DSpace – 72

DSpace 7.x Documentation – DSpace 7.x Documentation

Another option is to avoid using a self-signed SSL certificate. Instead, create a real, issued SSL certificate using
something like Let's Encrypt154 (or similar free services)

2.5.5 My REST API is running under HTTPS, but some of its "link" URLs are
switching to HTTP?
This scenario may occur when you are running the REST API behind an HTTP proxy (e.g. Apache HTTPD's
mod_proxy_http, Ngnix's proxy_pass or any other proxy that is forwarding from HTTPS to HTTP).
The fix is to ensure the DSpace REST API is sent the X-Forwarded-Proto header (by your proxying service), telling
it that the forwarded protocol is HTTPS

X-Forwarded-Proto: https

In general, when running behind a proxy, the DSpace REST API depends on accurate X-Forwarded-* headers to be
sent by that proxy.

2.5.6 Database errors occur when you run ant fresh_install

There are two common errors that occur.
• If your error looks like this:

[java] 2004-03-25 15:17:07,730 INFO

org.dspace.storage.rdbms.InitializeDatabase @ Initializing Database
[java] 2004-03-25 15:17:08,816 FATAL
org.dspace.storage.rdbms.InitializeDatabase @ Caught exception:
[java] org.postgresql.util.PSQLException: Connection refused. Check
that the hostname and port are correct and that the postmaster is
accepting TCP/IP connections.
[java] at
org.postgresql.jdbc1.AbstractJdbc1Connection.openConnection(AbstractJd
bc1Connection.java:204)
[java] at org.postgresql.Driver.connect(Driver.java:139)

it usually means you haven't yet added the relevant configuration parameter to your PostgreSQL
configuration (see above), or perhaps you haven't restarted PostgreSQL after making the change. Also, make
sure that the db.username and db.password properties are correctly set in [dspace]/config/dspace.cfg. An
easy way to check that your DB is working OK over TCP/IP is to try this on the command line:

psql -U dspace -W -h localhost

Enter the dspace database password, and you should be dropped into the psql tool with a dspace=> prompt.
• Another common error looks like this:

154 https://letsencrypt.org/

Installing DSpace – 73

DSpace 7.x Documentation – DSpace 7.x Documentation

[java] 2004-03-25 16:37:16,757 INFO

org.dspace.storage.rdbms.InitializeDatabase @ Initializing Database
[java] 2004-03-25 16:37:17,139 WARN
org.dspace.storage.rdbms.DatabaseManager @ Exception initializing DB
pool
[java] java.lang.ClassNotFoundException: org.postgresql.Driver
[java] at java.net.URLClassLoader$1.run(URLClassLoader.java:198)
[java] at java.security.AccessController.doPrivileged(Native
Method)
[java] at
java.net.URLClassLoader.findClass(URLClassLoader.java:186)

This means that the PostgreSQL JDBC driver is not present in [dspace]/lib. See above.

Installing DSpace – 74

DSpace 7.x Documentation – DSpace 7.x Documentation

3 Upgrading DSpace
 These instructions are valid for any of the following upgrade paths:
• Upgrading ANY prior version (1.x.x, 3.x, 4.x, 5.x, 6.x or 7.x) of DSpace to DSpace 7.x (latest version)
For more information about new features or major changes in previous releases of DSpace, please refer to
following:
• Releases155 - Provides links to release notes for all prior releases of DSpace
• Version History(see page 696) - Provides detailed listing of all changes in all prior releases of DSpace

 Upgrading database structure/data is now automated!

The underlying DSpace database structure changes and data migrations are now AUTOMATED (using
FlywayDB156). This means that you no longer need to manually run SQL scripts. Instead, the first time you
run DSpace, it will auto-update your database structure (as needed) and migrate all your data to be
compatible with the installed version of DSpace. This allows you to concentrate your upgrade efforts on
customizing your site without having to worry about migrating your data!
For example, if you were running DSpace 1.4, and you wish to upgrade to DSpace 5, you can follow the
simplified instructions below. As soon as you point your DSpace 5 installation against the older DSpace
1.4-compatible database, your database tables (and data) will automatically be migrated to be compatible
with DSpace 5.
See below for a specific note on troubleshooting "ignored" migrations (a rare circumstance, but known to
happen if you upgrade from DSpace 5 to a later version of DSpace).

 Please refrain from customizing the DSpace database tables. It will complicate your next upgrade!
With the addition of our automated database upgrades, we highly recommend AGAINST customizing the
DSpace database tables/structure or backporting any features that change the DSpace tables/structure.
Doing so will often cause the automated database upgrade process to fail (and therefore will complicate
your next upgrade).
If you must add features requiring new database tables/structure, we recommend creating new tables
(instead of modifying existing ones), as that is usually much less disruptive to our automated database
upgrade.

155 https://wiki.lyrasis.org/display/DSPACE/Releases
156 https://flywaydb.org/

Upgrading DSpace – 75

DSpace 7.x Documentation – DSpace 7.x Documentation

 Test Your Upgrade Process

In order to minimize downtime, it is always recommended to first perform a DSpace upgrade using a
Development or Test server. You should note any problems you may have encountered (and also how to
resolve them) before attempting to upgrade your Production server. It also gives you a chance to
"practice" at the upgrade. Practice makes perfect, and minimizes problems and downtime. Additionally, if
you are using a version control system, such as subversion or git, to manage your locally developed
features or modifications, then you can do all of your upgrades in your local version control system on
your Development server and commit the changes. That way your Production server can just checkout
your well tested and upgraded code.

 In the notes below [dspace] refers to the install directory for your existing DSpace installation, and
[dspace-source] to the source directory for DSpace. Whenever you see these path references, be sure to
replace them with the actual path names on your local system.

• Release Notes / Significant Changes(see page 76)

• Upgrading the Backend (Server API)(see page 77)
• Backup your DSpace Backend(see page 77)
• Update Backend Prerequisite Software(see page 78)
• Upgrading the Backend Steps(see page 78)
• Upgrading the Frontend (User Interface)(see page 85)
• Troubleshooting Upgrade Issues(see page 86)
• "Ignored" Flyway Migrations(see page 86)
• Manually updating the Metadata Registries(see page 86)

3.1 Release Notes / Significant Changes

DSpace 7.0 features some significant changes which you may wish to be aware of before beginning your upgrade:
• XMLUI and JSPUI are no longer supported or distributed with DSpace. All users should install and utilize
the new Angular User Interface. See the "Installing the Frontend (User Interface)" instructions in Installing
DSpace(see page 53)
• The old REST API ("rest" webapp from DSpace v4.x-6.x) is deprecated and will be removed in v8.x. The
new REST API(see page 502) (provided in the "server" webapp) replaces all functionality available in the older
REST API. If you have tools that rely on the old REST API, you can still (optionally) build & deploy it alongside
the "server" webapp via the "-Pdspace-rest" Maven flag. See REST API v6 (deprecated)(see page 506)
• Solr must be installed separately due to changes in the packaging of recent Solr releases. The indexes
have been reconfigured and must be rebuilt. See below.
• GeoIP location database must be installed separately due to changes in Maxmind's terms and conditions.
MaxMind has changed the terms and procedure157 for obtaining and using its GeoLite2 location database.
Consequently, DSpace no longer automatically downloads the database during installation or update, and
the DSpace-specific database update tool has been removed. If you wish to (continue to) record client
location data in SOLR Statistics(see page 338), you will need to make new arrangements. See below.
• The Submission Form configuration has changed. The "item-submission.xml" file has changed its
structure, and the "input-forms.xml" has been replaced by a "submission-forms.xml". See Submission User
Interface(see page 260)

157 https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/

Upgrading DSpace – 76

DSpace 7.x Documentation – DSpace 7.x Documentation

• The traditional, 3-step Workflow system has been removed in favor of the Configurable Workflow
System(see page 250). For most users, you should see no effect or difference. The default setup for this
Configurable Workflow System is identical to the traditional, 3-step workflow ("Approve/Reject", "Approve/
Reject/Edit Metadata", "Edit Metadata")
• The old BTE import framework in favor of Live Import Framework(see page 274) (features of BTE have been
ported to Live Import)
• ElasticSearch Usage Statistics have been removed. Please use SOLR Statistics(see page 338) or DSpace
Google Analytics Statistics(see page 363).
• Configuration(see page 552) has been upgraded to Apache Commons Configuration version 2. For most
users, you should see no effect or difference. No DSpace configuration files were modified during this upgrade
and no configurations or settings were renamed or changed. However, if you locally modified or customized
the [dspace]/config/config-definition.xml (DSpace's Apache Commons Configuration settings),
you will need to ensure those modifications are compatible with Apache Commons Configuration version 2.
See the Apache Commons Configuration's configuration definition file reference158 for more details.
• Handle server has been updated to v9.3. Most users will see no effect or difference, however a minor
change to the Handle Server configuration is necessary, see below.
• Many backend prerequisites have been upgraded to avoid "end of life" versions. Therefore, pay very
close attention to the required prerequisites listed below.
• A large number of old/obsolete configurations were removed. "7.0 Configurations Removed" section in
the Release Notes(see page 21).

3.2 Upgrading the Backend (Server API)

3.2.1 Backup your DSpace Backend

Before you start your upgrade, it is strongly recommended that you create a backup of your DSpace content.
Backups are easy to recover from; a botched install/upgrade is very difficult if not impossible to recover from. The
DSpace specific things to backup are: configs, source code modifications, database, and assetstore. On your server
that runs DSpace, you might additionally consider checking on your cron/scheduled tasks, servlet container, and
database.
Make a complete backup of your system, including:
• Database: Make a snapshot/dump of the database. For the PostgreSQL database use
Postgres' pg_dump159 command. For example:

pg_dump -U [database-user] -f [backup-file-location] [database-name]

• Assetstore: Backup the directory ([dspace]/assetstore by default, and any other assetstores configured
in [dspace]/config/spring/api/bitstore.xml)
• Configuration: Backup the entire directory content of [dspace]/config.
• Customizations: If you have custom code, such as themes, modifications, or custom scripts, you will want to
back them up to a safe location.
• Statistics data: what to back up depends on what you were using before: the options are the default SOLR
Statistics(see page 338), or the legacy statistics. Legacy stats utilizes the dspace.log files, while SOLR Statistics
stores data in [dspace]/solr/statistics. A simple copy of the logs or the Solr core directory tree

158 https://commons.apache.org/proper/commons-configuration/userguide/
howto_combinedbuilder.html#Configuration_definition_file_reference
159 http://www.postgresql.org/docs/8.4/static/app-pgdump.html

Upgrading DSpace – 77

DSpace 7.x Documentation – DSpace 7.x Documentation

should give you a point of recovery, should something go wrong in the update process. We can't stress this
enough: your users depend on these statistics more than you realize. You need a backup.
• Authority data: stored in [dspace]/solr/authority. As with the statistics data, making a copy of the
directory tree should enable recovery from errors.

3.2.2 Update Backend Prerequisite Software

DSpace 7.x requires the following versions of prerequisite software:
• Updated: Java 11 (Oracle or OpenJDK)
• Updated: Apache Maven 3.3.x or above
• Updated: Apache Ant 1.10.x or above
• Database
• Updated: PostgreSQL 11 (with pgcrypto installed), OR
• Oracle 10g or above
• Updated: Tomcat 9.x
• New: Solr 8.x. See step #11 below.
• New: (optional) IP to City Database for location-based Statistics (either MaxMind's GeoLite City Database160
or DB-IP's City Lite Database161). See step #13 below.
Refer to the Backend Requirements section of "Installing DSpace" (see page 53) for more details around configuring
and installing these prerequisites.

3.2.3 Upgrading the Backend Steps

1. Ensure that your database is compatible: Starting with DSpace 6.x, there are new database requirements
for DSpace (refer to the Backend Requirements section of "Installing DSpace" (see page 53) for full details).
a. PostgreSQL databases: PostgreSQL 9.4 or above is required and the "pgcrypto" extension must be
installed.
i. Notes on installing pgcrypto
1. On most Linux operating systems (Ubuntu, Debian, RedHat), this extension is provided
in the "postgresql-contrib" package in your package manager. So, ensure you've
installed "postgresql-contrib".
2. On Windows, this extension should be provided automatically by the installer (check
your "[PostgreSQL]/share/extension" folder for files starting with "pgcrypto")
ii. Enabling pgcrypto on your DSpace database. (Additional options/notes in the Installation
Documentation(see page 53))

# Login to your "dspace" database as a superuser

psql --username=postgres dspace
# Enable the pgcrypto extension on this database
CREATE EXTENSION pgcrypto;

b. Oracle databases: Oracle database have no additional requirements at this time.

2. From your old version of DSpace, dump your authority and statistics Solr cores. (Only necessary
when upgrading from DSpace 6 or older & you want to keep both your authority records and/or SOLR
Statistics(see page 338))

160 https://dev.maxmind.com/geoip/geoip2/geolite2/
161 https://db-ip.com/db/download/ip-to-city-lite

Upgrading DSpace – 78

DSpace 7.x Documentation – DSpace 7.x Documentation

[dspace]/bin/dspace solr-export-statistics -i authority

[dspace]/bin/dspace solr-export-statistics -i statistics

The dumps will be written to the directory [dspace]/solr-export. This may take a long time and require
quite a lot of storage. In particular, the statistics core is likely to be huge, perhaps double the size of the
content of solr/statistics/data. You should ensure that you have sufficient free storage.

This is not the same as the disaster-recovery backup that was done above. These dumps will be reloaded
into new, reconfigured cores later(see page 84).

If you are sharding your statistics data, you will need to dump each shard separately. The index names for
prior years will be statistics-YYYY (for example: statistics-2017 statistics-2018 etc.) The
current year's statistics shard is named statistics and you should dump that one too.
3. Download the latest DSpace release162 from the DSpace GitHub Repository. You can choose to either
download the zip or tar.gz file provided by GitHub, or you can use "git" to checkout the appropriate tag (e.g.
dspace-7.0) or branch.
a. Unpack it using "unzip" or "gunzip". If you have an older version of DSpace installed on this same
server, you may wish to unpack it to a different location than that release. This will ensure no files
are accidentally overwritten during the unpacking process, and allow you to compare configs side by
side.
b. For ease of reference, we will refer to the location of this unzipped version of the DSpace release as
[dspace-source] in the remainder of these instructions.
4. Replace your old build.properties file with a local.cfg (ONLY REQUIRED if upgrading from DSpace 5 or
previous): As of DSpace 6.0, the build.properties configuration file has been replaced by an enhanced
local.cfg configuration file. Therefore, any old build.properties file (or similar [dspace-source]/
*.properties files) WILL BE IGNORED. Instead, you should create a new local.cfg file, based on the
provided [dspace-source]/dspace/config/local.cfg.EXAMPLE and use it to specify all of your locally
customized DSpace configurations. This new local.cfg can be used to override ANY setting in any other
configuration file (dspace.cfg or modules/*.cfg). To override a default setting, simply copy the
configuration into your local.cfg and change its value(s). For much more information on the features of
local.cfg, see the Configuration Reference(see page 552) documentation and the local.cfg Configuration File(see
page 560) section on that page.

cd [dspace-source]
cp dspace/config/local.cfg.EXAMPLE local.cfg

# Then edit the local.cfg, specifying (at a minimum) your basic DSpace configuration settings.
# Optionally, you may copy any settings from other *.cfg configuration files into your local.cfg to
override them.
# After building DSpace, this local.cfg will be copied to [dspace]/config/local.cfg, where it will
also be used at runtime.

5. Build DSpace Backend. Run the following commands to compile DSpace :

cd [dspace-source]
mvn -U clean package

The above command will re-compile the DSpace source code and build its "installer". You will find the result
in [dspace-source]/dspace/target/dspace-installer

162 https://github.com/DSpace/DSpace/releases

Upgrading DSpace – 79

DSpace 7.x Documentation – DSpace 7.x Documentation

 Defaults to PostgreSQL settings

Without any extra arguments, the DSpace installation package is initialized for PostgreSQL. If you
use Oracle instead, you should build the DSpace installation package as follows:
mvn -Ddb.name163=oracle -U clean package

6. Stop Tomcat (or servlet container). Take down your servlet container.

a. For Tomcat, use the $CATALINA_HOME/shutdown.sh script. (Many Unix-based installations will
have a startup/shutdown script in the /etc/init.d or /etc/rc.d directories.)
7. Update DSpace Installation. Update the DSpace installation directory with the new code and libraries.
Issue the following commands:

cd [dspace-source]/dspace/target/dspace-installer
ant update

8. Update your DSpace Configurations. Depending on the version of DSpace you are upgrading from, not all
steps are required.
a. If you are upgrading from DSpace 5.x or below. If you are upgrading from DSpace 6.x you can skip
these steps and move to the next bullet.
i. Search/Browse requires Discovery: As of DSpace 6, only Discovery(see page 386) (Apache Solr)
is supported for search/browse. Support for Legacy Search (using Apache Lucene) and Legacy
Browse (using database tables) has been removed, along with all their configurations.
ii. XPDF media filtering no longer exists: XPDF media filtering, deprecated in DSpace 5, has
been removed. If you used this, you will need to reconfigure using the remaining
alternatives(see page 469) (e.g. PDF Text Extractor and/or ImageMagick PDF Thumbnail
Generator)
b. If you are upgrading from DSpace 6.x or below. All administrators will need to perform these steps.
i. Review your customized configurations (recommended to be in local.cfg): As mentioned
above, we recommend any local configuration changes be placed in a local.cfg Configuration
File(see page 560). With any major upgrade some configurations may have changed. Therefore,
it is recommended to review all configuration changes that exist in the config directory, and
its subdirectories, concentrating on configurations your previously customized in your
local.cfg. See also the Configuration Reference(see page 552).
ii. Remove obsolete configurations. With the removal of the JSPUI and XMLUI, a large number
of server-side (backend) configurations were made obsolete and were therefore removed
between the 6.x and 7.0 release. A full list can be found in the Release Notes(see page 24).
iii. Migrate or recreate your Submission configuration. As of DSpace 7, the submission
configuration has changed. The format of the "item-submission.xml" file has been updated,
and the older "input-forms.xml" has been replaced by a new "submission-forms.xml". You
can choose to either start fresh with the new v7 configuration files, or you can use the steps
below to migrate your old configurations into the new format. See the Submission User
Interface(see page 260) for more information
1. First, create a temporary folder to copy your old v6 configurations into

# Example of creating a [dspace]/config/temp folder for this migration

# You must replace [dspace] with the full path of your DSpace 7 installation.
cd [dspace]/config
mkdir temp

163 http://ddb.name/

Upgrading DSpace – 80

DSpace 7.x Documentation – DSpace 7.x Documentation

2. Copy your old (v5 or v6) "item-submission.xml" and "input-forms.xml" into that
temporary folder
3. Run the command-line migration script to migrate them to v7 configuration files

# This example uses [dspace] as a placeholder for all paths.

# Replace it with either the absolute or relative path of these files
[dspace]/bin/dspace submission-forms-migrate -s [dspace]/config/temp/item-
submission.xml -f [dspace]/config/temp/input-forms.xml

4. The result will be two files. These are valid v7 configurations based on your original
submission configuration files.
a. [dspace]/config/item-submission.xml.migrated
b. [dspace]/config/submission-forms.xml.migrated
5. These "*.migrated" files have no inline comments, so you may want to edit them
further before installing them (by removing the ".migrated" suffix). Alternatively, you
may choose to copy sections of the *.migrated files into the default configurations in
the [dspace]/config/ folder, therefore retaining the inline comments in those
default files.
iv. City IP Database file for Solr Statistics has been renamed. The old [dspace]/config/
GeoLiteCity.dat file is no longer maintained by its provider. You can delete it. The new file
is named GeoLite2-City.mmdb by default. If you have configured a different name and/or
location for this file, you should check the setting of usage-statistics.dbfile in
[dspace]/config/modules/usage-statistics.cfg (and perhaps move your custom
setting to local.cfg).
v. tm-extractors media filtering (WordFilter) no longer exists: the PoiWordFilter plugin now
fulfills this function. If you still have WordFilter configured, remove from dspace.cfg and/or
local.cfg all lines referencing org.dspace.app.mediafilter.WordFilter and
uncomment all lines referencing org.dspace.app.mediafilter.PoiWordFilter.
vi. Re-configure Solr URLs: change the value of solr.server to point at your new Solr
external service. It will probably become something like solr.server = https://$
{dspace.hostname}:8983/solr. Also review the values of
1. discovery.search.server
2. oai.solr.url
3. solr.authority.server
4. solr.statistics.server
vii. Sitemaps are now automatically generated/updated: A new sitemap.cron setting exists
in the dspace.cfg which controls when Sitemaps are generated. By default they are enabled to
update once per day, for optimal SEO. See Search Engine Optimization(see page 485) docs for
more detail
1. Because of this change, if you had a system cron job which ran "./dspace
generate-sitemaps", this system cron job can be removed in favor of the new
sitemap.cron setting.
9. Upgrade your database (optional, but recommended for major upgrades). As of DSpace 5 (and above), the
DSpace code will automatically upgrade your database (from any prior version of DSpace). By default, this
database upgrade occurs automatically when you restart Tomcat (or your servlet container). However, if
you have a large repository or are upgrading across multiple versions of DSpace at once, you may wish to
manually perform the upgrade (as it could take some time, anywhere from 5-15 minutes for large sites).
a. First, you can optionally verify whether DSpace correctly detects the version of your DSpace
database. It is very important that the DSpace version is detected correctly before you attempt the
migration:

Upgrading DSpace – 81

DSpace 7.x Documentation – DSpace 7.x Documentation

[dspace]/bin/dspace database info

# Look for a line at the bottom that says something like:
# "Your database looks to be compatible with DSpace version ___"

b. In some rare scenarios, if your database's "sequences" are outdated, inconsistent or incorrect, a
database migration error may occur (in your DSpace logs). While this is seemingly a rare occurrence,
you may choose to run the "update-sequences" command PRIOR to upgrading your database. If your
database sequences are inconsistent or incorrect, this "update-sequences" command will auto-
correct them (otherwise, it will do nothing).

# General PostgreSQL example

psql -U [database-user] -f [dspace]/etc/postgres/update-sequences.sql [database-name]

# Example for a PostgreSQL database named "dspace", and a user account named "dspace"
# psql -U dspace -f [dspace]/etc/postgres/update-sequences.sql dspace

c. Then, you can upgrade your DSpace database to the latest version of DSpace. (NOTE: check the
DSpace log, [dspace]/log/dspace.log.[date], for any output from this command)

[dspace]/bin/dspace database migrate

If you are upgrading from DSpace 6 or earlier, there are database changes which were previously
optional but now are mandatory (specifically Configurable Workflow(see page 250) database changes).
Instead of (or after) the above command:

[dspace]/bin/dspace database migrate ignored

to apply these changes.

d. If the database upgrade process fails or throws errors, then you likely have manually customized your
database structure (and/or backported later DSpace features to an older version of DSpace). In this
scenario, you may need to do some manual migrations before the automatic migrations will succeed.
The general process would be something like this:
i. Revert back to your current DSpace database
ii. Manually upgrade just your database past the failing migration. For example, if you are
current using DSpace 1.5 and the "V1.6" migration is failing, you may need to first manually
upgrade your database to 1.6 compatibility. This may involve either referencing the upgrade
documentation for that older version of DSpace, or running the appropriate SQL script from
under [dspace-src]/dspace-api/src/main/resources/org/dspace/storage/
rdbms/sqlmigration/)
iii. Then, re-run the migration process from that point forward (i.e. re-run ./dspace database
migrate)
e. More information on the "database" command can be found in Database Utilities164 documentation.

164 https://wiki.lyrasis.org/display/DSDOC5x/Database+Utilities

Upgrading DSpace – 82

DSpace 7.x Documentation – DSpace 7.x Documentation

 By default, your site will be automatically reindexed after a database upgrade

If any database migrations are run (even during minor release upgrades), then by default DSpace
will automatically reindex all content in your site. This process is run automatically in order to
ensure that any database-level changes are also immediately updated within the search/browse
interfaces. See the notes below under "Restart Tomcat (servlet container)" for more information.
However, you may choose to skip automatic reindexing. Some sites choose to run the reindex
process manually in order to better control when/how it runs.
To disable automatic reindexing, set

discovery.autoReindex
= false in config/local.cfg or config/modules/discovery.cfg.
As you have disabled automatic reindexing, make sure to manually reindex your site by running
[dspace]/bin/dspace discovery -b (This must be run after restarting Tomcat)
WARNING: It is not recommended to skip automatic reindexing, unless you will manually reindex at a
later time, or have verified that a reindex is not necessary. Forgetting to reindex your site after an
upgrade may result in unexpected errors or instabilties.

 Sites with Oracle database backends (and Configurable Workflow enabled) may need to run a
"repair" on your database.
In version 6.3, we fixed an Oracle migration issue related to Configurable (XML) Workflow(see page
250). See DS-3788165.

If you are upgrading an Oracle-based site to 6.3 from 6.0, 6.1 or 6.2 AND had Configurable Workflow
already enabled, then you will need to manually "repair" your database to align it with the latest
schema. This does not affect PostgreSQL-based backends or any sites that are upgrading from 5.x or
below.
Simply run the following to repair your Oracle database:
[dspace]/bin/dspace database repair

10. Deploy Server web application: The DSpace backend consists of a single "server" webapp (in [dspace]/
webapps/server ). You need to deploy this webapp into your Servlet Container (e.g. Tomcat). Generally,
there are two options (or techniques) which you could use...either configure Tomcat to find the DSpace
"server" webapp, or copy the "server" webapp into Tomcat's own webapps folder. For more information &
example commands, see the Installation Guide(see page 53)
a. Optionally, you may also install the deprecated DSpace 6.x REST API web application ("rest"
webapp). If you previously used the DSpace 6.x REST API(see page 506), for backwards compatibility the
old, deprecated "rest" webapp is still available to install (in [dspace]/webapps/rest). It is NOT
used by the DSpace UI/frontend. So, most users should skip this step.
11. Install new Solr cores and rebuild your indexes. (Only necessary if upgrading from 6.x or below)
a. Copy the new, empty Solr cores to your new Solr instance.

cp -R [dspace]/solr/* [solr]/server/solr/configsets
chown -R solr:solr [solr]/server/solr/configsets

165 https://jira.duraspace.org/browse/DS-3788

Upgrading DSpace – 83

DSpace 7.x Documentation – DSpace 7.x Documentation

b. Start Solr, or restart it if it is running, so that these new cores are loaded.

[solr]/bin/solr restart

c. Load authority and statistics from the dumps(see page 78) that you made earlier (not the
disaster-recovery backup).

[dspace]/bin/dspace solr-import-statistics -i authority

[dspace]/bin/dspace solr-import-statistics -i statistics

This could take quite some time.

If you had sharded your statistics, you will need to load the dump of each shard separately. As when
dumping, the index names will be ... statistics-2017 statistics-2018 statistics.

d. For Statistics shards only, upgrade legacy DSpace Object Identifiers (pre-6.4 statistics) to UUID
Identifiers.

[dspace]/bin/dspace solr-upgrade-statistics-6x -i statistics

Again If you had sharded your statistics, you will need to run this for each shard separately. See also
SOLR Statistics Maintenance(see page 356)
e. Rebuild the oai and search cores.

[dspace]/bin/dspace oai import

[dspace]/bin/dspace index-discovery -b

If you have a great deal of content, this could take a long time.
12. Update Handle Server Configuration. (Only necessary if upgrading from 6.x or below) Because we've
updated to Handle Server v9, if you are using the built-in Handle server (most installations do), you'll need to
add the follow to the end of the server_config section of your [dspace]/handle-server/config.dct file
(the only new line is the "enable_txn_queue" line)

"case_sensitive" = "no"
"storage_type" = "CUSTOM"
"storage_class" = "org.dspace.handle.HandlePlugin"
"enable_txn_queue" = "no"

a. Alternatively, you could re-run the ./dspace make-handle-config script, which is in charge of
updating this config.dct file.
13. (Optional) Set up IP to City database for location-based statistics. If you wish to (continue to) record the
geographic origin of client activity, you will need to install (and regularly update) one of the following:
• Either, a copy of MaxMind's GeoLite City database166 (in MMDB format)
• NOTE: Installing MaxMind GeoLite2 is free. However, you must sign up for a (free) MaxMind
account in order to obtain a license key to use the GeoLite2 database.
• You may download GeoLite2 directly from MaxMind, or many Linux distributions provide the
geoipupdate tool directly via their package manager. You will still need to configure your
license key prior to usage.

166 https://dev.maxmind.com/geoip/geoip2/geolite2/

Upgrading DSpace – 84

DSpace 7.x Documentation – DSpace 7.x Documentation

• Once the "GeoLite2-City.mmdb" database file is installed on your system, you will need to
configure its location as the value of usage-statistics.dbfile in your local.cfg
configuration file .
• You can discard any old GeoLiteCity.dat database(s) found in the config/ directory (if they
exist).
• See the "Managing the City Database File" section of SOLR Statistics(see page 338) for more
information about using a City Database with DSpace.
• Or, you can alternatively use/install DB-IP's City Lite database167 (in MMDB format)
• This database is also free to use, but does not require an account to download.
• Once the "dbip-city-lite.mmdb" database file is installed on your system, you will need to
configure its location as the value of usage-statistics.dbfile in your local.cfg
configuration file .
• See the "Managing the City Database File" section of SOLR Statistics(see page 338) for more
information about using a City Database with DSpace.
14. Restart Tomcat (servlet container). Now restart your servlet container (Tomcat/Jetty/Resin) and test out
the upgrade.
a. Upgrade of database: If you didn't manually upgrade your database in the previous step, then your
database will be automatically upgraded to the latest version. This may take some time (seconds to
minutes), depending on the size of your repository, etc. Check the DSpace log ([dspace]/log/
dspace.log.[date]) for information on its status.
b. Reindexing of all content for search/browse: If your database was just upgraded (either manually
or automatically), all the content in your DSpace will be automatically re-indexed for searching/
browsing. As the process can take some time (minutes to hours, depending on the size of your
repository), it is performed in the background; meanwhile, DSpace can be used as the index is
gradually filled. But, keep in mind that not all content will be visible until the indexing process is
completed. Again, check the DSpace log ( [dspace]/log/dspace.log.[date]) for information on
its status.
i. If you wish to skip automatic reindexing, please see the Note above under the "Upgrade your
Database" step.
15. Check your cron / Task Scheduler jobs. In recent versions of DSpace, some of the scripts names have
changed.
a. Check the Scheduled Tasks via Cron(see page 481) documentation for details. If you have been using
the dspace stats-util --optimize tool, it is no longer recommended and you should stop.
b. WINDOWS NOTE: If you are running the Handle Server on a Windows machine, a new [dspace]/
bin/start-handle-server.bat script is available to more easily startup your Handle Server.
16. Install the new User Interface (see below)

3.3 Upgrading the Frontend (User Interface)

1. Install the new User Interface per the Installing DSpace(see page 53) guide. The JSPUI and XMLUI are no
longer supported and cannot work with the DSpace 7 backend. You will need to install the new (Angular.io)
User Interface.
a. JSPUI or XMLUI based themes cannot be migrated. That said, since the new Angular UI also uses
Bootstrap, you may be able to borrow some basic CSS from your old themes. But any HTML-level
changes will need to be reimplemented in the new UI.

167 https://db-ip.com/db/download/ip-to-city-lite

Upgrading DSpace – 85

DSpace 7.x Documentation – DSpace 7.x Documentation

3.4 Troubleshooting Upgrade Issues

3.4.1 "Ignored" Flyway Migrations

In very rare instances, a Flyway database migration will be "ignored." One known instance of this is documented in
DS-3407. If you are upgrading from DSpace 5.x to a later version of DSpace, the migration put in place to address
DS-2818 will be "ignored" because it is not necessary. There is a special command you can run which will un-flag
this migration as "ignored."

dspace database migrate ignored

 warning: dangerous command: BACK UP YOUR DATABASE!

The dspace database migrate ignored command can be dangerous: it will attempt to re-run all
ignored migrations. In the case outlined above, this is safe. However, under other circumstances, re-
running ignored migrations can lead to unpredictable results. To be absolutely safe, be sure you have a
current backup of your database.
The presence of ignored migrations can indicate a problem in the database. It's best not to use this
command unless instructed to.

3.4.2 Manually updating the Metadata Registries

The database migration (./dspace database migratte) should automatically trigger your metadata/file registries to
be updated (based on the config files in [dspace]/config/registries/). However, if this update was NOT
triggered, you can also manually run these registry updates (they will not harm existing registry contents) as
follows:

[dspace]/bin/dspace registry-loader -metadata [dspace]/config/registries/dcterms-types.xml

[dspace]/bin/dspace registry-loader -metadata [dspace]/config/registries/dublin-core-types.xml
[dspace]/bin/dspace registry-loader -metadata [dspace]/config/registries/eperson-types.xml
[dspace]/bin/dspace registry-loader -metadata [dspace]/config/registries/local-types.xml
[dspace]/bin/dspace registry-loader -metadata [dspace]/config/registries/sword-metadata.xml
[dspace]/bin/dspace registry-loader -metadata [dspace]/config/registries/workflow-types.xml

Upgrading DSpace – 86

DSpace 7.x Documentation – DSpace 7.x Documentation

4 Using DSpace
This page offers access to all aspects of the documentation relevant to using DSpace after it has been properly
installed or upgraded. These pages assume that DSpace is functioning properly. Please refer to the section on
System Administration(see page 410) if you are looking for information on diagnosing DSpace issues and measures
you can take to restore your DSpace to a state in which it functions properly.

4.1 Authentication and Authorization

• Authentication Plugins(see page 87)
• Embargo(see page 113)
• Managing User Accounts(see page 121)
• Request a Copy(see page 126)

4.1.1 Authentication Plugins

• Stackable Authentication Method(s)(see page 87)
• Authentication by Password(see page 89)
• Enabling Authentication by Password(see page 89)
• Configuring Authentication by Password(see page 89)
• Shibboleth Authentication(see page 90)
• Enabling Shibboleth Authentication(see page 90)
• Configuring Shibboleth Authentication(see page 91)
• Apache "mod_shib" Configuration (required)(see page 91)
• Sample shibboleth2.xml Configuration(see page 94)
• Sample attribute-map.xml Configuration (for samltest.id)(see page 95)
• DSpace Shibboleth Configuration Options(see page 96)
• LDAP Authentication(see page 101)
• Introduction to LDAP specific terminology(see page 101)
• Enabling LDAP Authentication(see page 102)
• Configuring LDAP Authentication(see page 102)
• Debugging LDAP connection and configuration(see page 108)
• Enabling Hierarchical LDAP Authentication(see page 109)
• Configuring Hierarchical LDAP Authentication(see page 110)
• IP Authentication(see page 111)
• Enabling IP Authentication(see page 111)
• Configuring IP Authentication(see page 111)
• X.509 Certificate Authentication(see page 112)
• Enabling X.509 Certificate Authentication(see page 112)
• Configuring X.509 Certificate Authentication(see page 112)
• Example of a Custom Authentication Method(see page 113)

4.1.1.1 Stackable Authentication Method(s)

Since many institutions and organizations have existing authentication systems, DSpace has been designed to
allow these to be easily integrated into an existing authentication infrastructure. It keeps a series, or "stack", of
authentication methods, so each one can be tried in turn. This makes it easy to add new authentication methods or
rearrange the order without changing any existing code. You can also share authentication code with other sites.

Using DSpace – 87

DSpace 7.x Documentation – DSpace 7.x Documentation

Configuration File: [dspace]/config/modules/authentication.cfg

Property: plugin.sequence.org.dspace.authenticate.Authenticati
onMethod

Example Value:
plugin.sequence.org.dspace.authenticate.AuthenticationMethod
= org.dspace.authenticate.PasswordAuthentication

The configuration property plugin.sequence.org.dspace.authenticate.AuthenticationMethod defines

the authentication stack. It is a comma-separated list of class names. Each of these classes implements a different
authentication method, or way of determining the identity of the user. They are invoked in the order specified
until one succeeds.
Existing Authentication Methods include
• Authentication by Password(see page 89) (class: org.dspace.authenticate.PasswordAuthentication)
(DEFAULT)
• Shibboleth Authentication(see page 90) (class: org.dspace.authenticate.ShibAuthentication)
• LDAP Authentication(see page 101) (class: org.dspace.authenticate.LDAPAuthentication)
• IP Address based Authentication(see page 111) (class: org.dspace.authenticate.IPAuthentication)
• X.509 Certificate Authentication(see page 112) (class: org.dspace.authenticate.X509Authentication)
An authentication method is a class that implements the interface
org.dspace.authenticate.AuthenticationMethod. It authenticates a user by evaluating the credentials
(e.g. username and password) he or she presents and checking that they are valid.
The basic authentication procedure in the DSpace Web UI is this:
1. A request is received from an end-user's browser that, if fulfilled, would lead to an action requiring
authorization taking place.
2. If the end-user is already authenticated:
• If the end-user is allowed to perform the action, the action proceeds
• If the end-user is NOT allowed to perform the action, an authorization error is displayed.
• If the end-user is NOT authenticated, i.e. is accessing DSpace anonymously:
3. The parameters etc. of the request are stored.
4. The Web UI's startAuthentication method is invoked.
5. First it tries all the authentication methods which do implicit authentication (i.e. they work with just the
information already in the Web request, such as an X.509 client certificate). If one of these succeeds, it
proceeds from Step 2 above.
6. If none of the implicit methods succeed, the UI responds by putting up a "login" page to collect credentials
for one of the explicit authentication methods in the stack. The servlet processing that page then gives
the proffered credentials to each authentication method in turn until one succeeds, at which point it retries
the original operation from Step 2 above.
Please see the source files AuthenticationManager.java and AuthenticationMethod.java for more
details about this mechanism.

Using DSpace – 88

DSpace 7.x Documentation – DSpace 7.x Documentation

Authentication by Password

Enabling Authentication by Password

By default, this authentication method is enabled in DSpace.
However, to enable Authentication by Password, you must ensure the
org.dspace.authenticate.PasswordAuthentication class is listed as one of the AuthenticationMethods in
the following configuration:

Configuration File: [dspace]/config/modules/authentication.cfg

Property: plugin.sequence.org.dspace.authenticate.Authenticati
onMethod

Example Value:
plugin.sequence.org.dspace.authenticate.AuthenticationMethod
= org.dspace.authenticate.PasswordAuthentication

Configuring Authentication by Password

The default method org.dspace.authenticate.PasswordAuthentication has the following properties:
• Use of inbuilt e-mail address/password-based log-in. This is achieved by forwarding a request that is
attempting an action requiring authorization to the password log-in servlet, /password-login. The
password log-in servlet (org.dspace.app.webui.servlet.PasswordServlet) contains code that will
resume the original request if authentication is successful, as per step 3. described above.
• Users can register themselves (i.e. add themselves as e-people without needing approval from the
administrators), and can set their own passwords when they do this
• Users are not members of any special (dynamic) e-person groups
• You can restrict the domains from which new users are able to register. To enable this feature, uncomment
the following line from dspace.cfg: authentication.password.domain.valid = example.com
Example options might be '@example.com' to restrict registration to users with addresses ending in
@example.com, or '@example.com, .ac.uk' to restrict registration to users with addresses ending in
@example.com or with addresses in the .ac.uk domain.
A full list of all available Password Authentication Configurations:

Configuration File: [dspace]/config/modules/authentication-password.cfg

Property: authentication-password.domain.valid

Example Value: authentication-password.domain.value = @mit.edu, .ac.uk

Using DSpace – 89

DSpace 7.x Documentation – DSpace 7.x Documentation

Informational Note: This option allows you to limit self-registration to email addresses ending
in a particular domain value. The above example would limit self-
registration to individuals with "@mit.edu" email addresses and all
".ac.uk" email addresses.

Property: authentication-password.login.specialgroup

Example Value: authentication-password.login.specialgroup = My DSpace

Group

Informational Note: This option allows you to automatically add all password authenticated
user sessions to a specific DSpace Group (the group must exist in DSpace)
for the remainder of their logged in session.

Property: authentication-password.digestAlgorithm

Example Value: authentication-password.digestAlgorithm = SHA-512

Informational Note: This option specifies the hashing algorithm to be used in converting plain-
text passwords to more secure password digests. The example value is
the default. You may select any digest algorithm available through
java.security.MessageDigest on your system. At least MD2, MD5, SHA-1,
SHA-256, SHA-384, and SHA-512 should be available, but you may have
installed others. Most sites will not need to adjust this.

Shibboleth Authentication

Enabling Shibboleth Authentication

To enable Shibboleth Authentication, you must ensure the org.dspace.authenticate.ShibAuthentication
class is listed as one of the AuthenticationMethods in the following configuration:

Configuration File: [dspace]/config/modules/authentication.cfg

Property: plugin.sequence.org.dspace.authenticate.Authenticat
ionMethod

Using DSpace – 90

DSpace 7.x Documentation – DSpace 7.x Documentation

Example Value:
plugin.sequence.org.dspace.authenticate.AuthenticationMethod
= org.dspace.authenticate.ShibAuthentication

(NOTE: This setting may be repeated to support multiple

AuthenticationMethods)

Configuring Shibboleth Authentication

Shibboleth is a distributed authentication system for securely authenticating users and passing attributes about the
user from one or more identity providers. In the Shibboleth terminology DSpace is a Service Provider which receives
authentication information and then based upon that provides a service to the user. To use Shibboleth, DSpace
requires that you use Apache installed with the mod_shib module acting as a proxy for all HTTP requests for your
servlet container (typically Tomcat). DSpace will receive authentication information from the mod_shib module
through HTTP headers.
Before DSpace will work with Shibboleth, you must have the following:
1. An Apache web server with the "mod_shib" module installed. As mentioned, this mod_shib module acts as a
proxy for all HTTP requests for your servlet container (typically Tomcat). Any requests to DSpace that
require authentication via Shibboleth should be redirected to 'shibd' (the shibboleth daemon) by this
"mod_shib" module. Details on installing/configuring mod_shib in Apache are available at: https://
wiki.shibboleth.net/confluence/display/SHIB2/NativeSPApacheConfig We also have a sample Apache +
mod_shib configuration provided below.
2. An external Shibboleth IdP (Identity Provider). Using mod_shib, DSpace will only act as a Shibboleth SP
(Service Provider). The actual Shibboleth Authentication & Identity information must be provided by an
external IdP. If you are using Shibboleth at your institution already, then there already should be a
Shibboleth IdP available. More information about Shibboleth IdPs versus SPs is available at: https://
wiki.shibboleth.net/confluence/display/SHIB2/UnderstandingShibboleth
For more information on installing and configuring a Shibboleth Service Provider see: https://wiki.shibboleth.net/
confluence/display/SHIB2/Installation
Note about Shibboleth Active vs Lazy Sessions:
When configuring your Shibboleth Service Provider there are two Shibboleth paradigms you may use: Active or Lazy
Sessions. Active sessions is where the mod_shib module is configured to product an entire URL space. No one will
be able to access that URL without first authenticating with Shibboleth. Using this method you will need to
configure shibboleth to protect the URL: "/shibboleth-login". The alternative, Lazy Session does not protect any
specific URL. Instead Apache will allow access to any URL, and when the application wants to it may initiate an
authenticated session.
The Lazy Session method is preferable for most DSpace installations, as you usually want to provide public access
to (most) DSpace content, while restricting access to only particular areas (e.g. administration UI/tools, private
Items, etc.). When Active Sessions are enabled your entire DSpace site will be access restricted. In other words,
when using Active Sessions, Shibboleth will require everyone to first authenticate before they can access any part of
your repository (which essentially results in a "dark archive", as anonymous access will not be allowed).
Apache "mod_shib" Configuration (required)
As mentioned above, you must have Apache with the "mod_shib" module installed in order for DSpace to be able to
act as a Shibboleth Service Provider (SP). The mod_shib module acts as a proxy for all HTTP requests for your
servlet container (typically Tomcat). Any requests to DSpace that require authentication via Shibboleth should be
redirected to 'shibd' (the shibboleth daemon) by this "mod_shib" module. Details on installing/configuring
mod_shib in Apache are available at: https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPApacheConfig

Using DSpace – 91

DSpace 7.x Documentation – DSpace 7.x Documentation

General information about installing/configuring Shibboleth Service Providers (SPs) can be found at: https://
wiki.shibboleth.net/confluence/display/SHIB2/Installation
A few extra notes/hints when configuring mod_shib & Apache:
• In Debian based environments, "mod_shib" tends to be in a package named something like "libapache2-
mod-shib2"
• The Shibboleth setting "ShibUseHeaders" is no longer required to be set to "On", as DSpace will correctly
utilize attributes instead of headers.
• When "ShibUseHeaders" is set to "Off" (which is recommended in the mod_shib documentation168),
proper configuration of Apache to pass attributes to Tomcat (via either mod_jk or mod_proxy) can be
a bit tricky, SWITCH has some great documentation169 on exactly what you need to do. We will
eventually paraphrase/summarize this documentation here, but for now, the SWITCH page will have
to do.
• When initially setting up Apache & mod_shib, https://samltest.id/ provides a great testing ground for your
configurations. This site provides a sample/demo Shibboleth IdP (as well as a sample Shibboleth SP) which
you can test against. It acts as a "sandbox" to get your configurations working properly, before you point
DSpace at your production Shibboleth IdP.
• You also may wish to review the Shibboleth setup in our "dspace-shibboleth" Docker setup170which the
development team uses for testing (and it uses https://samltest.id as the IdP). It may provide you with good
examples/hints on getting everything setup. However, keep in mind this code has not been tested in
Production scenarios.
Below, we have provided a sample Apache configuration. However, as every institution has their own specific
Apache setup/configuration, it is highly likely that you will need to tweak this configuration in order to get it
working properly. Again, see the official mod_shib documentation for much more detail about each of these
settings: https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPApacheConfig These configurations are
meant to be added to an Apache <VirtualHost> which acts as a proxy to your Tomcat (or other servlet container)
running DSpace. More information on Apache VirtualHost settings can be found at: https://httpd.apache.org/docs/
2.2/vhosts/

168 https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPApacheConfig#NativeSPApacheConfig-AuthConfigOptions
169 https://www.switch.ch/de/aai/support/serviceproviders/sp-access-rules.html#javaapplications
170 https://github.com/DSpace/DSpace/tree/main/dspace/src/main/docker/dspace-shibboleth

Using DSpace – 92

DSpace 7.x Documentation – DSpace 7.x Documentation

#### SAMPLE MOD_SHIB CONFIGURATION FOR APACHE2 (it may require local modifications based on your Apache
setup) ####
# While this sample VirtualHost is for HTTPS requests (recommended for Shibboleth, obviously),
# you may also need/want to create one for HTTP (*:80)
<VirtualHost *:443>
...
# PLEASE NOTE: We have omitted many Apache settings (ServerName, LogLevel, SSLCertificateFile, etc)
# which you may need/want to add to your VirtualHost

# As long as Shibboleth module is installed, enable all Shibboleth/mod_shib related settings

<IfModule mod_shib>
# Shibboleth recommends turning on UseCanonicalName
# See "Prepping Apache" in https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPApacheConfig
UseCanonicalName On

# Most DSpace instances will want to use Shibboleth "Lazy Session", which ensures that users
# can access DSpace without first authenticating via Shibboleth.
# This section turns on Shibboleth "Lazy Session". Also ensures that once they have authenticated
# (by accessing /Shibboleth.sso/Login path), then their Shib session is kept alive
<Location />
AuthType shibboleth
ShibRequireSession Off
require shibboleth
# If your "shibboleth2.xml" file specifies an <ApplicationOverride> setting for your
# DSpace Service Provider, then you may need to tell Apache which "id" to redirect Shib requests
to.
# Just uncomment this and change the value "my-dspace-id" to the associated @id attribute value.
#ShibRequestSetting applicationId my-dspace-id
</Location>

# If a user attempts to access the DSpace shibboleth endpoint, force them to authenticate via Shib.
<Location "/server/api/authn/shibboleth">
Order deny,allow
Allow from all
AuthType shibboleth
ShibRequireSession On
# Please note that setting ShibUseHeaders to "On" is a potential security risk.
# You may wish to set it to "Off". See the mod_shib docs for details about this setting:
# https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPApacheConfig#NativeSPApacheConfig-
AuthConfigOptions
# Here's a good guide to configuring Apache + Tomcat when this setting is "Off":
# https://www.switch.ch/de/aai/support/serviceproviders/sp-access-rules.html#javaapplications
ShibUseHeaders On
Require shibboleth
</Location>

# If a user attempts to access the DSpace login endpoint, ensure Shibboleth is supported but other
auth methods can be too.
<Location "/server/api/authn/login">
Order deny,allow
Allow from all
AuthType shibboleth
# For DSpace, this is required to be off otherwise the available auth methods will be not visible
ShibRequireSession Off
# Please note that setting ShibUseHeaders to "On" is a potential security risk.

Using DSpace – 93

DSpace 7.x Documentation – DSpace 7.x Documentation

# You may wish to set it to "Off". See the mod_shib docs for details about this setting:
# https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPApacheConfig#NativeSPApacheConfig-
AuthConfigOptions
# Here's a good guide to configuring Apache + Tomcat when this setting is "Off":
# https://www.switch.ch/de/aai/support/serviceproviders/sp-access-rules.html#javaapplications
ShibUseHeaders On
</Location>

# Ensure /Shibboleth.sso path (in Apache) can be accessed

# By default it may be inaccessible if your Apache security is tight.
<Location "/Shibboleth.sso">
Order deny,allow
Allow from all
# Also ensure Shibboleth/mod_shib responds to this path
SetHandler shib
</Location>

# Finally, you may need to ensure requests to /Shibboleth.sso are NOT redirected
# to Tomcat (as they need to be handled by mod_shib instead).
# NOTE: THIS SETTING IS LIKELY ONLY NEEDED IF YOU ARE USING mod_proxy TO REDIRECT
# ALL REQUESTS TO TOMCAT (e.g. ProxyPass /server ajp://localhost:8009/server)
ProxyPass /Shibboleth.sso !
</IfModule>

...

# You will likely need Proxy settings to ensure Apache is proxying requests to Tomcat for the DSpace
REST API
# The below is just an example of proxying for REST API only. It requires installing & enabling
"mod_proxy" and "mod_proxy_ajp"
## Proxy / Forwarding Settings ##
<Proxy *>
AddDefaultCharset Off
Order allow,deny
Allow from all
</Proxy>

# Proxy all requests to /server to Tomcat via AJP
ProxyPass /server ajp://localhost:8009/server
ProxyPassReverse /server ajp://localhost:8009/server

# Optionally, also proxy Angular UI (if on same server). This requires "mod_proxy_http"
#ProxyPass / http://localhost:4000/
#ProxyPassReverse / http://localhost:4000/
</VirtualHost>

Sample shibboleth2.xml Configuration

In addition, here's a sample "ApplicationOverride" configuration for "shibboleth2.xml". This particular
"ApplicationOverride" is configured to use the Test IdP provided by https://samltest.id/ and is just meant as an
example. In order to enable it for testing purposes, you must specify ShibRequestSetting applicationId
samltest in your Apach mod_shib configuration (see above). An additional, more detailed example is provided in

Using DSpace – 94

DSpace 7.x Documentation – DSpace 7.x Documentation

our "dspace-shibboleth" Docker configurations at https://github.com/DSpace/DSpace/blob/main/dspace/src/

main/docker/dspace-shibboleth/shibboleth2.xml

<ApplicationOverride id="samltest" entityID="http://[mydspace.edu]/shibboleth" REMOTE_USER="eppn
persistent-id targeted-id">




<Sessions lifetime="28800" timeout="3600" checkAddress="false" relayState="ss:mem"
handlerSSL="true" cookieProps="; path=/; SameSite=None; secure; HttpOnly">
<SSO entityID="https://samltest.id/saml/idp">
SAML2 SAML1
</SSO>
</Sessions>



<!-- See also: https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPMetadataProvider --
>
<MetadataProvider type="XML" uri="https://samltest.id/saml/idp"
backingFilePath="samltest-metadata.xml" reloadInterval="180000"/>
</ApplicationOverride>

Sample attribute-map.xml Configuration (for samltest.id)

In order to use the above example for https://samltest.id/, you may also need to modify your attribute-map.xml to
support their attributes. Again, a more complete example is in our "dspace-shibboleth" Docker configurations at
https://github.com/DSpace/DSpace/blob/main/dspace/src/main/docker/dspace-shibboleth/attribute-map.xml

Using DSpace – 95

DSpace 7.x Documentation – DSpace 7.x Documentation

<Attributes xmlns="urn:mace:shibboleth:2.0:attribute-map" xmlns:xsi="http://www.w3.org/2001/XMLSchema-

instance">

<Attribute name="urn:oid:0.9.2342.19200300.100.1.1" id="uid"/>
<Attribute name="urn:oid:0.9.2342.19200300.100.1.3" id="mail"/>
<Attribute name="urn:oid:2.5.4.4" id="sn"/>
<Attribute name="urn:oid:2.16.840.1.113730.3.1.241" id="displayName"/>
<Attribute name="urn:oid:2.5.4.20" id="telephoneNumber"/>
<Attribute name="urn:oid:2.5.4.42" id="givenName"/>
<Attribute name="https://samltest.id/attributes/role" id="role"/>

...

</Attributes>

DSpace Shibboleth Configuration Options

Authentication Methods:
DSpace supports authentication using NetID, or email address. A user's NetID is a unique identifier from the IdP that
identifies a particular user. The NetID can be of almost any form such as a unique integer, string, or with Shibboleth
2.0 you can use "targeted ids". You will need to coordinate with your shibboleth federation or identity provider.
There are three ways to supply identity information to DSpace:
1) NetID from Shibboleth Header (best)
The NetID-based method is superior because users may change their email address with the identity provider.
When this happens DSpace will not be able to associate their new address with their old account.

2) Email address from Shibboleth Header (okay)
In the case where a NetID header is not available or not found DSpace will fall back to identifying a user based-upon
their email address.

3) Tomcat's Remote User (worst)

In the event that neither Shibboleth headers are found then as a last resort DSpace will look at Tomcat's remote
user field. This is the least attractive option because Tomcat has no way to supply additional attributes about a
user. Because of this the autoregister option is not supported if this method is used.
Identity Scheme Migration Strategies:
If you are currently using Email based authentication (either 1 or 2) and want to upgrade to NetID based
authentication then there is an easy path. Simply enable shibboleth to pass the NetID attribute and set the netid-
header below to the correct value. When a user attempts to log in to DSpace first DSpace will look for an EPerson
with the passed NetID, however when this fails DSpace will fall back to email based authentication. Then DSpace
will update the user's EPerson account record to set their NetID so all future authentications for this user will be
based upon NetID. One thing to note is that DSpace will prevent an account from switching NetIDs. If an account
already has a NetID set and then they try and authenticate with a different NetID the authentication will fail.
EPerson Metadata:
One of the primary benefits of using Shibboleth based authentication is receiving additional attributes about users
such as their names, telephone numbers, and possibly their academic department or graduation semester if
desired. DSpace treats the first and last name attributes differently because they (along with email address) are the
three pieces of minimal information required to create a new user account. For both first and last name supply

Using DSpace – 96

DSpace 7.x Documentation – DSpace 7.x Documentation

direct mappings to the Shibboleth headers. In additional to the first and last name DSpace supports other metadata
fields such as phone, or really anything you want to store on an eperson object. Beyond the phone field, which is
accessible in the user's profile screen, none of these additional metadata fields will be used by DSpace out-of-the
box. However if you develop any local modification you may access these attributes from the EPerson object. The
Vireo ETD workflow system utilizes this to aid students when submitting an ETD.
Role-based Groups:
DSpace is able to place users into pre-defined groups based upon values received from Shibboleth. Using this
option you can place all faculty members into a DSpace group when the correct affiliation's attribute is provided.
When DSpace does this they are considered 'special groups', these are really groups but the user's membership
within these groups is not recorded in the database. Each time a user authenticates they are automatically placed
within the pre-defined DSpace group, so if the user loses their affiliation then the next time they login they will no
longer be in the group.

Depending upon the shibboleth attributed use in the role-header it may be scoped. Scoped is shibboleth
terminology for identifying where an attribute originated from. For example a students affiliation may be encoded
as "[email protected]". The part after the @ sign is the scope, and the preceding value is the value. You may use
the whole value or only the value or scope. Using this you could generate a role for students and one institution
different than students at another institution. Or if you turn on ignore-scope you could ignore the institution and
place all students into one group.
The values extracted (a user may have multiple roles) will be used to look up which groups to place the user into.
The groups are defined as "authentication-shibboleth.role.<role-name>" which is a comma separated list of
DSpace groups.

 Having issues getting Safari working?

In addition to the below settings, you may need to ensure your Shibboleth IdP is trusted by the DSpace
backend by adding it to your rest.cors.allowed-origins configuration. This is required for Safari web
browsers to work with DSpace's Shibboleth plugin.
For example, if your IdP is https://samltest.id/, then you need to append that URL to the comma-separated
list of "allowed-origins" like:
rest.cors.allowed-origins = ${dspace.ui.url}, https://samltest.id
More information on this configuration can be found in the REST API(see page 502) documentation.

Configuration File: [dspace]/config/modules/authentication-shibboleth.cfg

Property: authentication-shibboleth.lazysession

Example Value: authentication-shibboleth.lazysession = true

Informational Note: Whether to use lazy sessions or active sessions. For more DSpace
instances, you will likely want to use lazy sessions. Active sessions will
force every user to authenticate via Shibboleth before they can access
your DSpace (essentially resulting in a "dark archive").

Using DSpace – 97

DSpace 7.x Documentation – DSpace 7.x Documentation

Property: authentication-shibboleth.lazysession.loginurl

Example Value: authentication-shibboleth.lazysession.loginurl = /

Shibboleth.sso/Login

Informational Note: The url to start a shibboleth session (only for lazy sessions). Generally this
setting will be "/Shibboleth.sso/Login"

Property: authentication-shibboleth.lazysession.secure

Example Value: authentication-shibboleth.lazysession.secure = true

Informational Note: Force HTTPS when authenticating (only for lazy sessions). Generally this is
recommended to be "true".

Property: authentication-shibboleth.netid-header

Example Value: authentication-shibboleth.netid-header = SHIB-NETID

Informational Note: The HTTP header where shibboleth will supply a user's NetID. This HTTP
header should be specified as an Attribute within your Shibboleth
"attribute-map.xml" configuration file.

Property: authentication-shibboleth.email-header

Example Value: authentication-shibboleth.email-header = SHIB-MAIL

Informational Note: The HTTP header where the shibboleth will supply a user's email address.
This HTTP header should be specified as an Attribute within your
Shibboleth "attribute-map.xml" configuration file.

Property: authentication-shibboleth.email-use-tomcat-remote-user

Example Value: authentication-shibboleth.email-use-tomcat-remote-user =

false

Using DSpace – 98

DSpace 7.x Documentation – DSpace 7.x Documentation

Informational Note: Used when a netid or email headers are not available should Shibboleth
authentication fall back to using Tomcat's remote user feature? Generally
this is not recommended. See the "Authentication Methods" section above.

Property: authentication-shibboleth.reconvert.attributes

Example Value authentication-shibboleth.reconvert.attributes = false

Informational Note: Shibboleth attributes are by default UTF-8 encoded. Some servlet
container automatically converts the attributes from ISO-8859-1 (latin-1) to
UTF-8. As the attributes already were UTF-8 encoded it may be necessary
to reconvert them. If you set this property true, DSpace converts all
shibboleth attributes retrieved from the servlet container from UTF-8 to
ISO-8859-1 and uses the result as if it were UTF-8. This procedure restores
the shibboleth attributes if the servlet container wrongly converted them
from ISO-8859-1 to UTF-8. Set this true, if you notice character encoding
problems within shibboleth attributes.

Property: authentication-shibboleth.autoregister

Example Value: authentication-shibboleth.autoregister = true

Informational Note: Should we allow new users to be registered automatically?

Property: authentication-shibboleth.sword.compatibility

Example Value: authentication-shibboleth.sword.compatibility = false

Informational Note: SWORD compatibility will allow this authentication method to work when
using SWORD. SWORD relies on username and password based
authentication and is entirely incapable of supporting shibboleth. This
option allows you to authenticate username and passwords for SWORD
sessions with out adding another authentication method onto the stack.
You will need to ensure that a user has a password. One way to do that is to
create the user via the create-administrator command line command and
then edit their permissions.
WARNING: If you enable this option while ALSO having
"PasswordAuthentication" enabled, then you should ensure that
"PasswordAuthentication" is listed prior to "ShibAuthentication" in your
authentication.cfg file. Otherwise, ShibAuthentication will be used to
authenticate all of your users INSTEAD OF PasswordAuthentication.

Using DSpace – 99

DSpace 7.x Documentation – DSpace 7.x Documentation

Property: authentication-shibboleth.firstname-header

Example Value: authentication-shibboleth.firstname-header =

SHIB_GIVENNAME

Informational Note: The HTTP header where the shibboleth will supply a user's given name.
This HTTP header should be specified as an Attribute within your
Shibboleth "attribute-map.xml" configuration file.

Property: authentication-shibboleth.lastname-header

Example Value: authentication-shibboleth.lastname-header = SHIB_SN

Informational Note: The HTTP header where the shibboleth will supply a user's surname. This
HTTP header should be specified as an Attribute within your Shibboleth
"attribute-map.xml" configuration file.

Property: authentication-shibboleth.eperson.metadata

Example Value:
authentication-shibboleth.eperson.metadata = \
SHIB-telephone => phone, \
SHIB-cn => cn

Informational Note: Additional user attributes mapping, multiple attributes may be stored for
each user. The left side is the Shibboleth-based metadata Header and the
right side is the eperson metadata field to map the attribute to.

Property: authentication-shibboleth.eperson.metadata.autocreate

Example Value: authentication-shibboleth.eperson.metadata.autocreate =

true

Informational Note: If the eperson metadata field is not found, should it be automatically
created?

Property: authentication-shibboleth.role-header

Using DSpace – 100

DSpace 7.x Documentation – DSpace 7.x Documentation

Example Value: authentication-shibboleth.role-header =

SHIB_SCOPED_AFFILIATION

Informational Note: The shibboleth header to do role-based mappings (see section on roll
based mapping section above)

Property: authentication-shibboleth.role-header.ignore-scope

Example Value: authentication-shibboleth.role-header.ignore-scope = true

Informational Note: Weather to ignore the attribute's scope (everything after the @ sign for
scoped attributes)

Property: authentication-shibboleth.role-header.ignore-value

Example Value: authentication-shibboleth.role-header.ignore-value = false

Informational Note: Weather to ignore the attribute's value (everything before the @ sign for
scoped attributes)

Property: authentication-shibboleth.role.[affiliation-attribute]

Example Value:
authentication-shibboleth.role.faculty = Faculty, Member
authentication-shibboleth.role.staff = Staff, Member
authentication-shibboleth.role.student = Students, Member

Informational Note: Mapping of affiliation values to DSpace groups. See the "Role-based
Groups" section above for more info.

LDAP Authentication

Introduction to LDAP specific terminology

If you are unfamiliar with LDAP, the following introduction to some of its terminology might come in handy:
https://stackoverflow.com/questions/18756688/what-are-cn-ou-dc-in-an-ldap-search

Using DSpace – 101

DSpace 7.x Documentation – DSpace 7.x Documentation

Enabling LDAP Authentication

To enable LDAP Authentication, you must ensure the org.dspace.authenticate.LDAPAuthentication class is
listed as one of the AuthenticationMethods in the following configuration:

Configuration File: [dspace]/config/modules/authentication.cfg

Property: plugin.sequence.org.dspace.authenticate.Authenticat
ionMethod

Example Value:
plugin.sequence.org.dspace.authenticate.AuthenticationMethod
= org.dspace.authenticate.LDAPAuthentication

Configuring LDAP Authentication

If LDAP is enabled, then new users will be able to register by entering their username and password without being
sent the registration token. If users do not have a username and password, then they can still register and login with
just their email address the same way they do now.
If you want to give any special privileges to LDAP users, create a stackable authentication method to automatically
put people who have a netid into a special group. You might also want to give certain email addresses special
privileges. Refer to the Custom Authentication Code section(see page 113) below for more information about how to
do this.

 Ensure required commas are escaped in LDAP configuration

NOTE: As of DSpace 6, commas (,) are now a special character in the Configuration (see page 552)system. As
some LDAP configuration may contain commas, you must be careful to escape any required commas by
adding a backslash (\) before each comma, e.g. "\,". The configuration reference for authentication-
ldap.cfg has been updated below with additional examples.

Here is an explanation of each of the different LDAP configuration parameters:

Configuration File: [dspace]/config/modules/authentication-ldap.cfg

Property: authentication-ldap.enable

Example Value: authentication-ldap.enable = false

Using DSpace – 102

DSpace 7.x Documentation – DSpace 7.x Documentation

Informational Note: This setting will enable or disable LDAP authentication in DSpace. With
the setting off, users will be required to register and login with their email
address. With this setting on, users will be able to login and register with
their LDAP user ids and passwords.

Property: authentication-ldap.autoregister

Example Value: authentication-ldap.autoregister = true

Informational Note: This will turn LDAP autoregistration on or off. With this on, a new EPerson
object will be created for any user who successfully authenticates against
the LDAP server when they first login. With this setting off, the user must
first register to get an EPerson object by entering their ldap username
and password and filling out the forms.

Property: authentication-ldap.provider_url

Example Value: authentication-ldap.provider_url = ldap://ldap.myu.edu/

o=myu.edu\,ou=mydept

Informational Note: This is the url to your institution's LDAP server. You may or may not need
the /o=myu.edu part at the end. Your server may also require the ldaps://
protocol. (This field has no default value)

NOTE: As of DSpace 6, commas (,) are now a special character in the

Configuration (see page 552)system. Therefore, be careful to escape any
required commas in this configuration by adding a backslash (\) before
each comma, e.g. "\,"

Property: authentication-ldap.starttls

Example Value: authentication-ldap.starttls = false

Informational Note: Should we issue StartTLS after establishing TCP connection in order to
initiate an encrypted connection?
Note: This (TLS) is different from LDAPS:

• TLS is a tunnel for plain LDAP and is typically recognized on the same
port (standard LDAP port: 389)
• LDAPS is a separate protocol, deprecated in favor of the standard
TLS method. (standard LDAPS port: 636)

Using DSpace – 103

DSpace 7.x Documentation – DSpace 7.x Documentation

Property: authentication-ldap.id_field

Example Value: authentication-ldap.id_field = uid

Explanation: This is the unique identifier field in the LDAP directory where the
username is stored. (This field has no default value)

Property: authentication-ldap.object_context

Example Value: authentication-ldap.object_context =

ou=people\,o=myu.edu

Informational Note: This is the LDAP object context to use when authenticating the user. By
default, DSpace will use this value to create the user's DN in order to
attempt to authenticate them. It is appended to the id_field and
username. For example uid=username\,ou=people\,o=myu.edu. You
will need to modify this to match your LDAP configuration. (This field has
no default value)

If your users do NOT all exist under a single "object_context" in LDAP,

then you should ignore this setting and INSTEAD use the Hierarchical
LDAP Authentication settings below(see page 110) (especially see
"search.user" or "search.anonymous")

NOTE: As of DSpace 6, commas (,) are now a special character in the

Configuration (see page 552)system. Therefore, be careful to escape any
required commas in this configuration by adding a backslash (\) before
each comma, e.g. "\,"

Property: authentication-ldap.search_context

Example Value: authentication-ldap.search_context = ou=people

Using DSpace – 104

DSpace 7.x Documentation – DSpace 7.x Documentation

Informational Note: This is the search context used when looking up a user's LDAP object to
retrieve their data for autoregistering. With autoregister=true, when
a user authenticates without an EPerson object we search the LDAP
directory to get their name (id_field) and email address
(email_field) so that we can create one for them. So after we have
authenticated against uid=username,ou=people,o=byu.edu we now
search in ou=people for filtering on [uid=username]. Often the
search_context is the same as the object_context parameter. But
again this depends on your LDAP server configuration. (This field has no
default value, and it MUST be specified when either
search.anonymous=true or search.user is specified)

NOTE: As of DSpace 6, commas (,) are now a special character in the

Configuration (see page 552)system. Therefore, be careful to escape any
required commas in this configuration by adding a backslash (\) before
each comma, e.g. "\,"

Property: authentication-ldap.email_field

Example Value: authentication-ldap.email_field = mail

Informational Note: This is the LDAP object field where the user's email address is stored.
"mail" is the most common for LDAP servers. (This field has no default
value)

If the "email_field" is unspecified, or the user has no email address in

LDAP, his/her username (id_field value) will be saved as the email in
DSpace (or appended to netid_email_domain, when specified)

Property: authentication-ldap.netid_email_domain

Example Value: authentication-ldap.netid_email_domain = @example.com171

171 http://example.com

Using DSpace – 105

DSpace 7.x Documentation – DSpace 7.x Documentation

Informational Note: If your LDAP server does not hold an email address for a user (i.e. no
email_field), you can use the following field to specify your email
domain. This value is appended to the netid (id_field) in order to make
an email address (which is then stored in the DSpace EPerson). For
example, a netid of 'user' and netid_email_domain as @example.com172
would set the email of the user to be [email protected]

Please note: this field will only be used if "email_field" is unspecified

OR the user in question has no email address stored in LDAP. If both
"email_field" and "netid_email_domain" are unspecified, then the
"id_field" will be used as the email address.

Property: authentication-ldap.surname_field

Example Value: authentication-ldap.surname_field = sn

Informational Note: This is the LDAP object field where the user's last name is stored. "sn" is
the most common for LDAP servers. If the field is not found the field will
be left blank in the new eperson object. (This field has no default value)

Property: authentication-ldap.givenname_field

Example Value: authentication-ldap.givenname_field = givenName

Informational Note: This is the LDAP object field where the user's given names are stored. I'm
not sure how common the givenName field is in different LDAP instances.
If the field is not found the field will be left blank in the new eperson
object. (This field has no default value)

Property: authentication-ldap.phone_field

Example Value: authentication-ldap.phone_field = telephoneNumber

Informational Note: This is the field where the user's phone number is stored in the LDAP
directory. If the field is not found the field will be left blank in the new
eperson object. (This field has no default value)

Property: authentication-ldap.login.specialgroup

172 http://example.com
173 mailto:[email protected]

Using DSpace – 106

DSpace 7.x Documentation – DSpace 7.x Documentation

Example Value: authentication-ldap.login.specialgroup = group-name

Informational Note: If specified, all user sessions successfully logged in via LDAP will
automatically become members of this DSpace Group (for the remainder
of their current, logged in session). This DSpace Group must already exist
(it will not be automatically created).
This is useful if you want a DSpace Group made up of all internal
authenticated users. This DSpace Group can then be used to bestow
special permissions on any users who have authenticated via LDAP (e.g.
you could allow anyone authenticated via LDAP to view special, on
campus only collections or similar)

Property: login.groupmap.*

Example Value: authentication-ldap.login.groupmap.1 =

ou=Students:ALL_STUDENTS
authentication-ldap.login.groupmap.2 =
ou=Employees:ALL_EMPLOYEES
authentication-ldap.login.groupmap.3 =
ou=Faculty:ALL_FACULTY

Informational Note: The left part of the value (before the ":") must correspond to a portion of
a user's DN (unless "login.group.attribute" is specified..please see
below). The right part of the value corresponds to the name of an existing
DSpace group.

For example, if the authenticated user's DN in LDAP is in the following

form:

cn=jdoe,OU=Students,OU=Users,dc=example,dc=edu

that user would get assigned to the ALL_STUDENTS DSpace group for the
remainder of their current session.

However, if that same user later graduates and is employed by the

university, their DN in LDAP may change to:

cn=jdoe,OU=Employees,OU=Users,dc=example,dc=edu

Upon logging into DSpace after that DN change, the authenticated user
would now be assigned to the ALL_EMPLOYEES DSpace group for the
remainder of their current session.

Note: This option can be used independently from the login.specialgroup

option, which will put all LDAP users into a single DSpace group. Both
options may be used together.

Property: authentication-ldap.login.groupmap.attribute

Using DSpace – 107

DSpace 7.x Documentation – DSpace 7.x Documentation

Example Value: authentication-ldap.login.groupmap.attribute = group

Informational Note: The value of the "authentication-

ldap.login.groupmap.attribute" should specify the name of a
single LDAP attribute. If this property is uncommented, it changes the
meaning of the left part of "authentication-
ldap.login.groupmap.*" (see above) as follows:

• If the authenticated user has this LDAP attribute, look up the value of
this LDAP attribute in the left part (before the ":") of the
authentication-ldap.login.groupmap.* value
• If that LDAP value is found in any "authentication-
ldap.login.groupmap.*" field, assign this authenticated user to
the DSpace Group specified by the right part (after the ":") of the
authentication-ldap.login.groupmap.* value.
For example:

• authentication-ldap.login.groupmap.attribute = group
• authentication-ldap.login.groupmap.1 =
mathematics:Mathematics_Group
The above would ensure that any authenticated users where their LDAP
"group" attribute equals "mathematics" would be added to the DSpace
Group named "Mathematics_Group" for the remainder of their current
session. However, if that same user logged in later with a new LDAP
"group" value of "computer science", he/she would no longer be a
member of the "Mathematics_Group" in DSpace.

Debugging LDAP connection and configuration

As every LDAP is different, configuring your DSpace to communicate with your LDAP can sometimes be a challenge.
We recommend using third-party LDAP tools to test your LDAP connection / username / password, and perform
sample searches to better understand what information is being returned from your local LDAP. This will help
ensure that LDAP configuration goes more smoothly.
One example of such an LDAP tool is the ldapsearch174 commandline tool available in most Linux operating
systems (e.g. in Debian / Ubuntu it's available in the "ldap-utils" package). Below are some example ldapsearch
commands that can be used to determine (and/or debug) specific configurations in your authentication-
ldap.cfg. In the below examples, we've used the names of specific DSpace configurations as placeholders (in
square brackets).

174 https://linux.die.net/man/1/ldapsearch

Using DSpace – 108

DSpace 7.x Documentation – DSpace 7.x Documentation

# Basic anonymous connection (for VERBOSE, add -v)

ldapsearch -x -H [provider_url]

# Debug a connection error (add -d-1)

# If you are connecting to an LDAPS URL and see connection errors (e.g. "peer cert untrusted or revoked")
# then see below note about "SSL Connection Errors"
ldapsearch -x -H [provider_url] -d-1

# Attempt to connect to [provider_url] as [search.user] (will prompt for search.user's password)
# This doesn't actually perform a query, just ensures that authentication is working
# NOTE: "search.user" is USUALLY either the full user DN (e.g. "cn=dspaceadmin,ou=people,o=myu.edu")
# or "DOMAIN\USERNAME" (e.g. "MYU\DSpaceUser"). The latter is more likely with Windows Active Directory
ldapsearch -x -H [provider_url] -D [search.user] -W

# Attempt to list the first 100 users in a given [search_context], returning the "cn", "mail" and "sn"
fields for each
ldapsearch -x -H [provider_url] -D [search.user] -W -b [search_context] -z 100 cn mail sn

# Attempt to find the first 100 users whose [id_field] starts with the letter "t", returning the
[id_field], "cn", "mail" and "sn" fields for each
ldapsearch -x -H [provider_url] -D [search.user] -W -b [search_context] -z 100 -s sub "([id_field]=t*)"
[id_field] cn mail sn

SSL Connection Errors: If you are using ldapsearch with an LDAPS connection (secure connection), you may receive
"peer cert untrusted or revoked" errors if the LDAP SSL certificate is self-signed. You can temporarily tell LDAP to
accept any security certificate by setting TLS_REQCERT allow in your ldapsearch's ldap.conf file. Be sure to
remove this setting however after you are done testing!

# FOR TESTING ONLY! This setting disables the check for a valid LDAP Server security certificate,
# which is considered a security issue for production LDAP setups. Setting this to "allow" tells
# the LDAP client to accept any security certificates that it cannot verify or validate.
TLS_REQCERT allow

More information on this SSL workaround can be found at:

• http://www.bind9.net/manual/openldap/2.3/tls.html
• http://muzso.hu/2012/03/29/how-to-configure-ssl-aka.-ldaps-for-libnss-ldap-auth-client-config-in-ubuntu

Enabling Hierarchical LDAP Authentication

 Please note, that DSpace doesn't contain the LDAPHierarchicalAuthentication class anymore. This
functionality is now supported by LDAPAuthentication, which uses the same configuration options.

If your users are spread out across a hierarchical tree on your LDAP server, you may wish to have DSpace search for
the user name in your tree. Here's how it works:
1. DSpace gets the user name from the login form

Using DSpace – 109

DSpace 7.x Documentation – DSpace 7.x Documentation

2. DSpace binds to LDAP as an administrative user with right to search in DNs (LDAP may be configured to
allow anonymous users to search)
3. DSpace searches for the user name as within DNs (username is a part of full DN)
4. DSpace binds with the found full DN and password from login form
5. DSpace logs user in if LDAP reports successful authentication; refuses login otherwise

Configuring Hierarchical LDAP Authentication

Hierarchical LDAP Authentication shares all the above standard LDAP configurations(see page 102), but has some
additional settings.
You can optionally specify the search scope. If anonymous access is not enabled on your LDAP server, you will need
to specify the full DN and password of a user that is allowed to bind in order to search for the users.

Configuration File: [dspace]/config/modules/authentication-ldap.cfg

Property: authentication-ldap.search_scope

Example Value: authentication-ldap.search_scope = 2

Informational Note: This is the search scope value for the LDAP search during
autoregistering (autoregister=true). This will depend on your
LDAP server setup, and is only really necessary if your users are
spread out across a hierarchical tree on your LDAP server. This value
must be one of the following integers corresponding to the following
values:
object scope : 0
one level scope : 1
subtree scope : 2

Please note that "search_context" in the LDAP configurations

must also be specified.

Property: authentication-ldap.search.anonymous

Example Value: authentication-ldap.search.anonymous = true

Informational Note: If true, DSpace will anonymously search LDAP (in the
"search_context") for the DN of the user trying to login to DSpace.
This setting is "false" by default. By default, DSpace will either use
"search.user" to authenticate for the LDAP search (if search.user
is specified), or will use the "object_context" value to create the
user's DN.

Using DSpace – 110

DSpace 7.x Documentation – DSpace 7.x Documentation

Property: authentication-ldap.search.user
authentication-ldap.search.password

Example Value: authentication-ldap.search.user =

cn=admin\,ou=people\,o=myu.edu
authentication-ldap.search.password = password

Informational Note: The full DN and password of a user allowed to connect to the LDAP
server and search (in the "search_context") for the DN of the user
trying to login. By default, if unspecified, DSpace will either search
LDAP anonymously for the user's DN (when
search.anonymous=true), or will use the "object_context"
value to create the user's DN.

NOTE: As of DSpace 6, commas (,) are now a special character in the

Configuration (see page 552)system. Therefore, be careful to escape
any required commas in this configuration by adding a backslash (\)
before each comma, e.g. "\,"

IP Authentication

Enabling IP Authentication
To enable IP Authentication, you must ensure the org.dspace.authenticate.IPAuthentication class is listed
as one of the AuthenticationMethods in the following configuration:

Configuration File: [dspace]/config/modules/authentication.cfg

Property: plugin.sequence.org.dspace.authenticate.Authenticat
ionMethod

Example Value:
plugin.sequence.org.dspace.authenticate.AuthenticationMethod
= org.dspace.authenticate.IPAuthentication

Configuring IP Authentication

Configuration File: [dspace]/config/modules/authentication-ip.cfg

Once enabled, you are then able to map DSpace groups to IP addresses in authentication-ip.cfg by setting
ip.GROUPNAME = iprange[, iprange ...], e.g:

Using DSpace – 111

DSpace 7.x Documentation – DSpace 7.x Documentation

authentication-ip.MY_UNIVERSITY = 10.1.2.3, \ # Full IP

13.5, \ # Partial IP
11.3.4.5/24, \ # with CIDR
12.7.8.9/255.255.128.0, \ # with netmask
2001:18e8::32 # IPv6 too

Negative matches can be set by prepending the entry with a '-'. For example if you want to include all of a class B
network except for users of a contained class c network, you could use: 111.222,-111.222.333.
Notes:
• If the Groupname contains blanks you must escape the spaces, e.g. "Department\ of\ Statistics"
• If your DSpace installation is hidden behind a web proxy, remember to set the useProxies configuration
option within the 'Logging' section of dspace.cfg to use the IP address of the user rather than the IP
address of the proxy server.

X.509 Certificate Authentication

Enabling X.509 Certificate Authentication

The X.509 authentication method uses an X.509 certificate sent by the client to establish his/her identity. It requires
the client to have a personal Web certificate installed on their browser (or other client software) which is issued by a
Certifying Authority (CA) recognized by the web server.
1. See the HTTPS installation instructions(see page 87) to configure your Web server. If you are using HTTPS with
Tomcat, note that the <Connector> tag must include the attribute clientAuth="true" so the server
requests a personal Web certificate from the client.
2. Add the org.dspace.authenticate.X509Authentication plugin first to the list of stackable
authentication methods in the value of the configuration key
plugin.sequence.org.dspace.authenticate.AuthenticationMethod

Configuration File: [dspace]/config/modules/authentication.cfg

Property: plugin.sequence.org.dspace.authenticate.Authenticat
ionMethod

Example Value:
plugin.sequence.org.dspace.authenticate.AuthenticationMetho
d = org.dspace.authenticate.X509Authentication
plugin.sequence.org.dspace.authenticate.AuthenticationMetho
d = org.dspace.authenticate.PasswordAuthentication

Configuring X.509 Certificate Authentication

Configuration File: [dspace]/config/modules/authentication-x509.cfg

Using DSpace – 112

DSpace 7.x Documentation – DSpace 7.x Documentation

1. You must also configure DSpace with the same CA certificates as the web server, so it can accept and
interpret the clients' certificates. It can share the same keystore file as the web server, or a separate one, or a
CA certificate in a file by itself. Configure it by oneof these methods, either the Java keystore

authentication-x509.keystore.path = path to Java keystore file

authentication-x509.keystore.password = password to access the keystore

...or the separate CA certificate file (in PEM or DER format):

authentication-x509.ca.cert = path to certificate file for CA whose client certs to accept.

2. Choose whether to enable auto-registration: If you want users who authenticate successfully to be
automatically registered as new E-Persons if they are not already, set the autoregister configuration
property to true. This lets you automatically accept all users with valid personal certificates. The default is
false.

 TODO: document the remaining authentication-x509.* properties

Example of a Custom Authentication Method

Also included in the source is an implementation of an authentication method used at MIT,
edu.mit.dspace.MITSpecialGroup. This does not actually authenticate a user, it only adds the current user session to
a special (dynamic) group called 'MIT Users' (which must be present in the system!). This allows us to create
authorization policies for MIT users without having to manually maintain membership of the MIT users group.
By keeping this code in a separate method, we can customize the authentication process for MIT by simply adding it
to the stack in the DSpace configuration. None of the code has to be touched.
You can create your own custom authentication method and add it to the stack. Use the most similar existing
method as a model, e.g. org.dspace.authenticate.PasswordAuthentication for an "explicit" method (with
credentials entered interactively) or org.dspace.authenticate.X509Authentication for an implicit method.

4.1.2 Embargo
• What is an Embargo?(see page 114)
• DSpace Embargo Functionality(see page 114)
• Managing Embargoes on existing Items(see page 114)
• Configuring and using Embargo in DSpace Submission User Interface(see page 115)
• Private/Public Item(see page 115)
• Pre-3.0 Embargo Migration Routine(see page 115)
• Technical Specifications(see page 115)
• Introduction(see page 115)
• ResourcePolicy(see page 116)
• Item(see page 116)
• Item.inheritCollectionDefaultPolicies(Collection c)(see page 116)
• AuthorizeService(see page 116)
• Withdraw Item(see page 117)
• Reinstate Item(see page 117)
• Pre-DSpace 3.0 Embargo Compatibility(see page 117)
• Creating Embargoes via Metadata(see page 117)

Using DSpace – 113

DSpace 7.x Documentation – DSpace 7.x Documentation

• Introduction(see page 117)

• Setting Embargo terms via metadata(see page 117)
• Terms assignment(see page 118)
• Terms interpretation/imposition(see page 118)
• Embargo period(see page 118)
• Configuration of metadata fields(see page 118)
• Operation(see page 119)
• Extending embargo functionality(see page 119)
• Setter(see page 120)
• Lifter(see page 120)

4.1.2.1 What is an Embargo?

An embargo is a temporary access restriction placed on metadata or bitstreams (i.e. files). Its scope or duration may
vary, but the fact that it eventually expires is what distinguishes it from other content restrictions. For example, it is
not unusual for content destined for DSpace to come with permanent restrictions on use or access based on
license-driven or other IP-based requirements that limit access to institutionally affiliated users. Restrictions such
as these are imposed and managed using standard administrative tools in DSpace, typically by attaching specific
access policies (aka "resource policies") to Items, Collections, Bitstreams, etc.
Embargo functionality was originally introduced as part of DSpace 1.6, enabling embargoes on the level of items
that applied to all bitstreams included in the item. Since DSpace 3.0, this functionality has been extended to the
Submission User Interface, enabling embargoes on the level of individual bitstreams.

4.1.2.2 DSpace Embargo Functionality

Embargoes can be applied per item (including metadata) and per bitstream (i.e. file). The item level embargo will be
the default for every bitstream, although it could be customized at bitstream level.
When an embargo is set on either an item level or a bitstream level, a new ResourcePolicy (i.e. access policy) is
added to the corresponding Item or Bitstream. This ResourcePolicy will automatically control the lifting of the
embargo (when the embargo date passes). An embargo lift date is generally stored as the "start date" of such a
policy. Essentially, this means that the access rights defined in the policy do not get applied until after that date
passes (and prior to that date, the access rights will default to Admin only).
The scheduled, manual "embargo-lifter" commands (used prior to DSpace 3) are no longer necessary and not
recommended to run.

Managing Embargoes on existing Items

Administrators are able to change the lift date of any embargo by editing the authorization policy (ResourcePolicy)
on the object. These authorization policies can be managed from the Edit Item screen by clicking on
"Authorizations".
• To add an embargo, edit the appropriate policy and set a "start date". To add an full Item embargo
(including metadata), edit the Item policy. To embargo individual bitstreams, edit the appropriate Bitstream
policy.
• To remove an embargo, edit the appropriate policy, and clear out the "start date".
• To change an embargo, edit the appropriate policy, and change the "start date" to a new date.
Changes to the embargo should take effect immediately. However, as Administrators have full access to embargoed
items, you may need to log out first. After logging out, you will be subject to the embargo.

Using DSpace – 114

DSpace 7.x Documentation – DSpace 7.x Documentation

4.1.2.3 Configuring and using Embargo in DSpace Submission User Interface

 Item-level embargo is not yet supported in DSpace 7.0 in the Submission user interface. In the Submission
UI, only embargoes on specific bitstreams (files) is supported. However, you can add an item-level
embargo in DSpace 7.0 using the "Manage Embargoes on existing Items" approach described above.
Item-level embargo will be available in a future 7.x release. See DSpace Release 7.0 Status175

Starting in DSpace 7, embargo (and lease) settings are configurable via a Spring Bean configuration file [dspace]/
config/spring/api/access-conditions.xml
For detailed information on configuring your Embargo options (and other related options like lease or restrict to a
particular group of users), see the section on "Configuring the File Upload Step" of the Submission User Interface(see
page 260).

Private/Public Item
It is also possible to adjust the Private/Public state of an item after it has been archived in the repository. This can
be achieved from either the "Admin Search" (/admin/search), or from the "Status" tab under "Edit Item".
Private items are not retrievable through the DSpace search, browse or Discovery indexes.
Therefore, an "Admin Search" option is provided, which allows you to search across all items, including private or
withdrawn items. You can also filter your results to display only private items.

Pre-3.0 Embargo Migration Routine

If you have just upgraded from a DSpace 1.x.x version, any embargoes that are currently "in effect" will need to be
migrated into ResourcePolicies. Prior to 3.0, embargoes in DSpace were managed entirely in metadata fields (and
required running a scheduled "embargo-lifter" command). However, DSpace now stores all embargo information
directly on ResourcePolicies (i.e. "access policies"). These ResourcePolicies automatically "lift" an embargo after
the embargo date passes.
In order to migrate old embargoes into ResourcePolicies, a migration routine has been developed. Please note
that this migration routine should only need to be run ONCE (immediately after an upgrade from 1.x.x to a more
recent version of DSpace). After that point, any newly defined embargoes will automatically be stored on
ResourcePolicies.
To execute it, run the following command:

[dspace]/bin/dspace migrate-embargo -a

4.1.2.4 Technical Specifications

Introduction
The following sections illustrate the technical changes that have been made to the back-end to add the new
Advanced Embargo functionality.

175 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Status

Using DSpace – 115

DSpace 7.x Documentation – DSpace 7.x Documentation

ResourcePolicy
When an embargo is set at item level or bitstream level, a new ResourcePolicy will be added.
Three new attributes have been introduced in the ResourcePolicy class:
• rpname: resource policy name
• rptype: resource policy type
• rpdescription: resource policy description
While rpname and rpdescription are fields manageable by users, the rptype is managed by DSpace itself. It
represents a type that a resource policy can assume, among the following:
• TYPE_SUBMISSION: all the policies added automatically during the submission process
• TYPE_WORKFLOW: all the policies added automatically during the workflow stage
• TYPE_CUSTOM: all the custom policies added by users
• TYPE_INHERITED: all the policies inherited from the enclosing object (for Item, a Collection; for Bitstream, an
Item).
Here is an example of all information contained in a single policy record:

policy_id: 4847
resource_type_id: 2
resource_id: 89
action_id: 0
eperson_id:
epersongroup_id: 0
start_date: 2013-01-01
end_date:
rpname: Embargo Policy
rpdescription: Embargoed through 2012
rptype: TYPE_CUSTOM

Item
To manage Private/Public state a new boolean attribute has been added to the Item:
• isDiscoverable
When an Item is private, the attribute will assume the value false.

Item.inheritCollectionDefaultPolicies(Collection c)
This method has been adjusted to leave custom policies, added by the users, in place and add the default collection
policies only if there are no custom policies.

AuthorizeService
Some methods have been changed on AuthorizeService to manage the new fields and some convenience methods
have been introduced:

Using DSpace – 116

DSpace 7.x Documentation – DSpace 7.x Documentation

public static List<ResourcePolicy> findPoliciesByDSOAndType(Context c, DSpaceObject o, String type);

public static void removeAllPoliciesByDSOAndTypeNotEqualsTo(Context c, DSpaceObject o, String type);
public static boolean isAnIdenticalPolicyAlreadyInPlace(Context c, DSpaceObject o, ResourcePolicy rp);
public static ResourcePolicy createOrModifyPolicy(ResourcePolicy policy, Context context, String name, int
idGroup, EPerson ePerson, Date embargoDate, int action, String reason, DSpaceObject dso);

Withdraw Item
The feature to withdraw an item from the repository has been modified to keep all the custom policies in place.

Reinstate Item
The feature to reinstate an item in the repository has been modified to preserve existing custom policies.

Pre-DSpace 3.0 Embargo Compatibility

The Pre-DSpace 3.0 embargo functionality (see below) has been modified to adjust the policies setter and lifter.
These classes now also set the dates within the policy objects themselves in addition to setting the date in the item
metadata.

4.1.2.5 Creating Embargoes via Metadata

Introduction
Prior to DSpace 3.0, all DSpace embargoes were stored as metadata. While embargoes are no longer stored
permanently in metadata fields (they are now stored on ResourcePolicies, i.e. access policies), embargoes can still
be initialized via metadata fields.
This ability to create/initialize embargoes via metadata is extremely powerful if you wish to submit embargoed
content via electronic means (such as Importing Items via Simple Archive Format(see page 233), SWORDv1(see page
216), SWORDv2(see page 202), etc).

Setting Embargo terms via metadata

Functionally, the embargo system allows you to attach "terms" to an item before it is placed into the repository,
which express how the embargo should be applied. What do we mean by "terms" here? They are really any
expression that the system is capable of turning into (1) the time the embargo expires, and (2) a concrete set of
access restrictions. Some examples:
"2020-09-12" - an absolute date (i.e. the date embargo will be lifted)
"6 months" - a time relative to when the item is accessioned
"forever" - an indefinite, or open-ended embargo
"local only until 2015" - both a time and an exception (public has no access until 2015, local users OK immediately)
"Nature Publishing Group standard" - look-up to a policy somewhere (typically 6 months)
These terms are interpreted by the embargo system to yield a specific date on which the embargo can be removed
(or "lifted"), and a specific set of access policies. Obviously, some terms are easier to interpret than others (the
absolute date really requires none at all), and the default embargo logic understands only the most basic terms (the
first and third examples above). But as we will see below, the embargo system provides you with the ability to add
your own interpreters to cope with any terms expressions you wish to have. This date that is the result of the
interpretation is stored with the item. The embargo system detects when that date has passed, and removes the

Using DSpace – 117

DSpace 7.x Documentation – DSpace 7.x Documentation

embargo ("lifts it"), so the item bitstreams become available. Here is a more detailed life-cycle for an embargoed
item:

Terms assignment
The first step in placing an embargo on an item is to attach (assign) "terms" to it. If these terms are missing, no
embargo will be imposed. As we will see below, terms are carried in a configurable DSpace metadata field, so
assigning terms just means assigning a value to a metadata field. This can be done in a web submission user
interface form, in a SWORD deposit package, a batch import, etc. - anywhere metadata is passed to DSpace. The
terms are not immediately acted upon, and may be revised, corrected, removed, etc, up until the next stage of the
life-cycle. Thus a submitter could enter one value, and a collection editor replace it, and only the last value will be
used. Since metadata fields are multivalued, theoretically there can be multiple terms values, but in the default
implementation only one is recognized.

Terms interpretation/imposition
In DSpace terminology, when an Item has exited the last of any workflow steps (or if none have been defined for it),
it is said to be "installed" into the repository. At this precise time, the interpretation of the terms occurs, and a
computed "lift date" is assigned, and recorded as part of the ResourcePolicy (aka policy) of the Item. Once the lift
date has been assigned to the ResourcePolicy, the metadata field which defined the embargo is cleared. From that
point forward, all embargo information is controlled/defined by the ResourcePolicy.
It is important to understand that this interpretation happens only once, (just like the installation). Therefore,
updating/changing an embargo cannot be done via metadata fields. Instead, all embargo updates must be
made to the ResourcePolicies themselves (e.g. ResourcePolicies can be managed from the Admin UI in the Edit Item
screens).
Also note that since these policy changes occur before installation, there is no time during which embargoed
content is "exposed" (accessible by non-administrators). The terms interpretation and imposition together are
called "setting" the embargo, and the component that performs them both is called the embargo "setter".

Embargo period
After an embargoed item has been installed, the policy restrictions remain in effect until the embargo date passes.
Once the embargo date passes, the policy restrictions are automatically lifted. An embargo lift date is generally
stored as the "start date" of a policy. Essentially, this means that the policy does not get applied until after that date
passes (and prior to that date, the object defaults to Admin only access).
Administrators are able to change the lift date of the embargo by editing the policy (ResourcePolicy). These policies
can be managed from the Edit Item screens.

Configuration of metadata fields

DSpace embargoes utilize standard metadata fields to hold both the "terms" and the "lift date". Which fields you
use are configurable, and no specific metadata element is dedicated or pre-defined for use in embargo. Rather, you
must specify exactly what field you want the embargo system to examine when it needs to find the terms or assign
the lift date.
The properties that specify these assignments live in dspace.cfg:

Using DSpace – 118

DSpace 7.x Documentation – DSpace 7.x Documentation

# DC metadata field to hold the user-supplied embargo terms

embargo.field.terms = SCHEMA.ELEMENT.QUALIFIER

# DC metadata field to hold computed "lift date" of embargo
embargo.field.lift = SCHEMA.ELEMENT.QUALIFIER

You replace the placeholder values with real metadata field names. If you only need the "default" embargo
behavior - which essentially accepts only absolute dates as "terms" - this is the only configuration required, except
as noted below.
There is also a property for the special date of "forever":

# string in terms field to indicate indefinite embargo

embargo.terms.open = forever

which you may change to suit linguistic or other preference.

You are free to use existing metadata fields, or create new fields. If you choose the latter, you must understand that
the embargo system does not create or configure these fields: i.e. you must follow all the standard documented
procedures for actually creating them (i.e. adding them to the metadata registry, or to display templates, etc) - this
does not happen automatically. Likewise, if you want the field for "terms" to appear in submission screens and
workflows, you must follow the documented procedure for configurable submission (basically, this means adding
the field to submission-forms.xml). The flexibility of metadata configuration makes if easy for you to restrict
embargoes to specific collections, since configurable submission can be defined per collection.
Key recommendations:
1. Use a local metadata schema. Breaking compliance with the standard Dublin Core in the default metadata
registry can create a problem for the portability of data to/from of your repository.
2. If using existing metadata fields, avoid any that are automatically managed by DSpace. For example, fields
like "date.issued" or "date.accessioned" are normally automatically assigned, and thus must not be
recruited for embargo use.
3. Do not place the field for "lift date" in submission screens. This can potentially confuse submitters because
they may feel that they can directly assign values to it. As noted in the life-cycle above, this is erroneous: the
lift date gets assigned by the embargo system based on the terms. Any pre-existing value will be over-
written. But see next recommendation for an exception.
4. As the life-cycle discussion above makes clear, after the terms are applied, that field is no longer actionable
in the embargo system. Conversely, the "lift date" field is not actionable until the application. Thus you may
want to consider configuring both the "terms" and "lift date" to use the same metadata field. In this
way, during workflow you would see only the terms, and after item installation, only the lift date. If you wish
the metadata to retain the terms for any reason, use 2 distinct fields instead.

Operation
After the fields defined for terms and lift date have been assigned in dspace.cfg, and created and configured
wherever they will be used, you can begin to embargo items simply by entering data (dates, if using the default
setter) in the terms field. They will automatically be embargoed as they exit workflow, and that the computed lift
date will be stored on the ResourcePolicy

Extending embargo functionality

The embargo system supplies a default "interpreter/imposition" class (the "Setter") .

Using DSpace – 119

DSpace 7.x Documentation – DSpace 7.x Documentation

Setter
The default setter recognizes only two expressions of terms: either a literal, non-relative date in the fixed format
"yyyy-mm-dd" (known as ISO 8601), or a special string used for open-ended embargo (the default configured value
for this is "forever", but this can be changed in dspace.cfg to "toujours", "unendlich", etc). It will perform a minimal
sanity check that the date is not in the past. Similarly, the default setter will only remove all read policies as noted
above, rather than applying more nuanced rules (e.g allow access to certain IP groups, deny the rest). Fortunately,
the setter class itself is configurable and you can "plug in" any behavior you like, provided it is written in java and
conforms to the setter interface. The dspace.cfg property:

# implementation of embargo setter plugin - replace with local implementation if applicable

plugin.single.org.dspace.embargo.EmbargoSetter = org.dspace.embargo.DefaultEmbargoSetter

controls which setter to use.

Lifter

 DEPRECATED: The Lifter is no longer used in the DSpace API, and is not recommended to utilize. Embargo
lift dates are now stored on ResourcePolicies and, as such, are "lifted" automatically when the embargo
date passes. Manually running a "lifter" may bypass this automatic functionality and result in unexpected
results.

The default lifter behavior as described above - essentially applying the collection policy rules to the item - might
also not be sufficient for all purposes. It also can be replaced with another class:

implementation of embargo lifter plugin - - replace with local implementation if applicable

plugin.single.org.dspace.embargo.EmbargoLifter = org.dspace.embargo.DefaultEmbargoLifter

4.1.2.6 Pre-3.0 Embargo Lifter Commands

 DEPRECATED - Not recommended to use

The old "embargo-lifter" command is no longer necessary to run. All Embargoes in DSpace are now stored
on ResourcePolicies and are lifted automatically after the lift date passed. See Embargo(see page 113)
documentation for more information.
Continuing to run the "embargo-lifter" is not recommended and this feature will be removed entirely in a
future DSpace release.

If you have implemented the pre DSpace 3.0 Embargo(see page 113) feature, you will need to run it periodically to
check for Items with expired embargoes and lift them.

Command used: [dspace]/bin/dspace embargo-lifter

Using DSpace – 120

DSpace 7.x Documentation – DSpace 7.x Documentation

Java class: org.dspace.embargo.EmbargoManager

Arguments short and (long) forms): Description

-c or --check ONLY check the state of embargoed Items, do NOT lift any
embargoes

-i or --identifier Process ONLY this handle identifier(s), which must be an

Item. Can be repeated.

-l or --lift Only lift embargoes, do NOT check the state of any

embargoed items.

-n or --dryrun Do no change anything in the data model, print message

instead.

-v or --verbose Print a line describing the action taken for each embargoed
item found.

-q or --quiet No output except upon error.

-h or --help Display brief help screen.

You must run the Embargo Lifter task periodically to check for items with expired embargoes and lift them from
being embargoed. For example, to check the status, at the CLI:

[dspace]/bin/dspace embargo-lifter -c

To lift the actual embargoes on those items that meet the time criteria, at the CLI:

[dspace]/bin/dspace embargo-lifter -l

4.1.3 Managing User Accounts

When a user registers an account for the purpose of subscribing to change notices, submitting content, or the like,
DSpace creates an EPerson record in the database. Administrators can manipulate these records in several ways.

Using DSpace – 121

DSpace 7.x Documentation – DSpace 7.x Documentation

4.1.3.1 From the browser

• Login as an Administrator
• Sidemenu "Access Control" → "People"
• Browse or search for the account you wish to modify or delete.
To modify user permissions / group memberships:
• Login as an Administrator
• Sidemenu "Access Control" → "Groups"
• Edit the Group
• Search for the EPerson & add/remove them from that group.

4.1.3.2 From the command line

The user command

The dspace user command adds, lists, modifies, and deletes EPerson records.

To create a new user account:

[dspace]/bin/dspace user --add --email [email protected] -g John -s User --password hiddensecret

[dspace]/bin/dspace user --add --netid jquser --telephone 555-555-1234 --password hiddensecret

One of the options --email or --netid is required to name the record. The complete options are:

-a --add required

-m --email email address

-n --netid "netid" (a username in an external system such as a directory

– see Authentication Methods for details)

-p --password a password for the account. Required.

-g --givenname First or given name

-s --surname Last or surname

-t --telephone Telephone number

-l --language Preferred language

-c --requireCertificate Certificate required? See X.509 Authentication(see page 87) for

details.

Using DSpace – 122

DSpace 7.x Documentation – DSpace 7.x Documentation

To list accounts:

[dspace]/bin/dspace user --list

This simply lists some characteristics of each EPerson.

short long meaning

-L --list required

To modify an account:

[dspace]/bin/dspace user --modify -m [email protected]

short long meaning

-M --modify required

-m --email identify the account by email address

-n --netid identify the account by netid

-g --givenname First or given name

-s --surname Last or surname

-t --telephone telephone number

-l --language preferred language

-c --requireCertificate certificate required?

-C --canLogIn is the account enabled or disabled?

-i --newEmail set or change email address

-I --newNetid set or change netid

To delete an account:

[dspace]/bin/dspace user --delete -n martha

Using DSpace – 123

DSpace 7.x Documentation – DSpace 7.x Documentation

short long meaning

-d --delete required

-m --email identify the account by email address

-n --netid identify the account by netid

The Groomer
This tool inspects all user accounts for several conditions.

short long meaning

-a --aging find accounts not logged in since a given date

-u --unsalted find accounts not using salted password hashes

-b --before date cutoff for --aging

-d --delete delete disused accounts (used with --aging)

Find accounts with unsalted passwords

Earlier versions of DSpace used an "unsalted hash" method to protect user passwords. Recent versions use a salted
hash. You can find accounts which have never been converted to salted hashing:

Discovering accounts with unsalted password hashes

[DSpace]/bin/dsrun org.dspace.eperson.Groomer -u

The output is a list of email addresses for matching accounts.

Find (and perhaps delete) disused accounts

You can list accounts which have not logged on since a given date:

Discovering disused accounts

[DSpace]/bin/dsrun org.dspace.eperson.Groomer -a -b 07/20/1969

The output is a tab-separated-value table of the EPerson ID, last login date, email address, netid, and full name for
each matching account.
You can also have the tool delete matching accounts:

Using DSpace – 124

DSpace 7.x Documentation – DSpace 7.x Documentation

Deleting disused accounts

[DSpace]/bin/dsrun org.dspace.eperson.Groomer -a -b 07/20/1969 -d

4.1.3.3 Email Subscriptions

• Introduction
• Adding new subscriptions
• System configuration for sending out daily emails

Introduction
Registered users can subscribe to collections in DSpace. After subscribing, users will receive a daily email
containing the new and modified items in the collections they are subscribed to.

 DSpace 7.0 does not yet support

Email Subscriptions are not available in DSpace 7.0. They are scheduled to be restored in a later 7.x release
(currently 7.2), see DSpace Release 7.0 Status176

Adding new subscriptions

Adding new subscriptions is only available to users who are logged in.
In the User interface, new subscriptions are added on the users Profile page.
In the JSP User Interface, a specific dialog "Receive Email Updates" is available from the dropdown in the top right
corner.

176 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Status

Using DSpace – 125

DSpace 7.x Documentation – DSpace 7.x Documentation

System configuration for sending out daily emails

To send out the subscription emails you need to invoke the sub-daily script from the DSpace command launcher. It
is advised to setup this script as a scheduled task using cron(see page 125).
This script can be run with a parameter -t for testing purposes. When this parameter is passed, the log level is set to
DEBUG to ensure that more diagnostic information will be added to the dspace logfile.

4.1.4 Request a Copy

• Introduction(see page 126)
• Requesting a copy using the User Interface(see page 126)
• (Optional) Requesting a copy with Help Desk workflow(see page 128)
• Email templates(see page 131)
• Configuration parameters(see page 132)
• Selecting Request a Copy strategy via Spring Configuration(see page 133)

4.1.4.1 Introduction

 DSpace 7.0 does not yet support

Email Subscriptions are not available in DSpace 7.0. They are scheduled to be restored in a later 7.x release
(currently 7.1), see DSpace Release 7.0 Status177

The request a copy functionality was added to DSpace as a measure to facilitate access in those cases when
uploaded content can not be openly shared with the entire world immediately after submission into DSpace. It
gives users an efficient way to request access to the original submitter of the item, who can approve this access with
the click of a button. This practice complies with most applicable policies as the submitter interacts directly with
the requester on a case by case basis.

4.1.4.2 Requesting a copy using the User Interface

Users can request a copy by clicking the file thumbnail or the blue lock symbol displayed on files that are restricted
to them.

177 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Status

Using DSpace – 126

DSpace 7.x Documentation – DSpace 7.x Documentation

The request form asks the user for his or her name, email address and message where the reason for requesting
access can be entered.

After clicking request copy at the bottom of this form, the original submitter of the item will receive an email
containing the details of the request. The email also contains a link with a token that brings the original submitter
to a page where he or she can either grant or reject access. If the original submitter can not evaluate the request, he
or she can forward this email to the right person, who can use the link containing the token without having to log
into DSpace.

Each of these buttons registers the choice of the submitter, displaying the following form in which an additional
reason for granting or rejecting the access can be added.

Using DSpace – 127

DSpace 7.x Documentation – DSpace 7.x Documentation

After hitting send, the contents of this form will be sent together with the associated files to the email address of the
requester. In case the access is rejected, only the reason will be sent to the requester.
After responding positively to a request for copy, the person who approved is presented with an optional form to
ask the repository administrator to alter the access rights of the item, allowing unrestricted open access to
everyone.

4.1.4.3 (Optional) Requesting a copy with Help Desk workflow

(Optional) Request Item with HelpDesk intermediary, is steered towards having your Repository Support staff act as
a helpdesk that receives all incoming RequestItem requests, and then processes them. This adds the options of
"Initial Reply to Requestor" to let the requestor know that their request is being worked on, and an option "Author
Permission Request" which allows the helpdesk to email the author of the document, as not all documents are
deposited by the author, or the author will need to be tracked down by a support staff, as DSpace might not have
their current email address.

Using DSpace – 128

DSpace 7.x Documentation – DSpace 7.x Documentation

Initial Reply to Requester

Using DSpace – 129

DSpace 7.x Documentation – DSpace 7.x Documentation

Author permission request, includes information about the original request (requester name, requester email,
requester's reason for requesting). The author/submitter's name and email address will be pre-populated in the
form from the submitter, but the email address and author name are editable, as the submitter's of content to
DSpace aren't always the author.

Using DSpace – 130

DSpace 7.x Documentation – DSpace 7.x Documentation

4.1.4.4 Email templates

Most of the email templates used by Request a Copy are treated just like other email templates in DSpace. The
templates can be found in the /config/emails directory and can be altered just by changing the contents and
restarting tomcat.

Using DSpace – 131

DSpace 7.x Documentation – DSpace 7.x Documentation

request_item.admin template for the message that will be sent to the administrator of the
repository, after the original submitter requests to have the
permissions changed for this item.

request_item.author template for the message that will be sent to the original submitter of
an item with the request for copy.
The templates for emails that the requester receives, that could have been customized by the approver in the
aforementioned dialog are not managed as separate email template files. These defaults are stored in the
Messages.properties file under the keys

itemRequest.response.body.approve Default message for informing the requester of the

approval

itemRequest.response.body.reject Default message for informing the requester of the

rejection

itemRequest.response.body.contactAuthor Default message for the helpdesk to contact the author

itemRequest.response.body.contactRequester Default message for the helpdesk to contact the

requester

4.1.4.5 Configuration parameters

Request a copy is enabled by default. Only two configuration parameters in dspace.cfg relate to Request a Copy:

Property: request.item.type

Example Value request.item.type = all

Informational Note This parameter manages who can file a request for an item. The
parameter is optional. When it is empty or commented out, request a
copy is disabled across the entire repository. When set to all, any user
can file a request for a copy. When set to logged, only registered users
can file a request for copy.

Property: mail.helpdesk

Example Value mail.helpdesk = [email protected]

Using DSpace – 132

DSpace 7.x Documentation – DSpace 7.x Documentation

Informational Note The email address assigned to this parameter will receive the emails
both for granting or rejecting request a copy requests, as well as
requests to change item policies.

This parameter is optional. If it is empty or commented out, it will

default to mail.admin.

WARNING: This setting is only utilized if the

RequestItemHelpdeskStrategy bean is enabled in [dspace]/
config/spring/api/requestitem.xml (see below)

Property: request.item.helpdesk.override

Example Value request.item.helpdesk.override = true

Informational Note Should all Request Copy emails go to the mail.helpdesk instead of
the item submitter? Default is false, which sends Item Requests to the
item submitter.

WARNING: This setting is only utilized if the

RequestItemHelpdeskStrategy bean is enabled in [dspace]/
config/spring/api/requestitem.xml (see below)

4.1.4.6 Selecting Request a Copy strategy via Spring Configuration

The process that DSpace uses to determine who is the recipient of the Item Request is configurable in this Spring
file: [dspace]/config/spring/api/requestitem.xml

By default the RequestItemMetadataStrategy is enabled, but falls back to the Item Submitter eperson's name
and email. You can configure the RequestItemMetadataStrategy to load the author's name and email address if
you set that information into an item metadata field. For example:

Using DSpace – 133

DSpace 7.x Documentation – DSpace 7.x Documentation

<bean class="org.dspace.app.requestitem.RequestItemMetadataStrategy"
id="org.dspace.app.requestitem.RequestItemAuthorExtractor">
<!--
Uncomment these properties if you want lookup in metadata the email and the name of the author to contact
for request copy.
If you don't configure that or if the requested item doesn't have these metadata the submitter data are
used as fail over

<property name="emailMetadata" value="schema.element.qualifier" />

-->
</bean>

Another common request strategy is the use a single Helpdesk email address to receive all of these requests (see
corresponding helpdesk configs in dspace.cfg above). If you wish to use the Helpdesk Strategy, you must first
comment out the default RequestItemMetadataStrategy, bean and uncomment this bean:

4.2 Configurable Entities

• Introduction(see page 135)
• Default Entity Models(see page 135)
• Research Entities(see page 135)
• Journals(see page 136)
• Enabling Entities(see page 136)
• 1. Configure your entity model (optionally)(see page 137)
• 2. Import entity model into the database(see page 137)
• 3. Configuration of community/collection list for Entity types(see page 137)
• 4. Configure Submission Forms for each Entity type(see page 139)
• 5. Configure Workflow for each Entity type (optionally)(see page 139)
• 6. Configure Virtual Metadata to display for related Entities (optionally)(see page 139)
• Designing your own Entity model(see page 140)
• Thinking about the object model(see page 141)
• Configuring the object model(see page 141)
• Configuring the metadata fields(see page 142)
• Configuring the item display pages(see page 142)
• Configuring virtual metadata(see page 142)
• Configuring discovery(see page 142)
• Additional Technical Details(see page 142)

Using DSpace – 134

DSpace 7.x Documentation – DSpace 7.x Documentation

4.2.1 Introduction
DSpace users have expressed the need for DSpace to be able to provide more support for different types of digital
objects related to open access publications, such as authors/author profiles, data sets etc. Configurable Entities are
designed to meet that need.
In DSpace, an Entity is a special type of Item which often has Relationships to other Entities. Breaking it down
with more details...
• Entity: Every Entity is an Item.
• This means they must belong to a Collection, just like a normal Item. (Community & Collection
objects are unchanged and unaffected by Entities.)
• Normal Items are still the "default" Item, and they are unchanged. So, not every Item is an Entity.
• Because Entities are all Items, they are immediately usable in submission/workflow process, batch
import/export, OAI-PMH, etc.
• Entity (or Item) Type: Entities all have a "dspace.entity.type" metadata field which defines their Entity/Item
"type". For example, this type may be "Person", "Project", "Publication", "Journal", etc. It's highly visible
within the User Interface as a label.
• Relationships: Based on that "type", an Entity may be related to other Entities via a Relationship. One
Entity type may support several relationship types at once. Examples of relationship types include
"isPersonOfProject" or "isPublicationOfAuthor". These relationship types are named based on the Entity
"type" (as you can likely tell). Relationships also appear on Entities as metadata using the "relation"
schema.
• Virtual Metadata: Entities of different types may also have customized visualizations in the User Interface.
These visualizations may also dynamically pull in metadata from related Entities. For example, a
Publication entity may be displayed in the User Interface with an author name dynamically pulled in from a
related Person entity. The metadata "appears" as though it is part of the Entity you are viewing, but it is
dynamically pulled via the Relationship.
Entities and their Relationships are also completely configurable. DSpace provides some sample models out of the
box, which you can use directly or adapt as needed.
The Entity model also has similarities with the Portland Common Data Model (PCDM)178, with an Entity roughly
mapping to a "pcdm:Object" and existing Communities and Collections roughly mapping to a "pcdm:Collection".
However, at this time DSpace Entities concentrate more on building a graph structure of relationships, instead of a
tree structure.

4.2.2 Default Entity Models

DSpace currently comes with the following Entity models, both of which are defined in [dspace]/config/
entities/relationship-types.xml. These Entity models are not used by default, but may be enabled as
described below.

4.2.2.1 Research Entities

Research Entities include Person, OrgUnit, Project and Publication. They allow you to create author profiles
(Person) in DSpace, and relate those people to their department(s) (OrgUnit), grant project(s) (Project) and works
(Publication).

178 https://github.com/duraspace/pcdm/wiki

Using DSpace – 135

DSpace 7.x Documentation – DSpace 7.x Documentation

• Each publication can link to projects, people and org units

• Each person can link to projects, publications and org units
• Each project can link to publications, people and org units
• Each org units can link to projects, people and publications

4.2.2.2 Journals
Journal Entities include Journal, Journal Volume, Journal Issue and Publication (article). They allow you to
represent a Journal hierarchy more easily within DSpace, starting at the overall Journal, consisting of multiple
Volumes, and each Volume containing multiple Issues. Issues then link to all articles (Publication) which were a
part of that journal issue.
NOTE: that this model includes the same "Publication" entity as the Research Entities model described above. This
Entity overlap allows you to link an article (Publication) both to its author (Person) as well as the Journal Issue it
appeared in.

4.2.3 Enabling Entities

By default, Entities are not used in DSpace. But, as described above several models are available out-of-the-box
that may be optionally enabled.

 Keep in mind, there are a few DSpace import/export features that do not yet support Entities in DSpace
7.0. These will be coming in future 7.x releases. See DSpace Release 7.0 Status179 for prioritization
information, etc.
• AIP Backup and Restore(see page 411) does not fully support entity types or relationships. In other
words, Entities are only represented as normal Items in AIPs
• Importing and Exporting Items via Simple Archive Format180 does not fully support entity types or
relationships. In other words, Entities are only represented as normal Items in SAF. (Note: early
work to bring this support is already begun in https://github.com/DSpace/DSpace/pull/3322)
• SWORDv1 Server(see page 216) and SWORDv2 Server(see page 202) does not yet support Entity or
relationship creation.

179 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Status
180 https://wiki.lyrasis.org/display/DSDOC6x/Importing+and+Exporting+Items+via+Simple+Archive+Format

Using DSpace – 136

DSpace 7.x Documentation – DSpace 7.x Documentation

4.2.3.1 1. Configure your entity model (optionally)

As described above, DSpace provides two default entity models defined in [dspace]/config/entities/
relationship-types.xml. These models may be used as-is, or modified.
You can also design your own model from scratch (see "Designing your own model" section below). So, feel free to
start by modifying relationship-types.xml, or creating your own model based on the relationship-
types.dtd.

4.2.3.2 2. Import entity model into the database

In order to enable a defined entity model, it MUST be imported into the DSpace database This is achieved by using
the "initialize-entities" script. The example below will import the "out-of-the-box" entity models into your DSpace
installation

# The -f command requires a full path to an Entities model configuration file.

[dspace]/bin/dspace initialize-entities -f [dspace]/config/entities/relationship-types.xml

If an Entity (of same type name) already exists, it will be updated with any new relationships defined in relationship-
types.xml
If an Entity (of same type name) doesn't exist, the new Entity type will be created along with its relationships
defined in relationship-types.xml
Once imported into the Database, the overall structure is as follows:
• All valid Entity Types are stored in the "entity_type" database table.
• All Relationship type definitions are stored in the "relationship_type" database table
• All Relationships between entities get stored in the "relationship" table.
• Entities themselves are stored alongside Items in the 'item' table. Every Entity must have a
"dspace.entity.type" metadata field whose value is a valid Entity Type (from the "entity_type" table).
Keep in mind, your currently enabled Entity model is defined in your database, and NOT in the "relationship-
types.xml". Anytime you want to update your data model, you'd update/create a configuration (like relationship-
types.xml) and re-run the "initialize-entities" command.

4.2.3.3 3. Configuration of community/collection list for Entity types

Because all Entities are Items, they MUST belong to a Collection. Therefore, the easiest way to create a different
submission forms per Entity type (e.g. Person, Project, Journal, Publication, etc) is to ensure you create a Collection
for each Entity Type (as each Collection can have a custom Submission Form).
1. Create at least one Collection for each Entity Type needing a custom Submission form. For example, a
Collection for "Person" entities, and a separate one for "Publication" entities.
2. Edit the Collection, and create a Template Item (which is used for all Entities/Items submitted to that
Collection) from the "Edit Metadata" tab
a. In the Template Item, add a single metadata field "dspace.entity.type". Give it a value matching the
Entity type (e.g. Publication, Person, Project, OrgUnit, Journal, JournalVolume, JournalIssue). This
value IS CASE SENSITIVE and it MUST match the Entity type name defined in relationship-types.xml
i. This single metadata field will ensure that every Item submitted to this collection is
automatically assigned that Entity type. So, it ties this Collection to that type of Entity.

Using DSpace – 137

DSpace 7.x Documentation – DSpace 7.x Documentation

3. In the Edit Collection page, switch to the "Assign Roles" tab and create a "Submitters" group. Add any
people who should be allowed to submit/create this new Entity type.
a. If you only want Administrators to create this Entity type, you can skip this step. Administrators can
submit to any Collection.
4. If you want to hide this Collection, you can choose to only make it visible to that same Submitters group (or
Administrators). This does NOT hide the Entities from search or browse, but it will hide the Collection itself.
a. In the Edit Collection page, switch to the "Authorizations" tab.
b. Add a new Authorization of TYPE_CUSTOM, restricting "READ" to the Submitters group created above
(or Administrators if there is no Submitters group). You can also add multiple READ policies as
needed. WARNING: The Submitters group MUST have READ privileges to be able to submit/create
new Entities.
c. Remove the default READ policy giving Anonymous permissions.
d. Assuming you want the Entities to still be publicly available, make sure the DEFAULT_ITEM_READ
policy is set to "Anonymous"!
Obviously, how you organize your Entity Types into Collections is up to you. You can create a single Collection for
all Entities of that type (e.g. an "Author Profiles" collection could be where all "Person" Entities are submitted/
stored). Or, you could create many Collections for each Entity Type (e.g. each Department in your University may
have it's own Community, and underneath have a "Staff Profiles" Collection where all "Person" Entities for that
department are submitted/stored). A few example structures are shown below.
Example Structure based on the departments:
• Department of Architecture
• Building Technology Program
• Theses - Department of Architecture
• Department of Biology
• Theses - Biology
• People
• Projects
OR
• Department of Architecture
• Building Technology Program
• Theses - Department of Architecture
• People in Department of Architecture
• Projects in Department of Architecture
• Department of Biology
• Theses - Biology
• People in Department of Biology
• Projects in Department of Biology
Example Structure based on the publication type:
• Books
• Book Chapter
• Edited Volume
• Monograph
• Theses
• Bachelor Thesis

Using DSpace – 138

DSpace 7.x Documentation – DSpace 7.x Documentation

• Doctoral Thesis
• Habilitation Thesis
• Master Thesis
• People
• Projects

4.2.3.4 4. Configure Submission Forms for each Entity type

You should have already created Entity-specific Collections in the previous step. Now, we just need to map those
Collections to Submission processes specific to each Entity.
On the backend, you will now need to modify the [dspace]/config/item-submission.xml to "map" this
Collection (or Collections) to the submission process for this Entity type.
• DSpace comes with sample submission forms for each Entity type.
• The sample <submission-process> is defined in item-submission.xml and named based on the
Entity type (e.g. Publication, Person, Project, etc).
• The metadata fields captured for each Entity are defined in a custom step in submission-
forms.xml, and named in the format "[entityType]Step" (where the entity type is camelcased). For
example: "publicationStep", "personStep", "projectStep".
• Optionally, modify those sample submission forms. See Submission User Interface(see page 260) for hints/tips
on customizing the item-submission.xml or submission-forms.xml files
• Now, in item-submission.xml, map your Collection's handle (findable on the Collection homepage) to the
submission form you want it to use. In the below example, we've mapped a single Collection to each of the
out-of-the-box Entity types.

<name-map collection-handle="123456789/5" submission-name="Publication"/>

<name-map collection-handle="123456789/6" submission-name="Person"/>
<name-map collection-handle="123456789/7" submission-name="Project"/>
<name-map collection-handle="123456789/8" submission-name="OrgUnit"/>
<name-map collection-handle="123456789/28" submission-name="Journal"/>
<name-map collection-handle="123456789/29" submission-name="JournalVolume"/>
<name-map collection-handle="123456789/30" submission-name="JournalIssue"/>

Once your modifications to the submission process are complete, you will need to quickly reboot Tomcat (or your
servlet container) to reload the current settings.

4.2.3.5 5. Configure Workflow for each Entity type (optionally)

The DSpace workflow can be used for reviewing all objects in the Object Model since these objects are all Items, and
separate collections can be used. The workflow used for e.g. a Person Object can be configured to be identical to a
publication, different from a publication, or use no workflow at all.
See Configurable Workflow(see page 250) for more information on configuring workflows per Collection.

4.2.3.6 6. Configure Virtual Metadata to display for related Entities (optionally)

"Virtual Metadata" is metadata that is dynamically determined (at the time of access) based on an Entity's
relationship to other Entities. A basic example is displaying a Person Entity's name in the "dc.contributor.author"
field of a related Publication Entity. That "dc.contributor.author" field doesn't actually exist on the Publication, but

Using DSpace – 139

DSpace 7.x Documentation – DSpace 7.x Documentation

is dynamically added as "virtual metadata" simply because the Publication is linked to the Person (via a
relationship).
Virtual Metadata is configurable for all Entities and all relationships. DSpace comes with default settings for its
default Entity model, and those can be found in [dspace]/config/spring/api/virtual-metadata.xml. In
that Spring Bean configuration file, you'll find a map of each relationship type to a metadata field & its value. Here's
a summary of how it works:
• The "org.dspace.content.virtual.VirtualMetadataPopulator" bean maps every Relationship type (from
relationship-types.xml) to a <util:map> definition (of a given ID) also in the virtual-metadata.xml

<!-- For example, the isAuthorOfPublication relationship is linked to a map of ID

"isAuthorOfPublicationMap" -->
<entry key="isAuthorOfPublication" value-ref="isAuthorOfPublicationMap"/>

• That <util:map> defintion defines which DSpace metadata field will store the virtual metadata. It also links
to the bean which will dynamically define the value of this metadata field.

<!-- In this example, isAuthorOfPublication will be displayed in the "dc.contributor.author" field

-->

<util:map id="isAuthorOfPublicationMap">
<entry key="dc.contributor.author" value-ref="publicationAuthor_author"/>
</util:map>

• A bean of that ID then defines the value of the field, based on the related Entity. In this example, these fields
are pulled from the related Person entity and concatenated. If the Person has "person.familyName=Jones"
and "person.givenName=Jane", then the value of "dc.contributor.author" on the related Publication will be
dynamically set to "Jones, Jane.

<bean class="org.dspace.content.virtual.Concatenate" id="publicationAuthor_author">

<property name="fields">
<util:list>
<value>person.familyName</value>
<value>person.givenName</value>
<value>organization.legalName</value>
</util:list>
</property>
<property name="separator">
<value>, </value>
</property>
<property name="useForPlace" value="true"/>
<property name="populateWithNameVariant" value="true"/>
</bean>

If the default Virtual Metadata looks good to you, no changes are needed. If you make any changes, be sure to
restart Tomcat to update the bean definitions.

4.2.4 Designing your own Entity model

When using a different entities model, the new model has to be configured an loaded into your repository

Using DSpace – 140

DSpace 7.x Documentation – DSpace 7.x Documentation

4.2.4.1 Thinking about the object model

First step: identify the entity types
• Which types of objects would you want to create items for: e.g. Person, Publication, JournalVolume
• Be careful not to confuse a type with a relationship. A Person is a type, an author is a relationship between
the publication and the person
Second step: identify the relationship types
• Which relationship types would you want to create between the entity items from the previous step: e.g.
isAuthorOfPublication, isEditorOfPublication, isProjectOfPublication, isOrgUnitOfPerson,
isJournalIssueOfPublication
• Multiple relationships between the same 2 types can be created: isAuthorOfPublication,
isEditorOfPublication
• Relationships are automatically bidirectional, so no need to worry about whether you want to display the
authors in a publication or the publications of an author
Third step: visualize your model
• By creating a drawing of your model, you’ll be able to quickly verify whether anything is missing

4.2.4.2 Configuring the object model

Configure the model in relationship-types.xml
• Similar to the default relationship-types.xml181, configure a relationship type per connection between 2
entity types
• Include the 2 entity type names which are being connected.
• Determine a clear an unambiguous name for the relation in both directions
• Optionally: determine the cardinality (min/max occurrences) for the relationships
• Optionally: determine default behavior for copying metadata if the relationship is deleted

181 https://github.com/DSpace/DSpace/blob/main/dspace/config/entities/relationship-types.xml

Using DSpace – 141

DSpace 7.x Documentation – DSpace 7.x Documentation

4.2.4.3 Configuring the metadata fields

Determining the metadata fields to use
• Dublin Core works for publications, but not for a Person, JournalVolume, …
• There are many standards which can be easily configured: schema.org182, eurocris, datacite, …
• Pick a schema which suits your needs
Configure the submission forms
• Add a form in submission-forms.xml183 for each entity type, containing the relevant metadata fields
• See also Submission User Interface(see page 260) documentation.
• Configure which relationships to create184

4.2.4.4 Configuring the item display pages

• The metadata configuration is not specific to configurable entities.
• Similar to other customizations to the item display pages, configure in Angular which metadata fields to
display and their label. A template per entity type can be created
• The relationship display is similar to the metadata configuration
• Similar to the metadata configuration: configure in Angular which relationship to display and their label

4.2.4.5 Configuring virtual metadata

• The isAuthorOfPublication relationship can be displayed for the Publication item as dc.contributor.author
• The isOrgUnitOfPerson relationship can be displayed for the Person item as organization.legalName
• This can be configured in virtual-metadata.xml185

4.2.4.6 Configuring discovery

• Configure the discovery facets, filters, sort options, …
• The facets for a Person can be job title, organization, project, …
• The filters for a Person can be person.familyName, person.givenName, …

4.2.4.7 Additional Technical Details

The original Entities design document is available in Google Docs at: https://docs.google.com/document/d/
1wEmHirFzrY3qgGtRr2YBQwGOvH1IuTVGmxDIdnqvwxM/edit
We are working on pulling that information into this Wiki space as a final home, but currently some technical details
exist only in that document.
A talk on Configurable Entities was also presented at DSpace 7 at OR2021186

182 http://schema.org
183 https://github.com/DSpace/DSpace/blob/master/dspace/config/submission-forms.xml
184 https://docs.google.com/document/d/1X0XsppZYOtPtbmq7yXwmu7FbMAfLxxOCONbw0_rl7jY/edit#heading=h.
5bu9shx0j942
185 https://github.com/DSpace/DSpace/blob/master/dspace/config/spring/api/virtual-metadata.xml
186 https://wiki.lyrasis.org/display/DSPACE/DSpace+7+at+OR2021

Using DSpace – 142

DSpace 7.x Documentation – DSpace 7.x Documentation

4.3 Curation System

DSpace supports running curation tasks, which are described in this section. DSpace includes several useful tasks
out-of-the-box, but the system also is designed to allow new tasks to be added between releases, both general
purpose tasks that come from the community, and locally written(see page 539) and deployed tasks.

• Tasks(see page 143)

• Activation(see page 143)
• Task Invocation(see page 144)
• On the command line(see page 144)
• In the admin UI(see page 145)
• In workflow(see page 146)
• In arbitrary user code(see page 147)
• Asynchronous (Deferred) Operation(see page 148)
• Task Output and Reporting(see page 148)
• Status Code(see page 148)
• Result String(see page 149)
• Reporting Stream(see page 149)
• Task Properties(see page 149)
• Task Parameters(see page 150)
• Scripted Tasks(see page 151)

4.3.1 Tasks
The goal of the curation system ("CS") is to provide a simple, extensible way to manage routine content operations
on a repository. These operations are known to CS as "tasks", and they can operate on any DSpaceObject (i.e.
subclasses of DSpaceObject) - which means the entire Site, Communities, Collections, and Items - viz. core data
model objects. Tasks may elect to work on only one type of DSpace object - typically an Item - and in this case they
may simply ignore other data types (tasks have the ability to "skip" objects for any reason). The DSpace core
distribution will provide a number of useful tasks, but the system is designed to encourage local extension - tasks
can be written for any purpose, and placed in any java package. This gives DSpace sites the ability to customize the
behavior of their repository without having to alter - and therefore manage synchronization with - the DSpace
source code. What sorts of activities are appropriate for tasks?
Some examples:
• apply a virus scan to item bitstreams (this will be our example below)
• profile a collection based on format types - good for identifying format migrations
• ensure a given set of metadata fields are present in every item, or even that they have particular values
• call a network service to enhance/replace/normalize an item's metadata or content
• ensure all item bitstreams are readable and their checksums agree with the ingest values
Since tasks have access to, and can modify, DSpace content, performing tasks is considered an administrative
function to be available only to knowledgeable collection editors, repository administrators, sysadmins, etc. No
tasks are exposed in the public interfaces.

4.3.2 Activation
For CS to run a task, the code for the task must of course be included with other deployed code (to [dspace]/lib,
WAR, etc) but it must also be declared and given a name. This is done via a configuration property in [dspace]/
config/modules/curate.cfg as follows:

Using DSpace – 143

DSpace 7.x Documentation – DSpace 7.x Documentation

### Task Class implementations

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.NoOpCurationTask = noop
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.ProfileFormats = profileformats
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.RequiredMetadata = requiredmetadata
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.ClamScan = vscan
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MicrosoftTranslator = translate
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MetadataValueLinkChecker =
checklinks

For each activated task, a key-value pair is added. The key is the fully qualified class name and the value is the
taskname used elsewhere to configure the use of the task, as will be seen below. Note that the curate.cfg
configuration file, while in the config directory, is located under "modules". The intent is that tasks, as well as any
configuration they require, will be optional "add-ons" to the basic system configuration. Adding or removing tasks
has no impact on dspace.cfg.
For many tasks, this activation configuration is all that will be required to use it. But for others, the task needs
specific configuration itself. A concrete example is described below, but note that these task-specific configuration
property files also reside in [dspace]/config/modules

4.3.3 Task Invocation

Tasks are invoked using CS framework classes that manage a few details (to be described below), and this
invocation can occur wherever needed, but CS offers great versatility "out of the box":

4.3.3.1 On the command line

A simple tool "CurationCli" provides access to CS via the command line. This tool bears the name "curate" in the
DSpace launcher. For example, to perform a virus check on collection "4":

[dspace]/bin/dspace curate -t vscan -i 123456789/4

The complete list of options:

option meaning

-t taskname name of task to perform.

-T filename name of file containing a list of tasknames to be performed.

-e epersonID (email address or netid) will be the superuser if unspecified.

-i identifier ID of object to curate. May be (1) a Handle, (2) a workflow ID, or (3) 'all'
to operate on the whole repository.

-q queue name of queue to process. -i and -q are mutually exclusive.

-l limit maximum number of objects in Context cache. If absent, unlimited

objects may be added.

Using DSpace – 144

DSpace 7.x Documentation – DSpace 7.x Documentation

option meaning

-s scope declare a scope for database transactions. Scope must be: (1)
'open' (default value), (2) 'curation' or (3) 'object'.

-v emit verbose output

-r filename emit reporting to the named file. '-r -' writes reporting to standard out.
If not specified, report is discarded silently.

-p name=value set a runtime task parameter name to the value value. May be repeated
as needed. See "Task parameters" below.
As with other command-line tools, these invocations could be placed in a cron table and run on a fixed schedule, or
run on demand by an administrator.

4.3.3.2 In the admin UI

In the UI, there are several ways to execute configured Curation Tasks:
1. From the "Curate" tab/button that appears on each "Edit Community/Collection/Item" page: this tab
allows an Administrator, Community Administrator or Collection Administrator to run a Curation Task on
that particular Community, Collection or Item. When running a task on a Community or Collection, that task
will also execute on all its child objects, unless the Task itself states otherwise (e.g. running a task on a
Collection will also run it across all Items within that Collection).
• NOTE: Community Administrators and Collection Administrators can only run Curation Tasks on the
Community or Collection which they administer, along with any child objects of that Community or
Collection. For example, a Collection Administrator can run a task on that specific Collection, or on
any of the Items within that Collection.
2. From the Administrator's "Curation Tasks" page: This option is only available to DSpace Administrators,
and appears in the Administrative side-menu. This page allows an Administrator to run a Curation Task
across a single object, or all objects within the entire DSpace site.
• In order to run a task from this interface, you must enter in the handle for the DSpace object. To run a
task site-wide, you can use the handle: [your-handle-prefix]/0
Each of the above pages exposes a drop-down list of configured tasks, with a button to 'perform' the task, or queue
it for later operation (see section below). Not all activated tasks need appear in the Curate tab - you filter them by
means of a configuration property. This property also permits you to assign to the task a more user-friendly name
than the PluginManager taskname. The property resides in [dspace]/config/modules/curate.cfg:

curate.ui.tasknames = profileformats = Profile Bitstream Formats

curate.ui.tasknames = requiredmetadata = Check for Required Metadata

When a task is selected from the drop-down list and performed, the tab displays both a phrase interpreting the
"status code" of the task execution, and the "result" message if any has been defined. When the task has been
queued, an acknowledgement appears instead. You may configure the words used for status codes in curate.cfg (for
clarity, language localization, etc):

Using DSpace – 145

DSpace 7.x Documentation – DSpace 7.x Documentation

curate.ui.statusmessages = -3 = Unknown Task

curate.ui.statusmessages = -2 = No Status Set
curate.ui.statusmessages = -1 = Error
curate.ui.statusmessages = 0 = Success
curate.ui.statusmessages = 1 = Fail
curate.ui.statusmessages = 2 = Skip
curate.ui.statusmessages = other = Invalid Status

Report output from tasks run in this way is collected by configuring a Reporter plugin. You must have exactly one
Reporter configured. The default is to use the FileReporter, which writes a single report of the output of all tasks in
the run over all of the selected objects, to a file in the reports directory (configured as report.dir). See [DSpace]/
config/modules/submission-configuration.cfg for the value of plugin.single.org187.dspace.curate.R
eporter. Other Reporter implementations are provided, or you may supply your own.
As the number of tasks configured for a system grows, a simple drop-down list of all tasks may become too
cluttered or large. DSpace 1.8+ provides a way to address this issue, known as task groups. A task group is a simple
collection of tasks that the Admin UI will display in a separate drop-down list. You may define as many or as few
groups as you please. If no groups are defined, then all tasks that are listed in the ui.tasknames property will appear
in a single drop-down list. If at least one group is defined, then the admin UI will display two drop-down lists. The
first is the list of task groups, and the second is the list of task names associated with the selected group. A few key
points to keep in mind when setting up task groups:
• a task can appear in more than one group if desired
• tasks that belong to no group are invisible to the admin UI (but of course available in other contexts of use)
The configuration of groups follows the same simple pattern as tasks, using properties in [dspace]/config/
modules/curate.cfg. The group is assigned a simple logical name, but also a localizable name that appears in
the UI. For example:

# ui.taskgroups contains the list of defined groups, together with a pretty name for UI display
curate.ui.taskgroups = replication = Backup and Restoration Tasks
curate.ui.taskgroups = integrity = Metadata Integrity Tasks
.....
# each group membership list is a separate property, whose value is comma-separated list of logical task
names
curate.ui.taskgroup.integrity = profileformats, requiredmetadata
....

4.3.3.3 In workflow
CS provides the ability to attach any number of tasks to standard DSpace workflows. Using a configuration file
[dspace]/config/workflow-curation.xml, you can declaratively (without coding) wire tasks to any step in a
workflow. An example:

187 http://plugin.single.org

Using DSpace – 146

DSpace 7.x Documentation – DSpace 7.x Documentation

<taskset-map>
<mapping collection-handle="default" taskset="cautious" />
</taskset-map>
<tasksets>
<taskset name="cautious">
<flowstep name="step1">
<task name="vscan">
<workflow>reject</workflow>
<notify on="fail">$flowgroup</notify>
<notify on="fail">$colladmin</notify>
<notify on="error">$siteadmin</notify>
</task>
</flowstep>
</taskset>
</tasksets>

This markup would cause a virus scan to occur during step one of workflow for any collection, and automatically
reject any submissions with infected files. It would further notify (via email) both the reviewers (step 1 group), and
the collection administrators, if either of these are defined. If it could not perform the scan, the site administrator
would be notified.
The notifications use the same procedures that other workflow notifications do - namely email. There is a new
email template defined for curation task use: [dspace]/config/emails/flowtask_notify. This may be
language-localized or otherwise modified like any other email template.
Tasks wired in this way are normally performed as soon as the workflow step is entered, and the outcome action
(defined by the 'workflow' element) immediately follows. It is also possible to delay the performance of the task -
which will ensure a responsive system - by queuing the task instead of directly performing it:

...
<taskset name="cautious">
<flowstep name="step1" queue="workflow">
...

This attribute (which must always follow the "name" attribute in the flowstep element), will cause all tasks
associated with the step to be placed on the queue named "workflow" (or any queue you wish to use, of course),
and further has the effect of suspending the workflow. When the queue is emptied (meaning all tasks in it
performed), then the workflow is restarted. Each workflow step may be separately configured,
Like configurable submission, you can assign these task rules per collection, as well as having a default for any
collection.
As with task invocation from the administrative UI, workflow tasks need to have a Reporter configured in
submission-configuration.cfg.

4.3.3.4 In arbitrary user code

If these pre-defined ways are not sufficient, you can of course manage curation directly in your code. You would use
the CS helper classes. For example:

Using DSpace – 147

DSpace 7.x Documentation – DSpace 7.x Documentation

Collection coll = (Collection)HandleManager.resolveToObject(context, "123456789/4");

Curator curator = new Curator();
curator.setReporter(System.out);
curator.addTask("vscan").curate(coll);
System.out.println("Result: " + curator.getResult("vscan"));

would do approximately what the command line invocation did. the method "curate" just performs all the tasks
configured (you can add multiple tasks to a curator).
The above directs report output to standard out. Any class which implements Appendable may be set as the
reporter class.

4.3.4 Asynchronous (Deferred) Operation

Because some tasks may consume a fair amount of time, it may not be desirable to run them in an interactive
context. CS provides a simple API and means to defer task execution, by a queuing system. Thus, using the previous
example:

Curator curator = new Curator();

curator.addTask("vscan").queue(context, "monthly", "123456789/4");

would place a request on a named queue "monthly" to virus scan the collection. To read (and process) the queue,
we could for example:

[dspace]/bin/dspace curate -q monthly

use the command-line tool, but we could also read the queue programmatically. Any number of queues can be
defined and used as needed.
In the administrative UI curation "widget", there is the ability to both perform a task, but also place it on a queue for
later processing.

4.3.5 Task Output and Reporting

Few assumptions are made by CS about what the 'outcome' of a task may be (if any) - it. could e.g. produce a report
to a temporary file, it could modify DSpace content silently, etc. But the CS runtime does provide a few pieces of
information whenever a task is performed:

4.3.5.1 Status Code

This was mentioned above. This is returned to CS whenever a task is called. The complete list of values:

-3 NOTASK - CS could not find the requested task

-2 UNSET - task did not return a status code because it has not yet run
-1 ERROR - task could not be performed
0 SUCCESS - task performed successfully
1 FAIL - task performed, but failed
2 SKIP - task not performed due to object not being eligible

Using DSpace – 148

DSpace 7.x Documentation – DSpace 7.x Documentation

In the administrative UI, this code is translated into the word or phrase configured by the ui.statusmessages
property (discussed above) for display.

4.3.5.2 Result String

The task may define a string indicating details of the outcome. This result is displayed, in the "curation widget"
described above:

"Virus 12312 detected on Bitstream 4 of 1234567789/3"

CS does not interpret or assign result strings, the task does it. A task may not assign a result, but the "best practice"
for tasks is to assign one whenever possible.

4.3.5.3 Reporting Stream

For very fine-grained information, a task may write to a reporting stream. This stream may be sent to a file or to
standard out, when running a task from the command line. Tasks run from the administrative UI or a workflow use
a configured Reporter class to collect report output. Your own code may collect the report using any
implementation of the Appendable interface. Unlike the result string, there is no limit to the amount of data that
may be pushed to this stream.

4.3.6 Task Properties

DSpace 1.8 introduces a new "idiom" for tasks that require configuration data. It is available to any task whose
implementation extends AbstractCurationTask, but is completely optional. There are a number of problems
that task properties are designed to solve, but to make the discussion concrete we will start with a particular one:
the problem of hard-coded configuration file names. A task that relies on configuration data will typically encode a
fixed reference to a configuration file name. For example, the virus scan task reads a file called "clamav.cfg",
which lives in [dspace]/config/modules. It could look up its configuration properties in the ordinary way. But
tasks are supposed to be written by anyone in the community and shared around (without prior coordination), so if
another task uses the same configuration file name, there is a name collision here that can't be easily fixed, since
the reference is hard-coded in each task. In this case, if we wanted to use both at a given site, we would have to
alter the source of one of them - which introduces needless code localization and maintenance.
Task properties gives us a simple solution. Here is how it works: suppose that both colliding tasks instead use the
task properties facility instead of ordinary configuration lookup. For example, each asks for the property
clamav.service.host. At runtime, the curation system resolves this request to a set of configuration properties,
and it uses the name the task has been configured as as the prefix of the properties. So, for example, if both were
installed (in, say, curate.cfg) as:

org.dspace.ctask.general.ClamAv = vscan,
org.community.ctask.ConflictTask = virusscan,
....

then the task property foo will resolve to the property named vscan.foo when called from ClamAv task, but
virusscan.foo when called from ConflictTask's code. Note that the "vscan" etc are locally assigned names, so we
can always prevent the "collisions" mentioned, and we make the tasks much more portable, since we remove the
"hard-coding" of config names.

Using DSpace – 149

DSpace 7.x Documentation – DSpace 7.x Documentation

Another use of task properties is to support multiple task profiles. Suppose we have a task that we want to operate
in one of two modes. A good example would be a mediafilter task that produces a thumbnail. We can either create
one if it doesn't exist, or run with "-force" which will create one regardless. Suppose this behavior was controlled by
a property in a config file. If we configured the task as "thumbnail", then we would have in (perhaps) [dspace]/
config/modules/thumbnail.cfg:

...other properties...
thumbnail.thumbnail.maxheight = 80
thumbnail.thumbnail.maxwidth = 80
thumbnail.forceupdate=false

The thumbnail generating task code would then resolve "forcedupdate" to see whether filtering should be forced.
But an obvious use-case would be to want to run force mode and non-force mode from the admin UI on different
occasions. To do this, one would have to stop Tomcat, change the property value in the config file, and restart, etc
However, we can use task properties to elegantly rescue us here. All we need to do is go into the config/modules
directory, and create a new file perhaps called: thumbnail.force.cfg. In this file, we put the properties:

thumbnail.force.thumbnail.maxheight = 80
thumbnail.force.thumbnail.maxwidth = 80
thumbnail.force.forceupdate=true

Then we add a new task (really just a new name, no new code) in curate.cfg:

org.dspace.ctask.general.ThumbnailTask = thumbnail
org.dspace.ctask.general.ThumbnailTask = thumbnail.force

Consider what happens: when we perform the task "thumbnail" (using taskProperties), it uses the thumbnail.*
properties and operates in "non-force" profile (since the value is false), but when we run the task
"thumbnail.force" the curation system uses the thumbnail.force.* properties. Notice that we did all this via
local configuration - we have not had to touch the source code at all to obtain as many "profiles" as we would like.
See Task Properties in Curation Tasks(see page 539) for details of how properties are resolved in task code.

4.3.7 Task Parameters

New in DSpace 7, you can pass parameters to a task at invocation time. These runtime parameters will be
presented to the task as if they were task properties (see above) and, if present, will override the value of
identically-named properties. Example:

Task parameters

bin/dspace curate -t reticulate -i 123456789/36 -p foreground=red -p background=green

Using DSpace – 150

DSpace 7.x Documentation – DSpace 7.x Documentation

4.3.8 Scripted Tasks

 The procedure to set up curation tasks in Jython is described on a separate page: Curation tasks in
Jython(see page 543)

DSpace 1.8 includes limited (and somewhat experimental) support for deploying and running tasks written in
languages other than Java. Since version 6, Java has provided a standard way (API) to invoke so-called scripting or
dynamic language code that runs on the java virtual machine (JVM). Scripted tasks are those written in a language
accessible from this API. The exact number of supported languages will vary over time, and the degree of maturity
of each language, or suitability of the language for curation tasks will also vary significantly. However, preliminary
work indicates that Ruby (using the JRuby runtime) and Groovy may prove viable task languages.
Support for scripted tasks does not include any DSpace pre-installation of the scripting language itself - this must
be done according to the instructions provided by the language maintainers, and typically only requires a few
additional jars on the DSpace classpath. Once one or more languages have been installed into the DSpace
deployment, task support is fairly straightforward. One new property must be defined in [dspace]/config/
modules/curate.cfg:

curate.script.dir = ${dspace.dir}/scripts

This merely defines the directory location (usually relative to the deployment base) where task script files should be
kept. This directory will contain a "catalog" of scripted tasks named task.catalog that contains information
needed to run scripted tasks. Each task has a 'descriptor' property with value syntax:
<engine>|<relFilePath>|<implClassCtor>
An example property for a link checking task written in Ruby might be:

linkchecker = ruby|rubytask.rb|LinkChecker.new

This descriptor means that a "ruby" script engine will be created, a script file named "rubytask.rb" in the
directory <script.dir> will be loaded and the resolver will expect an evaluation of "LinkChecker.new" will
provide a correct implementation object. Note that the task must be configured in all other ways just like java tasks
(in ui.tasknames, ui.taskgroups, etc).
Script files may embed their descriptors to facilitate deployment. To accomplish this, a script must include the
descriptor string with syntax:
$td=<descriptor> somewhere on a comment line. For example:

# My descriptor $td=ruby|rubytask.rb|LinkChecker.new

For reasons of portability, the <relFilePath> component may be omitted in this context. Thus, "$td=ruby||
LinkChecker.new" will be expanded to a descriptor with the name of the embedding file.

4.3.9 Bundled Tasks

DSpace bundles a small number of tasks of general applicability. Those that do not require configuration (or have
usable default values) are activated by default to demonstrate the use of the curation system. They may be

Using DSpace – 151

DSpace 7.x Documentation – DSpace 7.x Documentation

deactivated by means of configuration, if desired, without affecting system integrity. Those that require
configuration may be enabled (activated) by means editing DSpace configuration files. Each task is briefly described
in this section.
All bundled tasks are in the package org.dspace.ctask.general. So, for example, to activate the no-operation
task, which is implemented in the class NoOpCurationTask, one would configure:

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.NoOpCurationTask = noop

4.3.9.1 Bitstream Format Profiler Task

The task with the taskname 'formatprofiler' (in the admin UI it is labeled "Profile Bitstream Formats") examines all
the bitstreams in an item and produces a table ("profile") which is assigned to the result string. It is activated by
default, and is configured to display in the administrative UI. The result string has the layout:

10 (K) Portable Network Graphics

5 (S) Plain Text

where the left column is the count of bitstreams of the named format and the letter in parentheses is an
abbreviation of the repository-assigned support level for that format:

U Unsupported
K Known
S Supported

The profiler will operate on any DSpace object. If the object is an item, then only that item's bitstreams are profiled;
if a collection, all the bitstreams of all the items; if a community, all the items of all the collections of the
community.

4.3.9.2 Link Checker Tasks

Two link checker tasks, BasicLinkChecker and MetadataValueLinkChecker, can be used to check for broken or
unresolvable links appearing in item metadata.
This task is intended as a prototype / example for developers and administrators who are new to the curation
system.
These tasks are not configurable.

Basic Link Checker

BasicLinkChecker iterates over all metadata fields ending in "uri" (eg. dc.relation.uri, dc.identifier.uri,
dc.source.uri ...), attempts a GET to the value of the field, and checks for a 200 OK response.
Results are reported in a simple "one row per link" format.

Using DSpace – 152

DSpace 7.x Documentation – DSpace 7.x Documentation

Metadata Value Link Checker

MetadataValueLinkChecker parses all metadata fields for valid HTTP URLs, attempts a GET to those URLs, and
checks for a 200 OK response.
Results are reported in a simple "one row per link" format.

4.3.9.3 MetadataWebService Task

DSpace item metadata can contain any number of identifiers or other field values that participate in networked
information systems. For example, an item may include a DOI which is a controlled identifier in the DOI registry.
Many web services exist to leverage these values, by using them as 'keys' to retrieve other useful data. In the DOI
case for example, CrossRef provides many services that given a DOI will return author lists, citations, etc. The
MetadataWebService task enables the use of such services, and allows you to obtain and (optionally) add to DSpace
metadata the results of any web service call to any service provider. You simply need to describe what service you
want to call, and what to do with the results. Using the task code ([taskcode]), you can create as many distinct
tasks as you have services you want to call.
Each task description lives in a configuration file in 'config/modules' (or in your local.cfg), and is a simple properties
file, like all other DSpace configuration files (see Configuration Reference(see page 552)). All of the settings associated
with a given task should be prepended with the task name (as assigned in config/modules/curate.cfg). For
example, if the task name is issn2pubname in curate.cfg, then all settings should start with "issn2pubname."
Your settings can either be set in your local.cfg , or in a new configuration file which is included (include =
path/to/new/file.cfg) into either your local.cfg or the dspace.cfg. See the Configuration Reference(see page 552)
for examples of including configuration files, or modifying your local.cfg
There are a few required properties you must configure for any service, and for certain services, a few additional
ones. An example will illustrate best.

ISSN to Publisher Name

Suppose items (holding journal articles) include 'dc.identifier.issn' when available. We might also want to catalog
the publisher name (in 'dc.publisher'). The cataloger could look up the name given the ISSN in various sources, but
this 'research' is tedious, costly and error-prone. There are many good quality, free web services that can furnish
this information. So we will configure a MetadataWebService task to call a service, and then automatically assign
the publisher name to the item metadata. As noted above, all that is needed is a description of the service, and
what to do with the results. Create a new file in 'config/modules' called 'issn2pubname.cfg' (or whatever is
mnemonically useful to you). The first property in this file describes the service in a 'template'. The template is just
the URL to call the web service, with parameters to substitute values in. Here we will use the 'Sherpa/Romeo'
service:

[taskcode].template=http://www.sherpa.ac.uk/romeo/api29.php?issn={dc.identifier.issn}

When the task runs, it will replace '{dc.identifier.issn}' with the value of that field in the item, If the field has multiple
values, the first one will be used. As a web service, the call to the above URL will return an XML document containing
information (including the publisher name) about that ISSN. We need to describe what to do with this response
document, i.e. what elements we want to extract, and what to do with the extracted content. This description is
encoded in a property called the 'datamap'. Using the example service above we might have:

[taskcode].datamap=//publisher/name=>dc.publisher,//romeocolor

Using DSpace – 153

DSpace 7.x Documentation – DSpace 7.x Documentation

Each separate instruction is separated by a comma, so there are 2 instructions in this map. The first instruction
essentially says: find the XML element 'publisher name' and assign the value or values of this element to the
'dc.publisher' field of the item. The second instruction says: find the XML element 'romeocolor', but do not add it to
the DSpace item metadata - simply add it to the task result string (so that it can be seen by the person running the
task). You can have as many instructions as you like in a datamap, which means that you can retrieve multiple
values from a single web service call. A little more formally, each instruction consists of one to three parts. The first
(mandatory) part identifies the desired data in the response document. The syntax (here '//publisher/name') is an
XPath 1.0 expression, which is the standard language for navigating XML trees. If the value is to be assigned to the
DSpace item metadata, then 2 other parts are needed. The first is the 'mapping symbol' (here '=>'), which is used to
determine how the assignment should be made. There are 3 possible mapping symbols, shown here with their
meanings:

'->' mapping will add to any existing value(s) in the item field
'=>' mapping will replace any existing value(s) in the item field
'~>' mapping will add *only if* item field has no existing value(s)

The third part (here 'dc.publisher') is simply the name of the metadata field to be updated. These two mandatory
properties (template and datamap) are sufficient to describe a large number of web services. All that is required to
enable this task is to edit 'config/modules/curate.cfg' (or your local.cfg), and add 'issn2pubname' to the
list of tasks:

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MetadataWebService = issn2pubname

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MetadataWebService = doi2crossref

If you wish the task to be available in the Admin UI, see the Invocation from the Admin UI(see page 0) documentation
(above) about how to configure it. The remaining sections describe some more specialized needs using the
MetadataWebService task.

HTTP Headers
For some web services, protocol and other information is expressed not in the service URL, but in HTTP headers.
Examples might be HTTP basic auth tokens, or requests for a particular media type response. In these cases, simply
add a property to the configuration file (our example was 'issn2pubname.cfg') containing all headers you wish to
transmit to the service:

[taskcode].headers=Accept: application/xml||Cache-Control: no-cache

You can specify any number of headers, just separate them with a 'double-pipe' ('||'). Ensure that any commas in
values are escaped (with backslash comma, i.e. '\,').

Transformations
One potential problem with the simple parameter substitutions performed by the task is that the service might
expect a different format or expression of a value than the way it is stored in the item metadata. For example, a DOI
service might expect a bare prefix/suffix notation ('10.000/12345'), whereas the DSpace metadata field might have a
URI representation ('http://dx.doi.org/10.000/12345').(see page 153) In these cases one can declare a 'transformation'
of a value in the template. For example:

Using DSpace – 154

DSpace 7.x Documentation – DSpace 7.x Documentation

[taskcode].template=http://www.crossref.org/openurl/?id={doi:dc.relation.isversionof}&format=unixref

The 'doi:' prepended to the metadata field name declares that the value of the 'dc.relation.isversionof' field should
be transformed before the substitution into the template using a transformation named 'doi'. The transformation is
itself defined in the same configuration file as follows:

[taskcode].transform.doi=match 10. trunc 60

This would be read as: exclude the value string up to the occurrence of '10.', then truncate any characters after
length 60. You may define as many transformations as you want in any task, although generally 1 or 2 will suffice.
They keywords 'match', 'trunc', etc are names of 'functions' to be applied (in the order entered). The currently
available functions are:

'cut' <number> = remove number leading characters

'trunc' <number> = remove trailing characters after number length
'match' <pattern> = start match at pattern
'text' <characters> = append literal characters (enclose in ' ' when whitespace needed)

When the task is run, if the transformation results in an invalid state (e.g. cutting more characters than there are in
the value), the un-transformed value will be used and the condition will be logged. Transformations may also be
applied to values returned from the web service. That is, one can apply the transformation to a value before
assigning it to a metadata field. In this case, the declaration occurs in the datamap property, not the template:

[taskcode].datamap=//publisher/name=>shorten:dc.publisher,//romeocolor

Here the task will apply the 'shorten' transformation (which must be defined in the same config file) before
assigning the value to 'dc.publisher'.

Result String Programatic Use

Normally a task result string appears in a window in the admin UI after it has been invoked. The MedataWebService
task will concatenate all the values declared in the 'datamap' property and place them in the result string using the
format: 'name:value name:value' for as many values as declared. In the example above we would get a string like
'publisher: Nature romeocolor: green'. This format is fine for simple display purposes, but can be tricky if the values
contain spaces. You can override the space separator using an optional property 'separator' (put in the config file,
with all other properties). If you use:

[taskcode].separator=||

for example, it becomes easy to parse the result string and preserve spaces in the values. This use of the result
string can be very powerful, since you are essentially creating a map of returned values, which can then be used to
populate a user interface, or any other way you wish to exploit the data (drive a workflow, etc).

Limits and Use

A few limitations should be noted. First, since the response parsing utilizes XPath, the service can only operate on
XML, (not JSON) response documents. Most web services can provide either, so this should not be a major obstacle.
The MetadataWebService can be used in many ways: showing an admin a value in the result string in a UI, run in a

Using DSpace – 155

DSpace 7.x Documentation – DSpace 7.x Documentation

batch to update a set of items, etc. One excellent configuration is to wire these tasks into submission workflow, so
that 'automatic cataloging' of many fields can be performed on ingest.

4.3.9.4 MicrosoftTranslator Task

Microsoft Translator uses the Microsoft Translate API to translate metadata values from one source language into
one or more target languages.
This task cab be configured to process particular fields, and use a default language if no authoritative language for
an item can be found. Bing API v2 key is needed.
MicrosoftTranslator extends the more generic AbstractTranslator. This now seems wasteful, but a GoogleTranslator
had also been written to extend AbstractTranslator. Unfortunately, Google has announced they are
decommissioning free Translate API service, so this task hasn't been included in DSpace's general set of curation
tasks.
Translated fields are added in addition to any existing fields, with the target language code in the 'language'
column. This means that running a task multiple times over one item with the same configuration could result in
duplicate metadata.
This task is intended as a prototype / example for developers and administrators who are new to the curation
system.

Configure Microsoft Translator

An example configuration file can be found in [dspace]/config/modules/translator.cfg.

Using DSpace – 156

DSpace 7.x Documentation – DSpace 7.x Documentation

#---------------------------------------------------------------#
#----------TRANSLATOR CURATION TASK CONFIGURATIONS--------------#
#---------------------------------------------------------------#
# Configuration properties used solely by MicrosoftTranslator #
# Curation Task (uses Microsoft Translation API v2) #
#---------------------------------------------------------------#
## Translation field settings
##
## Authoritative language field
## This will be read to determine the original language an item was submitted in
## Default: dc.language
translator.field.language = dc.language

## Metadata fields you wish to have translated
translator.field.targets = dc.description.abstract, dc.title, dc.type

## Translation language settings
##
## If the language field configured in translate.field.language is not present
## in the record, set translate.language.default to a default source language
## or leave blank to use autodetection
translator.language.default = en

## Target languages for translation
translator.language.targets = de, fr

## Translation API settings
##
## Your Bing API v2 key and/or Google "Simple API Access" Key
## (note to Google users: your v1 API key will not work with Translate v2,
## you will need to visit https://code.google.com/apis/console and activate
## a Simple API Access key)
##
## You do not need to enter a key for both services.
translator.api.key.microsoft = YOUR_MICROSOFT_API_KEY_GOES_HERE
translator.api.key.google = YOUR_GOOGLE_API_KEY_GOES_HERE

4.3.9.5 NoOp Task

This task does absolutely nothing. It is intended as a starting point for developers and administrators wishing to
learn more about the curation system.

4.3.9.6 Required Metadata Task

The "requiredmetadata" task examines item metadata and determines whether fields that the web submission
(input-forms.xml) marks as required are present. It sets the result string to indicate either that all required fields
are present, or constructs a list of metadata elements that are required but missing. When the task is performed on
an item, it will display the result for that item. When performed on a collection or community, the task be
performed on each item, and will display the last item result. If all items in the community or collection have all
required fields, that will be the last in the collection. If the task fails for any item (i.e. the item lacks all required
fields), the process is halted. This way the results for the 'failed' items are not lost.

Using DSpace – 157

DSpace 7.x Documentation – DSpace 7.x Documentation

4.3.9.7 Virus Scan Task

The "vscan" task performs a virus scan on the bitstreams of items using the ClamAV software product.
Clam AntiVirus is an open source (GPL) anti-virus toolkit for UNIX. A port for Windows is also available. The virus
scanning curation task interacts with the ClamAV virus scanning service to scan the bitstreams contained in items,
reporting on infection(s). Like other curation tasks, it can be run against a container or item, in the GUI or from the
command line. It should be installed according to the documentation at http://www.clamav.net188. It should not be
installed in the dspace installation directory. You may install it on the same machine as your dspace installation, or
on another machine which has been configured properly.

Setup the service from the ClamAV documentation.

This plugin requires a ClamAV daemon installed and configured for TCP sockets. Instructions for installing ClamAV
(http://189www.clamav.net/doc/latest/190clamdoc191.pdf192)
NOTICE: The following directions assume there is a properly installed and configured clamav daemon. Refer to links
above for more information about ClamAV.
The Clam anti-virus database must be updated regularly to maintain the most current level of anti-virus protection.
Please refer to the ClamAV documentation for instructions about maintaining the anti-virus database.

DSpace Configuration
In [dspace]/config/modules/curate.cfg, activate the task:
• Add the plugin to the list of curation tasks.

### Task Class implementations

plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.NoOpCurationTask = noop
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.ProfileFormats = profileformats
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.RequiredMetadata = requiredmetadata
# This is the ClamAV scanner plugin
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.ClamScan = vscan
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MicrosoftTranslator = translate
plugin.named.org.dspace.curate.CurationTask = org.dspace.ctask.general.MetadataValueLinkChecker =
checklinks

• Optionally, add the vscan friendly name to the configuration to enable it in the administrative it in the
administrative user interface.

curate.ui.tasknames = profileformats = Profile Bitstream Formats

curate.ui.tasknames = requiredmetadata = Check for Required Metadata
curate.ui.tasknames = checklinks = Check Links in Metadata
# Enable ClamAV from UI
curate.ui.tasknames = vscan = Virus Scan

188 http://www.clamav.net/
189 http://www.clamav.net/doc/latest/clamdoc.pdf
190 http://www.clamav.net/doc/latest/clamdoc.pdf
191 http://www.clamav.net/doc/latest/clamdoc.pdf
192 http://www.clamav.net/doc/latest/clamdoc.pdf

Using DSpace – 158

DSpace 7.x Documentation – DSpace 7.x Documentation

• In [dspace]/config/modules, edit configuration file clamav.cfg:

clamav.service.host = 127.0.0.1
# Change if not running on the same host as your DSpace installation.
clamav.service.port = 3310
# Change if not using standard ClamAV port
clamav.socket.timeout = 120
# Change if longer timeout needed
clamav.scan.failfast = false
# Change only if items have large numbers of bitstreams

• Finally, if desired virus scanning can be enabled as part of the submission process upload file step. In
[dspace]/config/modules, edit configuration file submission-curation.cfg:

submission-curation.virus-scan = true

Task Operation from the Administrative user interface

Curation tasks can be run against container and item dspace objects by e-persons with administrative privileges. A
curation tab will appear in the administrative ui after logging into DSpace:
1. Click on the curation tab.
2. Select the option configured in ui.tasknames above.
3. Select Perform.

Task Operation from the Item Submission user interface

If desired virus scanning can be enabled as part of the submission process upload file step. In [dspace]/config/
modules, edit configuration file submission-curation.cfg:

submission-curation.virus-scan = true

Task Operation from the curation command line client

To output the results to the console:

[dspace]/bin/dspace curate -t vscan -i <handle of container or item dso> -r -

Or capture the results in a file:

[dspace]/bin/dspace curate -t vscan -i <handle of container or item dso> -r - > /<path...>/<name>

Using DSpace – 159

DSpace 7.x Documentation – DSpace 7.x Documentation

Table 1 – Virus Scan Results Table

GUI (Interactive Mode) FailFast Expectation

Container T Stop on 1st Infected Bitstream

Container F Stop on 1st Infected Item

Item T Stop on 1st Infected Bitstream

Item F Scan all bitstreams

Command Line

Container T Report on 1st infected bitstream within an item/

Scan all contained Items

Container F Report on all infected bitstreams/Scan all

contained Items

Item T Report on 1st infected bitstream

Item F Report on all infected bitstreams

4.4 Exporting Content and Metadata

General top level page to group all DSpace facilities for exporting content and metadata.

• Linked (Open) Data(see page 160)

• SWORDv1 Client(see page 176)
• Exchanging Content Between Repositories(see page 178)
• OAI(see page 179)
• OpenAIRE4 Guidelines Compliancy(see page 199)

4.4.1 Linked (Open) Data

Using DSpace – 160

DSpace 7.x Documentation – DSpace 7.x Documentation

• Introduction(see page 161)

• Exchanging repository contents(see page 161)
• Terminology(see page 161)
• Linked (Open) Data Support within DSpace(see page 162)
• Architecture / Concept(see page 162)
• Install a Triple Store(see page 164)
• Default configuration and what you should change(see page 164)
• Configuration Reference(see page 165)
• [dspace-source]/dspace/config/modules/rdf.cfg(see page 165)
• [dspace-source]/dspace/config/modules/rdf/constant-data-*.ttl(see page 174)
• [dspace-source]/dspace/config/modules/rdf/metadata-rdf-mapping.ttl(see page 174)
• [dspace-source]/dspace/config/modules/rdf/fuseki-assembler.ttl(see page 175)
• [dspace-source]/dspace/config/spring/api/rdf.xml(see page 175)
• Maintenance(see page 176)

4.4.1.1 Introduction

Exchanging repository contents

Most sites on the Internet are oriented towards human consumption. While HTML may be a good format for
presenting information to humans, it is not a good format to export data in a way easy for a computer to work with.
Like most software for building repositories, DSpace supports OAI-PMH(see page 179) as an interface to expose the
stored metadata. While OAI-PMH is well known in the field of repositories, it is rarely known elsewhere (e.g. Google
retired its support for OAI-PMH in 2008193). The Semantic Web is a generic approach to publish data on the Internet
together with information about its semantics. Its application is not limited to repositories or libraries and it has a
growing user base. RDF194 and SPARQL195 are W3C-released standards for publishing structured data on the web in
a machine-readable way. The data stored in repositories is particularly suited for use in the Semantic Web, as the
metadata are already available. It doesn’t have to be generated or entered manually for publication as Linked Data.
For most repositories, at least for Open Access repositories, it is quite important to share their stored content.
Linked Data is a rather big chance for repositories to present their content in a way that can easily be accessed,
interlinked and (re)used.

Terminology
We don't want to give a full introduction into the Semantic Web and its technologies here as this can be easily found
in many places on the web. Nevertheless, we want to give a short glossary of the terms used most often in this
context to make the following documentation more readable.

Semantic Web The term "Semantic Web" refers to the part of the Internet containing Linked
Data. Just like the World Wide Web, the Semantic Web is also woven
together by links among the data.

193 http://googlewebmastercentral.blogspot.de/2008/04/retiring-support-for-oai-pmh-in.html
194 http://www.w3.org/standards/techs/rdf#w3c_all
195 http://www.w3.org/TR/sparql11-query/

Using DSpace – 161

DSpace 7.x Documentation – DSpace 7.x Documentation

Linked Data Data in RDF, following the Linked Data Principles196 are called Linked Data.
The Linked Data Principles describe the expected behavior of data
Linked Open Data publishers who shall ensure that the published data are easy to find, easy to
retrieve, can be linked easily and link to other data as well.

Linked Open Data is Linked Data published under an open license. There is
no technical difference between Linked Data and Linked Open Data (often
abbreviated as LOD). It is only a question of the license used to publish it.

RDF RDF is an acronym for Resource Description Framework, a metadata model.

RDF/XML Don't think of RDF as a format, as it is a model. Nevertheless, there are
Turtle different formats to serialize data following RDF. RDF/XML, Turtle, N-Triples
N-Triples and N3-Notation are probably the most well-known formats to serialize data
N3-Notation in RDF. While RDF/XML uses XML, Turtle, N-Triples and N3-Notation don't
and they are easier for humans to read and write. When we use RDF in
DSpace configuration files, we currently prefer Turtle (but the code should
be able to deal with any serialization).

Triple Store A triple store is a database to natively store data following the RDF model.
Just as you have to provide a relational database for DSpace, you have to
provide a Triple Store for DSpace if you want to use the LOD support.

SPARQL The SPARQL Protocol and RDF Query Language is a family of protocols to
query triple stores. Since version 1.1, SPARQL can be used to manipulate
triple stores as well, to store, delete or update data in triple stores. DSpace
uses SPARQL 1.1 Graph Store HTTP Protocol and SPARQL 1.1 Query
Language to communicate with the Triple Store. The SPARQL 1.1 Query
Language is often referred to simply as SPARQL, so expect the SPARQL 1.1
Query Language if no other specific protocol out of the SPARQL family is
explicitly specified.

SPARQL endpoint A SPARQL endpoint is a SPARQL interface of a triple store. Since SPARQL 1.1,
a SPARQL endpoint can be either read-only, allowing only to query the
stored data; or readable and writable, allowing to modify the stored data as
well. When talking about a SPARQL endpoint without specifying which
SPARQL protocol is used, an endpoint supporting SPARQL 1.1 Query
Language is meant.

4.4.1.2 Linked (Open) Data Support within DSpace

Starting with DSpace 5.0, DSpace provides support for publishing stored contents in form of Linked (Open) Data.

Architecture / Concept
To publish content stored in DSpace as Linked (Open) Data, the data have to be converted into RDF. The conversion
into RDF has to be configurable as different DSpace instances may use different metadata schemata, different
persistent identifiers (DOI, Handle, ...) and so on. Depending on the content to convert, configuration and other
parameters, conversion may be time-intensive and impact performance. Content of repositories is much more
often read then created, deleted or changed because the main goal of repositories is to safely store their contents.

196 http://www.w3.org/DesignIssues/LinkedData.html

Using DSpace – 162

DSpace 7.x Documentation – DSpace 7.x Documentation

For this reason, the content stored within DSpace is converted and stored in a triple store immediately after it is
created or updated. The triple store serves as a cache and provides a SPARQL endpoint to make the converted data
accessible using SPARQL. The conversion is triggered automatically by the DSpace event system and can be started
manually using the command line interface – both cases are documented below. There is no need to backup the
triple store, as all data stored in the triple store can be recreated from the contents stored elsewhere in DSpace (in
the assetstore(s) and the database). Beside the SPARQL endpoint, the data should be published as RDF serialization
as well. With dspace-rdf DSpace offers a module that loads converted data from the triple store and provides it as
an RDF serialization. It currently supports RDF/XML, Turtle and N-Triples.
Repositories use Persistent Identifiers to make content citable and to address content. Following the Linked Data
Principles, DSpace uses a Persistent Identifier in the form of HTTP(S) URIs, converting a Handle to http://
hdl.handle.net/<handle> and a DOI to http://dx.doi.org/<doi>. Altogether, DSpace Linked Data support spans all
three Layers: the storage layer with a triple store, the business logic with classes to convert stored contents into
RDF, and the application layer with a module to publish RDF serializations. Just like DSpace allows you to choose
Oracle or Postgresql as the relational database, you may choose between different triple stores. The only
requirements are that the triple store must support SPARQL 1.1 Query Language and SPARQL 1.1 Graph Store HTTP
Protocol which DSpace uses to store, update, delete and load converted data in/out of the triple store and uses the
triple store to provide the data over a SPARQL endpoint.

 Store public data only in the triple store!

The triple store should contain only data that are public, because the DSpace access restrictions won't
affect the SPARQL endpoint. For this reason, DSpace converts only archived, discoverable (non-private)
Items, Collections and Communities which are readable for anonymous users. Please consider this while
configuring and/or extending DSpace Linked Data support.

The org.dspace.rdf.conversion197 package contains the classes used to convert the repository content to RDF. The
conversion itself is done by plugins. The org.dspace.rdf.conversion.ConverterPlugin198 interface is really simple, so
take a look at it you if can program in Java and want to extend the conversion. The only thing important is that
plugins must only create RDF that can be made publicly available, as the triple store provides it using a sparql
endpoint for which the DSpace access restrictions do not apply. Plugins converting metadata should check whether
a specific metadata field needs to be protected or not (see org.dspace.app.util.MetadataExposure199 on how to
check that). The MetadataConverterPlugin200 is heavily configurable (see below) and is used to convert the
metadata of Items. The StaticDSOConverterPlugin201 can be used to add static RDF Triples (see below). The
SimpleDSORelationsConverterPlugin202 creates links between items and collections, collections and communities,
subcommunitites and their parents, and between top-level communities and the information representing the
repository itself.
As different repositories uses different persistent identifiers to address their content, different algorithms to create
URIs used within the converted data can be implemented. Currently HTTP(S) URIs of the repository (called local
URIs), Handles and DOIs can be used. See the configuration part of this document for further information. If you
want to add another algorithm, take a look at the org.dspace.rdf.storage.URIGenerator203 interface.

197 https://github.com/DSpace/DSpace/tree/master/dspace-api/src/main/java/org/dspace/rdf/conversion
198 https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/rdf/conversion/
ConverterPlugin.java
199 https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/app/util/MetadataExposure.java
200 https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/rdf/conversion/
MetadataConverterPlugin.java
201 https://wiki.duraspace.org/dspace-api/src/main/java/org/dspace/rdf/conversion/StaticDSOConverterPlugin.java
202 https://wiki.duraspace.org/dspace-api/src/main/java/org/dspace/rdf/conversion/SimpleDSORelationsConverterPlugin.java
203 https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/rdf/storage/URIGenerator.java

Using DSpace – 163

DSpace 7.x Documentation – DSpace 7.x Documentation

Install a Triple Store

In addition to a normal DSpace installation you have to install a triple store. You can use any triple store that
supports SPARQL 1.1 Query Language and SPARQL 1.1 Graph Store HTTP Protocol. If you do not have one yet, you
can use Apache Fuseki. Download Fuseki from its official download page204 and unpack the downloaded archive.
The archive contains several scripts to start Fuseki. Use the start script appropriate to the OS of your choice with the
options '--localhost --config=<dspace-install>/config/modules/rdf/fuseki-assembler.ttl'. Instead of changing to the
directory into which you unpacked Fuseki, you may set the variable FUSEKI_HOME. If you're using Linux and bash,
you unpacked Fuseki to /usr/local/jena-fuseki-1.0.1 and you installed DSpace to [dspace-install], this would look
like this:

export FUSEKI_HOME=/usr/local/jena-fuseki-1.0.1 ; $FUSKI_HOME/fuseki-server --localhost --config [dspace-

install]/config/modules/rdf/fuseki-assembler.ttl

Fuseki's archive contains a script to start Fuseki automatically at startup as well.

 Make Fuseki connect to localhost only, by using the argument --localhost when launching if you use the
configuration provided with DSpace! The configuration contains a writeable SPARQL endpoint that allows
any connection to change/delete the content of your triple store.

 Use Apache mod proxy, mod rewrite or any other appropriate web server/proxy to make localhost:3030/
dspace/sparql readable from the internet. Use the address under which it is accessible as the address of
your public sparql endpoint (see the property public.sparql.endpoint in the configuration reference(see
page 165) below.).

The configuration provided within DSpace makes it store the files for the triple store under [dspace-install]/
triplestore. Using this configuration, Fuseki provides three SPARQL endpoints: two read-only endpoints and one
that can be used to change the data of the triple store. You should not use this configuration if you let Fuseki
connect to the internet directly as it would make it possible for anyone to delete, change or add information to
the triple store. The option --localhost tells Fuseki to listen only on the loopback device. You can use Apache
mod_proxy or any other web or proxy server to make the read-only SPARQL endpoint accessible from the internet.
With the configuration described, Fueski listens to the port 3030 using HTTP. Using the address http://localhost:
3030/ you can connect to the Fuseki Web UI. http://localhost:3030/dspace/data addresses a writeable SPARQL 1.1
HTTP Graph Store Protocol endpoint, and http://localhost:3030/dspace/get a read-only one. Under http://localhost:
3030/dspace/sparql a read-only SPARQL 1.1 Query Language endpoint can be found. The first one of these
endpoints must be not accessible by the internet, while the last one should be accessible publicly.

Default configuration and what you should change

First, you'll want to ensure the Linked Data endpoint is enabled/configured. In your local.cfg, add rdf.enabled
= true . You can optionally change it's path by setting rdf.path (it defaults to "rdf" which means the Linked
Data endpoint is available at [dspace.server.url]/rdf/ (where dspace.server.url is also specified in your
local.cfg)

204 http://jena.apache.org/documentation/serving_data/index.html#download-fuseki

Using DSpace – 164

DSpace 7.x Documentation – DSpace 7.x Documentation

In the file [dspace]/config/dspace.cfg you should look for the property

event.dispatcher.default.consumers and add rdf there. Adding rdf there makes DSpace update the triple
store automatically as the publicly available content of the repository changes.
As the Linked Data support of DSpace is highly configurable this section gives a short list of things you probably
want to configure before using it. Below you can find more information on what is possible to configure.
In the file [dspace]/config/modules/rdf.cfg you want to configure the address of the public sparql endpoint
and the address of the writable endpoint DSpace use to connect to the triple store (the properties
rdf.public.sparql.endpoint, rdf.storage.graphstore.endpoint). In the same file you want to configure
the URL that addresses the dspace-rdf module which is depending on where you deployed it (property
rdf.contextPath) and switch content negotiation on (set property rdf.contentNegotiation.enable =
true).
In the file [dspace]/config/modules/rdf/constant-data-general.ttl you should change the links to the
Web UI of the repository and the public readable SPARQL endpoint. The URL of the public SPARQL endpoint should
point to a URL that is proxied by a webserver to the Triple Store. See the section Install a Triple Store(see page 164)
above for further information.
In the file [dspace]/config/modules/rdf/constant-data-site.ttl you may add any triples that should be
added to the description of the repository itself.
If you want to change the way the metadata fields are converted, take a look into the file [dspace]/config/
modules/rdf/metadata-rdf-mapping.ttl. This is also the place to add information on how to map metadata
fields that you added to DSpace. There is already a quite acceptable default configuration for the metadata fields
which DSpace supports out of the box. If you want to use some specific prefixes in RDF serializations that support
prefixes, you have to edit [dspace]onfig/modules/rdf/metadata-prefixes.ttl.

Configuration Reference
There are several configuration files to configure DSpace's LOD support. The main configuration file can be found
under [dspace-source]/dspace/config/modules/rdf.cfg. Within DSpace we use Spring to define which
classes to load. For DSpace's LOD support this is done within [dspace-source]/dspace/config/spring/api/
rdf.xml. All other configuration files are positioned in the directory [dspace-source]/dspace/config/
modules/rdf/. Configurations in rdf.cfg can be modified directly, or overridden via your local.cfg config file
(see Configuration Reference(see page 552)). You'll have to configure where to find and how to connect to the triple
store. You may configure how to generate URIs to be used within the generated Linked Data and how to convert the
contents stored in DSpace into RDF. We will guide you through the configuration file by file.

[dspace-source]/dspace/config/modules/rdf.cfg

Pro rdf.enabled
pert
y:

Exa rdf.enabled = true

mpl
e
Valu
e:

Using DSpace – 165

DSpace 7.x Documentation – DSpace 7.x Documentation

Info Defines whether the RDF endpoint is enabled or disabled (disabled by default). If enabled, the RDF
rma endpoint is available at ${dspace.server.url}/${rdf.path}. Changing this value requires rebooting your
tion servlet container (e.g. Tomcat)
al
Not
e:

Pro rdf.path
pert
y:

Exa rdf.path = rdf

mpl
e
Valu
e:

Info Defines the path of the RDF endpoint, if enabled. For example, a value of "rdf" (the default) means the
rma RDF interface/endpoint is available at ${dspace.server.url}/rdf (e.g. if "dspace.server.url = http://
tion localhost:8080/server", then it'd be available at "http://localhost:8080/server/rdf". Changing this value
al requires rebooting your servlet container (e.g. Tomcat)
Not
e:

Pro rdf.contentNegotiation.enable
pert
y:

Exa rdf.contentNegotiation.enable = true

mpl
e
Valu
e:

Info Defines whether content negotiation should be activated. Set this true, if you use Linked Data support.
rma
tion
al
Not
e:

Pro rdf.contextPath
pert
y:

Exa rdf.contextPath = ${dspace.baseUrl}/rdf

mpl
e
Valu
e:

Using DSpace – 166

DSpace 7.x Documentation – DSpace 7.x Documentation

Info The content negotiation needs to know where to refer if anyone asks for RDF serializations of content
rma stored within DSpace. This property sets the URL where the dspace-rdf module can be reached on the
tion Internet (depending on how you deployed it).
al
Not
e:

Pro rdf.public.sparql.endpoint
pert
y:

Exa rdf.public.sparql.endpoint = http://${dspace.baseUrl}/sparql

mpl
e
Valu
e:

Info Address of the read-only public SPARQL endpoint supporting SPARQL 1.1 Query Language.
rma
tion
al
Not
e:

Pro rdf.storage.graphstore.endpoint
pert
y:

Exa rdf.storage.graphstore.endpoint = http://localhost:3030/dspace/data

mpl
e
Valu
e:

Info Address of a writable SPARQL 1.1 Graph Store HTTP Protocol endpoint. This address is used to create,
rma update and delete converted data in the triple store. If you use Fuseki with the configuration provided as
tion part of DSpace 5, you can leave this as it is. If you use another Triple Store or configure Fuseki on your
al own, change this property to point to a writeable SPARQL endpoint supporting the SPARQL 1.1 Graph
Not Store HTTP Protocol.
e:

Pro rdf.storage.graphstore.authentication
pert
y:

Exa rdf.storage.graphstore.authentication = no
mpl
e
Valu
e:

Using DSpace – 167

DSpace 7.x Documentation – DSpace 7.x Documentation

Info Defines whether to use HTTP Basic authentication to connect to the writable SPARQL 1.1 Graph Store
rma HTTP Protocol endpoint.
tion
al
Not
e:

Pro rdf.storage.graphstore.login
pert rdf.storage.graphstore.password
ies:

Exa rdf.storage.graphstore.login = dspace

mpl rdf.storage.graphstore.password =ecapsd
e
Valu
es:

Info Credentials for the HTTP Basic authentication if it is necessary to connect to the writable SPARQL 1.1
rma Graph Store HTTP Protocol endpoint.
tion
al
Not
e:

Pro rdf.storage.sparql.endpoint
pert
y:

Exa rdf.storage.sparql.endpoint = http://localhost:3030/dspace/sparql

mpl
e
Valu
e:

Info Besides a writable SPARQL 1.1 Graph Store HTTP Protocol endpoint, DSpace needs a SPARQL 1.1 Query
rma Language endpoint, which can be read-only. This property allows you to set an address to be used to
tion connect to such a SPARQL endpoint. If you leave this property empty the property $
al {rdf.public.sparql.endpoint} will be used instead.
Not
e:

Pro rdf.storage.sparql.authentication
pert rdf.storage.sparql.login
ies: rdf.storage.sparql.password

Exa rdf.storage.sparql.authentication = yes

mpl rdf.storage.sparql.login = dspace
e rdf.storage.sparql.password = ecapsd
Valu
es:

Using DSpace – 168

DSpace 7.x Documentation – DSpace 7.x Documentation

Info As for the SPARQL 1.1 Graph Store HTTP Protocol you can configure DSpace to use HTTP Basic
rma authentication to authenticate against the (read-only) SPARQL 1.1 Query Language endpoint.
tion
al
Not
e:

Pro rdf.converter.DSOtypes
pert
y:

Exa rdf.converter.DSOtypes = SITE, COMMUNITY, COLLECTION, ITEM

mpl
e
Valu
e:

Info Define which kind of DSpaceObjects should be converted. Bundles and Bitstreams will be converted as
rma part of the Item they belong to. Don't add EPersons here unless you really know what you are doing. All
tion converted data is stored in the triple store that provides a publicly readable SPARQL endpoint. So all
al data converted into RDF is exposed publicly. Every DSO type you add here must have an HTTP URI to be
Not referenced in the generated RDF, which is another reason not to add EPersons here currently.
e:

The following properties configure the StaticDSOConverterPlugin.

Pro rdf.constant.data.GENERAL
pert rdf.constant.data.COLLECTION
ies: rdf.constant.data.COMMUNITY
rdf.constant.data.ITEM
rdf.constant.data.SITE

Exa rdf.constant.data.GENERAL = ${dspace.dir}/config/modules/rdf/constant-data-general.ttl

mpl rdf.constant.data.COLLECTION = ${dspace.dir}/config/modules/rdf/constant-data-collection.ttl
e rdf.constant.data.COMMUNITY = ${dspace.dir}/config/modules/rdf/constant-data-community.ttl
Valu rdf.constant.data.ITEM = ${dspace.dir}/config/modules/rdf/constant-data-item.ttl
es: rdf.constant.data.SITE = ${dspace.dir}/config/modules/rdf/constant-data-site.ttl

Info These properties define files to read static data from. These data should be in RDF, and by default Turtle
rma is used as serialization. The data in the file referenced by the property ${rdf.constant.data.GENERAL} will
tion be included in every Entity that is converted to RDF. E.g. it can be used to point to the address of the
al public readable SPARQL endpoint or may contain the name of the institution running DSpace.
Not
e: The other properties define files that will be included if a DSpace Object of the specified type (collection,
community, item or site) is converted. This makes it possible to add static content to every Item, every
Collection, ...

The following properties configure the MetadataConverterPlugin.

Using DSpace – 169

DSpace 7.x Documentation – DSpace 7.x Documentation

Pro rdf.metadata.mappings
pert
y:

Exa rdf.metadata.mappings = ${dspace.dir}/config/modules/rdf/metadata-rdf-mapping.ttl

mpl
e
Valu
e:

Info Defines the file that contains the mappings for the MetadataConverterPlugin. See below the description
rma of the configuration file [dspace-source]/dspace/config/modules/rdf/metadata-rdf-mapping.ttl.
tion
al
Not
e:

Pro rdf.metadata.schema
pert
y:

Exa rdf.metadata.schema = file://${dspace.dir}/config/modules/rdf/metadata-rdf-schema.ttl

mpl
e
Valu
e:

Info Configures the URL used to load the RDF Schema of the DSpace Metadata RDF mapping Vocabulary.
rma Using a file:// URI makes it possible to convert DSpace content without having an internet connection.
tion The version of the schema has to be the right one for the used code. In DSpace 5.0 we use the version
al 0.2.0. This Schema can be found here as well: http://digital-repositories.org/ontologies/dspace-
Not metadata-mapping/0.2.0. The newest version of the Schema can be found here: http://digital-
e: repositories.org/ontologies/dspace-metadata-mapping/205.

Pro rdf.metadata.prefixes
pert
y:

Exa rdf.metadata.prefixes = ${dspace.dir}/config/modules/rdf/metadata-prefixes.ttl

mpl
e
Valu
e:

Info If you want to use prefixes in RDF serializations that support prefixes, you can define these prefixes in the
rma file referenced by this property.
tion
al
Not
e:

205 http://digital-repositories.org/ontologies/dspace-metadata-mapping/0.2.0

Using DSpace – 170

DSpace 7.x Documentation – DSpace 7.x Documentation

The following properties configure the SimpleDSORelationsConverterPlugin

Pro rdf.simplerelations.prefixes
pert
y:

Exa rdf.simplerelations.prefixes = ${dspace.dir}/config/modules/rdf/simple-relations-prefixes.ttl

mpl
e
Valu
e:

Info If you want to use prefixes in RDF serializations that support prefixes, you can define these prefixes in the
rma file referenced by this property.
tion
al
Not
e:

Pro rdf.simplerelations.site2community
pert
y:

Exa rdf.simplerelations.site2community = http://purl.org/dc/terms/hasPart, http://digital-repositories.org/

mpl ontologies/dspace/0.1.0#hasCommunity
e
Valu
e:

Info Defines the predicates used to link from the data representing the whole repository to the top level
rma communities. Defining multiple predicates separated by commas will result in multiple triples.
tion
al
Not
e:

Pro rdf.simplerelations.community2site
pert
y:

Exa rdf.simplerelations.community2site = http://purl.org/dc/terms/isPartOf, http://digital-repositories.org/

mpl ontologies/dspace/0.1.0#isPartOfRepository
e
Valu
e:

Using DSpace – 171

DSpace 7.x Documentation – DSpace 7.x Documentation

Info Defines the predicates used to link from the top level communities to the data representing the whole
rma repository. Defining multiple predicates separated by commas will result in multiple triples.
tion
al
Not
e:

Pro rdf.simplerelations.community2subcommunity
pert
y:

Exa rdf.simplerelations.community2subcommunity = http://purl.org/dc/terms/hasPart, http://digital-

mpl repositories.org/ontologies/dspace/0.1.0#hasSubcommunity
e
Valu
e:

Info Defines the predicates used to link from communities to their subcommunities. Defining multiple
rma predicates separated by commas will result in multiple triples.
tion
al
Not
e:

Pro rdf.simplerelations.subcommunity2community
pert
y:

Exa rdf.simplerelations.subcommunity2community = http://purl.org/dc/terms/isPartOf, http://digital-

mpl repositories.org/ontologies/dspace/0.1.0#isSubcommunityOf
e
Valu
e:

Info Defines the predicates used to link from subcommunities to the communities they belong to. Defining
rma multiple predicates separated by commas will result in multiple triples.
tion
al
Not
e:

Pro rdf.simplerelations.community2collection
pert
y:

Exa rdf.simplerelations.community2collection = http://purl.org/dc/terms/hasPart, http://digital-

mpl repositories.org/ontologies/dspace/0.1.0#hasCollection
e
Valu
e:

Using DSpace – 172

DSpace 7.x Documentation – DSpace 7.x Documentation

Info Defines the predicates used to link from communities to their collections. Defining multiple predicates
rma separated by commas will result in multiple triples.
tion
al
Not
e:

Pro rdf.simplerelations.collection2community
pert
y:

Exa rdf.simplerelations.collection2community = http://purl.org/dc/terms/isPartOf, http://digital-

mpl repositories.org/ontologies/dspace/0.1.0#isPartOfCommunity
e
Valu
e:

Info Defines the predicates used to link from collections to the communities they belong to. Defining multiple
rma predicates separated by commas will result in multiple triples.
tion
al
Not
e:

Pro rdf.simplerelations.collection2item
pert
y:

Exa rdf.simplerelations.collection2item = http://purl.org/dc/terms/hasPart206, http://digital-repositories.org/

mpl ontologies/dspace/0.1.0#hasItem
e
Valu
e:

Info Defines the predicates used to link from collections to their items. Defining multiple predicates
rma separated by commas will result in multiple triples.
tion
al
Not
e:

Pro rdf.simplerelations.item2collection
pert
y:

206 http://purl.org/dc/terms/hasPart,%5C%22%20data-mce-href=

Using DSpace – 173

DSpace 7.x Documentation – DSpace 7.x Documentation

Exa rdf.simplerelations.item2collection = http://purl.org/dc/terms/isPartOf207, http://digital-

mpl repositories.org/ontologies/dspace/0.1.0#isPartOfCollection
e
Valu
e:

Info Defines the predicates used to link from items to the collections they belong to. Defining multiple
rma predicates separated by commas will result in multiple triples.
tion
al
Not
e:

Pro rdf.simplerelations.item2bitstream
pert
y:

Exa rdf.simplerelations.item2bitstream = http://purl.org/dc/terms/hasPart208, http://digital-repositories.org/

mpl ontologies/dspace/0.1.0#hasBitstream
e
Valu
e:

Info Defines the predicates used to link from item to their bitstreams. Defining multiple predicates separated
rma by commas will result in multiple triples.
tion
al
Not
e:

[dspace-source]/dspace/config/modules/rdf/constant-data-*.ttl
As described in the documentation of the configuration file [dspace-source]/dspace/config/modules/rdf.cfg, the
constant-data-*.ttl files can be used to add static RDF to the converted data. The data are written in Turtle, but if
you change the file suffix (and the path to find the files in rdf.cfg) you can use any other RDF serialization you like
to. You can use this, for example, to add a link to the public readable SPARQL endpoint, add a link to the repository
homepage, or add a triple to every community or collection defining it as an entity of a specific type like a
bibo:collection. The content of the file [dspace-source]/dspace/config/modules/rdf/constant-data-general.ttl will
be added to every DSpaceObject that is converted. The content of the file [dspace-source]/dspace/config/modules/
rdf/constant-data-community.ttl to every community, the content of the file [dspace-source]/dspace/config/
modules/rdf/constant-data-collection.ttl to every collection and the content of the file [dspace-source]/dspace/
config/modules/rdf/constant-data-item.ttl to every Item. You can use the file [dspace-source]/dspace/config/
modules/rdf/constant-data-site.ttl to specify data representing the whole repository.

[dspace-source]/dspace/config/modules/rdf/metadata-rdf-mapping.ttl
This file should contain several metadata mappings. A metadata mapping defines how to map a specific metadata
field within DSpace to a triple that will be added to the converted data. The MetadataConverterPlugin uses these
metadata mappings to convert the metadata of a item into RDF. For every metadata field and value it looks if any of

207 http://purl.org/dc/terms/isPartOf,
208 http://purl.org/dc/terms/hasPart,%5C%22%20data-mce-href=

Using DSpace – 174

DSpace 7.x Documentation – DSpace 7.x Documentation

the specified mappings matches. If one does, the plugin creates the specified triple and adds it to the converted
data. In the file you'll find a lot of examples on how to define such a mapping.
For every mapping a metadata field name has to be specified, e.g. dc.title, dc.identifier.uri. In addition you can
specify a condition that is matched against the field's value. The condition is specified as a regular expression (using
the syntax of the java class java.util.regex.Pattern). If a condition is defined, the mapping will be used only on fields
those values which are matched by the regex defined as condition.
The triple to create by a mapping is specified using reified RDF statements. The DSpace Metadata RDF Mapping
Vocabulary209 defines some placeholders that can be used. The most important placeholder is dm:DSpaceObjectIRI
which is replaced by the URI used to identify the entity being converted to RDF. That means if a specific Item is
converted the URI used to address this Item in RDF will be used instead of dm:DSpaceObjectIRI. There are three
placeholders that allow reuse of the value of a meta data field. dm:DSpaceValue will be replace by the value as it is.
dm:LiteralGenerator allows one to specify a regex and replacement string for it (see the syntax of the java classes
java.util.regex.Pattern and java.util.regex.Matcher) and creates a Literal out of the field value using the regex and
the replacement string. dm:ResourceGenerator does the same as dm:LiteralGenerator but it generates a HTTP(S)
URI that is used in place. So you can use the resource generator to generate URIs containing modified field values
(e.g. to link to classifications). If you know regular expressions and turtle, the syntax should be quite self
explanatory.

[dspace-source]/dspace/config/modules/rdf/fuseki-assembler.ttl
This is a configuration for the triple store Fuseki of the Apache Jena project. You can find more information on the
configuration it provides in the section Install a Triple Store(see page 164) above.

[dspace-source]/dspace/config/spring/api/rdf.xml
This file defines which classes are loaded by DSpace to provide the RDF functionality. There are two things you
might want to change: the class that is responsible to generate the URIs to be used within the converted data, and
the list of Plugins used during conversion. To change the class responsible for the URIs, change the following line:

<property name="generator" ref="org.dspace.rdf.storage.LocalURIGenerator"/>

This line defines how URIs should be generated, to be used within the converted data. The LocalURIGenerator
generates URIs using the ${dspace.url} property. The HandleURIGenerator uses handles in form of HTTP URLs. It
uses the property ${handle.canonical.prefix} to convert handles into HTTPS URLs. The class
org.dspace.rdf.storage.DOIURIGenerator uses DOIs in the form of HTTP URLs if possible, or local URIs if there are no
DOIs. It uses the DOI resolver "http://dx.doi.org" to convert DOIs into HTTP URLs. The class
org.dspace.rdf.storage.DOIHandleGenerator does the same but uses Handles as fallback if no DOI exists. The
fallbacks are necessary as DOIs are currently used for Items only and not for Communities or Collections.
All plugins that are instantiated within the configuration file will automatically be used during the conversion. Per
default the list looks like the following:

209 http://digital-repositories.org/ontologies/dspace-metadata-mapping/

Using DSpace – 175

DSpace 7.x Documentation – DSpace 7.x Documentation

<bean id="org.dspace.rdf.conversion.SimpleDSORelationsConverterPlugin" class="org.dspace.rdf.conversion
.SimpleDSORelationsConverterPlugin"/>
<bean id="org.dspace.rdf.conversion.MetadataConverterPlugin" class="org.dspace.rdf.conversion.MetadataC
onverterPlugin"/>
<bean id="org.dspace.rdf.conversion.StaticDSOConverterPlugin" class="org.dspace.rdf.conversion.StaticDS
OConverterPlugin"/>

You can remove plugins if you don't want them. If you develop a new conversion plugin, you want to add its class to
this list.

Maintenance
As described above (see page 164)you should add rdf to the property event.dispatcher.default.consumers and
in dspace.cfg. This configures DSpace to automatically update the triple store every time the publicly available
content of the repository is changed. Nevertheless there is a command line tool that gives you the possibility to
update the content of the triple store. As the triple store is used as a cache only, you can delete its content and
reindex it every time you think it is necessary of helpful. The command line tool can be started by the following
command which will show its online help:

[dspace-install]/bin/dspace rdfizer --help

The online help should give you all necessary information. There are commands to delete one specific entity; to
delete all information stored in the triple store; to convert one item, one collection or community (including all
subcommunities, collections and items) or to convert the complete content of your repository. If you start using the
Linked Open Data support on a repository that already contains content, you should run [dspace-install]/
bin/dspace rdfizer --convert-all once.
Every time content of DSpace is converted or Linked Data is requested, DSpace will try to connect to the triple store.
So ensure that it is running (as you do with e.g. your sevlet container or relational database).

4.4.2 SWORDv1 Client

The embedded SWORD Client allows a user (currently restricted to an administrator) to copy an item to a SWORD
server. This allows your DSpace installation to deposit items into another SWORD-compliant repository (including
another DSpace install).

 DSpace 7.0 does not yet support

The SWORDv1 Client is not available in DSpace 7.0. It may be restored in a later 7.x release, see DSpace
Release 7.0 Status210

• Enabling the SWORD Client(see page 177)

• Configuring the SWORD Client(see page 177)

210 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Status

Using DSpace – 176

DSpace 7.x Documentation – DSpace 7.x Documentation

4.4.2.1 Enabling the SWORD Client

The SWORDv1 Client is not available in DSpace 7.0. It may be restored in a later 7.x release, see DSpace Release 7.0
Status211

4.4.2.2 Configuring the SWORD Client

All the relevant configuration can be found in sword-client.cfg. These may be overridden in your local.cfg
config (see Configuration Reference(see page 552)).

211 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Status

Using DSpace – 177

DSpace 7.x Documentation – DSpace 7.x Documentation

Configuration File: [dspace]/config/modules/sword-client.cfg

Property: sword-client.targets

Example value:
sword-client.targets = http://localhost:8080/sword/
servicedocument, \
http://client.swordapp.org/client/servicedocument, \
http://dspace.swordapp.org/sword/servicedocument, \
http://sword.eprints.org/sword-app/servicedocument, \
http://sword.intralibrary.com/IntraLibrary-Deposit/service, \
http://fedora.swordapp.org/sword-fedora/servicedocument

Informational note: List of remote Sword servers. Used to build the drop-down list of
selectable SWORD targets.

Property: sword-client.file-types

Example value: sword-client.file-types = application/zip

Informational note: List of file types from which the user can select. If a type is not supported
by the remote server
it will not appear in the drop-down list.

Property: sword-client.package-formats

Example value:
sword-client.package-formats = http://purl.org/net/sword-types/
METSDSpaceSIP

Informational note: List of package formats from which the user can select. If a format is not
supported by the remote server
it will not appear in the drop-down list.

4.4.3 Exchanging Content Between Repositories

• Transferring Content via Export and Import(see page 179)
• Transferring Communities, Collections, or Items using Packages(see page 179)
• Transferring Items using Simple Archive Format(see page 179)

Using DSpace – 178

DSpace 7.x Documentation – DSpace 7.x Documentation

• Transferring Items using OAI-ORE/OAI-PMH Harvester(see page 179)

4.4.3.1 Transferring Content via Export and Import

To migrate content from one DSpace to another, you can export content from the Source DSpace and import it into
the Destination DSpace.

Transferring Communities, Collections, or Items using Packages

You may transfer any DSpace content (Communities, Collections or Items) from one DSpace to another by utilizing
the AIP Backup and Restore(see page 411) tool. This tool allows you to export content into a series of Archival
Information Packages (AIPs). These AIPs can be used to restore content (from a backup) or move/migrate content to
another DSpace installation.
For more information see AIP Backup and Restore(see page 411).

4.4.3.2 Transferring Items using Simple Archive Format

Where items are to be moved between DSpace instances (for example from a test DSpace into a production DSpace)
the Item Exporter and Item Importer(see page 233) can be used.
First, you should export the DSpace Item(s) into the Simple Archive Format, as detailed at: Importing and Exporting
Items via Simple Archive Format(see page 233). Be sure to use the --migrate option, which removes fields that would
be duplicated on import. Then import the resulting files into the other instance.

4.4.3.3 Transferring Items using OAI-ORE/OAI-PMH Harvester

 OAI Harvesting is not available in DSpace 7.0. It is scheduled to be restored in a later 7.x release (currently
7.1), see DSpace Release 7.0 Status212

You may also choose to enable the OAI-ORE Harvester. This OAI-ORE Harvester allows one DSpace installation to
harvest Items (via OAI-ORE) from another DSpace Installation (or any other system supporting OAI-ORE). Items are
harvested from a remote DSpace Collection into a local DSpace Collection. Harvesting can also be scheduled to run
automatically (or by demand).
See OAI - Harvesting from another DSpace(see page 181)

4.4.4 OAI

4.4.4.1 OAI Interfaces

• • OAI-PMH Server(see page 180)

• OAI-PMH Server Activation(see page 180)
• OAI-PMH Server Maintenance(see page 180)
• OAI-PMH / OAI-ORE Harvester (Client)(see page 181)
• Harvesting from another DSpace(see page 181)

212 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Status

Using DSpace – 179

DSpace 7.x Documentation – DSpace 7.x Documentation

• OAI-PMH / OAI-ORE Harvester Configuration(see page 181)

OAI-PMH Server
In the following sections and subpages, you will learn how to configure OAI-PMH server and activate additional OAI-
PMH crosswalks. The user is also referred to OAI-PMH Data Provider(see page 654) for greater depth details of the
program.
The OAI-PMH Interface may be used by other systems to harvest metadata records from your DSpace.

OAI-PMH Server Activation

DSpace's OAI-PMH server is enabled by default. However, you can choose to enable/disable it in your local.cfg
using these configurations:

# Enable (true) or disable (false) OAI-PMH server

oai.enabled = true

# When enabled, OAI-PMH server is available at this path
oai.path = oai

If you modify either of these configuration, you must restart your Servlet Container (usually Tomcat).
• You can test that it is working by sending a request to: [dspace.server.url]/[oai.path]/request?
verb=Identify (e.g. http://localhost:8080/server/oai/request?verb=Identify)
• The response should look similar to the response from the DSpace Demo Server: http://demo.dspace.org/
oai/request?verb=Identify
If you're using a recent browser, you should see a HTML page describing your repository. What you're getting from
the server is in fact an XML file with a link to an XSLT stylesheet that renders this HTML in your browser (client-side).
Any browser that cannot interpret XSLT will display pure XML. The default stylesheet is located in [dspace-
source]/dspace-oai/src/main/resources/static/style.xsl and can be changed by configuring the
stylesheet attribute of the Configuration element in [dspace]/config/crosswalks/oai/xoai.xml.

 Relevant Links
• OAI 2.0 Server(see page 189) - basic information needed to configure and use the OAI Server in DSpace
• OAI-PMH Data Provider 2.0 (Internals)(see page 186) - information on how it's implemented
• http://www.openarchives.org/pmh/ - information on the OAI-PMH protocol and its usage (not
DSpace-specific)

OAI-PMH Server Maintenance

After activating the OAI-PMH server, you need to also ensure its index is updated on a regular basis. Currently, this
doesn't happen automatically within DSpace. Instead, you must schedule the [dspace.dir]/bin/dspace oai
import commandline tool to run on a regular basis (usually at least nightly, but you could schedule it more
frequently).
Here's an example cron that can be used to schedule an OAI-PMH reindex on a nightly basis (for a full list of
recommended DSpace cron tasks see Scheduled Tasks via Cron213):

213 https://wiki.lyrasis.org/display/DSDOC5x/Scheduled+Tasks+via+Cron

Using DSpace – 180

DSpace 7.x Documentation – DSpace 7.x Documentation

# Update the OAI-PMH index with the newest content (and re-optimize that index) at midnight every day
# NOTE: ONLY NECESSARY IF YOU ARE RUNNING OAI-PMH
# (This ensures new content is available via OAI-PMH and ensures the OAI-PMH index is optimized for better
performance)
0 0 * * * [dspace.dir]/bin/dspace oai import -o > /dev/null

More information about the dspace oai commandline tool can be found in the OAI Manager(see page 191)
documentation.

OAI-PMH / OAI-ORE Harvester (Client)

This section describes the parameters used in configuring the OAI-ORE / OAI-ORE harvester. This harvester can be
used to harvest content (bitstreams and metadata) into DSpace from an external OAI-PMH or OAI-ORE server.

 DSpace 7.0 does not yet support

OAI Harvesting is not available in DSpace 7.0. It is scheduled to be restored in a later 7.x release (currently
7.1), see DSpace Release 7.0 Status214

Harvesting from another DSpace

If you are harvesting content (bitstreams and metadata) from an external DSpace installation via OAI-PMH & OAI-
ORE, you first should verify that the external DSpace installation allows for OAI-ORE harvesting.
If the external DSpace is running v6.x or below, it must be running both the OAI-PMH interface and the XMLUI
interface to support harvesting content from it via OAI-ORE.
If the external DSpace is running v7.x or above, it just needs to be running the OAI-PMH interface.
You can verify that OAI-ORE harvesting option is enabled by following these steps:
1. First, check to see if the external DSpace reports that it will support harvesting ORE via the OAI-PMH
interface. Send the following request to the DSpace's OAI-PMH interface: http://[full-URL-to-OAI-
PMH]/request?verb=ListRecords&metadataPrefix=ore
• The response should be an XML document containing ORE, similar to the response from the DSpace
Demo Server: http://demo.dspace.org/oai/request?verb=ListRecords&metadataPrefix=ore
2. For 6.x or below, you can verify that the XMLUI interface supports OAI-ORE (it should, as long as it's a current
version of DSpace). First, find a valid Item Handle. Then, send the following request to the DSpace's XMLUI
interface: http://[full-URL-to-XMLUI]/metadata/handle/[item-handle]/ore.xml
• The response should be an OAI-ORE (XML) document which describes that specific Item. It should
look similar to the response from the DSpace Demo Server: http://demo.dspace.org/xmlui/metadata/
handle/10673/3/ore.xml

OAI-PMH / OAI-ORE Harvester Configuration

There are many possible configuration options for the OAI harvester. Most of these are contained in the [dspace]/
config/modules/oai.cfg file (unless otherwise noted below). They may be updated there or overridden in your
local.cfg config file (see Configuration Reference(see page 552)).

Configuration File: [dspace]/config/modules/oai.cfg

214 https://wiki.lyrasis.org/display/DSPACE/DSpace+Release+7.0+Status

Using DSpace – 181

DSpace 7.x Documentation – DSpace 7.x Documentation

Property: oai.harvester.eperson

Example Value: oai.harvester.eperson = [email protected]

Informational Note: The EPerson under whose authorization automatic harvesting will be
performed. This field does not have a default value and must be specified in
order to use the harvest scheduling system. This will most likely be the
DSpace admin account created during installation.

Property: oai.url

Example Value: oai.url = ${dspace.server.url}/${oai.path}

Informational Note: The base url of the OAI-PMH disseminator webapp (i.e. do not include the /
request on the end). This is necessary in order to mint URIs for ORE
Resource Maps. The default value of ${dspace.baseUrl}/oai will work
for a typical installation, but should be changed if appropriate. Please note
that dspace.baseUrl is defined in your dspace.cfg configuration file.

Property: oai.ore.authoritative.source

Example Value: oai.ore.authoritative.source = oai

Informational Note: The webapp responsible for minting the URIs for ORE Resource Maps. If
using oai, the oai.url config value must be set.

• When set to 'oai', all URIs in ORE Resource Maps will be relative to the
OAI-PMH URL (configured by oai.url above)