Skip to content

gbhl/bhl-us

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,848 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prerequisites

  • SQL Express 2014 or later
  • Visual Studio 2017 or later
  • Docker (Optional) - Used to host ElasticSearch and RabbitMQ, which enable full-text search.

Notes

  • The folder holding the bhl-us source code is referred to throughout this document as <BHLRoot>.
  • These instructions assume that the databases will be named "BHL", "BHLImport", "BHLAuditArchive", and "IAAnalysis".

Setup

After downloading the bhl-us source code, do the following to get the web sites and utility applications running.

Database Creation

  1. Open a Windows command prompt.

  2. Make sure that the sqlcmd utility, which is part of the SQL Server client tools, is included in your path. More information can be found at http://technet.microsoft.com/en-us/library/ms162773.aspx.

  3. Navigate to the <BHLRoot>\Database-BHL folder.

  4. Run the BHLDBBuildScript.bat batch file. This will build the primary database.

    Usage:

    BHLDBBuildScript SERVERNAME DATABASENAME FULLTEXTCATALOGFILEPATH ISPRODUCTION DATAORSTRUCTURE

    where

    SERVERNAME is the name of the database server DATABASENAME is the name of the database. It is recommended that "BHL" be used as the database name. FULLTEXTCATALOGFILEPATH is the path in which to place the full-text catalog file. Use quotes around this value if the path contains spaces. ISPRODUCTION is true for a production database, and false for a development database. Auditing triggers are removed from development databases. DATAORSTRUCTURE is "structure" to build the empty database (no data), "data" to add data to an existing database, or "all" to build the structure and add the data.

    Example:

    BHLDBBuildScript localhost BHL "C:\Program Files\Microsoft SQL Server\MSSQL11.SQLEXPRESS\MSSQL\DATA" false all

  5. In the new BHL database, create roles named db_executor and db_webuser.

    USE [BHL]; CREATE ROLE db_executor; GRANT EXECUTE TO db_executor;

    CREATE ROLE db_webuser; GRANT INSERT ON dbo.PDF TO db_webuser; GRANT INSERT ON dbo.PDFPage TO db_webuser; GRANT UPDATE ON dbo.Page TO db_webuser;

  6. Navigate to the <BHLRoot>\Database-BHLImport folder.

  7. Run the BHLImportDBBuildScript.bat batch file. This will build the database used as a staging area for new material.

    Usage:

    BHLImportDBBuildScript SERVERNAME DATABASENAME DATAORSTRUCTURE

    SERVERNAME is the name of the database server DATABASENAME is the name of the database. It is recommended that "BHLImport" be used as the database name. DATAORSTRUCTURE is "structure" to build the empty database (no data), "data" to add data to an existing database, or "all" to build the structure and add the data.

    Example:

    BHLImportDBBuildScript localhost BHLImport all

  8. In the new BHLImport database, create a role named db_webuser.

    USE [BHLImport]; CREATE ROLE db_webuser; GRANT SELECT ON dbo.IAFile TO db_webuser; GRANT SELECT ON dbo.IAItem TO db_webuser;

  9. Navigate to the <BHLRoot>\Database-BHLAuditArchive folder.

  10. Run the BHLAuditArchiveDBBuildScript.bat batch file. This will build the auditing database.

    Usage:

    BHLAuditArchiveDBBuildScript SERVERNAME DATABASENAME

    where

    SERVERNAME is the name of the database server DATABASENAME is the name of the database. It is recommended that "BHLAuditArchive" be used as the database name.

    Example:

    BHLAuditArchiveDBBuildScript localhost BHLAuditArchive

  11. Navigate to the <BHLRoot>\Database-IAAnalysis folder.

  12. Run the IAAnalysisDBBuildScript.bat batch file. This will build the database used to ingest non-biodiversity-collection items from Internet Archive.

    Usage:

    IAAnalysisDBBuildScript SERVERNAME DATABASENAME DATAORSTRUCTURE

    where

    SERVERNAME is the name of the database server DATABASENAME is the name of the database. It is recommended that "IAAnalysis" be used as the database name. DATAORSTRUCTURE is "structure" to build the empty database (no data), "data" to add data to an existing database, or "all" to build the structure and add the data.

    Example:

    IAAnalysisDBBuildScript localhost IAAnalysis all

  13. Create a new SQL Server login named BHLWebUser. Map it to a user named BHLWebUser in the new BHL database, and assign it to the "db_executor" and "db_webuser" database roles.

    USE [master]; CREATE LOGIN [BHLWebUser] WITH PASSWORD=N'BHLWebUser';

    USE [BHL]; CREATE USER [BHLWebUser] FOR LOGIN [BHLWebUser] WITH DEFAULT_SCHEMA=[dbo]; ALTER ROLE [db_executor] ADD MEMBER [BHLWebUser]; ALTER ROLE [db_webuser] ADD MEMBER [BHLWebUser];

  14. Map the BHLWebUser login to a user named BHLWebUser in the new BHLImport database, and assign it to the "db_webuser" database role.

    USE [BHLImport]; CREATE USER [BHLWebUser] FOR LOGIN [BHLWebUser] WITH DEFAULT_SCHEMA=[dbo]; ALTER ROLE [db_webuser] ADD MEMBER [BHLWebUser];

  15. Create a new SQL Server login named BHLService. Map it to a user named BHLService in the BHL, BHLAuditArchive, BHLImport, and IAAnalysis databases. In each database, assign the new user to the "db_datareader", "db_datawriter", and "db_owner" database roles.

    USE [master]; CREATE LOGIN [BHLService] WITH PASSWORD=N'BHLService';

    USE [BHL]; CREATE USER [BHLService] FOR LOGIN [BHLService] WITH DEFAULT_SCHEMA=[dbo]; ALTER ROLE [db_datareader] ADD MEMBER [BHLService]; ALTER ROLE [db_datawriter] ADD MEMBER [BHLService]; ALTER ROLE [db_owner] ADD MEMBER [BHLService];

    USE [BHLAuditArchive]; CREATE USER [BHLService] FOR LOGIN [BHLService] WITH DEFAULT_SCHEMA=[dbo]; ALTER ROLE [db_datareader] ADD MEMBER [BHLService]; ALTER ROLE [db_datawriter] ADD MEMBER [BHLService]; ALTER ROLE [db_owner] ADD MEMBER [BHLService];

    USE [BHLImport]; CREATE USER [BHLService] FOR LOGIN [BHLService] WITH DEFAULT_SCHEMA=[dbo]; ALTER ROLE [db_datareader] ADD MEMBER [BHLService]; ALTER ROLE [db_datawriter] ADD MEMBER [BHLService]; ALTER ROLE [db_owner] ADD MEMBER [BHLService];

    USE [IAAnalysis]; CREATE USER [BHLService] FOR LOGIN [BHLService] WITH DEFAULT_SCHEMA=[dbo]; ALTER ROLE [db_datareader] ADD MEMBER [BHLService]; ALTER ROLE [db_datawriter] ADD MEMBER [BHLService]; ALTER ROLE [db_owner] ADD MEMBER [BHLService];

ElasticSearch Installation and Configuration (Optional)

The BHL web site uses ElasticSearch to implement many of its search capabilities. However, if ElasticSearch is not present, the web site will fall back to a SQL Server search implementation. The SQL Server search performs basic searches of catalog metadata. Full-text searches are disabled, faceting is not available, and the “search within a book” feature will not work. Therefore, for a full-featured BHL implementation, the instructions in this section should be followed. If a limited search is satisfactory, then this section, the “RabbitMQ” section, and the “Index Data in ElasticSearch” section can be skipped.

  1. (If necessary) Download and install Docker from docker.com. As of January 2019, the product to install is called “Docker Desktop”.

  2. Open a Windows command prompt.

  3. Get the official ElasticSearch image.

    docker pull docker.elastic.co/elasticsearch/elasticsearch:7.9.1

  4. Start a new ElasticSearch docker container that will be accessible at http://localhost:9200.

    docker run -d --name es791 -p 9200:9200 -e "http.host=0.0.0.0" -e "transport.host=127.0.0.1" docker.elastic.co/elasticsearch/elasticsearch:7.9.1

  5. Locate the "elasticsearch.yml" file within the running Docker container. The following command should return something like "/usr/share/elasticsearch/config/elasticsearch.yml"

    docker exec -it es791 find / -name "elasticsearch.yml"

  6. Copy the elasticsearch.yml file from the container to the host.

    docker cp es791:/usr/share/elasticsearch/config/elasticsearch.yml c:\elasticsearch.yml

  7. On the host, use a text editor to disable Xpack security by adding the following line to the elasticsearch.yml file:

    xpack.security.enabled: false

    The file contents should now look something like this:

    cluster.name: "docker-cluster" network.host: 0.0.0.0

    # minimum_master_nodes need to be explicitly set when bound on a public IP # set to 1 to allow single node clusters # Details: elastic/elasticsearch#17288 discovery.zen.minimum_master_nodes: 1

    xpack.security.enabled: false

  8. Copy the edited elasticsearch.yml file from the host to the running Docker container.

    docker cp c:\elasticsearch.yml es791:/usr/share/elasticsearch/config/elasticsearch.yml

  9. Add the ICU analysis plug-in to ElasticSearch to add better support for Unicode characters, including Asian characters.

    docker exec -it es791 /usr/share/elasticsearch/bin/elasticsearch-plugin install analysis-icu

  10. Stop the running container.

    docker stop es791

  11. Restart the ElasticSearch container. Note that by using the name (es791) assigned to the container is Step 4, all of the other arguments we specified in Step 4 (-d -e -p ) are used by default.

docker start es791
  1. Verify the operation of ElasticSearch by opening a browser and navigating to http://localhost:9200. You should get a response that looks something like this:

    { "name" : "90WsOOT", "cluster_name" : "docker-cluster", "cluster_uuid" : "Ok4_vavaTuSxn9qrsAGZwA", "version" : { "number" : "5.4.2", "build_hash" : "f9d9b74", "build_date" : "2017-02-24T17:26:45.835Z", "build_snapshot" : false, "lucene_version" : "6.4.1" }, "tagline" : "You Know, for Search" }

  2. Create new indexes using curl or a comparable tool.

    curl –X PUT http://localhost:9200/items -d @\ElasticSearch\items.json –H “Content-Type:application/json” curl –X PUT http://localhost:9200/catalog -d @\ElasticSearch\catalog.json –H “Content-Type:application/json” curl –X PUT http://localhost:9200/authors -d @\ElasticSearch\authors.json –H “Content-Type:application/json” curl –X PUT http://localhost:9200/keywords -d @\ElasticSearch\keywords.json –H “Content-Type:application/json” curl –X PUT http://localhost:9200/names -d @\ElasticSearch\names.json –H “Content-Type:application/json” curl –X PUT http://localhost:9200/pages -d @\ElasticSearch\pages.json –H “Content-Type:application/json”

RabbitMQ Installation and Configuration (Optional)

  1. Open a Windows command prompt.

  2. Get the official RabbitMQ image

    docker pull rabbitmq

  3. Start a new ElasticSearch docker container that will be accessible at http://localhost:5672.

    docker run -d --name rabbit373 --hostname my-rabbit -p 5672:5672 rabbitmq

    (RECOMMENDED) To alternately include the RabbitMQ management console, which will be accessible at http://localhost:15672, use this instead:

    docker run -d --name rabbit373 --hostname my-rabbit -p 5672:5672 -p 15672:15672 rabbitmq:management

  4. the operation of RabbitMQ by opening a browser and navigating to http://localhost:5672. You should get a response that looks something like this:

    AMQP

  5. Verify the operation of the RabbitMQ management console by opening a browser and navigating to http://localhost:15672. Use guest/guest as the username/password.

    NOTE: To supply a different username and password, change the command that creates a RabbitMQ container with the management console to the following:

    docker run -d --name rabbit373mgmt --hostname my-rabbit -p 5672:5672 -p 15672:15672 -e RABBITMQ_DEFAULT_USER=user -e RABBITMQ_DEFAULT_PASS=password rabbitmq:management

Application Configuration

  1. Make copies of the config files as indicated in the following list:
Original FileCopy To
<BHLRoot>\BHLAdminWeb\Web.config.template<BHLRoot>\BHLAdminWeb\Web.config
<BHLRoot>\BHLApiDALTest\App.config.template<BHLRoot>\BHLApiDALTest\App.config
<BHLRoot>\BHLBioStorHarvest\app.config.template<BHLRoot>\BHLBioStorHarvest\app.config
<BHLRoot>\BHLCoreDALTest\App.config.template<BHLRoot>\BHLCoreDALTest\App.config
<BHLRoot>\BHLDOIService\app.config.template<BHLRoot>\BHLDOIService\app.config
<BHLRoot>\BHLExportProcessor\App.config.template<BHLRoot>\BHLExportProcessor\App.config
<BHLRoot>\BHLFlickrTagHarvest\app.config.template<BHLRoot>\BHLFlickrTagHarvest\app.config
<BHLRoot>\BHLFlickrThumbGrab\app.config.template<BHLRoot>\BHLFlickrThumbGrab\app.config
<BHLRoot>\BHLMETSUpload\app.config.template<BHLRoot>\BHLMETSUpload\app.config
<BHLRoot>\BHLNameFileGenerator\app.config.template<BHLRoot>\BHLNameFileGenerator\app.config
<BHLRoot>\BHLOAIHarvester\app.config.template<BHLRoot>\BHLOAIHarvester\app.config
<BHLRoot>\BHLOCRRefresh\app.config.template<BHLRoot>\BHLOCRRefresh\app.config
<BHLRoot>\BHLPageNameRefresh\app.config.template<BHLRoot>\BHLPageNameRefresh\app.config
<BHLRoot>\BHLPDFGenerator\app.config.template<BHLRoot>\BHLPDFGenerator\app.config
<BHLRoot>\BHLSearchIndexer\AppConfig.xml.template<BHLRoot>\BHLSearchIndexer\AppConfig.xml
<BHLRoot>\BHLSearchIndexer\AppConfig.xml.template<BHLRoot>\BHLSearchIndexer\AppConfig.Names.xml
<BHLRoot>\BHLSearchIndexer\AppConfig.xml.template<BHLRoot>\BHLSearchIndexer\AppConfig.Full.xml
<BHLRoot>\BHLSearchIndexQueueLoad\AppConfig.xml.template<BHLRoot>\BHLSearchIndexQueueLoad\AppConfig.xml
<BHLRoot>\BHLServerTest\app.config.template<BHLRoot>\BHLServerTest\app.config
<BHLRoot>\BHLTextImportProcessor\app.config.template<BHLRoot>\BHLTextImportProcessor\app.config
<BHLRoot>\BHLUSWeb2\ratelimit.config.template<BHLRoot>\BHLUSWeb2\ratelimit.config
<BHLRoot>\BHLUSWeb2\ratelimitwhitelist.config.template<BHLRoot>\BHLUSWeb2\ratelimitwhitelist.config
<BHLRoot>\BHLUSWeb2\Web.config.template<BHLRoot>\BHLUSWeb2\Web.config
<BHLRoot>\BHLUSWeb2\Views\Web.config.template<BHLRoot>\BHLUSWeb2\Views\Web.config
<BHLRoot>\BHLWebServiceREST.v1\app.config.template<BHLRoot>\BHLWebServiceREST.v1\app.config
<BHLRoot>\BHLWebServiceREST.v1\Web.config.template<BHLRoot>\BHLWebServiceREST.v1\Web.config
<BHLRoot>\IAAnalysisHarvest\App.config.template<BHLRoot>\IAAnalysisHarvest\App.config
<BHLRoot>\IAHarvest\App.config.template<BHLRoot>\IAHarvest\App.config
<BHLRoot>\IAHarvestAsync\App.config.template<BHLRoot>\IAHarvestAsync\App.config
<BHLRoot>\SearchElasticTest\app.config.template<BHLRoot>\SearchElasticTest\app.config
<BHLRoot>\SiteServiceREST.v1\app.config.template<BHLRoot>\SiteServiceREST.v1\app.config
<BHLRoot>\SiteServiceREST.v1\Web.config.template<BHLRoot>\SiteServiceREST.v1\Web.config
<BHLRoot>\WDHarvest\App.config.template<BHLRoot>\WDHarvest\App.config

 
2) Make the following modifications to the config files:

# = denotes optional modifications that are not required for development installations

 
WWW.BIODIVERSITYLIBRARY.ORG

The primary web user interface, allowing browsing and searching the collection as well as viewing individual items.

<BHLRoot>\BHLUSWeb2\ratelimit.config

This configuration file allows rate limits to be set by IP address, User Agent, and web site endpoint. See the instructions and examples in the ratelimit.config file for more information.

<BHLRoot>\BHLUSWeb2\ratelimitwhitelist.config

This configuration file works in tandem with the ratelimit.config file. It specifies IP addresses, User Agents, and web site endpoints to omit from rate limiting (to be "whitelisted"). See the instructions and examples in the file for more information.

<BHLRoot>\BHLUSWeb2\Web.config

ElementValue
# appSettings/PdfUrlhttp://SITE_SERVICES_URL/pdf{0}/{1}, where SITE_SERVICES_URL is the URL for a running instance of the SiteServiceREST.v1 project
# appSettings/GoogleAnalyticsTrackingIDTracking identifier for the site in Google Analytics
# appSettings/GeminiURLIssue tracking service URL
# appSettings/GeminiUserIssue tracking service username
# appSettings/GeminiPasswordIssue tracking service password
appSettings/ElasticSearchServerAddressServer address for an instance of ElasticSearch
appSettings/SiteServicesUrlURL for a running instance of the BHLSiteServiceREST.v1 project
# appSettings/FundRaiseUpCampaignCodeFundraiseUp code for the site
# appSettings/TwitterConsumerKeyConsumer Key for Twitter API
# appSettings/TwitterConsumerSecretConsumer Secret for Twitter API
# appSettings/ReCaptchaSiteKeySite key for Google ReCaptcha service
# appSettings/ReCaptchaSecretKeySecret key for Google ReCaptcha service
connectionStrings/BHLConnection string for BHL database
# connectionStrings/AdminOptional connection string for API logging database
# system.net/mailSettings/smtp/networkSTMP host address, username, and password

 
ADMIN.BIODIVERSITYLIBRARY.ORG

The administrative user interface. It requires authorization and authentication, and allows metadata editing, reporting, and system monitoring.

<BHLRoot>\BHLAdminWeb\Web.config

ElementValue
appSettings/CollectionImageUploadPathPath in which to place uploaded images.
appSettings/ItunesImageUploadPathPath in which to place uploaded images.
appSettings/AlertMsgPathPath in which to place text file with informational messages.
appSettings/MARCUploadPathPath in which to place uploaded MARC files.
appSettings/MARCUploadDriveDrive letter or server name for MARC uploads.
appSettings/MARCUploadServerServer name for MARC uploads.
appSettings/CitationNewPathPath for new uploads of citation information.
appSettings/CitationCompletePathPath for completed uploads of citation information.
appSettings/CitationErrorPathPath for failed uploads of citation information.
# appSettings/FlickrUserIdFlickr user identifier.
appSettings/SearchServerAddressServer address for an instance of ElasticSearch
appSettings/MessageQueueAdminAddressServer address for the administrative interface of an instance of RabbitMQ
appSettings/SiteServicesUrlURL for a running instance of the BHLSiteServiceREST.v1 project
# appSettings/EmailFromNameEmail sender address to use when sending emails.
# appSettings/EmailFromAddressEmail sender name to use when sending emails.
# appSettings/BHLUserAdminEmailAddressEmail address of a BHL user administrator.
appSettings/LocalFileFolderFile folder in which to place new data files ingested from Internet Archive.
# appSettings/FlickrKeyFlickr API key
# appSettings/FlickrSecretFlickr API secret
connectionStrings/BHLConnection string for BHL database
connectionStrings/BHLUserConnection string for user account database
connectionStrings/BHLImportConnection string for BHLImport database

INTERNAL APIs

BHLWebServiceREST.v1

APIs that support the internal non-web applications.

<BHLRoot>\BHLWebServiceREST.v1\app.config
<BHLRoot>\BHLWebServiceREST.v1\web.config

ElementValue
# appSettings/SMTPHostName of a SMTP server.
appSettings/DOIDepositFileLocationPath to CrossRef deposit files.
appSettings/DOISubmitLogFileLocationPath to Crossref log files.
appSettings/OCRJobNewPathPath to new OCR job files
connectionStrings/BHLConnection string for BHL database

SiteServiceREST.v1

APIs that support the primary web UI and the administrative web interface.

<BHLRoot>\SiteServiceREST.v1\app.config
<BHLRoot>\SiteServiceREST.v1\web.config

ElementValue
# appSettings/SMTPHostName of a SMTP server.
# appSettings/SearchServerStatsUrlSearch server URL for uptime stats
appSettings/DOIDepositFileLocationPath to CrossRef deposit files.
appSettings/DOISubmitLogFileLocationPath to Crossref log files.
appSettings/OCRJobNewPathPath to new OCR job files
# appSettings/MQHostServer address for a RabbitMQ instance
# appSettings/MQPortServer port for a RabbitMQ instance
# appSettings/MQAPIPortServer port for a RabbitMQ API instance
# appSettings/MQUsernameUsername to access a RabbitMQ instance
# appSettings/MQPasswordPassword to access a RabbitMQ instance
# appSettings/PregeneratedPdfLocationFile location of pregenerated article PDFs
connectionStrings/BHLConnection string for BHL database

DATA IMPORT APPS

BHLBioStorHarvest

Service that harvests Segment metadata from APIs that are part of the BioStor platform (https://biostor.org/).

<BHLRoot>\BHLBioStorHarvest\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project
connectionStrings/BHLImportConnection string for BHLImport database
connectionStrings/BHLConnection string for BHL database

BHLFlickrTagHarvest

Service that examines the BHL Flickr collection (https://www.flickr.com/photos/biodivlibrary/) and downloads new and updated tags and notes into a database.

<BHLRoot>\BHLFlickrTagHarvest\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/FlickrApiKeyFlickr API Key
appSettings/BHLFlickrUserIDFlickr username
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
connectionStrings/BHLImportConnection string for BHLImport database
connectionStrings/BHLConnection string for BHL database

IAAnalysisHarvest

Service that obtains identifiers of Internet Archive (IA) items that should be harvested into BHL even though they are not part of the IA "biodiversity" collection.

<BHLRoot>\IAAnalysisHarvest\App.config

ElementValue
connectionStrings/IAAnalysisConnection string for IAAnalysis database
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.

IAHarvest

Service that downloads metadata files for new and updated items hosted at Internet Archive. It extracts the metadata from the files and adds it to database tables. From there, it initiates procedures that clean the data and add it to the production database.

<BHLRoot>\IAHarvest\App.config

ElementValue
connectionStrings/BHLImportConnection string for BHLImport database
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
appSettings/LocalFileFolderLocal folder to hold downloaded files
# appSettings/MQAddressServer address for a RabbitMQ instance
# appSettings/MQPortServer port for a RabbitMQ instance
# appSettings/MQUserUsername to access a RabbitMQ instance
# appSettings/MQPasswordPassword to access a RabbitMQ instance
# appSettings/MQQueueName of a RabbitMQ queue for identifiers of new/updated items
# appSettings/MQExchangeName of a RabbitMQ exchange associated with the queue
# appSettings/MQErrorQueueName of a RabbitMQ queue for messages that are not processed successfully
# appSettings/MQErrorExchangeName of a RabbitMQ exchange associated with the error queue

IAHarvestAsync

Service that executes multiple instances of the IAHarvest process at one time, speeding up the overall process of downloading metadata files for new and updated items from Internet Archive.

<BHLRoot>\IAHarvestAsync\App.config

ElementValue
connectionStrings/BHLImportConnection string for BHLImport database
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
appSettings/LocalFileFolderLocal folder to hold downloaded files

BHLOAIHarvester

Service that harvests metadata from OAI-PMH feeds and stores it in a BHL database. From there, it initiates procedures that clean the data and add it to the production database.

<BHLRoot>\BHLOAIHarvester\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
connectionStrings/BHLImportConnection string for BHLImport database

WDHarvest

Service that downloads persistent identifiers associated with BHL entities in Wikidata. Identifiers are added to the production database, reports are generated identifying newly added data and potential errors, and stakeholders are notified via email.

<BHLRoot>\WDHarvest\App.config

ElementValue
connectionStrings/BHLImportConnection string for BHLImport database
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
# appSettings/AdminEmailToAddressProcess administrator recipient of report notifications sent by the process
# appSettings/StaffEmailToAddressStaff member recipients of report notifications sent by the process
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.

 
UTILITY APPS

BHLDOIService

Service that submits new and updated DOI metadata to Crossref, and updates the DOI metadata in BHL.

<BHLRoot>\BHLDOIService\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/CrossRefDepositorNameDepositor name associated with CrossRef account
appSettings/CrossRefDepositorEmailDepositor email associated with CrossRef account
appSettings/CrossRefLoginLogin for CrossRef account
appSettings/CrossRefPasswordPassword for CrossRef account
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.

BHLExportProcessor

Service that creates BHL data exports in a variety of formats, including BibTeX, MODS, RIS, KBART, and TSV.

<BHLRoot>\BHLExportProcessor\App.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
appSettings/RISTitleTempFileReplace \\SERVER\FOLDER with valid path
appSettings/RISTitleFileReplace \\SERVER\FOLDER with valid path
appSettings/RISTitleZipFileReplace \\SERVER\FOLDER with valid path
appSettings/RISItemTempFileReplace \\SERVER\FOLDER with valid path
appSettings/RISItemFileReplace \\SERVER\FOLDER with valid path
appSettings/RISItemZipFileReplace \\SERVER\FOLDER with valid path
appSettings/RISSegmentTempFileReplace \\SERVER\FOLDER with valid path
appSettings/RISSegmentFileReplace \\SERVER\FOLDER with valid path
appSettings/RISSegmentZipFileReplace \\SERVER\FOLDER with valid path
appSettings/RISInternalTitleTempFileReplace \\SERVER\FOLDER with valid path
appSettings/RISInternalTitleFileReplace \\SERVER\FOLDER with valid path
appSettings/RISInternalTitleZipFileReplace \\SERVER\FOLDER with valid path
appSettings/RISInternalItemTempFileReplace \\SERVER\FOLDER with valid path
appSettings/RISInternalItemFileReplace \\SERVER\FOLDER with valid path
appSettings/RISInternalItemZipFileReplace \\SERVER\FOLDER with valid path
appSettings/RISInternalSegmentTempFileReplace \\SERVER\FOLDER with valid path
appSettings/RISInternalSegmentFileReplace \\SERVER\FOLDER with valid path
appSettings/RISInternalSegmentZipFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSTitleTempFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSTitleFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSTitleZipFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSItemTempFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSItemFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSItemZipFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSSegmentTempFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSSegmentFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSSegmentZipFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSInternalTitleTempFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSInternalTitleFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSInternalTitleZipFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSInternalItemTempFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSInternalItemFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSInternalItemZipFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSInternalSegmentTempFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSInternalSegmentFileReplace \\SERVER\FOLDER with valid path
appSettings/MODSInternalSegmentZipFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXTitleTempFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXTitleFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXTitleZipFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXItemTempFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXItemFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXItemZipFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXSegmentTempFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXSegmentFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXSegmentZipFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXInternalTitleTempFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXInternalTitleFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXInternalTitleZipFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXInternalItemTempFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXInternalItemFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXInternalItemZipFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXInternalSegmentTempFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXInternalSegmentFileReplace \\SERVER\FOLDER with valid path
appSettings/BibTeXInternalSegmentZipFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVDOIFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVAuthorFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVItemFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVPageFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVPageNameFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVPartFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVPartAuthorFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVKeywordFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVTitleFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVTitleIdentifierFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalDOIFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalAuthorFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalItemFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalPageFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalPageNameFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalPartFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalPartAuthorFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalKeywordFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalTitleFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalTitleIdentifierFileReplace \\SERVER\FOLDER with valid path
appSettings/TSVInternalAuthorIdentifierFileReplace \\SERVER\FOLDER with valid path

BHLFlickrThumbGrab

Service that downloads randomly selectly BHL images from Flickr for display on the BHL home page.

<BHLRoot>\BHLFlickrThumbGrab\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/FlickrAPIKeyFlickr API key
appSettings/ImageFileNameReplace \\SERVER\FOLDER with valid path
appSettings/ImageFolderReplace \\SERVER\FOLDER with valid path
appSettings/ImageListFilePathReplace \\SERVER\FOLDER with valid path
appSettings/DefaultFilesFolderReplace \\SERVER\FOLDER with valid path
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.

BHLMETSUpload

Service that generates METS files for new and modified BHL Items. The METS files include bibliographic metadata and page-level metadata. After generation they are uploaded to the item's Internet Archive folder.

<BHLRoot>\BHLMETSUpload\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/METSEmailOrganization email address to place in METS files
appSettings/IAS3AccessKeyInternet Archive access key
appSettings/IAS3SecretKeyInternet Archive secret key
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.

BHLNameFileGenerator

Service that generates XML files containing the scientific names in an item. After generation they are uploaded to the item's Internet Archive folder.

<BHLRoot>\BHLNameFileGenerator\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/IAS3AccessKeyInternet Archive access key
appSettings/IAS3SecretKeyInternet Archive secret key
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.

BHLOCRRefresh

Service that downloads the DJVU file for an item from Internet Archive, parses it into individual text files (one per page), and replaces the item's existing page text files on the BHL search/file server.

<BHLRoot>\BHLOCRRefresh\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/OcrJobNewPathPath to new job files
appSettings/OcrJobProcessingPathPath to job files being processed
appSettings/OcrJobCompletePathPath to complete job files
appSettings/OcrJobErrorPathPath to failed job files
appSettings/OcrJobTempPathPath to temporary OCR files
appSettings/MQAddressMessage queue host URL
appSettings/MQPortMessage queue port
appSettings/MQUserMessage queue username
appSettings/MQPasswordMessage queue password
appSettings/MQQueueName of message queue for items with updated text
appSettings/MQExchangeName of MQ exchange for items with update dtext
appSettings/MQErrorQueueName of error message queue
appSettings/MQErrorExchangeName of MQ error exchange
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.

BHLPageNameRefresh

Service that invokes the Global Names gnfinder tool to identify scientific names in page text. Identified names are added to the BHL database.

<BHLRoot>\BHLPageNameRefresh\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.

BHLPDFGenerator

Service that fulfills requests for custom PDFs. Assembles the PDFs, saves them to the BHL search/file server, and emails the requestor a download link.

<BHLRoot>\BHLPDFGenerator\app.config

ElementValue
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/PdfFilePathReplace \\SERVER\FOLDER with valid path
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.

BHLSearchIndexer

Service that reads messages from RabbitMQ queues and adds/updates/deletes the corresponding Elasticsearch records.

<BHLRoot>\BHLSearchIndexer\AppConfig.xml

ElementValue
# appSettings/EmailToAddressesRecipients of emails sent by the process (comma-separated)
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
appSettings/ElasticSearchServerAddressSearch Server address, including port number
appSettings/MQAddressMessage queue host URL
appSettings/MQPortMessage queue port
appSettings/MQUserMessage queue username
appSettings/MQPasswordMessage queue password
appSettings/MQQueueNameName of message queue for items/authors/keywords
appSettings/MQErrorExchangeNameName of MQ error exchange for items/authors/keywords
appSettings/MQErrorQueueNameName of MQ error queue for items/authors/keywords
appSettings/DocFolderFolder for debug output files
appSettings/OCRLocationSet to “remote”
# connectionStrings/ProductionConnection string for production BHL database
connectionStrings/QAConnection string for QA BHL database

<BHLRoot>\BHLSearchIndexer\AppConfig.Names.xml

ElementValue
# appSettings/EmailToAddressesRecipients of emails sent by the process (comma-separated)
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
appSettings/ElasticSearchServerAddressSearch Server address, including port number
appSettings/MQAddressMessage queue host URL
appSettings/MQPortMessage queue port
appSettings/MQUserMessage queue username
appSettings/MQPasswordMessage queue password
appSettings/MQQueueNameName of message queue for names
appSettings/MQErrorExchangeNameName of MQ error exchange for names
appSettings/MQErrorQueueNameName of MQ error queue for names
appSettings/DocFolderFolder for debug output files
appSettings/OCRLocationSet to “remote”
# connectionStrings/ProductionConnection string for production BHL database
connectionStrings/QAConnection string for QA BHL database

<BHLRoot>\BHLSearchIndexer\AppConfig.Full.xml

ElementValue
# appSettings/EmailToAddressesRecipients of emails sent by the process (comma-separated)
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
appSettings/ElasticSearchServerAddressSearch Server address, including port number
appSettings/DocFolderFolder for debug output files
appSettings/OCRLocationSet to “remote”
appSettings/DoFullIndexSet to “true”
# connectionStrings/ProductionConnection string for production BHL database
connectionStrings/QAConnection string for QA BHL database

BHLSearchIndexQueueLoad

Service that queries the database to identify recently changed entities (titles, items, segments, authors, keywords, names), and adds messages for each changed entity to RabbitMQ queues. FOr changed segments, it also adds messages to a RabbitMQ queue for pre-generated PDFs.

<BHLRoot>\BHLSearchIndexQueueLoad\AppConfig.xml

ElementValue
appSettings/MQAddressMessage queue host URL
appSettings/MQPortMessage queue port
appSettings/MQUserMessage queue username
appSettings/MQPasswordMessage queue password
appSettings/MQQueueName of message queue for items/authors/keywords
appSettings/MQExchangeName of MQ exchange for items/authors/keywords
appSettings/MQErrorExchangeName of MQ error exchange for items/authors/keywords
appSettings/MQErrorQueueName of MQ error queue for items/authors/keywords
appSettings/MQQueueNamesName of MQ queue for names
appSettings/MQErrorExchangeNamesName of MQ error exchange for names
appSettings/MQErrorQueueNamesName of MQ error queue for names
appSettings/MQQueuePDFName of MQ queue for pre-generated PDFs
appSettings/MQErrorExchangePDFName of MQ error exchange for pre-generated PDFs
appSettings/MQErrorQueuePDFName of MQ error queue for pre-generated PDFs
# appSettings/EmailToAddressesRecipients of emails sent by the process (comma-separated)
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
connectionStrings/ProductionConnection string for production BHL database
# connectionStrings/QAConnection string for QA BHL database

BHLTextImportProcessor

Service that parses uploaded files containing page transcripts and replaces existing page text files on the BHL search/file server.

<BHLRoot>\BHLTextImportProcessor\app.config

ElementValue
# appSettings/DebugPathPath for debugging output
# appSettings/EmailFromAddress"From" address for emails sent by the process
# appSettings/EmailToAddressRecipient of emails sent by the process
appSettings/TextImportFilePathURL of location of text import files
appSettings/BHLWSUrlURL for a running instance of the BHLWebServiceREST.v1 project.
# appSettings/MQAddressServer address for a RabbitMQ instance
# appSettings/MQPortServer port for a RabbitMQ instance
# appSettings/MQUserUsername to access a RabbitMQ instance
# appSettings/MQPasswordPassword to access a RabbitMQ instance
appSettings/MQQueueName of message queue for items with updated text
appSettings/MQExchangeName of MQ exchange for items with updated text
appSettings/MQErrorQueueName of error message queue
appSettings/MQErrorExchangeName of MQ error exchange
connectionStrings/BHLConnection string for BHL database

TEST PROJECTS

BHLAPIDALTest

Unit tests for API data access methods.

<BHLRoot>\BHLApiDALTest\testhost.dll.config


BHLCoreDALTest

Unit tests for core data access methods.

<BHLRoot>\BHLCoreDALTest\testhost.dll.config

ElementValue
connectionStrings/BHLConnection string for BHL database
connectionstrings/AdminOptional connection string for logging database
ElementValue
connectionStrings/BHLConnection string for BHL database

BHLServerTest

Unit tests for business rule methods.

<BHLRoot>\BHLServerTest\testhost.dll.config

ElementValue
connectionStrings/BHLConnection string for BHL database
connectionstrings/AdminOptional connection string for logging database

SearchElasticTest

Unit tests for methods that interact with ElasticSearch.

<BHLRoot>\SearchElasticTest\app.config

ElementValue
appSettings/ElasticSearchServerAddressServer address for the ElasticSearch instance

Index Data in ElasticSearch (Optional)

  1. Open the BHLUtility solution in Visual Studio.
  2. Build the BHLSearchIndexer project.
  3. Make sure you have completed the configuration within the AppConfig.Full.xml file.
  4. Run the BHLSearchIndex project, specifying “AppConfig.Full.xml” as the application argument.

About

Source code for the Biodiversity Heritage Library's web site, databases, and supporting services.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors