Skip to content

ANW-2199: fixes issues with CSV export.#3710

Merged
brianzelip merged 13 commits intomasterfrom
ANW-2199--csv-export-issues
Sep 24, 2025
Merged

ANW-2199: fixes issues with CSV export.#3710
brianzelip merged 13 commits intomasterfrom
ANW-2199--csv-export-issues

Conversation

@cpapazoglou
Copy link
Copy Markdown
Collaborator

@cpapazoglou cpapazoglou commented Sep 17, 2025

Description

Fixes:

  • context is not mapped to Resource/Accession
  • type, indicator, and barcode are mapped to correct solr fields and populated
  • CSV headers match the html headers
  • container profile and location fields are cleaned from \
  • refactor the implementation to make it more straight forward

Related JIRA Ticket or GitHub Issue

https://archivesspace.atlassian.net/browse/ANW-2199

How Has This Been Tested?

  • Manually
  • Also added automated tests

Screenshots (if appropriate):

CleanShot 2025-09-17 at 22 54 15@2x

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have read the CONTRIBUTING document.
  • I have authority to submit this code.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@coveralls
Copy link
Copy Markdown

coveralls commented Sep 17, 2025

Pull Request Test Coverage Report for Build 17953658890

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 21 of 31 (67.74%) changed or added relevant lines in 2 files are covered.
  • 78 unchanged lines in 12 files lost coverage.
  • Overall coverage increased (+0.6%) to 79.871%

Changes Missing Coverage Covered Lines Changed/Added Lines %
frontend/app/controllers/search_controller.rb 0 1 0.0%
frontend/app/helpers/export_helper.rb 21 30 70.0%
Files with Coverage Reduction New Missed Lines %
backend/app/exporters/serializers/mods.rb 1 93.97%
backend/app/exporters/models/marc21.rb 1 91.67%
backend/app/exporters/lib/export_helpers.rb 1 93.13%
backend/app/lib/progress_ticker.rb 1 94.92%
frontend/app/controllers/archival_objects_controller.rb 1 88.7%
indexer/app/lib/indexer_common.rb 2 99.52%
backend/app/exporters/models/ead.rb 2 90.18%
backend/app/lib/resource/duplicate.rb 5 93.33%
backend/app/lib/job_runners/resource_duplicate_runner.rb 7 19.23%
backend/app/exporters/serializers/ead3.rb 7 90.14%
Totals Coverage Status
Change from base Build 15878865621: 0.6%
Covered Lines: 28450
Relevant Lines: 35620

💛 - Coveralls

@cpapazoglou
Copy link
Copy Markdown
Collaborator Author

@brianzelip this is ready for a review.

@brianzelip brianzelip self-requested a review September 21, 2025 11:23
Copy link
Copy Markdown
Collaborator

@brianzelip brianzelip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cpapazoglou. The new exported CSV looks good, so do the unit tests.

Some I18n keys are called that don't exist. This should be fixed by adding appropriate new keys for 7 languages (German, English, Spanish, French, Italian, Japanese, Ukrainian) - I prefer using a llm for this, but google translate is a fallback.

There are also some very minor comments that could be removed since they seem to duplicate the nearby method names or expectations, etc.

The above two points remind me of llm agent output. If any of us are going to review llm-assisted PRs, let's all do ourselves a favor by heavily interrogating and shaping the llm output before a review! :)

Also, please add a feature test for this. There doesn't seem to be any feature specs for exporting a CSV from a search listing page, so you can start by adding to the end of frontend/spec/features/search_listing_spec.rb, ie:

context 'CSV export' do
  describe 'Top Containers' do
    # login
    # visit '/search'
    # click 'Top Containers'
    # click_on 'Download CSV'
  end
end

The spec should compare the column headers and row data from the web page to that of the CSV file. See frontend/spec/features/resources_spec.rb for an example of parsing and analyzing a CSV file.

Feel free to reach out with any questions!

Comment thread frontend/app/helpers/export_helper.rb Outdated
Comment thread frontend/app/helpers/export_helper.rb Outdated
def self.header_mappings
@header_mappings ||= begin
{
'context' => (I18n.t('search_results.filter.top_container.context', :default => 'Resource/Accession') rescue 'Resource/Accession'),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search_results.filter doesn't exist, so all of these translations fall back to English. Smells like llm?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, missed the part of checking that keys actually exist 🤦
Fixed in 15476ee

Comment thread frontend/spec/helpers/export_helper_spec.rb Outdated
Comment thread frontend/spec/helpers/export_helper_spec.rb
@cpapazoglou
Copy link
Copy Markdown
Collaborator Author

Thanks for the review @brianzelip. I have addressed your feedback.

Regarding the CSV export, I am testing that it works and data are present not the actual header mapping that happens in the helper and it's tested separately.

Copy link
Copy Markdown
Collaborator

@brianzelip brianzelip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[UPDATE - see my comment after this comment]

@cpapazoglou

# Verify headers match between web page and CSV
# Note: CSV may have different header names due to I18n translation
expect(csv_headers).to include('Type')
expect(csv_headers).to include('Indicator')
expect(csv_headers).to include('Barcode')

# Find rows containing our test data
container_1_row = csv_data.find { |row| row.include?(@top_container_1.indicator) }
container_2_row = csv_data.find { |row| row.include?(@top_container_2.indicator) }
expect(container_1_row).to include(@top_container_1.barcode)
expect(container_1_row).to include(@top_container_1.type)
expect(container_2_row).to include(@top_container_2.barcode)
expect(container_2_row).to include(@top_container_2.type)

The above expectations don't test for all of the available headings (missing "Title","Resource/Accession", "Container Profile", "Location"), nor their order. Let's please test for both all headings and their correct order. [UPDATE]

Here is an example of getting all of the headings in order from the web page, not including the final "Actions" column:

web_page_headings = all('#tabledSearchResults > thead th').map(&:text).reject { |th| th == 'Actions' }

Let's compare web_page_headings with csv_headers to make sure they match. [UPDATE]

Please also make sure that at least one Top Container: [UPDATE]

  • includes a Container Profile
  • includes a Location
  • includes a Resource/Accession.

Then please test the CSV for the presence of all data in the same order as the web page. Arrays make for nice comparisons! [UPDATE]

Also, small nitpick that relates to our budding efforts to enforce a style guide, the following comments are examples of explicitly duplicating the information conveyed by its code. I know that there are many of these kinds of comments in the codebase, including the CSV parsing pattern that I referenced earlier in this PR. This is something we're working to remove, slowly but surely!

# Click download CSV
click_on 'Download CSV'

# Find the downloaded CSV file
files = Dir.glob(File.join(Dir.tmpdir, '*.csv'))
expect(files.length).to be >= 1

# Parse the CSV file
csv_file = File.read(files.last)
csv_data = CSV.parse(csv_file)
csv_headers = csv_data[0]

# Verify headers match between web page and CSV
# Note: CSV may have different header names due to I18n translation
expect(csv_headers).to include('Type')
expect(csv_headers).to include('Indicator')
expect(csv_headers).to include('Barcode')

My approach to writing code in this regard includes:

  • making variable and method names as self-evident as possible, as you have here
  • asking myself, "is the comment merely duplicating the information conveyed by its code?" If so, I try to edit the comment or remove it.

Lastly regarding this comment:

# Note: CSV may have different header names due to I18n translation

Actually, the headers should be exactly the same. Is this comment from a llm agent? Regardless, please ensure the comment's meaning is not accurate! (and remove the comment)

@brianzelip
Copy link
Copy Markdown
Collaborator

@cpapazoglou

Regarding the CSV export, I am testing that it works and data are present not the actual header mapping that happens in the helper and it's tested separately.

Ahh! Sorry I missed this before adding my review remarks just now. Let's not worry then about adding extra testing to the feature spec. But please address the # Note: CSV may have different header names due to I18n translation comment which seems to contradict what it is we're doing here.

@cpapazoglou
Copy link
Copy Markdown
Collaborator Author

@cpapazoglou

Regarding the CSV export, I am testing that it works and data are present not the actual header mapping that happens in the helper and it's tested separately.

Ahh! Sorry I missed this before adding my review remarks just now. Let's not worry then about adding extra testing to the feature spec. But please address the # Note: CSV may have different header names due to I18n translation comment which seems to contradict what it is we're doing here.

My approach to writing code in this regard includes:
making variable and method names as self-evident as possible, as you have here
asking myself, "is the comment merely duplicating the information conveyed by its code?" If so, I try to edit the comment or remove it.

We are aligned on this, sometimes I mistakenly think that they might be useful. I just left one which is explaining the "why".
1b6ee5c

@brianzelip brianzelip merged commit 62d9a75 into master Sep 24, 2025
68 of 76 checks passed
@brianzelip brianzelip deleted the ANW-2199--csv-export-issues branch September 24, 2025 01:36
@cdibella cdibella added this to the 4.2.0 milestone Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants