Skip to content

Comments

Feature: Support inputURL and inputHostname in scrapers#6250

Merged
WithoutPants merged 3 commits intostashapp:developfrom
Gykes:inputurl-clean
Nov 10, 2025
Merged

Feature: Support inputURL and inputHostname in scrapers#6250
WithoutPants merged 3 commits intostashapp:developfrom
Gykes:inputurl-clean

Conversation

@Gykes
Copy link
Collaborator

@Gykes Gykes commented Nov 8, 2025

This PR adds a {inputURL} placeholder that can be used in both fixed values and selector expressions within YAML scraper configurations. The placeholder is automatically replaced with the actual URL that was used to perform the scrape.

Implementation Details

  • Added url field to both xpathQuery and jsonQuery structs to store the input URL
  • Added getURL() method to the mappedQuery interface
  • Updated mappedConfig.process() to replace {inputURL} placeholder in both fixed values and selectors
  • Updated all query creation calls to pass the URL parameter through the scraping pipeline

I havent officially tested this as I had so many GH issues. Will do it tomorrow or someone else can test.

Fixes: #4087

@DogmaDragon
Copy link
Collaborator

@Gykes Gykes closed this Nov 8, 2025
@Gykes Gykes deleted the inputurl-clean branch November 8, 2025 07:11
@Gykes Gykes restored the inputurl-clean branch November 8, 2025 07:12
@Gykes Gykes reopened this Nov 8, 2025
Copy link
Collaborator

@DogmaDragon DogmaDragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation check passed.

@feederbox826
Copy link
Collaborator

Was this less work than the scraping system? probably infinitely. Was it fun making the most obscure hacky workarounds for missing URL? yes.

@Gykes
Copy link
Collaborator Author

Gykes commented Nov 9, 2025

If there would be a better way to do this as far as usability goes please let me know. I don't use scrapers so not sure if this is a painful or painless implementation.

@DogmaDragon
Copy link
Collaborator

Was this less work than the scraping system? probably infinitely. Was it fun making the most obscure hacky workarounds for missing URL? yes.

Ah yes, I remember first seeing @Maista6969 __SEPARATOR__ trick. My mind was blown. 😆

@feederbox826
Copy link
Collaborator

If there would be a better way to do this as far as usability goes please let me know. I don't use scrapers so not sure if this is a painful or painless implementation.

Adding hostname as well as URL would be great, we wouldn't need to parse it out manually

@WithoutPants WithoutPants added this to the Version 0.30.0 milestone Nov 9, 2025
@WithoutPants WithoutPants added the improvement Something needed tweaking. label Nov 9, 2025
@Gykes
Copy link
Collaborator Author

Gykes commented Nov 9, 2025

If there would be a better way to do this as far as usability goes please let me know. I don't use scrapers so not sure if this is a painful or painless implementation.

Adding hostname as well as URL would be great, we wouldn't need to parse it out manually

Shouldn't be an issue. I can add it in here or if @WithoutPants wants it in a separate PR I can do that.

@WithoutPants
Copy link
Collaborator

Adding to this PR is fine.

@WithoutPants WithoutPants changed the title Feature: Make Scrapers Self-Aware Feature: Support inputURL and inputHostname in scrapers Nov 10, 2025
@WithoutPants WithoutPants merged commit 678b3de into stashapp:develop Nov 10, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Something needed tweaking.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug Report] yml scrapers aren't self-aware of url

4 participants