cronjobs: inject canonical URLs into older manual pages (SEO)#1241
cronjobs: inject canonical URLs into older manual pages (SEO)#1241neteler merged 4 commits intoOSGeo:grass8from
Conversation
The GRASS GIS manual pages of the different versions have been published for a long time with a difficult to understand concept of being invisible, redirected or shown, which also strongly affects the search engine ranking. SEO: Without indication of "canonical" URLs different versions wipe each out out in search engines. Canonical tags help consolidate duplicate or similar content by specifying the preferred version of a page, ensuring search engines index and rank the desired URL while avoiding duplicate content issues. This PR changes the cronjob scripts to - inject "grass-stable" as the "canonical" into older manual pages under versioned URL - inject "grass-devel" as the "canonical" into the development manual pages under versioned URL Like this no "duplicate content" from a SEO perspective should occur. Also `robots.txt` is updated to reactivate the manual pages of old GRASS GIS versions (which now contain "grass-stable" as the canonical). Fixes OSGeo/grass#4579
|
Note: these files are now deployed on grass.osgeo.org for testing. |
|
Wouldn't the grass-devel and grass-stable primary content be rather similar, thus being potentially penalized as duplicates? How does this method handle pages that don't exist in later versions, or are renamed/moved? |
… to point to "stable" manual (rather than "devel")
Very good point. Deployed update on grass.osgeo.org, triggered cronjob and told Google Search about it.
A few of them are handled with redirects in Apache. I would not know any other method. |
|
Too bad, now the building with Can @landam help? |
Bugfix PR: OSGeo/grass#4739 |
|
Is everything clear for this one now? |
So far https://grass.osgeo.org/sitemap.xml showed the versioned manual pages which is unhelpful in terms of consolidating search engine results for manuals. In the past months we were penalized by "duplicate content". For an overview, see OSGeo/grass#4579 For efforts to address this situation, see - OSGeo/grass-addons#1168 - OSGeo/grass-addons#1241 This PR changes the URL in `sitemap.xml` from versioned manual URLs to grass-stable/grass-devel in order to complete the other PRs.
So far https://grass.osgeo.org/sitemap.xml showed the versioned manual pages which is unhelpful in terms of consolidating search engine results for manuals. In the past months we were penalized by "duplicate content". For an overview, see OSGeo/grass#4579 For efforts to address this situation, see - OSGeo/grass-addons#1168 - OSGeo/grass-addons#1241 This PR changes the URL in `sitemap.xml` from versioned manual URLs to grass-stable/grass-devel in order to complete the other PRs.
Almost. I am at time fixing the red box injection into the old libpython manual pages which fails with the globbing related error |
|
@wenzeslaus from my side this PR is now complete. |
|
cronjob files including |
The GRASS GIS manual pages of the different versions have been published for a long time with a difficult to understand concept of being invisible, redirected or shown, which also strongly affects the search engine ranking.
SEO: Without indication of "canonical" URLs different versions wipe each out out in search engines. Canonical tags help consolidate duplicate or similar content by specifying the preferred version of a page, ensuring search engines index and rank the desired URL while avoiding duplicate content issues.
This PR changes the cronjob scripts to
Like this no "duplicate content" from a SEO perspective should occur.
Also robots.txt is updated to reactivate the manual pages of old GRASS GIS versions (which now contain "grass-stable" as the canonical).
Additionally, rewrite red box injection to avoid globbing error
argument list too longold versions of libpython manual.Fixes OSGeo/grass#4579