feat(route): add 北京大学学生就业指导服务中心#7904
Conversation
request content in script tag to get full text
|
This pull request is being automatically deployed with Vercel (learn more). 🔍 Inspect: https://vercel.com/diy/rsshub-do-not-use/2ZNZEZA4SZosZ6QfdFWdkw1DRhMg |
|
Successfully generated as following: https://rsshub-do-not-use-jo6lckd7z-diy.vercel.app/pku/scc/recruit - **Failed**https://rsshub-do-not-use-jo6lckd7z-diy.vercel.app/pku/scc/recruit/xwrd - Successhttps://rsshub-do-not-use-jo6lckd7z-diy.vercel.app/pku/scc/recruit/tzgg - Successhttps://rsshub-do-not-use-jo6lckd7z-diy.vercel.app/pku/scc/recruit/zpxx - Successhttps://rsshub-do-not-use-jo6lckd7z-diy.vercel.app/pku/scc/recruit/gfjgxx - Successhttps://rsshub-do-not-use-jo6lckd7z-diy.vercel.app/pku/scc/recruit/sxxx - Successhttps://rsshub-do-not-use-jo6lckd7z-diy.vercel.app/pku/scc/recruit/cyxx - Success |
| Connection: 'keep-alive', | ||
| 'proxy-connection': 'keep-alive', | ||
| Accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', | ||
| 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36', |
| }) | ||
| .get(); | ||
|
|
||
| const sorted = list.sort((a, b) => b.pubDate.getTime() - a.pubDate.getTime()).slice(0, 10); |
There was a problem hiding this comment.
get 请求一次会拿到1000多条数据,其中前几条是置顶信息,之后的按时间顺序排列。因此需要重新按时间排序,过滤一下置顶信息。
| const detail_page = await got({ method: 'get', url: item.link, headers }); | ||
| const detail = cheerio.load(detail_page.data); | ||
| const script = detail('script', 'div#content-div').html(); | ||
| const content_route = script.match(/\$\("#content-div"\).load\("(\S+)"\)/)[1]; |
There was a problem hiding this comment.
本地没能复现这个问题,自己部署了一下,发现每次部署完后第一次请求会报错,之后的请求正常,可能和请求header有关,我再看一看。
|
|
||
| ctx.state.data = { | ||
| title: `北京大学学生就业指导服务中心 - ${feed_title}`, | ||
| link: baseUrl, |
|
Successfully generated as following: https://rsshub-do-not-use-db4bhamgw-diy.vercel.app/pku/scc/recruit - Successhttps://rsshub-do-not-use-db4bhamgw-diy.vercel.app/pku/scc/recruit/xwrd - Successhttps://rsshub-do-not-use-db4bhamgw-diy.vercel.app/pku/scc/recruit/tzgg - Successhttps://rsshub-do-not-use-db4bhamgw-diy.vercel.app/pku/scc/recruit/zpxx - Successhttps://rsshub-do-not-use-db4bhamgw-diy.vercel.app/pku/scc/recruit/gfjgxx - Successhttps://rsshub-do-not-use-db4bhamgw-diy.vercel.app/pku/scc/recruit/sxxx - Successhttps://rsshub-do-not-use-db4bhamgw-diy.vercel.app/pku/scc/recruit/cyxx - Success |
|
|
||
| const sorted = list.sort((a, b) => b.pubDate.getTime() - a.pubDate.getTime()).slice(0, 10); | ||
| await got({ method: 'get', url: sorted[0].link, headers }); | ||
| await wait(500); |
There was a problem hiding this comment.
试了很多次,发现只有这样先请求一次,延迟一段时间后再请求才行
|
好像这个完成了一半吗,谢谢各位大佬的辛苦工作啊 |
该 PR 相关 Issue / Involved issue
Close #7767
完整路由地址 / Example for the proposed route(s)
新RSS检查列表 / New RSS Script Checklist
Puppeteer? Make use ofPuppeteer?说明 / Note