-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Twitter Thread Archiver #345
Comments
Yeah I've wanted this for a long time too. The way it's been implemented on other projects is as a content script that unrolls threads before snapshotting inside of chrome headless. |
Would it be possible for the archiver to trigger the ThreadReader app to unroll it then archive the ThreadReader result? |
Then it ends up depending on ThreadReader. What if ThreadReader becomes defunct tomorrow? |
@shimizurei You'd have an archive of the ThreadReader page in your ArchiveBox. |
If it's part of ArchiveBox's code, then it's life depends on the maintainers of ArchiveBox. ThreadReader isn't open source, so if it goes down tomorrow, that's it. Everyone will be scrambling to find a replacement because the code is not easily available. Yes, you'll have your already created archives, but you wouldn't be able to create anymore. |
I'd rather do this via a python library, CLI tool, or puppeteer scripts (once our async playwright worker system is out). Follow here for updates on puppeteer script support progress: #51 |
I would really like this feature, and I'm willing to contribute code to make it happen, if that's welcome. |
There are still a lot of structural blockers in Archivebox's design to running content scripts directly during archiving. The most helpful approach might be to write a dedicated extractor in Python that dumps the unrolled thread to a nicer HTML file? Look for existing tools structured like YouTube-dl but for Reddit and Twitter (does a |
I've been looking for a box with this functionality for a long while now, with no luck. The closest thing to what I imagine and that I found is https://github.com/weskerfoot/TweetLog – however that does require access to developer API which I don't have. Regular thread – sequence of tweets making a mini article (my god, what happened to good ol' blogs?) – can be otherwise quite easily archived with Thread Reader App (by calling |
ThreadReaderApp has been acquired by twitter and shut down. I think a feasible approach would be to make a config option where a twitter developer token can be entered and then just download the thread and put it into a simple html file with one ˋ<p>ˋaragraph tag per tweet, maybe ˋ<br>ˋ for newlines. I myself would do it quick and dirty and just pretend the html was made by readability but I can understand if that’s too much of a hack to you 😃 I also think that this feature is now of a higher importance than before because of the acquisition. I just archived ThreadReaderApps links before. |
How about Nitter?
|
FYI we use Mercury (recently renamed
|
Can something like the Thread Reader App be incorporated into ArchiveBox?
Type
What is the problem that your feature request solves
We can save Twitter threads (NOT individual Twitter posts) as functionally complete articles.
Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes
A nice article pdf like the Thread Reader app.
What hacks or alternative solutions have you tried to solve the problem?
ThreadReader App
How badly do you want this new feature?
The text was updated successfully, but these errors were encountered: