Skip to content
This repository was archived by the owner on Jun 21, 2022. It is now read-only.

webis-de/webis-web-archiver

Repository files navigation

Note: development continues as scriptor.

webis-web-archiver

Source code and scripts for the Webis Web Archiver.

If you use the archiver, please cite the paper that describes it in detail.

Quickstart

You need to have Docker installed.

Then, on a Unix machine:

  • run src-bash/archive.sh for archiving web pages. It will display usage hints.
  • run src-bash/reproduce.sh for reproducing from an archive. It will display usage hints.

The scripts will automatically download and run the image (2GB+ due to all the fonts).

For other OSes, have a look at the shell scripts and adjust the call to docker run accordingly.

Custom user simulation scripts

  • Write a class that extends InteractionScript.

  • You can use the ScrollDownScript as an example, or extend it.

  • The utility class Windows offers static helper methods for frequently used interactions.

  • Compile your script with the binaries in the class path and create a JAR from it.

  • Place the JAR into a directory named "scriptname-1.0.0", where you replace "scriptname" by the name of your script.

  • Create a file "script.conf" with the following content and put it into the same directory

    script = packages.of.your.ScriptClass;
    environment.name = de.webis.java
    environment.version = 1.0.0
    

    where you replace "packages.of.your.ScriptClass" accordingly. For the example ScrollDownScript, that would be

    script = de.webis.webarchive.environment.scripts.ScrollDownScript
    
  • The src-bash/compile-scroll-down-script.sh illustrates the complete compilation process for the ScrollDownScript. Adapt it for your own script.

  • When running archive.sh or reproduce.sh, specify the directory that contains the new directory with "--scriptsdirectory" and give the script name (as in the new directory) with "--script".

About

Source code and scripts for the Webis Web Archiver

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •