Skip to content

Downloading msmarco_v2_doc.tar #7

@seanmacavaney

Description

@seanmacavaney

We're really excited that the v2 document corpus is now available! A couple of questions:

  • Since this file is pretty big, is it possible to replicate it in multiple regions? I'm seeing pretty drastic differences in download speeds depending on where the request is coming from. On the West Coast US: ~100MB/s. On the East Coast: ~10MB/s. In the UK: 2-3MB/s. Using azcopy didn't make a difference.
  • And/or could HTTP Range requests be enabled on the files, allowing downloads to recover from network interruptions without needing to start over? (I'm not super familiar with Azure, but from what I can tell, it looks like this is something that can be enabled.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions