-
Notifications
You must be signed in to change notification settings - Fork 133
Download Cmdlet #124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download Cmdlet #124
Conversation
|
I would love to see SFTP and FTPS on that list. Or at least keep them in mind when considering future extensions to the cmdlet. |
|
@KevinMarquette My plan is to start by fixing the gap created by the API switch from I hope that eventually this could support FTPS, SFTP, and SCP. Assuming we can find good .NET libraries to do those, the interfaces could be used to wrap pretty much any implementation and then plugged into the Then we can all start plugging our own funky protocols in So the goal here is to start small, but make it easily extensible. for new features and new protocols. |
|
I'm all for anything that increases utility with PS, and this seems like a much-needed addition for PS Core. The ability to handle multiple different downloading protocols would definitely help PS have a much more useful presence. I find myself wondering if PS would ever be capable of handling torrent downloads via To be clear, you're planning on bringing over the Also, regarding filename conflicts. I can definitely see the utility here in providing files with a datetime stamp where conflicts arise. However, I'm not sure that really has a massive amount of merit, as the file itself would carry that metadata (CreationTime) regardless, would it not? Perhaps a simpler approach that is more commonly present would also be okay. Most browsers simply append an integer to the filename, incrementing it until they get a unique filename. I think that's likely to be perfectly sound for most purposes. |
|
I think this is a great idea. You mentioned "Automatic Filename Resolution Feature" and "AsJob" in Download-Cmdlet.md and this really hit home because I actually wrote some C# not too long ago that accomplished exactly those features. It's mostly cobbled together from a few stackoverflow threads, so I don't really feel comfortable sharing (i.e. not a lot of original thought went into it), but I definitely had a need for those features. Along the same lines, I also wrote something that checks if there is enough space on the target drive before downloading the target. Not sure if that's something you could fit into your plans. In case you're interested, here's the basic idea: Also, I'm pretty sure the above won't always work depending on the website hosting the file. But I think you know a lot more than me about that stuff. |
|
Can we not use How about |
|
@Jaykul I'm not married to the name. I have even considered |
|
@vexx32 I have updated the PR to include the missing resume documentation. I had it listed in the syntax, but didn't call it out as a feature explicitly.Originally, that feature I had planed to only implement in the new cmdlet and not make it available in the web cmdlets, but there would be some pain points for advanced download scenarios that it still needed to address. In any case, the Resume feature is a first class feature and requires the protocol handler to implement Re: auto name resolution using date instead of an iterated integer. It's not about meta data. The reason it was chosen is that it is less likely to already exist, that makes it easier to implement. if you have use wget enough you know that multiple files get downloaded maybe you delete the |
|
@pldmgg re: space check. I think that is something we can iterate with later and maybe not focus on for the initial inclusion. if enough transfer protocols support the ability to query the size before transfer, then it can be extended with another interface and some more internal logic. |
|
After doing some mocking of this, I think I should revisit the interface definitions. I personally hate methods so I don't even know why I originally chose them. I think properties work best for most of those. Also, I realized that the Dictionary would be awkward if the handler is directly queried from there. So it would probably be best to include a facorty interface and use that as the value type for the dictionary: internal interface IInvokeDownloadProtocolFactory
{
IInvokeDownloadProtocol NewInvokeDownloadProtocol();
}
public class InvokeDownloadCmdlet : PSCmdlet
{
static Dictionary<string,IInvokeDownloadProtocolFactory> ProtocolHandlers;
}Anyway, I have done a mockup of the interfaces with properties and the new factory and how they would work in the cmdlet here: It's obviously not functional code. It's more just to demonstrate how I envision the interfaces working in the cmdlet to enable features on a protocol. I'll wait for some more feed back before updating the RFC with those changes. |
|
Also, as far as determining the file name, what about the I think there are a couple of other tricks too -- otherwise, you're going to end up with a lot of downloads named I think the alternative of using incrementing numbers to differentiate is probably desirable, because the full date-time stamp is going to make file names extremely large (although perhaps this could be an option?) Finally, I wonder if you shouldn't passthru by default. Specifically, if you're generating names whenever you hit a collision, that will mean people will almost always need to -Passthru in order to reliably write scripts using this cmdlet. |
1-Draft/RFCNNNN-Download-Cmdlet.md
Outdated
|
|
||
| # Download Cmdlet | ||
|
|
||
| This RFC is to propose the design of a new cmdlet specialized for remote file downloads to compliment the existing Web Cmdlets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a spelling / homophone nit (the RFC is great): it should be complement, not compliment.
1-Draft/RFCNNNN-Download-Cmdlet.md
Outdated
| The cmdlet shall have a subset of the Web Cmdlets features that are specific or essential to downloading files. | ||
| For example, the cmdlet will not support body content nor HTTP methods other than GET. | ||
|
|
||
| As the cmdlet is not centered HTTP and HTTPS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing "around" after "centered"
1-Draft/RFCNNNN-Download-Cmdlet.md
Outdated
| The cmdlet shall be extensible to support additional protocols. | ||
| The protocol handlers shall be extensible to handle additional features. | ||
|
|
||
| The extensibility is accomplished by Interfaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Interfaces" -> "interfaces"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, consider not capitalizing "Cmdlet" either ("Web" is increasingly lowercased too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which instances? Web Cmdlets is a proper noun.
1-Draft/RFCNNNN-Download-Cmdlet.md
Outdated
| These APIs will begin as internal for several versions. | ||
| Once they are mature, they will be made public. | ||
| This will allow users to extend and replace protocols with their own handlers. | ||
| A set of default handlers will be initialize. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"initialize" -> "initialized"
| 1. Illegal file name characters will be replaced with their URL encoded equivalent. | ||
| 1. If a file with that name is present, the basename will be appended with a period followed by `DateTime.Now.ToString("yyyMMddHHmmssfff")` | ||
| 1. The above rule will repeat until it locates a name that does not exist or 100 attempts | ||
| 1. If in the unlikely event all the above fails, an error will be generated and the user will be suggested to supply a path or change directories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth clarifying here - and in all other cases below - whether the error is non-terminating or statement-terminating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's intentionally not clarified. I have not yet decided if it is worth expanding this to handle multiple URIs. if it is extended to handle multiple URIs it would not be terminating. If it is not extended.. it would end the cmdlet anyway as there is nothing left to process.
In other words.. this is up for discussion.
1-Draft/RFCNNNN-Download-Cmdlet.md
Outdated
| ## Alternate Proposals and Considerations | ||
|
|
||
| Instead of being extensible, the cmdlet could be a closed system. | ||
| There have been many lessons learned from the Web Cmdlets with regards to extensibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"regards" -> "regard"
| `-Headers` may make sense for `HTTP` and `HTTPS`, | ||
| but they will make no sense for `SSH`. | ||
|
|
||
| Alternate automated file name resolution strategies could be considered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, PowerShell's cmdlets simply fail if the target file already exists and require -Force to force replacement; examples: Move-Item, New-Item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this one instance it is better to have parity with peer technologies than with PowerShell. There are decades worth of expected behavior built into this kind of functionality, seeking functional parity with PowerShell in this case may not be the best approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should still have a -Force to force replacement, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just to clarify, if the -Path is specified and the file already exists, then we would get the error and have to use a -Force to save it. The automated name only applied when the -Path is not specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another discussion point on Path which would break what you propose here is whether a folder can be specified and allowing automatic name resolution create the file in the designated folder.
Invoke-Download 'http://contoso.com/file.exe' -Path `c:\downloads\`There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On -Force, I cringe at it, not the idea of allow overwriting the file, just the vague parameter name. it works well when the verb and noun make it clear what -Force would do. But the get-* commands that have -Force are mostly about showing hidden items, and there are no Invoke-* commands that have -Force.
I guess one assumption would be that if you are using -Force you would already know the file exists? Otherwise, would you really care about the autogen file name enough to overwrite if you didn't? And if you did know it exists, you could supply a path.
Again, not opposed to the idea. Just need to feel out all the possible issues with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My expectation when calling with a -Path to a file is that I would be able to save the file to that path.
Invoke-Download 'http://contoso.com/layout.csv' -Path 'c:\website\layout.csv'
If this will not overwrite the target file, then I would expect to be able to call -Force.
Invoke-Download 'http://contoso.com/layout.csv' -Path 'c:\website\layout.csv' -Force
I would use -Force to not care if the file already exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point.. I'm too used to the Web Cmdlets which implemented -OutFile instead of -Path/-LiteralPath and it overwrites if it exists. I'm not sure which is better.. making it work like the Web Cmdlets, more like peer technology, or more like other PowerShell commands... it seems like no matter what behavior is chosen it will be surprising depending on what you are coming from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer -AllowClobber to the generic -Force which is way overloaded
|
Re:
I will add two comments here, one for using the time stamp and another for using the incremental number and let people vote on them. I personally hate the
That's a good point. and if this is changed to a |
|
If you would prefer the automatic name resolution to use Downloading file.exe: if file.exe already exists, file.1.exe will be used. if file.exe and file.1.exe exist, file.2.exe will be used. |
|
If you would prefer the automatic name resolution to use the timestamp if the file exists and to try again until a filename is found that doesn't exist. Give this comment thumbs up. Thumbs down will not be counted, instead vote for the other comment. Downloading file.exe: if file.exe already exists, file.20180331124709627.exe will be used. if file.exe and file.20180331124709627.exe exist, file.20180331124709945.exe will be used. |
|
I think the |
|
Vote for |
|
Vote for |
|
I think |
|
We got into an intense discussion today with @PowerShell/powershell-committee on the naming today. We brainstormed a little without thinking about approved verbs. When I think of downloading a file, I think "download, fetch, get, pull, obtain, acquire". @JamesWTruher felt pretty strongly about using Where we landed--and we'd like feedback because we haven't had time time to think through this all the way--was that we should have a greater discussion on whether it might make sense to significantly expand What are folks' thoughts on this approach? |
|
That being said, it's worth mentioning we have PowerShell/PowerShell#4714 which might be worth revisiting as well |
|
@joeyaiello I'm not a fan of adding this functionality to copy-item. While that initially seems like a good idea, one motivation behind this is multiple requests across issues, at conventions, on twitter, and in person for functionality similar to wget Where it downloads the file and saves it to the CWD without a tone of parameters/arguments. Though, from the perspective of having something lightweight added into the base language (allowing for micro-installations of PS without all the additional modules) baking some kind of base functionality for HTTP/FTP get into Copy-Item does seem attractive. However, Copy-Item has a complex overloaded API and I feel adding web functionality to is is just going to open flood gates on extending that API further. On the naming..... *sigh... can someone PLEASE look at the API and see if it makes sense? I don't care what color the bikeshed is.... I want to know if it will stand on its own and keep the rain out. ;-P |
|
@markekraus Yes, it may be more complicated, but the desired experience for download is: copy-item https://foo.com/file.zip
copy-item ftp://foo.com/file.zip
copy-item smb://foo.com/file.zip
copy-item sftp://foo.com/file.zipUploads would be similar. |
|
@SteveL-MSFT Does that mean this would be supported too? Copy-Item c:\some\other\file\path\file.txt |
|
@markekraus well, if you follow Explorer, it would create I don't think remote and local operations have to be symmetric |
|
They certainly don't have to be, but that would mean further causing the behaviour of the cmdlet to vary in terms of what parameters are expected depending on which type of provider path you give it. I'm not sure diverging the behaviour there makes sense. I'm sure the dynamic parameter code for that cmdlet is already far too complicated. 😕 |
|
Well, Those used to both a source and a destination argument being required can simply use copy-item https://foo.com/file.zip . |
|
@mklement0 oh wow... I should not have just assumed that didn't work... O_O |
|
Yeah, @markekraus, it is is surprising; as I just found out, |
|
I personally would like a new cmdlet instead of overriding |
|
@daxian-dbw Can you provide some feedback on what I propose for the internals? There has been a ton of discussion about the name and public PowerShell command API, but I've had very little feedback on the internal design. |
| 1. The last segment of `Uri.Segments` | ||
| 1. In the case that the last segment is `/`, the file name `index.html` will be used | ||
| 1. Illegal file name characters will be replaced with their URL encoded equivalent. | ||
| 1. If a file with that name is present, the basename will be appended with a period followed by `DateTime.Now.ToString("yyyMMddHHmmssfff")` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you decided not to follow the convention of adding (n) to the filename? The file itself will have a creation timestamp.
|
|
||
| Example 6: | ||
|
|
||
| * Uri: `https://i.imgur.com/gXnRf4J.jpg?these=query&elements=will&be=ignored` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about an example where the URI dynamically generates a file?
| i.e. the file is already completely downloaded and there is nothing to resume. | ||
|
|
||
| If the resume was successful, the cmdlet will append to existing file. | ||
| If resume was not successful, the cmdlet will attempt a full download load of the same file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| If resume was not successful, the cmdlet will attempt a full download load of the same file. | |
| If resume was not successful, the cmdlet will attempt a full download of the same file. |
| but they will make no sense for `SSH`. | ||
|
|
||
| Alternate automated file name resolution strategies could be considered. | ||
| `wget` uses `.1` at the end of a file when a file exists and increments until if fined s file name that doesn't exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `wget` uses `.1` at the end of a file when a file exists and increments until if fined s file name that doesn't exist. | |
| `wget` uses `.1` at the end of a file when a file exists and increments until if fined a file name that doesn't exist. |
| { | ||
| internal static Dictionary<String, IInvokeDownloadProtocol> ProtocolHandlers; | ||
| } | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both external and internal implementations of those interfaces need to somehow register to the cmdlet. How would the registration work?
- You could expose a static methods from the cmdlet type for a module to call to register its implementation types. Then the module needs to implement
IModuleAssemblyInitializer.OnImportto register the protocol when it's being imported, and maybe unregister viaIModuleAssemblyCleanup.OnRemovewhen it's being removed? - Add another cmdlet for registration? Maybe not really needed ...
What are your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way how JobManager deals with 3rd party JobSourceAdapter types is to register them when importing the module. And it looks to me the registered JobSourceAdapter types never got unregistered when the module is removed ...
I don't think we should update the engine code to register 3rd party download protocol implementation during module importing. It should be more explicit and something done by the module itself.
I guess providing cmdlets for registration/unregistration also makes sense, so a module author can also choose to register in .psm1 file, and unregister with the OnRemove property of PSModuleInfo.
I understand the desire to avoid monolithic cmdlets that are overwhelming in terms of both implementation complexity and UX (a bewildering array of parameter sets). How about Of course, that implies that we may need a separate |
|
Tabling the discussion on the name right now, as we wasted lots of time last week on it without no meaningful progress. Right now, we in the @PowerShell/powershell-committee think that this implementation proposal looks mostly right, but we won't be sure about the interface definitions until someone (ideally @markekraus) starts implementing a couple different protocols against it. Ideally, we would like to have someone else (on @SteveL-MSFT's team, if no one else signs up) to implement one of these protocols to validate that the interface is general enough. For now, we say rock on the implementation behind an experimental feature flag, and we can revise this RFC as we learn more. Can you make any planned updates, update your filename and front-matter to RFC0034, and re-target the |
| The syntax shall be as follows: | ||
|
|
||
| ```none | ||
| Invoke-Download -Uri <Uri> [-Credential <PSCredential>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important for this cmdlet to implement IDynamicParameters, and design a way to allow the type that implements IInvokeDownloadProtocol to provide the dynamic parameters as needed.
| { | ||
| Stream GetStream(PSCmdlet Cmdlet); | ||
| Uri GetUri(); | ||
| void SetUri(Uri RequestUri); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This interface might need an extra method GetDynamicParameters(<some-parameter>). So if the protocol requires additional inputs from users, the implementation can extend the cmdlet syntax by defining dynamic parameters.
|
I love how PowerShell can do magical things. From this point of view, it is much more promising to implement Copy-Item and inter-provider copying. It realizes fantastic flexibility. If this path is correct then the name of the cmdlet is not essential.
@joeyaiello Since there is no clear preference, you could ask the project’s founders to decide |
|
Given how we're moving forward on experimenting with modules/cmdlets in the Gallery before delivering them into PowerShell, we'd love to see someone ship a prototype or v1 of this to validate that the current thinking makes sense (and to exhaust the naming issue/discussion in the wild). |
|
Given the gap since |
|
@joeyaiello, is this path to potential inclusion in PowerShell itself via gallery-published prototypes documented / formalized somewhere, in a detailed, procedural manner? It would help to have a document to point people to that explains the process. |
For details of the RFC, please see the file changes.
Automatic Name Rsulution
To vote for
.1,.2style auto file name resolution vote thumbs up on this comment.To vote for timestamp based file name resolution vote thumbs up on this comment.
Cmdlet Name
To vote for
Invoke-Downloadvote thumbs up on this comment.To vote for
Get-RemoteFilevote thumbs up on this comment.