2018-03-26

New Feature for 6.1: Resume Partial Downloads with Invoke-WebRequest and Invoke-RestMethod

2018-03-26_16-48-06

Intro

I’m excited to announce a new feature for Invoke-WebRequest and Invoke-RestMethod that will ship with PowerShell Core 6.1.0: Resume Downloads!

This is a feature that has been requested many times throughout the years and I’m please to say that it will be included in the next release of PowerShell Core. You should be able to preview the feature in 6.1.0-preview.2.

You can se the code changes in PR #6447.


Background

Have you ever had the download of a large file interrupted? Either the internet goes out, the computer reboots, or some problem occurs with the site you are downloading the file from and you are staring at 5GB of a 10GB file on your computer. You think “No problem! I can just start the download again and it will pickup where it left of!!” However, instead of downloading only what you are missing, you discover that the download has restarted from scratch. Sad smile

Web browsers, curl, and wget all support resuming downloads. If your download stops before it is complete, your web browser can generally pick up where the download left off. Some of the CLI web tools support this too through special command line switches. But, Invoke-WebRequest and Invoke-RestMethod have not supported this. If you try to use a partial file as the -OutFile, the web cmdlets will start over downloading the file and overwrite the existing file.

This can be a huge pain with large files, poor internet connections, or busy websites. The fact that you have to start all over from the beginning is bad enough, but some times you have to repeat over and over until it finally succeeds.

But, fret no more! Download Resume is coming in PowerShell Core 6.1!

On With The Code!

So, how will this new feature work? Pretty simple, actually, you just supply the same -OutFile and issue the -Resume switch!

The following code will partially download a file. The target file is a picture of Rin at https://i.imgur.com/gXnRf4J.jpg.

$Uri = 'https://i.imgur.com/gXnRf4J.jpg'
$OutFile = 'c:\temp\rin.jpg'
$Header = @{Range = "bytes=0-10000"}
Remove-Item $OutFile -ErrorAction SilentlyContinue -Force -Confirm:$false
Invoke-WebRequest -Uri $uri -Headers $Header -OutFile $OutFile

You can see we are removing the file, if it exists, for good measure! You can ignore the header. I’m just using it to tell the remote server to only send me the first 10001 bytes. Let’s check to see if we got that many:

Get-Item $OutFile | Select-Object Length

Which outputs:

Length
------
10001

Good! So we have simulated a partial download. At this point you can ignore all of the above. You don’t have to do any of that to resume a download. But I wanted to provide a way for everyone to test this themselves.

Here is how you download the rest of your partially downloaded file:

$result = Invoke-WebRequest -PassThru -Uri $Uri -OutFile $OutFile -Resume

All you have to do is supply the path of the partial download to -OutFile and use the -Resume switch. The –PassThru switch is there just so we can demonstrate that only the remainder of the file was downloaded. It is not required.

Here is how we can verify everything worked as expected:

Get-Item $OutFile | Select-Object Length
$result.Headers.'Content-Range'
$result.Headers.'Content-Length'

The output:

Length
------
29339

bytes 10001-29338/29339
19338

The file size is now 29,339 bytes, but the download was only 19,338 bytes! The Content-Range header shows the byte range that the server returned and the size in bytes of the file.

Since I was including a bunch of unnecessary things for the sake of demonstration, here is all that is required to resume downloading the partial file:

Invoke-WebRequest -Uri $Uri -OutFile $OutFile -Resume

Simple!


More Details and a Warning

First of all, the original usage of -OutFile still remains the same. If you do not supply -Resume the local file will be overwritten and the remote file will be downloaded from scratch. We do not try to auto detect if you are resuming or not. So the -Resume switch is required.

The resume feature only works on the size of the file. It does not do any validation to ensure that the file supplied in -OutFile is the same as the remote file you are trying to download. It is up to you to ensure you are trying to resume the correct file.

If the local file is smaller than the remote file, the web cmdlets will append the remaining bytes of the remote file to the end of the local file.

If the local file is the same size as the remote file, the local file will remain untouched as we assume you have already completed downloading the file. This results in a 416 status code from the remote server. We make a special exception for this status and treat it as not being an error only when the files are the same size.

If the local file is larger than the remote file, then we assume this is not the same file. The local file will be overwritten and the remote file will be downloaded from scratch.

If the remote server does not support resuming downloads, the local file will be overwritten and the remote file will be downloaded from scratch.

If the local file does not exist, the file will be created and the remote file will be downloaded from scratch.

Since many of the outcomes involve destroying the local file and downloading from scratch, you should treat this feature as destructive. If the local data is important to you, then you should make a backup copy first. Usually, when resuming a file, you already have useless data anyway. So I’m just being over cautious in warning this. It’s best to think of the resume feature as “best effort.” It tries to resume if possible, but if it cannot it will revert back to the original behavior of -OutFile.


Conclusion

This was a somewhat difficult feature to implement. I spent quite a few weeks trying to hammer out all the scenarios and how to address them. It was also a bit painful to create tests for. My compliments to browser developers for making this work. It’s no simple task for sure! But, I love a good challenge. This was fun to tackle, I learned a lot about the HTTP Range header, and I’m happy to finally implement something that myself and many others have wanted for a long time!

If you end up using this feature, please, let me know! If you run into issues, please file a bug report.

I hope you enjoy this new feature and I’m looking forward to hearing about how you make use of it!