2017-11-27

PowerShell Core Web Cmdlets in Depth (Part 1)

2017112501

Intro

I recently spoke at the North Texas PC Users Group's PowerShell Special Interest Group on the topic of the Web Cmdlets in PowerShell Core. I spoke for full hour because there is just so much new and different about Invoke-RestMethod and Invoke-WebRequest between Windows PowerShell 5.1 and PowerShell Core 6.0.0. In fact, because I was limited to an hour, I couldn't go as in depth or cover as many things as I would have liked. This blog series will cover what I covered in that presentation and more. At this time I plan to have 3 parts with a possible 4th as an addendum should anything change between now and GA.

Table of Contents


Before You Begin

This series goes in depth into the Web Cmdlets as they are in PowerShell Core 6.0.0. There is quite a bit to cover. In order to save time and space, I will not go in depth to certain related concepts (especially with regards to HTTP). and will leave learning and researching those concepts up to the reader. I will attempt to provide a comprehensive look at all of the differences between 5.1 and 6.0.0, however I may miss something. Also, this article may be a rough read for the novice PowerShell user and even, at times, for the intermediate PowerShell user. I will cover the .NET "under the hood" changes as well as changes visible to the PowerShell user. Some of the concepts may be considered advanced and others are relevant only in the underlying C# code. I will attempt to provide links to information for concepts I do not cover directly. I will not be defining many initialisms and acronyms that are considered common in .NET, C# or PowerShell.  I will only define initialisms and acronyms where they are not common or of my own making.

Current Version

At the time this series was written, PowerShell Core 6.0.0-rc had just been released. There are no plans to add any new features between now and GA to the Web Cmdlets, so information in this article should be relevant for 6.0.0 when it is finally released. There may be additional bug fixes added between now and GA. If any do occur, I will add an addendum section at the end of the series.

About The Author

Since not all new readers take the time to read all my previous blogs to find out who I am, I wanted to take this opportunity to reintroduce myself as well as provide my qualifications for writing on the Web Cmdlets in PowerShell Core.

My Name is Mark Kraus and I'm a Lead IT Solutions Architect for Mitel. In addition to my day job, which has a heavy dose of PowerShell automation, I volunteer my time to assist PowerShell user's in /r/PowerShell and the poshcode slack.

In the past 4 months I have contributed heavily to the PowerShell Core project, enough so to have been "promoted" to a Collaborator for the project. The area to which I have contributed the most is the Web Cmdlets, Invoke-RestMethod and Invoke-WebRequest. I know many of the new features very intimately because I was primarily responsible for their additions. Through that work, I have become very intimately acquainted with the Web Cmdlet code base and the history since 5.1.


Under The Hood

Several key changes have been made to the underlying architecture of the Web Cmdlets in PowerShell Core. Some of these are visible to the PowerShell user and others are less so. This section covers the changes made to the underlying .NET APIs called by the Web Cmdlets and how those changes manifest themselves in ways that will be obvious or obscure to PowerShell users.


The Move from WebRequest to HttpClient

The chief architectural change in PowerShell Core 6.0.0 is the in the primary .NET API called by the Web Cmdlets. In PowerShell 5.1, the web cmdlets used System.Net.WebRequest. In PowerShell Core 6.0.0 the web cmdlets now use System.Net.Http.HttpClient. HttpClient is newer API introduced in .NET Framework 4.5. It is more specialized for dealing with HTTP requests and include features for modern REST APIs. WebRequest is an older and generalized web API with broader support for web protocols outside of HTTP. You can read about the differences in the APIs in depth here.

This API switch has several effects on the PowerShell user experience. The rest of the Under The Hood section includes all the cascading changes as result of this switch.


Strict Header Parsing

One of the goals of the HttpClient API is to provide a more object oriented and standards based approach to HTTP. This manifests in the PowerShell user experience by changing the behavior of how the -Headers and -UserAgent parameters are parsed. In Windows PowerShell, you could supply pretty much any request header and value to -Headers or whatever you wanted to -UserAgent and the cmdlets would make the request without issue. In PowerShell Core, the default is to parse the headers for standards compliance. This means if the header you are supplying is a well known header with a standards defined value format, supplying a non-standards compliant value will result in an error.

$Params = @{
    Uri = 'http://httpbin.org/headers'
    headers = @{
        "if-match" = "12345"
    }
}
Invoke-WebRequest @Params

In 5.1 The above would work without issue. However, in 6.0.0 you will get the following error:

Invoke-WebRequest : The format of value '12345' is invalid.
At line:1 char:1
+ Invoke-WebRequest @Params
+ ~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : NotSpecified: (:) [Invoke-WebRequest], FormatException
+ FullyQualifiedErrorId : System.FormatException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

The If-Match request header is Defined in RFC-7232 section 3.1. The value is required to be defined with surrounding quotes.

"if-match" = '"12345"'

Some remote endpoints are tolerant and will allow unquoted entity-tags. In 5.1 you could use those tolerant remote endpoints with "lazy" entity-tags without fuss from the cmdlets. However, due to the switch to HttpClient in PowerShell Core, you need to either make the value standards compliant or make use of the -SkipHeaderValidation parameter switch.

$Params = @{
    Uri = 'http://httpbin.org/headers'
    headers = @{
        "if-match" = '"12345"'
    }
}
Invoke-WebRequest @Params

# or:

$Params = @{
    Uri = 'http://httpbin.org/headers'
    headers = @{
        "if-match" = "12345"
    }
}
Invoke-WebRequest @Params -SkipHeaderValidation

This also applies the -UserAgent parameter:

# Errors:
$Params = @{
    Uri = 'http://httpbin.org/user-agent'
    UserAgent = "Non-Standards Compliant: User-Agent"
}
Invoke-WebRequest @Params

# Succeeds:
$Params = @{
    Uri = 'http://httpbin.org/user-agent'
    UserAgent = "Non-Standards Compliant: User-Agent"
}
Invoke-WebRequest @Params -SkipHeaderValidation

If you are working with code that sets headers and you want it to work on both 5.1 and 6.0.0, you may want to consider adding the following:

$PSDefaultParameterValues['Invoke-RestMethod:SkipHeaderValidation'] = $true
$PSDefaultParameterValues['Invoke-WebRequest:SkipHeaderValidation'] = $true

This will turn on  -SkipHeaderValidation if it exists. On 5.1 it will just be ignored and on 6.0.0 it will activate the parameter switch on every call.

More Info: Issue #2895, Pull Request #4085, and Pull Request #4479.

HttpWebResponse is now HttpResponseMessage

Another side affect of the switch to HttpClient in PowerShell Core is the change in the objects returned from the API. In 5.1, The WebRequest API would return one of System.Net.FileWebResponse, System.Net.FtpWebResponse, or System.Net.HttpWebResponse. These all derive from the System.Net.WebResponse base class. In 6.0.0, HttpClient returns a System.Net.Http.HttpResponseMessage. This probably doesn't concern a grand majority of PowerShell users, but it should be of particular note for those who wrap or proxy the web cmdlets, particularly in the areas of response headers parsing (covered later) and error handling.

This means that the BaseResponse property on BasicHtmlWebResponseObject and the Response property on exceptions thrown by the web cmdlets are now HttpResponseMessage objects.

$Response = Invoke-WebRequest 'http://httpbin.org/get'
$Response.BaseResponse.GetType().FullName
try {
    Invoke-WebRequest 'http://httpbin.org/status/418' -ErrorAction Stop
}
Catch {
    $_.Exception.Response.GetType().FullName
}

Result:

System.Net.Http.HttpResponseMessage
System.Net.Http.HttpResponseMessage

This object has a very different shape than HttpWebResponse object in 5.1 to the point of near incompatibility. If you have code in 5.1 that directly access these objects, you will need to make many changes to make it work with 6.0.0 and if you need your code to work on both 5.1 and 6.0.0, you will need to have the code logic gated based on the PowerShell version.


Headers Values are Now String Arrays

A side effect of the change to HttpClient and the resulting HttpResponseMessage is that the values of the Headers property of BasicHtmlWebResponseObject are now String Arrays instead of Strings. HttpClient implements support for multiple response headers with the same name as an array. This requires a bit of explaining on HTTP Response headers. There are two RFC defined methods for returning multiple values in a response header. One is to use a field separator like this:

X-Header: Value1,Value2,Value3

The other is to use multiple headers with the same name like this:

X-Header: Value1
X-Header: Value2
X-Header: Value3

Sometimes response header values need to contain separators as part of their value. When that is the case, a remote endpoint may opt to use the second method and return multiple headers with the same name. In 5.1, regardless of how the remote endpoint returned the headers, the header would be presented as a single string joined by commas:

"Value1,Value2,Value3" -eq $Response.Headers.'X-Header'

In 6.0.0, when multiple response headers with the same name, each value is set to an array element:

"Value1" -eq $Response.Headers.'X-Header'[0]
"Value2" -eq $Response.Headers.'X-Header'[1]
"Value3" -eq $Response.Headers.'X-Header'[2]

This could be an issue for PowerShell users who process response headers. If your code was made with the assumption that the headers were a single string (and there is a very high chance that it was), your logic will need to be updated. Even when a single value is returned for a response header, this can still cause issues. For example, the Expires header:

$Response = Invoke-WebRequest -uri 'https://google.com'
$Response.Headers.'Expires'

Result:

-1

That displayed result is the same for both 5.1 and 6.0.0 due to default formatting. But if you try to assign this to a strongly typed variable or class property this will be an issue:

[int]$Expires = $Response.Headers.'Expires'

On 5.1 the above would execute without issue. However, on 6.0.0 you get the following:

Cannot convert the "System.String[]" value of type "System.String[]" to type "System.Int32".
At line:1 char:1
+ [int]$Expires = $Response.Headers.'Expires'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : MetadataError: (:) [], ArgumentTransformationMetadataException
+ FullyQualifiedErrorId : RuntimeException

PowerShell does some magic for you to parse strings into integers. That magic does not extend to arrays of strings. You have a couple of options for dealing with this while making your code forwards and backwards compatible. The first is to join the array with commas and treat the string as you always did in 5.1:

[int]$Expires = $Response.Headers.'Expires' -Join ','

This works because in 5.1 there is nothing to join so it just becomes the original string and in 6.0.0 it converts the array into a single string. A single array element becomes a single string without any commas.

Another option is to use Select-Object when you know you will only be dealing with 1 header value:

[int]$Expires = $Response.Headers.'Expires' | Select-Object -First 1

Obviously on 5.1 there is only one object anyway and on 6.0.0 it will return just the first element of the array.

Avoid Using the Index Operator

You may be wondering why not use just use the index operator.

# DO NOT DO THIS:
[int]$Expires = $Response.Headers.'Expires'[0]
$Expires

On 6.0.0 you will see:

-1

However, on 5.1 you will see this:

45

When you use the index operator on a String, it blows the String up into a Char array and then grabs the referenced Char. In the String "-1", "-" is the index 0 Char. When you convert the Char "-" into an Int you get the number 45. In any case, avoid using the index accessor unless you are working strictly with 6.0.0.


Content Headers are Separated

Another consequence of the change to HttpClient and the resulting HttpResponseMessage is that Content related headers are separated from other response headers in the BaseResponse property on BasicHtmlWebResponseObject and the Response property on exceptions. In 5.1, all of the response headers are available in the HttpWebResponse.Headers property. In 6.0.0 content related headers such as Content-Type and Content-Length are not available in HttpResponseMessage.Headers. Instead, they can be found in HttpResponseMessage.Content.Headers.

$Response = Invoke-WebRequest 'http://httpbin.org/get'
$Response.BaseResponse.Headers.GetValues('Content-Type')

In 5.1 you get:

application/json

In 6.0.0 you get:

Exception calling "GetValues" with "1" argument(s): "Misused header name. Make sure request headers are used with HttpRequestMessage, response headers with HttpResponseMessage, and content headers with HttpContent objects."
At line:1 char:1
+ $Response.BaseResponse.Headers.GetValues('Content-Type')
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : InvalidOperationException

You can access the content related headers in 6.0.0 like this:

$Response.BaseResponse.Content.Headers.GetValues('Content-Type')

This does not apply to BasicHtmlWebResponseObject.Headers or for headers returned from the -RespnseHeadersVariable parameter on Invoke-RestMethod (more on that new feature later). PowerShell Core reconstitutes all headers into a single dictionary:

$Response = Invoke-WebRequest 'http://httpbin.org/get'
$Response.Headers.'Content-Type'

That will have the same output on 5.1 and 6.0.0:

application/json

This should only affect you if your code is working with web cmdlet exceptions or you chose to work directly with BaseResponse property on BasicHtmlWebResponseObject.


WebHeaderCollection is now HttpResponseHeaders or HttpContentHeaders

On the topic of changes to headers in the BaseResponse property on BasicHtmlWebResponseObject and the Response property on exceptions, it's important to note the underlying type for these have changed as well. In 5.1 HttpWebResponse.Headers was a System.Net.WebHeaderCollection. In 6.0.0 the  HttpResponseMessage.Headers is a System.Net.Http.Headers.HttpResponseHeaders and HttpResponseMessage.Content.Headers is a System.Net.Http.Headers.HttpContentHeaders. These APIs are similar but different enough to likely break any code you have working against them in 5.1. Rather than spend time here describing all of the differences in the APIs, please go and read the documentation for the new APIs. The only reliable way to work with this in both 5.1 and 6.0.0 is to have version specific code called behind a version check. On the bright side, many of the response headers are now accessible as strongly typed object. For example, the Date header:

# In 6.0.0 Only:
$Response.BaseResponse.Headers.Date

Result:

DateTime      : 11/27/2017 1:33:51 PM
UtcDateTime   : 11/27/2017 1:33:51 PM
LocalDateTime : 11/27/2017 7:33:51 AM
Date : 11/27/2017 12:00:00 AM
Day           : 27
DayOfWeek     : Monday
DayOfYear     : 331
Hour          : 13
Millisecond   : 0
Minute        : 33
Month         : 11
Offset        : 00:00:00
Second        : 51
Ticks         : 636473864310000000
UtcTicks      : 636473864310000000
TimeOfDay     : 13:33:51
Year          : 2017

However, these types are not present in the BasicHtmlWebResponseObject.Headers where only the raw response header value strings are available (PR #4494).


Error Handling (404 etc.)

There are several "gotchas" that you may run into with regards to error handling. Anyone who is wrapping or proxying the web cmdlets is likely doing some form of error handling. If the remote endpoint returns a 404, for example, your code may be processing and/or parsing the response body. Just running the following on 5.1 and 6.0.0 will make it immediately clear there are some differences to consider:

Invoke-WebRequest 'http://httpbin.org/status/418'

Result on 5.1:

Invoke-WebRequest : -=[ teapot ]=- _...._ .' _ _ `. | ."` ^ `". _, \_;`"---"`|// | ;/ \_ _/ `"""`
At line:1 char:1
+ Invoke-WebRequest 'http://httpbin.org/status/418'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebException
    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

Result on 6.0.0:

Invoke-WebRequest :
    -=[ teapot ]=-
       _...._
     .'  _ _ `.
    | ."` ^ `". _,
    \_;`"---"`|//
      |       ;/
      \_     _/
        `"""`
At line:1 char:1
+ Invoke-WebRequest 'http://httpbin.org/status/418'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : InvalidOperation: (Method: GET, Re...rShell/6.0.0
}:HttpRequestMessage) [Invoke-WebRequest], HttpResponseException
+ FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

This is another area where the only way I have found to work cross-version is to include code for both versions behind a version check.

Before I continue, here is some reference code:

try {
Invoke-WebRequest 'http://httpbin.org/status/418' -ErrorAction Stop
}
Catch {
    $err = $_
}

The first obvious difference is the ErrorDetails.Message property. In 5.1 you can see above this is squashed to a single line whereas the line breaks are preserved in 6.0.0. Unfortunately, this appears to be the only way to retrieve the response body in 6.0.0. In 5.1 you could do this:

$Stream = $err.Exception.Response.GetResponseStream()
$Stream.Position = 0
$Reader = [System.IO.StreamReader]::new($Stream)
$Reader.ReadToEnd()

That would return the response body as a string. With the switch to HttpClient the method would be different:

# Currently Not Working
$Stream = $err.Exception.Response.Content.ReadAsStreamAsync().GetAwaiter().GetResult()
$Reader = [System.IO.StreamReader]::new($Stream)
$Reader.ReadToEnd()

However, this is currently not possible. The Web Cmdlets are already doing this and then disposing the reader whenever there is a status code that is not a success status code. This causes the underlying stream to close as well. Once the stream is closed it can't be reopened. That means you cannot get the original content anywhere other than ErrorDetails.Message . This probably wouldn't be an issue, if the web cmdlets were not also stripping content from the response body to make it human readable. I have opened issue #5555 to track this.

Another surprising fact is that HttpResponseMessage treats anything that is not a 200 level status code as not successful. For example, 300 level redirects:

$Params = @{
    Method = 'HEAD'
    Uri = 'https://httpbin.org/redirect/6'
    MaximumRedirection = 0
    ErrorAction  = 'Ignore'
}
(Invoke-WebRequest @Params).Headers.Location

In 5.1:

/relative-redirect/5

in 6.0.0:

Invoke-WebRequest : Response status code does not indicate success: 302 (FOUND).
At line:1 char:2
+ (Invoke-WebRequest @Params).Headers.Location
+  ~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : InvalidOperation: (Method: HEAD, R...rShell/6.0.0
}:HttpRequestMessage) [Invoke-WebRequest], HttpResponseException
+ FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

So be mindful of that as well.


Part 1 Conclusion

As you can see, just a few under the hood changes have radically affected the PowerShell user experience for Invoke-RestMethod and Invoke-WebRequest.These changes primarily affect advanced users of the cmdlets and anyone who is writing wrapper or proxy functions for them. There is quite a bit of work required to make that code work for both 6.0.0 and 5.1. This means you may need to decide if the effort is worth it. Since much of what I am working with targets cross-platform users, I've personally decided that my code will be ported to work with 6.0.0 only. I also added several new features to the web cmdlets to address shortcomings in 5.1 and the only way to take full advantage of them is to remove my 5.1 support. Your situation might make it not worth it as your environment might not be including 6.0.0 or other factors (such as a requirement for the ActiveDirectory module) may make 6.0.0 as stretch at this point.

Part 2 will cover the outstanding issues, deprecated features, and missing features in the PowerShell Core 6.0.0 web cmdlets. It may be a week or two before it is published. Check back soon!

Join the conversation on Reddit!

2 comments:

Peter Bertok said...

For the love of god, please add a switch to ignore the HTTP error code!

Forcefully converting anything other than HTTP/200 to a hard error makes several Web APIs totally unusable from PowerShell, as the error responses have content that must be parsed by clients.

Mark Kraus said...

@Peter Bertok

That is a good suggestion! The best way to get your suggestions implemented is to open an issue at https://github.com/PowerShell/PowerShell/issues