Intro
On Monday morning I opened the PowerShell slack and started to catch up on the conversations I missed. The most recent conversation that morning was someone, yet again, asking how to get rid of the annoying first line in the output from Export-Csv. This is such a common question and an issue that trips up many PowerShell novices. I've probably answered this question several dozen times on /r/PowerShell. It also caused me grief when I was just starting out with the language.
This prompted me to ask myself and the slack channel "Does anyone actually use the default behavior of including the type information?" My gut and the few users online at the time told me no. So I created an Issue in the PowerShell repo then asked for feedback from /r/PowerShell, Twitter, and Slack. The result was 175 thumbs up, no thumbs down, and not a single person coming out in support of the default behavior.
Since the change is a breaking change, the issue went before the PowerShell Committee Wednesday and was ultimately approved. I quickly made my PR and early this morning the PR was merged to master. That's right: Starting with PowerShell 6.0.0-Beta.9 -NoTypeInformation will be the default behavior on Export-Csv and ConvertTo-Csv.
If you want this feature sooner, rather than later, you can either build the current master or pick up tomorrow's nightly build.
I use ConvertTo-Csv for all of my examples in this post for convenience. There is little difference between it and Export-Csv besides the fact that one puts a string in the output stream and the other in a file. They both share the same base class where this change was implemented so they both share the same new behavior.
History
So what is this legacy default behavior all about? If you have used either CSV cmdlet you have probably run into this, discovered the odd behavior, learned about -NoTypeInformation, and then just used the parameter all the time. Maybe you dug into this, maybe you didn't. I certainly didn't care about it for some time.I just assumed the default behavior was intended for some use case my inexperienced self had not encountered yet. I dug into this behavior twice: once long ago as an academic exercise to discover what it does and again during this PR to confirm my understanding of it.
The Type Information included by default in versions 5.1 and older looks something like this:
Result:
The #TYPE followed by the full name of the type of object supplied. One would expect this to then be converted from CSV data to the type in the Type Information. However, that doesn't happen:
Result:
It never reconstitutes the object back to it's original type. Import-Csv and ConvertFrom-Csv only ever return [PSCustomObject].
So what is the type data used for? Well, the reality is: Nothing. It is applied as a PSTypeName to the [PSCustomObject], but, it is prefixed with "CSV:"
Result:
The usefulness of the CSV cmdlets for object serialization is next to nil. Even if you made heavy use of PSTypeName:
They would import broken:
If you had planned on using the CSV cmdlets for serialization, you would need to either transform them after import, or have extra logic and format data to accommodate both your natural and deserialized custom types.
And that uselessness why just about every script that uses the CSV cmdlets includes -NoTypeInformation in the call or the following at the beginning:
That parameter allows you to exclude the type useless information.
Update: Steve Lee from the PowerShell Team commented on the history of this behavior:
For the historical record, when this issue was discussed in the committee one of the members who is a PM (and shall not be named) remembered that the original intent was to have a more robust extensible type system for csv so that you could get back the original object, but that work never happened so all you had was this one comment.
I think this is a great example of the benefit of being not only Open Source, but open development. When we were purely a Windows project, we would review the feedback on Uservoice, but ultimately with limited resources we could only address the top issues that weren't either overly expensive nor risky. With OSS, we can get the feedback much quicker and respond much quicker (although now we've built up a queue of items for committee review...) and even better, community people like /u/markekraus
can submit a PR to get something fixed that may not pop up high enough in our priority queue to address ourselves.
With all that said, we in the committee did debate the risk of regression to users as it is a breaking change and it seemed that the risk was low. There have been other issues brought up where we all agreed it would be the "right thing to do", however, we have not thrown compatibility out the window and error on the side of maintaining compatibility with Windows PowerShell when we don't have sufficient data
The Future
Obviously, there is going to be someone out there that has their entire workflow built around this working properly with the legacy default behavior. The behavior has been there for a long time. It is a breaking change, but one for the better in my opinion. The -NoTypeInformation parameter is not going away. It will be hidden with DontShow but still there and still works the way it did in older versions. A new parameter has been added -IncludeTypeInformation which will revert the cmdlets back to their legacy behavior of including type information.
New Default Behavior:
Result
Reverting to legacy behavior:
Result:
The Power of Open Source
There you have it. An almost universally disliked default feature of PowerShell has ben changed for the better. This was not possible without the support of the community. Your voices were heard. I may have done the dirty work of coding this, but you, the community, made this change possible by showing support for the change. This shows how willing Microsoft is to make PowerShell a true Open Source project. The fact that the code is on GitHub is not for show and it's not just a publicity stunt. Many changes have been made to PowerShell Core by the community and many more are in the works and on the way.
As another example, an issue was added to request the ability to supply explicit type to generic methods. If you don't understand all those words, don't worry. It's not a key feature of PowerShell, but something useful for the underlying .NET. It is a bit esoteric and its use cases are on the fringes in PowerShell. But it is something that is missing. This comment from PowerShell Team member Jason Shirk sums this up well:
This is totally a reasonable thing to add, and a perfect example of why I'm excited PowerShell is open source.
People do want this feature, but as you point out, it's not critical for a shell. With limited bandwidth, this feature was never quite important enough for a small team to implement, but that's not an issue anymore - there are plenty of motivated and capable people who might add this feature now.
And that's it for now! Stay tuned for my next blog on another new feature I have added for Invoke-WebRequest and Invoke-RestMethod.