Friday, September 16, 2011

So, what the heck is WinRT?

First of all, an MSDN article on the topic that is currently the most detailed one:
http://msdn.microsoft.com/en-us/library/windows/apps/hh464947(v=VS.85).aspx

Which is a curious article. The very first sentence says, "The Windows Runtime is designed to enable you to create Metro style apps in a language-independent way" - which is all well and good - but notice how it doesn't say "Windows Runtime is ..." anywhere. Instead, it goes on and on about what you can do with WinRT. Let me try to come up with a somewhat long-winded explanation with some examples and some exploratory demos that can, hopefully, substitute for a definition.

First, let's take a .NET angle on this. What is CLI? Well, first and foremost it's the type system, the object model - reference and value types, classes and interfaces and delegates, properties methods and events. There's also a metadata format to describe all this. Then there's the bytecode - IL - to implement methods in; and the VM - CLR - which provides JIT and GC to actually run all this.

Let's see what we can take out of this picture. GC is an easy first target - let's throw in reference counting for reference types, and weak references to deal with reference cycles, and most of our memory problems are handled. Note, I say "most", because now you do have to explicitly deal with those reference cycles if and when they arise. On the other hand, we no longer have a mysterious background thread that occasionally pauses everything else in the process while it's cleaning up our mess.

Alright, who's next against the wall? It would seem that we could do without JIT - just compile code in advance and package it as a native binary; after all, that's what NGen does already, right? Well, it's not quite as easy as it sounds: for one, you have to get rid of generic classes, since their instantiations can have different native implementations. Ditto for generic methods. Generic interfaces are still okay, though, since no code generation occurs there - it's just a contract declaration. And delegates are semantically just single-method interfaces, so they're okay to be generic, as well. For another thing, now that we have native code, we need to standardize on our ABI, so that two different assemblies can call each other. That means strictly defining binary representations of objects - down to the level of vtable layouts - and setting in stone our refcounting protocol and lifetime management rules (i.e. who is responsible for incrementing/decrementing the reference count when passing references around). We also need to figure out something for casting those references.

So we've already sacrificed several things, but what did we gain? No GC and no JIT... hm... so we don't really have a "virtual machine" now, do we? It's just a bunch of runtime services - libraries, really - to locate and load assemblies, deal with primitive types and metadata, and so on.

Congratulations, we now (almost) have WinRT! Now, before anyone gets confused - the above does not mean that WinRT actually is CLR with stuff stripped from it - it is most empathically not; but it's a useful mental model for a .NET developer. In practice, WinRT is actually COM at heart, with a well-defined set of primitive types and core interfaces (such as e.g. collections) on top of that, and an extended object model that borrows a lot from .NET. So now you have COM with namespaces; delegates - here literally single-method interfaces - and .NET-style events with add/remove accessors taking those delegates; generic interfaces; a single immutable opaque string type (HSTRING); and a mandatory extensible metadata format in place of the optional type libraries of yore. So it's COM, but evolved to a level that is much closer to .NET than it ever was.

That point about metadata format actually necessitates some further explanation. A WinRT component is described by a .winmd file, which is - surprise, surprise! - a CLI (Ecma-335) assembly, with a few extensions. Most CLI metadata tables are reused as is - WinRT classes are described as CLI classes, WinRT methods as CLI methods, and so on. You can use ILDASM to crack them open - if you have Developer Preview installed, have a look inside "C:\Program Files (x86)\Windows Kits\8.0\Windows Metadata" (where VS gets them from), or alternatively at "C:\Windows\System32\WinMetadata" (where WinRT itself does).

Now, the .winmd file may or may not contain the implementation of classes and methods described in it. WinRT use of CLI assemblies is really only concerned with metadata; when an instance of a class is "activated" (created), the class factory is looked up by other means, closely resembling traditional COM activation (DllGetClassObject and friends). More clues on how this all comes together can be found by inspecting the "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\WindowsRuntime\ActivatableClassId" registry key in RegEdit.

For WinRT components that are purely native, the class factory - and the class itself - normally live in a separate native DLL file somewhere else. If you go to "C:\Windows\System32", and search for" Windows.*.dll", you'll see a few of those - usually one .dll backing several related .winmd files. This, in particular, applies to all components in WinRT system libraries (i.e. corresponding to .winmd files at the aforementioned "Windows Metadata" folder) - they are all 100% native code. There is a certain well-defined interface to request a DLL to provide a class factory for a particular class, not dissimilar to DllGetClassObject of vanilla COM.

Alright, so, looking from perspective of a COM developer, what else does WinRT have compared to COM? For one, all WinRT objects provide Reflection-like capabilities - you can query any object for the list of interfaces it implements, as well as its actual class type, and then look up the corresponding metadata in .winmd. The other is support for weak references throughout the object model - this could be done in COM on a case-by-case basis, but there was no standard universal protocol for all objects supporting weak references to use. In WinRT, there is IWeakReference and IWeakReferenceSource, and vast majority of classes support that. Finally, there are extensible, .NET-style attributes that let metadata be extended pretty much arbitrarily.

Then there are the new collection interfaces. Collections in COM have always been a hassle - you had the IEnum... design pattern, which was strongly typed, but not reusable (since every type X had its own separate IEnumX unrelated to all others). Then you had IEnumVARIANT, which was reusable but untyped. For anything above and beyond that, there wasn't any common ground - everyone defined collection interfaces as they saw fit. In WinRT this finally gets proper treatment with a new set of generic interfaces - IIterator<T>, IIterable<T>, IVector<T>, IMap<T> etc - the names speak for themselves. Non-mutable and observable variations are also available. Now we can finally write reusable components with a sane API, where collections from one component can be passed to another!

It's also worth pointing out one important thing that hasn't changed much from COM: error handling. Under the hood, it's still all HRESULTs, which all methods have to return.

All well and good; now that we have all those WinRT components, how are they actually used? Well, since they're COM, in C++ you can just treat them as such... if you're feeling masochistic. I mean, do "AddRef" and "Release" bring up pleasant memories? Ouch... thought so. What about "CComPtr<T>" and "COM_INTERFACE_ENTRY"... no, no, don't hit the same spot, please! I'll explain myself right now.
Like I said, if you're feeling masochistic, you can do that kind of thing. But if you're like most of the rest of us, VC++ has some nifty language extensions (officially named "Visual C++ Component Extensions", and codenamed C++/CX by Herb Sutter in his //BUILD presentation) that let you cheat and do it the easy way. Let me present an example here - this is what you get as a starting point when you create a new "WinRT Component DLL" C++ project in Developer Preview:
using namespace Windows::Foundation;

namespace WinRTComponentDll1
{
    public delegate void SomeEventHandler(int i);

    public ref class WinRTComponent sealed
    {
        int _propertyAValue;

    public:

        WinRTComponent();
        ~WinRTComponent();

        property int PropertyA
        {
            int get()
           
{
                return _propertyAValue;
            }
            void set(int propertyAValue)
            {
                _propertyAValue = propertyAValue;
            }
        }

        int Method(Platform::String^ s);

        event SomeEventHandler^ someEvent;
    };
}

If that looks suspiciously like something you've seen before, you're right - this is very, very similar to C++/CLI. The crucial difference here is that the above compiles to pure native code: those ^ types are now COM references with implicit refcounting, and "ref class" defines WinRT (i.e. COM) classes. Inheriting from an interface will extend QueryInterface for you accordingly, and inheriting from a class will wire up object composition to achieve the same effect. Strings are seen not as raw handles, but as a reference type, with implicit conversion from string literals - but under the hood it's all HSTRING, and their lifetime is managed for you. For casts you use dynamic_cast, which calls QueryInterface under the hood. Also note also how HRESULTs are nowhere to be seen, and neither are [retval] out parameters that are used in COM to return actual method results: [retval] is projected as a method return value, and HRESULTs are wrapped into exceptions.

Unlike C++/CLI, this new language does not suffer from schizophrenia, in a sense that there's no forced separation into "managed" and "unmanaged" worlds for objects. This means that you can mix things freely - e.g. have a vanilla C++ struct or class with a Platform::String^ field, or a "ref class" with an std::string field.

So, what about .NET? If you've just thought "COM Interop", and shuddered in horror at the thought, worry not. That lesson has been learned, and WinRT is much more straightforward to deal with. Unlike classic COM, there are no more intermediate artifacts such as interop assemblies (embedded or not). Instead, you literally add references to .winmd metadata files, and magically get all types therein available in your .NET programs as if they were written in .NET in the first place.
Just to drive this point home, a little experiment you can do with Developer Preview. Let's start with this C# code:
using Windows.Foundation.Collections;

class Program
{
    static void Main()
    {
        var propSet = new PropertySet();
        propSet["foo"] = 123;
    }
}

This can be compiled from VS command line with a somewhat lengthy evocation:
csc test.cs /r:Windows.Foundation.winmd "/lib:C:\Program Files (x86)\Windows Kits\8.0\Windows Metadata"

Note how we reference the .winmd file using /r, just as if it were a regular .NET assembly. Now, to check that compiler isn't cheating and inserting some artifacts generated from .winmd into the output .exe, use ILDASM on it. Here's the manifest:
.assembly extern mscorlib
{
    .publickeytoken = (B7 7A 5C 56 19 34 E0 89)
    .ver 4:0:0:0
}
.assembly extern windowsruntime Windows.Foundation
{
    .ver 255:255:255:255
}
.assembly test
{
    …
}
.module test.exe

Note the "assembly extern windowsruntime" declaration. It really is a special kind of assembly reference, even on IL level. The only type you can see in our assembly is "Program", and IL disassembly of Main() is straightforward:
.method private hidebysig static void Main() cil managed
{
    .entrypoint
    .maxstack 3
    .locals init (class [Windows.Foundation]Windows.Foundation.Collections.PropertySet V_0)
    nop
    newobj instance void [Windows.Foundation]Windows.Foundation.Collections.PropertySet::.ctor()
    stloc.0
    ldloc.0
    ldstr "foo"
    ldc.i4.s 123
    box [mscorlib]System.Int32
    callvirt instance void class [mscorlib]System.Collections.Generic.IDictionary`2<string,object>::set_Item(!0, !1)
    nop
    ret
} // end of method Program::Main

Nothing criminal here, either. We actually use the "newobj" instruction to create the instance of the class - again, as if it was a .NET class - and call a method on it using "callvirt". To sum it up, the actual interop happens deep in CLR, where it recognizes types in .winmd files as special, and handles all normal operations on them - "newobj", calls etc - by directing them to the corresponding WinRT facilities (such as activation), or WinRT component itself. Types get similar treatment; e.g. .NET delegate types are a special kind of class, while WinRT delegate types are interfaces - but in .NET you still see them as classes inheriting from System.MulticastDelegate. Inheriting from WinRT classes is also possible if they are marked as supporting that (i.e. "not sealed" in .NET parlance), and is transparently implemented using composition.

One other interesting thing that stands out in the code above is the method call - look, it's set_Item() property setter from... System.Collections.Generic.IDictionary<string,object>! How come? We're dealing with a native WinRT component - where'd it get a CLR interface from? Well, the truth is, if you look at the metadata file for it (Windows.Foundation.wmind) - it doesn't. Here's what ILDASM says about it:
.class public auto ansi windowsruntime sealed Windows.Foundation.Collections.PropertySet
extends [mscorlib]System.Object
implements Windows.Foundation.Collections.IPropertySet,
class Windows.Foundation.Collections.IObservableMap`2<string,object>,
class Windows.Foundation.Collections.IMap`2<string,object>,
class Windows.Foundation.Collections.IIterable`1<class Windows.Foundation.Collections.IKeyValuePair`2<string,object>>

No IDictionary (or, indeed System.Anything, aside from the must-have System.Object) in sight. However, it does implement several WinRT interfaces which look like they might be related - most notably, IMap<string, object>. Indeed, IMap is the WinRT interface that corresponds to the same concept as .NET's Dictionary. So, to save .NET developers the hassle of dealing with another name for the same thing, CLR projects all mentions of IMap<K,V> as IDictionary<K,V>, mapping methods and properties accordingly - so e.g. IMap::Size property becomes IDictionary.Count, and IMap::Insert() and IMap::Lookup() methods become getter and setter for the IDictionary.Item indexer. This goes both ways, in a sense that a .NET object implementing IDictionary<K,V> will be seen by WinRT as implementing IMap<K,V>.

All other WinRT collection interfaces get the same treatment, and so do some other types - e.g. Windows.Foundation.Uri becomes System.Uri, Windows.Foundation.DateTime becomes System.DateTime, and so on. Again, this all happens deep inside CLR - as can be seen from the code snippet above, on IL level, you already see the CLR projection, not the original WinRT interface.

(Coincidentally, this means that any existing CLR language implementation can, in theory, be adapted to consume WinRT with relative ease. Its compiler would still have to deal with .winmd files, though, as in their raw form they don't properly reflect the projection that will occur at runtime.)

All of the above is what in WinRT parlance is called a "projection" - adaptation of WinRT concepts to those of a particular language and/or platform. One of WinRT projections provided out of the box is the .NET one. Another one is, in fact, the aforementioned C++/CX. And then there is a JavaScript projection for Metro web applications. Third-parties can, and are encouraged to, offer additional projections for their languages and frameworks, so hopefully we'll see more options soon.

So, from the perspective of a .NET developer, the entire WinRT standard library can be treated as an extension - or, in some cases, replacement - of the .NET class library. Similarly, from the perspective of an HTML/JS developer, WinRT can be treated as an extension of the standard JavaScript library. The fact that this extension is implemented in native code is, effectively, transparent to the developer.

From another perspective, the point of WinRT is to provide a .NET-like rich, high-level class library - including UI with XAML, data binding, styles, templates and cookies! - that can be used from C++ without the latter taking a dependency on CLR and its services (for some reason, C++ devs tend to be nervious in the presence of GC, for example, even if it doesn't touch "their" objects).

But there's another interesting aspect. So far, I've been talking mostly about consuming WinRT APIs, but the truth is that you can actually write your own WinRT components, in either C++ or C#. So you can have a C++ application using a component written in C#, or vice versa; and then a Metro web application using both. So this is also, in a sense, a reinterpretation of one of the fundamental ideas behind CLR - cross-language interoperability - but without a shared runtime, using only common ABI and class library.