DCSIMG
March 2011 - Posts - Pavel's Blog
Sign in | Join | Help

Pavel's Blog

Pavel is a software guy that is interested in almost everything
software related... way too much for too little time

March 2011 - Posts

Asynchronous Programming with C# v.Next

Published at Mar 29 2011, 03:54 PM by pavely

Today I delivered a session on asynchronous programming with the Async CTP (note that it currently works with Visual Studio 2010 RTM and not SP1).

Thank you all for coming! It was very interesting for me, probably as much at it was for you!

I’ve attached the presentation and demos. The recording will probably up in a few days.

I’ll discuss some of the things in the session in upcoming posts, so stay tuned.

Thanks again!

Presentation

Demos

More Media Info with Media Foundation

Published at Mar 28 2011, 07:07 PM by pavely

Getting information on media files is possible through Media Foundation, as we’ve seen, using the various “descriptors”. If we look at Windows Explorer, we can see other information presented, such as artist, title and other metadata. This information is also accessible through media foundation, without resorting to the (mostly) dreadful shell API.

To get to these properties, we can query the media source for the IMFGetService, which is conceptually similar to IServiceProvider used in various APIs – that is, an interface that allows getting another interface that is easier to implement as a separate object. We’ll query for the MF_PROPERTY_HANDLER_SERVICE service type and look for the IPropertyStore interface. We can do this more fluently using the MFGetService helper.

With an IPropertyStore pointer in hand, we can query various properties (the full list is here).

And in code:

CComPtr<IMFSourceResolver> spResolver;
CHECK_HR(::MFCreateSourceResolver(&spResolver));
MF_OBJECT_TYPE type;
CComPtr<IUnknown> spUnkSource;
CHECK_HR(spResolver->CreateObjectFromURL(url, MF_RESOLUTION_MEDIASOURCE, NULL, &type, &spUnkSource));
CComQIPtr<IMFMediaSource> spSource(spUnkSource);
 
CComPtr<IPropertyStore> spStore;
CHECK_HR(::MFGetService(spSource, MF_PROPERTY_HANDLER_SERVICE, __uuidof(IPropertyStore), (void**)&spStore));
 
PROPVARIANT value;
if(SUCCEEDED(spStore->GetValue(PKEY_Title, &value)) && value.pwszVal) {
	wcout << "Title: " << value.pwszVal << endl;
	::PropVariantClear(&value);
}
if(SUCCEEDED(spStore->GetValue(PKEY_Author, &value)) && value.pwszVal) {
	wcout << "Author: " << value.pwszVal << endl;
	::PropVariantClear(&value);
}
if(SUCCEEDED(spStore->GetValue(PKEY_Audio_Format, &value)) && value.pwszVal) {
	wcout << "Audio Format: " << value.pwszVal << endl;
	::PropVariantClear(&value);
}
if(SUCCEEDED(spStore->GetValue(PKEY_Audio_ChannelCount, &value)) && value.intVal) {
	cout << "# Audio channels: " << value.intVal << endl;
}
if(SUCCEEDED(spStore->GetValue(PKEY_Audio_EncodingBitrate, &value)) && value.intVal) {
	cout << "Average bitrate: " << (value.intVal >> 10) << " kbps" << endl;
}
if(SUCCEEDED(spStore->GetValue(PKEY_Audio_SampleRate, &value))) {
	cout << "Samples/sec: " << value.intVal << endl;
}
if(SUCCEEDED(spStore->GetValue(PKEY_Audio_SampleSize, &value)) && value.intVal) {
	cout << "Bits/sample: " << value.intVal << endl;
}
 
if(SUCCEEDED(spStore->GetValue(PKEY_Video_FrameRate, &value)) && value.intVal) {
	cout << "Video frame rate: " << value.uintVal / 1000 << " fps" << endl;
}
if(SUCCEEDED(spStore->GetValue(PKEY_Video_FrameWidth, &value)) && value.intVal) {
	PROPVARIANT height;
	spStore->GetValue(PKEY_Video_FrameHeight, &height);
	cout << "Video frame size: " << value.intVal << " X " << height.intVal << endl;
}
if(SUCCEEDED(spStore->GetValue(PKEY_Video_Director, &value)) && value.pwszVal) {
	wcout << "Director: " << value.pwszVal << endl;
	::PropVariantClear(&value);
}

Those are just some of the properties we can query. It’s also possible to enumerate all the properties in the IPropertyStore using IPropertyStore::GetCount and IPropertyStore::GetAt, displaying their names and values.

Try it on some media files!

Asynchronous Programming with C# v.Next Event

Published at Mar 26 2011, 12:02 PM by pavely

This upcoming Tuesday (March 29th), I’ll be giving a lecture on some of the new features expected in C# 5.0 regarding asynchronous programming, in Microsoft offices in Ra’anana. These features were first revealed in the Microsoft PDC 2010 conference. A CTP version is available for download here, although beware that it currently does not work with Visual Studio 2010 with SP1 installed, but requires an earlier version, such as the RTM version of VS 2010. An updated CTP will probably be released soon that would work with VS 2010 SP1.

The registration link is https://msevents.microsoft.com/cui/EventDetail.aspx?EventID=1032479685&culture=he-IL, and although technically it indicates the event is full, you may still find a spot just by showing up.

Maybe I’ll see you there?

Getting Media File Info

Published at Mar 20 2011, 11:55 AM by pavely

So, what can we do with Media Foundation? One of the simplest things, perhaps, is getting information on some media file, somewhat similar to what we see in Windows Explorer, but we can dig deeper if we like. Let’s get started.

First, we’ll create a simple Win32 console application named MediaInfo (I check the box to include ATL headers, we’ll use ATL smart pointers). We then add some Media Foundation includes (e.g. in StdAfx.h):

#include <mfidl.h>

#include <mfapi.h>

 

These are the basic header files for MF (there are more). We’ll also need to link against the MF libraries. I prefer to do that with #pragma lib instead of Project->Properties:

#pragma comment(lib, "mfplat.lib")

#pragma comment(lib, "Mfuuid.lib")

#pragma comment(lib, "mf.lib")

 

This helps ensure I get the libs no matter what: debug or release, 32/64 bit, if I lose the project file, etc. (a bit paranoid, I know).

In main, we need first to initialize COM and Media Foundation:

::CoInitializeEx(0, COINIT_MULTITHREADED);

::MFStartup(MF_VERSION);

 

The MFStartup function must be called before most MF operations. Next, we’ll create a local function called DisplayInfo that takes a file path and display some info to the console. So, the entire main function looks like so:

int _tmain(int argc, _TCHAR* argv[]) {

   if(argc != 2) {

      cout << "Usage: MediaInfo <path>" << endl;

      return 1;

   }

 

   ::CoInitializeEx(0, COINIT_MULTITHREADED);

   ::MFStartup(MF_VERSION);

 

   HRESULT hr = DisplayInfo(argv[1]);

   if(FAILED(hr)) {

      cout << "Error: " << hex << hr << endl;

   }

 

   ::MFShutdown();

   ::CoUninitialize();

 

   return 0;

}

Now for the real stuff. How do we get information about a media file? We need a media source, an MF abstraction of some source for media data (in our case a file, but can be anything, such as a live feed from a camera).

To get a media source for a file, we’ll need the help of another MF entity, the Source Resolver. This object can “resolve” a source file into a media source object (implementing the IMFmediaSource interface). If the source resolver fails, it’s safe to say that MF does not recognize the file’s format, perhaps because a decoder is missing:

HRESULT DisplayInfo(LPCWSTR url) {

   CComPtr<IMFSourceResolver> spResolver;

   CHECK_HR(::MFCreateSourceResolver(&spResolver));

   MF_OBJECT_TYPE type;

   CComPtr<IUnknown> spUnkSource;

   CHECK_HR(spResolver->CreateObjectFromURL(url, MF_RESOLUTION_MEDIASOURCE, NULL, &type, &spUnkSource));

   CComQIPtr<IMFMediaSource> spSource(spUnkSource);

 

The CHECK_HR is a simple macro that checks the returned HRESULT and backs out if it’s a failure code:

#define CHECK_HR(x) { HRESULT _hr = (x); if(FAILED(_hr)) return _hr; }

 

First, we create the source resolver (MFCreateSourceResolver), and then we try to obtain a media source with the CreateObjectFromURL method. This one is synchronous, so there may be a short delay (doesn’t matter in our case), but if it does (perhaps running on a UI thread), there are asynchronous alternatives (BeginCreateObjectFromURL and EndCreateObjectFromURL).

Now that we have a media source, we need something called a presentation descriptor. This object describes the “presentation”, an MF term for a set of media steams sharing a common timeline:

CComPtr<IMFPresentationDescriptor> spDesc;

CHECK_HR(spSource->CreatePresentationDescriptor(&spDesc));

 

A presentation descriptor holds stream descriptors. Each stream corresponds to some media data, such as audio or video. We now need to iterate over all stream descriptors looking for a video and/or audio stream, then describe it:

DWORD count;

spDesc->GetStreamDescriptorCount(&count);

for(DWORD i = 0; i < count; i++) {

   BOOL selected;

   CComPtr<IMFStreamDescriptor> spStreamDesc;

   CHECK_HR(spDesc->GetStreamDescriptorByIndex(i, &selected, &spStreamDesc));

   if(selected) {

      // analyze stream descriptor

   }

}

 

A stream descriptor describes a stream of data (e.g. video or audio). Only selected streams are of interest. An unselected stream means there is data but it’s not interesting for some reason. Technically, we can select or deselect streams ourselves, but we’re just looking at the descriptors, not playing it. So, we’ll stick with selected streams.

A stream may have one or media types. For example, a video stream may be capable of providing data in more than one resolution. We need to get to those media types, look for the current one and display its properties:

CComPtr<IMFMediaTypeHandler> spHandler;

spStreamDesc->GetMediaTypeHandler(&spHandler);

CComPtr<IMFMediaType> spMediaType;

spHandler->GetCurrentMediaType(&spMediaType);

 

Once we have the media type, we can finally get some details.

The first thing we need to know is what kind of media this is. Typical result is audio or video. if it’s something else, we’ll ignore it:

GUID major;

spMediaType->GetMajorType(&major);

bool video;

if(major == MFMediaType_Audio)

   video = false;

else if(major == MFMediaType_Video)

   video = true;

else

   continue;

 

The GetMajorType method returns the actual type of the media (as a GUID). There are several predefined GUIDs for this in the MF headers. We could actually get that from the media type handler directly.

Next, we want to display information that is common to audio and video, and then get specific info for audio and video.

Information in MF is stored in attributes. An attribute store implements the IMFAttributes interface, and it’s a kind of property bag, where keys are always GUIDs and values may be of several types (the value itself is stored in a PROPVARIANT), such as UINT32, UINT64, WCHAR*, GUID, IUnknown*, double and a BLOB. The IMFMediaType interface inherits from IMFAttributes, so we can query it directly. The complete list of attributes for MF can be found here.

Common attributes for a media type can be found here. Let’s display some of them:

cout << "Stream Index: " << i << endl;

cout << "Media type: " << (video ? "Video" : "Audio") << endl;

 

UINT32 compressed;

HRESULT hr = spMediaType->GetUINT32(MF_MT_COMPRESSED, &compressed);

if(SUCCEEDED(hr))

   cout << "Compressed: " << (compressed ? "True" : "False");

GUID guid;

spMediaType->GetGUID(MF_MT_SUBTYPE, &guid);

::StringFromGUID2(guid, buffer, 128);

wcout << "Subtype GUID: " << buffer << endl;

 

There is a list of possible subtypes for video and audio. We need some conversion from a GUID to a more readable description using some lookup table. This is left as an exercise for the reader. Now let’s turn our attention to video attributes:

if(video) {

   UINT32 num;

   if(SUCCEEDED(spMediaType->GetUINT32(MF_MT_AVG_BITRATE, &num)))

      cout << "Average bitrate: " << (num >> 10) << " Kbps" << endl;

   UINT32 width, height;

   ::MFGetAttributeSize(spMediaType, MF_MT_FRAME_SIZE, &width, &height);

   cout << "Frame size: " << width << " X " << height << endl;

   ::MFGetAttributeRatio(spMediaType, MF_MT_FRAME_RATE, &width, &height);

   cout << "Frame rate: " << width / (float)height << " FPS" << endl;

}

 

There are many other attributes we can query. Let’s look at audio:

   else {

      UINT32 num;

      if(SUCCEEDED(spMediaType->GetUINT32(MF_MT_AUDIO_BITS_PER_SAMPLE, &num)))

         cout << "Bits/sample: " << num << endl;

      if(SUCCEEDED(spMediaType->GetUINT32(MF_MT_AUDIO_NUM_CHANNELS, &num)))

         cout << "# Channels: " << num << endl;

      if(SUCCEEDED(spMediaType->GetUINT32(MF_MT_AUDIO_AVG_BYTES_PER_SECOND, &num)))

         cout << "Average bytes/sec: " << num << endl;

      if(SUCCEEDED(spMediaType->GetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, &num)))

         cout << "Samples/sec: " << num << endl;

   }

   cout << endl;

}

 

One thing we forgot is to show the media file’s duration… Let’s add that. This is not a stream based attributes, but is file based (or more precise, presentation based):

UINT64 duration;

spDesc->GetUINT64(MF_PD_DURATION, &duration);

CTimeSpan span(duration / 10000000);

wcout << "Duration: " << (LPCWSTR)span.Format(L"%H:%M:%S") << endl;

CTimeSpan is a small shared MFC/ATL class that is used here for formatting purposes. The original value is in 100 nano-second units. This is converted to seconds when passed to CTimeSpan.

Here’s an output for some video file I have:

image

Here’s an example of an MP3 file:

image

The entire project is attached.

Introduction to TopoEdit

Published at Mar 11 2011, 05:33 PM by pavely

In Windows Media Foundation, TopoEdit is the equivalent of DirectShow’s GraphEdit tool. Using a simple graphic interface, one can build topologies (the equivalent of a DirectShow filter graph), and “run” them, that is, start the flow of data, from a source node towards one or more output nodes. We’ll see that in a minute.

To open TopoEdit, the Windows SDK should be installed. Navigate using Windows Explorer to something like C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin and run TopoEdit.Exe. You should see something like this:

image

Not too exciting at this point. We have a blank working area where our topology would show.

The easiest way to start experimenting is use the File->Render File… menu item. Select some media file (such as a WMV or MP3 file). If all is well, you should see something like this:

image

What just happened? TopoEdit used Media Foundation’s source resolver and with some extra work, tried to build a topology that manages to deal with the selected source (here having two streams of data: one audio, the other video) and get each stream to a “visual” output, namely an Audio renderer and a Video renderer.

Pressing the “Play” toolbar button (or selecting from the menu Controls->Play) should show a new window with the media file playing. You can change the playing rate, pause or stop playback. In essence, you have a simple Windows Media Player.

Let’s try to create this topology in a different way. Select File->New from the menu, answer “yes” to the continue without saving the current topology and you should be back at an empty work area.

Now go to the Topology menu and select “Add Source…”. Navigate to a media file of your choosing. Let’s assume the previous file, which contains both video and audio. You should see something like this:

image

We’ll disregard the “attributes” pane on the right for now.

What we have is a node within the topology of type “source”. A source node is required, and a media file is a common source. Other possibilities may be a video capture device (camera for example) or a microphone. Technically it can be anything that matches a specific interface.

We can see in the toolbar area that the topology is not resolved, meaning there is no valid path from a source node to an output node – we have no output node yet. Clicking “Play” fails with a topology resolution error.

We need an output node. The most obvious may be a video renderer. Select “Add EVR” from the Topology menu. The EVR is the Enhanced Video Renderer, which is MF’s default supplied video renderer. Connect the output of the Video input stream to the Video Renderer’s input node by clicking and dragging. It should look something like:

image

Now for the magic part: select Topology->Resolve Topology. If you’ve selected a WMV file, a “WMVideo Decoder” node appears, resolving the mismatch that naturally exists between a WMV source and the EVR. The WMV video stream needs to be decoded before it can be rendered. MF is smart enough to look for matching transforms that can do the job. Pressing “Play” now shows video but no sound.

image

We can provide audio output in a similar manner. Select Topology->Add SAR to add a sound output device. Connect the audio stream to the Audio Renderer, and resolve the topology again. Now you can fully play the media file.

Now let’s try something more daring.

Disconnect the WMVDecoder from the Video renderer by clicking one of the pins and pressing DELETE.

image

Now select from the menu Topology->Add Tee. A “splitter” node is created. Select from the menu Topology->Add EVR.

image

Now connect the WMVideo Decoder to the input of the “Tee” and connect the Tee’s output to one of the EVRs. Automatically another output appears on the Tee. Connect it to the other EVR.

image

Now resolve the topology. Press Play. You should see two video windows playing at once.

Summary

TopoEdit is a useful tool. Unfortunately, it’s not useful enough. Setting attributes is difficult. Setting properties (in the MF sense, which I’ll discuss at a later post) is not possible as far as I could see. Also, the tool has bugs and its UI is clunky at best. Looking at its About box shows:

image

Looks like May 2010, but the copyright is 2005. I tend to believe the copyright. The tool could use some work. That said, its source code is part of the Windows SDK, so can be tweaked.

Until next time with the Media Foundation, happy topologies!

Introduction to Windows Media Foundation

Published at Mar 07 2011, 11:26 PM by pavely

I’ve been writing a new course on this technology, so I thought I’d share some of my experiences with the Windows Media Foundation.

What is Windows Media Foundation?

The Windows Media Foundation is technically the successor of DirectShow (which is still around and very much supported), introduced in Windows Vista and enhanced in Windows 7.

It’s a multimedia platform, capable of playing, analyzing, writing and otherwise transforming media (mostly video & audio, but can technically be anything). It’s based on similar principles as DirectShow, such as interface based programming using COM, which naturally lends itself to multiple implementations.

MF exposes a COM API, much like many native technologies these days. Some people run away almost immediately when they hear “COM” uttered around some API, but at its core COM mandates interface based programming model, which is a good thing. The “bad” thing is probably the apartment model, which most people using COM don’t fully understand, or at least hate.

This is understandable, as the model is not a simple one, albeit a necessary evil considering the time COM was conceived (around 1993). At that time, multithreading of any kind was a new concept in the Windows arena – Windows 3.x didn’t have any and many programmers were using VB 3.0/4.0/5.0 and later 6.0, which did not support the creation of threads. COM apartments were built to protect objects that couldn’t protect themselves.

If COM was invented today, there wouldn’t have been any apartments. It would be similar to the .NET CLR model – everyone must be aware of multithreading and its perils, and that’s that. There are “helpers”, such as synchronization contexts, but the model is basically multithreaded all the way. Programmers simply had to grow up (yes, even the VB.NET guys).

Media Foundation states that its objects live in an MTA, but they are not “full COM objects”, which is not an official term. Either you’re a COM object or you’re not. You can’t technically be “part” COM. What MF means, is that it does not support a proxy coming from an STA, but must be marshaled to the MTA (or TNA) and then handed over to MF. This is pretty strange, so I tried to get a better feel of what this means.

In a perfect COM world, there would be no apartments and no proxies (in process). This is in fact achievable, by aggregating the Free Threaded Marshalar (FTM). This object implements IMarshal and ensures that whenever an interface pointer on the object is requested, it always returns a direct pointer, never a proxy. This sounds ideal, and it is, provided one avoids some potential “gotchas”, the main one being if an FTM based objects wants to hold interface pointers to non-FTM based objects. This is problematic, because when the FTM based object wants to call the other object it must do so with the correct pointer, be it a proxy or not, depending on the calling apartment. Although an FTM based object is apartment agnostic – the other object isn’t. The usual solution to this is to register the interface pointer upon reception in the Global Interface Table (GIT), getting back a DWORD cookie that is apartment agnostic. Then, the FTM based object can get a correct interface pointer from any thread by getting it from the GIT, using it, and releasing it, while continuing to keep that cookie. If this explanation confuses you, dear reader, you’re not alone, and if you didn’t hate apartments up till now, you may start now.

Does MF aggregate the FTM?

I was pretty sure that MF objects aggregate the FTM. After all, it’s almost a perfect solution. However, querying various objects for IMarshal (a first must sign for possible FTM aggregation) failed. MF objects don’t aggregate the FTM. Most of them are not created through the official COM CoCreateInstance API, but are created privately, partially leading to the “not full COM objects” statement. Checking some interfaces in the registry revealed it has its own proxy/stub DLL (does not use the type library marshalar). This is a snapshot of one of the common interfaces in MF, IMFAttributes:

image

What does all this mean? I’m not entirely sure, and the documentation on this is (IMHO) too complex. The only sure thing is that client’s threads had better be in the MTA, and client objects had better be in the MTA as well or aggregate the FTM.

After all this COM geek talk, what is really Media Foundation? I’ll talk about that in the next post in this series starting with the TopoEdit tool, the equivalent (sort of) of DirectShow’s GraphEdit.

Technology Radio show that I participated in

Published at Mar 02 2011, 08:40 PM by pavely

Last week I was invited to a radio show called “Technofobia” (origin in Hebrew) in the Interdisciplinary Center (IDC) in Hertzliya (Israel) (106.4 FM). This was a “special” talk about development technologies. There was a Ruby guy, a Python guy, a C++ guy (actually a girl), an iPhone guy (the famous Yossi Taguri) and myself, as the “Microsoft guy”. I had to repeatedly (and during the music breaks) tell the hosts that I don’t work at Microsoft, I just use, teach, mostly like and develop with those technologies. It simply didn’t take. Oh well…

You can hear a podcast of the show here (that’s the February 24th show. Sorry, non-Hebrew speaker guys & girls, that was all in Hebrew). It was a fun experience, although we ended up talking very little about what we were actually supposed to talk about, so it seems there will be a part 2 sometime in the near future. Stay tuned…