I am currently implementing Reservation (a pattern for “partial temporary commitment” when using Sagas).The implementation calls for calling a few of the saga members asking them to reserve an asset. Since network calls can take some time I wanted to make each call on its own thread, collect any errors (failures to reserve or communication exceptions) and then have all the threads reconverge to the initiating thread where any problems will be sorted out (e.g. by retrying the reservation). It basically looks something like the figure below:
As you can probably guess this type of synchronization is called Rendezvous (as the threads “meet”). It is also actually fairly easy to implement in .NET I usually do something along the lines of the code snippet below and things like Parallel.For in C# 4.0 will give a similar effect
- private void DispatchThreads(List<Thread> threads, List<SharedContext> sharedContexts, ParameterizedThreadStart threadFunction)
- {
- //initiate threads
- foreach (var context in sharedContexts)
- {
- var dispatchThread = new Thread(threadFunction);
- threads.Add(dispatchThread);
- dispatchThread.Start(context);
- }
-
- //Rendezvous
- foreach (var dispatchThread in threads)
- dispatchThread.Join();
-
- \
However, that’s not why I am writing this post. It turns out I have a more complicated situation. In my setup the code of “Main Thread” (in the figure above) can be run by multiple different threads which are initiated by external events (i.e. I don’t control when they run) – What I needed to to is to make sure that if there are several threads that run in parallel they need to wait for the last of them to finish one section of the code before they can all move on. To explain this better consider the figure below:
Thread 1 starts, sometime after that Thread 2 starts and then Thread 3 starts as well – Thread 1 and Thread 2 have to hold and wait until Thread 3 finished. Thread 4 starts beyond the Rendezvous so none of the other threads care about it. Had it started just before Thread 3 finished all the threads would have had to wait for it as well. Basically I need a similar effect to chords in polyphonic C#
I couldn’t find anything that .NET 3.5 for this - It is kind of what ManualResetEvent – however, it not quite that, since we need to wait for the event only when we’re done. Also we need to let other threads “know” we are running so that they’d wait for us. So indeed I wrapped a ManualResetEvent with a few methods to do just that and ended up with the following.
- public class Rendezvous
- {
- private readonly ManualResetEvent Synch = new ManualResetEvent(true);
- private readonly object Locker = new object();
- private int Counter = 0;
-
-
- /// <summary>
- /// Join the Rendezvous - Mark that a thread is active and it should finish before the others can continue
- /// </summary>
- public void MarkStart()
- {
- lock (Locker)
- {
- if (Counter < 0) Counter = 0;
- Counter++;
- Synch.Reset();
- }
- }
-
- /// <summary>
- /// Allows a thread to wait for others without having the others wait for it
- /// </summary>
- public void Wait()
- {
- Synch.WaitOne();
- }
-
- /// <summary>
- ///
- /// </summary>
- public void MarkFinishAndWait()
- {
- lock (Locker)
- {
- if (--Counter <= 0) Synch.Set();
- }
- //it is ok if we context switch and lock since we want to synchronize the end times
- Wait();
- }
- \
As long as all the threads have access to the same instance of the Rendezvous class they can synchronize their end times.
Rendezvous is a common pattern for workflow systems (also known as Join-And). It is the first time I had to do that in C# with uncontrolled/uncoordinated thread starts so I don’t know how useful it would be to you. However I thought it is also interesting reading so I published it anyway :)
Reacting to a comment left by Frans Bauma, Ayende recently wrote about “Maintainability”
Maintainable is a value that can only be applied by someone who is familiar with the codebase. If that someone find it hard to work on the codebase, it is hard to maintain. If someone with no knowledge of a codebase find it hard to work with it, tough luck, but that doesn’t say anything about the maintainability of a code base.
I usually agree with what Ayende has to say, but not this time. First I hope that by “someone who is familiar with the codebase” he doesn’t refer to the person that actually wrote the code – since if the person who wrote the code can’t understand what he/she wrote than the code base is doomed anyway.
In the wider-sense “someone who is familiar with the codebase” is just part of the picture – a code base is only maintainable is a reasonably professional developer can get to a point where she is familiar enough with the code to be able to maintain it. This doesn’t imply that the time it takes to be productive with the code base is zero – but the lower the time it takes to get up to speed means the more maintainable is the code.
In any event, for a codebase to be maintainable, it has to show several quality attributes .For the most part I agree withthe definition of Maintainability in ISO 9126:2001 Software Engineering Product Quality*
6.5 Maintainability
The capability of the software product to be modified. Modifications may include corrections, improvements or adaptation of the software to changes in environment, and in requirements and functional specifications.
- 6.5.1 Analysability - The capability of the software product to be diagnosed for deficiencies or causes of failures in the software, or for the parts to be modified to be identified.
- 6.5.2 Changeability - The capability of the software product to enable a specified modification to be implemented.
NOTE 1 Implementation includes coding, designing and documenting changes.
NOTE 2 If the software is to be modified by the end user, changeability may affect operability. - 6.5.3 Stability - The capability of the software product to avoid unexpected effects from modifications of the software
- 6.5.4 Testability - The capability of the software product to enable modified software to be validated.
- 6.5.5 Maintainability compliance -The capability of the software product to adhere to standards or conventions relating to maintainability.
Naturally, being a standard it has the “compliance” thingy which is usually only relevant for large organizations and project but for the most part the different aspects mentioned above are the parts you need to take care of when you want someone besides yourself to make changes to the software.
The view of Maintainability Ayende uses is problematic esp. when we consider that (successful) software will spend most its life in maintenance and not in development (you can read Robert L. Glass’s excellent “Software Maintenance is a Solution, Not a Problem” paper in this regard). Assuming someone maintaining the code will always be familiar with it is expecting the same developer(s) to stay at the same project for as long as the project will live (which is not likely) and/or assuming the project will have a short life (not something I’d want from my projects)
So don’t forget that other people will have to maintain your code and they probably won’t live the code-base as you do or as they say in “Code for the maintainer” in the C2 wiki
“Always code as if the person who ends up maintaining your code is a violent psychopath who knows where you live. “ :)
* ISO 9126 is a multi-part standard for QA. I think ISO9126:2001 is good quick reference for quality attributes ( i.e. something you can look at when you try to elicit quality attributes for an architecture). I, personally think the other parts of the standard are pretty useless but that's another story :)
In one of my previous posts (Rest: good, bad and ugly), I made a passing comment, about how I think using CRUD in RESTful service is a bad practice. I received a few comments / questions asking why do I say that – so what’s wrong with CRUD and REST?
On the surface, it seems like a very good fit (both technically and architecturally), however scratch that surface, and you’d see that it isn’t a good fit for either.
REST over HTTP is the most common (almost only) implementation of the REST architectural style - to the point REST over HTTP is synonymous with REST. I would say most of the people who think of REST in CRUD terms, think about mapping of the HTTP verbs.
CRUD which stands for Create, Read, Update and Delete, are the four basic database operations. Some of the HTTP verbs, namely POST, GET, PUT and DELETE (there are others like OPTIONS or HEAD) seem to have a 1-1 mapping to CRUD. As I said earlier they don’t. The table below briefly contrast HTTP verbs and CRUD
| Verb | CRUDdy Candidate | Actually |
| GET | SELECT (Read) | Get a representation of a resource. While it is very similar to SELECT it also has a few features beyond an out-of-the-box SELECT e.g. by using If-Modified-Since (and similar modifiers) you might get an empty reply. |
| Delete | Delete | Maps well |
| PUT | Update | Put looks like an update but it isn’t since: 1. You have to provide a complete replacement for the resource (again similar to update but not quite) 2. You can use PUT to create a resource (when the URI is set by the client)
|
| POST | Insert | It can be used to create a but it should be a child/subordinate one. Furthermore, it can be used to provide partial update to a resource (i.e. not resulting in a new URI) |
| OPTIONS | ? | Get the available ways to continue considering the current state or the resource |
| HEAD | ? | Get the headers or metadata about the resource (which you would otherwise GET) |
The way I see it, the HTTP verbs are more document oriented than database oriented (which is why document databases like CouchDB are seamlessly RESTful). In any event, what I tried to show here is that while you can update, delete and create new resources the way you do that is not exactly CRUD in the database sense of the word – at least when it comes to using the HTTP verbs.
However, the main reason CRUD is wrong for REST is an architectural one. One of the base characteristics(*) of REST is using hypermedia to externalize the statemachine of the protocol (a.k.a. HATEOS– Hypertext as the engine of state). The URI to URI transition is what makes the protocol tick (the transaction implementation by Alexandros discussed in the previous post shows a good example of following this principle).
Tim Ewald explains this nicely (in a post from 2007…) :
… Here's what I came to understand. Every communication protocol has a state machine. For some protocols they are very simple, for others they are more complex. When you implement a protocol via RPC, you build methods that modify the state of the communication. That state is maintained as a black box at the endpoint. Because the protocol state is hidden, it is easy to get things wrong. For instance, you might call Process before calling Init. People have been looking for ways to avoid these problems by annotating interface type information for a long time, but I'm not aware of any mainstream solutions. The fact that the state of the protocol is encapsulated behind method invocations that modify that state in non-obvious ways also makes versioning interesting.
The essence of REST is to make the states of the protocol explicit and addressableg by URIs. The current state of the protocol state machine is represented by the URI you just operated on and the state representation you retrieved. You change state by operating on the URI of the state you're moving to, making that your new state. A state's representation includes the links (arcs in the graph) to the other states that you can move to from the current state. This is exactly how browser based apps work, and there is no reason that your app's protocol can't work that way too. (The ATOM Publishing protocol is the canonical example, though its easy to think that its about entities, not a state machine.)
If you are busy with inserting and updating (CRUDing) resources you are not, in fact, thinking about protocols or externalizing a State machine and, in my opinion, miss the whole point about REST.
CRUD services leads and promoted to the database as a service kind of thinking (e.g. ADO.NET data services) which as I explained in another post last year is a bad idea since:
- It circumvents the whole idea about "Services" - there's no business logic.
- It is exposing internal database structure or data rather than a thought-out contract.
- It encourages bypassing real services and going straight to their data.
- It creates a blob service (the data source).
- It encourages minuscule demi-serices (the multiple "interfaces" of said blob) that disregard few of the fallacies of distributed computing.
- It is just client-server in sheep's clothing.
The main theme of this and the previous post is that if we try to drag REST to the same old, same old stuff we always did we wouldn’t really get that many benefits. In fact, the “old” ways of doing that stuff are probably more suitable for the job anyway since they have been in use for a while now. and they are “tried and tested” (“You can’t win an argument with an idiot, he’ll just drag you down to his level and beat you with experience” …). REST is just a different paradigm that RPC, ACID transactions and CRUD.
* I know I sound like a broken record on that but our industry has a history diluting terms to a point they almost stop being useful (SOA comes to mind..). The way I see it you can have 3 levels on your way to REST over HTTP:
- You can be using HTTP and XML/JSON – this is level 1 or “Using standards”.
- You can be using the HTTP verbs properly and/or applying document oriented communications – this is level 2 or “Rest-like” interface
- You can conform to all REST constraints and be at level 3 or “RESTful”.
All levels can be useful and bring you merit but only the 3rd is REST
Yesterday I read an interesting paper called “RETRO: A RESTful Transaction Mode”. On the good side, I have to say, it is one of the best RESTful models I’ve seen thus far. The authors took special care to satisfy the different REST constraints, unlike many “RESTful” services (e.g. twitter that returns identifier and not URIs). On the downside is I think a distributed transaction model is bad for REST or in other words I don’t see a reason for going through this effort and jumping through all these hoops.
Why?
For the same reasons transactions are wrong for SOA and why WS-AtomicTransactions is wrong for SOAP web services:
- Service Boundary – RESTful or otherwise is a trust boundary. Atomic transactions require holding locks and holding them on behalf of foreign service is opening a security hole (makes it much easier to do a denial of service attack)
- You cannot assume atomicity between two different entities or resources. Esp. when these resources belong to different businesses.
- Transactions introduce coupling (at least in time)
- Transactions hinder scalability – It isn’t that you can’t scale but it is much harder
For rest it is even worse - Since using hypermedia as the engine of state change means that the hypermedia actually describes the protocol, we clutter the business representations (the representations of real business entities like customer, order etc.) with transactional nitty-gritty as the authors say:
“our model explicitly identifies locks, transactions, owners and conditional representations as explicit, linkable resources. In fact, every significant entity in our model is represented as a resource in order to comply with this constraint.”
This also means the programming the resources themselves will get much more complicated
I think that if you want to reap the benefits of REST you should keep the protocol simple and focus on the business and technical merits you can get not bog it all with needless complexity. It seems to me that RETRO is a good mental exercise to show transactions can be RESTful. I think, however that it is an overkill for RESTful implementations.
RESTful architectures will be better off with BASE (Basically Available, Scalable, Eventually Consistent) and/or ACID2 (Associative, Commutative, Idempotent and Distributed) models –or at least the Saga model (which the authors intend to tackle next) which is a better candidate (IMHO) for achieving distributed consensus.
This is another post (<Rant>) about WCF default behavior and how it can make the life of developers miserable ( you can also check out “WCF defaults limit scalability” and “Another WCF gotcha - calling another service/resource within a call”)
Anyway, the trigger for this is a post by Ayende called “WCF works in mysterious ways”. Ayende posted some code he wrote which was throwing a serialization exception. You can see his post for the full code, but in a nut shell he was defining a large object graph (8192 objects that contain other objects) and was trying to send that over the wire. Here’s a short excerpt from the service definition:
1: [ServiceBehavior(
2: InstanceContextMode = InstanceContextMode.Single,
3: ConcurrencyMode = ConcurrencyMode.Single,
4: MaxItemsInObjectGraph = Int32.MaxValue
5: )]
6: public class DistributedHashTableMaster : IDistributedHashTableMaster
7: {
8: private readonly Segment[] segments;
9:
10: public DistributedHashTableMaster(NodeEndpoint endpoint)
11: {
12: segments = Enumerable.Range(0, 8192).Select(i =>
13: new Segment
14: {
15: AssignedEndpoint = endpoint,
16: Index = i
17: }).ToArray();
18: }
19:
20: public Segment[] Join()
21: {
22: return segments;
23: }
24: }
25:
26: [ServiceContract]
27: public interface IDistributedHashTableMaster
28: {
29: [OperationContract]
30: Segment[] Join();
31: }
32:
33: public class NodeEndpoint
34: {
35: public string Sync { get; set; }
36: public string Async { get; set; }
37: }
38:
39: public class Segment
40: {
41: public Guid Version { get; set; }
42:
43: public int Index { get; set; }
44: public NodeEndpoint AssignedEndpoint { get; set; }
45: public NodeEndpoint InProcessOfMovingToEndpoint { get; set; }
46:
47: public int WcfHatesMeAndMakeMeSad { get; set; }
48: }
As you can see in line 4 – the service is properly decorated with an attribute to enlarge the number of objects in graph. so looking at the code I initially suggested he add a few ServiceKnowType and DataContract/DataMember attributes on the data classes (as the serialization sometimes needs some guidance. After that didn’t help I actually ran the code and then I noticed that the code was missing setting that same attribute – on the client side. So to fix the problem, the client side code below
1: var channel =
2: new ChannelFactory<IDistributedHashTableMaster>(binding, new EndpointAddress(uri))
3: .CreateChannel();
4: channel.Join();
Need to change to something like
1: var channelFactory =
2: new ChannelFactory(binding, new EndpointAddress(uri));
3:
4: foreach (var operationDescription in channelFactory.Endpoint.Contract.Operations)
5: {
6:
7: var dataContractBehavior =
8:
9: operationDescription.Behaviors[typeof(DataContractSerializerOperationBehavior)]
10:
11: as DataContractSerializerOperationBehavior;
12:
13: if (dataContractBehavior != null)
14: {
15:
16: dataContractBehavior.MaxItemsInObjectGraph = int.MaxValue;
17:
18: }
19:
20: }
21: var channel=channelFactory.CreateChannel();
22: channel.Join();
The main problem I find with this piece of code is the fact that it is needed at all. As the post’s title suggest I find this behavior greatly affects the loose coupling of anything that uses WCF (services or other components).
WCF requires that any change you make to the channel on the server side would be reflected in the channel on each and every client (e.g. we have a similar setting where we enlarge message sizes for webHttpBinding and there are many other such examples).
Sure, you say, that is just like adding a new field in the contract isn’t it? – Well no it isn’t since unlike anything else which appears in the (verbose as it is) SOAP contract these changes in default values, which are purely a WCF design choice, are not documented. Again, the changes in default values are not part of the contract. These are things you need to remember to pass on to you service consumer. So not only do I pay the overhead of having an explicit contract (e.g. vs. REST) – it really doesn’t work. It means that two components who use the same contract may not be interchangeable if one returns more data (in this case). It means that the two sides are coupled by the need to change these defaults and for what? WCF is smart enough to know how long is the message; WCF is smart enough to handle the message (if I encourage it by setting a behavior) why can’t it add 2 and 2 by itself?
Sometimes I just wish WCF had a TrainingWheels or DemosOnly attribute I could just set to false and make all this crap go away…
</Rant>
I recently read a post by Tim Bray where he states that building on web technologies let you get away with believing some of the fallacies of distributed computing.
I personally thinks he is a little optimistic in that claim.
On “The network is reliable” – Tim says that that the connectionless of HTTP helps (it does) and that GET, PUT and DELETE are idempotent helps as well. I say that GET, PUT and DELETE only if the people implementing the server side make them so – i.e. consider the fallacy. The fact that the HTTP says they should be idempotent doesn’t automatically make each implementation compliant
On “ Latency is Zero” – Tim says the web makes it worse – but, he claims, users got used to that. Even if they did I think that users are just part of the picture since the programmable web is also making strides. Also as Tim says it is actually worse. Not to mention that “Latency isn’t constant” either
On “Bandwidth is infinite” – Again Tim agrees that it is worse but people learn to note it. Again learning that it is there doesn’t mean the fallacy is gone just that people are less likely to presume it
On “The Network is secure” – Tim says its probably the “least-well-addressed by the web” – no argument here
On “Topology doesn’t change” – Tim says URIs help mitigate it – Again Tim is assuming people make URIs permanent or will always return a temporary redirect/permanent redirect when a URI change – good luck with that.
On “There is one administrator” – Tim says that yes that’s the case but who cares. Well, an example I usually give is that time when I deployed an ASP.NET which worked for a while – until the hosting company decided to change their policy to partial-trust (the app. needed full-trust) – when that happens to you. You care. If you mashup with someone else, you care etc.
On “Transport cost is Zero” – Tim says it is the same as for Bandwidth – i.e. worse.
On “The network is homogeneous” – Tim says that that’s this is the “web’s single greatest triumph”. I actually agree to that as long as all of you stick to using the web’s ubiquitous standards (http, XML/JSON ) if you have parts of your application that can’t use that you still need to pay attention
One thing I am really puzzled by is Tim’s conclusion :
“If you’re building Web technology, you have to worry about these things. But if you’re building applications on it, mostly you don’t.”
Since even according to him only 4 fallacies are covered by the web… (I think only 1)
In any event, I agree that the web standards and REST in particular, do contain guidelines that take into consideration the fallacies. However it is still up to developers to understand the problems they’ll create if they don’t follow these guidelines. Assuming that that is indeed the case, is well, overly optimistic in my experience.
You can also read a paper I published a few years ago which explains the fallacies and why they are still relevant today.
Michael Poulin @ ebizq doesn’t like the Active Service pattern I suggest you read his post first but in a nutshell Michael sees two possible ways to understand the term Active Service:
“a) service view - a service that actively looking for companions to complete its own task
b) consumer view – a service which triggers its own execution by itself”
…and he doesn’t like both…
I think that both of these definitions aren’t that far… and I like both :)
The way I see it there are two concern here
1. Are services only reactive (“passive”) ? - i.e. The service only “works” when it gets a request from a service consumer (user/another service/an orchestration engine) ? If the service also has at least one thread working to do internal stuff (e.g. scavenging outdated data, pre-fetching data from other service etc.) then that’s what I call an Active Service (option “b” above)
2. How do services get data they need to complete a request when they actually get a request – There are many possibilities here: events, pub/sub, an orchestration engine that takes care of that, services that check for a known contract in a registry and then go to that service, even hardcoded. The options where the service looks for other services (e.g. using a registry) is option "a” above.
So basically all the options are valid a service can be a+b just a or just b or none and, in my eyes, these are orthogonal concerns.
Regarding pre-fetching – I think this can be beneficial as a way to achieve caching. Note that if you control both sides and you’ve got the needed infrastructure then it is probably better to push changes (eventing or pub/sub) but that’s not always the case.
In the comment I left on Michael’s blog I talked about different strategies for services “There are several strategies for that - one is to take that knowledge out of the service (e.g. using choreography or orchestration), providing a subscription and/or wiring infrastructure i.e. something that will tell you where to find certain contracts, hard coding , registry , using uniform interfaces (e.g. REST) etc.”
lets take a concrete (albeit very very simplistic) scenario to illustrate some of the approaches
Business scenario: When a customer makes an order we want to give a 5% discount for preferred customers. A customer get’s a proffered status upon a business decision (annual orders of 1M$ or knowing the CEO or whatever) and the status lasts for a year from the date it was introduced.
For the sake of this discussion say we have two services (again this is overly simplified) an Ordering service and a Customer service.
Here are a few technical options
Technical Scenario 1.
Customer places and order, the ordering service talks to “the” customer service to check if the customer deserves a discount if she does. the ordering service then updates the order with the discount and present it to the customer to finalize the order.
Technical Scenario 2.
Same as 1, with the ordering looking for a service that matches the customer contract it knows about
Technical Scenario 3
The ordering service asks “the” Customer service twice a day for a list of discounts and caches the result. When the user sends her order. it calculates the price and present it to her
Technical Scenario 4
Same as 3, with the ordering looking for a customer service (not using a known service)
Technical Scenario 5
The customer service sends a message to known subscribers whenever a new customer status occurs. The ordering service listens on that and update its internal cache. When the customer places her order, the ordering hits the cache for the discount
Technical Scenario 6
same as 5 but publishing an event to unknown subscribers
Technical Scenario 7
The customer service publish an event with the discounts (or changes in discounts) twice a day. The ordering service listens on that and update its internal cache. When the customer places her order, the ordering hits the cache for the discount
Technical Scenario 8
The customer order is passed to an orchestrating service, which hits a customer service for a discount and then passes all the data to an ordering service
…
There are quite a few more options and variants on the options listed but which one is best?
Yeah, you’ve guessed it - it depends.It depends since each option has its own strength and weaknesses which can work best in different circumstances . It also depends on the available infrastructure, on the structure of other services, on the services being internal or external etc.
for instance scenario 1 is less flexible than most others but it is simple to implement. There is coupling in time between ordering and customer (both have to be up for the order to complete). Scenario 4 needs to solve the problem of finding other services (e.g. using some kind of registry, or other services “pushing” their existence or whatever) but when a customer makes her request it (most likely) have all the needed info to process that request, making the ordering service more autonomous. As a side note, the fact that different approaches to achieve the same end-goal work in different situations is why I decided to write patterns in the first place
Lastly, in case you are wondering the scenarios are:
1 – choreography with pre-known (configured or hardcoded) companion services
2 – choreography with “active service” of type a (ordering is active)
3- choreography with “active service” type b (ordering is active)
4 – Choreography with “active service” type a + b (ordering is active)
5 – pub/sub (e.g. using an ESB)
6 – eventing
7- eventing with “active service” type b (customer is active)
8 - orchestration
I recently got a request from Alik for my opinion on REST. I think this
might be interesting for a wider audience and decided to blog my answer
here.
Note: I also have a REST presentation I prepared awhile ago, which is
downloadable from here
(ppt)
The good
As you probably know REST is an architectural style defined
by Roy Fielding for the web which is built on several foundations
(client/server, uniform interface etc.) which gives it a lot of strength in
affected areas. The top three in my opinion are:
- (relatively) Easy to integrate – a good RESTful API is discoverable from the
initial URI onward. This doesn’t suggest that a any application calling on on
your service will automagically know what to do. It does mean however that the
developer reading your API trying to integrate it has an easier life. Esp. if
since hypermedia provides you the roadmap of what to do next.
- Another feature for ease of integration which has to do with REST over HTTP
(THE most common implementation of REST ) is the use of ubiquitous standards.
Speaking HTTP which is the protocol of the web, emitting JSON or ATOMPub means
it is much easier to find a library that can connect to you on any language and
platform.
- Scalability – stateless communication, replicated repository make for a good
scalability potential.
do note that, as with any architecture/technology – a bad implementation can
negate all the benefits
other REST goodness are things like the notion of the URI, idempotance of GET
in REST over HTTP etc.
The Bad
Some of the problems of REST aren’t inherent problems of the architectural
style but rather drawbacks of the REST over HTTP implementation. Most notable of
these is what’s known as “lo-rest”
(using just GET and POST) – While technically it might still be RESTful, to me a
uniform interface with 2 verbs is too small to be really helpful (which indeed
makes a lot of the implementation unRESTful see “The Ugly” below)
One problem which isn’t HTTP specific is handling REST- programming languages
are not resource oriented so the handling code that maps URIs to tends to get
messy. Actually Microsoft did a relatively good
work with implementing Joe Gregorio’s idea of URI
mapping which helps alleviate some of the problem. On the other hand it is
relatively hard to make the REST API hyper-text driven (Which is a constraints
of REST)
Lastly and most importantly REST is not the answer to everything (see also
another post I made on using
REST along with other architectural styles) – e.g. most REST implementations
I know do not support the notion of pub/sub (Roy did suggest a REST
implementation called WAKA that enables this but most people never even heard of
it). be weary of the “Hammer” syndrome, REST is a good tool for your toolset but
it isn’t the only one.
The Ugly
In my opinion there are 2 main ugly sides for REST. The first is Zealots.
That isn’t something unique to REST any good technology/idea (Agile, TDD etc. )
gets its share of followers who think that <insert favorite idea> is the
best thing since sliced bread and that everybody should do as they do or
else.
The real ugliness comes from the misusers – There’s a lot of
mis-understanding. The fact that REST over HTTP has become synonymous with REST
leads people to think that HTTP is REST. I recently read a REST
book review on Colin’s blog where “the author states that although
hypermedia is important in REST it isn't covered in the book because WCF has
poor support for it” i.e. a book on REST which ignores one of the important
constraints of the style..
Other mis-uses include building an implementation that is GETsful (ie. does
everything with http GET) or doing plain RPC where the URI is the command, doing
CRUD with HTTP verbs etc. etc.
The point is that REST seems simple but it isn’t – it requires a shift in
thinking (e.g. identifying resources, externalizing the state transitions etc.).
However, as noted above, done right it can be an important and useful tool in
your toolset
If you recall what
I currently work on is a type of a visual search engine. In a nutshell
when we get a request (image) we allocate a bunch of algorithmic
engines in a grid like manner to process the image (e.g. try to
perform OCR or whatever). As it happens, we are developing the
different components using several different environments(*) - e.g. the
control bits run on windows (.NET) and most algorithms run on Linux
(mostly C++).
The need for easy cross-platform communications and
extensibility, the resource nature of the solution and a few other
tidbits led us to design our solution in a RESTful manner.
If
you are a .NET developer/architect and wanted you may know that to
implement a RESTful application in Windows Communication Foundation
(WCF) you really have to jump through hoops.For instance you have to
go back to basics and use the HttpRequest and HttpResponse, handle the
breakdown and parsing of URI hierarchies yourself not to mention
fight with the bindings .
Fortunetly this all changed with WCF 3.5. True, .Net doesn't have (to my knowledge anyway) something like RESTlets, but at least building REST on http is pretty straightforward.
Consider for example the following excerpt:
[ServiceContract(Namespace = "http://paperlnx.Contracts/2007/12", Name = "ISessions")]
public interface ISessions
{
[OperationContract]
[WebGet(UriTemplate = "/Sessions/{sessionId}")]
[ServiceKnownType(typeof(Atom10FeedFormatter))]
SyndicationFeedFormatter ListSessionStatus(string sessionId);
.
.
.
With these 6 lines of code you see the essence of the .NET 3.5 REST goodies
- Integrated
support for HTTP verbs - The sample above shows the support for GET.
You can get the other verbs almost as easy with the WebInvoke
Attribute. To do that simply specify the verb you want e.g.
[WebInvoke(Method = "PUT")] , [WebInvoke(Method="DELETE")] etc.
- Support for URI templates - In a way not too far from Joe Gregorio's IETF draft , WCF supports the notion of providing a way to describe families of URIs. This is done using the UriTemplate class.
The WebGet and WebInvoke attributes also accept URI templates as
variables and map the variable values (the curly brackets ones {}) to
parameters of methods.
- Support for standard formats -
you can use plain XML or you can choose to use RSS and ATOM syndication
formats. In its most basic form you just create a syndicationfeed and
format it to atom feed. Which is what we do for error messages:
public static SyndicationFeedFormatter GenerateAtomError(string errormessage, string description,Uri location)
{
SyndicationFeed feed = new SyndicationFeed(errormessage, description, location);
return new Atom10FeedFormatter(feed);
}
Naturally you can also add items and element extensions to all elements (e.g. the feed or items)
All
in all, I am a happy camper :) After all, when you make an
architectural decision, you always need to review it once you opted for
an underlying technology. Even when a decision is right. The friction
caused by a technology which doesn't accommodate it well can both make
your life miserable and make a good decision bad. .NET 3.5 with its
newly added support for REST increases the architectural freedom and
that's always a good thing
* Among other things, it helps us avoid the "Network is homogeneous" fallacy -
but that's another story :)
PaperLnx develops an advanced visual search solution for mobile
handsets based on computer vision and image understanding technologies
developed by Rafael. PaperLnx solves the cumbersome web surfing
experience on mobile handsets by enabling end users to send captured
images from their mobiles to retrieve relevant information for the
object photographed.
We now have few open positions for the following profiles:
Senior Developer
We
are looking for a highly motivated, resourceful and intelligent
developer. Good interpersonal and communication skills will be very
appreciated. A Team player. Broad thinking and problem solving
capabilities are also desired.
- At least 5 years of server
side development with thorough understanding of Object Oriented
principled and understanding of architectural styles and design
patterns.
- Experience in multithreading and in distributed systems.
- .NET/Ruby experience
- Integration with C++ components
- Experience with O/R Mappers such as nHibernate/ActiveRecord etc.
- Video processing experience/ familiarity with video related protocols (H.324m, H.323 etc.) a plus
- Web experience (AJAX, CSS, ASP.NET) a plus
- Mobile Internet backend development a big advantage
- Understanding
of architectural constraints (security, availability, scalability etc.)
for internet scale platforms a big advantage
- Team leading skills – an advantage.
Algorithms Developer
We are looking for an experienced algorithm developer or an outstanding MSC graduate with image processing concentration. A team-player interested in joining a young and dynamic firm.
- Experience developing image processing algorithms using Matlab
- Computer vision/Image processing experience a must
- Experience transitioning algorithms from Matlab to software (C++/C/C#/F#) a big plus
- Familiarity with video compression protocols a plus
- experience with performance tuning and scaling optimizations
If you think you qualify and want to join a promising startup you can send your CV (hint: my first name at paperlnx .com) or call me at 052-3331027
Arnon
Microsoft uses the "live labs" to release all sorts of test balloons. Sometimes we get really nifty stuff like Photosynth or SeaDragon. Unfortunately, sometimes we get stupid not so bright ideas like Volta.
Ok, so what is Volta? Here's what the project's homepage has to say (emphasis mine):
"
The Volta technology preview is a developer toolset that enables you to
build multi-tier web applications by applying familiar techniques and
patterns. First, design and build your application as a .NET client
application, then assign the portions of the application to run on the
server and the client tiers late in the development process. The
compiler creates cross-browser JavaScript for the client tier, web
services for the server tier, and communication, serialization,
synchronization, security, and other boilerplate code to tie the tiers
together.
Developers can target either web browsers or the CLR as clients and Volta handles the complexities of tier-splitting for you.
Volta comprises tools such as end-to-end profiling to make
architectural refactoring and optimization simple and quick. In effect,
Volta offers a best-effort experience in multiple environments without
any changes to the application."
The idea sounds very compelling - I kid you not. So what's the problem?
The
first issue is that, as a platform/framework (MS would say factory),
Volta tries to accomplish too much. On the one hand Volta is another go
at the
web/
desktop convergence trend.
On the other hand it is supposed to be a solution for "painless"
tier-splitting. Both of these tasks are very heavy. My opinion is that
the
Single Responsibility Principle (while originally defined for objects) applies here. And Volta should choose one thing and try to excel in that.
What's more disturbing to me, is the automatic handling of the "complexities of tier-splitting". Here's another
excerpt from the Volta site which further explains the "tier-splitting" concept:
Objective
We have an application that runs in a monolithic environment, say the
browser. We want parts of this application to run in other
environments, such as servers. We don’t want to litter the application
with plumbing code.
Rationale The standard
techniques for distributed applications infuse our code everywhere with
information about what parts run where. This makes the code hard to
change. Typically, once we make these decisions we can’t change them
because it is too expensive. However, environments, requirements, and
performance profiles change and we’re stuck with applications that
can’t adapt to new realities. We need to separate the concerns about
what the application does from the concerns about where parts of the
application run.
Without Volta, we are forced to decide
where code runs before we know everything it is going to do, in
particular before we know the communication frequencies and delays.
Development methodologies force us to make irreversible decisions too
early in the application lifecycle. Volta gives us the means to delay
decisions until we have adequate information to base them on.
Recipe
Volta tier splitting automates the creation of the communication
plumbing code, serialization, and remoting. Simply mark classes or
methods with a custom attribute that tells the Volta compiler where
they should run. Unmarked classes and methods continue to run on the
client.
We may base our decisions about tier assignment on
any criteria we like, such as performance or location of critical
assets and capabilities. Because Volta automates boilerplate code and
processes for dispersing code, it is easy for us to experiment with and
change assignments of classes and methods to tiers.
Wow,
Agile development at its best, allowing us to postpone architectural
decisions, that just sound too good to be true. Well, the problem is
that
it is too good to be true. Abstracting the network out,
and providing location transparency without thinking about the
implications of distribution is the reason "distributed objects"
failed. e.g. Here is what Harry Pierson (DevHawk)
had to say about distributed objects:
"...back in 2003, mainstream platforms typically used a distributed object approach
to building distributed apps. Distributed objects were widely
implemented and fairly well understood. You created an object like
normal, but the underlying platform would create the actual object on a
remote machine. You'd call functions on your local proxy and the
platform would marshal the call across the network to the real object.
The network hop would still be there, but the platform abstracted away
the mechanics of making it. Examples of distributed object platforms
include CORBA via IOR, Java RMI, COM via DCOM and .NET Remoting. The
(now well documented and understood) problem with this approach is that
distributed objects can't be designed like other objects. For
performance reasons, distributed objects have to have what Martin
Fowler called
a "coarse-grained interface", a design which sacrifices flexibility and
extensibility in return for minimizing the number of cross-network
calls. Because the network overhead can't be abstracted away,
distributed objects are a very leaky abstraction.
So
here comes Volta and tells us just put a [RunAtOrigin] attribute on the
code you want on another tier and if you don't like that you can change
it to another place in your application and what not. Note that the
notion that you can automate some or maybe even all of the distribution
"boilerplate" code may be viable. The problem is in the premise that
you can seamlessly move that boundary around. There's a
fundamental
difference between tiers and layers.
Tiers should be treated as a boundary .Volta designers do talk about Security but they seem to forget a few of the other
fallacies of distributed computing...
While everybody and their sister (especially in this blogging community) is busy celebrating the release of VS2008. A more interesting* release happened today - The first community release of Ruby.NET (version 0.9). This is another step in the languages trend I discussed here a few weeks ago.
The release is said to have a lot of improvements however, Ruby.Net
isn't running Rails just yet. Hopefully we'd have that soon. Another
thing I would love to have is to use Ruby's testing frameworks like Mocha to test .NET classes (which already works for Java and JRuby). Well I am off to test that now :) - you can do that too, if you download the bits
By the way, you may also want to read the paper discussing the design of Ruby.Net (by Wayne Kelly and John Gough who started this project)
*my blog - my opinions :)
I received a number of inquires regarding PaperLnx following the help we rendered to Yediot in enhancing the quality of the video of Yitzhak Rabin's assassination (The link is to a site in Hebrew).
Our business is not video enhancement per se. What we do is use this
and other similar proprietary technologies to provide a form of visual
search for pictures taken on mobile phone cameras. "surfing" on a
mobile phone is not a good user experience, typing URLs and search
terms is cumbersome and lengthy. Using this image understanding
capabilities we enable end-users to get at the relevant multi-media
content for the object they are interested in (the object in the
picture taken). This type of application are called "physical world
connection" solutions. Naturally we are not alone in this space, but
think that the technology we have gives us a competitive edge in the
robustness of our solution.
We are now at the stage where we the technology is pretty solid and we
"only" need to turn this into a product, which is as I mentioned in
another post, why we are looking for a few good men (and/or women) to join us.
One question I don't hear asked
too much is "who tests the tests?" - after all we are writing all this
additional code - if we write so many bugs in our production code that
we need tests - what are the chances the test code is clean?
The
current answer I have is that the code, the tests and the acceptance
tests all test each other so if one fails we'll spot the problem in at
least one of the others. I hope that this it is a good enough answer... :)
What do you think?
Pete Lacey has a post called "What is SOA?" where he defines SOA as follows:
"- Network
Oriented Computing (NOC): An approach to computing that makes business
logic available over the network in a standardized and interoperable
manner.
- Service Oriented Architecture (SOA): A technical
approach to NOC that has a non-uniform service interface as its
principle abstraction. Today, SOAP/WS-* is the chief implementation
approach.
- Resource Oriented Architecture (ROA): A technical
approach to NOC that has an addressable, stateful resource as its
principle abstraction. Today, REST/HTTP is the chief implementation
approach.
- Business Service Architecture
(BSA): An unnecessary term (also not an architecture) that tries to
make the obvious something special. Aka, business analysis. Aka,
requirements gathering"
I am sorry but I beg to defer.
The first thing to note (again) is the
architecture vs. architecture style differentiation I mentioned in a previous post (You can
see a similar definition by Stuart Charlton) Here is a quick reminder :
Software architecture
is the collection of the fundamental decisions about a software
product/solution designed to meet the project's quality attribute
requirements. The architecture includes the main components, their main
attributes, and their collaboration (i.e. interactions and behavior) to
meet the quality attributes. Architecture can and usually should be
expressed in several levels of abstraction (depending on the project's
size).
An Architectural style is a blue
print that can be used when you desing an architecture. An
architectural style defines some of the components and thier attributes
as weel as place constraints on how they can interact.
My
claim is that SOA is an architectural style for distributed computing
which puts extra emphasis on the interface (and hence gets the easier
interoperability). Ok, if SOA is indeed an architectural style, we
should be able to define it as a set of components, interactions and
attributes. Well, I
already did that a while ago (in a paper called "What is SOA anyway?"). And while it may not be perfect, I think it is a reasonable definition all the same:
"SOA
is an architectural style for building systems based on interacting
coarse grained autonomous components called services. Each service
expose processes and behavior through contracts, which are composed of
messages at discoverable addresses called endpoints. Services’ behavior
is governed by policies which can be set externally to the service
itself. "
You can see the above mentioned paper for a little more detail on each of the components.
ROA,
in my opinion, is just a re-branding of REST so that it would be easier
to discuss it as an architectural style and not connect it to the HTTP
implementation - which is what a lot of REST proponents are doing.
By the way,
as I pointed out before,
there are a few other important architectural styles that are related
to distributed systems like Event driven architecture, Spaced based
architecture, peer-to-peer etc.
As for "Business Service
Architecture" - I personally like to think about that as "SOA
initiative" as in the strategic decision to try to implement an SOA in
an organization while trying to achieve the more nebulous traits like
business and IT alignment etc. (which is why it is nether architecture
nor architecture style)
More Posts
Next page »