Synchronicity. Sometimes it feels like it’s more than just a 35-year-old album by The Police.
This week I was filling in a questionnaire about Glue42 from Norman & Sons, as part of their forthcoming survey of the Desktop Interop Space. Part of the questionnaire asked for a timeline for the 6+ year history of Glue42. One point that emerged was how our concept of Desktop Interop changed as we dealt with new requirements and problems resulting from growing Glue42 usage.
At the same time, one of our sales people reported that a potential client had said they could not understand why we made such a big deal about the breadth of Glue42’s native interop support. Surely Pub/Sub is all you need? As an example, one of their developers had implemented Request/Response in less than two days, so who cares if Request/Response, and the other interop primitives are natively supported by the product?
This made me recall the time I first met Pub/Sub with TIBCO RV, well over 20 years ago. The model was incredibly liberating, since it allowed so much more flexibility in designing systems, in particular the removal of the central server. Like so many stories of youthful excitement the promised simplicity turned out to be a bit more complicated.
This piece can be used to help any newly minted network architect think about some of the issues their implementation may face over the coming months and years, and why we think Glue42 native support is important.
I restrict myself here to Request/Response, which was the first interop service offered in Glue42. As usage grew, our clients needed additional Interop services such as; Streaming, Shared Context, Channels and Window groups. These additional services are not discussed in this piece, but if there is sufficient interest I can extend this series to describe what they are and why our clients needed them.
The benefits of request/response interop
In general, companies use Desktop Interop to allow different applications to work together so that the user experience becomes simpler and more intuitive, and the user’s focus moves more easily between their applications. Interop may also be used to move to a more component oriented development model. Whatever the motivation for interop, the basic requirement, is to ‘send messages’ between applications.
Pub/Sub is now a very common networking model since it does not require applications to be marked as servers or clients. However, when developing applications, it is common to require some more ‘structure’ around messaging and Request/Response is often the first step. An application (a server) publishes one or more ‘Methods’ and other applications (clients) can invoke aka call that method to ensure some action is carried out. In Desktop Interop, most applications are both servers and clients.
Thinking about a ‘two day implementation’ Request/Response reported by our prospect, I remember our first implementation and the issues and changes we have made over the last 6 years. This document describes some of the pitfalls I would expect their solution to encounter and some of the changes that may be required to produce a mature architecture that is both robust and scalable.
A typical first solution
The first Glue42 implementation of Request/Response 6 years ago, used Pub/Sub messaging (running over RV and over Jabber messaging) and encoded the method name in the message subject.
So, for example, if I have a method ‘Sum’ that takes two parameters A and B, I might implement the server to subscribe to a message topic like ‘MYAPP.SUM’.
In order to invoke the method, a client publishes a message with the request parameters, on the topic MYAPP.SUM.
To get the ‘answer’ the server sends a message back to the caller either using a special caller subject or an inbox depending on the capabilities of the Pub/Sub implementation.
The parameters are encoded using some simple schema designed by the Network Architect who defined the interop.
The demo looks great. You run the application implementing the Sum method (server), run the application calling the method (client), press a button on the client and the code is invoked in the server and the result returned to the Client.
This probably only took half a day, so plenty of time to make it production ready in the next day and a half. Well maybe not, so let’s think about the possible journey to maturity this solution might follow. By Maturity I mean a robust and scalable implementation of Request/Response. As we will see scalability is an open question, the largest Glue42 client has produced 800 Glue42 applications, and a typical user will have well over 30 applications available for use.
Some of the issues you may discover during the first stages of adoption of this first solution include:
1. Run different environments
Say you need to run Dev and a UAT versions of applications. You need to keep the messages separate? Easy! Just add ‘structured’ prefixes for the subject for each environment.
It is true that as you start deploying this more widely, you will find the prefix becoming more complex, but the issues of traffic segmentation tend not to really bite until deployment has reached hundreds of users and tens of application.
2. Method detection
Discoverability is a key feature of good design, for example client apps will often want to discover whether a ‘method’ is available and only enable a button if it is.
There are multiple issues to consider here that will usually require extensions to the Request/Response code, for example What happens if the server is started after the client?, What happens if the server stops?
3. Multiple servers for same method
- What happens if there are two servers offering the same method?
- Is this a config error or a legitimate deployment option?
- What options do you want to offer the client when they issue their request? Run the invoke on all servers, one server (if so which one)?
- When is the invoke complete? First server, all servers?
4. Return values
- Does every invoke return a value?
- What happens if a server crashes in implementing an invoke?
- What happens if a client tries to call the same method on the same server at high frequency?
- Is the second, pending call, queued or discarded?
Although user interactions and desktop interop have a slow message rate when compared to market data, overload and throttling can become an issue after recovery from service outages, or possibly just during a laptop recovery from hibernate.
Should you fail calls, queue with retry?
Adulthood – problems of success
Let’s imagine you manage to do enough to get your applications into production. You are typically only running a few apps. At this point people start liking Interop and therefore want more of it. Here are some of the challenges you might face as your solution enters adulthood:
Discovery of a major bug or a requirement for a must have new feature are often discovered as the number of applications increase. You make the changes to the Request/Response code and now you need to deploy the new version. No problem.
How does that get into the production apps? Does everyone include the same version from a common location? Or does each app have its own version?
Let’s hope that the change is backward compatible. And of course, don’t forget to implement the fix/enhancements into each of your language bindings!
And let’s hope that there is no problem if applications using different versions of your interop logic are running on the same Pub/Sub bus. Because maybe you can’t force a restart at every user’s desktop, or there are problems doing a patch to the nine other apps you have running your solution.
Just as well you don’t have 800 applications using your interop like some vendors.
7. New language support
At this time, you probably have to increase the size of the team supporting core services, but “Hey, no problem, internal developers are practically free and think of the license fees you are saving!”
Maturity – Problems of scale
8. Supporting developers
At the start the interop library is often written by the same developer who is working with the applications. As the interop becomes more successful other developers need to start writing interop code.
No Problem. Your original developer probably created great documentation, sample applications and has plenty of time for support and training. Or maybe not, in which case just expand the core team with a few more resources for documentation and support, and don’t forget to add a project manager as the team size grows.
At some point, one of the new dev teams finish their first app and start testing it against existing applications. It doesn’t work. Just use the interop developer debug tools and inter applications test tools! Did your core team get around to writing those yet?
9. Multi machine interop
Some of your users have multiple desktop machines and want to interop between applications running on different machines.
Lots of ways to do this. Probably a good idea to start reading about the history of network architecture and maybe speaking to some people who worked with TIB RV back in the day and have seen the full majesty of a NAK storm across a production trading network on a very busy day.
Mid life crisis – Security
Eventually you have a great, scalable, robust interop mechanism for Request/Response. It may even support multi-machine Interop. It has probably taken multiple developers, to get here. Is their cost cheaper than license fees to vendor? Maybe not, but it has been interesting work, good for our network architect’s resume and, of course, it is so much more flexible to build your own code.
10. Security model
Тhen comes the day when someone asks the killer question.
What is your security model?
How do you prevent rogue applications participating in your interop?
At which point you go to talk to your ‘technology historian’ and asks him how this was solved back in the TIBCO days.
The answer is surprising,
“oh you need to trust your deployment process (and pray) that no rogue applications run on your network. But that was not such a big issue since every app on a trading network must be deployed onto corporate machines following strict build and test procedures.”
Probably true 20 years ago, but today when applications can be loaded from the public web, those assumptions can look outdated and a bit naïve.
The problems given here, can all be solved, but each solution to each problem has different costs and benefits, and the ‘right answer’ changes over time. The solution you come up with to get the first handful of applications into production, may have very high costs in the future. The more applications using your request/response solution the more expensive future changes are to deploy.
What can you do? You can take this (partial) list and use it to think about where you expect your interop requirements to be in 1 year and in 3 years, and use that to review your in-house solutions. Or, and here come the sales pitch, you might ditch the ‘2 day’ solution and go to an experienced vendor. A vendor who has already travelled this road. A vendor who has also dealt with cloud deployments, inter-device interop and all the other features required for large scale, successful, deployments. A vendor, like Glue42.
As an additional benefit, Glue42 gives you pre-packed connectors for important 3rd party applications such as Outlook, Excel or Bloomberg as well as other interop models like streaming and shared contexts.
For a free consultation on the process of implementing interop and a demo of our product you can write to us at email@example.com, and if you are of a certain age and haven’t listened to ‘Every Breath You Take’ for a while, check it out.
About the author
Leslie Spiro is one of the founders and CEO of Glue42. He is also a serial entrepreneur, who has co-founded a number of significant capital markets product companies. SD&C which delivered the first Windows-based Reuters workstation, Dealing Object Technology (DOT) delivered RTT platform independent real-time market data, EasyScreen, a derivatives trading system that floated on the LSE. He is an active a member of both OpenMama, part of the Linux Foundation, and FINOS where he works with the FDC3 and Open Desktop program.