Readability Panel
Capitalise Main Text
Increase Word Spacing
Use another Font
More Coming Soon!!

Saturday, 24 August 2013

Why I Don't Like Frameworks and XML Configuration

One of the comments on my Singleton Pattern blog post simply gave the name of a framework followed by a questionmark. My response:

"What about it? :) There are 101 frameworks out there. IME they fall under two categories: Costly, and Poorly Maintained. If they're not Poorly Maintained now, they probably will be later. There can also be resistance in corporate environments to using anything 3rd-party regardless of how good it is, and there are plenty of discussions to be had on distributed library size and closed-source unmaintainability (and the issues around merging 'official' fixes with 'private' tweaks in open-source libraries.)

You could use Microsoft Enterprise Library for Logging, but I'll be doing a blog post on that and Microsoft's Unity DI crap with bits about why having 20,000 lines of app config XML is a Bad Thing. The same applies to WCF."

I aim now to explain in further detail.

API familiarity


Ben Ellis also commented that there are plenty of other ways of achieving the same result. An example he gave was to use Service Locator. There are reasons why you wouldn't want to use Service Locator - although there are counter-arguments that point out that it does come down to how you use the Service Locator. My issue, on the other hand, is impure intent. Service Locator and Unity are two separate technologies that do roughly the same thing. They can be forced to interact, but looking at that link raises one of the most important points I want to make here: "Make sure that you understand the API and code that you are calling. It may not always be black and white." If you are working in a team, or are likely to be bringing in short term contractors to work on the code, or even if there is the likelihood of someone else having to maintain your code, other people might not understand the API. Expecting them to learn all about that API when they've had an urgent production issue dropped on their desk is unfair, and the most likely outcome is that they will bend and twist whatever framework(s) you've chosen to make them behave the way they expect.

XML Configuration


It might seem like I'm picking on Unity a little here. I just did a quick Google search and found The Unity Configuration Schema - the size of this thing is incredible. Even if you split it out into a separate file (which always causes a bit of fun when people remove the reference or it gets missed out of a deployment package) it requires a huge effort to learn how the damn thing goes together, and again at short notice that's going to become tricky. When you've got a number of retail stores unable to trade because of a bug, time really is of the essence. I have Juval Löwy's Programming WCF Services on my desk so I can respond accurately to questions about The WCF Configuration Schema - another monster I've actually developed some code to avoid having to deal with. The problem with having a lot of XML Configuration is that it's difficult to read.

Imagine the following scenario:

- A Junior Developer has been given the task of maintaining an overnight process
- The Junior Developer deploys some new code, but amongst the thousands of lines of XML, has forgotten to update one of the service references to point to a Production server
- The code runs that night at 02:00 and flags up an error to The Tech Support Guy who gets paid to make sure none of the red lights come on.
- That Guy thinks "That's odd, but there was a deployment so I'll have a look and see if there's anything obvious." He's faced with thousands of lines of cryptic XML and hurriedly closes the file.
- He phones the Developer who gets out of bed, fires up their laptop and signs into the corporate network so they can get a copy of the configuration file as it is in Production.
- 30 minutes later, a tired Junior Developer, not in the best frame of mind for looking at code, finally locates the erroneous setting.
- The Developer emails The Support Guy with the fix - locate "http://testserver/" with "http://prodserver/".
- The Support Guy does so - but also accidentally deletes the opening quote from the bad setting, his view obscured by ragged line lengths and settings interspersed by hierarchical elements.
- The code still doesn't work.
- They agree to come back to it in the morning, and the Junior Developer loses more sleep.

Now let's say we've abolished complex XML (my WCF configuration has App Settings like IService_Host) and replay the scenario:

- A Junior Developer has been given the task of maintaining an overnight process,
- The Junior Developer deploys some new code, but - as unlikely as it now is - forgets to change one of the few settings in the XML file.
- The code runs that night at 02:00 and flags up an error to The Tech Support Guy who gets paid to make sure none of the red lights come on.
- That Guy thinks "That's odd, but there was a deployment so I'll have a look and see if there's anything obvious." He opens the config file, and notices that one of the _Host settings is wrong. He knows what "Host" means, being a network guy.
- He phones the Developer and says "There was a problem. I looked at the config file and one of the Host settings says testserver. Should I change it to prodserver and re-run the program?"
- The Developer says "Oh, sorry. Yes, please change that."
- The configuration gets changed, the process runs, the Developer gets a good nights sleep.

Frameworks try to be All Things to All Men


Frameworks very often try to cover all bases. They try to be configurable in many, many ways so as to cater for the many, many ways people want to work. That means they are open to abuse by people who don't know how to configure them properly and misuse by those who don't have the time to understand how they're meant to be configured. This comes back to maintainability, too. If you have an urgent problem, you're likely to simply create a workaround for the framework - through misconfiguration - not doing what you need. That workaround won't ever get changed.

Frameworks try to use their own Language


Enterprise Library has "Blocks". Someone decided that instead of "Assemblies", "Namespaces" or "Classes", Enterprise Library should introduce a new concept. These "Blocks" are described by Microsoft as "Reusable Software Components" - so what was wrong with using the word "Components"? The Logging "Block" has "Listeners" while in the traditional logging paradigm one would call a method that Pushes a Log item to wherever it's going. Indeed, you still use the Enterprise Library Logging Block in this way. Entity Framework has "Entities" rather than "Classes", "Tables", "Structures" or even "Object Definitions" - terms that pretty much any developer would understand. It goes further. It has "Associations" instead of "Relationships". There's an enum of "Primitive Types" that map directly to already defined CLR types - and, of course, the already present DbType enumeration. NHibernate uses a "Session" instead of a "Connection" - I can see how the concept of a "connection" is somewhat outdated, but it's terminology that will be understood by a wider audience. This introduction of new terms does nothing to help those who need to pick up some code and fix it fast. It does nothing to improve on-boarding speed for developers new to a project.

Frameworks have unknown side-effects


The Enterprise Library Logging "Block" (see above comment about Language) has LOADS of built-in "Listeners". We don't know (without using a tool like dotPeek) how good the code is, and even if we open it up, it might be calling code we can't get or could be too complex to delve into properly. To take Logging as an example, I've written my own because I need there to be NO thread blocking when I'm writing a log entry. I need to log and return as fast as possible. I do this by calling a static class which stores the entry in a Queue which gets drained on another thread. The point is to remove any delay that writing to disk might cause. I don't know what Enterprise Library does internally. Someone could write a new Listener that tries to write to a database - which would seriously spanner the performance of the process that's trying to log. There was a time when we have had a service timing out in the middle of the day, around 2pm. It was fine the next morning. It turned out Enterprise Library was creating a file so large that the sum of the time taken to perform all the logging was exceeding 30s.

To divert from Logging, we have had a case where we were using a very basic ORM layer that had been in active development when the team lead decided to use his friend's project. By the time we found the performance limitations and memory leak issues, it had been abandoned and the code was unavailable to us. Another, more common ORM layer - Entity Framework - has particularly poor performance when you're trying to update a large number of records. It's not really designed for handling 300,000 records at once, since it attempts to "observe" changes and manage the persistence of those changes. Writing each record at a time was slow because of the setup & teardown cost in the Save method. Writing all 300,000 records in bulk was far too unwieldy. We had to save them in batches, in a manner discussed in this StackOverflow post.



Of course, there are times that it DOES make sense to use a framework, but like design patterns, you should use a framework because you believe it's going to enhance the development of the code in some way. Not because you haven't used it before, and not because you want to learn it.

No comments :

Post a Comment