It’s that time of the morning... that wondrous time when I hope that my final round of testing, with my brand-spanking-newly developed component is going to work... maybe.
And all this because it seems as though there are a series of hoops that one has to jump through to make any Microsoft Office COM component to operate smoothly. And by smoothly, I refer to, in particular, convincing the COM component to let go of any/all resources it was consuming and/or quit. These two things are surprisingly hard to achieve.
It has been a night fraught with dangerous code refactoring, extensive use of the Marshal.ReleaseComObject and Marshal.FinalReleaseComObject, AppDomains and various other craziness. I thought I had it at one stage where I would load the assembly of the component into a temporary AppDomain, taking great care as to not let anything cross the Domain boundaries in an attempt to make sure there were no cases of rogue assembly leakage, ensuring that when I unloaded the AppDomain all would be right with the world. The long and the short of that approach is that it didn’t help.
(WARNING: to save those time looking for some direction should they be having similar issues, let me just stop you right here and tell you up front, I hacked it. The working solution I came up with was quick, dirty, and above all nowhere near elegant. But it works! You do these things when a demo to the boss is looming, first thing the next morning, and its just hit 02:30 AM).
Now, to the background of my tale of woe... We have had a windows service application developed by a 3rd party that makes use of MODI (Microsoft Office Document Imaging). They use this component to facilitate OCR (Optical Character Recognition). MODI seems a fantastic alternative (far from perfect but it gets the job done for the most part) for businesses out there who don't want to (or can’t) fork over the wads of moolah for those high end OCR components - especially when they, almost certainly, already own a licence for Microsoft Office. The best part about it? Its actually rather good at what it does (right, enough about MODI now, perhaps I'll do a very quick post about it - not many people know about it as an alternative OCR engine you can manipulate through code). However, this particular windows service had issues (I am being very nice with that statement), and unfortunately required extensive reverse engineering and recompilation. This poor service has/had a lot of issues, and one of the components not working correctly was the OCR (to give you an example, the vendor only OCR'ed the first page looking for a specific text string, so if the client sent the fax with a cover sheet, it was all over). During our testing, I noticed errors popping up with the MODI component. These would happen after a few documents had been OCR'ed successfully (not necessarily having that text found, just passing through the engine). Obscure sounding errors about the COM server throwing this and that and a good amount of GoG (Good 'ol Googling) resulted in nothing concrete.
Enter my "Bright Idea(tm)" of rewriting the OCR component. It should not have been a big deal. Just clean up the messy code and the COM component takes care of the rest? Right? Well yes, it did the rest and more. Like holding onto the file it was supposed to OCR for dear life. After managing to pry the file handle out of MODI's tight little clutches by Marshalling my way to success and trying to make sure there simply could be no dangling references to MODI lying around... anywhere... ever… ever ever. I thought it was going to be clear sailing. Initial unit tests made me feel all warm and fuzzy with those green lights of goodness burning bright and having the service chug away at a few documents and finding once lost reference numbers filled me with giddy glee.
That is, however, till I saw the RPC_E_SERVERFAULT errors popping up like crazy in the event log. The odd (or perhaps not so odd) thing was that if the service was restarted MODI would once again begin to function correctly. That is, until a few documents had been processed and/or some random time had passed.
Then the problems would line up. Take a number, stand in line.
Now, I am no stranger to “COM Interop” in .NET (but I am by no stretch of the imagination a Guru), and I performed all the good COM house cleaning I could and then some, to the point of overkill. Try-Catch blocks everywhere there should be with Finally blocks to make sure things were released after useage. Still nothing. Same issues. As I mentioned previously, I even went as far as to create a temporary AppDomain and try and load everything through there... Still the same errors.
Finally I cracked. Desperate times, desperate measures and all that. I pushed the OCR component out into an executable and now a whole new process is spawned when the OCR needs to happen - not just a thread. That way the entire process would close down and ensure (that's the theory anyway) all references were severed and objects cleaned up.
The good news is that it works. It has continued to work for the last hour. I am overjoyed to say the least. However, I really don't like it as a solution. I need to know why MODI was faulting in such a spectacular way. I just can't figure it out. If I had it in me - I would post some code, but right now I am too bushed.
I just hope I don't have to face that particular problem again for a little bit. And if I do, I would like it to be when I don't have a deadline looming over my head. At least, now, I don't have to tell the powers-that-be that its completely broken this morning...
Little victories.