12 December 2018

When standards aren't... aka how do I deal with multiple file attachments in InfoPath

As we all know, Microsoft has been proclaiming InfoPath to be "dead" for many years now, but it's not until recently with PowerApps that we've seen any glimmer of hope for a replacement.  As such, InfoPath lives on and gets migrated from one version of SharePoint to the next version after version after version...

One of the things that made InfoPath special was the ability for users and designers to tinker with the template and move stuff around, add and remove fields etc. and it would maintain all historic schemas in SharePoint so that any older content that was created with said schema, could be opened with the appropriate schema when needed.
Enter migration time.  Queue the dramatic music please.

All migration tools, that's right, every single last one of them, when migrating InfoPath libraries, will only migrate the latest .xsn schema which of course means that ALL forms created with older schemas is now broken!  This stems from the fact that migration tools use the API's provided by Microsoft and the fact that the APIs do not provide access to older schemas.
Being the industrious little problem solver that I am, I wrote my own tool to work around this problem.  The tool simply downloads the latest .xsn from the target library, extracts the template.xml and manifest.xsf for parsing and then iterates the schema node by node while seeking matching nodes in the original source XML data and reconstructing the new target XML on the fly.  Upon completion the new XML document is upload to the target and it fully usable as is.  This works GREAT!

Then we encountered multiple file attachments.
Now I don't know if you've even ventured down the rabbit hole that is InfoPath File Attachments, but let me tell you... it's a mess!  What little information is available on the web is old, outdated and hardly applicable.  Oh and did I mentioned nobody and I mean NOBODY talks about dealing with MULTIPLE FILE ATTACHMENTS!!!

The first glimpses of hope I could gleam from scouring the internet was this Microsoft support article titled "How to encode and decode a file attachment programmatically  by using Visual C# in InfoPath 2003".
Right in the introduction, the article had a reference pointing to an updated version titled "How to encode and decode a file attachment programmatically  by using Visual C# in InfoPath 2010 or in InfoPath 2007".
I read both to be sure.  There's very little difference from a code perspective between the two, but I'll reference code from the 2010 example here.

The article first walks you through the InfoPathAttachmentEncoder() class which is pretty straight forward, but it is important to note that in line 45 they are reading the file to be attached as Unicode.  In line 78 the ToBase64Transform is used to convert the unicode binary of the file on disk to text in Base64 format, that can fit in XML.  Finally in line 93, the memory stream is grabbed as ASCII text and returned to the caller to be inserted to the InfoPath form as an attachment.

The decoder class that follows is the same except in reverse, taking in the base64, ASCII data and then converting it into a byte array in a memory stream in lines 25/26.  A BinaryReader() then reads the memory stream to decode the attachment.

They never delve into dealing with multiple attachments at any point.  In fact the reference to the attachments in XML is very small and always worked from a view of single file attachments as can be seen in the code to save the attachment thus:

As can be seen from line 8, the use of the SelectSingleNode() method assumes only one file attachment.  So this helped me understand the magic behind how the documents are encoded and stored in the XML of the InfoPath document, but it still wasn't giving me any glimpses into dealing with multiple attachments.  Further searching brought me to the BizSupportOnline.net site and in particular their "Top 10 questions about InfoPath attachments" article.  Eureka!  I thought.  This has to be it.  They had multiple questions with answers and links to videos that would hold the key information I sought... except... their video links were all dead. 😡
My heart sank to new lows as my struggle entered day two.  I stumbled upon this article from Kunaal Kapoor titled "Getting InfoPath attachments from a submitted form" wherein he explains how it could be done using the XSD Visual Studio tool to build classes off the InfoPath template.  This seemed like a major departure from my current implementation, but desperate times you know...
So I tried it and as I suspected, it didn't work.  OK, back to my former path.  I know it's in there in the XML, I just have to find it.

OK, back to basics.  Rule #1 of debugging.  NEVER ASSUME ANYTHING!
So let's check, double check and triple check all our bases again.
Let's look at the attachment definition in the form.

OK, it's a simple field control, of type Picture or File Attachment (base64) with the Repeating option checked.  Nothing out of the ordinary here.
Next I decided to compare the character by character, line by line results of the files.  Time for my beloved Beyond Compare!  If you don't have this tool in your toolbox, you're seriously missing out on productivity.  The folks at Scooter Software does an awesome job with this gem of a tool.  A definite must have for every geek.
As I did my comparison and got to the end of the first attachment, this is what I noticed:

On the left is the original source file that had 3 attachments.  On the right is the processed result with just one.  Obvious right?  So then the question is... why do we only see one Attachment node in the code during processing.  The answer is based on the presupposition that repeating data represented in standard XML is represented in a collection node thus:

    <File Name="abc.doc"/>
    <File Name="xyz.doc"/>
    <File Name="ugh.doc"/>

Instead, the way InfoPath chose to represent multiple attachments in this case was as follows:

  <Attachment Name="abc.doc"/>
  <Attachment Name="xyz.doc"/>
  <Attachment Name="ugh.doc"/>

As a result, when my code was processing the childnodes of the XML document, looking for a match against the schema in order to rebuild the new target document, it found the first Attachment node and was satisfied and then moved on.  If the attachment was represented in a standards based collective node, all the files would have been caught.

It just goes to show you.  Rule #1 is so true.  Never assume anything.  Not even when the source is a large, reputable, multi-billion dollar company. 😎

If you'd like to leverage the code I wrote to solve this, I posted a follow on article the following day that can be found here:

Happy coding

No comments:

Post a Comment

Comments are moderated only for the purpose of keeping pesky spammers at bay.

SharePoint Remote Event Receivers are DEAD!!!

 Well, the time has finally come.  It was evident when Microsoft started pushing everyone to WebHooks, but this FAQ and related announcement...