Introduction
Disclaimer
This is a retelling of a story from a bygone era, before 2019.
Twas a time untouched by the chaos brought upon us by the pandemic, with all of its lockdowns and the subsequent toilet paper shortages that pushed us to the brink of societal collapse.
Twas a time when generative AI, for most of us, was but a far-off fantasy from an age unheard of, and the GPT model existed mainly inside a recently published research paper by a bunch of nerds.
I really enjoyed creating the feature I'm about to describe, even though it flagrantly ignores best practices. Back then, I was an inexperienced developer, eager to prove myself in the trenches and blissfully unaware of the things I still didn't know. And the issue, as always, was urgent, with no time to spare. But the solution actually worked. Sometimes, that's all that matters.
But!
And it's a big "but", I cannot lie: this is a fun story, but the code I'm about to share? It's definitely not something you should ever let loose in any production environment.
The task at hand
I was a part of the company's in-house IT/softdev team, tasked with creating an ERP product. Our mission: to replace a decades-old legacy in-house system with something fresher-looking and better-behaving (and also in-house, good grief). It's a classic tale of "let's remake it cleanly" that slowly became "let's stuff everything into it and make it harder, better, faster, stronger" when nobody was looking.
That's the feature creep - it creeps on you when you least expect it.
Along the way, someone came up with the brilliant idea of replacing one of the company's archaic internal web apps with something that wasn't held together by duct tape, technological debt, and tears of its maintainers. It was a PHP app responsible for parsing invoices. These were required in raw text form, so the users would preprocess them from PDF format with some ancient freeware program that worked most of the time. The web app in question, ripe with poor Y2K design choices and creatively named "Converter", would convert those raw text files into CSV files containing SKUs, prices, amounts, whatever.
The idea itself was simple - upload a text file and the internal logic runs it through a series of custom-written regular expressions, handling matches with some internal black magic, spitting out a file with desired data.
I say "black magic" because sometimes it was exactly that, with occasionally complicated and nonsensical rules. Imagine something like this:
- For the current match, if five lines before this one, the fifth word begins with "X04", then apply a "104" prefix.
- If the OEM code starts with numbers from 14 to 18, divide the amount by 5 to match the internal packaging rules.
And, of course, there was zero documentation available.
The devil in the details
Since we were already writing a new ERP system, someone came up with a brilliant idea of incorporating "Converter" into it. A module that processes text files based on some arbitrary rules seems pretty straightforward at first glance, but only until you realize the insidiousness of the user requirements.
A tiny, itty-bitty caveat.
Aside from the fact that each contact would send invoices in their own format, for whatever reason imaginable, they'd change those formats unreasonably often. And whenever they would, that specific invoice converter inside the "Converter" would stop outputting correct data. Or any data at all.
Such issues would have critical priority, as the problem would arise when the delivery was already at the company's doorstep, and the CSV file was required to allow the warehouse workers to process all the wares correctly. The fix was expected to be up and running ASAP, ideally in two hours or so. And sometimes, by the time you'd be done fixing that logic, you'd hear about another invoice not parsing. Or two.
While PHP apps can handle direct changes to their logic fairly well (for example, by directly editing the .php files on the web server, which is an amazing practice), a .NET app cannot do that. Each fix, or batch of fixes if you're lucky, requires rolling out a new version. In this specific case, it could easily result in releasing a dozen versions weekly, perhaps more. And the CI/CD was...
Well, it wasn't. There were some .bat files and network drives.
I'm not judging - we had an amazing dev team that might not have been very AGILE or overly SCRUM-y, but we sure knew how to jury-rig things. Which is a respectable skill. So, after some brainstorming, we decided to implement a solution that would allow editing the logic on the fly, just like the PHP gods intended their software to operate. It would require pieces of dynamically compiled C# code to handle parsing the CSV files, and that code would have to be editable by people responsible for the feature's maintenance.
And then, I was told to write it. ๐
Problem overview
Okay, enough with the lore, we're not 40k nerds. Since we're about to dive into the technical side, let's drop most of the wittiness and talk actual reqs.
Business requirements
- The module must accept raw
.txtfiles (preprocessed PDF invoices with an external tool) and output CSV files containing structured data: SKU, count, price per unit. - Each vendor has at least one unique format. The system must support maintaining separate parsing configurations assigned to vendors in a many-to-one relationship.
- Parsing rules must support arbitrary and unknown complexity, including contextual lookbacks, conditional transformations, and derived logic.
- When a vendor changes their invoice format, the fix must be deployable within approximately two hours without application downtime.
- Maintainers must be able to author, edit, and test the parsing logic live.
Technical requirements
- Parsing logic must be editable at runtime without recompiling or redeploying the application.
- Runtime-editable logic must be written in C# and dynamically compiled within the host .NET Framework application.
- The module has to be integrated into the existing ERP as a module, not a standalone app.
- The solution cannot depend on CI/CD infrastructure.
- Dynamically compiled code must be type-safe.
Interpretation
On the surface, we're looking at a simple text-to-CSV parser, and frankly, that's a weekend project or junior task territory.
However, the real problem isn't parsing the text. It's the frequency of the change. Vendors swapping their invoice formats constantly and fixes being needed in hours, not days, is already a completely different beast. Add no sane deployment pipeline to this, and you're looking at something more existential.
The actual challenge is: how do you build something in .NET that lets you rewrite its own logic on the fly, safely (or semi-safely at least), and without redeploying?
This is where .NET becomes your ball and chain, instead of a modern upgrade. The platform is compiled by nature (into CLR, but still compiled), and fundamentally, it does not want you to do this. PHP will happily let you edit a file on the server, hit refresh, and carry on with your life. With .NET, you're basically asking a compiled ecosystem to cosplay as a scripting language, which is the kind of thing that makes software architects turn gray a decade early.
The good news is that .NET actually has the tools for this, they're just not meant for what we're going to build with them (and generally, people discourage using them). CodeDomProvider, reflection, runtime assembly generation exist for legitimate purposes and not to satisfy a "the delivery truck is idling in the parking lot, and the CSV is borked" requirement.
But the building blocks are there, and if you're insane enough to stack them into something that works and ship it, then it certainly is an option.
Hydration break ๐ง
Alright, we're halfway through. The juicy part is just ahead, so let's spare a moment to take a sip of whatever you're drinking. And if you're not, grab a glass of lukewarm water. Hydration is no joke.
Ready?
Type-safety in dynamically compiled code
In the technical part, I'll walk you through the steps I've taken to create something that is pleasant to interact with inside the IDE. We'll start with the basic, simplest possible implementation of dynami compilation during runtime in order to gain much needed lore about this concept. Then, we'll keep iterating in order to make it a bit more humane. So, let's dive in.
Naive native implementation
The stack we're working with here is .NET Framework 4.6.2, if I remember correctly. None of the modern dotnet niceties, and we're deeply entrenched in the Windows environment, trying to make do.
And it turns out that dynamically compiling C# code is... surprisingly not hard at all. The simplest implementation would look like this...
First, we need the external code itself, so let's store it inside a variable.
1var code = @"using System.Collections.Generic;2namespace InvoiceParser3{4 public class InvoiceParser5 {6 public IEnumerable<string> ParseInvoice(IEnumerable<string> input)7 {8 foreach (var code in input)9 {10 yield return ""Parsed line: "" + code;11 }12 }13 }14}";
First, we have to simply compile the code, like this:
1using System.CodeDom.Compiler;23// ...45public CompilerResults CompileCode(string code) {6 var codeDomProvider = CodeDomProvider.CreateProvider("CSharp");7 var compilerParameters = new CompilerParameters();8 var compilerResults = codeDomProvider9 .CompileAssemblyFromSource(compilerParameters, code);10 return compilerResults;11}
And... that's it. You end up with a compiled assembly that can be used almost as if we had it referenced in the current project.
Almost.
Because invoking it ain't so pretty. We're operating with an object that neither Intellisense nor the CLR know a dang about. Forget about using it as any other class instance, and you can't simply cast it as anything, because there's no contract for the abstraction available. Instead, you can either use the dynamic keyword and write dirty code, or you can invoke the method by name through reflection... which looks ugly.
1var results = CompileCode(converterCode);2var instance = results.CompiledAssembly3 .CreateInstance("InvoiceParser.InvoiceParser");45// Option 1: dynamic type encapsulation6dynamic dynamicInstance = instance;7dynamicInstance.ParseInvoice(invoiceLines);89// Option 2: named invoke through reflection10var parseInvoiceMethod = instance.GetType()11 .GetMethod("ParseInvoice");12var parsedInvoice = parseInvoiceMethod13 .Invoke(instance, new object[] {invoiceLines});
The first option is the kind of stuff mothers warn children about: "Don't ever use dynamic in your code, lest Uncle Bob will come for you at midnight's wake." Option two is much better, something that could probably pass as a finished feature. It's just ugly-looking. Of course, you could stick that inside a facade, seal it up, and try to forget about commited atrocities. But myself, I still felt uncomfortable with that idea.
Wouldn't it be so much better if we could somehow... interact with our compiled code in a type-safe manner, just like with any precompiled assembly?
Type-safety through contracts
So, our code compiles - that's awesome. ๐ฅณ
But like I said, the ergonomics are still icky. Let's civilize it.
While using types sounds nice and simple, we still haven't figured out how to inform our IDE or the CLR about the contents of the code we'll be loading from text. We can't just reference a text file like it's an assembly, so we'll need to work around this. The simplest way to achieve this is by making our external code utilize an interface and then upcasting our instance to that interface.
However, we need to somehow define this interface both within the static and dynamic parts of the code without duplicating the interface definition (even with exactly the same namespaces) because those won't be the same to the compiler. The CLR considers them different if they're in different assemblies. That's why we need a shared library.
Let's create an additional Class Library project, put an interface inside there, and reference it both in program code and code for dynamic compilation. Here's what it looks like:
1using System.Collections.Generic;23namespace Common4{5 public interface IInvoiceParser6 {7 IEnumerable<string> ParseInvoice(IEnumerable<string> input);8 }9}
And then, in our external code, we do this:
1using System.Collections.Generic;2using Common;34namespace InvoiceParser5{6 public class InvoiceParser : IInvoiceParser7 {8 public IEnumerable<string> ParseInvoice(IEnumerable<string> input)9 {10 foreach (var code in input)11 {12 yield return "Parsed line: " + code;13 }14 }15 }16}
With that prepared, let's write our final implementation.
1using System;2using System.CodeDom.Compiler;3using Common;45namespace Main6{7 internal class Program8 {9 private static string _compilerLanguage = "CSharp";10 private static string _interfaceDllName = "Common.dll";11 private static string _externalCodeClass = "InvoiceParser.InvoiceParser";12 private static string _externalCodeFile = "ExternalCode.txt";1314 static void Main(string[] args)15 {16 CompilerResults compilerResults;1718 using (var codeDomProvider = CodeDomProvider19 .CreateProvider(_compilerLanguage))20 {21 var code = System.IO.File22 .ReadAllText(_externalCodeFile);23 var compileParams = new CompilerParameters24 {25 GenerateExecutable = false,26 GenerateInMemory = true27 };28 compileParams.ReferencedAssemblies29 .Add(_interfaceDllName);30 compilerResults = codeDomProvider31 .CompileAssemblyFromSource(32 compileParams, code);33 }3435 if (compilerResults.Errors.Count > 0)36 {37 throw new Exception("External code failed to compile.");38 }3940 var parser = compilerResults.CompiledAssembly41 .CreateInstance(_externalCodeClass)42 as IInvoiceParser;43 if (parser == null)44 {45 throw new Exception("Parser failed to be instantiated.");46 }4748 var input = new string[]49 {50 "First line",51 "Second line",52 "Third line"53 };5455 var output = parser.ParseInvoice(input);56 Console.WriteLine(string57 .Join(Environment.NewLine, output));58 Console.ReadKey();59 }60 }61}
And that's all there is to it. Just look how neat the code looks - it's all recognized by Intellisense, without any kind of weirdness, and added complexity is pretty much none. What's not to love?
Well, besides memory leaks. You can't unload those dynamically compiled assemblies, that's the limitation. For a server, that's a nightmare scenario. In our case, the Converter module was running in the client-side application, so it was no biggie.
THE APP
๐
AND
CARRY
ON
But with this in mind, and the fact that you still need to implement some sort of mini-IDE inside your application, or at the very least something that lets you compile the code, grab the errors, and iterate until you write working code, this approach is sub-par.
It's 2026. Just set up a microservice and be done with it, or something like that.
An afterword
We need to acknowledge something here: this is dated stuff.
CodeDomProvider has long since joined the fate of the dodo bird - it became obsolete. And unlike dinosaurs, rarely mourned. These days, you'd use Roslyn instead: recommended by 9 out of 10 dentists, more secure, an elegant compiler for more civilized times. It's the modern wayto do dynamic C#. If you're thinking about grabbing that 2010 crap I showcased here, and blindly implementing it into your 2026 project with the Copypaste's Method... Well, don't. This article is a historical curio, a war story, not a recommended implementation.
(No, I have no idea why dentists recommend Roslyn.)
Secondly (and more importantly), an obligatory safety notice.
The examples above? That's user-editable C# code running in your environment. Let me make it painfully clear: **implementing this means slapping a humongous, neon-lit sign that says:
REMOTE CODE EXECUTION EXPLOIT - THIS WAY!
If you let the wrong person near that, it could be used to run some nasty exploits. Hell, if you don't let anyone near it, and somehow, someone exploits another hole to access it - it will be used to run some nasty exploits.
And that's an express ticket to Real World Consequences town, population: you. In .NET Framework, back then, you could limit the risk by running your dynamically compiled code inside low-privilege AppDomain. It's not perfect security... but let's be real: if you want perfect security, this is not the kind of implementation you should be using.
Sure, we could go through some steps where we try to kinda, sorta sandbox it, but here's probably the BEST advice you could hear in regards to CodeDomProvider:
โ ๏ธ DO NOT DO THIS. EVER. ANYWHERE. โ ๏ธ
As for us back then, we deemed it safe... or "safe enough". The dynamic code in question would be managed by someone from within the IT team.
And finally, thirdly... how did the story end?
Obviously while the code works, it's hardly a real implementation of handling various invoices. How did I manage to code that? Did the implementation eventually go FUBAR? Was I chased off from the company's premises months later by an angry pitchfork-wielding mob?
Well, it all went... surprisingly well?
The way I handled the multiple supplier logic was by storing the code of the ParseInvoice method inside an SQL table. I would then retrieve the contents, wrap them with a template class, and compile it. Everything else matters little in terms of this article - the GUI, testing, all that nonsense.
1// Lots of code omitted for brevity23private const string codeWrapper = @"4 public IEnumerable<string> ParseInvoice(IEnumerable<string> input)5 {6 {0}7 }";89public IEnumerable<string> ParseInvoice(int supplierId,10 IEnumerable<string> invoice)11{12 var methodCode = _externalCodeRepository.GetBySupplier(supplierId);13 var externalCode = string.Format(codeWrapper, methodCode);14 var parser = _invoiceParserFactory.Create(externalCode);15 return parser.ParseInvoice(invoice);16}
It was probably the first time in my career when I purposefully broke the Bestest Practices(tm) in exchange for everyone's convenience, myself included. Would I do it again today, under the exact same constraints?
Reluctantly, but I would.
As for an ideal implementation, in an ideal world without technological constraints? I'm sure you can think of several solutions yourselves. Personally, I'd opt for a microservice. Easy to maintain, quick to deploy, and done as it should be.
But back then, I didn't even know about such a concept.
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of the joy of coding, it was the epoch of never caring about best practices.
And sometimes, I do miss those times.
Until I open someone else's "TODO: temporary, rewrite later" fix from 2015. Then, I'm reminded we're all just caretakers in the Museum of Bad Decisions.