C# Best Practices and Tips by Toptal Developers

Share

This resource contains a collection of best practices and tips for C#, provided by our Toptal network members.

This resource contains a collection of C# best practices and C# tips provided by our Toptal network members. As such, this page will be updated on a regular basis to include additional information and cover emerging techniques. This is a community driven project, so you are encouraged to contribute as well, and we are counting on your feedback.

C# is a general-purpose, multi-paradigm, object-oriented programming language developed by Microsoft, primarily intended for developing software components into distributed environments. By design, C# emphasizes portability, software robustness and internationalization.

Check out the Toptal resource pages for additional information on C#. There is a C# hiring guide, C# job description, common C# mistakes, and C# interview questions.

How to Fast Process Text Files Using Reactive .NET?

Processing text files is something we’ve all done as programmers. However, manipulating large (multi-GB) files raises an issue: How can you do it as quickly as possible, and without loading up the entire file in the memory?

Text files usually have a header (one, to several lines), followed by some lines of data. Processing that file would generate another text file, with each input line resulting in one or more output lines.

In this short article, I’m going to explain how to process such files using the Rx.NET library. The entire example project is available at Bitbucket. Here is how to solve this task.

First, load a file from a URL and return an IObservable<string>, with each string being an individual line. I have split this into the following classes:

The WebLoader class is responsible for opening a resource (a local or remote file) and returning it as a stream. I extracted this as a separate class, instead of just using the Open method in the RemoteTextLoader, because I wanted to be able to write self-contained unit tests, without accessing the network.

The RemoteTextLoader class will take that stream and return it as an observable of strings, with each string being a separate line. You can see the main code below; loader is an instance of the WebLoader class, and the ReadLoop method simply returns one line at a time.

        public IObservable<string> Read(string url)
        {
            return Observable.Using(
                () => new StreamReader(loader.Open(url)),
                sr => ReadLoop(sr).ToObservable(scheduler));
        }

The RemoteTextLoader.Read method ensures that the StreamReader and its underlying stream are closed when the observable has completes with the Observable.Using call. This is important to prevent resource leaks. The call to the private ReadLoop method will run on the given scheduler; we don’t want to block the calling thread.

Finally, the WebStream class is responsible for closing not only the input stream but also the web request, also to prevent leaking resources.

Then, after the input file has been turned into a (reactive) stream of lines, that stream must be split into several streams and each of those processed in parallel on a different thread.

The ProducerConsumerSplitter class handles the first part; its Split method takes an input observable and a count and returns an array of observables.

    public class ProducerConsumerSplitter<T> : Splitter<T>
    {
        public IObservable<T>[] Split(IObservable<T> observable, int count)
        {
            // block if the collection grows past (thread count * 10) items
            var collection = new BlockingCollection<T>(count * 10);

            observable.Subscribe(collection.Add, () => collection.CompleteAdding());

            return Enumerable
                .Range(0, count)
                .Select(_ => CreateConsumer(collection))
                .ToArray();
        }

        //

        private static IObservable<T> CreateConsumer(BlockingCollection<T> collection)
        {
            return Observable.Create<T>(o =>
            {
                while (!collection.IsCompleted)
                {
                    T item;
                    if (collection.TryTake(out item))
                        o.OnNext(item);
                }
                o.OnCompleted();

                return Disposable.Empty;
            });
        }
    }

The splitter uses a producer-consumer pattern to split one observable into many. The producer part subscribes to the input observable and starts adding its items to the collection, blocking if the number of items is larger than an arbitrarily chosen limit of ten times the number of output observables. This way, the consumer observables aren’t starved of items to process, and we don’t fill up the memory with a multi-GB input file either.

The consumers simply read items from the collection until IsCompleted returns true, which occurs once the collection is empty and there are no more items waiting to be added to it. Note that you should not use the IsAddingCompleted method here since it gets set to true as soon as the producer has finished adding items to the collection, even if there are still items to be processed.

The actual processing on multiple threads is done in the ParallelProcessor class, which delegates loading and splitting to the previous classes and then calls a LineProcessor to turn each input line into one or more output lines.

        public IObservable<string> Process(string url, int headerSize, LineProcessor lineProcessor)
        {
            var lines = loader.Read(url);

            // need to use Publish here because otherwise the stream will be enumerated twice
            return lines.Publish(shared =>
            {
                var header = shared.Take(headerSize).ToArray();
                var rest = shared.Skip(headerSize);

                var streams = splitter.Split(rest, ThreadCount);

                // using SubscribeOn instead of ObserveOn because processing starts immediately when subscribing
                return header
                    .SelectMany(h => streams
                        .Select(stream => ProcessLines(stream, h, lineProcessor)
                            .SubscribeOn(scheduler)))
                    .Merge();
            });
        }

This class is responsible for loading the file, processing each line (using as many threads as requested) and then merging the results into a single observable.

Finally, the main program writes the result to an output.txt file and also measures the time spent processing the entire file. The processing done by the tester program is rather simplistic, so no doubt the real code would be slower, but I’m getting close to one million lines processed per second on my machine, which is encouraging.

Contributors

Marcel Popescu

Freelance C# Developer
Romania

Marcel is a senior developer with over 20 years of experience. He prefers back-end development, is great with algorithms, and prides himself on well-designed code. He has written an introductory book on TDD and is currently mentoring several junior programmers.

Show More

Get the latest developer updates from Toptal.

Subscription implies consent to our privacy policy

Know The Libraries

While the .NET Framework Class Library (FCL) provides a wide variety of classes for most scenarios, its third party library ecosystem has blossomed in recent years to address some insufficiencies and many of them are worth knowing. Here, we will cover some of the most useful, popular, and more importantly, open source thus consequently free, third party libraries to help you select the best library for your needs.

Json.NET

As the name suggests Json.NET is a library for working with JSON, providing methods for serializing and deserializing between JSON and .NET objects. It is highly customizable, allowing you to specify how the JSON should be formatted, how to handle dates, how the property names should be capitalized, and so on. Json.NET is available as Portable Class Library, making is usable for Xamarin and Windows Phone projects as well.

Dapper

Created by StackExchange, Dapper is a lightweight objectmapper, compatible with any database that has an ADO.NET provider. Simply, it executes a query and maps results to your classes; it does not generate database queries or model classes for you. What makes Dapper stand out from other, similar libraries? Speed. Dapper is a lot faster compared to the Entity Framework and only a little bit slower than raw ADO.NET.

NLog

Detailed logging not only allows you to monitor application performance but it’s also key to diagnosing issues in production. NLog comes with dozens of layout renderers that declaratively add contextual information to every log message. NLog is extensible too, letting you create a custom log target or a custom layout renderer. Configuration options allow you to log to a file system, directly to a database, track an event log, plus logging via email.

Topshelf

Topshelf makes writing a Windows Service a breeze by taking care of all the behind-the-scenes details after you’ve created your ordinary console application. There’s no need to worry about inheriting from the ServiceBase class or needing to add a ProjectInstaller/ServiceProcessInstaller; merely, tell Topshelf what your service should be named and which methods Topshelf should call when the service is started and stopped, and you will get a console application that you can debug without going through all the usual hoops.

Hangfire

Hangfire is a library that allows you to add different kinds of background jobs to your ASP.NET projects. It supports a variety of background task types such as fire-and-forget, delayed, and recurring, without needing a separate Windows Service. It includes a beautiful, sleek dashboard so you can quickly get an overview of all jobs, running or failed. The library will also synchronize jobs on different machines.

AutoFixture

By using AutoFixture for your unit testing, you won’t need mocks for peripheral dependencies in any given test. AutoFixture “auto-magically” creates anonymous variables and an instance of the system under test (SUT). As a result, adding a new parameter to the constructor of the SUT will not break any of your tests. If you integrate AutoFixture with other unit testing frameworks, you will receive a SUT instance and the other variables as test parameters, minimizing your test code even further.

Humanizer

Need to convert a Pascal case string to a readable sentence? Need to pluralize or singularize words? Convert numbers to words? Dates to a relative date or time spans to a friendly string? The Humanizer library “humanizes” your data, supporting English and almost forty other human languages.

AngleSharp

AngleSharp is a library for parsing and manipulating HTML documents. It fully supports HTML5 and, just like browsers, it can handle malformed HTML. After parsing the HTML, it provides a DOM, which you can query with CSS selectors or filter with LINQ. The DOM is fully interactive, so you can remove or add new elements and the DOM will reflect the changes. AngleSharp is available as a Portable Class Library, making it usable for Xamarin and Windows Phone projects as well.

ImageProcessor

If you need to manipulate images in your app, ImageProcessor is your new best friend; it can resize, rotate, watermark, crop, apply various filters, and more, plus, it does so in a very efficient way. ImageProcessor’s fluent interface makes it easy to explore the API and its library does not depend on GDI+, so it also works on the .NET Core too.

VerbalExpressions

Regular expressions (also known as RegEx) are powerful but they are hard to write, read and maintain in general. This is what VerbalExpressions tries to solve; it comes with a descriptive, fluent API which results in more expressive, easy to follow code. Just consider this ugly RegEx code to check if a string is a valid url:

 var regex = new Regex(@"^(http)(s)?(://)(www\.)?([^\ ]*)$");
 var isValidUrl = verbalExpression.Test("http://toptal.com");

Now, take a look at the corresponding VerbalExpression code that would replace it:

 var verbalExpression = new VerbalExpressions()
    	                            .StartOfLine()
    	                            .Then("http")
    	                            .Maybe("s")
    	                            .Then("://")
    	                            .Maybe("www.")
    	                            .AnythingBut(" ")
    	                            .EndOfLine();

MiniProfiler

MiniProfiler is a simple yet very powerful mini-profiler for ASP.NET. It profiles all your database commands, regardless of whether you use raw ADO.NET or an ORM, such as Linq-to-Sql or the Entity Framework. It includes neat functionality, such as explicitly profiling part of your code by wrapping it with a call to the Step method. NoSql databases, such as RavenDb and MongoDb, are also supported, plus it has providers for ASP.NET MVC as well as WCF.

Contributors

Giorgi Dalakishvili

Freelance C# Developer
Georgia

Giorgi is a software developer with more than a decade of experience. He has worked on a wide variety of applications including mobile applications, console applications/windows services, large web applications, REST Apis, web services, and desktop/Mac apps. He also maintains several open-source projects on GitHub. He works mainly with C#, Xamarin, SQL Server, Oracle, ASP.NET, ASP.NET Core MVC, Entity Framework, Android, iOS, and WinForms.

Show More

Submit a tip

Submitted questions and answers are subject to review and editing, and may or may not be selected for posting, at the sole discretion of Toptal, LLC.

* All fields are required

Toptal Connects the Top 3% of Freelance Talent All Over The World.

Join the Toptal community.