Over the course of my career I have had to query, extract, and analyze data sets from all sorts of databases. Microsoft Excel is probably the number one tool for this job, but there are limitations to excel in consuming and analyzing normalized databases. That is were someone like me comes in to build a query to extract and de-normalize the data set to make it easy for end users to run their analysis. The type of analysis I’m talking about here is the quick ad-hoc analysis, someone asked a question and we are going to answer it once; Excel has some pretty slick tooling for analyzing normalized databases but it generally isn’t worth setting that up to just answer a question.
During my career I have been both the query writer and extractor, and the analyzer (working with a dataset in Excel to answer a question.) My favorite tool for the first job is LinqPad. With LinqPad, you are querying in C# and not SQL. My favorite use case with LinqPad over any other SQL IDE is advanced joining and aggregations over multiple queries. Producing a single dataset to work with can often be completed quicker (IMO) by creating multiple small queries and combining the results rather than trying to write one large SQL query with several joins. I also recently discovered that LinqPad can display WCF and WinForms in its output window. I figured way not build and display a .NET chart object? It works! So why stop with just displaying a single chart that I build once, way not be able to display a chart of any IEnumberable?
I created a extension method for LinqPad called “ToChart” that converts any .NET IEnumerable (i.e. the results of a query in LinqPad) into a .NET chart object. Using reflection this proved to be pretty straight forward. The object you want to display as data points in the chart have some naming conventions for their properties. For example, you must have a property called “Key” to be the X Axis point. Any numeric properties in the object are then made into the Y axis points, or Series. You can also customize the properties of each series by appending the property name to a new property in you object, i.e. color could be controlled like so: Value_Color = Color.Red. In the below screen shot you can see where I’m setting the Color and Tooltip of each point in the “Hours” series.
Why did I do this? Number one, because it was fun and because I could. Secondly it allowed me to stay out of excel and keep me closer to the data when trying to answer a question. Finally, it helps me answer the question faster. Excel is a great tool but if I can remove one step from the question answering process then that is great. Like I said earlier, I use this for the quick ad-hoc question answering, not for detailed a recurring reporting. For more robust or recurring reporting I would look to SSRS or Excel to build a solution.
ASP.NET Core recently hit the 1.0 milestone and one of the most interesting features in this version (or edition, I’m not sure of future of the MVC 5.X series) is that dependency injection is baked into the core of the framework. That means that our controllers are no longer created by simplistic construction logic and can have complex dependencies. In order to facilitate my unit testing requirements, my team has (for years) overridden the default IDependencyResolver in MVC to use Unity (Microsoft’s IoC library) to resolve all dependencies (“Services”) for MVC. This approach is functional, and great for Unit Testing becuase you can inject mocked members into your controller, but it would sure be nice if that functionality was cooked into the framework so that my team didn’t have to bolt it on. You can see the details about how to configure DI in MVC Core here.
Unfortunately, I am not ready to jump into MVC Core for my team’s production applications. It’s not a me thing, it is a we thing. Jumping into MVC Core is really a decision that all our development teams need to agree to and we just have too many other things to focus on. I really believe that MVC core will be were we go in the future, so I did decide to just into the DI stack that Microsoft is using in MVC Core. Unfortunately, Microsoft’s new DI stack is not Unity. I decided to go against our defacto standard of Unity because if we do go to MVC core, there’s no reason to continue with Unity. Unless the new DI stack is terribly complicated, this DI stack will become my team’s standard in the future, spoiler alert, IMHO this new DI stack is better and easier to use than Unity.
Unity (and Ninject, and AutoFac, etc) have a ton of awesomely cool features that I have never used in a professional project. Since I understood the idea of Dependency Injection, I have desired a simpler tool. In fact, to learn how DI works and create a simpler tool to use, I created my own IoC Container. From my experience, and experimentation, the new DI stack is extremely simple. The root interface, “IServiceProvider,” only has a single method, GetService. There are several helper method to make it simpler to use, like the generic version of the method.
All developers should know the value in unit testing and automated builds. They ensure quality is built and maintained in your software products. You don’t want to have you interns come in and add some cool functionality into your product and deploy a broken product because they didn’t understand how their changes impacted existing functionality. In my last post in this series I went over how to create unit tests for a node.js application. Unit tests by themselves are great, but you (and your intern) need to execute them for there to be any value in having them. Chances are that intern doesn’t know about unit testing and won’t know they should, or even how to, run the unit tests before committing their changes. Fortunately there are tools that automate testing and reporting on commits. Travis is one such tool, and it is free to use. Setting it up for my node application was stupid easy. There were three steps, first I needed to define a “default” gulp task like so:
This step is necessary because I only defined the “run-test” task in gulp when setting up the unit tests, Travis by default runs the “default” gulp task, so it needs to exist. Second you need to create a “.travis.yml” file to define how Travis should run.
Third, you need to log in to Travis with your github account and select the repository to test. That is it. You are then running and testing automatically. If all is setup up right you will see a screen like this shortly after pushing your changes to github, complete with your “build passing” badge. Awesome!
Recently, one of my employees was working to automate an excel based report. The old school was to Excel automation was involving heavy use of macros and VBA. This would execute queries or copy data around between sheets spread out on network shares and accomplish some data task. From my experience this was very brittle and prone to failures. This time around, I told by employee to use Power Pivot to accomplish the automation. There is no VBA and only SQL statements that get executed. The visualization is accomplished by using Pivot Tables and Pivot Charts. Power Pivot is awesome in that it can pull data from many different types of data sources, SQL, Oracle, SSRS, SSAS, MySQL and Excel Sheets. The trick is that in order to make your data model work well for reporting you have to think a little like a data warehouse architect. THis will enable a power user to create self refreshing data sets that can be analyzed in Pivot Tables and Pivot Charts. Let me walk you through a simple Power Pivot model that I created to track some of my personal fitness goals and health metrics.
Let’s start with the all important time table. In Data Warehousing, almost all measurable data ties to a point in time, and we model that with a Time Dimension. In my example, I have a sheet with one column a date and time. I increment each row by one hour and copied it down about 10,000 times. You then click “Add to Data Model” and you created a Power Pivot Table. To make this more useful and interesting for slicing data you need to add some columns. For Example, add the day of week with this expression: FORMAT([DateTime], “ddd”). Finally, create a key column. I learned this trick from a former colleague, you can create a easy to read key for dates and times with a little addition and multiplication. You could easily exclude that or expand in out to minutes or seconds even: (YEAR ([DateTime]) *10000) + (Month ([DateTime]) * 100) + DAY ([DateTime]).
Adding a Sheet to Power Pivot
Once you have your time situated, you are ready to start creating some tables with data to measure, fact tables in the Data Warehouse lingo. I wanted to start with three measures: calories, exercise, and weight. All of these tables started out very simple, there was a column for the date and column for the value. In the Power Pivot designer I would then repeat the creation of the Date Key on each table. Final step is to create an association between the fact tables and the time dimension using the calculated DateKey column in Power Pivot; the easiest way to do that is by clicking and dragging in the Power Pivot designer. Once that is complete you are ready to start adding Pivot Tables or Pivot Charts in your spread sheet.
A Pivot Chart
The Completed Model
Immediately after getting my first node project up and running, I started to ask how do I write and test my code? My corporate experience has taught me the importance of automated testing, how to create them in C# with visual studio and how to automate the testing with TFS, but node and VS Code is totally different. Visual Studio is an IDE and it will take care of almost everything for you. You simply need to create a new project (a ‘test’ project) in you code and you are good to start writing tests. Executing your test is as easy as a file menu click or a key board short cut. You don’t have to think about the testing framework, how your code is built or executed because the IDE will take care of that for you.
Once you have these in place you can run your tests with the test command in the command pallet or the keyboard shortcut.