Introduction

Only tested code is worth something. With a code coverage report you can see what code is tested and what is not. That is essential to ensure the functionality of the code, also towards customers. In addition, there are also many benefits for you and your team. For example, you have a growing codebase and you want to have easy overview about what code is working and tested and where your team needs to add some tests. Your code quality will improve too. By testing the code, one also deals with its structure. This makes complicated passages with many if conditions stand out and makes it possible to optimize those methods.
Therefore it is important to have an overview of the percentage of tested code.
Because we had trouble finding a good guide on how to do this we decided to write this post to help those who encounter the same problem.
As a reference we used the best guide we found (even if it does not work out of the box):
https://medium.com/@dijin123/angular-unit-testing-code-coverage-report-in-azure-devops-build-pipeline-a062c881609c

To do the same thing with a dotnet backend you can follow this guide:
https://dejanstojanovic.net/aspnet/2020/may/setting-up-code-coverage-reports-in-azure-devops-pipeline/

In this post we want to show you how to add a coverage report to your angular project. In addition, we will also show you how to display the code coverage in the dashboard.

Coverage Report Screenshot

Dashboard

Install Junit-Reporter

Run the following command to install the junit-reporter, which we will use to collect code coverage from our unit tests.

npm install karma-junit-reporter --save-dev

You need to add an additional script into your package.json to get ng test to run without waiting for refreshes:

"test-headless": "ng test --watch=false --browsers=ChromeHeadless --reporters=junit --code-coverage"

Your package.json file should look like this now:

{
  "name": "angular-testing",
  "version": "0.0.0",
  "scripts": {
    "ng": "ng",
    "start": "ng serve",
    "build": "ng build",
    "watch": "ng build --watch --configuration development",
    "test": "ng test",
    "test-headless": "ng test --watch=false --browsers=ChromeHeadless --reporters=junit --code-coverage"
  },
  "private": true,
  "dependencies": {
    "@angular/animations": "^14.1.0",
    "@angular/common": "^14.1.0",
    "@angular/compiler": "^14.1.0",
    "@angular/core": "^14.1.0",
    "@angular/forms": "^14.1.0",
    "@angular/platform-browser": "^14.1.0",
    "@angular/platform-browser-dynamic": "^14.1.0",
    "@angular/router": "^14.1.0",
    "rxjs": "~7.5.0",
    "tslib": "^2.3.0",
    "zone.js": "~0.11.4"
  },
  "devDependencies": {
    "@angular-devkit/build-angular": "^14.1.1",
    "@angular/cli": "~14.1.1",
    "@angular/compiler-cli": "^14.1.0",
    "@types/jasmine": "~4.0.0",
    "jasmine-core": "~4.2.0",
    "karma": "~6.4.0",
    "karma-chrome-launcher": "~3.1.0",
    "karma-coverage": "~2.2.0",
    "karma-jasmine": "~5.1.0",
    "karma-jasmine-html-reporter": "~2.0.0",
    "karma-junit-reporter": "^2.0.1",
    "typescript": "~4.7.2"
  }
}

Edit the karma.config.js

Next we need to edit the karma.config.js in order to use the junit reporter and to export the code coverage as a cobertura file.
To enable the junit-reporter we add require('karma-junit-reporter') in the plugins array.
To export the results in the cobertura format we add { type: 'cobertura' } to the reporters
array inside the coverageReporter. If you have not changed anything else in this file it should look like this:

karma.conf.js

// Karma configuration file, see link for more information
// https://karma-runner.github.io/1.0/config/configuration-file.html

module.exports = function (config) {
  config.set({
    basePath: '',
    frameworks: ['jasmine', '@angular-devkit/build-angular'],
    plugins: [
      require('karma-jasmine'),
      require('karma-chrome-launcher'),
      require('karma-jasmine-html-reporter'),
      require('karma-coverage'),
      require('@angular-devkit/build-angular/plugins/karma'),
      // --------------> Add the following line <--------------
      require('karma-junit-reporter')
    ],
    client: {
      jasmine: {
        // you can add configuration options for Jasmine here
        // the possible options are listed at https://jasmine.github.io/api/edge/Configuration.html
        // for example, you can disable the random execution with `random: false`
        // or set a specific seed with `seed: 4321`
      },
      clearContext: false // leave Jasmine Spec Runner output visible in browser
    },
    jasmineHtmlReporter: {
      suppressAll: true // removes the duplicated traces
    },
    coverageReporter: {
      dir: require('path').join(__dirname, './coverage/angular-testing'),
      subdir: '.',
      reporters: [
        { type: 'html' },
        { type: 'text-summary' },
        // --------------> Add the following line <--------------
        { type: 'cobertura' }
      ]
    },
    reporters: ['progress', 'kjhtml'],
    port: 9876,
    colors: true,
    logLevel: config.LOG_INFO,
    autoWatch: true,
    browsers: ['Chrome'],
    singleRun: false,
    restartOnFileChange: true
  });
};

Coverage Report Publishment Pipeline

Use this pipeline to publish the test results.

pool:
  vmImage: ubuntu-latest

steps:
  - task: NodeTool@0
    inputs:
      versionSpec: '18.7.x'

  - task: Npm@1
    inputs:
      command: 'install'

  - task: Npm@1
    inputs:
      command: 'custom'
      customCommand: 'run build'

  - task: Npm@1
    displayName: Test
    inputs:
      command: custom
      workingDir: ''
      verbose: false
      # run the command we created in the package.json
      customCommand: 'run test-headless'
    continueOnError: true

    # Publishes test results
  - task: PublishTestResults@2
    displayName: 'Publish Test Results $(Build.SourcesDirectory)/test_report/TESTS-*.xml'
    inputs:
      testResultsFiles: '$(Build.SourcesDirectory)/TESTS-*.xml'

    # Publishes Coverage from cobertura xml
  - task: PublishCodeCoverageResults@1
    displayName: 'Publish code coverage from $(Build.SourcesDirectory)/coverage/**/*cobertura-coverage.xml'
    inputs:
      codeCoverageTool: Cobertura
      summaryFileLocation: '$(Build.SourcesDirectory)/coverage/**/*cobertura-coverage.xml'

After the pipeline ran successfully you can view the code coverage that was published by this run under the tab Code Coverage in the pipeline result window. Here you can see how it should look if you followed all steps correctly:

Code Coverage Pipeline Overview

Add Report to Dashboard

Now we add the Report to our Dashboard.

Current Code Coverage widget

Preview:
Code Coverage Widget Preview

At Azure DevOps Dashboard, click edit of "Add a widget" and search for "code coverage". If you don´t find the widget download it from the marketplace:
https://marketplace.visualstudio.com/items?itemName=shanebdavis.code-coverage-dashboard-widgets

Code Coverage Widget

You need this configuration:

Code Coverage Widget Settings

Code Coverage over time widget

Preview:
Code Coverage Widget2 Preview

In addition, we want to show the course of the coverage. To do this we search again for "code coverage" in the widget selection and choose the following one:

Code Coverage Widget2

If you don´t find the widget download it from the marketplace:
https://marketplace.visualstudio.com/items?itemName=davesmits.codecoverageprotector

You need this configuration:

Code Coverage Widget2 Settings

The result shows a good overview of the test status of our code

Conclusion

We believe that coverage reports contribute a lot to every team and give customers confidence in the product.
For example, in our project we decided to add those reports the dashboard and in the process of getting the numbers up, we also learned a lot about testing in general, as we could no longer left complicated parts untested to avoid difficult mocks. We also found a lot of bugs and bad dependencies in our code while testing it. Therefore, for our project in particular the introduction of code Coverage reports helped a lot in terms of code qualtiy.
We hope that in the future significantly more projects will use a coverage report to produce better code and reliable products.

PS

The only thing we think could be improved would be the addition of dark mode support for the dashboard widget, because it would embellish the overall look of the dashboard.

© authored by Gregor Pfister and Nicolas Lerch

Vectorization, or Single Instruction Multiple Data (SIMD), is the art of performing the same operation simultaneously on a small block of different pieces of data. While many compilers do support some kind of auto-vectorization, it’s often fragile and even seemingly unrelated code changes can lead to performance regressions. SIMD code written by hand is difficult to read and has to be implemented multiple times for multiple platforms, with fallbacks if the customer’s CPU doesn’t support it.

Over the Christmas holidays I had the chance to revisit and port a personal project from .NET 6 to .NET 7 where I gave vectorization a second look. I don’t think the above statement is 100% true anymore. In some way, it turned out to be the opposite. Let me give you an example, a Color struct with two methods: Equals as an example for an operator and AlphaBlend, which is a simple textbook implementation of layering two pixels on top of each other.

[StructLayout(LayoutKind.Sequential)]
public readonly struct Color
{
    // Alpha aka opacity
    public float A { get; }
    // Red
    public float R { get; }
    // Green
    public float G { get; }
    // Blue
    public float B { get; }

    // constructor omitted

    public Color AlphaBlend(Color top) => new(
        top.A + A * (1 - top.A),
        top.R + R * (1 - top.A),
        top.G + G * (1 - top.A),
        top.B + B * (1 - top.A));

    public bool Equals(Color other) => 
           A == other.A
        && R == other.R
        && G == other.G
        && B == other.B;
}

While both methods are four lines long, the actual logic is only one line, repeated four times. As the name suggests, Single Instruction Multiple Data is meant for these kinds of tasks.

C#’s Vector in the past

C# provides four different types that help us with vectorization: Vector64, Vector128, Vector256 (all since .NET Core 3.0) and Vector (since .NET Core 1.0 and .NET Standard 2.1). Vector has an unspecified size and will use one of the other three types internally. The others are fixed size, as their name suggests, but can be split in different ways. One Vector128 can hold two 64 bit long, four 32 bit int or eight 16 bit short. Floating point numbers are supported as well but custom structs will fail at runtime.

I did implement a vectorized version of AlphaBlend prior to C# 11 for performance and curiosity. It wasn’t pretty. Color is a struct of four 32 bit floats, which is 128 bit in total, the same size as Vector128. To perform operations on a Vector128, you had to use the functions defined in the Sse/Avx static classes. These fail at runtime if the host’s CPU doesn’t support them and it’s up to you to prevent crashes that may never happen on your system. This is usually done through a (scalar) fallback in every method or by requiring a minimum level of SIMD support (and possibly exiting the application on startup).

// Don't destroy the performance gains before doing the actual calculation.
public Vector128<float> Vec128 => Unsafe.As<ColorF32, Vector128<float>>(ref Unsafe.AsRef(in this));

public Color AlphaBlend(Color top) 
{
    if (Sse.IsSupported)
    {
        var vResult = Sse.Multiply(Vec128, Vector128.Create(1.0f - top.A));
        vResult = Sse.Add(vResult, top.Vec128);
        return new(vResult);
    }
    else
    {
        return new(
            a: top.A + A * (1 - top.A),
            r: top.R + R * (1 - top.A),
            g: top.G + G * (1 - top.A),
            b: top.B + B * (1 - top.A));
    }
}

Are those unsafe casts safe? The unit tests pass but I’m not sure I would risk it in production. It doesn’t support ARM CPUs and looks a lot scarier in languages that only provide cryptic names or have to fall back to metaprogramming to eliminate the if-else branch. At least C#’s JIT is smart enough to detect such simple patterns at runtime and rewrite the function without branches (depending on what the host’s CPU supports).

C#’s Vector today

With C# 11 you don’t have to write the code above anymore because the Vector classes have their own operators and utility functions now! It’s a small change that has a massive impact on readability and maintainability. The vectorized version is now on par with the scalar version in terms of readability and even a little shorter. Since it was so easy to make the entire class vectorized, I also changed the ARGB values to be of type Vector128 internally and provide getters instead, which yielded additional performance improvements. The Unsafe methods seem to not be zero cost like I thought initially. Generic Math (also new in C# 11, not shown here) was a breeze to implement.

public readonly struct Color
{
    private readonly Vector128<float> argb;

    public Vector128<float> Vector => argb;
    // Alpha
    public float A => argb.GetElement(0);
    // Red
    public float R => argb.GetElement(1);
    // Green
    public float G => argb.GetElement(2);
    // Blue
    public float B => argb.GetElement(3);

    // constructor omitted

    public Color AlphaBlend(Color top) => new(Vector * Vector128.Create(1.0f - top.A) + top.Vector);

    public bool Equals(Color other) => Vector == other.Vector;
}

Are there still functions that cannot be implemented this way? Would it be even faster to MemoryMarhsal.Cast an array of colors to a Vector256? Of course. But the next time you write a simple container of homogenous data, stop and think for a second if you could express it as a Vector internally. Maybe someone will thank you for your extra care someday, or at least learn something new while looking through your source code.

In a previous blogpost, we explored a concept and mindset of how to build bridges between systems (and teams) in agile development environments. The proposed way has its pros and cons but is a convenient way for smooth and rapid PoC and MVP development. However, just mocking something locally (on your own machine), is not going to help bridge gaps between teams; so, we somehow need to make our mock service accessible.

In this part of the series, we are going to dockerize our application and host it on a Kubernetes cluster. For this we have multiple options. While all following concepts apply to each of the big cloud providers, we are going to use Microsoft Azure (in this case Azure Kubernetes Service – AKS and Azure Container Registry – ACR).

Series:

Docker, oh Docker.

The first step is to dockerize our application. The following is specific for our application, but there are a lot of great tutorials on how to dockerize different kind of applications across all languages and technologies. (If you are a German reader, you can follow this post by my colleague!) Anyways, let’s create the Dockerfile.

FROM python:3.8.5

RUN mkdir /code
WORKDIR /code
ADD . /code/
RUN pip install -r requirements.txt

RUN groupadd -r appuser && useradd -r -g appuser
USER appuser

EXPOSE 9090
CMD ["python", "/code/app.py"]

We use a simple python 3.8.5 base image (based on ubuntu). For this context this is absolutely okay but depending on the use case we might want smaller images like python-alpine or Debian-slim. However, python docker images are used a lot in the machine learning world (esp. w/ libraries like sklearn and xgboost) and there we might need certain OS-libraries that would not be included on a tiny alpine image (at least not out-of-the-box). Our Dockerfile is not the most sophisticated ever, but it does the job. To learn more, see the official docs at docker docs and the docker blog.

Ok – let’s walk through our Dockerfile. It creates a Linux image with python 3.8.5 installed, copies our application code onto it and then installs dependencies via pip install. To not run the docker container as root user, we create a new system user (RUN groupadd -r appuser && useradd -r -g appuser appuser) and tell docker to use this user to run the container (USER appuser). The last step is to expose our application on port 9090 and specify the entry point to start the app (CMD ["python", "/code/app.py"]).

Mix in the cloud

We could build and run the docker image locally, but we wanted to make it accessible. An easy way could be to use Azure Container Instances. If we just want to run one single container, this would be the easiest solution. For this blogpost, we take another route of creating our own cloud cluster. This way we can deploy multiple containers there (e.g. multiple ideas for a prove of concept, different versions in parallel etc.). Well, let’s create the cluster!

There are endless ways of working with infrastructure. Whenever we are new to a cloud provider or to any specific resource, usually we use the providers web portal (e.g. Azure Portal, Google Cloud Console, AWS Portal) to explore our options. If we know how to integrate our cloud resources and have multiple similar or short-lived environments, Infrastructure as Code (e.g. w/ terraform, k8s operator pattern or provider specific solutions: Azure: ARM templates, AWS: CloudFormation, Google: CDM) is the way to go.

For everything in between, I love to use CLIs whenever possible. To create an AKS cluster the CLI of our desire is the azure cli (short: az). As the CLI is cross-platform, you can choose your environment. A lot of colleagues go with the PowerShell (also available on Linux and mac!) or the native mac terminal. I prefer the new windows terminal + wsl2 + ubuntu 20.04 on windows.

If you do not have an azure subscription, start a free trail. The 200$ free credit is more than enough than we need for this. Make sure to install the Kubernetes CLI (kubectl) before you continue. Let’s spin up a console!

  1. First, we log in to Azure with our Azure account.

    $ az login
    You have logged in. Now let us find all the subscriptions to which you have access...
    CloudName    IsDefault    Name                            State    TenantId
    -----------  -----------  ------------------------------  -------  ------------------------------------
    AzureCloud   True         Visual Studio Premium mit MSDN  Enabled  XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
  2. We need to create a resource group to place our cluster in. (Microsoft’s way of grouping multiple Azure resources together is called resource groups. There are no cloud resources outside of resource groups.)

    $ az group create \
    --name tkr-blog-aks-we \
    --location westeurope
    Location    Name
    ----------  ---------------
    westeurope  tkr-blog-aks-we
  3. Then, we provision a cluster into this group. After some minutes we have a cluster called tkr-blog-k8s-we01 in a resource group tkr-blog-aks-we. While we only added two nodes to the cluster, it’s quite impressive how fast the provisioning is.

    $ az aks create --resource-group tkr-blog-aks-we --name tkr-blog-k8s-we01 --node-count 2 --generate-ssh-keys
    Succeeded tkr-blog-aks-we
  4. Now we fetch the credentials to connect to the cluster and store them in the local .kube-config (this requires kubectl to be installed).

    $ az aks get-credentials \
    --name tkr-blog-k8s-we01 \
    --resource-group tkr-blog-aks-we
    Merged "tkr-blog-k8s-we01" as current context in /home/username/.kube/config
  5. We are done! We could start exploring our cluster now.

    $ kubectl get all
    NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
    service/kubernetes   ClusterIP   10.0.0.1     <none>        443/TCP   5m24s

Just a tiny bit more infrastructure

Ok, let’s recap. We now have:

  • a cluster somewhere in the cloud and
  • source code locally (or in a git repository).

In the container world (or esp. in the docker world) we bridge this gap with container registries and CI/CD pipelines. To not expand the scope of this blog post too much, we’ll emit the CI/CD part and build and deploy from our local machine (spoiler: automating this is going to be in part III of the series). That said, we still need the registry. Of course, every cloud provider has their own registry and there are other popular alternatives like obviously docker.hub or JFrog artifactory. For the sake of this blog, let’s stick with Azure, so Azure Container Registry (ACR) it is.

$ az acr create \
  --resource-group tkr-blog-aks-we \
  --name tkrblog \
  --sku Basic

Our cluster needs access to this (to pull images), so we grant the permission via CLI. If we don’t use two resources that integrate that well, we simply use ImagePullSecrets.

$ az aks update \
  --name tkr-blog-k8s-we01 \
  --resource-group tkr-blog-aks-we \
  --attach-acr tkrblog

After all the infrastructure work, we have:

  • a connexion application (previous blog post),
  • means to wrap it into a docker image (Dockerfile,
  • a registry to store the image (ACR) and
  • a cluster to run the container (AKS).

Building the Docker image

Usually this should be done in the CI/CD tool of your choice; likely something Jenkins, Github workflows, CircleCI or TeamCity. Other great contenders are Azure DevOps’ Pipelines (seamless integration with the other azure services like aks and acr) or my personal favorite GoCD that you could easily run inside the AKS cluster via the GoCD helm chart. The hottest challenger is probably Argo CD or other GitOps tools. Anyways, let’s build locally.

$ docker build -t tkrblog.azurecr.io/wellroom:1.0.0-dev.1 .
...
$ docker login tkrblog.azurecr.io
...
$ docker push tkrblog.azurecr.io/wellroom:1.0.0-dev.1
...

If you happen to use Azure DevOps the build pipeline would look like this (otherwise just ignore):

trigger:
- master

resources:
- repo: self

variables:
  dockerRegistryServiceConnection: 'XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX'
  imageRepository: 'wellroom'
  containerRegistry: 'tkrblog.azurecr.io'
  dockerfilePath: '$(Build.SourcesDirectory)/Dockerfile'
  tag: '1.0.0-dev.1'
  vmImageName: 'ubuntu-latest'

stages:
- stage: Build
  displayName: Build and push stage
  jobs:
  - job: Build
    displayName: Build
    pool:
      vmImage: $(vmImageName)
    steps:
    - task: Docker@2
      displayName: Build and push an image to container registry
      inputs:
        command: buildAndPush
        repository: $(imageRepository)
        dockerfile: $(dockerfilePath)
        containerRegistry: $(dockerRegistryServiceConnection)
        tags: |
          $(tag)

Deployments and Services in Kubernetes

So here we are, docker image in registry, cluster with registry access. Now we need to bring this image into our cluster. There are multiple ways to deploy workloads to Kubernetes, but without yaml-files this wouldn’t be a post about kubernetes. So, let’s declare what we want Kubernetes to do with our image (deployment.yaml):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wellroom-room-depl
  labels:
    app: wellroom
    component: wellroom-room
spec:
  replicas: 1
  selector:
    matchLabels:
      app: wellroom-room-micro
  template:
    metadata:
      labels:
        app: wellroom-room-micro
    spec:
      containers:
        - name: wellroom-room
          image: tkrblog.azurecr.io/wellroom:1.0.0-dev.1
          ports:
            - name: http
              containerPort: 3000
          resources:
            requests:
              memory: "128Mi"
              cpu: "250m"
            limits:
              memory: "256Mi"
              cpu: "500m"

To apply this configuration to kubernetes (as we are still connected to the cluster):

kubectl apply -f ./deployment.yaml

Essentially, we told Kubernetes to deploy our application. The work Kubernetes does after this command is quite impressive, but out of scope of the post. To make it short – Kubernetes runs the image as a container in a so called pod.
Read about Kubernetes Deployments, Replicasets and Pods, if you want to dig deeper.

Since pods are isolated workloads, they are not accessible by default. We need to expose it with a service. For us, this is easily done by creating a service definition (service.yaml) for the deployment, that exposes the deployment. Behind that we could explore things like the impressive kubernetes service discovery.

apiVersion: v1
kind: Service
metadata:
  name: wellroom-room
  labels:
    app: wellroom
    component: wellroom-room
spec:
  # type: LoadBalancer
  ports:
    - port: 443
      targetPort: 3000
      protocol: TCP
  selector:
    app: wellroom-room-micro
$ kubectl apply -f ./service.yaml

Finalizing the puzzle: the benefit of the cloud.

We’ll, there we have it. Our running mock service in a cloud-based kubernetes cluster. Before we uncomment the line type: LoadBalancer, we’ll explore what it does. With the line commented like this, Kubernetes exposes our running pod (created by the deployment.yaml) with the service (service.yaml). Since we did not specify where to expose it, Kubernetes makes it available cluster-internally only (default for .spec.type is ClusterIp). That’s what you usually want if have a micro service architecture: only services inside the cluster can call each other. But that does not fit our use case, we want to show our service to the world! There are a couple of ways to achieve this in Kubernetes (and esp. on a cloud instance!).

  • port-forwarding: this is for the development process only and no real option here.
  • NodePort: short answer – don’t use it (maybe if you have zero budget).
  • LoadBalancer: exposes one service to the world.
  • Ingress: actually, not a service, but the most powerful way (multiple services with same IP address/Routing, SSL, Authentication, …).

If you want to dig deeper into this, this compact article provides great visuals to explain the differences. We stick with the LoadBalancer (could also use NodePort – but as I said, don’t use it…).

apiVersion: v1
kind: Service
metadata:
  name: wellroom-room
  labels:
    app: wellroom
    component: wellroom-room
spec:
  type: LoadBalancer
  ports:
    - port: 443
      targetPort: 3000
      protocol: TCP
  selector:
    app: wellroom-room-micro
$ kubectl apply -f ./service.yaml
$ kubectl get svc wellroom-room -w
NAMESPACE     NAME             TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)         AGE
default       wellroom-room    LoadBalancer   10.0.39.110    pending         443:3000/TCP    52s
...
default       wellroom-room    LoadBalancer   10.0.39.110    52.156.88.187   443:3000/TCP    52s

The last shell command shows our service as pending while Azure provisions an actual load blancer for us. Once this is done (parameter -w watches for changes to the resource), we’ll see the external IP address where we can finally reach our mock service. So, if we browse to https://52.156.88.187, we can see our publicly accessible API!

The final folder structure looks like this.

wellroom-room/
├── Dockerfile
├── README.md
├── api
│   ├── business_controller
│   │   └── rooms.py
│   └── tech_controller
│       └── health.py
├── app.py
├── deployment.yml
├── requirements.txt
├── service.yml
└── swagger.yaml

There is always more: what we did cover and what not.

While we did cover a lot of topics in this post, we also leapt over a bunch. For example, our service now has a dynamic IP address. As soon as we want to modify the service (i.e. service.yml, not the deployment.yml), the public IP address is going to change. Usually we either want a static IP and/or a DNS name. Other mandatory topics we skipped in this run were things like X-API-Keys, token mechanisms, certificate authentication with ingress etc. Nevertheless, we

  • learned how to create an Azure Kubernetes Service, an Azure Container Registry and how we can connect them.
  • We accessed the cluster on the console of our choice and
  • containerized our mock application and made it Kubernetes-ready.

Sneak peak at Part III

We did an initial set-up, but a lot of the steps we did were rather manual. The next post focuses on automating the build and release process. We are going to learn how Azure DevOps Pipelines integrate with Kubernetes and how we can facilitate Kubernetes namespaces as environments for our staging from development over testing to production.

Building bridges is hard

In a world where everybody builds distributed system (in a small scale: for microservice-based systems; in a larger scale for a global market) and time to market is a key success factor, we face the challenges of parallel development and synchronization across teams and products. While agile frameworks like SCRUM address some of the arising topics (e.g. having a unifying definition of done for multiple teams) there are other means to tackle the complexity.
But before we jump to a solution, let’s take one step back and review one of the typical issues:

  • One team or company builds an awesome API and another team integrates it into their product.
  • The second team must wait for team one to have a working API before they can start their integration work. However, even after they waited for team one to finish, they face mayor problems during their development.

These could occur while working with the API directly but might as well be totally unrelated (e.g. when trying to automate an unclear business process). The reasons don’t matter that much when the second team’s product arrives the market too late and lost its worth.
We (i.e. software engineers) solved these kinds of problems on other layers already. When building a database-heavy application we start with the data model (+ database) and then build the user-facing application and the data-oriented pipelines, optimizations etc. in parallel – instead of sequential development. We call this database first.

Series:

So, let’s talk about API first…

The idea is not complicated: two (or more) teams design an interface collaboratively. Usually, this is achieved by a proposal from one team and following discussions. The result should be an interface description that all parties agree on. In our case we are going to use the OpenAPI specification (or swagger if you don’t look too closely). We can create an example for our case here: here or here.
Disclaimer: There a lot of pay-versions out there, but these two are free and without registration. So, they are either nice or steal your data.

For the example in this post we want to create an API used for a room-booking service (think about it like the outlook room booking feature for meetings). We’ll start with two endpoints

  • GET /rooms gives a list of all available rooms and
  • GET /roomdetails provides more specific information for a given room

For good measure we throw in a technical service endpoint with

  • GET /health that provides a simple ‘ok’ in case our service is up and running.

Using one of the API designers and only minimal features of OpenAPI we end up with our swagger.yml:

swagger: "2.0"
info:
  description: 'a simple room booking service'
  version: 1.0.0
  title: room booking service

basePath: /v1

schemes:
  - http
consumes:
  - application/json
produces:
  - application/json

paths:
  /health:
    get:
      operationId: ''
      responses:
        '200':
          description: 'get service health'
  /rooms:
    get:
      operationId: ''
      responses:
        '200':
          description: 'get all rooms + meta data'
  /roomdetails:
    get:
      operationId: ''
      responses:
        '200':
          description: 'get details for a room'

Before we implement our (mock) API, we create our webservice with these six lines of code (app.py):

import connexion

app = connexion.App(__name__,
                    options={"swagger_ui": True})
app.add_api('swagger.yaml')
application = app.app

if __name__ == '__main__':
    app.run(port=3000, server='gevent')

Before we explore what we see here, we’ll install our dependencies first. There are multiple ways to do this in python, but for the easy way we use pip and our requirements.txt. For people unfamiliar with python, this would be like npm & package.json for Nodejs or maven & the pom.xml in Java.

connexion[swagger-ui]
gevent

Connexton is an API-first framework build in python by Zalando (Open Source). It leverages popular technology components like Flask and Oauth2 and integrates various industry standards to fasten up (microservice) development.

Gevent is a python networking library that is used as a WSGI server in this case. For our purpose we do not need more information on this, but if you want to dig deeper, check out this excellent blogpost.

Now, let’s come back to our code! As you can see, we create a new connextion.App(...) with enabled swagger ui and add an API (reference) the app exposes (our previously created swagger.yaml). We are pretty close to running our API-first application.

Connecting the dots

For the last steps, we need a mock implementation of your endpoints. In a real-world example, you should have defined responses in you swagger specification already. A variety of tools can generate server mocks from this. However, as we have a slim example, we create this without any tooling help.
Considering our current project structure, we’ll just add two files for our API-controllers. We create an api/ folder with two subfolders for our "business" logic and our technical health-endpoint (rooms.py and health.py):

app.py
requirements.txt
swagger.yml
api
|-- business_controller
|   `--rooms.py
|-- tech_controller
    `-- health.py

Simple mock-returns for rooms.py and health.py:

def get_room():
    return ["Kallista", "Io"]

def get_room_details():
    return [{"name": "Kallista", "space": 3},{"name": "Io", "space": 4}]
def get():
    return 'ok'

Ok, let’s review what we got so far…

  • We have an API specification (at this point another team could start their work).
  • We created a lean python webserver that exposes our API from our specification.
  • Lastly, to actually run this, we coded a trivial stub implementation of our API.

The last thing we need to do is to connect our API routes to our controller functions. This is easily done by enhancing our swagger.yaml slightly (adding the path to our controllers as operationId):

swagger: "2.0"

info:
  description: 'a simple room booking service'
  version: 1.0.0
  title: room ms

basePath: /v1

schemes:
  - http
consumes:
  - application/json
produces:
  - application/json

paths:
  /health:
    get:
      operationId: api.tech_controller.health.get
      responses:
        '200':
          description: 'get service health'
  /rooms:
    get:
      operationId: api.business_controller.rooms.get_room
      responses:
        '200':
          description: 'get all rooms + meta data'
  /roomdetails:
    get:
      operationId: api.business_controller.rooms.get_room_details
      responses:
        '200':
          description: 'get detail for a room'

Explore and verify our API first mock service

Now let’s enjoy our service:

~/DEV/wellroom-room$ pip3 install -r requirements.txt
~/DEV/wellroom-room$ python3 app.py

We could call our API now at http://localhost:3000/v1/health or http://localhost:3000/v1/rooms with a tool like Postman or Insomnia, but we can also serve to http://localhost:3000/v1/ui/ and explore the API via the build-in swagger-ui function in a browser.

Wrap up and what we didn’t cover

We learned about the benefits of API first and saw one way to approach the topic. We got a short glimpse at what python with connexion can do for us. From tooling standpoint, we saw tools to create OpenAPI specifications.
The two main things we are missing at this point is the security component and quality assurance (e.g. testing + linting). The first part is a topic on its own, but one of the great benefits of using connexion is the native integration of standard mechanisms like Oauth2 and X-API-KEYs. For testing and linting there are great blogs, books and videos out there….😉

Follow up

This blogpost is part of a series.

  • The next post is going to focus on how to integrate such a microservice in an enterprise ecosystem:

    • Dockerize the application,
    • use Azure Container Registry to store our application,
    • create and access an Azure Kubernetes Service and finally
    • deploy our service in this Kubernetes cluster.
  • The third post shows how to enforce improvement with continuous delivery and/or continuous deployment for our room service:

    • leverage AzureDevOp’s Build and Release Pipelines for our service and
    • integrate our Kubernetes cluster as environments in AzureDevOps.

R is a powerfull tool for data scientists but a huge pain for software developers and system administrators.
It’s hard to automate and integrate into other systems. That’s because it was developed mostly in academic environments for adhoc analytics instead of IT departments for business processes.
As every meal gets better if you gratinate it with cheese, every command line tool gets better if you wrap it in a PowerShell Commandlet.

PSRTools

You can do a lot with the R command line tools Rscript.exe, but the developer of R seem to have a strange understanding of concepts like standard streams and return codes. So i created the PowerShell module PSRTools, that provides Commandlets for basic tasks needed for CI/CD pipelines.

If you have build dependecies, you have to install a R package from a repository. You can specify a snapshot to get reproducable builds. An example is:

Install-RPackage -Name 'devtools' -Repository 'https://mran.microsoft.com/' -Snapshot '2019-02-01'

In case you have the package on the disk you can install it with the Path parameter.
In both cases you can specify the library path, to install it on a specific location.

Install-RPackage -Path '.\devtools.tar.gz' -Library 'C:\temp\build\library'

If you have inline documentation in your R functions, you have to generate Rd files for the package. There is the Commandlet New-RDocumentation for that.
The build can be executed using New-RPackage. Both Commandlets have the parameters Path for the project location and Library for required R packages.

Build Script

The build script depends a little on your solution. It could be something like:


param (
    $Project,
    $StagingDirectory
)

$Repository = 'https://mran.microsoft.com/'
$Snapshot = '2019-02-01'
$Library = New-Item `
    -ItemType Directory `
    -Path ( Join-Path (
        [System.IO.Path]::GetTempPath()
    ) (
        [System.Guid]::NewGuid().ToString()
    ))

Import-Module PSRTools -Scope CurrentUser
Set-RScriptPath 'C:\Program Files\Microsoft\R Open\R-3.5.2\bin\x64\Rscript.exe'

Install-RPackage -Name 'devtools' -Repository $Repository -Snapshot $Snapshot -Library $Library
Install-RPackage -Name 'roxygen2' -Repository $Repository -Snapshot $Snapshot -Library $Library

New-RDocumentation -Path $Project -Library $Library
$Package = New-RPackage -Path $Project -Library $Library
Copy-Item -Path $Package -Destination $StagingDirectory

For build scripts i recommend InvokeBuild, but wanted a simple example.

Continuous Integration

To achive automated builds, you have to integrate it in a build script.
I tested it successfully in Appveyor and Azure Pipelines. Both work pretty much the same.

If you don’t have your own build agent, that can be prepared manually, you need to install R and RTools in your build script.
That can be done using Chocolatey.

In your appveyor.yml it is like:

install:
  - choco install microsoft-r-open
  - choco install rtools
build_script:
  - ps: Build.ps1 -Project $env:APPVEYOR_BUILD_FOLDER -StagingDirectory "$env:APPVEYOR_BUILD_FOLDER\bin"

In you azure-pipelines.yml it is like:

steps:
- script: |
    choco install microsoft-r-open
    choco install rtools
  displayName: Install build dependencies
- task: PowerShell@2
  inputs:
    targetType: 'inline'
    script: |
      Build.ps1 -Project '$(Build.SourcesDirectory)' -StagingDirectory '$(Build.ArtifactStagingDirectory)'

Not that terrible anymore, right?

As software engineers in fast-paced projects we risk getting overwhelmed by huge workloads and deadlines. Usually, the work is fun and that is why we can cope with it over longer periods, but we need moments to breath and think about other things. I do not want to talk about general work-life balance here. I want to talk about the work-portion here. We need to make breaks or follow a slight activity (like take a short walk in the nearest park). Another thing we experienced to be refreshing are retrospectives! Obviously their objective is to improve processes to increase efficiency and discover impediments (check the SCRUM GUIDE, p. 14), but they are also fun during times with high workloads. Therefore, if you feel like skipping them to have more time that is “productive” – resist.

Shout out

Following are some ideas on how to make your team’s retrospective a bit more exciting. Be careful that they fit your team. Do not force your colleagues to do what they do not want, especially as the following ideas might be a bit too cheesy.
A shout out to various sites in the web where we took ideas and inspiration

In addition, an extra shout out to my colleague Harald Wittmann for coming up with some great ideas.
Without further ado, here are ideas for retrospectives to inspire your team.

Visual and playful

Super hero

Everybody likes superheroes, so do we. It does not matter if you are a fan of the MCU, DC or if you feel indifferent about Superman, Thor & friends; in this retrospective, the hero is just a vessel. They help the team to jump out of character and enable you to see your processes from a different angle.


Superhero retreat: Visualize your main base here. In our case, it is called "the engine tower", because there is "engine" in our project name. In your case, this should represent your project. It is the place where you feel comfortable, a safe space. Here you should collect what you did great, what your strength are and what you can rely on.
Sidekick: Every hero has someone to help him in dangerous times. In the retrospective, this is where you can call for help: write down where you need support. An example could be, "we need more time for research on a topic before we start implementing".
Gadgets: Image a belt with all your tools for converting ideas into software. What tools/skills can you improve or add to your belt?
Villain: This one is easy – what are obstacles in your way to achieve your objectives? Write down what could interrupt and/or destroy your goals.

The three little pigs

Play your team through the fable of the three little pigs. This is visual and just a tiny bit childish. This enables you to jump out of role (again). A piglet in a fairy tale can say completely different things than a senior software engineer. In addition, this retrospective is understandable easily.


The three huts (straw/sticks/brick) visualize things that can slip/rip easy, things that work, but can be improved and things that the team can rely on. Finally, we have the Big Bad Wolf. Like the villain previously, he resembles risks, obstacles and impediments.

The boat

As you might realize by now, we are a visual team. For us, the biggest benefit is that you view things from a different point and say things you would not have thought of or talked about.


The island is your goal. It is the ultimate objective. Be careful, we are talking process goals not software features or releases.
The wind is what gets you there. Collect what makes you efficient and helps you to continue and focus your efforts.
In contrary, post things that delay you near the anchor. Bigger risks (i.e. not just slowing, but also stopping your work) would find their place near the rocks.

The box

The box follows the same approach as the previous ideas, but in a more basic and easy understandable way. Draw a box. It shows your current environment/toolbox. Collect what is in there (keep/recycle), what should be in the box, but is not yet and what you want to kick out.


If you want to, you could try this retro with a physical box.

The traditional

While the visual ideas are great, they require some set up and preparation. That is why we mix in more traditional retrospectives, too.

Liked learned lacked

Like the following ones, this follows the typical retrospective pattern, where you collect what was good, bad and what can be improved.


Liked: What went well, what was new that improved your process?
Learned: When some was harder than expected or took longer than estimated, try to derive what you learned from that to not repeat the mistake.
Lacked: What are things that slowed your progress? What are things you want to get better in? An example could be "degree of automation (around product tooling)".

Sad mad glad

The same approach as the liked – learned – lacked.


What made you sad last sprint (what kept your efforts below expectations) or even mad (things that made parts of the sprint fail). Do not forget to collect what went good (glad).

Start Stop Continue

A mix of the simpler retrospectives and "the box". Sadly, it does not rhyme like the other ones.


Start: Things you could improve, e.g. plan in more time for the presentation of concepts/mock ups to validate them with the business colleagues.
Stop: Things you should stop in the next sprint, e.g. let requests grow too big.
Continue: Things that went well and you should not forget about, e.g. pick a topic and discuss it in a two-man session while you walk through the nearby park.

I Wish / I Like / If everything was possible

This focusses on “I wish” (what could have been better, what is the team/process lacking) and “I like” (what did we do well, what learnings occurred during the sprint) at first.



Then we want to get an element we had in previous ideas (e.g. what would be your superpower…): "if everything was possible…." This enables us to dream and speak openly. While these things are utopic, you can try to make a step in their direction. An example could be, "If everything was possible, I would only write clean code". This could show I had troubles with this in the past. Either I did not have enough time to refactor a section, lack knowledge about clean code or something along those lines. Now we can derive actions from; like look for a book or a training, make a session with typical code smells, estimate more time for refactoring, plan in more code reviews…
In the usual work environment, I would not just say, "Hey folks, I write dirty code" (well maybe…).

Other Ideas

I hope that you found some inspiration reading this. To top it off, here are two additional tips:

  • Before the retrospective, go back to your backlog and write down what you achieved during the last sprint. Then at the start of the meeting, just listen the topics and challenges you overcome. This helps the team to think back and remember things they want to bring up in the retrospective. For us a moral boost helps with such a highly creative artifacts as a scrum retrospective.
  • Try to visualize your team structure somehow. This could be a football field or the ship we saw previously. What position would everybody take? (Think about captain, on board scrubbing the deck…)

Do you have ideas for retrospectives? Leave a comment or send me an email!

In data analytics, there are a lot of nice and shiny buzzwords, products and concepts. Before you decide anything, you should be clear about your actual and future needs. Your analytics infrastructure should enable you to analyze data. But there are aspects that architectures support differently and you have to trade off. There is no free lunch.

Here are some explanations that should help you to orientate what you need and enable you to compare different approaches.

Data Location

It’s important where the data is located. You can leave the data on the source system and read it per analysis. Or you can copy the data to an analytics system.

Pros of leave and read are:

  • You will save space.
  • You will have always the most recent data.

Pros of copy are:

  • You may get fast analyses, since you can optimize the storage for analytics.
  • You maintain fast operations, since you don’t read data that the source system wants to use at the same time.
  • You get consistent results, since you control the updates.
  • You can implement historization and don’t lose information.

Usually the data is copied, but if the volume is big enough or you don’t have the resources you may want to leave it on the source.

Data Structure

How or if your data is structured has a significant impact on your analytics infrastructure. There are structured data, semi structured data and unstructured data. Structured data are for example relational databases. Semi structured data contains information how to separate values and identify structure. Examples for tabular semi structured data are simple Excel sheets or CSV files. Examples for hierarchical are XML or JSON that are often used by web services and APIs. Examples for unstructured data are images, PDF or plain text.

Analytics is all about reduction of information to be consumable by humans or processes that humans create and understand. Reduction requires structured data. Semi structured data can be transformed into structured data. Unstructured data may be transformable into structured data, but not always, not that easy or not error free. Avoid Excel, PDF or Text as data source, whenever possible.

Data Transformation

Data from different sources may be hard to combine, since there is no common identifier or different formats. There are again two approaches: "Schema on read" and "Schema on write". "Schema on read" means, you leave the data in its raw form and transform it when you analyze it. "Schema on write" means, if you write the data you transform it into a common format. You may change the format, the data types, normalize it and deduplicate it.

Pros of "Schema on read" are:

  • You may save effort on integrating new data.
  • The raw data may contain more information than the integrated data.

Pros of "Schema on write" are:

  • It’s less effort for analyses using integrated data in development and computation terms.
  • Investment in quality pays out more, due to the reuse of transformed data.
  • Preaggregated data will speed up analyses.

Data Volume

It’s hard to say what data volume is big, but anyway it should have a big influence on your individual solution. In analytics systems, performance is often provided by redundancy. Some results that are used several times or with reduced latency are precalculated and stored.
So one piece of information is stored many times in different ways. That’s performance efficient but not storage efficient.
Obviously that may be a problem with a lot of data.

Usually one can say that data that is automatically generated, e.g. from sensors or logging functions may come in high volumes. Manually generated data like orders in your ERP system or master data usually not.

Data Velocity

Data velocity means how much time elapses from data generation to analysis. High velocity may come along with other restrictions or increased effort.

Most common are scenarios with updates on a daily-basis. For regulatory supervision it may be enough to update your data once a quarter.

A often misused term in this topic is real-time. If your system is real-time capable, it means that you guarantee a result in a specified time. That’s important for example in embedded systems in automotive or industrial environments. In business context real-time is used as best effort latency and only in special cases necessary. Imagine a manager that makes decisions, based on reports that are updated and changed every 5 minutes. That holds the risk to react to random events instead of pursuit a strategy. That may be different in a cloud-based application scenario, where you want to scale-up or scale-down the system based on the usage. Or think about a process that changes the prices in a ecommerce scenario based on the recent sales.

Conclusion

You see if someone tries to sell you something without listening to your requirements, there is a good chance to end up with something that does not deliver what you need or may be accomplished with less effort.

As IT consultants, we try to solve problems on a daily basis. This is our normal workload, our daily business. But this is not our only duty, we need to keep up with the technical evolution, we need to learn continuously to satisfy our customers. This is why we read about new things in blogs, visit meetups in our free time and go to conferences (like the one this post is inspired by "down to earth architecture" by Uwe Friedrichsen @SAS 2019 in Munich). We are influenced by all these channels and need to be careful how to use the knowledge in our working environments or we end up with one of these stereotypical types of bad software architecture:

Stackoverflow architecture (or google-driven architecture)

We have a problem to overcome in our software system and are not familiar with the topic. Therefore, we search the internet for books, blogs or tutorials. We find a slightly related solved problem on Stackoverflow and copy the solution without much thinking.
We are not talking about copy-pasting code here, but rather abstract solutions like "where should ids be generated in CQRS." We do not want to downplay the absolute knowledge found on Stackoverflow, but we should be sure the solution we found actually fits the problem and/or adapt accordingly.

Conference-driven architecture

Whatever conferences you visit, you always feel attached to your track or topic. These could be things like micro-services, domain-driven design or EventSourcing. While these are very good solutions to their respective problems, they might solve problems you aren’t even encountering in your domain or there are other good solutions.
Additionally, most of the time, we are not starting an application from scratch. If we visit conferences regularly and always incorporate the hot topics, we end up with a mess after some time.

Hype-driven architecture

Similar to the conference-driven architecture, we find the hype-driven architecture. Every (new) application needs to be distributed into micro services. Of course, that’s not true. There are huge benefits in following a micro-service (or SCS) approach, but there are also challenges, constrains and problems! Learning and especially applying a framework is often useful. However, you should not force a framework onto your system if there is no need for it! Most of the time, learning how to solve your domain’s problems (e.g. how to handle consistency in distributed systems or mastering personal data and GDPR) is more beneficial than being a master of a framework.

Strategic architecture (aka PowerPoint architecture)

Usually, when you join a project there are some PowerPoint slides describing the architecture of the system or application. You go through these, but your colleagues advise against doing so: "these are for compliance" or "we made this for the latest steering committee". When the slides diverge too much from the actual structure or code, misunderstandings are about to happen! While there are reasons to display different aspects of your software to different stakeholders, try to minimize this.

Tunnel-vision architecture

As a software engineer or architect, you need to work on some topics in excessive detail. We need to build walls around us and analyze problems in-depth! Occasionally, we need to look around, too. With more experience, we learn to balance the extremes. Especially for younger developers, there is a risk of over-engineering one detail or creating problems on other ends of the system.

Blast-from-the-past architecture

Technology advances, business models evolve and the underlying software architecture needs to do this, too. There are challenges that a lot of software components face; an example is versioning of web APIs. A versioning concept of system-to-system APIs with /v1/, /v2/, /v3/ might work for applications that had a release once a month and a breaking change once a year, but probably won’t work for a fast paced API in an API economy where time-to-marked is a driving factor.

Big design up front

In a world with perfect information, where all user needs and every aspect of your system are clear, Big Design Up Front (BDUF) could work. BDUF is closely related to the waterfall approach of developing software. This clashes with the agile world. Similar to communism and capitalism, BDUF and agile development are two paradigms where neither is inherently bad or good – it’s just that one is more practical in real life. Especially in a fast-moving world where innovation is key, agile development won the battle and there is no place for BDUF architecture.

One-size-fits-it-all architecture

Develop your application as a polyglot, domain-driven micro-service architecture with CQRS and EventSourcing. Use Kubernetes as container orchestrator with Helm for deployment, Prometheus and Grafana for monitoring and GIT as source control system. Frontend is Angular, machine learning is done in python and we use Mongo and Cassandra for persistency. Caching is done through redis and the whole application needs to be cloud agnostic and conform to all cloud native principles. While this is a noble approach and a turn-on for software engineers, this might not suite our business needs in any way. We could solve many problems with this technology selection, but we are likely over-engineering and not optimizing our efforts.

Accidental architecture

Remember the cone of uncertainty? When you start developing a product, about everything is blurry. You don’t know the user-needs; you don’t know the scale of your application and so on. At this stage, you might not be able to find solutions to some problems as you cannot answer essential questions. At this point, you need to act accordingly! Work with interfaces, adapters and libraries that can be switched later easily or don’t put too much effort in some components as you will either replace them later or implement a more sophisticated version anyways.
Don’t just "do it" or you will end up with a mess of decisions that nobody wanted to make. Another way accidental architecture happens is the development team is either unaware of or under-experienced to identify key-issues.

How do we make sure not to end up with one of this? I’ll look for a more detailed answer in another article, but it boils down to this: we should ask why we need architecture initially.
We have requirements, constrains, problems etc. We figure out solutions (for example with an approach like "orient – explore – evaluate – support" from Uwe Friedrichsen). When we follow this path, we protect our systems from the types of bad architecture above. As unlikely as it seems, if we end up with an architecture that is similar to the ones above, it’s fine. We engineered it with the right intentions. Additionally, learn when to not use certain solutions and follow Uwe Friedrichsen’s advices:

  • Think holistically
  • Resist hyper-specialization
  • Get a T-shape profile
  • Leave your comfort zone once in a while
  • Understand your domain
  • Don’t fall for hypes
  • Cope with technology explosion
  • Master the foundation design
  • Don’t overact

I recently had the task to automate a program, with a COM interface, and integrate it in a database application.
I already used PowerShell to automate Docker, SqlPackage and others.
So my first thought was to use PowerShell in this case too, but due to the complexity of the task i decided against it.
I ended up with a C# solution with round about 300 classes and i’m happy with it, but it brought me to the question what good criteria are to choose between PowerShell and something else.

Basically PowerShell is a nice hammer, but not every problem is a nail.
And since almost every programming language is turing complete, you can every problem in every language, but they have their pros and cons.
This is especially true for .NET-based languages like C#, F# and PowerShell, since they share the same libraries.
So you can develop a nice graphical user interface using Windows Presentation Foundation in PowerShell, even though is was originally designed for C# applications.

There are many problems out there, that you can solve in PowerShell with less code than in other languages, which makes it faster to write and easier to maintain.
But now i will show you some cases when PowerShell is a little painful and other languages are a better choice.

Inhertiance and Polymorphism

PowerShell is object-oriented and where object-orientation is, there is polymorphism not far. So you define interfaces and maybe several implementations for it.
But since PowerShell is a interpreted language there is no type or interface checking before runtime.
In default, everything is of type Object and you see if a method is available, when you execute the code.
You can assert types, but not have to.

There are different ways to get new objects in PowerShell.
Often they are created in Commandlets that are written in C#, like Get-Process or New-Item.
Another common option is to create a custom object using New-Object -Type PsCustomObject -Property @{ 'Foo' = 'Bar' }.
That creates a generic object that can be extended by any property or method in runtime.
Another option is to create the object in PowerShell but write the class definition in C#.
You can do that from existing .NET libraries or even in runtime.
Store the C# code in a variable and add the classes with Add-Type.
That were the options for PowerShell version 4, but in version 5 classes were introduced.

All that methods have their reason.
Let me explain that using some questions:

  • Why would you want to create a PowerShell class, if you can use a PsCustomObject?
  • Why would you want to create a PsCustomObject, if you can use a Hashmap?

Commandlets are the default if you want to use existing PowerShell modules.
Hashmaps are the default if you need custom attributes in an object.
But if you want to pass data to existing Commandlets, for example to write it CSV files, then the easiest way is to use a PsCustomObject.
If you write your own functions that require parameters with specific properties and methods, then its better to define a class, that can be easily validated.
The next level of complexity is, if you write a function that has parameters that may be of the one or of the other type with the same interface.
Then you start thinking about abstract methods, reuse of code between these classes.
Here C# supports more expressions to simplify the code and compilation time validation improves the quality.

So if you start to create inherited classes in PowerShell you probably gone to far.
Maybe it’s better to create a PowerShell Module in C# or a .DLL in C# and include it in your Powershell code.

Concurrency and Parallel Computing

Since PowerShell supports using the System.Threading library of .NET, you can do multicore computation in PowerShell.
In some cases this is not even a bad idea.
A common case where PowerShell is used is automation and integration of other tools.
For example run a compiler, call a web service and so on. These tools may produce output that you might want to process while the tool is still working.
Sometimes you have to do that since otherwise you would get a overflow of the output buffer and you don’t get the entire output.
In that case you can define PowerShell block as variable and register it as a event handler.
But there are other cases where you have more parallel processes that need to be synchronized somehow and that may communicate in between. Then C# or F# has better expressions to manage asychronous calls.

Software development is hard. Sure, there are things that can make your live easier (e.g. Containers or ubiquitous language), but sadly there is "No Silver Bullet" as Frederick P. Brooks Jr. concludes in this 16-pager. With our advance in technology, development becomes easier and faster. But some things may not bring the redemption we hoped they were (like "automatic" programming code or even OOP and somewhat newer AI).
One of the more promising members of the redemption-club is the "Great Designer" (p.15) of the software system. They build software "faster, smaller, simpler, cleaner […] with less effort". Today, we call someone with the skillset described by Brooks a "software architect".

In 2019, I went to a great Summit in Munich where Trisha Gee (@trisha_gee) held the key note about required skillset for a software architect. I want to share her insights mixed with my views here:

Master of communication

The software architect is a master of communication. Obviously, this is not limited to verbal communication, but also includes writing skills. Writing does not stop with good programming and documentation skills. Things that matter are e-mails, slack and twitter! Asking questions like "what are we building?" and "what skills does the team have?" are as important as listening to the answers and translating it into software.

"Your code does not speak to the machine. It speaks to the next one who reads it!"

Talk to different people. Talk to developers, domain experts and users. Try to get a feel for their problems, challenges and constrains within their domain.

Adaptability & open minded-ness

Be openminded! There are a thousand views on a simple topic. Users und domain experts might change their minds rapidly, technology and processes change. It is your job to order things, estimate the impact and derive actions.

It’s not the year of K8s!

No, Kubernetes, AI and agile development are not the magic solution to every problem. Always learn what’s needed.

Prioritization & time management

We all work in projects. There is always too much work for too few people – deal with it. Allocate time for yourself. Make a plan for your work, for time at home and absolutely free time. Mental health is an essential part of a "Great Designer". As an architect, your time is limited and valuable. You cannot learn everything, but try to keep up.

Stay technical

Most of the things up until this point are non-technical. But be careful; do not underestimate the "Business Analyst Movement". Trisha points out that especially women are pushed into non-technical and softer roles too often. Don’t become a PM, stay an architect.

Scale out

At some point in the history of software engineering we got to the point where we understood, scaling out may be better than scaling up [Admiral Grace Hopper]. The same applies for great engineers. Instead of just getting better, help others to get better.

If you want to be 10 times more productive, teach 9 people your skillset.

Use "pair programming" more often, but do not stop at development. Do it for deployments and troubleshooting with a DevOps engineer and for domain building with a business analyst. Code reviews and walkthroughs "are not for finding bugs only – they are about sharing information and writing the best system you can". If your company supports it, Trisha recommends 20% time. Another idea to share are book clubs where five people read a book – one or two chapters each and tell the others about key information in their part. This way everybody can get a little knowledge and decide if it’s worth reading the whole thing.

"Nobody knows how good you are! Teaching makes you look good."

There are different ways of teaching and being taught. You can teach in internal, informal (or less formal) sessions during lunch time called

Visit user groups and speak on conferences. As usual, there are pros and cons for each type. Decide what’s the best for you.

If you don’t like sharing with foreigners, share with your colleagues. This way you avoid too narrow specialization and knowledge-silos.

Retention and recruitment

Being a good architect means finding new projects and interesting topics in your environment. That is the easy part. Also, watch out for new colleagues and keep your team(s) happy! Be a good role model, be a paragon for great designers.

Community support

We love stack overflow! We visit conferences and we gather at meetups. You cannot explore every technology yourself – especially not as an emerging architect. You need to consume what the community can provide, but you also need to give back. You can talk about your personal challenges when your first big project failed or you commit to an open source project: maybe there are easy enhancements for your favorite JavaScript library or you build a python wrapper for a public REST API.
Do you like Goldman Sachs? Probably. But aren’t they an evil banking company? Probably. Nevertheless, their developers are vivid contributors to Java libraries. They published their enhanced version of JavaCollections (called GS Collections) and influenced a lot of things like the Java Streaming API.
The same things goes for Microsoft. They open source their .NET Core platform as part of the .NET foundation and publish their code to the best IDE ever created on GitHub.