Browsed by
Category: CodeProject

Why you really need a private nuget feed

Why you really need a private nuget feed

So I have to be honest, I’m the kind of person that is always looking for a better way to do something. And it probably speaks volumes to my chosen career that I am that way. One of the reasons I’m so passionate about technology is that it on a foundational level supposed to change and improve our lives.

To that end, I’ve been having a lot of conversations lately around development and managing technical debt, sprawl, and just generally the ability to have more controls around major enterprise applications that are being built.

One of the biggest changes to coding, I would argue since the idea of Object-Oriented programming, has been the advent of the micro-service. This idea that I can build out a swarm of smaller services and leverage all these elements to do much larger tasks, and scale them independently has fundamentally changed how many of us look at Software Architecture and Solution development.

But with this big step forward, all to often I see a big step backward as well.

And one of the ways this regression into bad habits happens is that I all to often see it become the “Wild west” of software development, and everyone runs off building individual micro-services, with no idea to operations or manageability. Which ultimately leads to a myriad of services that are now very difficult to maintain.

For example, all services have several common components:

  1. Configuration Management
  2. Secret Management
  3. Monitoring / Alerting
  4. Exception handling
  5. Logging
  6. Availability Checking
  7. Health probes endpoints

And if these elements are coordinated, then what can happen is that it becomes very difficult to maintain an solution because if you have 100 services in the application, you could potentially have 100 different ways configuration is managed. This is a nightmare to be honest. The good news is that they solved this problem, with npm and nuget!

One of the key benefits of leveraging a custom nuget server is that it allows you to gain the ability to have premade application components that then can be leveraged like lego blocks by developers to increase their productivity, so from their side, you can use these to get a jump on the new services they build.

Now most developers I know, love to reuse things, none of us got into this field to rewrite code for the hell of it.

Now if using the above, you create nuget packages around these types of operations (and more you will identify), and hook them up to their own CI/CD process that ends in an artifacts feed, we gain a couple of benefits:

  1. Developers stop reinventing the wheel:  You gain the productivity gains of being able to plug-and-play with code. You basically get to say “I’m building a new service, and I need X, Y, and Z…”
  2. We get away from “It works on my machine” as all dependencies are bundled, and can get to a container that is more cleanly implemented.
  3. You gain more change control and management:  By versioning changes to nuget packages, and leveraging pre-release packages appropriately we can roll out changes to some services and easily look at all of our services and see which ones are consuming older packages. So ultimately we can manage the updates of each service independently, but take the need to track that off the developer and onto a service.
  4. Make updates to shared code easier and an afterthought:  Stop me if you heard this one before.  You as a developer are performing updates to a service that leverages an older piece of shared code that someone discovered a bug with.  You are working on their bug, and don’t think about the shared code that should be updated at the same time, it fell through the cracks.  And when you finish, the new version of service is deployed with old shared code, and no one is the wiser.  A change happens later to enforce something and it breaks service.”  By leveraging nuget packages, it can become standard practice to say “I’m starting work on this service, select manage nuget packages and it shows me a list of packages with updates, I say update all.”
  5. You gain more ability to catch things earlier ahead of release preventing delays:  It makes it a lot easier to coordinate the changes being made to common code when you use a nuget feed, and make it easier for developers to adjust. For example, If I’m working on Service A, and during our standup I find out that another member of my team has a pre-release of the nuget package I’m using, and both are going to prod around the same time. I can implement the pre-release, and make the changes before I get to a point of needing to integrate this at the end and pushing last minute changes.

There are some great posts that get into the specifics of this, and how to implement it, and those are:

MagicMirror on the Wall

MagicMirror on the Wall

So lately, my wife and I have been going through a spending a lot of time organizing different elements of our lives. Specifically Covid-19 has created a situation where our lives are very different than what we are used to previously. And it has caused us to re-evaluate the tips and tricks we used to organize our lives.

One such change has been the need to coordinate and communicate more with regard to the family calendar and a variety of other information about our lives. So I decided if we were going to bring back “The family bulletin board”, that I wanted it to integrate with our digital footprint and tools to make life easier.

After doing some research, I settled on MagicMirror, mainly because of articles like this. And I had a spare monitor, with a Raspberry Pi 3 lying around the office, so I figured why not. The plan was to implement the initial prototype and get it working in my office, and then invest in something better for my wife to help manage everything.

So I have to admit, this was surprisingly easy to setup. And did not require much in the way of effort. I was even able to automate pushing updates to this device pretty easily. So if you are looking for a cheap option for getting something like this working, I recommend MagicMirror.

Here’s the current state of the PoC:

So for a walk through, I went through the following steps:

  • Download Raspbian OS, and flash the Micro SD card using Etcher.
  • Put the Micro SD into the Raspberry Pi and boot it up. From there you will get a step-by-step wizard for configuring wifi and setting up passwords, etc.
  • From there, I executed the manual installation methods found here.

That was really it for getting the baseline out of the box. From there I had a nice black screen with holidays, date/time, and a few other widgets. From there the next step was to install the widgets I cared about. You can find a list of the 3rd party widgets here.

Now I wanted MagicMirror to load up on this monitor without me having to do anything, so I followed the steps here. I did make some additional modifications from here that helped to make my life easier.

The one problem I ran into was the process of updating the device. I could have just remoted into the device to execute updates to config files, and other scripts. But for me, this was frustrating, I really want to keep configurations for this kind of stuff source controlled as a practice. So I ended up creating a github private repo for the configuration of my MagicMirror.

This worked fine except for the fact that every time I needed to update the mirror, I had to push a file to the device, and then copy it over and reboot the device.

So instead what I ended update doing was building a CI/CD pipeline that pushes changes to the magic mirror.

So what I did was the following:

  • Create a blob storage account and container.
  • Create a ADO pipeline for my GitHub repo.
  • And added only one task to my repo:
- task: AzureCLI@2
  inputs:
    azureSubscription: '{Subscription Name}'
    scriptType: 'bash'
    scriptLocation: 'inlineScript'
    inlineScript: 'az storage blob upload --connection-string $BLOBSTORAGE -f configdev.js -c test -n configdev.js'

Now whenever I push an update to the master branch, a copy of the config file is pushed directly to blob storage.

Now came the problem of how do I get the device to pull down the new config file. If you look at the instructions about for making Magic Mirror auto start, it mentions an mm.sh file, which I updated to be the following:

cd ./MagicMirror
cd config
curl -0 config.dev.js https://...blob url.../config.js
cp config.dev.js config.js
cd ..
DISPLAY=:0 npm start

Now with this update the magic mirror will pick up a fresh config on every restart. So all I have to do is “sudo reboot” to make it pickup the new version.

I’m going to continue to build this out, and will likely have more blog posts on this topic moving forward. But I wanted to get something out about the beginning. Some things I’ve been thinking about adding are:

  • Examine the different types of widgets
  • Calendar integration
  • Google Calendar integration
  • To Do Integration
  • Updating to shut off at 8pm and start up at 6am
  • Building ability to recognize when config has been updated and trigger a reboot automatically

Adding Application Insights to Azure Functions

Adding Application Insights to Azure Functions

So I wanted to share a quick post on something that is rather small but I was surprised at how little documentation there was with regard to how to implement it.

Monitoring is honestly one of the most important things you can do with regard to cloud applications. There’s an old saying, “what gets measured…matters.” And I find this expression to be very true in Application Development.

So if you are building out an azure function, what steps are required to enable Azure Application Insights for your Functions. For this post I’ll be focusing on adding it to a DotNetCore function app.

Step 1 – Add the nuget package

No surprise, it all starts with a nuget package, you are going to want to add “Microsoft.ApplicationInsights.AspNetCore”, shown below:

But additionally, you are going to need the nuget package “Microsoft.Azure.Functions.Extensions” shown here:

Step 2 – Update the Startup.cs

Now if you haven’t configured your function app to have a startup.cs file, then I’m of the opinion you’ve made some mistakes. Because honestly I can’t understate the importance of dependency injection to ensure you are falling recommended practices and setting yourself up for success. Once you’ve done that, you will be able to add the following line to the override of “Configure”, shown below:

[assembly: FunctionsStartup(typeof(SampleProject.Services.Startup))]

namespace SampleProject.Services
{
    public class Startup : FunctionsStartup
    {
        public override void Configure(IFunctionsHostBuilder builder)
        {
            builder.Services.AddScoped<ITelemetryProvider, TelemetryProvider>();
            builder.Services.AddApplicationInsightsTelemetry();
        }
    }
}

The above code will use the “AddApplicationInsightsTelemetry()” method to capture the out-of-the box telemetry.

That leaves the outstanding question of capturing custom or application specific telemetry, for that I recommend implementing a wrapper class around the Telemetry client. Personally this is a general practice I always implement as it removes the hard dependencies in an application and helps with flexibility later.

For this, the “TelemetryProvider” is the following:

public class TelemetryProvider : ITelemetryProvider
    {
        private TelemetryClient _client;

        public TelemetryProvider()
        {
            _client = new TelemetryClient(TelemetryConfiguration.CreateDefault());
        }
        public void TrackEvent(string name)
        {
            _client.TrackEvent(name);
        }

        public void TrackEvent(string name, Dictionary<string, string> properties)
        {
            _client.TrackEvent(name, properties);
        }

        public void TrackEvent(string name, Dictionary<string, string> properties, Dictionary<string, double> metrics)
        {
            _client.TrackEvent(name, properties, metrics);
        }

        public void TrackMetric(string name, double value)
        {
            _client.TrackMetric(name, value);
        }

        public void TrackException(Exception ex)
        {
            _client.TrackException(ex);
        }

        public void TrackDependency(string name, string typeName, string data, DateTime startTime, TimeSpan duration, bool success)
        {
            _client.TrackDependency(typeName, name, data, startTime, duration, success);
        }
    }

With an interface of the following:

public interface ITelemetryProvider
    {
        void TrackDependency(string name, string typeName, string data, DateTime startTime, TimeSpan duration, bool success);
        void TrackEvent(string name);
        void TrackEvent(string name, Dictionary<string, string> properties);
        void TrackEvent(string name, Dictionary<string, string> properties, Dictionary<string, double> metrics);
        void TrackException(Exception ex);
        void TrackMetric(string name, double value);
    }
}

After you’ve implemented this provider, it makes it very easy to capture telemetry relates to specific functionality in your application and leverage application insights as a total telemetry solution.

Learning how to use Power BI Embedded

Learning how to use Power BI Embedded

So, lately I’ve been doing a lot more work with Power BI Embedded, and discussions around the implementation of Power BI Embedded within applications.

As I discussed Power BI itself can be a complicated topic, especially just to get a handle on all the licensing. Look here for an explanation on that licensing.

But another big question is even then, what does it take to implement Power BI Embedded? What kind of functionality is available? The first resource I would point you at is the Power BI Embedded Playground. This site really is fantastic, giving working samples of how to implement Power BI Embedded in a variety of use-cases and giving you the code to leverage in the process.

But more than that, leveraging a tool like Power BI Embedded, does require further training, and here are some links to tutorials and online training you might find useful:

There are some videos out there that give a wealth of good information on Power BI Embedded, and some of them can be found here.

There is a wealth of information and this is just a post to get you started, but Power BI Embedded, once you get started can make it really easy to embed for amazing analytics capabilities into your applications.

How to pull blobs out of Archive Storage

How to pull blobs out of Archive Storage

So if you’re building a modern application, you definitely have a lot of options for storage of data, whether that be traditional database technologies (SQL, MySQL, etc) or NoSQL (Mongo, Cosmos, etc), or even just blob storage. Of the above options, Blob storage is by far the cheapest, providing a very low cost option for storing data long term.

The best way though to ensure that you get the most value out of blob storage, is to leverage the different tiers to your benefit. By using a tier strategy for your data, you can pay significantly less to store it for the long term. You can find the pricing for azure blob storage here.

Now most people are hesitant to leverage the archive tier because the idea of having to wait for the data to be re hydrated has a tendency to scare them off. But it’s been my experience that most data leveraged for business operations, has a shelf-life, and archiving that data is definitely a viable option. Especially for data that is not accessed often, which I would challenge most people storing blobs to capture data on and see how much older data is accessed. When you compare this need to “wait for retrieval” vs the cost savings of archive, in my experience it tends to really lean towards leveraging archive for data storage.

How do you move data to archive storage

When storing data in azure blob storage, the process of upload a blob is fairly straight forward, and all it takes is setting the access tier to “Archive” to move data to blob storage.

The below code generates a random file and uploads it to blob storage:

var accountClient = new BlobServiceClient(connectionString);

            var containerClient = accountClient.GetBlobContainerClient(containerName);

            // Get a reference to a blob
            BlobClient blobClient = containerClient.GetBlobClient(blobName);

            Console.WriteLine("Uploading to Blob storage as blob:\n\t {0}\n", blobClient.Uri);

            // Open the file and upload its data
            using FileStream uploadFileStream = File.OpenRead(localFilePath);
            var result = blobClient.UploadAsync(uploadFileStream, true);

            result.Wait();

            uploadFileStream.Close();

            Console.WriteLine("Setting Blob to Archive");

            blobClient.SetAccessTier(AccessTier.Archive);

How to re-hydrate a blob in archive storage?

There are two ways of re-hydrating blobs:

  1. Copy the blob to another tier (Hot or Cool)
  2. Set the access tier to Hot or Cool

It really is that simple, and it can be done using the following code:

var accountClient = new BlobServiceClient(connectionString);

            var containerClient = accountClient.GetBlobContainerClient(containerName);

            // Get a reference to a blob
            BlobClient blobClient = containerClient.GetBlobClient(blobName);

blobClient.SetAccessTier(AccessTier.Hot);

After doing the above, it will start the process of re-hydrating the blob automatically. And you need to monitor the properties of the blob which will allow you to see when it has finished hydrating.

Monitoring the re-hydration of a blob

One easy pattern for monitoring the blobs as they are rehydrated is to implement a queue and an azure function to monitor the blob during this process. I did this by implementing the following:

For the message model, I used the following to track the hydration process:

public class BlobHydrateModel
    {
        public string BlobName { get; set; }
        public string ContainerName { get; set; }
        public DateTime HydrateRequestDateTime { get; set; }
        public DateTime? HydratedFileDataTime { get; set; }
    }

And then implemented the following code to handle the re-hydration process:

public class BlobRehydrationProvider
    {
        private string _cs;
        public BlobRehydrationProvider(string cs)
        {
            _cs = cs;
        }

        public void RehydrateBlob(string containerName, string blobName, string queueName)
        {
            var accountClient = new BlobServiceClient(_cs);

            var containerClient = accountClient.GetBlobContainerClient(containerName);

            // Get a reference to a blob
            BlobClient blobClient = containerClient.GetBlobClient(blobName);

            blobClient.SetAccessTier(AccessTier.Hot);

            var model = new BlobHydrateModel() { BlobName = blobName, ContainerName = containerName, HydrateRequestDateTime = DateTime.Now };

            QueueClient queueClient = new QueueClient(_cs, queueName);
            var json = JsonConvert.SerializeObject(model);
            string requeueMessage = Convert.ToBase64String(Encoding.UTF8.GetBytes(json));
            queueClient.SendMessage(requeueMessage);
        }
    }

Using the above code, when you set the blob to hot, and queue a message it triggers an azure function which would then monitor the blob properties using the following:

[FunctionName("CheckBlobStatus")]
        public static void Run([QueueTrigger("blobhydrationrequests", Connection = "StorageConnectionString")]string msg, ILogger log)
        {
            var model = JsonConvert.DeserializeObject<BlobHydrateModel>(msg);
            
            var connectionString = Environment.GetEnvironmentVariable("StorageConnectionString");

            var accountClient = new BlobServiceClient(connectionString);

            var containerClient = accountClient.GetBlobContainerClient(model.ContainerName);

            BlobClient blobClient = containerClient.GetBlobClient(model.BlobName);

            log.LogInformation($"Checking Status of Blob: {model.BlobName} - Requested : {model.HydrateRequestDateTime.ToString()}");

            var properties = blobClient.GetProperties();
            if (properties.Value.ArchiveStatus == "rehydrate-pending-to-hot")
            {
                log.LogInformation($"File { model.BlobName } not hydrated yet, requeuing message");
                QueueClient queueClient = new QueueClient(connectionString, "blobhydrationrequests");
                string requeueMessage = Convert.ToBase64String(Encoding.UTF8.GetBytes(msg));
                queueClient.SendMessage(requeueMessage, visibilityTimeout: TimeSpan.FromMinutes(5));
            }
            else
            {
                log.LogInformation($"File { model.BlobName } hydrated successfully, sending response message.");
                //Trigger appropriate behavior
            }
        }

By checking the ArchiveStatus, we can tell when the blob is re-hydrated and can then trigger the appropriate behavior to push that update back to your application.

TerraForm – Using the new Azure AD Provider

TerraForm – Using the new Azure AD Provider

So by using TerraForm, you gain a lot of benefits, including being able to manage all parts of your infrastructure using HCL languages to make it rather easy to manage.

A key part of that is not only being able to manage the resources you create, but also access to them, by creating and assigning storage principals. In older versions of TerraForm this was possible using the azurerm_azuread_application and other elements. I had previously done this in the Kubernetes template I have on github.

Now, with TerraForm v2.0, there have been some pretty big changes, including removing all of the Azure AD elements and moving them to their own provider, and the question becomes “How does that change my template?”

Below is an example, it shows the creation of a service principal, with a random password, and creating an access policy for a keyvault.

resource "random_string" "kub-rs-pd-kv" {
  length = 32
  special = true
}

data "azurerm_subscription" "current" {
    subscription_id =  "${var.subscription_id}"
}
resource "azurerm_azuread_application" "kub-ad-app-kv1" {
  name = "${format("%s%s%s-KUB1", upper(var.environment_code), upper(var.deployment_code), upper(var.location_code))}"
  available_to_other_tenants = false
  oauth2_allow_implicit_flow = true
}

resource "azurerm_azuread_service_principal" "kub-ad-sp-kv1" {
  application_id = "${azurerm_azuread_application.kub-ad-app-kv1.application_id}"
}

resource "azurerm_azuread_service_principal_password" "kub-ad-spp-kv" {
  service_principal_id = "${azurerm_azuread_service_principal.kub-ad-sp-kv1.id}"
  value                = "${element(random_string.kub-rs-pd-kv.*.result, count.index)}"
  end_date             = "2020-01-01T01:02:03Z"
}

resource "azurerm_key_vault" "kub-kv" {
  name = "${var.environment_code}${var.deployment_code}${var.location_code}lkub-kv1"
  location = "${var.azure_location}"
  resource_group_name = "${azurerm_resource_group.management.name}"

  sku {
    name = "standard"
  }

  tenant_id = "${var.keyvault_tenantid}"

  access_policy {
    tenant_id = "${var.keyvault_tenantid}"
    object_id = "${azurerm_azuread_service_principal.kub-ad-sp-kv1.id}"

    key_permissions = [
      "get",
    ]

    secret_permissions = [
      "get",
    ]
  }
  access_policy {
    tenant_id = "${var.keyvault_tenantid}"
    object_id = "${azurerm_azuread_service_principal.kub-ad-sp-kv1.id}"

    key_permissions = [
      "create",
    ]

    secret_permissions = [
      "set",
    ]
  }

  depends_on = ["azurerm_role_assignment.kub-ad-sp-ra-kv1"]
}

Now as I mentioned, with the change to the new provider, you will see a new version of this code be implemented. Below is an updated form of code that generates a service principal with a random password.

provider "azuread" {
  version = "=0.7.0"
}

resource "random_string" "cds-rs-pd-kv" {
  length = 32
  special = true
}

resource "azuread_application" "cds-ad-app-kv1" {
  name = format("%s-%s%s-cds1",var.project_name,var.deployment_code,var.environment_code)
  oauth2_allow_implicit_flow = true
}

resource "azuread_service_principal" "cds-ad-sp-kv1" {
  application_id = azuread_application.cds-ad-app-kv1.application_id
}

resource "azuread_service_principal_password" "cds-ad-spp-kv" {
  service_principal_id  = azuread_service_principal.cds-ad-sp-kv1.id
  value                = random_string.cds-rs-pd-kv.result
  end_date             = "2020-01-01T01:02:03Z"
}

Notice how much cleaner the code is, first we aren’t doing the ${} to do string interpolation, and ultimately the resources are much cleaner. So the next question is how do I connect this with my code to assign this service principal to a keyvault access policy.

You can accomplish that with the following code, which is in a different file in the same directory:

resource "azurerm_resource_group" "cds-configuration-rg" {
    name = format("%s-Configuration",var.group_name)
    location = var.location 
}

data "azurerm_client_config" "current" {}

resource "azurerm_key_vault" "cds-kv" {
    name = format("%s-%s%s-kv",var.project_name,var.deployment_code,var.environment_code)
    location = var.location
    resource_group_name = azurerm_resource_group.cds-configuration-rg.name 
    enabled_for_disk_encryption = true
    tenant_id = data.azurerm_client_config.current.tenant_id
    soft_delete_enabled = true
    purge_protection_enabled = false
 
  sku_name = "standard"
 
  access_policy {
    tenant_id = data.azurerm_client_config.current.tenant_id
    object_id = data.azurerm_client_config.current.object_id
 
    key_permissions = [
      "create",
      "get",
    ]
 
    secret_permissions = [
      "set",
      "get",
      "delete",
    ]
  }

  access_policy {
    tenant_id = data.azurerm_client_config.current.tenant_id
    object_id = azuread_service_principal.cds-ad-sp-kv1.id

    key_permissions = [
      "get",
    ]

    secret_permissions = [
      "get",
    ]
  }
}

Notice that I am able to reference the “azuread_service_principal.cds-ad-sp-kv1.id” to access the newly created service principal without issue.

Good practices for starting with containers

Good practices for starting with containers

So I really hate the saying “best practices” mainly because it creates a belief that there is only one right way to do things. But I wanted to put together a post around some ideas for strengthening your micro services architectures.

As I’ve previously discussed, Micro service architectures are more complicated to implement but have a lot of huge benefits to your solution. And some of those benefits are:

  • Independently deployable pieces, no more large scale deployments.
  • More focused testing efforts.
  • Using the right technology for each piece of your solution.
  • I creased resiliency from cluster based deployments.

But for a lot of people, including myself the hardest part of this process is how do you structure a micro-service? How small should each piece be? How do they work together?

So here are some practices I’ve found helpful if you are starting to leverage this in your solutions.

One service = one job

One of the first questions is how small should my containers be. Is there such a thing as too small? A good rule of thumb to focus on is the idea of separation concerns. If you take every use-case and start to break it down to a single purpose, you’ll find you get to a good micro-service design pretty quickly.

Let’s look at examples, I recently worked on a solution with a colleague of mine that ended up pulling from an API, and then extracting that information to put it into a data model.

In the monolith way of thinking, that would have been 1 API call. Pass in the data and then cycle through and process it. But the problem was throughput, if I would have pulled the 67 different regions, and the 300+ records per region and processed this as a batch it would have been a mess of one gigantic API call.

So instead, we had one function that cycled through the regions, and pulled them all to json files in blob storage, and then queued a message.

Then we had another function that when a message is queued, will take that message, read in the records for that region, and process saving them to the database. This separate function is another micro-services.

Now there are several benefits to this approach, but chief among them, the second function can scale independent of the first, and I can respond to queued messages as they come in, using asynchronous processing.

Three words… Domain driven design

For a great definition of Domain-Driven Design, see here. The idea is pretty straight forward, the idea of building software and the structure of your application should mirror the business logic that is being implemented.

So for example, your micro-services should mirror what they are trying to do. Like let’s take the most straightforward example…e-commerce.

If we have to track orders, and have a process once an order is submitted of the following:

  • Orders are submitted.
  • Inventory is verified.
  • Order Payment is processed.
  • Notification is sent to supplier for processing.
  • Confirmation is sent to the customer.
  • Order is fulfilled and shipped

Looking at the above, one way to implement this would be to do the following:

  • OrderService: Manage the orders from start to finish.
  • OrderRecorderService: Record order in tracking system, so you can track the order throughout the process.
  • OrderInventoryService: Takes the contents of the order and checks it against inventory.
  • OrderPaymentService: Processes the payment of the order.
  • OrderSupplierNotificationService: Interact with a 3rd party API to submit the order to the supplier.
  • OrderConfirmationService: Send an email confirming the order is received and being processed.
  • OrderStatusService: Continues to check the 3rd party API for the status of the order.

If you notice above, outside of an orchestration they match exactly what the steps were according to the business. This provides a streamlined approach that makes it easy to make changes, and easy to understand for new team members. More than likely communication between services is done via queues.

For example, let’s say the company above wants to expand to except Venmo as a payment method. Really that means you have to update the OrderPaymentServices to be able to accept the option, and process the payment. Additionally OrderPaymentService might itself be an orchestration service between different micro-services, one per payment method.

Make them independently deployable

This is the big one, if you really want to see benefit of microservices, they MUST be independently deployable. This means that if we look at the above example, I can deploy each of these separate services and make changes to one without having to do a full application deployment.

Take the scenario above, if I wanted to add a new payment method, I should be able to update the OrderPaymentService, check-in those changes, and then deploy that to dev, through production without having to deploy the entire application.

Now, the first time I heard that I thought that was the most ridiculous thing I ever heard, but there are some things you can do to make this possible.

  • Each Service should have its own data store: If you make sure each service has its own data store, that makes it much easier to manage version changes. Even if you are going to leverage something like SQL Server, then make sure that the tables leveraged by each micro-service are used by that service, and that service only. This can be accomplished using schemas.
  • Put layers of abstraction between service communication: For example, a great common example is queuing or eventing. If you have a message being passed through, then as long as the message leaving doesn’t change, then there is no need to update the receiver.
  • If you are going to do direct API communication, use versioning. If you do have to have APIs connecting these services, leverage versioning to allow for micro-services to be deployed and change without breaking other parts of the application.

Build with resiliency in mind

If you adopt this approach to micro-services, then one of the biggest things you will notice quickly is that each micro-service becomes its own black-box. And as such I find its good to build each of these components with resiliency in mind. Things like leveraging Polly for retry, or circuit breaker patterns. These are great ways of making sure that your services will remain resilient, and it will have a cumulative affect on your application.

For example, take our OrderPaymentService above, we know that Queue messages should be coming in, with the order and payment details. We can take a microscope to this service and say, how could it fail, its not hard to get to a list like this:

  • Message comes through in a bad format.
  • Payment service can’t be reached.
  • Payment is declined (for any one of a variety of reasons)
  • Service fails while waiting on payment to be processed.

Now for some of the above, its just some simple error handling, like checking the format of the message for example. We can also build logic to check if the payment service is available, and do an exponential retry until its available.

We might also consider implementing a circuit breaker, that says if we can’t process payments after so many tries, the service switches to an unhealthy state and causes a notification workflow.

And in the final scenario, we could implement a state store that indicates the state of the payment being processed should a service fail and need to be picked up by another.

Consider monitoring early

This is the one that everyone forgets, but it dove-tails nicely out of the previous one. It’s important that there be a mechanism for tracking and monitoring the state of your micro-service. I find too often its easy to say “Oh the service is running, so that means its fine.” That’s like saying just cause the homepage loads, a full web application is working.

You should build into your micro-services the ability to track their health and enable a way of doing so for operations tools. Let’s face it, at the end of the day, all code will eventually be deployed, and all deployed code must be monitored.

So for example, looking at the above. If I build a circuit breaker pattern into OrderPaymentService, and every failure updates status stored within memory of the service that says its unhealthy. I can then expose an http endpoint that returns the status of that breaker.

  • Closed: Service is running fine and healthy
  • Half-Open: Service is experiencing some errors but still processing.
  • Open: Service is taken offline for being unhealthy.

I can then build out logic that when it gets to Half-open, and even open specific events will occur.

Start small, don’t boil the ocean

This one seems kind of ironic given the above. But if you are working on an existing application, you will never be able to convince management to allow you to junk it and start over. So what I have done in the past, is to take an application, and when you find its time to make a change to that part of the application, take the opportunity to rearchitect and make it more resilient. Deconstruct the pieces and implement a micro-service response to resolving the problem.

Stateless over stateful

Honestly this is just good practice to get used to, most container technologies, like Docker or Kubernetes or other options really favor the idea of elastic scale and the ability to start or kill a process at any time. This becomes a lot harder if you have to manage state within a container. If you must manage state I would definitely recommend using an external store for information.

Now I know not every one of these might fit your situation but I’ve found that these ten items make it much easier to transition to creating micro services for your solutions and seeing the benefits of doing so.

Configuring SQL for High Availability in Terraform

Configuring SQL for High Availability in Terraform

Hello All, a short post this week, but as we talk about availability, chaos engineering, etc. One of the most common data elements I see out there is SQL, and Azure SQL. SQL is a prevelant and common data store, it’s everywhere you look.

Given that, many shops are implementing infrastructure-as-code to manage configuration drift and provide increased resiliency for their applications. Which is definitely a great way to do that. The one thing that I’ve had a couple of people talk to me about that isn’t clear…how can I configure geo-replication in TerraForm.

This actually built into the TerraForm azure rm provider, and can be done with the following code:

provider "azurerm" {
    subscription_id = ""
    features {
    key_vault {
      purge_soft_delete_on_destroy = true
    }
  }
}

data "azurerm_client_config" "current" {}

resource "azurerm_resource_group" "rg" {
  name     = "sqlpoc"
  location = "{region}"
}

resource "azurerm_sql_server" "primary" {
  name                         = "kmack-sql-primary"
  resource_group_name          = azurerm_resource_group.rg.name
  location                     = azurerm_resource_group.rg.location
  version                      = "12.0"
  administrator_login          = "sqladmin"
  administrator_login_password = "{password}"
}

resource "azurerm_sql_server" "secondary" {
  name                         = "kmack-sql-secondary"
  resource_group_name          = azurerm_resource_group.rg.name
  location                     = "usgovarizona"
  version                      = "12.0"
  administrator_login          = "sqladmin"
  administrator_login_password = "{password}"
}

resource "azurerm_sql_database" "db1" {
  name                = "kmackdb1"
  resource_group_name = azurerm_sql_server.primary.resource_group_name
  location            = azurerm_sql_server.primary.location
  server_name         = azurerm_sql_server.primary.name
}

resource "azurerm_sql_failover_group" "example" {
  name                = "sqlpoc-failover-group"
  resource_group_name = azurerm_sql_server.primary.resource_group_name
  server_name         = azurerm_sql_server.primary.name
  databases           = [azurerm_sql_database.db1.id]
  partner_servers {
    id = azurerm_sql_server.secondary.id
  }

  read_write_endpoint_failover_policy {
    mode          = "Automatic"
    grace_minutes = 60
  }
}

Now above TF, will deploy two database servers with geo-replication configured. The key part is the following:

resource "azurerm_sql_failover_group" "example" {
  name                = "sqlpoc-failover-group"
  resource_group_name = azurerm_sql_server.primary.resource_group_name
  server_name         = azurerm_sql_server.primary.name
  databases           = [azurerm_sql_database.db1.id]
  partner_servers {
    id = azurerm_sql_server.secondary.id
  }

  read_write_endpoint_failover_policy {
    mode          = "Automatic"
    grace_minutes = 60
  }
}

The important elements are “server_name” and “partner_servers”, this makes the connection to where the data is being replicated. And then the “read_write_endpoint_failover_policy” setups up the failover policy.

Terraform – Key Rotation Gotcha!

Terraform – Key Rotation Gotcha!

So I did want to write about something that I discovered recently when investigating a question. The idea being Key rotation, and how TerraForm state is impacted.

So the question being this, if you have a key vault and you ask any security expert. The number one rule is that Key rotation is absolutely essential. This is important because it helps manage the blast radius of an attack, and keep the access keys changing in a way that makes it harder to compromise.

Now, those same experts will also tell you this should be done via automation, so that no human eye has ever seen that key. Which is easy enough to accomplish. If you look at the documentation released by Microsoft, here. It discusses how to rotate keys with azure automation.

Personally, I like this approach because it makes the key rotation process a part of normal system operation, and not something you as a DevOps engineer or developer have to keep track of. It also if you’re in the government space makes it easy to report for compliance audits.

But I’ve been seeing this growing movement online of people who say to have TerraForm generate your keys, and do rotation of those keys using randomization in your TerraForm scripts. The idea being that you can automate random values in your TerraForm script to generate the keys, and I do like that, but overall I’m not a fan of this.

The reason being is it makes key rotation a deployment activity, and if your environment gets large enough that you start doing “scoped” deployments, it removes any rhyme or reason from your key generation. It’s solely based on when you run the scripts.

Now that does pose a problem with state at the end of the day. Because let’s take the following code:

provider "azurerm" {
    subscription_id = "...subscription id..."
    features {
    key_vault {
      purge_soft_delete_on_destroy = true
    }
  }
}

data "azurerm_client_config" "current" {}

resource "azurerm_resource_group" "superheroes" {
  name     = "superheroes"
  location = "usgovvirginia"
}

resource "random_id" "server" {
  keepers = {
    ami_id = 1
  }

  byte_length = 8
}

resource "azurerm_key_vault" "superherovault" {
  name                        = "superheroidentities"
  location                    = azurerm_resource_group.superheroes.location
  resource_group_name         = azurerm_resource_group.superheroes.name
  enabled_for_disk_encryption = true
  tenant_id                   = data.azurerm_client_config.current.tenant_id
  soft_delete_enabled         = true
  purge_protection_enabled    = false

  sku_name = "standard"

  access_policy {
    tenant_id = data.azurerm_client_config.current.tenant_id
    object_id = data.azurerm_client_config.current.object_id

    key_permissions = [
      "create",
      "get",
    ]

    secret_permissions = [
      "set",
      "get",
      "delete",
    ]
  }

  tags = {
    environment = "Testing"
  }
}

resource "azurerm_key_vault_secret" "Batman" {
  name         = "batman"
  value        = "Bruce Wayne"
  key_vault_id = azurerm_key_vault.superherovault.id

  tags = {
    environment = "Production"
  }
}

Now based on the above, all good right? When I execute my TerraForm script, I will have a secret named “batman” with a value of “Bruce Wayne.”

But the only problem here will be if I go to the Azure Portal and change that value, say I change the value of “batman” to “dick grayson”, and then I rerun my TerraForm apply.

It will want to reset that key back to “batman”. And we’ve broken our key rotation at this point….now what?

My thoughts on this is its easy enough to wrap the “terraform apply” in a bash script and before you execute it run a “terraform refresh” and re-pull the values from the cloud to populate your TerraForm script.

If you don’t like that option, there is another solution, use the lifecycle tag within a resource to tell it to ignore updates. And prevent updates to the keyvault if the keys have changed as part of the rotation.

Copying blobs between storage accounts / regions

Copying blobs between storage accounts / regions

So a common question I get is copying blobs. So if you are working with azure blob storage, it’s sort of inevitable that you would need to do a data copy. Whether that be for a migration, re-architecture, any number of reasons … you will need to do a data copy.

Now this is something where I’ve seen all different versions of doing a data copy. And I’m going to talk through those options here, and ultimately how best to execute a copy within Azure Blob Storage.

I want to start with the number 1, DO NOT DO, option. That option is “build a utility to cycle through and copy blobs one by one.” This is the least desirable option for moving data for a couple of reasons:

  • Speed – This is going to be a single threaded, synchronous operation.
  • Complexity – This feels counter-intuitive, but the process of ensuring data copies, building fault handling, etc…is not easy. And not something you want to take on when you don’t have to.
  • Chances of Failure – Long running processes are always problematic, always. As these processes can fail, and when they do they can be difficult to recover from. So you are opening yourself up to potential problems.
  • Cost – At the end of the day, you are creating a long running process that will need to have compute running 24/7 for an extended period. Compute in the cloud costs money, so this is an additional cost.

So the question is, if I shouldn’t build my own utility, how do we get this done. There are really two options that I’ve used in the past to success:

  • AzCopy – This is the tried and true option. This utility provides an easy command line interface for kicking off copy jobs that can be run either in a synchronous or asynchronous method. Even in its synchronous option, you will see higher throughput for the copy. This removes some of the issues from above, but not all.
  • Copy API – a newer option, the Rest API enables a copy operation. This provides the best possible throughput and prevents you from having to create a VM, allowing for asynchronous copy operations in azure to facilitate this operation. The API is easy to use and documentation can be found here.

Ultimately, there are lots of ways and scenarios you can leverage these tools to copy data. The other one that I find usually raises questions, is if I’m migrating a large volume of data, how do I do it to minimize downtime.

The way I’ve accomplished this, is to break your data down accordingly.

  • Sort the data by age oldest to newest.
  • Starting with the oldest blobs, break them down into the following chucks.
  • Move the first 50%
  • Move the next 30%
  • Move the next 10-15%
  • Take a downtime window to copy the last 5-10%

By doing so, you gain the ability to minimize your downtime window while maximizing the backend copy. Now the above process only works if your newer data is accessed more often, it creates a good option for moving your blobs, and minimizing downtime.