Semantic Tags in NetBox

Although NetBox has a pretty nice data model and a lot of things found in common environments can be modeled pretty well, it sometimes reaches its limits. In such situations you’re faced with multiple options to go forward

  1. Introduce (a number of) custom field(s) to extend the data model
  2. Write a plugin to extend the data model (and logic)
  3. Introduce a (set of) tag(s) to allow users to place bumper stickers on things
  4. File a feature request to add some generic missing feature

In this post we’ll look into options for using Tags to denote characteristics of devices, VMs, interfaces, prefixes, IPs, etc.

Simple labels

Sometimes it’s enough to just label a device, an interfaces, etc. with a static bit of information. Examples would be

  • Enable backup on this device (e.g. tag backup)
  • Enable DHCP on this interface (e.g. tag DHCP)
  • Mark an interface as planned or offline (e.g. tags planned or offline)
  • Denote an interface is of type Wireguard (e.g. tag Wireguard)
  • etc.

For those cases you can create a tag, e.g. planned, and apply it to any interface which is configured in NetBox, should be configured on the device but isn’t connected yet, so that effectively you can push down (parts of) the interface configuration to the device but inhibit alarms for it being offline or IGP adjacencies being down. The same is true for device which need to be backed up, etc.

One way of thinking of these example is as a boolean switch, which is set to true if the tag is applied. This is easy for cases where the “false” value is the default, so only the outliers – meaning “true” values – need to be represented explicitly.

Semantic labels

But what to do when there is a need for a tri-state value, e.g. true, false, unset, with unset being the default?

One example would be a part of an SDN controller which figures out if uRPF should be enabled on a given interface or not. This usually can be done programmatically, but in some situations an explicit override may be required. That’s where – what I would call them – Semantic Labels come in!

This means, that we also put the value we want to assign a given label into the tag. In this case, we create two labels called urpf=on as well as urpf=off (or urpf=enable and urpf=disable as you fancy) and apply these to any interface requiring an explicit override. If no tag is present the code decides about the fate of uRPF.

Why not use custom fields?!

At this point you may be wondering “Why bother? Just create a custom field, you can provide the allowed options and get input validation for free!

The reason is that if you go down this path, you can create a fairly long list of custom fields in the UI (and API) which are likely not set for the majority of you interfaces (in this example). This creates clutter within the UI and is likely to confuse users, especially if you end up with a number of custom fields.

My rule of thumb is, that I only create a custom field if it’s relevant for all or a least the majority of entities of a given kind. For example, if you wanted to store the data center tier of all the DC (read: Sites) you are present in, it might make sense to store this information as a custom fields on the Site model, with a predefined list of values (1 – 4).

More examples

In the example above we could also have created two labels, e.g. uRPF enable and uRPF disable and let our automation handle the differentiation – which actually is how the FFHO SDN is built as of today.

For other bits of information, with more possible values, the benefits of the key=value approach can shine, for example if you want to

  • explicitly set the OSPF cost of an interface (e.g. ospf_cost=100)
  • store the maximum capacity of VRFs of a box (e.g. capacity=62 for a box with a Mellanox Spectrum 1 ASIC with a default + mgmt VRF in place)

Although I now prefer the key=value approach as it’s easy to recognize and parse, you can obviously place values in labels by following any common format you define. For the Freifunk Hochstift Backbone I introduced labels of the format batman_connect_<instance_name> (e.g. batman_connect_pad-cty) enable B.A.T.M.A.N. adv. overlay for a given instance on a tagged interface.

NetBox interface overview with Tags

The next level – prefix lists

Oliver – takt – Geiselhardt-Herms took this to the next level in a previous role and created tags with a hierarchical naming structure for prefixes, following the format prefix-list=key1:key2:key3

The idea is that any prefix with a prefix-list tag with value key1:key2:key3 is added into the prefix lists name

  • key1:key2:key3
  • key1:key2
  • key1

An example structure could be

  • <VRF>:<purpose>:<label>, e.g. INTERNET:CUSTOMER:<customer name here>
  • <purpose>:<region>:<device>, e.g. CUSTOMERS:DE:core01.fra01
  • etc.

This obviously can be extended to more levels if it were to be required.

NetBox scripts – now with clean error handling

Recently I wrote a scripts which aids in provisioning things inside our network, which will do some sanity checks and if all is good set up a number of things (prefixes, IPs, sub-interfaces, IPs on them, etc.). The reason to do this inside a script is to do this in an atomic operation, so either the full provision process is done, or nothing is changed at all.

If the sanity checks fail (invalid input, trying to create something with overlapping resources, etc.) the script should fail and ideally report a clear error message one what was wrong, which should be reported to the caller (via API).

Now I was looking into exiting the script on an error and only found the option to throw any Exception which will produce a stack trace of all internal Exceptions which had been caught and handled. There is a AbortTransaction Exception, which allows to terminate the script and thereby the DB transaction, but it was not designed to carry an error message.

Looking into the code it seemed like adding support to gracefully abort a would be rather straight forward to add, so I did (issue, PR). Today the PR got merged and NetBox v3.4.4 (and later) includes the AbortScript exception to elegantly abort scripts, which you can use like this:

from utilities.exceptions import AbortScript

if some_error:
     raise AbortScript("Some meaningful error message")

Modelling (C)WDM MUXes in NetBox/Nautobot – the universal way

A while ago we had a discussion in #DENOG on how to best model CWDM MUXes in NetBox/Nautobot, so they can be used to build a network topology, which can be leveraged for holistic automation, and are prepared for augments, repairs and network changes.

8Ch CWDM MUX

“Natural” way of modelling

The “natural” way of modelling for example an 8Ch CWDM MUX – as shown above – would be do create a DeviceType containing one RearPort with 8 position as well as 8 FrontPorts which map to one position each. So logically this would look like the following

Inside NetBox the Rear Port and Front Ports view of the DeviceType could look like this

One Rear Port with 8 positions
Eight Front Ports mapped to one position each

This model works absolutely, allows tracing connections through the MUXes etc. as long as we always use exactly the same MUX on both ends off the fiber for one WDM setup.

Limitations

If one MUX would decay over time and needs to be replaced by another model e.g. due to supply chain issues, which for example might have 10 channels and does not skip 1390 and 1410, the mapping would fail. This obviously would also be the case if for an expansion the setup should be changed to 18Ch MUXes and this could only be done one side after another.

Universal way of modelling

As the ITU CWDM standard only defines 18 CWDM channels in total with a fixed spacing of 20nm we can use one simple trick to overcome this limitation: Always define all 18 positions and only map the existing Front Ports to their associated position as shown in the following table.

PositionChannel
1
1270
21290
31310
41330
51350
61370
71390
81410
91430
101450
111470
121490
131510
141530
151550
161570
171590
181610

So the Front Ports of the MUX shown above would be mapped to positions 1-6, 9, and 10. This way all CWDM MUXes can be connected in NetBox/Nautobot and all channels which exist on both ends can be connected and traced through.

Eight Front Ports mapped to one their associate (existing) positions

What about DWDM?

For DWDM systems this obviously would be somewhat harder given the bigger amount of possible channel, especially when different spacing options (25, 50, 100GHz) are taken into account. It would be possible though to go for 160 positions and map the channels accordingly given strict adherence to a to be predefined map.

DENOG12 Netbox workshop summary

One week ago DENOG12 took place virtually as an awesome venueless conference and I had the pleasure to hold a workshop about netbox and how to use it for automation. As promised here are the notes we collected while the workshop and projects that emerged since. This is provided as-is and as I don’t have used most of the external tools/scripts/reports/etc. linked below 🙂

Deployment

I forked the officiall netbox-docker GIT repository to set up netbox as a number of Docker containers and made some small changes, to run the PostgreSQL DB outside of Docker and restarte the container on Docker restarts / system reboots.

As the time of this writing this repo is configured for netbox version 2.9.8.

Device Types

The community Device Type library

Within the netbox-community organization on Github you can find the semi-official community Netbox Device Type library where a lot of Device Types are present and ready to be imported into your Netbox. Be aware that you have to create the Manufacturer first before you can import Device Types for that manufacturer. If you happen to create Device Types for devices which are no present in the library please open a PR – sharing is caring 🙂

Importing multiple Device Types at once

Someone mentioned this script/repository which offers the possibility to import multiple Device Types at once. (I didn’t test it yet :))

Reports and Scripts

This repository  hold a bunch of example Netbox Reports for various things within Netbox (circuits, Cabling, IPs/DNS, VMs, etc.) as well as Netbox Scripts to create a VM as well as a geolocator for a site.

My own netbox-scripts repository contains a script to populate a Freifunk Hochstift Backbone POP from one script.

This (archived) repository holds a bunch for tools to import/sync stuff into/with netbox.

Wikimedia seems to use Netbox, too and has open sourced some tooling including a zone file generator.

Other

Johannes wrote an article about adding own buttons within Netbox to open an SSH session into a router.

If you want to extent the authentication options of your Netbox there is a Plugin for SSO using SAML2.

A thread from the Google Group on setting custom fields via the API.

Deploying a Freifunk Hochstift backbone POP with Netbox Scripts

Some weeks ago Network to Code held the first (virtual) Netbox Day (YouTube playlist, Slides repo on github). John Anderson gave a great NetBox Extensibility Overview and introduced me to Netbox Scripts (Video, Slide deck, Slide 28) which allow to add custom Python code to add own procedures to netbox. I was hooked. About three to four hours of fiddling, digging through the docs, and some hundred lines of Python later I had put together a procedure to provision a complete Freifunk Hochstift Backbone POP within Netbox according to our design. I’m going to share my proof of concept code here and  walk you through the key parts of the script.

Netbox scripts provide a great and really simple interface to codify procedures and design principles which apply to your infrastructure and fire up complex network setups within netbox by just entering a set of config parameters in a form like the following and a click of one button.

Provision Backbone POP form
Provision Backbone POP form

Continue reading Deploying a Freifunk Hochstift backbone POP with Netbox Scripts