SolarWinds Hybrid Cloud Observability First Steps, part one

I deployed SolarWinds Hybrid Cloud Observability (HCO), and now I have started to adjust it.
I will show what to do next and explain how to fix possible pitfalls before they start becoming an issue.
Again, this is a lab install, but most of the steps apply to production deployments, too.

So, let’s call this SolarWinds Hybrid Cloud Observability first steps, part one – as other details will follow!

The first time the web dashboard starts, it asks for a password. There are built-in complexity requirements, so neither “password” nor “Password” will work!

SolarWinds Hybrid Cloud Observability first steps

The Discovery Wizard

The first step is to get devices into the platform. Before clicking start here, I check the “don’t show this again” box:

Now I need to define the scope of the discovery.
For me, it’s a single subnet, but you can add as much information here as required.

The second step asks to add virtualization entities.
A while back I switched to a KVM hypervisor, which isn’t supported by the wizard, so I ignore it.

Leave the agent setting as it is; there’s only one agent deployed right now, and that’s on the server we are connected to.

For now, I leave this part empty, too. It defines SSH connections for managing network device configurations. I will deal with this manually later.

The next step is interesting.

In a nutshell, you need to provide all SNMP community strings available in your environment.
In all my years, I’ve seen countless companies using many different community strings that result from inadequate design and careless work; sorry for being honest.

In a perfect world, you would have one or at max two “read-only” strings, and maybe one “read-write.”
In a perfect world, you would also rename both from the defaults, which means different ones than those shown below.

Now, the pitfall here is: The platform supports strings without limits, but the more you provide, the longer the process will take, as it will try one string after the other.

I will delete the “private” one in my environment and operate with “public” only.

To communicate with Windows servers, we need to provide credentials.
In my lab, I’m using the Domain Admin, in production that’s a big no-no. You would create an account for monitoring who is a domain member but “just” a local admin. There are also ways to further limit the permissions, as explained here and in the other linked articles.

The switch on top here is essential. That’s a feature SolarWinds added a few years back, as many first-time users tried to monitor anything that responds to ping and wondered why there was zero information coming from those machines.
I suggest sticking to no; I’ll explain how to add an ICMP-only node and why you would do that in a bit.
The best practice for most situations is not to change anything on this screen.

The same applies to the next screen. There’s no need to tune the settings in a regular-sized and healthy environment.
You will change the settings if the wizard gets stuck or if you’re a multinational with more than 200k network devices.

Never automate the first run.
If you need continuous discovery, and there are situations where you would want it, create another process and tune it appropriately.

Time for a coffee! Or, in huge environments, for lunch.

Not much here in my downsized homelab. I already see that something is missing, and there’s something I’m not interested in, so I will uncheck my printer before continuing. Unknown means that something was discovered and responded to SNMP but didn’t come with the OID for identification. We’ll deal with that later.

I remove the loopbacks. In production, you would keep them, at least from network devices with OSPF etc., but I don’t need them here.

A few applications have been discovered already:

And we can choose to monitor hard- and software inventories:

After this step, the wizard will import the results and offer to discover more applications, but I will do that later.
Let’s see what we have now!

Post-discovery steps

I mentioned already that I’m suffering from IT-OCD, so I want to clean up now.

Let’s click the SolarWinds logo top left to go to the summary dashboard. It’s messy, so let’s remove all the unnecessary information. This will also speed up the page delivery, as most of the widgets are based on database queries. Here’s a short 40 sec video:

Now populate the All Nodes widget. It’s still messy in its default setting.

I’m going the click the “unknown” device to check why it’s unknown:

The machine type isn’t populated; the reason for that is that the Sys-OID isn’t correctly registered.
If you search for .55062 in a search engine, it’s unknown. There isn’t much you could do here.

Ideally, you need to talk to the vendor and ask them to get their stuff in order and properly register the OID.

The other option is to reach out to SolarWinds in this thread, to enter the OID manually, but that would be more of a workaround than a proper fix, and it will take a while.

But I don’t particularly appreciate how nodes are displayed in the All Nodes widget anyway, so I will show you another workaround. You are in complete control of this one, and it won’t take long and has enormous benefits for various things you will do with the platform in the future.

Custom Properties

Go to Settings / All Settings / Manage custom properties, mark all and delete what’s there:

Custom properties are like tags or labels, which you can attach to everything.
You can use this for the dashboards, reports, account limitations, alerts, or, like in our case, for widgets.

Suitable candidates would be properties based on location, departments, or a complex application delivery process.

I created one called “DeviceType,” and here are my settings:

In the next step I’m going to assign the values:

Now let’s return to our “All Nodes” widget and click edit in the corner top right.
This is what it looks before I apply any change:

Now a screenshot showing my changes:

And this is the result. Success!

Dealing with alerts

In the navigation bar, click Alerts & Activity / Alerts, then Manage Alerts, and sort by “Enabled (On/Off)” as shown here:

Now I want you to mark all and select “Disable Aerts.”

Rinse repeat until all alerts are disabled. Yes, I want you to disable all alerts.

Why did we do this?

The built-in alerts are merely templates to give you an idea of what an alert should look like, but they are not and will never be a perfect match for your environment.
Many first-time users don’t understand that and stick with the out-of-the-box alerts.
One of the results is a lot of spam.

The platform has an automated baselining feature, but it needs at least seven days of data to suggest a baseline and define the “normal” condition. So, give it some time.

Furthermore, each alert that’s enabled but doesn’t have any meaning to you is consuming CPU resources. The system will continuously check for the alert conditions in the background, so get rid of those alerts in the beginning.

Add another account

If you lose access to the default admin account for whatever reasons, there are steps to reset the password in the database.
While that’s possible, a much simpler way is to create a second account.
Sounds too easy.

Go to Settings / All Settings / Manage Accounts and create an Orion Individual Account.

Yes, please, even if you use AD auth as a primary way.

Make it an admin:

Store the credentials somewhere secure!

We did change a lot, didn’t we?

I think it’s enough for now, let the information sink, and we’ll continue another time!

More homelab posts:

1 2 3 4

Leave a Comment

Your email address will not be published. Required fields are marked *