Tutorial - Service availability check

Part 1: Creating a service availability script

We have already written service availability scripts to use with our Generic Script plugin. All you need to do is download the Python or Bash version and edit a few small configuration settings within the script and everything should work out of the box.

We recommend using the python version because python provides a better structure and is more readable.

If you are having problems with customizing the service availability script, or something is not clear, you can visit our Generic Script plugin docs for more information. You can also contact us at support@coscale.com and we will try to address any problems you might have.

Part 2: Creating an agent with a service availability script attached to it using the Generic Script plugin.

  1. Go to Datasources > Agent

  2. Create a new agent, or you can also edit an existing one that you might have. The process is the same.

    Create new agent screenshot
  3. We will only focus on configuring the necessary plugin for the agent. If you wish to read more extensive documentation on the whole process of configuring an agent please visit the agent docs. Generally the whole process should be straight-forward and self-explanatory.

    Now go to the Plugin step and add the Generic script plugin. A dialog will appear with the configuration steps for this plugin. What to fill in will be explained in the next step.

    Generic script screenshot
  4. Configuring the Generic script plugin is very easy. You only have to provide the path to the script that will run the service availability check. The path is the location to the script on your server.
    • Go to the script step.
    • Scroll down and click add another file.
    • Fill in the path to the script in the input box that appears and click Finish.
    Generic script walkthrough screenshot
  5. Continue through the steps and finish the agent configuration. Then download it and install it on your server(s).

    If you are editing an agent then the agent will update itself automatically after you finish the configuration. The auto-update procedure will require some minutes to complete so please be patient.

  6. Don't forget to deploy the service availability script to your server to the same path as provided in the configuration of the Generic script plugin.

    Generally you should only have one server running the Generic script plugin with the service availability script. This does mean that if your server goes down, your availability check will also not work anymore, so you should have some backup system in place for these situations.

Part 3: Waiting for the first run of the service availability script.

We have to wait until the script has been picked up by the agent and has run at least once. On the first run the agent will create the metrics in CoScale that have been defined in the service availability script.

If the metrics have not been created in CoScale you cannot continue creating alerts for them since they will not be available.

Part 4: Creating alerts for services that become unavailable.

  1. Go to Alerts > Manage

  2. Create a new notification scheme

    Create notification scheme screenshot
  3. Fill in the name and add the emails for the main recipients.

    The rest of the form is not required but it is worth considering if configuring the optional steps might be beneficial in your situation.

    Alert schema screenshot
  4. Now you will see a new alert block. Continue by clicking on the add new alert button.

    Add alert screenshot
  5. Configure this new alert
    • Select alert type: Static
    • Add alert screenshot
    • Select Server Metric. The Generic Script plugin only pushes server metrics
    • Add alert screenshot
    • Configure alert rule
      • Choose the correct metric in the first select box. Look for the same name as you provided in the script.
      • Set the value to not equal to 0. (0 means service available, anything else means service not available)
      • Set the period between 120s and 180s to avoid too many false positives.
      • Auto resolving is turned on by default and we recommend to leave it on.
      • Add alert screenshot
    • Give your alert a name
    • Add alert screenshot
    • Click on Finish and your new alert will be active.
    • Add alert screenshot
  6. Repeat the process of creating an alert for each service for which you wish to receive alerts when it becomes unavailable.