Settings

The Spidermon settings allow you to customize the behaviour of your monitors enabling, disabling and configuring features like enabled monitors, monitor actions, item validation and notifications.

Built-in settings reference

Here’s a list of all available Spidermons settings, in alphabetical order, along with their default values and the scope where they apply. These settings must be defined in settings.py file of your Scrapy project.

SPIDERMON_ENABLED

Default: False

Whether to enable Spidermon.

SPIDERMON_EXPRESSIONS_MONITOR_CLASS

Default: spidermon.python.monitors.ExpressionMonitor

A subclass of spidermon.python.monitors.ExpressionMonitor.

This class will be used to generate expression monitors.

Note

You probably will not change this setting unless you have an advanced use case and needs to change how the context data is build or how the on-the-fly MonitorSuite are generated. Otherwise the default should be enough.

SPIDERMON_PERIODIC_MONITORS

Default: {}

A dict containing the monitor suites that must be executed periodically as key and the time interval (in seconds) between the executions as value.

For example, the following suite will be executed every 30 minutes:

SPIDERMON_PERIODIC_MONITORS = {
    'tutorial.monitors.PeriodicMonitorSuite': 1800,
}

SPIDERMON_SPIDER_CLOSE_MONITORS

Default: []

List of monitor suites to be executed when the spider closes.

SPIDERMON_SPIDER_CLOSE_EXPRESSION_MONITORS

Default: []

List of dictionaries describing expression monitors to run when a spider is closed.

SPIDERMON_SPIDER_OPEN_MONITORS

Default: []

List of monitor suites to be executed when the spider starts.

SPIDERMON_SPIDER_OPEN_EXPRESSION_MONITORS

Default: []

List of dictionaries describing expression monitors to run when a spider is opened.

SPIDERMON_ENGINE_STOP_MONITORS

List of monitor suites to be executed when the crawler engine is stopped.

SPIDERMON_ENGINE_STOP_EXPRESSION_MONITORS

Default: []

List of dictionaries describing expression monitors to run when the engine is stopped.

SPIDERMON_ADD_FIELD_COVERAGE

Default: False

When enabled, Spidermon will add statistics about the number of items scraped and coverage for each existing field following this format:

'spidermon_item_scraped_count/<item_type>/<field_name>': <item_count> 'spidermon_field_coverage/<item_type>/<field_name>': <coverage>

Note

Nested fields are also supported. For example, if your spider returns these items:

[
  {
    "field_1": {
      "nested_field_1_1": "value",
      "nested_field_1_2": "value",
    },
  },
  {
    "field_1": {
      "nested_field_1_1": "value",
    },
    "field_2": "value"
  },
]

Statistics will be like the following:

'spidermon_item_scraped_count/dict': 2,
'spidermon_item_scraped_count/dict/field_1': 2,
'spidermon_item_scraped_count/dict/field_1/nested_field_1_1': 2,
'spidermon_item_scraped_count/dict/field_1/nested_field_1_2': 1,
'spidermon_item_scraped_count/dict/field_2': 1,
'spidermon_field_coverage/dict/field_1': 1,
'spidermon_field_coverage/dict/field_1/nested_field_1_1': 1,
'spidermon_field_coverage/dict/field_1/nested_field_1_2': 0.5,
'spidermon_item_scraped_count/dict/field_2': 0.5,

SPIDERMON_FIELD_COVERAGE_SKIP_NONE

Default: False

When enabled, returned fields that have None as value will not be counted as fields with a value.

Considering your spider returns the following items:

[
  {
    "field_1": None,
    "field_2": "value",
  },
  {
    "field_1": "value",
    "field_2": "value",
  },
]

If this setting is set to True, spider statistics will be:

'spidermon_item_scraped_count/dict': 2,
'spidermon_item_scraped_count/dict/field_1': 1,  # Ignored None value
'spidermon_item_scraped_count/dict/field_2': 2,
'spidermon_field_coverage/dict/field_1': 0.5,  # Ignored None value
'spidermon_item_scraped_count/dict/field_2': 1,

If this setting is not provided or set to False, spider statistics will be:

'spidermon_item_scraped_count/dict': 2,
'spidermon_item_scraped_count/dict/field_1': 2,  # Did not ignore None value
'spidermon_item_scraped_count/dict/field_2': 2,
'spidermon_field_coverage/dict/field_1': 1,  # Did not ignore None value
'spidermon_item_scraped_count/dict/field_2': 1,