Page tree

This topic provides the following information that you can use as guidelines for deciding when to scale. However this information is only indicative and is meant for reference purposes only.

The following two factors are an indication that you might need to scale the Collection Station, Indexer, or Search components:

  • Product performance is deteriorating
  • Hardware resources such as the processor, memory, storage and disk I/O start exceeding acceptable limits.
When you scale a component, you need to keep in mind that the issue observed for scaling that component can often shift to another component. For example, you notice that the rate of indexing is not keeping up with the incoming volume of data. After some analysis, you determine that there is an issue with the Collection Station and you scale the Collection Station to address the issue. At this time, the indexing rate is able to keep up with the incoming data volume. Now, you add a few hundred more data collectors to the system and notice that once again the indexing rate is not keeping up with the incoming volume of data. You may erroneously conclude that you need to scale the Collection Station to fix the issue. In this scenario, it is possible that the problem has shifted from the Collection Station component to the Indexer component, and that the Indexer needs to be scaled.

This topic contains the following information:

Indicators for scaling Collection Station

The Collection Station needs to be scaled when a large volume of data to be indexed exceeds the capacity of the current available hardware.

The following table lists the factors that you can use as indicators for scaling up the Collection Station component; you can also see the functions supported by the factors.

FactorIndicators for scalingFunctions supported
ProcessorAverage CPU required exceeds 75%.
  • Large volume of indexing
  • Large number of data collectors
Memory

Maximum Java heap size required is greater than 80% of the physical memory available.

This is controlled by the wrapper.java.maxmemory property. For more information, see Component configuration recommendations for horizontal scaling.

Disk I/ODisk I/O transfer rate exceeds 75% of the disk I/O available.

Indicators for scaling Indexer

The Indexer needs to be scaled when the overall product usage exceeds the hardware capacity and starts negatively impacting the Collection Station (indexing) or Search components.

The following table lists the factors that you can use as indicators for scaling up the Indexer component; you can also see the functions supported by the factors.

FactorIndicators for scalingFunctions supported
ProcessorAverage CPU required exceeds 75%.
  • High volume of indexing
  • High number of concurrent searches
Memory

Maximum Java heap size required is greater than 60% of the physical memory available.

This is controlled by the wrapper.java.maxmemory property. For more information, see Component configuration recommendations for horizontal scaling.

  • High number of concurrent searches
  • Search queries with high number of fields
  • Higher data retention period
Disk I/O

Disk I/O transfer rate exceeds 75% of the disk I/O available.

  • High volume of indexing
  • High number of concurrent searches
Disk space

Disk space required exceeds 75% of the available disk space.

Disk space required = (Rate of disk space growth per day) * (Number of days of data retention)

  • Large volume of data
  • Higher data retention period

Indicators for scaling Search

The Search component needs to be scaled when the overall product usage exceeds the hardware capacity and starts negatively impacting the Search component.

Use the following factors as indicators for scaling up the Search component:

FactorIndicators for scalingFunctions supported
ProcessorAverage CPU required exceeds 75%.
  • High number of concurrent searches
  • High number of email notifications (containing reports)
Memory

Maximum Java heap size required is greater than 75% of the physical memory available.

This is controlled by the wrapper.java.maxmemory property. For more information, see Component configuration recommendations for horizontal scaling.

High number of concurrent search

Variables that impact hardware resources

Overall, the variables described in the following table determine the amount by which the resources required by the Collection Station, Indexer, and Search components are impacted. Your decision to scale depends on the amount by which the resources in your environment are affected by each of these usage factors.

For example, if you have a large volume of data to index, then you need to scale the Collection Station and Indexer components. Because the level of impact for this usage factor is high, you must certainly consider scaling.

The following table provides information about variables that impact your hardware resources and thereby impact your decision to to scale the Collection Station, Search, and Search components. The levels of impact indicate the impact to the Java heap size and CPU and is described as High, Medium, Low, and None.

Variables that determine impact to components that can be scaled

Variable

Collection Station

Indexer

Search

Large volume of data to index

High

High

None

Large number of concurrent users actively using the product

None

High

High

Large number of email notifications (containing reports)

None

None

High

Large number of notifications

None

Medium

None

Large number of fields added to the Filters panel

None

High

None

Concurrent searches with large time ranges

None

High

None

Large number of concurrent searches (without search commands)

None

High

Low

Large number of concurrent simple search commands (dedup, group, filter, head, tail)NoneHighMedium

Large number of concurrent advanced search commands (timechart, stats,top, rare)

NoneHigh

Medium (Java heap size)

Low (CPU)

Higher data retention period (for Java heap size)

None

High

None

Large number of data collectorsMediumMediumNone
Large number of dashboard (or dashlet) executionsNoneHighMedium

Assessing the load handled by Collection Station or Indexer

You can assess the amount of load handled by the Collection Station or Indexer by finding out the indexing lag. The indexing lag is the time lag between the time at which data was collected and time at which data was indexed, for each poll (made by the data collector). 

This can be done by performing searches on the data available in the Collection_metrics.log file. For example, you can run the following search query:

_index=metrics * && engine=COLLECTION_STATION && indexing-lag=* |
head 100000 | filter indexing-lag > 300000

This search query displays results that have an indexing lag greater than five minutes. The indexing lag is indicated by the indexing-lag field in the search results. The value of this field is provided in milliseconds. The results displayed by running this search query indicates that the indexing activity is not able to keep pace with the collection activity.

 

Example

You can modify the search query in various ways depending on your goal.

Example 1: If you want to see results with an indexing lag greater than two minutes, you need to change the string 300000 to 120000 in the search query, as follows:

_index=metrics * && engine=COLLECTION_STATION && indexing-lag=* |
head 100000 | filter indexing-lag > 120000

Example 2: If you want to use the search query to see the pattern in which the indexing lag occurred for the last seven days, then you can run the following search query with the time range set to Last 7 days.

_index=metrics * && engine=COLLECTION_STATION && indexing-lag=* |
head 100000 | filter indexing-lag > 300000 | timechart span=1d count(indexing-lag), avg(indexing-lag) by collectorid

In the preceding search query the collectorid represents the Collection Station identifier.

This query displays the following information for each day per collectorid:
  • Total number of polls where the indexing lag is greater than five minutes (represented by the count function).
  • Average indexing lag obtained from the polls where the indexing lag is greater than five minutes (represented by the avg function).

You can also monitor the search results obtained by running the preceding search queries by adding dashboards or notifications. To do this, you need to first save the search query. For more information, see Managing saved searches.