This topic provides the following information that you can use as guidelines for deciding when to scale. However this information is only indicative and is meant for reference purposes only.
The following two factors are an indication that you might need to scale the Collection Station, Indexer, or Search components:
When you scale a component, you need to keep in mind that the issue observed for scaling that component can often shift to another component. For example, you notice that the rate of indexing is not keeping up with the incoming volume of data. After some analysis, you determine that there is an issue with the Collection Station and you scale the Collection Station to address the issue. At this time, the indexing rate is able to keep up with the incoming data volume. Now, you add a few hundred more data collectors to the system and notice that once again the indexing rate is not keeping up with the incoming volume of data. You may erroneously conclude that you need to scale the Collection Station to fix the issue. In this scenario, it is possible that the problem has shifted from the Collection Station component to the Indexer component, and that the Indexer needs to be scaled.
This topic contains the following information:
The Collection Station needs to be scaled when a large volume of data to be indexed exceeds the capacity of the current available hardware.
The following table lists the factors that you can use as indicators for scaling up the Collection Station component; you can also see the functions supported by the factors.
Factor | Indicators for scaling | Functions supported |
---|---|---|
Processor | Average CPU required exceeds 75%. |
|
Memory | Maximum Java heap size required is greater than 80% of the physical memory available. This is controlled by the | |
Disk I/O | Disk I/O transfer rate exceeds 75% of the disk I/O available. |
The Indexer needs to be scaled when the overall product usage exceeds the hardware capacity and starts negatively impacting the Collection Station (indexing) or Search components.
The following table lists the factors that you can use as indicators for scaling up the Indexer component; you can also see the functions supported by the factors.
Factor | Indicators for scaling | Functions supported |
---|---|---|
Processor | Average CPU required exceeds 75%. |
|
Memory | Maximum Java heap size required is greater than 60% of the physical memory available. This is controlled by the |
|
Disk I/O | Disk I/O transfer rate exceeds 75% of the disk I/O available. |
|
Disk space | Disk space required exceeds 75% of the available disk space. Disk space required = (Rate of disk space growth per day) * (Number of days of data retention) |
|
The Search component needs to be scaled when the overall product usage exceeds the hardware capacity and starts negatively impacting the Search component.
Use the following factors as indicators for scaling up the Search component:
Factor | Indicators for scaling | Functions supported |
---|---|---|
Processor | Average CPU required exceeds 75%. |
|
Memory | Maximum Java heap size required is greater than 75% of the physical memory available. This is controlled by the | High number of concurrent searches |
Overall, the variables described in the following table determine the amount by which the resources required by the Collection Station, Indexer, and Search components are impacted. Your decision to scale depends on the amount by which the resources in your environment are affected by each of these usage factors.
For example, if you have a large volume of data to index, then you need to scale the Collection Station and Indexer components. Because the level of impact for this usage factor is high, you must certainly consider scaling.
The following table provides information about variables that impact your hardware resources and thereby impact your decision to to scale the Collection Station, Search, and Search components. The levels of impact indicate the impact to the Java heap size and CPU and is described as High, Medium, Low, and None.
Variables that determine impact to components that can be scaled
Variable | Collection Station | Indexer | Search |
---|---|---|---|
Large volume of data to index | High | High | None |
Large number of concurrent users actively using the product | None | High | High |
Large number of email notifications (containing reports) | None | None | High |
Large number of notifications | None | Medium | None |
Large number of fields added to the Filters panel | None | High | None |
Concurrent searches with large time ranges | None | High | None |
Large number of concurrent searches (without search commands) | None | High | Low |
Large number of concurrent simple search commands (dedup, group, filter, head, tail) | None | High | Medium |
Large number of concurrent advanced search commands (timechart, stats,top, rare) | None | High | Medium (Java heap size) |
Low (CPU) | |||
Higher data retention period (for Java heap size) | None | High | None |
Large number of data collectors | Medium | Medium | None |
Large number of dashboard (or dashlet) executions | None | High | Medium |
You can assess the amount of load handled by the Collection Station or Indexer by finding out the indexing lag. The indexing lag is the time lag between the time at which data was collected and time at which data was indexed, for each poll (made by the data collector).
This can be done by performing searches on the data available in the Collection_metrics.log file. For example, you can run the following search query:
_index=metrics * && engine=COLLECTION_STATION && indexing-lag=* |
head 100000 | filter indexing-lag > 300000
This search query displays results that have an indexing lag greater than five minutes. The indexing lag is indicated by the indexing-lag field in the search results. The value of this field is provided in milliseconds. The results displayed by running this search query indicates that the indexing activity is not able to keep pace with the collection activity.
Example
You can modify the search query in various ways depending on your goal.
Example 1: If you want to see results with an indexing lag greater than two minutes, you need to change the string 300000
to 120000
in the search query, as follows:
_index=metrics * && engine=COLLECTION_STATION && indexing-lag=* |
head 100000 | filter indexing-lag > 120000
Example 2: If you want to use the search query to see the pattern in which the indexing lag occurred for the last seven days, then you can run the following search query with the time range set to Last 7 days.
_index=metrics * && engine=COLLECTION_STATION && indexing-lag=* |
head 100000 | filter indexing-lag > 300000 | timechart span=1d count(indexing-lag), avg(indexing-lag) by collectorid
In the preceding search query the collectorid
represents the Collection Station identifier.
collectorid
:count
function).avg
function).You can also monitor the search results obtained by running the preceding search queries by adding dashboards or notifications. To do this, you need to first save the search query. For more information, see Saving and sharing searches for analytics and monitoring.