Generic - NCSA log parser
This topic describes how to configure and use the NCSA log parser, which is useful for parsing the web server log files (Apache and IIS).This topic contains the following sections:
Requirements
Obtain an example of the file to be parsed and validate its format: it must be a standard NCSA formatted log file.
Following log entries format are supported:
#RECORD GET
10.205.25.53 - - [27/Oct/2004:04:02:04 +0200] "GET http://u4p.intjs HTTP/1.1" 304 238 TCP_IMS_HIT:NONE
10.205.25.53 - - [27/Oct/2004:04:02:04 +0200] "GET http://u4p.intjs HTTP/1.1" 304 238 TCP_IMS_HIT:NONE
#RECORD GET
2004-11-14 00:00:17 10.21.12.162 - 10.21.13.26 80 GET /dslib/IsClusterAvailable.asp - 200 0
2004-11-14 00:00:17 10.21.12.162 - 10.21.13.26 80 GET /dslib/IsClusterAvailable.asp - 200 0
#RECORD POST
2004-11-14 00:00:17 10.21.12.162 - 10.21.13.26 80 POST /dslib/IsClusterAvailable.asp - 200 0
2004-11-14 00:00:17 10.21.12.162 - 10.21.13.26 80 POST /dslib/IsClusterAvailable.asp - 200 0
#RECORD POST
62.11.106.82 - - [01/Nov/2004:00:00:26 +0100] "POST /aol2004/aolres/ HTTP/1.1" 200 87808
62.11.106.82 - - [01/Nov/2004:00:00:26 +0100] "POST /aol2004/aolres/ HTTP/1.1" 200 87808
#RECORD GET/POST IIS 6
2008-10-07 22:35:44 GET /Default.asp - hosting\ed_adm 151.53.255.231 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+it;+rv:1.9.0.3)+Gecko/2008092417+Firefox/3.0.3 302 507 458
2008-10-07 22:35:44 GET /Default.asp - hosting\ed_adm 151.53.255.231 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+it;+rv:1.9.0.3)+Gecko/2008092417+Firefox/3.0.3 302 507 458
#RECORD GET/POST IIS 6
2008-10-07 22:35:47 GET /clienti/Graphic/Images/titolo_statistiche.gif - hosting\finanzia_adm 151.53.255.231 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+it;+rv:1.9.0.3)+Gecko/2008092417+Firefox/3.0.3 200 699 592
2008-10-07 22:35:51 POST /clienti/ScriptServerSide/Exec_Statistiche.asp - hosting\finanzia_adm 151.53.255.231 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+it;+rv:1.9.0.3)+Gecko/2008092417+Firefox/3.0.3 200 330 749
2008-10-07 22:35:47 GET /clienti/Graphic/Images/titolo_statistiche.gif - hosting\finanzia_adm 151.53.255.231 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+it;+rv:1.9.0.3)+Gecko/2008092417+Firefox/3.0.3 200 699 592
2008-10-07 22:35:51 POST /clienti/ScriptServerSide/Exec_Statistiche.asp - hosting\finanzia_adm 151.53.255.231 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+it;+rv:1.9.0.3)+Gecko/2008092417+Firefox/3.0.3 200 330 749
Then, define a log file access strategy. BMC Helix Continuous Optimization can access files through different methods:
- Local repository - Files are stored in the BMC Helix Continuous Optimization repository, that is, a repository hosted on the filesystem of the server running the ETL Engine.
- Remote repository - Files are stored on a remote repository that BMC Helix Continuous Optimization can access through one of the following repositories: SCP, FTP, SFTP, Windows Shares. If Windows Shares is the selected method, it is mandatory having a Samba client installed on the ETL Engine server; at least a read-only account on the remote repository is required.
Integration steps
To integrate BMC Helix Continuous Optimization with the NCSA log parser, perform the following task:
- Navigate to Administration > ETL & SYSTEM TASKS > ETL tasks.
- In the ETL tasks page, click Add > Add ETL under the Last run tab.
In the Add ETL page, set values for the following properties under each expandable tab.
Basic properties
Advanced properties
- Click Save.
You return to the Last run tab under the ETL tasks page. - Validate the results in simulation mode: In the ETL tasks table under ETL tasks > Last run, locate your ETL (ETL task name), click
to run the ETL.
After you run the ETL, the Last exit column in the ETL tasks table will display one of the following values:- OK: The ETL executed without any error in simulation mode.
- WARNING: The ETL execution returned some warnings in simulation mode. Check the ETL log.
- ERROR: The ETL execution returned errors and was unsuccessful. Edit the active Run configuration and try again.
- Switch the ETL to production mode: To do this, perform the following task:
- In the ETL tasks table under ETL tasks > Last run, click the ETL under the Name column.
- In the Run configurations table in the ETL details page, click
to edit the active run configuration.
- In the Edit run configuration page, navigate to the Run configuration expandable tab and set Execute in simulation mode to No.
- Click Save.
- Locate the ETL in the ETL tasks table and click
to Run it, or schedule an ETL run.
After you run the ETL, or schedule the ETL for a run, it will extract the data form the source and transfer it to the BMC Helix Continuous Optimization database.
Metrics
This section describes which metrics are populated by default.
Supported datasets
Related topics
Dataset-reference-for-ETL-tasks