ELIS 2013 Manual

Reporting

Reporting Engine Technical Details

ETL Cron Process:

To aid in delivering reports accurately and quickly, ELIS Reports builds a data warehouse of data from your Moodle/ELIS site. The ETL Cron process is the tool that performs this function. ETL stands for "Extract, Transform, Load" - this describes what the cron process is doing - we extract data from your Moodle logs, ELIS and Moodle tables, etc. We transform this data into easily retrievable forms - for example for the time in course data we run an algorithm to estimate time spent, and load the results into the data warehouse. Finally, we load the extracted and transformed results into the data warehouse for retrieval by the ELIS Reports BI engine.

Technically the ELIS data warehouse is optimized for reading data, whereas many of the tables it pulls that data from are optimized for writing data. By extracting data from write optimized tables and loading it into read optimized tables in the data warehouse, we make the retrieval of the data much more efficient and rapid. This, along with the algorithmic processing described above, enables ELIS Reports to deliver very large recordset reports in a reasonable timeframe (see test results below for a ~50,000 user site on a Remote-Learner Level 4 cluster).

To avoid having the ETL process impact site performance, we process this data in small chunks. Every 5 minutes the process is initiated - however if the process detects the server is under heavy load from user activity or other processes, it will cancel itself, and check at the next 5 minute interval. This means that most processing is automatically done at times when your site is not under load from users or other processes. This optimization also means that processing on an active site is never really 'done' as there is always new data being delivered to the log tables by user activity. For example in the site below, out of ~5.7 million log records, there are 40 records remaining to be processed.

This small number of unprocessed records won't greatly impact the accuracy of the reports. For example if a user is showing they have spent 40 hours on the site, and today they spend another hour, their 41st hour might not show up in reports until the ETL process has caught up with today's records. If you consistently see a high number (>200) in the 'remain to be processed', then you may need to have a larger server or more server resources assigned to your RHEV instance - contact your Remote-Learner sales representative to inquire about adding more resources to your server. A separate reporting server (dedicated only to running reports) may also be an option.

When a Moodle site is upgraded to ELIS, and/or when a larger number of log records have been created than ELIS was able to process in 5 minutes, there is a message that is displayed in your Health Check stating that the ETL process is in progress. The message will display the number of records processed and the number of records that remain to be processed. The following image is showing an example.

ETL process in progress warning


Reporting Engine Optimization:

The following data provides some rough numbers for performance changes in the PHP report engine.

Some specific notes:

  • these numbers specifically reflect the performance related to running the Individual User and Individual Course Progress reports for a user with a small dataset, with about 50,000 users present on the site
  • memory usage numbers represent the "peak" usage within the execution
  • run times are measured using firebug, with the exception of the report export cases, which were measured through PHP
Scenario Description
Memory Usage (1.9.1 code)
Run Time (1.9.1 code)
Memory Usage (1.9.2 code)
Run Time (1.9.2 code)
Comments
- Individual User Report - Click on report and view the parameters screen for the first time after logging in
183.7 MB
35.2s (onload: 31.86s)
88.7 MB
9.14s (onload: 20.38s)
- Individual Course Progress Report - Click on report and view the parameters screen for the first time after logging in 183.9 MB
36.68s (onload: 33.56s) 88.7 MB
19.28s (onload: 21.44s)
- Individual User Report - Running the report for the RL Admin user
298.1 MB
25.66s
77.7 MB
5.58s
- Individual Course Progress Report - Running the report for the RL Admin user
297.2 MB
24.81s 77.8 MB
5.18s
- Individual User Report - Returning to the parameter screen after running the report
125.6 MB
13.56s 77.7 MB
6.21s
- Individual Course Progress Report - Returning to the parameter screen after running the report 125.9 MB
16.42s
77.8 MB
17.15s This is now slower because the report parameters are not cached
- Individual User Report - Hitting the "Refresh" button
164.3 MB
20.13s
18.7 MB
0.421s
- Individual Course Progress Report - Hitting the "Refresh" button
163.2 MB
25.7s 19.0 MB
0.414s
- Individual User Report - Sorting by first column
163.9 MB
20.3s 18.7 MB
0.531s

- Individual Course Progress Report - Sorting by first column
163,7 MB
25.48s 19.0 MB
0.607s
- Individual User Report - CSV export
162.8 MB
8.4809s 19.6 MB
0.1761s
- Individual Course Progress Report - CSV Export
162.8 MB
13.5448 19.7 MB
0.1699s
-Individual User Report - PDF Export
162.8 MB
8.5597s 20.2 MB
0.1872s
- Individual Course Progress Report - PDF Export
162.9 MB
13.8428 20.4 MB
0.2033s