Saturday, March 8, 2025

Analyze slow query logs from Amazon RDS and process them using pt-query-digest

To analyze slow query logs from Amazon RDS and process them using pt-query-digest, you will need to follow a series of steps to first obtain the log, then process and analyze it.




 Below is a step-by-step guide:

Step 1: Enable Slow Query Log in Amazon RDS

  1. Ensure slow query logging is enabled:

    • You can enable the slow query log by modifying the DB parameter group associated with your RDS instance.
    • Navigate to the RDS DashboardParameter Groups → Select the parameter group associated with your instance.
    • Modify the following parameters:
      • slow_query_log = 1
      • long_query_time = <time> (e.g., set to 1 second)
      • log_output = FILE (if not already set)
      • You may need to restart the RDS instance for changes to take effect.
  2. Check the RDS log export settings:

    • Go to RDS DashboardLog & events and make sure that logs are being stored and available for download.
    • Logs will typically be available in CloudWatch Logs or as downloadable files directly from RDS.

Step 2: Retrieve Slow Query Logs from RDS

  • You can use the AWS Management Console or AWS CLI to access your logs.

Using AWS Management Console:

  • In the RDS Console, navigate to the Logs & events section.
  • Select the desired DB instance and view the slow query log entries.

Using AWS CLI:

To download the logs via AWS CLI, run:

aws rds download-db-log-file-portion --db-instance-identifier <your-db-instance-id> --log-file-name slowquery/mysql-slowquery.log --starting-token 0 --output text > slowquery.log

This command downloads the slow query log into a file named slowquery.log.

Step 3: Install pt-query-digest

pt-query-digest is a tool from Percona Toolkit that processes and analyzes MySQL slow query logs.

  1. Install Percona Toolkit (which contains pt-query-digest):

    On Ubuntu/Debian:

    sudo apt-get update
    sudo apt-get install percona-toolkit
    

    On CentOS/Red Hat:

    sudo yum install percona-toolkit
    

Step 4: Run pt-query-digest on the Slow Query Log

Once you have the slow query log (e.g., slowquery.log), you can process it using pt-query-digest.

pt-query-digest slowquery.log > slowquery_analysis.txt

This command will process the log file and generate a detailed analysis in slowquery_analysis.txt.

Step 5: Review the Analysis

The output from pt-query-digest will include:

  • The most time-consuming queries.
  • Query statistics (e.g., execution time, number of rows examined, etc.).
  • Query patterns that can be optimized (e.g., missing indexes).

This information will help you identify performance bottlenecks in your queries.

Additional Options for pt-query-digest

  • Analyze queries by time: If you want to see queries ordered by the total time spent, you can use the --order-by flag.

    pt-query-digest --order-by=Time slowquery.log > slowquery_analysis_time_sorted.txt
    
  • Ignore certain queries: You can also use --ignore-query to filter out certain queries you don’t want to analyze.

    pt-query-digest --ignore-query="SELECT NOW()" slowquery.log
    

Step 6: Optimize Based on Results

Using the analysis results, focus on queries with:

  • High execution time.
  • Large numbers of rows examined.
  • Potentially missing indexes.

Consider adding appropriate indexes, rewriting queries for efficiency, or adjusting your database configuration to improve performance.


Let me know if you need further details or help with the output analysis!