• Home

  • Custom Ecommerce
  • Application Development
  • Database Consulting
  • Cloud Hosting
  • Systems Integration
  • Legacy Business Systems
  • Security & Compliance
  • GIS

  • Expertise

  • About Us
  • Our Team
  • Clients
  • Blog
  • Careers

  • VisionPort

  • Contact
  • Our Blog

    Ongoing observations by End Point Dev people

    Annotating Your Logs

    Josh Williams

    By Josh Williams
    March 31, 2011

    We recently did some PostgreSQL performance analysis for a client with an application having some scaling problems. In essence, they wanted to know where Postgres was getting bogged down, and once we knew that we’d be able to target some fixes. But to get to that point, we had to gather a whole bunch of log data for analysis while the test software hit the site.

    This is on Postgres 8.3 in a rather locked down environment, by the way. Coordinated pg_rotate_logfile() was useful, but occasionally it would seem to devolve to something resembling: “Okay, we’re adding 60 more users … now!” And I’d write down the time stamp, and figure out an appropriate place to slice the log file later.

    Got me thinking, what if we could just drop an entry into the log file, and use it to filter things out later? My first instinct was to start looking at seeing if a patch would be accepted, maybe a wrapper for ereport(), something easy. Turns out, it’s even easier than that…

    pubsite=# DO $$BEGIN RAISE LOG 'MARK: 60 users'; END;$$;
    DO
    Time: 0.464 ms
    pubsite=# DO $$BEGIN RAISE LOG 'MARK: 120 users'; END;$$;
    DO
    Time: 0.378 ms
    pubsite=# DO $$BEGIN RAISE LOG 'MARK: 360 users'; END;$$;
    DO
    Time: 0.700 ms
    

    Of course the above will only work on version 9.0 and up (eventually). Previous versions that have PL/pgSQL turned can just create a function that does the same thing. The “LOG” severity level is an informational message that’s supposed to always make it into the log files. So with those in place, a grep through the log can reveal just where they appear, and sed can extract the sections of log between those lines and feed them into your favorite analysis utility:

    postgres@mothra:~$ grep -n 'LOG:  MARK' /var/log/postgresql/postgresql-9.0-main.log 
    19180:2011-03-31 20:20:37 EDT LOG:  MARK: 60 users
    19478:2011-03-31 20:25:48 EDT LOG:  MARK: 120 users
    20247:2011-03-31 20:32:15 EDT LOG:  MARK: 360 users
    postgres@mothra:~$ sed -n '19180,19478p' /var/log/postgresql/postgresql-9.0-main.log | bin/pgsi.pl > 60users.html
    

    Oh, and the performance problem? Turns out it wasn’t Postgres at all, every single query average execution time was shown to vary minimally as the concurrent user count was scaled higher and higher. But that’s another story.

    database performance postgres


    Comments