• Home

  • Custom Ecommerce
  • Application Development
  • Database Consulting
  • Cloud Hosting
  • Systems Integration
  • Legacy Business Systems
  • Security & Compliance
  • GIS

  • Expertise

  • About Us
  • Our Team
  • Clients
  • Blog
  • Careers

  • VisionPort

  • Contact
  • Our Blog

    Ongoing observations by End Point Dev people

    Shell Command Outputs Truncated in Python

    Selvakumar Arumugam

    By Selvakumar Arumugam
    April 5, 2018

    Two guys working at computers
    Photo by Sarah Pflug of Burst

    Recently I was working on a Python script to do some parsing and processing on the output of shell commands in Ubuntu. The output that showed up was truncated.

    The below sections will walk through the debugging process to identify the root cause and implement a solution with detailed explanation, using Python 2.

    Problem

    The following code block shows the output of a shell command which lists the installed packages, name and version, in Ubuntu.

    # dpkg -l | grep ^ii | awk '{print $2 "    " $3}'
    accountsservice    0.6.35-0ubuntu7.3
    acl    2.2.52-1
    adduser    3.113+nmu3ubuntu3
    ant    1.9.3-2build1
    ant-optional    1.9.3-2build1
    apache2    2.4.7-1ubuntu4.18
    apache2-bin    2.4.7-1ubuntu4.18
    apache2-data    2.4.7-1ubuntu4.18
    apache2-utils    2.4.7-1ubuntu4.18
    apparmor    2.10.95-0ubuntu2.6~14.04.1
    

    The same shell command executes in the Python console but the output shows truncated values for a few packages’ versions, for example, accountsservice, adduser, apache2, etc.

    >>> import subprocess
    >>> installed_packages = subprocess.check_output(['dpkg -l | grep ^ii | awk \'{print $2 "    " $3}\''], shell=True)
    >>> print installed_packages
    accountsservice    0.6.35-0ubuntu7.
    acl    2.2.52-1
    adduser    3.113+nmu3ubuntu
    ant    1.9.3-2build1
    ant-optional    1.9.3-2build1
    apache2    2.4.7-1ubuntu4.1
    apache2-bin    2.4.7-1ubuntu4.1
    apache2-data    2.4.7-1ubuntu4.1
    apache2-utils    2.4.7-1ubuntu4.1
    apparmor    2.10.95-0ubuntu2
    

    Root Cause

    To identify the root cause of the problem, I started with source command dpkg -l command without any filters and processing. I have noticed two different results for this command, with and without less command. The less command showed the complete result with scrolling as below.

    # dpkg -l | less
    Desired=Unknown/Install/Remove/Purge/Hold
    | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
    |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
    ||/ Name                                  Version                                    Architecture Description
    +++-=====================================-==========================================-============-===============================================================================
    rc  aacraid                               1.2.1-52011                                amd64        This driver supports Adaptec by PMC aacraid family of cards.
    ii  accountsservice                       0.6.35-0ubuntu7.3                          amd64        query and manipulate user account information
    ii  acl                                   2.2.52-1                                   amd64        Access control list utilities
    ii  adduser                               3.113+nmu3ubuntu3                          all          add and remove users and groups
    ii  ant                                   1.9.3-2build1                              all          Java based build tool like make
    ii  ant-optional                          1.9.3-2build1                              all          Java based build tool like make - optional libraries
    ii  apache2                               2.4.7-1ubuntu4.18                          amd64        Apache HTTP Server
    ii  apache2-bin                           2.4.7-1ubuntu4.18                          amd64        Apache HTTP Server (binary files and modules)
    ii  apache2-data                          2.4.7-1ubuntu4.18                          all          Apache HTTP Server (common files)
    ii  apache2-utils                         2.4.7-1ubuntu4.18                          amd64        Apache HTTP Server (utility programs for web servers)
    rc  apache2.2-common                      2.2.22-1ubuntu1.11                         amd64        Apache HTTP Server common files
    ii  apparmor                              2.10.95-0ubuntu2.6~14.04.1                 amd64        user-space parser utility for AppArmor
    

    But dpkg -l prints on the screen with truncated data due to the columns width constraint. The truncated values exactly match the Python console output. The output column width is decided by environment variable COLUMNS and commands restrict the column width in output based on COLUMNS value.

    # echo $COLUMNS
    127
    
    # dpkg -l
    Desired=Unknown/Install/Remove/Purge/Hold
    | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
    |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
    ||/ Name                   Version          Architecture     Description
    +++-======================-================-================-==================================================
    rc  aacraid                1.2.1-52011      amd64            This driver supports Adaptec by PMC aacraid family
    ii  accountsservice        0.6.35-0ubuntu7. amd64            query and manipulate user account information
    ii  acl                    2.2.52-1         amd64            Access control list utilities
    ii  adduser                3.113+nmu3ubuntu all              add and remove users and groups
    ii  ant                    1.9.3-2build1    all              Java based build tool like make
    ii  ant-optional           1.9.3-2build1    all              Java based build tool like make - optional librari
    ii  apache2                2.4.7-1ubuntu4.1 amd64            Apache HTTP Server
    ii  apache2-bin            2.4.7-1ubuntu4.1 amd64            Apache HTTP Server (binary files and modules)
    ii  apache2-data           2.4.7-1ubuntu4.1 all              Apache HTTP Server (common files)
    ii  apache2-utils          2.4.7-1ubuntu4.1 amd64            Apache HTTP Server (utility programs for web serve
    rc  apache2.2-common       2.2.22-1ubuntu1. amd64            Apache HTTP Server common files
    ii  apparmor               2.10.95-0ubuntu2 amd64            user-space parser utility for AppArmor
    

    Solution

    The subprocess module of Python provides complete untruncated output of the shell command when the argument env={} is passed to check_output function:

    >>> installed_packages = subprocess.check_output(['dpkg -l | grep ^ii | awk \'{print $2 "    " $3}\''], shell=True, env={})
    >>> print installed_packages
    accountsservice    0.6.35-0ubuntu7.3
    acl    2.2.52-1
    adduser    3.113+nmu3ubuntu3
    ant    1.9.3-2build1
    ant-optional    1.9.3-2build1
    apache2    2.4.7-1ubuntu4.18
    apache2-bin    2.4.7-1ubuntu4.18
    apache2-data    2.4.7-1ubuntu4.18
    apache2-utils    2.4.7-1ubuntu4.18
    apparmor    2.10.95-0ubuntu2.6~14.04.1
    

    Explanation

    Curious to know what is happening behind the scenes? The check_output function uses C library functions execv or execve for processing. It chooses the function based on the env argument.

    Reference:

    When no env argument is passed to subprocess.check_output, the os.execv function is called.

    When an env argument is passed to subprocess.check_output, the os.execve function is called.

    for (i = 0; exec_array[i] != NULL; ++i) {
        const char *executable = exec_array[i];
        if (envp) {
            execve(executable, argv, envp);
        } else {
            execv(executable, argv);
        }
    

    What makes the execv and execve functions produce different output?

    The execv function passes through the shell COLUMNS variable which leads to truncating output columns to 127 width, like our reference system.

    # echo $COLUMNS
    127
    
    >>> print subprocess.check_output(['dpkg -l | grep libqtcore4'], shell=True)
    ii  libqtcore4:amd64          4:4.8.5+git192-g0 amd64             Qt 4 core module
    
    >>> print subprocess.check_output(['dpkg -l | grep libqtcore4'], shell=True, env={'COLUMNS':'127'})
    ii  libqtcore4:amd64          4:4.8.5+git192-g0 amd64             Qt 4 core module
    

    The execve function uses additional argument environment variables and it is based on the environ function. It uses environment variables available in env command which doesn’t have COLUMNS initialised. So output values returned without any column width restriction.

    >>> print subprocess.check_output(['dpkg -l | grep libqtcore4'], shell=True, env={})
    ii  libqtcore4:amd64                      4:4.8.5+git192-g085f851+dfsg-2ubuntu4.1    amd64        Qt 4 core module
    
    >>> print subprocess.check_output(['dpkg -l | grep libqtcore4'], shell=True, env={'COLUMNS':''})
    ii  libqtcore4:amd64                      4:4.8.5+git192-g085f851+dfsg-2ubuntu4.1    amd64        Qt 4 core module
    

    For more details refer to the man pages of execv, execve, environ.

    Conclusion

    It is always good to pass env={} argument to subprocess.check_output function whenever processing shell command output in Python. It helps avoid unstable results down the line due to truncated values.

    python shell environment


    Comments