timeout on os.popen3?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • selwyn

    timeout on os.popen3?

    hi all,

    I would like some advice on how I can include a timeout for a scanning
    operation using unzip on linux and os.popen3.

    I am scanning through about 30g of rescued zip files, looking for xml
    extensions within those files. What I have put together 'works', but
    only for what appears to be properly reconstructed files.
    Unfortunately, some aren't AND for some reason no standard error
    messages are being triggered. This causes my script to hang indefinitely.

    What I would like is for the script to move on to the next file, after
    say a 10sec period of inactivity, but am unsure how this could be
    included. I have googled around and think the select module may be
    helpful, but after reading the docs I am still confused :-(

    Here is what I have so far:

    #!/usr/bin/python
    import os,sys, time, string

    filesscanned=0
    possibles=0
    nonzips=0
    files=[]
    a = os.listdir(sys. argv[1])

    for i in a:
    print i
    stdin, stdout, stderr = os.popen3('unzi p -l %s%s' % (sys.argv[1], i))
    if stderr.read()== '':
    zippedfiles = string.lower(st dout.read())
    if zippedfiles.fin d('xml')!= -1:
    os.system("""cp '%s%s' candidates"""% (sys.argv[1], i))
    possibles +=1
    print 'found a candidate:- %s%s'% (sys.argv[1],i)
    files.append(i)
    else:
    os.system("""cp '%s%s' nonzips"""% (sys.argv[1], i))
    nonzips +=1
    print 'found nonzip or broken file:- %s' %i

    filesscanned +=1

    Any help gratefully received.
    cheers,
    Selwyn.

  • Donn Cave

    #2
    Re: timeout on os.popen3?

    Quoth selwyn <[email protected] t.nz>:
    | I would like some advice on how I can include a timeout for a scanning
    | operation using unzip on linux and os.popen3.
    |
    | I am scanning through about 30g of rescued zip files, looking for xml
    | extensions within those files. What I have put together 'works', but
    | only for what appears to be properly reconstructed files.
    | Unfortunately, some aren't AND for some reason no standard error
    | messages are being triggered. This causes my script to hang indefinitely.
    |
    | What I would like is for the script to move on to the next file, after
    | say a 10sec period of inactivity, but am unsure how this could be
    | included. I have googled around and think the select module may be
    | helpful, but after reading the docs I am still confused :-(

    You might be able to manage it with select. When you start up a program
    on two or more pipes, you have kind of a juggling act. Select is the
    juggler, it can tell which pipe has data to read and which is ready for
    more data to be written to it. However, it's still a juggling act and
    you need some skill, too. If you decide to try it, also read about
    os.read, and don't try to use the file object for reading.

    On the other hand, if you don't mind writing to disk files instead, that
    will completely eliminate this aspect of your problem. Like

    file = '%s%s' % (sys.argv[1], i)
    ev = os.system('unzi p -l "%s" > /tmp/zout 2> /tmp/zerr' % (file,)
    if os.WEXITSTATUS( ev) != 0 or nonEmptyFile('/tmp/zerr'):
    dealWithError(e v, open('/tmp/zerr', 'r'))
    elif searchFile('/tmp/zout', 'xml'):
    dealWithFile(fi le)

    Donn Cave, [email protected] m

    Comment

    • selwyn

      #3
      Re: timeout on os.popen3?

      > On the other hand, if you don't mind writing to disk files instead, that[color=blue]
      > will completely eliminate this aspect of your problem.[/color]

      thanks worked perfectly - classic case of not seeing the forest for the
      trees!

      Comment

      Working...