Deleting specific characters from a string

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Behrang Dadsetan

    Deleting specific characters from a string

    Hi all,

    I would like deleting specific characters from a string.
    As an example, I would like to delete all of the '@' '&' in the string
    'You are ben@orange?ente r&your&code' so that it becomes
    'benorange?ente ryourcode'.

    So far I have been doing it like:
    str = 'You are ben@orange?ente r&your&code'
    str = ''.join([ c for c in str if c not in ('@', '&')])

    but that looks so ugly.. I am hoping to see nicer examples to acheive
    the above..

    Thanks.
    Ben.

  • Matt Shomphe

    #2
    Re: Deleting specific characters from a string

    Maybe a new method should be added to the str class, called "remove".
    It would take a list of characters and remove them from the string:


    class RemoveString(st r):
    def __init__(self, s=None):
    str.__init__(se lf, s)
    def remove(self, chars):
    s = self
    for c in chars:
    s = s.replace(c, '')
    return(s)

    if __name__ == '__main__':
    r = RemoveString('a bc')
    e = r.remove('c')
    print r, e
    # prints "abc ab" -- it's not "in place" removal

    M@



    Behrang Dadsetan <[email protected] om> wrote in message news:<begfb3$7j [email protected]>. ..[color=blue]
    > Hi all,
    >
    > I would like deleting specific characters from a string.
    > As an example, I would like to delete all of the '@' '&' in the string
    > 'You are ben@orange?ente r&your&code' so that it becomes
    > 'benorange?ente ryourcode'.
    >
    > So far I have been doing it like:
    > str = 'You are ben@orange?ente r&your&code'
    > str = ''.join([ c for c in str if c not in ('@', '&')])
    >
    > but that looks so ugly.. I am hoping to see nicer examples to acheive
    > the above..
    >
    > Thanks.
    > Ben.[/color]

    Comment

    • Donn Cave

      #3
      Re: Deleting specific characters from a string

      In article <5ab0af73.03070 91044.254f5aab@ posting.google. com>,
      MatthewS@HeyAni ta.com (Matt Shomphe) wrote:
      [color=blue]
      > Maybe a new method should be added to the str class, called "remove".
      > It would take a list of characters and remove them from the string:[/color]

      Check out the translate function - that's what its optional
      deletions argument is for.

      Donn Cave, [email protected] on.edu

      Comment

      • John Hunter

        #4
        Re: Deleting specific characters from a string

        >>>>> "Matt" == Matt Shomphe <MatthewS@HeyAn ita.com> writes:

        Matt> Maybe a new method should be added to the str class, called
        Matt> "remove". It would take a list of characters and remove
        Matt> them from the string:

        you can use string translate for this, which is shorter and faster
        than using the loop.

        class rstr(str):
        _allchars = "".join([chr(x) for x in range(256)])
        def remove(self, chars):
        return self.translate( self._allchars, chars)

        me = rstr('John Hunter')
        print me.remove('ohn' )

        Also, you don't need to define a separate __init__, since you are nor
        overloading the str default.

        JDH

        Comment

        • Behrang Dadsetan

          #5
          Re: Deleting specific characters from a string

          Donn Cave wrote:[color=blue]
          > In article <5ab0af73.03070 91044.254f5aab@ posting.google. com>,
          > MatthewS@HeyAni ta.com (Matt Shomphe) wrote:[color=green]
          >>Maybe a new method should be added to the str class, called "remove".
          >>It would take a list of characters and remove them from the string:[/color]
          > Check out the translate function - that's what its optional
          > deletions argument is for.[/color]
          [color=blue][color=green][color=darkred]
          >>> str = 'You are Ben@orange?ente r&your&code'
          >>> str.translate(s tring.maketrans ('',''), '@&')[/color][/color][/color]
          and[color=blue][color=green][color=darkred]
          >>> str.replace('&' , '').replace('@' , '')[/color][/color][/color]
          are also ugly...

          The first version is completely unreadable. I guess my initial example
          ''.join([ c for c in str if c not in ('@', '&')]) was easier to read
          than the translate (who would guess -without having to peek in the
          documentation of translate- that that line deletes @ and &?!) but I am
          not sure ;)

          while the second becomes acceptable. The examples you gave me use the
          string module.
          I think I read somewhere that the methods of the object should rather be
          used than the string module. Is that right?

          Thanks anyhow, I will go for the replace(somethi ng, '') method.
          Ben.

          Comment

          • Walter Dörwald

            #6
            Re: Deleting specific characters from a string

            Behrang Dadsetan wrote:
            [color=blue]
            > Hi all,
            >
            > I would like deleting specific characters from a string.
            > As an example, I would like to delete all of the '@' '&' in the string
            > 'You are ben@orange?ente r&your&code' so that it becomes
            > 'benorange?ente ryourcode'.
            >
            > So far I have been doing it like:
            > str = 'You are ben@orange?ente r&your&code'
            > str = ''.join([ c for c in str if c not in ('@', '&')])
            >
            > but that looks so ugly.. I am hoping to see nicer examples to acheive
            > the above..[/color]

            What about the following:

            str = 'You are ben@orange?ente r&your&code'
            str = filter(lambda c: c not in "@&", str)

            Bye,
            Walter Dörwald


            Comment

            • Behrang Dadsetan

              #7
              Re: Deleting specific characters from a string

              Walter Dörwald wrote:
              [color=blue]
              > Behrang Dadsetan wrote:
              >[color=green]
              >> Hi all,
              >>
              >> I would like deleting specific characters from a string.
              >> As an example, I would like to delete all of the '@' '&' in the
              >> string 'You are ben@orange?ente r&your&code' so that it becomes
              >> 'benorange?ente ryourcode'.
              >>
              >> So far I have been doing it like:
              >> str = 'You are ben@orange?ente r&your&code'
              >> str = ''.join([ c for c in str if c not in ('@', '&')])
              >>
              >> but that looks so ugly.. I am hoping to see nicer examples to acheive
              >> the above..[/color]
              >
              >
              > What about the following:
              >
              > str = 'You are ben@orange?ente r&your&code'
              > str = filter(lambda c: c not in "@&", str)
              >
              > Bye,
              > Walter Dörwald[/color]

              def isAcceptableCha r(character):
              return charachter in "@&"

              str = filter(isAccept ableChar, str)

              is going to finally be what I am going to use.
              I not feel lambdas are so readable, unless one has serious experience in
              using them and python in general. I feel it is acceptable to add a named
              method that documents with its name what it is doing there.

              But your example would probably have been my choice if I was more
              familiar with that type of use and the potential readers of my code were
              also familiar with it. Many thanks!

              Ben.

              Comment

              • Walter Dörwald

                #8
                Re: Deleting specific characters from a string

                Behrang Dadsetan wrote:
                [color=blue]
                > Walter Dörwald wrote:
                >[color=green]
                >> Behrang Dadsetan wrote:
                >>[color=darkred]
                >>> Hi all,
                >>>
                >>> I would like deleting specific characters from a string.
                >>> As an example, I would like to delete all of the '@' '&' in the
                >>> string 'You are ben@orange?ente r&your&code' so that it becomes
                >>> 'benorange?ente ryourcode'.
                >>>
                >>> So far I have been doing it like:
                >>> str = 'You are ben@orange?ente r&your&code'
                >>> str = ''.join([ c for c in str if c not in ('@', '&')])
                >>>
                >>> but that looks so ugly.. I am hoping to see nicer examples to acheive
                >>> the above..[/color]
                >>
                >>
                >>
                >> What about the following:
                >>
                >> str = 'You are ben@orange?ente r&your&code'
                >> str = filter(lambda c: c not in "@&", str)
                >>
                >> Bye,
                >> Walter Dörwald[/color]
                >
                >
                > def isAcceptableCha r(character):
                > return charachter in "@&"
                >
                > str = filter(isAccept ableChar, str)
                >
                > is going to finally be what I am going to use.
                > I not feel lambdas are so readable, unless one has serious experience in
                > using them and python in general. I feel it is acceptable to add a named
                > method that documents with its name what it is doing there.[/color]

                You're not the only one with this feeling. Compare "the eff-bot's
                favourite lambda refactoring rule":


                [color=blue]
                > [...][/color]

                Bye,
                Walter Dörwald

                Comment

                • John Hunter

                  #9
                  Re: Deleting specific characters from a string

                  >>>>> "Behrang" == Behrang Dadsetan <[email protected] om> writes:

                  Behrang> is going to finally be what I am going to use. I not
                  Behrang> feel lambdas are so readable, unless one has serious
                  Behrang> experience in using them and python in general. I feel it
                  Behrang> is acceptable to add a named method that documents with
                  Behrang> its name what it is doing there.

                  If you want to go the functional programing route, you can generalize
                  your function somewhat using a callable class:

                  class remove_char:
                  def __init__(self,r emove):
                  self.remove = dict([ (c,1) for c in remove])

                  def __call__(self,c ):
                  return not self.remove.has _key(c)

                  print filter(remove_c har('on'), 'John Hunter')

                  Cheers,
                  Jh Huter

                  Comment

                  • Bengt Richter

                    #10
                    Re: Deleting specific characters from a string

                    On Wed, 09 Jul 2003 23:36:03 +0200, Behrang Dadsetan <[email protected] om> wrote:
                    [color=blue]
                    >Walter Dörwald wrote:
                    >[color=green]
                    >> Behrang Dadsetan wrote:
                    >>[color=darkred]
                    >>> Hi all,
                    >>>
                    >>> I would like deleting specific characters from a string.
                    >>> As an example, I would like to delete all of the '@' '&' in the
                    >>> string 'You are ben@orange?ente r&your&code' so that it becomes
                    >>> 'benorange?ente ryourcode'.
                    >>>
                    >>> So far I have been doing it like:
                    >>> str = 'You are ben@orange?ente r&your&code'
                    >>> str = ''.join([ c for c in str if c not in ('@', '&')])
                    >>>
                    >>> but that looks so ugly.. I am hoping to see nicer examples to acheive
                    >>> the above..[/color]
                    >>
                    >>
                    >> What about the following:
                    >>
                    >> str = 'You are ben@orange?ente r&your&code'
                    >> str = filter(lambda c: c not in "@&", str)[/color][/color]
                    Aaack! I cringe seeing builtin str name rebound like that ;-/
                    [color=blue][color=green]
                    >>
                    >> Bye,
                    >> Walter Dörwald[/color]
                    >
                    >def isAcceptableCha r(character):
                    > return charachter in "@&"[/color]
                    return character not in "@&"[color=blue]
                    >
                    >str = filter(isAccept ableChar, str)
                    >
                    >is going to finally be what I am going to use.[/color]
                    That's not going to be anywhere near as fast as Donn's translate version.
                    [color=blue]
                    >I not feel lambdas are so readable, unless one has serious experience in
                    >using them and python in general. I feel it is acceptable to add a named
                    >method that documents with its name what it is doing there.
                    >
                    >But your example would probably have been my choice if I was more
                    >familiar with that type of use and the potential readers of my code were
                    >also familiar with it. Many thanks!
                    >[/color]
                    IMO, if you are going to define a function like isAcceptableCha r, only to use it
                    with filter, why not write a function to do the whole job, and whose invocation
                    reads well, while hiding Donn's fast translate version? E.g., substituting the literal
                    value of string.maketran s('',''):

                    ====< removechars.py >============== =============== =============== ============
                    def removeChars(s, remove=''):
                    return s.translate(
                    '\x00\x01\x02\x 03\x04\x05\x06\ x07\x08\t\n\x0b \x0c\r\x0e\x0f'
                    '\x10\x11\x12\x 13\x14\x15\x16\ x17\x18\x19\x1a \x1b\x1c\x1d\x1 e\x1f'
                    ' !"#$%&\'()*+ ,-./'
                    '0123456789:;<= >?'
                    '@ABCDEFGHIJKLM NO'
                    'PQRSTUVWXYZ[\\]^_'
                    '`abcdefghijklm no'
                    'pqrstuvwxyz{|} ~\x7f'
                    '\x80\x81\x82\x 83\x84\x85\x86\ x87\x88\x89\x8a \x8b\x8c\x8d\x8 e\x8f'
                    '\x90\x91\x92\x 93\x94\x95\x96\ x97\x98\x99\x9a \x9b\x9c\x9d\x9 e\x9f'
                    '\xa0\xa1\xa2\x a3\xa4\xa5\xa6\ xa7\xa8\xa9\xaa \xab\xac\xad\xa e\xaf'
                    '\xb0\xb1\xb2\x b3\xb4\xb5\xb6\ xb7\xb8\xb9\xba \xbb\xbc\xbd\xb e\xbf'
                    '\xc0\xc1\xc2\x c3\xc4\xc5\xc6\ xc7\xc8\xc9\xca \xcb\xcc\xcd\xc e\xcf'
                    '\xd0\xd1\xd2\x d3\xd4\xd5\xd6\ xd7\xd8\xd9\xda \xdb\xdc\xdd\xd e\xdf'
                    '\xe0\xe1\xe2\x e3\xe4\xe5\xe6\ xe7\xe8\xe9\xea \xeb\xec\xed\xe e\xef'
                    '\xf0\xf1\xf2\x f3\xf4\xf5\xf6\ xf7\xf8\xf9\xfa \xfb\xfc\xfd\xf e\xff'
                    , remove)

                    if __name__ == '__main__':
                    import sys
                    args = sys.argv[1:]
                    fin = sys.stdin; fout=sys.stdout ; remove='' # defaults
                    while args:
                    arg = args.pop(0)
                    if arg == '-fi': fin = file(args.pop(0 ))
                    elif arg == '-fo': fout = file(args.pop(0 ))
                    else: remove = arg
                    for line in fin:
                    fout.write(remo veChars(line, remove))
                    =============== =============== =============== =============== =============== ===
                    Not tested beyond what you see here ;-)

                    [16:40] C:\pywk\ut>echo "'You are ben@orange?ente r&your&code'" |python removechars.py "@&"
                    "'You are benorange?enter yourcode'"

                    [16:41] C:\pywk\ut>echo "'You are ben@orange?ente r&your&code'" |python removechars.py aeiou
                    "'Y r bn@rng?ntr&yr&c d'"

                    Copying a snip above to the clipboard and filtering that with no removes and then (lower case) vowels:

                    [16:41] C:\pywk\ut>getc lip |python removechars.py[color=blue]
                    >I not feel lambdas are so readable, unless one has serious experience in
                    >using them and python in general. I feel it is acceptable to add a named
                    >method that documents with its name what it is doing there.
                    >
                    >But your example would probably have been my choice if I was more
                    >familiar with that type of use and the potential readers of my code were
                    >also familiar with it. Many thanks![/color]

                    [16:42] C:\pywk\ut>getc lip |python removechars.py aeiou[color=blue]
                    >I nt fl lmbds r s rdbl, nlss n hs srs xprnc n
                    >sng thm nd pythn n gnrl. I fl t s ccptbl t dd nmd
                    >mthd tht dcmnts wth ts nm wht t s dng thr.
                    >
                    >Bt yr xmpl wld prbbly hv bn my chc f I ws mr
                    >fmlr wth tht typ f s nd th ptntl rdrs f my cd wr
                    >ls fmlr wth t. Mny thnks![/color]


                    Regards,
                    Bengt Richter

                    Comment

                    • Jeff Hinrichs

                      #11
                      Re: Deleting specific characters from a string

                      "John Hunter" <[email protected] sd.uchicago.edu > wrote in message
                      news:mailman.10 57789156.27025. [email protected] ...[color=blue][color=green][color=darkred]
                      > >>>>> "Behrang" == Behrang Dadsetan <[email protected] om> writes:[/color][/color]
                      >
                      > Behrang> is going to finally be what I am going to use. I not
                      > Behrang> feel lambdas are so readable, unless one has serious
                      > Behrang> experience in using them and python in general. I feel it
                      > Behrang> is acceptable to add a named method that documents with
                      > Behrang> its name what it is doing there.
                      >
                      > If you want to go the functional programing route, you can generalize
                      > your function somewhat using a callable class:
                      >
                      > class remove_char:
                      > def __init__(self,r emove):
                      > self.remove = dict([ (c,1) for c in remove])
                      >
                      > def __call__(self,c ):
                      > return not self.remove.has _key(c)
                      >
                      > print filter(remove_c har('on'), 'John Hunter')[/color]
                      I've been following this thread, and on a whim I built a test harness to
                      time the different ideas that have been put forth in this thread. I will
                      post complete results tomorrow on the web but the short version is that
                      using the .replace method is the overall champ by quite a bit. Below is the
                      function I tested against the others in the harness:

                      def stringReplace(s ,c):
                      """Remove any occurrences of characters in c, from string s
                      s - string to be filtered, c - characters to filter"""
                      for a in c:
                      s = s.replace(a,'')
                      return s

                      It wins also by being easy to understand, no filter or lambda. Not that I
                      have anything against filter or lambda, but when the speediest method is the
                      most readable, that solution is definitely the Pythonic champ. :)

                      -Jeff Hinrichs



                      Comment

                      • Behrang Dadsetan

                        #12
                        Re: Deleting specific characters from a string

                        Asun Friere wrote:[color=blue]
                        > Behrang Dadsetan <[email protected] om> wrote in message news:<3F0C8AC3. 5010304@dadseta n.com>...
                        >
                        >[color=green]
                        >>def isAcceptableCha r(character):
                        >> return charachter in "@&"
                        >>
                        >>str = filter(isAccept ableChar, str)
                        >>
                        >>is going to finally be what I am going to use.[/color]
                        >
                        >
                        > Might 'return character not in "@&"' work better?[/color]
                        ahem... yes of course...

                        Comment

                        • Behrang Dadsetan

                          #13
                          Re: Deleting specific characters from a string

                          Jeff Hinrichs wrote:[color=blue]
                          > def stringReplace(s ,c):
                          > """Remove any occurrences of characters in c, from string s
                          > s - string to be filtered, c - characters to filter"""
                          > for a in c:
                          > s = s.replace(a,'')
                          > return s
                          >
                          > It wins also by being easy to understand, no filter or lambda. Not that I
                          > have anything against filter or lambda, but when the speediest method is the
                          > most readable, that solution is definitely the Pythonic champ. :)[/color]

                          Well I really had nothing against the filter, but this solution looks
                          also acceptable.

                          Thanks.
                          Ben.

                          Comment

                          • Giles Brown

                            #14
                            Re: Deleting specific characters from a string

                            Behrang Dadsetan <[email protected] om> wrote in message news:<behssg$k5 [email protected]>. ..[color=blue][color=green][color=darkred]
                            > >>> str = 'You are Ben@orange?ente r&your&code'
                            > >>> str.translate(s tring.maketrans ('',''), '@&')[/color][/color]
                            > and[color=green][color=darkred]
                            > >>> str.replace('&' , '').replace('@' , '')[/color][/color]
                            > are also ugly...[/color]

                            Well beauty is in the eye of the beholder they say.
                            Has anybody mentioned the re module yet?
                            [color=blue][color=green][color=darkred]
                            >>> s = 'You are ben@orange?ente r&your&code'
                            >>> import re
                            >>> re.sub('[@&]', '', s)[/color][/color][/color]
                            'You are benorange?enter yourcode'

                            Giles

                            Comment

                            • Paul Rudin

                              #15
                              Re: Deleting specific characters from a string

                              >>>>> "Jeff" == Jeff Hinrichs <[email protected]> writes:

                              [color=blue]
                              > I've been following this thread, and on a whim I built a test
                              > harness to time the different ideas that have been put forth in
                              > this thread. I will post complete results tomorrow on the web
                              > but the short version is that using the .replace method is the
                              > overall champ by quite a bit. Below is the function I tested
                              > against the others in the harness:[/color]
                              [color=blue]
                              > def stringReplace(s ,c): """Remove any occurrences of characters
                              > in c, from string s s - string to be filtered, c - characters to
                              > filter""" for a in c: s = s.replace(a,'') return s[/color]
                              [color=blue]
                              > It wins also by being easy to understand, no filter or lambda.
                              > Not that I have anything against filter or lambda, but when the
                              > speediest method is the most readable, that solution is
                              > definitely the Pythonic champ. :)[/color]

                              I haven't been following this thread closely but isn't a regexp the
                              obvious way to do this? I'd expect it to be faster than your solution
                              - particularly on large input (although I haven't actually
                              tried). Arguably it's more pythonic too :-)

                              re.compile(r).s ub('',s)

                              where r is the obvious disjunctive regexp mentioning each of the
                              charaters you want to remove. If you want to construct such a regexp
                              from a list of characters:

                              r= reduce(lambda x,y: x+'|'+y, c,'')[1:]

                              So putting it all together as an alternative version of your fuction:


                              !!warning - untested code!!

                              import re

                              def stringReplace(s ,c):
                              r= reduce(lambda x,y: x+'|'+y, c,'')[1:]
                              return re.compile(r).s ub('',s)

                              Comment

                              Working...