Skipping on memory in Python classes

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • the_rev_dharma_roadkill

    Skipping on memory in Python classes

    Hi,
    I'm a Python newbie, but I have some experience in other languages.

    I need to create about 100,000 instances of one class. Each instance
    has two lists, one usually empty, the other containing exactly 200
    elements which differ widely between the 100,000 instances, but about
    half of the elements in these lists will be empty strings:

    instance x (one of 100,000) contains:
    list A, 200 elements but half are emptyString
    list B, usually empty (None, not []), but can contain a few small
    elements
    variable X, a fairly short string.
    I also set __init__, __cmp__, and an attribute access function.

    Question: How can a reduce the memory used to a minimum?

    I have already set __slots__ = A,B,X
    and this shaved about 10% off of the used memory, which is well worth
    it.

    Any other proven techniques out there? Is there much point in
    creating a new metaclass for my class? How about replacing
    emptyStrings with Nones? Is there a fast (runtime) way of translating
    between '' and None?

    Cheers,
    Doug
  • Raymond Hettinger

    #2
    Re: Skipping on memory in Python classes


    "the_rev_dharma _roadkill" <doug.hendricks @tnzi.com> wrote in message
    news:fb91dbec.0 308061948.2e328 [email protected] gle.com...[color=blue]
    > Hi,
    > I'm a Python newbie, but I have some experience in other languages.
    >
    > I need to create about 100,000 instances of one class. Each instance
    > has two lists, one usually empty, the other containing exactly 200
    > elements which differ widely between the 100,000 instances, but about
    > half of the elements in these lists will be empty strings:
    >
    > instance x (one of 100,000) contains:
    > list A, 200 elements but half are emptyString
    > list B, usually empty (None, not []), but can contain a few small
    > elements
    > variable X, a fairly short string.
    > I also set __init__, __cmp__, and an attribute access function.
    >
    > Question: How can a reduce the memory used to a minimum?
    >
    > I have already set __slots__ = A,B,X
    > and this shaved about 10% off of the used memory, which is well worth
    > it.
    >
    > Any other proven techniques out there? Is there much point in
    > creating a new metaclass for my class? How about replacing
    > emptyStrings with Nones? Is there a fast (runtime) way of translating
    > between '' and None?[/color]

    If the list of 200 elements doesn't change, it may be better to use a
    tuple instead of a list.

    If the list contents are all of the same type, the array module provides
    a space efficient storage solution.

    Empty strings are like None in that they all refer to a single object,
    so there are no savings there.


    Raymond Hettinger


    Comment

    • the_rev_dharma_roadkill

      #3
      Re: Skipping on memory in Python classes

      "Raymond Hettinger" <vze4rx4y@veriz on.net> wrote in message news:<hVkYa.154 07$W%3.182@nwrd ny01.gnilink.ne t>...

      [snip]
      [color=blue][color=green]
      > >
      > > Any other proven techniques out there? Is there much point in
      > > creating a new metaclass for my class? How about replacing
      > > emptyStrings with Nones? Is there a fast (runtime) way of translating
      > > between '' and None?[/color]
      >
      > If the list of 200 elements doesn't change, it may be better to use a
      > tuple instead of a list.[/color]

      That sounds good, but it doesn't seem to make much difference.
      [color=blue]
      >
      > If the list contents are all of the same type, the array module provides
      > a space efficient storage solution.[/color]

      Most of the elements are either empty strings or strings of len() from
      1 to about 50. As I read/experiment, array.array is good for elements
      of
      len() == 1. Very efficient, yes, but not flexible enough.

      Maybe my own module, written in C, is the answer. Or not. My tests
      seem to indicate that most of the problem is the per-object memory,
      not the
      list-size-related memory. If I cut the size of the list from 200
      elements down to 35 elements, my memory use is only cut in half.
      That's nothing to sneeze at, but it's obvious that other terms are
      important.

      Maybe not storing 100,000 instances as objects is the answer. It
      should be possible to represent each object as a single (large) tuple
      or list or list of lists. Code readability will suffer. I'll
      experiment at a later date.
      [color=blue]
      >
      > Empty strings are like None in that they all refer to a single object,
      > so there are no savings there.[/color]

      That is good to know. Thanks.
      [color=blue]
      >
      >
      > Raymond Hettinger[/color]

      Doug

      Comment

      Working...