-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Description
Appears to happen on the compute call
[py-dev] c:\Data\test
λ python test_dask.py
Platform: Windows-2008ServerR2-6.1.7601-SP1
Python: 3.5.1 |Anaconda 2.4.1 (64-bit)| (default, Jan 29 2016, 15:01:46) [MSC v.1900 64 bit (AMD64)]
Fatal Python error: GC object already tracked
Thread 0x000010d4 (most recent call first):
File "C:\Python\envs\py-dev\lib\linecache.py", line 95 in updatecache
File "C:\Python\envs\py-dev\lib\linecache.py", line 47 in getlines
File "C:\Python\envs\py-dev\lib\linecache.py", line 16 in getline
File "C:\Python\envs\py-dev\lib\traceback.py", line 282 in line
File "C:\Python\envs\py-dev\lib\traceback.py", line 358 in extract
File "C:\Python\envs\py-dev\lib\traceback.py", line 474 in __init__
File "C:\Python\envs\py-dev\lib\traceback.py", line 117 in format_exception
File "C:\Python\envs\py-dev\lib\traceback.py", line 163 in format_exc
File "C:\Python\envs\py-dev\lib\threading.py", line 924 in _bootstrap_inner
File "C:\Python\envs\py-dev\lib\threading.py", line 882 in _bootstrap
Current thread 0x00001fd8 (most recent call first):
File "C:\Python\envs\py-dev\lib\threading.py", line 241 in __exit__
File "C:\Python\envs\py-dev\lib\queue.py", line 176 in get
File "C:\Python\envs\py-dev\lib\multiprocessing\pool.py", line 376 in _handle_tasks
File "C:\Python\envs\py-dev\lib\threading.py", line 862 in run
File "C:\Python\envs\py-dev\lib\threading.py", line 914 in _bootstrap_inner
File "C:\Python\envs\py-dev\lib\threading.py", line 882 in _bootstrap
Thread 0x000009b0 (most recent call first):
File "C:\Python\envs\py-dev\lib\multiprocessing\pool.py", line 367 in _handle_workers
File "C:\Python\envs\py-dev\lib\threading.py", line 862 in run
File "C:\Python\envs\py-dev\lib\threading.py", line 914 in _bootstrap_inner
File "C:\Python\envs\py-dev\lib\threading.py", line 882 in _bootstrap
Thread 0x00001348 (most recent call first):
Thread 0x00001edc (most recent call first):
File "C:\Python\envs\py-dev\lib\threading.py", line 293 in wait
Thread 0x00001624 (most recent call first):
File "C:\Python\envs\py-dev\lib\threading.py", line 293 in wait
File "C:\Python\envs\py-dev\lib\queue.py", line 164 in get
File "C:\Python\envs\py-dev\lib\site-packages\dask\async.py", line 467 in get_async
File "C:\Python\envs\py-dev\lib\site-packages\dask\threaded.py", line 57 in get
File "C:\Python\envs\py-dev\lib\site-packages\dask\base.py", line 110 in compute
File "C:\Python\envs\py-dev\lib\site-packages\dask\base.py", line 37 in compute
File "test_dask.py", line 44 in <module>
[py-dev] c:\Data\test
λ
test_dask.py
import faulthandler
faulthandler.enable()
from itertools import repeat
import sys
import dask as dsk
import dask.dataframe
from IPython import sys_info
import numpy as np
from numpy.random import randint
import pandas as pd
info = eval(sys_info())
print('Platform:\t', info['platform'])
print('Python:\t', info['sys_version'])
N = 100
dates = pd.date_range('01-Feb-2016', '29-Feb-2016')
region = np.asarray(['ABC', 'DEF', 'GHI', 'JKL'])
product = np.asarray(['A', 'B', 'C', 'D'])
header = ['DATE', 'REGION', 'MISSING', 'PRODUCT']
for date in dates:
data = zip(
repeat(date, N),
region[randint(0, region.size, N)],
repeat('', N),
product[randint(0, product.size, N)],
)
with open("dummy_{:%Y%m%d}.csv".format(date), 'w') as fid:
fid.write(','.join(header) + '\n')
for row in data:
fid.write(','.join(map(str, row)) + '\n')
data = dsk.dataframe.read_csv('dummy_*.csv', parse_dates=['DATE'], dayfirst=True)
max_date = data.DATE.max().compute()
print(max_date)Metadata
Metadata
Assignees
Labels
No labels