Tue Jan 9 16:30:04 UTC 2024 I: starting to build python-streamz/bullseye/i386 on jenkins on '2024-01-09 16:29' Tue Jan 9 16:30:04 UTC 2024 I: The jenkins build log is/was available at https://jenkins.debian.net/userContent/reproducible/debian/build_service/i386_6/10669/console.log Tue Jan 9 16:30:04 UTC 2024 I: Downloading source for bullseye/python-streamz=0.6.2-1 --2024-01-09 16:30:04-- http://cdn-fastly.deb.debian.org/debian/pool/main/p/python-streamz/python-streamz_0.6.2-1.dsc Connecting to 78.137.99.97:3128... connected. Proxy request sent, awaiting response... 200 OK Length: 2327 (2.3K) [text/prs.lines.tag] Saving to: ‘python-streamz_0.6.2-1.dsc’ 0K .. 100% 1.68M=0.001s 2024-01-09 16:30:04 (1.68 MB/s) - ‘python-streamz_0.6.2-1.dsc’ saved [2327/2327] Tue Jan 9 16:30:04 UTC 2024 I: python-streamz_0.6.2-1.dsc -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Format: 3.0 (quilt) Source: python-streamz Binary: python3-streamz Architecture: all Version: 0.6.2-1 Maintainer: Debian Med Packaging Team Uploaders: Nilesh Patra Homepage: https://github.com/python-streamz/streamz/ Standards-Version: 4.5.1 Vcs-Browser: https://salsa.debian.org/med-team/python-streamz Vcs-Git: https://salsa.debian.org/med-team/python-streamz.git Testsuite: autopkgtest, autopkgtest-pkg-python Testsuite-Triggers: python3-all, python3-pytest Build-Depends: debhelper-compat (= 13), dh-python, python3-all, python3-setuptools, python3-six, python3-toolz, python3-tornado, python3-pytest, python3-requests, python3-dask, python3-distributed, python3-numpy, python3-pandas, python3-flaky Package-List: python3-streamz deb python optional arch=all Checksums-Sha1: 66a0c5f1ca3f90113ab788ab8eaee82121b61d77 131780 python-streamz_0.6.2.orig.tar.gz 146f333f61a9f753976de64c41f6bfdb25d74aff 4020 python-streamz_0.6.2-1.debian.tar.xz Checksums-Sha256: c56df0d13ca03d7dd1ae82c88954baa9df1da95a9a8e6682b401c80963375b3f 131780 python-streamz_0.6.2.orig.tar.gz f9ed573f05c1c2f8144f8ee794048f6c4f80e4cc6fedc07127be28a348e59dda 4020 python-streamz_0.6.2-1.debian.tar.xz Files: d959308657ca4b59bb2c7ff89c66e92c 131780 python-streamz_0.6.2.orig.tar.gz 01e9e36d2444e14e7ce9f0f2f4d076a4 4020 python-streamz_0.6.2-1.debian.tar.xz -----BEGIN PGP SIGNATURE----- iQJIBAEBCgAyFiEEPpmlJvXcwMu/HO6mALrnSzQzafEFAmAFnkMUHG5wYXRyYTk3 NEBnbWFpbC5jb20ACgkQALrnSzQzafEi8Q/9GjkL2FJrFcDB1GrGvnnLLCZon88Q WBK5Qx9rcOYY3YGovWRunvYokI4QejKY+mAgj+sRMOff7DC5zrZ9siqSUel7536U 9+IjVv+TlmDXLovibLxL4Rt5mdWoZydZmvHoHaNSPoSXQ1NOht0lNaSLnr8n2MO0 YXR1snc2BANqVet6vYCKSkL7LvrHVGvwaC57LxTgttpK5PABoSf4cJ0FjAFjwUW8 +reRWCs7k9Cflhjjz+6nmJrBpewrXksQ2RORRPLKkgol94+N7AMsIcigwnHuaokK jKTxENLhv5jPC44MG4NrfMP0WXLzc1Exzun3hT1WLAH1aSAejjm6brQh+1myZm2m utIcHjQImzkjuNQ1qbIxfChgmNAPdYjkUTfpsh1TyYUNty6rteLVgjfjkkyzHAfx b6ZytTMtFsQpJk9w+vv6MQpZws42aQA7cRyp6HS4pRxDrwrDjO2tvgUk+6slcwOC dJUe7UJsm7Fc2gZQFkIP5ZgC6/WK8syMdVzO9WnOj4Zl6JTQZpcTj/Czy+4rz9+d JMPcczZJzMWNp78owQkRTC1JH1teFECBJjUEmUj7sZuNlnBuJxbh/xor+6jyxb1l giPhgwi2rLmCwZB6y2JgEg0Y+huyzZd3DjYKsCuJBKXb3q9pvckq3/Br5d/0QCr0 H5d4FpOxqoEooZw= =wkVW -----END PGP SIGNATURE----- Tue Jan 9 16:30:04 UTC 2024 I: Checking whether the package is not for us Tue Jan 9 16:30:04 UTC 2024 I: Starting 1st build on remote node ionos6-i386.debian.net. Tue Jan 9 16:30:04 UTC 2024 I: Preparing to do remote build '1' on ionos6-i386.debian.net. Tue Jan 9 16:34:03 UTC 2024 I: Deleting $TMPDIR on ionos6-i386.debian.net. I: pbuilder: network access will be disabled during build I: Current time: Mon Feb 10 10:53:03 -12 2025 I: pbuilder-time-stamp: 1739227983 I: Building the build Environment I: extracting base tarball [/var/cache/pbuilder/bullseye-reproducible-base.tgz] I: copying local configuration W: --override-config is not set; not updating apt.conf Read the manpage for details. I: mounting /proc filesystem I: mounting /sys filesystem I: creating /{dev,run}/shm I: mounting /dev/pts filesystem I: redirecting /dev/ptmx to /dev/pts/ptmx I: policy-rc.d already exists I: using eatmydata during job I: Copying source file I: copying [python-streamz_0.6.2-1.dsc] I: copying [./python-streamz_0.6.2.orig.tar.gz] I: copying [./python-streamz_0.6.2-1.debian.tar.xz] I: Extracting source gpgv: unknown type of key resource 'trustedkeys.kbx' gpgv: keyblock resource '/tmp/dpkg-verify-sig.0A5_VCTm/trustedkeys.kbx': General error gpgv: Signature made Mon Jan 18 14:42:11 2021 gpgv: using RSA key 3E99A526F5DCC0CBBF1CEEA600BAE74B343369F1 gpgv: issuer "npatra974@gmail.com" gpgv: Can't check signature: No public key dpkg-source: warning: failed to verify signature on ./python-streamz_0.6.2-1.dsc dpkg-source: info: extracting python-streamz in python-streamz-0.6.2 dpkg-source: info: unpacking python-streamz_0.6.2.orig.tar.gz dpkg-source: info: unpacking python-streamz_0.6.2-1.debian.tar.xz dpkg-source: info: using patch list from debian/patches/series dpkg-source: info: applying disable-unsupported-tests.patch I: Not using root during the build. I: Installing the build-deps I: user script /srv/workspace/pbuilder/96905/tmp/hooks/D02_print_environment starting I: set BUILDDIR='/build/reproducible-path' BUILDUSERGECOS='first user,first room,first work-phone,first home-phone,first other' BUILDUSERNAME='pbuilder1' BUILD_ARCH='i386' DEBIAN_FRONTEND='noninteractive' DEB_BUILD_OPTIONS='buildinfo=+all reproducible=+all,-fixfilepath parallel=16 ' DISTRIBUTION='bullseye' HOME='/root' HOST_ARCH='i386' IFS=' ' INVOCATION_ID='8a60ba7944484cf7b70cb4aec9ac1140' LANG='C' LANGUAGE='en_US:en' LC_ALL='C' LD_LIBRARY_PATH='/usr/lib/libeatmydata' LD_PRELOAD='libeatmydata.so' MAIL='/var/mail/root' OPTIND='1' PATH='/usr/sbin:/usr/bin:/sbin:/bin:/usr/games' PBCURRENTCOMMANDLINEOPERATION='build' PBUILDER_OPERATION='build' PBUILDER_PKGDATADIR='/usr/share/pbuilder' PBUILDER_PKGLIBDIR='/usr/lib/pbuilder' PBUILDER_SYSCONFDIR='/etc' PPID='96905' PS1='# ' PS2='> ' PS4='+ ' PWD='/' SHELL='/bin/bash' SHLVL='2' SUDO_COMMAND='/usr/bin/timeout -k 18.1h 18h /usr/bin/ionice -c 3 /usr/bin/nice /usr/sbin/pbuilder --build --configfile /srv/reproducible-results/rbuild-debian/r-b-build.UnyymaSA/pbuilderrc_ABwt --distribution bullseye --hookdir /etc/pbuilder/first-build-hooks --debbuildopts -b --basetgz /var/cache/pbuilder/bullseye-reproducible-base.tgz --buildresult /srv/reproducible-results/rbuild-debian/r-b-build.UnyymaSA/b1 --logfile b1/build.log python-streamz_0.6.2-1.dsc' SUDO_GID='112' SUDO_UID='107' SUDO_USER='jenkins' TERM='unknown' TZ='/usr/share/zoneinfo/Etc/GMT+12' USER='root' _='/usr/bin/systemd-run' http_proxy='http://85.184.249.68:3128' I: uname -a Linux ionos6-i386 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux I: ls -l /bin total 5776 -rwxr-xr-x 1 root root 1367848 Mar 27 2022 bash -rwxr-xr-x 3 root root 38280 Jul 20 2020 bunzip2 -rwxr-xr-x 3 root root 38280 Jul 20 2020 bzcat lrwxrwxrwx 1 root root 6 Jul 20 2020 bzcmp -> bzdiff -rwxr-xr-x 1 root root 2225 Jul 20 2020 bzdiff lrwxrwxrwx 1 root root 6 Jul 20 2020 bzegrep -> bzgrep -rwxr-xr-x 1 root root 4877 Sep 4 2019 bzexe lrwxrwxrwx 1 root root 6 Jul 20 2020 bzfgrep -> bzgrep -rwxr-xr-x 1 root root 3775 Jul 20 2020 bzgrep -rwxr-xr-x 3 root root 38280 Jul 20 2020 bzip2 -rwxr-xr-x 1 root root 17768 Jul 20 2020 bzip2recover lrwxrwxrwx 1 root root 6 Jul 20 2020 bzless -> bzmore -rwxr-xr-x 1 root root 1297 Jul 20 2020 bzmore -rwxr-xr-x 1 root root 38824 Sep 22 2020 cat -rwxr-xr-x 1 root root 71624 Sep 22 2020 chgrp -rwxr-xr-x 1 root root 67528 Sep 22 2020 chmod -rwxr-xr-x 1 root root 75752 Sep 22 2020 chown -rwxr-xr-x 1 root root 157960 Sep 22 2020 cp -rwxr-xr-x 1 root root 128724 Dec 10 2020 dash -rwxr-xr-x 1 root root 124904 Sep 22 2020 date -rwxr-xr-x 1 root root 92172 Sep 22 2020 dd -rwxr-xr-x 1 root root 100752 Sep 22 2020 df -rwxr-xr-x 1 root root 153964 Sep 22 2020 dir -rwxr-xr-x 1 root root 83644 Jan 20 2022 dmesg lrwxrwxrwx 1 root root 8 Nov 7 2019 dnsdomainname -> hostname lrwxrwxrwx 1 root root 8 Nov 7 2019 domainname -> hostname -rwxr-xr-x 1 root root 34664 Sep 22 2020 echo -rwxr-xr-x 1 root root 28 Jan 25 2023 egrep -rwxr-xr-x 1 root root 34664 Sep 22 2020 false -rwxr-xr-x 1 root root 28 Jan 25 2023 fgrep -rwxr-xr-x 1 root root 71928 Jan 20 2022 findmnt -rwsr-xr-x 1 root root 30112 Feb 26 2021 fusermount -rwxr-xr-x 1 root root 210488 Jan 25 2023 grep -rwxr-xr-x 2 root root 2346 Apr 10 2022 gunzip -rwxr-xr-x 1 root root 6447 Apr 10 2022 gzexe -rwxr-xr-x 1 root root 100952 Apr 10 2022 gzip -rwxr-xr-x 1 root root 21916 Nov 7 2019 hostname -rwxr-xr-x 1 root root 83980 Sep 22 2020 ln -rwxr-xr-x 1 root root 55572 Feb 7 2020 login -rwxr-xr-x 1 root root 153964 Sep 22 2020 ls -rwxr-xr-x 1 root root 153124 Jan 20 2022 lsblk -rwxr-xr-x 1 root root 96328 Sep 22 2020 mkdir -rwxr-xr-x 1 root root 79912 Sep 22 2020 mknod -rwxr-xr-x 1 root root 47048 Sep 22 2020 mktemp -rwxr-xr-x 1 root root 58920 Jan 20 2022 more -rwsr-xr-x 1 root root 50720 Jan 20 2022 mount -rwxr-xr-x 1 root root 13856 Jan 20 2022 mountpoint -rwxr-xr-x 1 root root 157996 Sep 22 2020 mv lrwxrwxrwx 1 root root 8 Nov 7 2019 nisdomainname -> hostname lrwxrwxrwx 1 root root 14 Dec 16 2021 pidof -> /sbin/killall5 -rwxr-xr-x 1 root root 38824 Sep 22 2020 pwd lrwxrwxrwx 1 root root 4 Mar 27 2022 rbash -> bash -rwxr-xr-x 1 root root 46984 Sep 22 2020 readlink -rwxr-xr-x 1 root root 75720 Sep 22 2020 rm -rwxr-xr-x 1 root root 46984 Sep 22 2020 rmdir -rwxr-xr-x 1 root root 22292 Sep 27 2020 run-parts -rwxr-xr-x 1 root root 125036 Dec 22 2018 sed lrwxrwxrwx 1 root root 4 Feb 8 15:47 sh -> dash -rwxr-xr-x 1 root root 34696 Sep 22 2020 sleep -rwxr-xr-x 1 root root 83880 Sep 22 2020 stty -rwsr-xr-x 1 root root 79396 Jan 20 2022 su -rwxr-xr-x 1 root root 34696 Sep 22 2020 sync -rwxr-xr-x 1 root root 602584 Feb 17 2021 tar -rwxr-xr-x 1 root root 13860 Sep 27 2020 tempfile -rwxr-xr-x 1 root root 108520 Sep 22 2020 touch -rwxr-xr-x 1 root root 34664 Sep 22 2020 true -rwxr-xr-x 1 root root 17768 Feb 26 2021 ulockmgr_server -rwsr-xr-x 1 root root 30236 Jan 20 2022 umount -rwxr-xr-x 1 root root 34664 Sep 22 2020 uname -rwxr-xr-x 2 root root 2346 Apr 10 2022 uncompress -rwxr-xr-x 1 root root 153964 Sep 22 2020 vdir -rwxr-xr-x 1 root root 63024 Jan 20 2022 wdctl lrwxrwxrwx 1 root root 8 Nov 7 2019 ypdomainname -> hostname -rwxr-xr-x 1 root root 1984 Apr 10 2022 zcat -rwxr-xr-x 1 root root 1678 Apr 10 2022 zcmp -rwxr-xr-x 1 root root 5898 Apr 10 2022 zdiff -rwxr-xr-x 1 root root 29 Apr 10 2022 zegrep -rwxr-xr-x 1 root root 29 Apr 10 2022 zfgrep -rwxr-xr-x 1 root root 2081 Apr 10 2022 zforce -rwxr-xr-x 1 root root 8049 Apr 10 2022 zgrep -rwxr-xr-x 1 root root 2206 Apr 10 2022 zless -rwxr-xr-x 1 root root 1842 Apr 10 2022 zmore -rwxr-xr-x 1 root root 4577 Apr 10 2022 znew I: user script /srv/workspace/pbuilder/96905/tmp/hooks/D02_print_environment finished -> Attempting to satisfy build-dependencies -> Creating pbuilder-satisfydepends-dummy package Package: pbuilder-satisfydepends-dummy Version: 0.invalid.0 Architecture: i386 Maintainer: Debian Pbuilder Team Description: Dummy package to satisfy dependencies with aptitude - created by pbuilder This package was created automatically by pbuilder to satisfy the build-dependencies of the package being currently built. Depends: debhelper-compat (= 13), dh-python, python3-all, python3-setuptools, python3-six, python3-toolz, python3-tornado, python3-pytest, python3-requests, python3-dask, python3-distributed, python3-numpy, python3-pandas, python3-flaky dpkg-deb: building package 'pbuilder-satisfydepends-dummy' in '/tmp/satisfydepends-aptitude/pbuilder-satisfydepends-dummy.deb'. Selecting previously unselected package pbuilder-satisfydepends-dummy. (Reading database ... 17763 files and directories currently installed.) Preparing to unpack .../pbuilder-satisfydepends-dummy.deb ... Unpacking pbuilder-satisfydepends-dummy (0.invalid.0) ... dpkg: pbuilder-satisfydepends-dummy: dependency problems, but configuring anyway as you requested: pbuilder-satisfydepends-dummy depends on debhelper-compat (= 13); however: Package debhelper-compat is not installed. pbuilder-satisfydepends-dummy depends on dh-python; however: Package dh-python is not installed. pbuilder-satisfydepends-dummy depends on python3-all; however: Package python3-all is not installed. pbuilder-satisfydepends-dummy depends on python3-setuptools; however: Package python3-setuptools is not installed. pbuilder-satisfydepends-dummy depends on python3-six; however: Package python3-six is not installed. pbuilder-satisfydepends-dummy depends on python3-toolz; however: Package python3-toolz is not installed. pbuilder-satisfydepends-dummy depends on python3-tornado; however: Package python3-tornado is not installed. pbuilder-satisfydepends-dummy depends on python3-pytest; however: Package python3-pytest is not installed. pbuilder-satisfydepends-dummy depends on python3-requests; however: Package python3-requests is not installed. pbuilder-satisfydepends-dummy depends on python3-dask; however: Package python3-dask is not installed. pbuilder-satisfydepends-dummy depends on python3-distributed; however: Package python3-distributed is not installed. pbuilder-satisfydepends-dummy depends on python3-numpy; however: Package python3-numpy is not installed. pbuilder-satisfydepends-dummy depends on python3-pandas; however: Package python3-pandas is not installed. pbuilder-satisfydepends-dummy depends on python3-flaky; however: Package python3-flaky is not installed. Setting up pbuilder-satisfydepends-dummy (0.invalid.0) ... Reading package lists... Building dependency tree... Reading state information... Initializing package states... Writing extended state information... Building tag database... pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) pbuilder-satisfydepends-dummy is already installed at the requested version (0.invalid.0) The following NEW packages will be installed: autoconf{a} automake{a} autopoint{a} autotools-dev{a} bsdextrautils{a} ca-certificates{a} debhelper{a} dh-autoreconf{a} dh-python{a} dh-strip-nondeterminism{a} dwz{a} file{a} gettext{a} gettext-base{a} groff-base{a} intltool-debian{a} libarchive-zip-perl{a} libblas3{a} libdebhelper-perl{a} libelf1{a} libexpat1{a} libfile-stripnondeterminism-perl{a} libgfortran5{a} libicu67{a} liblapack3{a} libmagic-mgc{a} libmagic1{a} libmpdec3{a} libpipeline1{a} libpython3-stdlib{a} libpython3.9-minimal{a} libpython3.9-stdlib{a} libreadline8{a} libsigsegv2{a} libsub-override-perl{a} libtool{a} libuchardet0{a} libxml2{a} libyaml-0-2{a} m4{a} man-db{a} media-types{a} openssl{a} po-debconf{a} python3{a} python3-all{a} python3-attr{a} python3-certifi{a} python3-chardet{a} python3-click{a} python3-cloudpickle{a} python3-colorama{a} python3-dask{a} python3-dateutil{a} python3-distributed{a} python3-distutils{a} python3-flaky{a} python3-fsspec{a} python3-heapdict{a} python3-idna{a} python3-importlib-metadata{a} python3-iniconfig{a} python3-lib2to3{a} python3-minimal{a} python3-more-itertools{a} python3-msgpack{a} python3-numpy{a} python3-packaging{a} python3-pandas{a} python3-pandas-lib{a} python3-pkg-resources{a} python3-pluggy{a} python3-psutil{a} python3-py{a} python3-pyparsing{a} python3-pytest{a} python3-requests{a} python3-setuptools{a} python3-six{a} python3-sortedcontainers{a} python3-tblib{a} python3-toml{a} python3-toolz{a} python3-tornado{a} python3-tz{a} python3-urllib3{a} python3-yaml{a} python3-zict{a} python3-zipp{a} python3.9{a} python3.9-minimal{a} readline-common{a} sensible-utils{a} tzdata{a} The following packages are RECOMMENDED but will NOT be installed: curl libarchive-cpio-perl libltdl-dev libmail-sendmail-perl lynx python3-bottleneck python3-bs4 python3-html5lib python3-jinja2 python3-lxml python3-matplotlib python3-numexpr python3-odf python3-openpyxl python3-partd python3-pygments python3-scipy python3-tables python3-xlwt wget 0 packages upgraded, 94 newly installed, 0 to remove and 0 not upgraded. Need to get 41.3 MB of archives. After unpacking 179 MB will be used. Writing extended state information... Get: 1 http://deb.debian.org/debian bullseye/main i386 bsdextrautils i386 2.36.1-8+deb11u1 [149 kB] Get: 2 http://deb.debian.org/debian bullseye/main i386 libuchardet0 i386 0.0.7-1 [67.9 kB] Get: 3 http://deb.debian.org/debian bullseye/main i386 groff-base i386 1.22.4-6 [952 kB] Get: 4 http://deb.debian.org/debian bullseye/main i386 libpipeline1 i386 1.5.3-1 [36.8 kB] Get: 5 http://deb.debian.org/debian bullseye/main i386 man-db i386 2.9.4-2 [1367 kB] Get: 6 http://deb.debian.org/debian bullseye/main i386 libpython3.9-minimal i386 3.9.2-1 [801 kB] Get: 7 http://deb.debian.org/debian bullseye/main i386 libexpat1 i386 2.2.10-2+deb11u5 [101 kB] Get: 8 http://deb.debian.org/debian bullseye/main i386 python3.9-minimal i386 3.9.2-1 [1956 kB] Get: 9 http://deb.debian.org/debian bullseye/main i386 python3-minimal i386 3.9.2-3 [38.2 kB] Get: 10 http://deb.debian.org/debian bullseye/main i386 media-types all 4.0.0 [30.3 kB] Get: 11 http://deb.debian.org/debian bullseye/main i386 tzdata all 2021a-1+deb11u10 [286 kB] Get: 12 http://deb.debian.org/debian bullseye/main i386 libmpdec3 i386 2.5.1-1 [91.9 kB] Get: 13 http://deb.debian.org/debian bullseye/main i386 readline-common all 8.1-1 [73.7 kB] Get: 14 http://deb.debian.org/debian bullseye/main i386 libreadline8 i386 8.1-1 [173 kB] Get: 15 http://deb.debian.org/debian bullseye/main i386 libpython3.9-stdlib i386 3.9.2-1 [1703 kB] Get: 16 http://deb.debian.org/debian bullseye/main i386 python3.9 i386 3.9.2-1 [466 kB] Get: 17 http://deb.debian.org/debian bullseye/main i386 libpython3-stdlib i386 3.9.2-3 [21.4 kB] Get: 18 http://deb.debian.org/debian bullseye/main i386 python3 i386 3.9.2-3 [37.9 kB] Get: 19 http://deb.debian.org/debian bullseye/main i386 sensible-utils all 0.0.14 [14.8 kB] Get: 20 http://deb.debian.org/debian bullseye/main i386 openssl i386 1.1.1w-0+deb11u1 [869 kB] Get: 21 http://deb.debian.org/debian bullseye/main i386 ca-certificates all 20210119 [158 kB] Get: 22 http://deb.debian.org/debian bullseye/main i386 libmagic-mgc i386 1:5.39-3+deb11u1 [273 kB] Get: 23 http://deb.debian.org/debian bullseye/main i386 libmagic1 i386 1:5.39-3+deb11u1 [135 kB] Get: 24 http://deb.debian.org/debian bullseye/main i386 file i386 1:5.39-3+deb11u1 [69.2 kB] Get: 25 http://deb.debian.org/debian bullseye/main i386 gettext-base i386 0.21-4 [176 kB] Get: 26 http://deb.debian.org/debian bullseye/main i386 libsigsegv2 i386 2.13-1 [35.1 kB] Get: 27 http://deb.debian.org/debian bullseye/main i386 m4 i386 1.4.18-5 [206 kB] Get: 28 http://deb.debian.org/debian bullseye/main i386 autoconf all 2.69-14 [313 kB] Get: 29 http://deb.debian.org/debian bullseye/main i386 autotools-dev all 20180224.1+nmu1 [77.1 kB] Get: 30 http://deb.debian.org/debian bullseye/main i386 automake all 1:1.16.3-2 [814 kB] Get: 31 http://deb.debian.org/debian bullseye/main i386 autopoint all 0.21-4 [510 kB] Get: 32 http://deb.debian.org/debian bullseye/main i386 libdebhelper-perl all 13.3.4 [189 kB] Get: 33 http://deb.debian.org/debian bullseye/main i386 libtool all 2.4.6-15 [513 kB] Get: 34 http://deb.debian.org/debian bullseye/main i386 dh-autoreconf all 20 [17.1 kB] Get: 35 http://deb.debian.org/debian bullseye/main i386 libarchive-zip-perl all 1.68-1 [104 kB] Get: 36 http://deb.debian.org/debian bullseye/main i386 libsub-override-perl all 0.09-2 [10.2 kB] Get: 37 http://deb.debian.org/debian bullseye/main i386 libfile-stripnondeterminism-perl all 1.12.0-1 [26.3 kB] Get: 38 http://deb.debian.org/debian bullseye/main i386 dh-strip-nondeterminism all 1.12.0-1 [15.4 kB] Get: 39 http://deb.debian.org/debian bullseye/main i386 libelf1 i386 0.183-1 [171 kB] Get: 40 http://deb.debian.org/debian bullseye/main i386 dwz i386 0.13+20210201-1 [179 kB] Get: 41 http://deb.debian.org/debian bullseye/main i386 libicu67 i386 67.1-7 [8775 kB] Get: 42 http://deb.debian.org/debian bullseye/main i386 libxml2 i386 2.9.10+dfsg-6.7+deb11u4 [728 kB] Get: 43 http://deb.debian.org/debian bullseye/main i386 gettext i386 0.21-4 [1322 kB] Get: 44 http://deb.debian.org/debian bullseye/main i386 intltool-debian all 0.35.0+20060710.5 [26.8 kB] Get: 45 http://deb.debian.org/debian bullseye/main i386 po-debconf all 1.0.21+nmu1 [248 kB] Get: 46 http://deb.debian.org/debian bullseye/main i386 debhelper all 13.3.4 [1049 kB] Get: 47 http://deb.debian.org/debian bullseye/main i386 python3-lib2to3 all 3.9.2-1 [77.8 kB] Get: 48 http://deb.debian.org/debian bullseye/main i386 python3-distutils all 3.9.2-1 [143 kB] Get: 49 http://deb.debian.org/debian bullseye/main i386 dh-python all 4.20201102+nmu1 [99.4 kB] Get: 50 http://deb.debian.org/debian bullseye/main i386 libblas3 i386 3.9.0-3+deb11u1 [147 kB] Get: 51 http://deb.debian.org/debian bullseye/main i386 libgfortran5 i386 10.2.1-6 [643 kB] Get: 52 http://deb.debian.org/debian bullseye/main i386 liblapack3 i386 3.9.0-3+deb11u1 [1960 kB] Get: 53 http://deb.debian.org/debian bullseye/main i386 libyaml-0-2 i386 0.2.2-1 [51.7 kB] Get: 54 http://deb.debian.org/debian bullseye/main i386 python3-all i386 3.9.2-3 [1060 B] Get: 55 http://deb.debian.org/debian bullseye/main i386 python3-attr all 20.3.0-1 [52.9 kB] Get: 56 http://deb.debian.org/debian bullseye/main i386 python3-certifi all 2020.6.20-1 [151 kB] Get: 57 http://deb.debian.org/debian bullseye/main i386 python3-pkg-resources all 52.0.0-4 [190 kB] Get: 58 http://deb.debian.org/debian bullseye/main i386 python3-chardet all 4.0.0-1 [99.0 kB] Get: 59 http://deb.debian.org/debian bullseye/main i386 python3-colorama all 0.4.4-1 [28.5 kB] Get: 60 http://deb.debian.org/debian bullseye/main i386 python3-click all 7.1.2-1 [75.7 kB] Get: 61 http://deb.debian.org/debian bullseye/main i386 python3-cloudpickle all 1.6.0-1 [21.6 kB] Get: 62 http://deb.debian.org/debian bullseye/main i386 python3-fsspec all 0.8.4-1 [65.5 kB] Get: 63 http://deb.debian.org/debian bullseye/main i386 python3-toolz all 0.9.0-1.1 [42.0 kB] Get: 64 http://deb.debian.org/debian bullseye/main i386 python3-yaml i386 5.3.1-5 [127 kB] Get: 65 http://deb.debian.org/debian bullseye/main i386 python3-dask all 2021.01.0+dfsg-1 [672 kB] Get: 66 http://deb.debian.org/debian bullseye/main i386 python3-six all 1.16.0-2 [17.5 kB] Get: 67 http://deb.debian.org/debian bullseye/main i386 python3-dateutil all 2.8.1-6 [79.2 kB] Get: 68 http://deb.debian.org/debian bullseye/main i386 python3-msgpack i386 1.0.0-6+b1 [71.7 kB] Get: 69 http://deb.debian.org/debian bullseye/main i386 python3-psutil i386 5.8.0-1 [185 kB] Get: 70 http://deb.debian.org/debian bullseye/main i386 python3-sortedcontainers all 2.1.0-2 [31.4 kB] Get: 71 http://deb.debian.org/debian bullseye/main i386 python3-tblib all 1.7.0-1 [13.9 kB] Get: 72 http://deb.debian.org/debian bullseye/main i386 python3-heapdict all 1.0.1-1 [5288 B] Get: 73 http://deb.debian.org/debian bullseye/main i386 python3-zict all 2.0.0-1 [9400 B] Get: 74 http://deb.debian.org/debian bullseye/main i386 python3-distributed all 2021.01.0+ds.1-2.1+deb11u1 [474 kB] Get: 75 http://deb.debian.org/debian bullseye/main i386 python3-flaky all 3.7.0-1 [20.1 kB] Get: 76 http://deb.debian.org/debian bullseye/main i386 python3-idna all 2.10-1 [37.4 kB] Get: 77 http://deb.debian.org/debian bullseye/main i386 python3-more-itertools all 4.2.0-3 [42.7 kB] Get: 78 http://deb.debian.org/debian bullseye/main i386 python3-zipp all 1.0.0-3 [6060 B] Get: 79 http://deb.debian.org/debian bullseye/main i386 python3-importlib-metadata all 1.6.0-2 [10.3 kB] Get: 80 http://deb.debian.org/debian bullseye/main i386 python3-iniconfig all 1.1.1-1 [6308 B] Get: 81 http://deb.debian.org/debian bullseye/main i386 python3-numpy i386 1:1.19.5-1 [3600 kB] Get: 82 http://deb.debian.org/debian bullseye/main i386 python3-pyparsing all 2.4.7-1 [109 kB] Get: 83 http://deb.debian.org/debian bullseye/main i386 python3-packaging all 20.9-2 [33.5 kB] Get: 84 http://deb.debian.org/debian bullseye/main i386 python3-tz all 2021.1-1 [34.8 kB] Get: 85 http://deb.debian.org/debian bullseye/main i386 python3-pandas-lib i386 1.1.5+dfsg-2 [3182 kB] Get: 86 http://deb.debian.org/debian bullseye/main i386 python3-pandas all 1.1.5+dfsg-2 [2096 kB] Get: 87 http://deb.debian.org/debian bullseye/main i386 python3-pluggy all 0.13.0-6 [22.3 kB] Get: 88 http://deb.debian.org/debian bullseye/main i386 python3-py all 1.10.0-1 [94.2 kB] Get: 89 http://deb.debian.org/debian bullseye/main i386 python3-toml all 0.10.1-1 [15.9 kB] Get: 90 http://deb.debian.org/debian bullseye/main i386 python3-pytest all 6.0.2-2 [211 kB] Get: 91 http://deb.debian.org/debian bullseye/main i386 python3-urllib3 all 1.26.5-1~exp1 [114 kB] Get: 92 http://deb.debian.org/debian bullseye/main i386 python3-requests all 2.25.1+dfsg-2 [69.3 kB] Get: 93 http://deb.debian.org/debian bullseye/main i386 python3-setuptools all 52.0.0-4 [366 kB] Get: 94 http://deb.debian.org/debian bullseye/main i386 python3-tornado i386 6.1.0-1+b1 [338 kB] Fetched 41.3 MB in 3s (16.5 MB/s) debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package bsdextrautils. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 17763 files and directories currently installed.) Preparing to unpack .../0-bsdextrautils_2.36.1-8+deb11u1_i386.deb ... Unpacking bsdextrautils (2.36.1-8+deb11u1) ... Selecting previously unselected package libuchardet0:i386. Preparing to unpack .../1-libuchardet0_0.0.7-1_i386.deb ... Unpacking libuchardet0:i386 (0.0.7-1) ... Selecting previously unselected package groff-base. Preparing to unpack .../2-groff-base_1.22.4-6_i386.deb ... Unpacking groff-base (1.22.4-6) ... Selecting previously unselected package libpipeline1:i386. Preparing to unpack .../3-libpipeline1_1.5.3-1_i386.deb ... Unpacking libpipeline1:i386 (1.5.3-1) ... Selecting previously unselected package man-db. Preparing to unpack .../4-man-db_2.9.4-2_i386.deb ... Unpacking man-db (2.9.4-2) ... Selecting previously unselected package libpython3.9-minimal:i386. Preparing to unpack .../5-libpython3.9-minimal_3.9.2-1_i386.deb ... Unpacking libpython3.9-minimal:i386 (3.9.2-1) ... Selecting previously unselected package libexpat1:i386. Preparing to unpack .../6-libexpat1_2.2.10-2+deb11u5_i386.deb ... Unpacking libexpat1:i386 (2.2.10-2+deb11u5) ... Selecting previously unselected package python3.9-minimal. Preparing to unpack .../7-python3.9-minimal_3.9.2-1_i386.deb ... Unpacking python3.9-minimal (3.9.2-1) ... Setting up libpython3.9-minimal:i386 (3.9.2-1) ... Setting up libexpat1:i386 (2.2.10-2+deb11u5) ... Setting up python3.9-minimal (3.9.2-1) ... Selecting previously unselected package python3-minimal. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 18630 files and directories currently installed.) Preparing to unpack .../0-python3-minimal_3.9.2-3_i386.deb ... Unpacking python3-minimal (3.9.2-3) ... Selecting previously unselected package media-types. Preparing to unpack .../1-media-types_4.0.0_all.deb ... Unpacking media-types (4.0.0) ... Selecting previously unselected package tzdata. Preparing to unpack .../2-tzdata_2021a-1+deb11u10_all.deb ... Unpacking tzdata (2021a-1+deb11u10) ... Selecting previously unselected package libmpdec3:i386. Preparing to unpack .../3-libmpdec3_2.5.1-1_i386.deb ... Unpacking libmpdec3:i386 (2.5.1-1) ... Selecting previously unselected package readline-common. Preparing to unpack .../4-readline-common_8.1-1_all.deb ... Unpacking readline-common (8.1-1) ... Selecting previously unselected package libreadline8:i386. Preparing to unpack .../5-libreadline8_8.1-1_i386.deb ... Unpacking libreadline8:i386 (8.1-1) ... Selecting previously unselected package libpython3.9-stdlib:i386. Preparing to unpack .../6-libpython3.9-stdlib_3.9.2-1_i386.deb ... Unpacking libpython3.9-stdlib:i386 (3.9.2-1) ... Selecting previously unselected package python3.9. Preparing to unpack .../7-python3.9_3.9.2-1_i386.deb ... Unpacking python3.9 (3.9.2-1) ... Selecting previously unselected package libpython3-stdlib:i386. Preparing to unpack .../8-libpython3-stdlib_3.9.2-3_i386.deb ... Unpacking libpython3-stdlib:i386 (3.9.2-3) ... Setting up python3-minimal (3.9.2-3) ... Selecting previously unselected package python3. (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 20913 files and directories currently installed.) Preparing to unpack .../00-python3_3.9.2-3_i386.deb ... Unpacking python3 (3.9.2-3) ... Selecting previously unselected package sensible-utils. Preparing to unpack .../01-sensible-utils_0.0.14_all.deb ... Unpacking sensible-utils (0.0.14) ... Selecting previously unselected package openssl. Preparing to unpack .../02-openssl_1.1.1w-0+deb11u1_i386.deb ... Unpacking openssl (1.1.1w-0+deb11u1) ... Selecting previously unselected package ca-certificates. Preparing to unpack .../03-ca-certificates_20210119_all.deb ... Unpacking ca-certificates (20210119) ... Selecting previously unselected package libmagic-mgc. Preparing to unpack .../04-libmagic-mgc_1%3a5.39-3+deb11u1_i386.deb ... Unpacking libmagic-mgc (1:5.39-3+deb11u1) ... Selecting previously unselected package libmagic1:i386. Preparing to unpack .../05-libmagic1_1%3a5.39-3+deb11u1_i386.deb ... Unpacking libmagic1:i386 (1:5.39-3+deb11u1) ... Selecting previously unselected package file. Preparing to unpack .../06-file_1%3a5.39-3+deb11u1_i386.deb ... Unpacking file (1:5.39-3+deb11u1) ... Selecting previously unselected package gettext-base. Preparing to unpack .../07-gettext-base_0.21-4_i386.deb ... Unpacking gettext-base (0.21-4) ... Selecting previously unselected package libsigsegv2:i386. Preparing to unpack .../08-libsigsegv2_2.13-1_i386.deb ... Unpacking libsigsegv2:i386 (2.13-1) ... Selecting previously unselected package m4. Preparing to unpack .../09-m4_1.4.18-5_i386.deb ... Unpacking m4 (1.4.18-5) ... Selecting previously unselected package autoconf. Preparing to unpack .../10-autoconf_2.69-14_all.deb ... Unpacking autoconf (2.69-14) ... Selecting previously unselected package autotools-dev. Preparing to unpack .../11-autotools-dev_20180224.1+nmu1_all.deb ... Unpacking autotools-dev (20180224.1+nmu1) ... Selecting previously unselected package automake. Preparing to unpack .../12-automake_1%3a1.16.3-2_all.deb ... Unpacking automake (1:1.16.3-2) ... Selecting previously unselected package autopoint. Preparing to unpack .../13-autopoint_0.21-4_all.deb ... Unpacking autopoint (0.21-4) ... Selecting previously unselected package libdebhelper-perl. Preparing to unpack .../14-libdebhelper-perl_13.3.4_all.deb ... Unpacking libdebhelper-perl (13.3.4) ... Selecting previously unselected package libtool. Preparing to unpack .../15-libtool_2.4.6-15_all.deb ... Unpacking libtool (2.4.6-15) ... Selecting previously unselected package dh-autoreconf. Preparing to unpack .../16-dh-autoreconf_20_all.deb ... Unpacking dh-autoreconf (20) ... Selecting previously unselected package libarchive-zip-perl. Preparing to unpack .../17-libarchive-zip-perl_1.68-1_all.deb ... Unpacking libarchive-zip-perl (1.68-1) ... Selecting previously unselected package libsub-override-perl. Preparing to unpack .../18-libsub-override-perl_0.09-2_all.deb ... Unpacking libsub-override-perl (0.09-2) ... Selecting previously unselected package libfile-stripnondeterminism-perl. Preparing to unpack .../19-libfile-stripnondeterminism-perl_1.12.0-1_all.deb ... Unpacking libfile-stripnondeterminism-perl (1.12.0-1) ... Selecting previously unselected package dh-strip-nondeterminism. Preparing to unpack .../20-dh-strip-nondeterminism_1.12.0-1_all.deb ... Unpacking dh-strip-nondeterminism (1.12.0-1) ... Selecting previously unselected package libelf1:i386. Preparing to unpack .../21-libelf1_0.183-1_i386.deb ... Unpacking libelf1:i386 (0.183-1) ... Selecting previously unselected package dwz. Preparing to unpack .../22-dwz_0.13+20210201-1_i386.deb ... Unpacking dwz (0.13+20210201-1) ... Selecting previously unselected package libicu67:i386. Preparing to unpack .../23-libicu67_67.1-7_i386.deb ... Unpacking libicu67:i386 (67.1-7) ... Selecting previously unselected package libxml2:i386. Preparing to unpack .../24-libxml2_2.9.10+dfsg-6.7+deb11u4_i386.deb ... Unpacking libxml2:i386 (2.9.10+dfsg-6.7+deb11u4) ... Selecting previously unselected package gettext. Preparing to unpack .../25-gettext_0.21-4_i386.deb ... Unpacking gettext (0.21-4) ... Selecting previously unselected package intltool-debian. Preparing to unpack .../26-intltool-debian_0.35.0+20060710.5_all.deb ... Unpacking intltool-debian (0.35.0+20060710.5) ... Selecting previously unselected package po-debconf. Preparing to unpack .../27-po-debconf_1.0.21+nmu1_all.deb ... Unpacking po-debconf (1.0.21+nmu1) ... Selecting previously unselected package debhelper. Preparing to unpack .../28-debhelper_13.3.4_all.deb ... Unpacking debhelper (13.3.4) ... Selecting previously unselected package python3-lib2to3. Preparing to unpack .../29-python3-lib2to3_3.9.2-1_all.deb ... Unpacking python3-lib2to3 (3.9.2-1) ... Selecting previously unselected package python3-distutils. Preparing to unpack .../30-python3-distutils_3.9.2-1_all.deb ... Unpacking python3-distutils (3.9.2-1) ... Selecting previously unselected package dh-python. Preparing to unpack .../31-dh-python_4.20201102+nmu1_all.deb ... Unpacking dh-python (4.20201102+nmu1) ... Selecting previously unselected package libblas3:i386. Preparing to unpack .../32-libblas3_3.9.0-3+deb11u1_i386.deb ... Unpacking libblas3:i386 (3.9.0-3+deb11u1) ... Selecting previously unselected package libgfortran5:i386. Preparing to unpack .../33-libgfortran5_10.2.1-6_i386.deb ... Unpacking libgfortran5:i386 (10.2.1-6) ... Selecting previously unselected package liblapack3:i386. Preparing to unpack .../34-liblapack3_3.9.0-3+deb11u1_i386.deb ... Unpacking liblapack3:i386 (3.9.0-3+deb11u1) ... Selecting previously unselected package libyaml-0-2:i386. Preparing to unpack .../35-libyaml-0-2_0.2.2-1_i386.deb ... Unpacking libyaml-0-2:i386 (0.2.2-1) ... Selecting previously unselected package python3-all. Preparing to unpack .../36-python3-all_3.9.2-3_i386.deb ... Unpacking python3-all (3.9.2-3) ... Selecting previously unselected package python3-attr. Preparing to unpack .../37-python3-attr_20.3.0-1_all.deb ... Unpacking python3-attr (20.3.0-1) ... Selecting previously unselected package python3-certifi. Preparing to unpack .../38-python3-certifi_2020.6.20-1_all.deb ... Unpacking python3-certifi (2020.6.20-1) ... Selecting previously unselected package python3-pkg-resources. Preparing to unpack .../39-python3-pkg-resources_52.0.0-4_all.deb ... Unpacking python3-pkg-resources (52.0.0-4) ... Selecting previously unselected package python3-chardet. Preparing to unpack .../40-python3-chardet_4.0.0-1_all.deb ... Unpacking python3-chardet (4.0.0-1) ... Selecting previously unselected package python3-colorama. Preparing to unpack .../41-python3-colorama_0.4.4-1_all.deb ... Unpacking python3-colorama (0.4.4-1) ... Selecting previously unselected package python3-click. Preparing to unpack .../42-python3-click_7.1.2-1_all.deb ... Unpacking python3-click (7.1.2-1) ... Selecting previously unselected package python3-cloudpickle. Preparing to unpack .../43-python3-cloudpickle_1.6.0-1_all.deb ... Unpacking python3-cloudpickle (1.6.0-1) ... Selecting previously unselected package python3-fsspec. Preparing to unpack .../44-python3-fsspec_0.8.4-1_all.deb ... Unpacking python3-fsspec (0.8.4-1) ... Selecting previously unselected package python3-toolz. Preparing to unpack .../45-python3-toolz_0.9.0-1.1_all.deb ... Unpacking python3-toolz (0.9.0-1.1) ... Selecting previously unselected package python3-yaml. Preparing to unpack .../46-python3-yaml_5.3.1-5_i386.deb ... Unpacking python3-yaml (5.3.1-5) ... Selecting previously unselected package python3-dask. Preparing to unpack .../47-python3-dask_2021.01.0+dfsg-1_all.deb ... Unpacking python3-dask (2021.01.0+dfsg-1) ... Selecting previously unselected package python3-six. Preparing to unpack .../48-python3-six_1.16.0-2_all.deb ... Unpacking python3-six (1.16.0-2) ... Selecting previously unselected package python3-dateutil. Preparing to unpack .../49-python3-dateutil_2.8.1-6_all.deb ... Unpacking python3-dateutil (2.8.1-6) ... Selecting previously unselected package python3-msgpack. Preparing to unpack .../50-python3-msgpack_1.0.0-6+b1_i386.deb ... Unpacking python3-msgpack (1.0.0-6+b1) ... Selecting previously unselected package python3-psutil. Preparing to unpack .../51-python3-psutil_5.8.0-1_i386.deb ... Unpacking python3-psutil (5.8.0-1) ... Selecting previously unselected package python3-sortedcontainers. Preparing to unpack .../52-python3-sortedcontainers_2.1.0-2_all.deb ... Unpacking python3-sortedcontainers (2.1.0-2) ... Selecting previously unselected package python3-tblib. Preparing to unpack .../53-python3-tblib_1.7.0-1_all.deb ... Unpacking python3-tblib (1.7.0-1) ... Selecting previously unselected package python3-heapdict. Preparing to unpack .../54-python3-heapdict_1.0.1-1_all.deb ... Unpacking python3-heapdict (1.0.1-1) ... Selecting previously unselected package python3-zict. Preparing to unpack .../55-python3-zict_2.0.0-1_all.deb ... Unpacking python3-zict (2.0.0-1) ... Selecting previously unselected package python3-distributed. Preparing to unpack .../56-python3-distributed_2021.01.0+ds.1-2.1+deb11u1_all.deb ... Unpacking python3-distributed (2021.01.0+ds.1-2.1+deb11u1) ... Selecting previously unselected package python3-flaky. Preparing to unpack .../57-python3-flaky_3.7.0-1_all.deb ... Unpacking python3-flaky (3.7.0-1) ... Selecting previously unselected package python3-idna. Preparing to unpack .../58-python3-idna_2.10-1_all.deb ... Unpacking python3-idna (2.10-1) ... Selecting previously unselected package python3-more-itertools. Preparing to unpack .../59-python3-more-itertools_4.2.0-3_all.deb ... Unpacking python3-more-itertools (4.2.0-3) ... Selecting previously unselected package python3-zipp. Preparing to unpack .../60-python3-zipp_1.0.0-3_all.deb ... Unpacking python3-zipp (1.0.0-3) ... Selecting previously unselected package python3-importlib-metadata. Preparing to unpack .../61-python3-importlib-metadata_1.6.0-2_all.deb ... Unpacking python3-importlib-metadata (1.6.0-2) ... Selecting previously unselected package python3-iniconfig. Preparing to unpack .../62-python3-iniconfig_1.1.1-1_all.deb ... Unpacking python3-iniconfig (1.1.1-1) ... Selecting previously unselected package python3-numpy. Preparing to unpack .../63-python3-numpy_1%3a1.19.5-1_i386.deb ... Unpacking python3-numpy (1:1.19.5-1) ... Selecting previously unselected package python3-pyparsing. Preparing to unpack .../64-python3-pyparsing_2.4.7-1_all.deb ... Unpacking python3-pyparsing (2.4.7-1) ... Selecting previously unselected package python3-packaging. Preparing to unpack .../65-python3-packaging_20.9-2_all.deb ... Unpacking python3-packaging (20.9-2) ... Selecting previously unselected package python3-tz. Preparing to unpack .../66-python3-tz_2021.1-1_all.deb ... Unpacking python3-tz (2021.1-1) ... Selecting previously unselected package python3-pandas-lib:i386. Preparing to unpack .../67-python3-pandas-lib_1.1.5+dfsg-2_i386.deb ... Unpacking python3-pandas-lib:i386 (1.1.5+dfsg-2) ... Selecting previously unselected package python3-pandas. Preparing to unpack .../68-python3-pandas_1.1.5+dfsg-2_all.deb ... Unpacking python3-pandas (1.1.5+dfsg-2) ... Selecting previously unselected package python3-pluggy. Preparing to unpack .../69-python3-pluggy_0.13.0-6_all.deb ... Unpacking python3-pluggy (0.13.0-6) ... Selecting previously unselected package python3-py. Preparing to unpack .../70-python3-py_1.10.0-1_all.deb ... Unpacking python3-py (1.10.0-1) ... Selecting previously unselected package python3-toml. Preparing to unpack .../71-python3-toml_0.10.1-1_all.deb ... Unpacking python3-toml (0.10.1-1) ... Selecting previously unselected package python3-pytest. Preparing to unpack .../72-python3-pytest_6.0.2-2_all.deb ... Unpacking python3-pytest (6.0.2-2) ... Selecting previously unselected package python3-urllib3. Preparing to unpack .../73-python3-urllib3_1.26.5-1~exp1_all.deb ... Unpacking python3-urllib3 (1.26.5-1~exp1) ... Selecting previously unselected package python3-requests. Preparing to unpack .../74-python3-requests_2.25.1+dfsg-2_all.deb ... Unpacking python3-requests (2.25.1+dfsg-2) ... Selecting previously unselected package python3-setuptools. Preparing to unpack .../75-python3-setuptools_52.0.0-4_all.deb ... Unpacking python3-setuptools (52.0.0-4) ... Selecting previously unselected package python3-tornado. Preparing to unpack .../76-python3-tornado_6.1.0-1+b1_i386.deb ... Unpacking python3-tornado (6.1.0-1+b1) ... Setting up media-types (4.0.0) ... Setting up libpipeline1:i386 (1.5.3-1) ... Setting up bsdextrautils (2.36.1-8+deb11u1) ... update-alternatives: using /usr/bin/write.ul to provide /usr/bin/write (write) in auto mode Setting up libicu67:i386 (67.1-7) ... Setting up libmagic-mgc (1:5.39-3+deb11u1) ... Setting up libarchive-zip-perl (1.68-1) ... Setting up libyaml-0-2:i386 (0.2.2-1) ... Setting up libdebhelper-perl (13.3.4) ... Setting up libmagic1:i386 (1:5.39-3+deb11u1) ... Setting up gettext-base (0.21-4) ... Setting up file (1:5.39-3+deb11u1) ... Setting up tzdata (2021a-1+deb11u10) ... Current default time zone: 'Etc/UTC' Local time is now: Mon Feb 10 22:53:23 UTC 2025. Universal Time is now: Mon Feb 10 22:53:23 UTC 2025. Run 'dpkg-reconfigure tzdata' if you wish to change it. Setting up autotools-dev (20180224.1+nmu1) ... Setting up libblas3:i386 (3.9.0-3+deb11u1) ... update-alternatives: using /usr/lib/i386-linux-gnu/blas/libblas.so.3 to provide /usr/lib/i386-linux-gnu/libblas.so.3 (libblas.so.3-i386-linux-gnu) in auto mode Setting up libsigsegv2:i386 (2.13-1) ... Setting up autopoint (0.21-4) ... Setting up libgfortran5:i386 (10.2.1-6) ... Setting up sensible-utils (0.0.14) ... Setting up libuchardet0:i386 (0.0.7-1) ... Setting up libmpdec3:i386 (2.5.1-1) ... Setting up libsub-override-perl (0.09-2) ... Setting up openssl (1.1.1w-0+deb11u1) ... Setting up libelf1:i386 (0.183-1) ... Setting up readline-common (8.1-1) ... Setting up libxml2:i386 (2.9.10+dfsg-6.7+deb11u4) ... Setting up libfile-stripnondeterminism-perl (1.12.0-1) ... Setting up liblapack3:i386 (3.9.0-3+deb11u1) ... update-alternatives: using /usr/lib/i386-linux-gnu/lapack/liblapack.so.3 to provide /usr/lib/i386-linux-gnu/liblapack.so.3 (liblapack.so.3-i386-linux-gnu) in auto mode Setting up gettext (0.21-4) ... Setting up libtool (2.4.6-15) ... Setting up libreadline8:i386 (8.1-1) ... Setting up m4 (1.4.18-5) ... Setting up intltool-debian (0.35.0+20060710.5) ... Setting up ca-certificates (20210119) ... Updating certificates in /etc/ssl/certs... 129 added, 0 removed; done. Setting up autoconf (2.69-14) ... Setting up dh-strip-nondeterminism (1.12.0-1) ... Setting up dwz (0.13+20210201-1) ... Setting up groff-base (1.22.4-6) ... Setting up libpython3.9-stdlib:i386 (3.9.2-1) ... Setting up libpython3-stdlib:i386 (3.9.2-3) ... Setting up automake (1:1.16.3-2) ... update-alternatives: using /usr/bin/automake-1.16 to provide /usr/bin/automake (automake) in auto mode Setting up po-debconf (1.0.21+nmu1) ... Setting up man-db (2.9.4-2) ... Not building database; man-db/auto-update is not 'true'. Setting up dh-autoreconf (20) ... Setting up python3.9 (3.9.2-1) ... Setting up debhelper (13.3.4) ... Setting up python3 (3.9.2-3) ... Setting up python3-sortedcontainers (2.1.0-2) ... Setting up python3-psutil (5.8.0-1) ... Setting up python3-tz (2021.1-1) ... Setting up python3-cloudpickle (1.6.0-1) ... Setting up python3-six (1.16.0-2) ... Setting up python3-flaky (3.7.0-1) ... Setting up python3-pyparsing (2.4.7-1) ... Setting up python3-certifi (2020.6.20-1) ... Setting up python3-idna (2.10-1) ... Setting up python3-toml (0.10.1-1) ... Setting up python3-urllib3 (1.26.5-1~exp1) ... Setting up python3-toolz (0.9.0-1.1) ... Setting up python3-dateutil (2.8.1-6) ... Setting up python3-msgpack (1.0.0-6+b1) ... Setting up python3-lib2to3 (3.9.2-1) ... Setting up python3-pkg-resources (52.0.0-4) ... Setting up python3-distutils (3.9.2-1) ... Setting up dh-python (4.20201102+nmu1) ... Setting up python3-more-itertools (4.2.0-3) ... Setting up python3-heapdict (1.0.1-1) ... Setting up python3-iniconfig (1.1.1-1) ... Setting up python3-attr (20.3.0-1) ... Setting up python3-tornado (6.1.0-1+b1) ... Setting up python3-setuptools (52.0.0-4) ... Setting up python3-tblib (1.7.0-1) ... Setting up python3-py (1.10.0-1) ... Setting up python3-colorama (0.4.4-1) ... Setting up python3-fsspec (0.8.4-1) ... Setting up python3-all (3.9.2-3) ... Setting up python3-yaml (5.3.1-5) ... Setting up python3-zipp (1.0.0-3) ... Setting up python3-click (7.1.2-1) ... Setting up python3-packaging (20.9-2) ... Setting up python3-chardet (4.0.0-1) ... Setting up python3-requests (2.25.1+dfsg-2) ... Setting up python3-numpy (1:1.19.5-1) ... Setting up python3-zict (2.0.0-1) ... Setting up python3-importlib-metadata (1.6.0-2) ... Setting up python3-pandas-lib:i386 (1.1.5+dfsg-2) ... Setting up python3-dask (2021.01.0+dfsg-1) ... Setting up python3-distributed (2021.01.0+ds.1-2.1+deb11u1) ... Setting up python3-pandas (1.1.5+dfsg-2) ... Setting up python3-pluggy (0.13.0-6) ... Setting up python3-pytest (6.0.2-2) ... Processing triggers for libc-bin (2.31-13+deb11u6) ... Processing triggers for ca-certificates (20210119) ... Updating certificates in /etc/ssl/certs... 0 added, 0 removed; done. Running hooks in /etc/ca-certificates/update.d... done. Reading package lists... Building dependency tree... Reading state information... Reading extended state information... Initializing package states... Writing extended state information... Building tag database... -> Finished parsing the build-deps I: Building the package I: Running cd /build/reproducible-path/python-streamz-0.6.2/ && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games" HOME="/nonexistent/first-build" dpkg-buildpackage -us -uc -b && env PATH="/usr/sbin:/usr/bin:/sbin:/bin:/usr/games" HOME="/nonexistent/first-build" dpkg-genchanges -S > ../python-streamz_0.6.2-1_source.changes dpkg-buildpackage: info: source package python-streamz dpkg-buildpackage: info: source version 0.6.2-1 dpkg-buildpackage: info: source distribution unstable dpkg-buildpackage: info: source changed by Nilesh Patra dpkg-source --before-build . dpkg-buildpackage: info: host architecture i386 debian/rules clean dh clean --with python3 --buildsystem=pybuild dh_auto_clean -O--buildsystem=pybuild I: pybuild base:232: python3.9 setup.py clean running clean removing '/build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build' (and everything under it) 'build/bdist.linux-i386' does not exist -- can't clean it 'build/scripts-3.9' does not exist -- can't clean it dh_autoreconf_clean -O--buildsystem=pybuild dh_clean -O--buildsystem=pybuild debian/rules binary dh binary --with python3 --buildsystem=pybuild dh_update_autotools_config -O--buildsystem=pybuild dh_autoreconf -O--buildsystem=pybuild dh_auto_configure -O--buildsystem=pybuild I: pybuild base:232: python3.9 setup.py config running config dh_auto_build -O--buildsystem=pybuild I: pybuild base:232: /usr/bin/python3 setup.py build running build running build_py creating /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/core.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/collection.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/sources.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/utils.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/utils_test.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/batch.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/dask.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/plugins.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/sinks.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/__init__.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/orderedweakset.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz copying streamz/graph.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz creating /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe copying streamz/dataframe/core.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe copying streamz/dataframe/utils.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe copying streamz/dataframe/aggregations.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe copying streamz/dataframe/__init__.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe creating /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_plugins.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_graph.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_batch.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_dask.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_core.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_sources.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/py3_test_core.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_sinks.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/__init__.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests copying streamz/tests/test_kafka.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests package init file 'streamz/dataframe/tests/__init__.py' not found (or not a regular file) creating /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests copying streamz/dataframe/tests/test_dataframes.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests copying streamz/dataframe/tests/test_dataframe_utils.py -> /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests dh_auto_test -O--buildsystem=pybuild I: pybuild pybuild:284: rm /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/tests/test_dask.py I: pybuild base:232: cd /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build; python3.9 -m pytest ============================= test session starts ============================== platform linux -- Python 3.9.2, pytest-6.0.2, py-1.10.0, pluggy-0.13.0 rootdir: /build/reproducible-path/python-streamz-0.6.2, configfile: setup.cfg plugins: flaky-3.7.0 collected 1534 items / 2 skipped / 1532 selected streamz/dataframe/tests/test_dataframe_utils.py .s.s [ 0%] streamz/dataframe/tests/test_dataframes.py ............................. [ 2%] ........................................................................ [ 6%] ...F...........F....F....F....F....F....F....F....F....F....F....F....F. [ 11%] ...F....F....F....F....F....F....F....F....F....F....F....F..FF........s [ 16%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 20%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 25%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 30%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 35%] ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 39%] sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss. [ 44%] ...xxxxxx....................ss......................................... [ 49%] .F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X. [ 53%] .F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X. [ 58%] .F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X. [ 63%] .F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X..F..X. [ 67%] .FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..F [ 72%] F..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF. [ 77%] .X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X [ 81%] ..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X.. [ 86%] FF..X..FF..X..FF..X..FF..X..FF..X..FF..X..FF..X....FF...........x.. [ 91%] streamz/tests/test_batch.py .... [ 91%] streamz/tests/test_core.py ............................................. [ 94%] ....................................................................... [ 98%] streamz/tests/test_plugins.py .... [ 99%] streamz/tests/test_sinks.py ..... [ 99%] streamz/tests/test_sources.py .XX...Xxx [100%] =================================== FAILURES =================================== _______________________ test_dataframe_simple[1] _______________________ func = at 0xf39d9730> @pytest.mark.parametrize('func', [ lambda df: df.query('x > 1 and x < 4', engine='python'), lambda df: df.x.value_counts().nlargest(2) ]) def test_dataframe_simple(func): df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]}) expected = func(df) a = DataFrame(example=df) L = func(a).stream.sink_to_list() a.emit(df) > assert_eq(L[0], expected) streamz/dataframe/tests/test_dataframes.py:191: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2 1 3 1 Name: x, dtype: int32, b = 2 1 3 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-0-0-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9a90> indexer = at 0xf39d9bb0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-0-1-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9ad8> indexer = at 0xf39d9bb0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-0-2-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b20> indexer = at 0xf39d9bb0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: y, dtype: int32, b = 0 2 1 1 Name: y, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-0-3-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b68> indexer = at 0xf39d9bb0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[core-1-0-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9a90> indexer = at 0xf39d9bf8>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 2 2 1.0 1 1 b = x y -overlapped-index-name-0 0.0 2 2 1.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-1-1-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9ad8> indexer = at 0xf39d9bf8>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-1-2-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b20> indexer = at 0xf39d9bf8>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 2 2 1 1 1, b = x y 0 2 2 1 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-1-3-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b68> indexer = at 0xf39d9bf8>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-2-0-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9a90> indexer = at 0xf39d9c40>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-2-1-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9ad8> indexer = at 0xf39d9c40>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-2-2-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b20> indexer = at 0xf39d9c40>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0 2 1 1, b = y 0 2 1 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[core-2-3-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b68> indexer = at 0xf39d9c40>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-0-0-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9a90> indexer = at 0xf39d9bb0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[dask-0-1-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9ad8> indexer = at 0xf39d9bb0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[dask-0-2-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b20> indexer = at 0xf39d9bb0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: y, dtype: int32, b = 0 2 1 1 Name: y, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[dask-0-3-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b68> indexer = at 0xf39d9bb0>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 2 1.0 1 Name: y, dtype: int32 b = x 0.0 2 1.0 1 Name: y, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __________ test_groupby_aggregate[dask-1-0-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9a90> indexer = at 0xf39d9bf8>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 2 2 1.0 1 1 b = x y -overlapped-index-name-0 0.0 2 2 1.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-1-1-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9ad8> indexer = at 0xf39d9bf8>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-1-2-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b20> indexer = at 0xf39d9bf8>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 2 2 1 1 1, b = x y 0 2 2 1 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-1-3-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b68> indexer = at 0xf39d9bf8>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-2-0-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9a90> indexer = at 0xf39d9c40>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-2-1-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9ad8> indexer = at 0xf39d9c40>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-2-2-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b20> indexer = at 0xf39d9c40>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0 2 1 1, b = y 0 2 1 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __________ test_groupby_aggregate[dask-2-3-2] __________ agg = at 0xf39d99b8> grouper = at 0xf39d9b68> indexer = at 0xf39d9c40>, stream = @pytest.mark.parametrize('agg', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), # pytest.mark.xfail(lambda x: x.var(ddof=0), reason="don't know") ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'x', lambda a: a.index % 2, lambda a: ['x'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.y, lambda g: g, lambda g: g[['y']] # lambda g: g[['x', 'y']] ]) def test_groupby_aggregate(agg, grouper, indexer, stream): df = pd.DataFrame({'x': (np.arange(10) // 2).astype(float), 'y': [1.0, 2.0] * 5}) a = DataFrame(example=df.iloc[:0], stream=stream) def f(x): return agg(indexer(x.groupby(grouper(x)))) L = f(a).stream.gather().sink_to_list() a.emit(df.iloc[:3]) a.emit(df.iloc[3:7]) a.emit(df.iloc[7:]) first = df.iloc[:3] > assert assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:301: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y x 0.0 2 1.0 1, b = y x 0.0 2 1.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="y") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___________________________ test_value_counts[core] ____________________________ stream = def test_value_counts(stream): s = pd.Series(['a', 'b', 'a']) a = Series(example=s, stream=stream) b = a.value_counts() assert b._stream_type == 'updating' result = b.stream.gather().sink_to_list() a.emit(s) a.emit(s) > assert_eq(result[-1], pd.concat([s, s], axis=0).value_counts()) streamz/dataframe/tests/test_dataframes.py:317: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = a 4 b 2 dtype: int32, b = a 4 b 2 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___________________________ test_value_counts[dask] ____________________________ stream = def test_value_counts(stream): s = pd.Series(['a', 'b', 'a']) a = Series(example=s, stream=stream) b = a.value_counts() assert b._stream_type == 'updating' result = b.stream.gather().sink_to_list() a.emit(s) a.emit(s) > assert_eq(result[-1], pd.concat([s, s], axis=0).value_counts()) streamz/dataframe/tests/test_dataframes.py:317: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = a 4 b 2 dtype: int32, b = a 4 b 2 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-0-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c40> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 3 1.0 2 2.0 2 3.0 3 Name: x, dtype: int32 b = x 0.0 3 1.0 2 2.0 2 3.0 3 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-0-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c40> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 4 1.0 3 2.0 3 3.0 3 Name: x, dtype: int32 b = x 0.0 4 1.0 3 2.0 3 3.0 3 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-0-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c40> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 3 1.0 2 2.0 2 3.0 3 Name: x, dtype: int32 b = x 0.0 3 1.0 2 2.0 2 3.0 3 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-0-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c40> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 4 1.0 3 2.0 3 3.0 3 Name: x, dtype: int32 b = x 0.0 4 1.0 3 2.0 3 3.0 3 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-1-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c88> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 5 1.0 5 Name: x, dtype: int32 b = y 0.0 5 1.0 5 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-1-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c88> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 7 1.0 6 Name: x, dtype: int32 b = y 0.0 7 1.0 6 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-1-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c88> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 5 1.0 5 Name: x, dtype: int32 b = y 0.0 5 1.0 5 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-1-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c88> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 7 1.0 6 Name: x, dtype: int32 b = y 0.0 7 1.0 6 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-2-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1cd0> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 2000-01-01 07:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int32 b = 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 2000-01-01 07:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-2-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1cd0> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 2000-01-01 04:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int32 b = 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 2000-01-01 04:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-2-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1cd0> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 2000-01-01 07:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int32 b = 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 2000-01-01 07:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-2-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1cd0> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 2000-01-01 04:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int32 b = 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 2000-01-01 04:00:0...00-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-3-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1d18> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 5 1.0 5 Name: x, dtype: int32 b = y 0.0 5 1.0 5 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-3-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1d18> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 7 1.0 6 Name: x, dtype: int32 b = y 0.0 7 1.0 6 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[0-3-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1d18> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 5 1.0 5 Name: x, dtype: int32 b = y 0.0 5 1.0 5 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ___ test_groupby_windowing_value[0-3-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1d18> indexer = at 0xf39d1d60> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 0.0 7 1.0 6 Name: x, dtype: int32 b = y 0.0 7 1.0 6 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError __ test_groupby_windowing_value[1-0-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c40> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 3 3 1.0 2 2 2.0 2 2 3.0 3 3 b = x y -overlapped-index-name-0 0.0 3 3 1.0 2 2 2.0 2 2 3.0 3 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-0-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c40> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 4 4 1.0 3 3 2.0 3 3 3.0 3 3 b = x y -overlapped-index-name-0 0.0 4 4 1.0 3 3 2.0 3 3 3.0 3 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-0-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c40> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 3 3 1.0 2 2 2.0 2 2 3.0 3 3 b = x y -overlapped-index-name-0 0.0 3 3 1.0 2 2 2.0 2 2 3.0 3 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-0-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c40> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 4 4 1.0 3 3 2.0 3 3 3.0 3 3 b = x y -overlapped-index-name-0 0.0 4 4 1.0 3 3 2.0 3 3 3.0 3 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-1-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c88> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-1-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c88> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-1-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c88> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-1-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c88> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-2-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1cd0> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 2000-01-01 03:00:00 1 1 2000-01-01 04:00:00 1 1 2000-01-01 05:00:00 1 1 2000-01-01 06:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 b = x y 2000-01-01 03:00:00 1 1 2000-01-01 04:00:00 1 1 2000-01-01 05:00:00 1 1 2000-01-01 06:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-2-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1cd0> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 2000-01-01 00:00:00 1 1 2000-01-01 01:00:00 1 1 2000-01-01 02:00:00 1 1 2000-01-01 03:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 b = x y 2000-01-01 00:00:00 1 1 2000-01-01 01:00:00 1 1 2000-01-01 02:00:00 1 1 2000-01-01 03:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-2-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1cd0> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 2000-01-01 03:00:00 1 1 2000-01-01 04:00:00 1 1 2000-01-01 05:00:00 1 1 2000-01-01 06:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 b = x y 2000-01-01 03:00:00 1 1 2000-01-01 04:00:00 1 1 2000-01-01 05:00:00 1 1 2000-01-01 06:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-2-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1cd0> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 2000-01-01 00:00:00 1 1 2000-01-01 01:00:00 1 1 2000-01-01 02:00:00 1 1 2000-01-01 03:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 b = x y 2000-01-01 00:00:00 1 1 2000-01-01 01:00:00 1 1 2000-01-01 02:00:00 1 1 2000-01-01 03:... 08:00:00 1 1 2000-01-01 09:00:00 1 1 2000-01-01 10:00:00 1 1 2000-01-01 11:00:00 1 1 2000-01-01 12:00:00 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-3-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1d18> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-3-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1d18> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[1-3-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1d18> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[1-3-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1d18> indexer = at 0xf39d1da8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-0-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c40> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 3 1.0 2 2.0 2 3.0 3 b = x -overlapped-index-name-0 0.0 3 1.0 2 2.0 2 3.0 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-0-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c40> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 4 1.0 3 2.0 3 3.0 3 b = x -overlapped-index-name-0 0.0 4 1.0 3 2.0 3 3.0 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-0-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c40> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 3 1.0 2 2.0 2 3.0 3 b = x -overlapped-index-name-0 0.0 3 1.0 2 2.0 2 3.0 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-0-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c40> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 4 1.0 3 2.0 3 3.0 3 b = x -overlapped-index-name-0 0.0 4 1.0 3 2.0 3 3.0 3 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-1-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c88> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-1-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1c88> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-1-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c88> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-1-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1c88> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-2-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1cd0> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 b = x 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-2-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1cd0> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 b = x 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-2-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1cd0> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 b = x 2000-01-01 03:00:00 1 2000-01-01 04:00:00 1 2000-01-01 05:00:00 1 2000-01-01 06:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-2-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1cd0> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 b = x 2000-01-01 00:00:00 1 2000-01-01 01:00:00 1 2000-01-01 02:00:00 1 2000-01-01 03:00:00 1 200...0 1 2000-01-01 08:00:00 1 2000-01-01 09:00:00 1 2000-01-01 10:00:00 1 2000-01-01 11:00:00 1 2000-01-01 12:00:00 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-3-0-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1d18> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-3-0-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bb0> grouper = at 0xf39d1d18> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError __ test_groupby_windowing_value[2-3-1-10h-2] ___ func = at 0xf39d1a90>, value = Timedelta('0 days 10:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1d18> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 5 1.0 5, b = x y 0.0 5 1.0 5 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ___ test_groupby_windowing_value[2-3-1-1d-2] ___ func = at 0xf39d1a90>, value = Timedelta('1 days 00:00:00') getter = at 0xf39d1bf8> grouper = at 0xf39d1d18> indexer = at 0xf39d1df0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.var(ddof=1), lambda x: x.std(), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('value', ['10h', '1d']) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 4, lambda a: 'y', lambda a: a.index, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_value(func, value, getter, grouper, indexer): index = pd.date_range(start='2000-01-01', end='2000-01-03', freq='1h') df = pd.DataFrame({'x': np.arange(len(index), dtype=float), 'y': np.arange(len(index), dtype=float) % 2}, index=index) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(value)).stream.gather().sink_to_list() value = pd.Timedelta(value) diff = 13 for i in range(0, len(index), diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[:diff] first = first[first.index.max() - value + pd.Timedelta('1ns'):] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:762: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0.0 7 1.0 6, b = x y 0.0 7 1.0 6 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[0-0-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 Name: x, dtype: int32, b = x 2.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 Name: x, dtype: int32, b = x 2.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int32 b = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int32 b = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 Name: x, dtype: int32, b = x 2.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 Name: x, dtype: int32, b = x 2.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int32 b = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-0-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int32 b = x 0.0 1 1.0 1 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-1-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 Name: x, dtype: int32, b = 0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 Name: x, dtype: int32, b = 0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: x, dtype: int32, b = 0 2 1 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: x, dtype: int32, b = 0 2 1 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 Name: x, dtype: int32, b = 0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 Name: x, dtype: int32, b = 0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: x, dtype: int32, b = 0 2 1 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-2-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 Name: x, dtype: int32, b = 0 2 1 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 Name: x, dtype: int32, b = y 1.0 1 Name: x, dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[0-3-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df268> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 Name: x, dtype: int32 b = y 1.0 2 2.0 1 Name: x, dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-0-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 2.0 1 1 b = x y -overlapped-index-name-0 2.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-0-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 dtype: int32, b = x 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-0-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 1 1 1.0 1 1 2.0 1 1 b = x y -overlapped-index-name-0 0.0 1 1 1.0 1 1 2.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-0-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 dtype: int32 b = x 0.0 1 1.0 1 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-0-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 2.0 1 1 b = x y -overlapped-index-name-0 2.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-0-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 dtype: int32, b = x 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-0-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y -overlapped-index-name-0 0.0 1 1 1.0 1 1 2.0 1 1 b = x y -overlapped-index-name-0 0.0 1 1 1.0 1 1 2.0 1 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-0-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 dtype: int32 b = x 0.0 1 1.0 1 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-1-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-1-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-1-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-1-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-1-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-1-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-1-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-1-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-2-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 1 1, b = x y 0 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-2-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 dtype: int32, b = 0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-2-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 2 2 1 1 1, b = x y 0 2 2 1 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-2-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 dtype: int32, b = 0 2 1 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-2-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 1 1, b = x y 0 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-2-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 dtype: int32, b = 0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-2-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 0 2 2 1 1 1, b = x y 0 2 2 1 1 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-2-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 dtype: int32, b = 0 2 1 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-3-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-3-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-3-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-3-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-3-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-3-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[1-3-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[1-3-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df2b0> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-0-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 2.0 1 b = x -overlapped-index-name-0 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-0-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 dtype: int32, b = x 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-0-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 1 1.0 1 2.0 1 b = x -overlapped-index-name-0 0.0 1 1.0 1 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-0-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df148> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 dtype: int32 b = x 0.0 1 1.0 1 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-0-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 2.0 1 b = x -overlapped-index-name-0 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-0-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 2.0 1 dtype: int32, b = x 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-0-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x -overlapped-index-name-0 0.0 1 1.0 1 2.0 1 b = x -overlapped-index-name-0 0.0 1 1.0 1 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-0-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df148> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0.0 1 1.0 1 2.0 1 dtype: int32 b = x 0.0 1 1.0 1 2.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-1-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-1-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-1-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-1-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df190> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-1-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-1-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-1-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-1-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df190> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-2-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0 1, b = x 0 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-2-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 dtype: int32, b = 0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-2-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0 2 1 1, b = x 0 2 1 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-2-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df1d8> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 dtype: int32, b = 0 2 1 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-2-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0 1, b = x 0 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-2-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 1 dtype: int32, b = 0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-2-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x 0 2 1 1, b = x 0 2 1 1, check_names = True, check_dtypes = True check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-2-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df1d8> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = 0 2 1 1 dtype: int32, b = 0 2 1 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-3-0-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-3-0-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-3-0-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-3-0-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df0b8> grouper = at 0xf39df220> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-3-1-1-2] ______ func = at 0xf39d1f10>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 1, b = x y 1.0 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-3-1-1-3] ______ func = at 0xf39d1f58>, n = 1 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 1 dtype: int32, b = y 1.0 1 dtype: int64, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError _____ test_groupby_windowing_n[2-3-1-4-2] ______ func = at 0xf39d1f10>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = x y 1.0 2 2.0 1, b = x y 1.0 2 2.0 1 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="x") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError _____ test_groupby_windowing_n[2-3-1-4-3] ______ func = at 0xf39d1f58>, n = 4 getter = at 0xf39df100> grouper = at 0xf39df220> indexer = at 0xf39df2f8> @pytest.mark.parametrize('func', [ lambda x: x.sum(), lambda x: x.mean(), lambda x: x.count(), lambda x: x.size(), lambda x: x.var(ddof=1), lambda x: x.std(ddof=1), pytest.param(lambda x: x.var(ddof=0), marks=pytest.mark.xfail), ]) @pytest.mark.parametrize('n', [1, 4]) @pytest.mark.parametrize('getter', [ lambda df: df, lambda df: df.x, ]) @pytest.mark.parametrize('grouper', [ lambda a: a.x % 3, lambda a: 'y', lambda a: a.index % 2, lambda a: ['y'] ]) @pytest.mark.parametrize('indexer', [ lambda g: g.x, lambda g: g, lambda g: g[['x']], #lambda g: g[['x', 'y']] ]) def test_groupby_windowing_n(func, n, getter, grouper, indexer): df = pd.DataFrame({'x': np.arange(10, dtype=float), 'y': [1.0, 2.0] * 5}) sdf = DataFrame(example=df) def f(x): return func(indexer(x.groupby(grouper(x)))) L = f(sdf.window(n=n)).stream.gather().sink_to_list() diff = 3 for i in range(0, 10, diff): sdf.emit(df.iloc[i: i + diff]) sdf.emit(df.iloc[:0]) assert len(L) == 5 first = df.iloc[max(0, diff - n): diff] > assert_eq(L[0], f(first)) streamz/dataframe/tests/test_dataframes.py:813: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = y 1.0 2 2.0 1 dtype: int32, b = y 1.0 2 2.0 1 dtype: int64 check_names = True, check_dtypes = True, check_divisions = True check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) tm.assert_frame_equal(a, b, **kwargs) elif isinstance(a, pd.Series): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_series_equal(a, b, check_names=check_names, **kwargs) E AssertionError: Attributes of Series are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:822: AssertionError ________________ test_groupby_aggregate_with_start_state[core] _________________ stream = def test_groupby_aggregate_with_start_state(stream): example = pd.DataFrame({'name': [], 'amount': []}) sdf = DataFrame(stream, example=example).groupby(['name']) output0 = sdf.amount.sum(start=None).stream.gather().sink_to_list() output1 = sdf.amount.mean(with_state=True, start=None).stream.gather().sink_to_list() output2 = sdf.amount.count(start=None).stream.gather().sink_to_list() df = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [50, 100]}) stream.emit(df) out_df0 = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [50.0, 100.0]}) out_df1 = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [1, 1]}) assert assert_eq(output0[0].reset_index(), out_df0) assert assert_eq(output1[0][1].reset_index(), out_df0) > assert assert_eq(output2[0].reset_index(), out_df1) streamz/dataframe/tests/test_dataframes.py:917: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = name amount 0 Alice 1 1 Tom 1 b = name amount 0 Alice 1 1 Tom 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 1] (column name="amount") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError ________________ test_groupby_aggregate_with_start_state[dask] _________________ stream = def test_groupby_aggregate_with_start_state(stream): example = pd.DataFrame({'name': [], 'amount': []}) sdf = DataFrame(stream, example=example).groupby(['name']) output0 = sdf.amount.sum(start=None).stream.gather().sink_to_list() output1 = sdf.amount.mean(with_state=True, start=None).stream.gather().sink_to_list() output2 = sdf.amount.count(start=None).stream.gather().sink_to_list() df = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [50, 100]}) stream.emit(df) out_df0 = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [50.0, 100.0]}) out_df1 = pd.DataFrame({'name': ['Alice', 'Tom'], 'amount': [1, 1]}) assert assert_eq(output0[0].reset_index(), out_df0) assert assert_eq(output1[0][1].reset_index(), out_df0) > assert assert_eq(output2[0].reset_index(), out_df1) streamz/dataframe/tests/test_dataframes.py:917: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a = name amount 0 Alice 1 1 Tom 1 b = name amount 0 Alice 1 1 Tom 1, check_names = True check_dtypes = True, check_divisions = True, check_index = True, kwargs = {} def assert_eq( a, b, check_names=True, check_dtypes=True, check_divisions=True, check_index=True, **kwargs, ): if check_divisions: assert_divisions(a) assert_divisions(b) if hasattr(a, "divisions") and hasattr(b, "divisions"): at = type(np.asarray(a.divisions).tolist()[0]) # numpy to python bt = type(np.asarray(b.divisions).tolist()[0]) # scalar conversion assert at == bt, (at, bt) assert_sane_keynames(a) assert_sane_keynames(b) a = _check_dask(a, check_names=check_names, check_dtypes=check_dtypes) b = _check_dask(b, check_names=check_names, check_dtypes=check_dtypes) if not check_index: a = a.reset_index(drop=True) b = b.reset_index(drop=True) if hasattr(a, "to_pandas"): a = a.to_pandas() if hasattr(b, "to_pandas"): b = b.to_pandas() if isinstance(a, pd.DataFrame): a = _maybe_sort(a) b = _maybe_sort(b) > tm.assert_frame_equal(a, b, **kwargs) E AssertionError: Attributes of DataFrame.iloc[:, 1] (column name="amount") are different E E Attribute "dtype" are different E [left]: int32 E [right]: int64 /usr/lib/python3/dist-packages/dask/dataframe/utils.py:818: AssertionError =============================== warnings summary =============================== .pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests/test_dataframes.py::test_windowing_n[1-1-4] .pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests/test_dataframes.py::test_windowing_n[1-1-5] /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build/streamz/dataframe/aggregations.py:99: RuntimeWarning: invalid value encountered in double_scalars result = result * n / (n - self.ddof) .pybuild/cpython3_3.9_streamz/build/streamz/dataframe/tests/test_dataframes.py::test_gc_random /usr/lib/python3/dist-packages/distributed/deploy/spec.py:330: DeprecationWarning: The explicit passing of coroutine objects to asyncio.wait() is deprecated since Python 3.8, and scheduled for removal in Python 3.11. await asyncio.wait(tasks) -- Docs: https://docs.pytest.org/en/stable/warnings.html ===Flaky Test Report=== test_tcp passed 1 out of the required 1 times. Success! test_tcp_async passed 1 out of the required 1 times. Success! ===End Flaky Test Report=== =========================== short test summary info ============================ FAILED streamz/dataframe/tests/test_dataframes.py::test_dataframe_simple[1] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-0-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-0-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-0-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-1-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-1-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-1-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-2-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-2-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-2-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[core-2-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-0-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-0-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-0-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-1-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-1-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-1-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-2-0-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-2-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-2-2-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate[dask-2-3-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_value_counts[core] - ... FAILED streamz/dataframe/tests/test_dataframes.py::test_value_counts[dask] - ... FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-0-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-0-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-0-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-0-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-1-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-1-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-1-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-1-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-2-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-2-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-2-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-2-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-3-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-3-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-3-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[0-3-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-0-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-0-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-0-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-0-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-1-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-1-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-1-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-1-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-2-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-2-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-2-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-2-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-3-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-3-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-3-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[1-3-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-0-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-0-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-0-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-0-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-1-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-1-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-1-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-1-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-2-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-2-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-2-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-2-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-3-0-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-3-0-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-3-1-10h-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_value[2-3-1-1d-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-0-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-1-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-2-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[0-3-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-0-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-1-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-2-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[1-3-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-0-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-1-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-2-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-0-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-0-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-0-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-0-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-1-1-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-1-1-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-1-4-2] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_windowing_n[2-3-1-4-3] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate_with_start_state[core] FAILED streamz/dataframe/tests/test_dataframes.py::test_groupby_aggregate_with_start_state[dask] = 173 failed, 817 passed, 438 skipped, 9 xfailed, 99 xpassed, 3 warnings in 199.00s (0:03:18) = E: pybuild pybuild:353: test: plugin distutils failed with: exit code=1: cd /build/reproducible-path/python-streamz-0.6.2/.pybuild/cpython3_3.9_streamz/build; python3.9 -m pytest dh_auto_test: error: pybuild --test --test-pytest -i python{version} -p 3.9 returned exit code 13 make: *** [debian/rules:10: binary] Error 25 dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2 I: copying local configuration E: Failed autobuilding of package I: unmounting dev/ptmx filesystem I: unmounting dev/pts filesystem I: unmounting dev/shm filesystem I: unmounting proc filesystem I: unmounting sys filesystem I: cleaning the build env I: removing directory /srv/workspace/pbuilder/96905 and its subdirectories Tue Jan 9 16:34:04 UTC 2024 W: No second build log, what happened?