scala - Spark union weird behavior -
i' m experiencing strange behavior union
method of rdd
class. , can't understand why happens.
i have class, in there data
, filtereddata
vars. first 1 initialized textfile
, other map
, filter
, second instead sc.emptyrdd[point]
. have method this:
do{ /** * here val lastfiltered computed */ logger.debug("filtered "+lastfiltered.count()+" points") filtereddata=filtereddata.union(lastfiltered) logger.debug("filtered far "+filtereddata.count()+" points") data = data.subtract(lastfiltered) /** here data repartitioned **/ }while(/** here there condition equivalent lastfiltered.count() == 0 **/) logger.info("preprocessing has filtered "+filtereddata.count()+" points")
what i'm getting logger very strange me:
filtered 13 points filtered far 13 points filtered 4 points filtered far 4834 points filtered 0 points filtered far 0 points preprocessing has filtered 0 points
of course first 2 lines expected...but later looks strange me. moreover, subtract
method on data
rdd seems work fine (the counts expected).
can me in understanding happening?
thank you! marco
Comments
Post a Comment