Another setting that I had to change was the spark.driver.maxResultSize.
While training the Word2Vec model, Spark job threw a SparkException “Job aborted due to stage failure: Total size of serialized results of x tasks (y MB) is bigger than spark.driver.maxResultSize (z MB)“.
This usually happens during the collect stage as the driver needs more memory.
Me default value was set at 1 GB and I was getting this error. I increased the value to 10 GB and resolved this issue.
Have this set to 0 means unlimited.
From official spark documentation…
Having a high limit may cause out-of-memory errors in driver (depends on spark.driver.memory and memory overhead of objects in JVM). Setting a proper limit can protect the driver from out-of-memory errors.