-
Notifications
You must be signed in to change notification settings - Fork 11
Closed
Description
This problem was fixed several times already: either by extending the amount of memory (#796, #807) or by refactoring PIG script to minimize memory footprint during the RANK operation (CeON/CoAnSys#425).
After recent increase in number of publications (to 37M) we are struggling again with the memory related problem:
java.lang.OutOfMemoryError
at java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at java.io.DataOutputStream.writeUTF(DataOutputStream.java:401)
at java.io.DataOutputStream.writeUTF(DataOutputStream.java:323)
full log is available here: https://pastebin.com/dk2C8wLF
Pig execution plan is available here:
https://pastebin.com/bAUsCNjb
claiming again RANK operation to be the phase when the map task failed:
Failed Jobs:
JobId Alias Feature Message Outputs
job_1524597382992_21544 wc_ranked ORDER_BY Message: Job failed!