Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

StackOverflow Point

StackOverflow Point Navigation

  • Web Stories
  • Badges
  • Tags
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Web Stories
  • Badges
  • Tags
Home/ Questions/Q 1760
Alex Hales
  • 0
Alex HalesTeacher
Asked: May 31, 20222022-05-31T01:44:24+00:00 2022-05-31T01:44:24+00:00

scala – Writing each row in a spark dataframe to a separate json

  • 0

[ad_1]

I have a fairly large dataframe(million rows), and the requirement is to store each of the row in a separate json file.

For this data frame

 root
 |-- uniqueID: string 
 |-- moreData: array 

The output should be stored like below for all the rows.

s3://.../folder[i]/<uniqueID>.json

where i is the first letter of the uniqueID

I have looked at other questions and solutions, but they don’t satisfy my requirements.
Trying to do this in a more time optimized way, and from what I have read so far re-partition is not a good option.

Tried writing the df with maxRecordsPerFile option, but I can’t seem to control the naming of the files.

df.write.mode("overwrite")
.option("maxRecordsPerFile", 1)
.json(outputPath)

I am fairly new to spark, any help is much appreciated.

[ad_2]

  • 0 0 Answers
  • 11 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report
Leave an answer

Leave an answer
Cancel reply

Browse

Sidebar

Ask A Question

Related Questions

  • xcode - Can you build dynamic libraries for iOS and ...

    • 0 Answers
  • bash - How to check if a process id (PID) ...

    • 4778 Answers
  • database - Oracle: Changing VARCHAR2 column to CLOB

    • 1063 Answers
  • What's the difference between HEAD, working tree and index, in ...

    • 1009 Answers
  • Amazon EC2 Free tier - how many instances can I ...

    • 0 Answers

Stats

  • Questions : 43k

Subscribe

Login

Forgot Password?

Footer

Follow

© 2022 Stackoverflow Point. All Rights Reserved.

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.