[ad_1]
I have pyspark code deployed on azure databricks, which basically reads and writes data to and from cosmosDB.
In previous months, I used:
DataBricks runtime Version: 6.4(Extended Support) with Scala 2.11 and Spark 2.45
Azure-cosmosdb-spark connector: azure-cosmosdb-spark_2.4.0_2.11-3.7.0-uber.jar
Program runs fine with out any errors.
But, when I upgraded my runtime version to:
DataBricks runtime Version: 10.4 LTS with Scala 2.12 and Spark 3.2.1
Azure-cosmosdb-spark connector: azure-cosmos-spark_3-2_2-12-4.10.0.jar
It throws me a error saying:
Py4JJavaError Traceback (most recent call last)
in
14 #Reading data from cosmosdb
15 #try:
—> 16 input_data_cosmosdb = spark.read.format(“com.microsoft.azure.cosmosdb.spark”).options(**ReadConfig_input_cosmos).load()
17 print(“Cosmos DB columns “,input_data_cosmosdb.columns)
18 #except:
/databricks/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options)
162 return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
163 else:
–> 164 return self._df(self._jreader.load())
165
166 def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None,
Py4JJavaError: An error occurred while calling o663.load.
: java.lang.NoSuchMethodError: [enter image description here][2]scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at com.microsoft.azure.cosmosdb.spark.config.Config$.getOptionsFromConf(Config.scala:281)
at com.microsoft.azure.cosmosdb.spark.config.Config$.apply(Config.scala:229)
at com.microsoft.azure.cosmosdb.spark.DefaultSource.createRelation(DefaultSource.scala:55)
at com.microsoft.azure.cosmosdb.spark.DefaultSource.createRelation(DefaultSource.scala:40)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:385)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:356)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:323)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:323)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:222)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
at py4j.Gateway.invoke(Gateway.java:295)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:251)
at java.lang.Thread.run(Thread.java:748)
I have tried all the solutions available on internet. I know it comes when I am using 2.11 libraries on 2.12 project, but i am using 2.12 project and 2.12 spark connector itself. Yet i get the error.
Please let me know if any one is using the same environment. I use Databricks on Azure. Looking for a solution.
Any answers related to Databricks are really appreciated.
[ad_2]