[ad_1]
I am new to spark and am working on developing a generic Spark Job class which provides an interface for callers to implement and provide their Spark Transformation logic.
public Sparkjob {
void runJob() {
SparkSession session = SparkSession.builder
.master("local")
.appName("Word Count")
.config("spark.some.config.option", "some-value")
.getOrCreate()
MyTransformationLogic logic = new MyTransformationLogic(session);
Dataset result = logic.runLogic(session);
}
}
public interface Transformationlogic {
Dataset runLogic(SparkSession session)
}
Callers would implement something like this:
public MyTransformationLogic implements Transformationlogic{
Dataset runLogic(SparkSession session) {
// I have to block creating new spark sessions here.
return session.sql("SELECT * FROM TABLE");
}
I wanted to block creation of child SparkSessions inside the interface method implementation.
-
Is there a way I could figure out if a new spark session is getting created by the caller.?
-
Can I leverage the
SparkContext
andCached Data
to figure out new SparkSession creation? -
I see
ExecutionListenerManager
that listen for execution metrics. Can I somehow leverage that?
[ad_2]