You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SparkFlightManager should also have a python interface to enable development of microservices in pyspark. The current SparkFlightManager interface should probably have to be altered as it's tied too much to Java implementation of FlightServer. It will have to be made general enough to be usable in Python FlightServers as well. For example, streamDistributedFlight right now takes FlightProducer.ServerStreamListener as a parameter, but there's of course no such thing in Python implementation.
The text was updated successfully, but these errors were encountered:
Yet another problem is that pyspark uses py4j for communication between python and java virtual machines, meaning that in Python, unlike Java/Scala, FlightServer and SparkSession will be running in different processes. The easiest way to reuse FlightManager from Python is with jpype which runs jvm in the same process. For this reason, having a single class serving requests and also calling functions in Scala Spark is not feasible. Will have to refactor SparkFlightManager into more general DistributedFlightManager class that's independent of Spark.
The method that converts DataFrame to a RDD of ArrowRecordBatches and sends them to Internal FlightServers will be a separate utility that can be independently reused from pyspark with py4j or directly called from Scala Spark.
SparkFlightManager should also have a python interface to enable development of microservices in pyspark. The current SparkFlightManager interface should probably have to be altered as it's tied too much to Java implementation of FlightServer. It will have to be made general enough to be usable in Python FlightServers as well. For example, streamDistributedFlight right now takes FlightProducer.ServerStreamListener as a parameter, but there's of course no such thing in Python implementation.
The text was updated successfully, but these errors were encountered: