Skip to content

feat: support pyspark-client for Spark Connect #861

@psavalle

Description

@psavalle

Is your feature request related to a problem? Please describe.

pyspark-client is the lightweight Python package for Spark Connect, without py4j or any of the Spark Classic dependencies. Currently, it is not possible to use graphframes-py with this client, as the package's __init__.py imports GraphFrameClassic, which imports py4j and related dependencies.

Describe the solution you would like

pyspark-client is much lighter when using Spark Connect, and it would be great to support it (e.g., having graphframes-py[pyspark] and graphframes-py[pyspark-client] as separate extras, and catching import errors if we can't make GraphFrameClassic available.

Component

  • Scala Core Internal
  • Scala API
  • Spark Connect Plugin
  • Infrastructure
  • PySpark Classic
  • PySpark Connect

Additional context

Are you planning on creating a PR?

  • I'm willing to make a pull-request

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions