为 Snowpark Scala 设置 Scala REPL

本主题介绍如何为 Snowpark 设置 Scala REPL。

安装 Scala REPL

The Scala REPL (https://docs.scala-lang.org/overviews/repl/overview.html) (read-eval-print loop) is provided with the Scala build tool. To install the supported version of the Scala build tool, find the version that you plan to use (https://www.scala-lang.org/download/all.html), and follow the installation instructions.

运行 Scala REPL

要在 Scala REPL 中使用 Snowpark 库,请执行以下操作:

  1. If you have not already done so, download the Snowpark library archive file and extract the contents of the file.
  2. Start the REPL by running the run.sh shell script provided in the archive file:
cd <path>/snowpark-1.18.0
./run.sh

The run.sh script does the following:

  • 将 Snowpark 库和依赖项添加到类路径。
  • Creates a <path>/snowpark-1.18.0/repl_classes/ directory for the classes generated by the Scala REPL.
  • Preloads the preload.scala file, which imports the com.snowflake.snowpark package and the com.snowflake.snowpark.functions object.

如果您使用不同的 Scala REPL 版本:

  1. 将 Snowpark 库 JAR 文件和依赖项添加到类路径。

    • Snowpark 库 JAR 文件位于提取 TAR/ZIP 归档文件的顶级目录中。
    • The dependencies are in the lib directory of the extracted TAR/ZIP archive file.
  2. Create a temporary directory for the classes generated by the REPL, and configure the REPL to generate classes in that directory.

Later, when defining inline user-defined functions (UDFs), you’ll need to specify the directory for the REPL classes as a dependency.

验证 Scala REPL 配置

要验证您是否已将项目配置为使用 Snowpark,请运行简单的 Snowpark 代码示例。

  1. In the directory containing the files extracted from the .zip / .tar.gz file (i.e. the directory containing the run.sh script), create a Main.scala file that contains the code below:

    import com.snowflake.snowpark._
    import com.snowflake.snowpark.functions._
    
    object Main {
      def main(args: Array[String]): Unit = {
        // Replace the <placeholders> below.
        val configs = Map (
          "URL" -> "https://<account_identifier>.snowflakecomputing.cn:443",
          "USER" -> "<user name>",
          "PASSWORD" -> "<password>",
          "ROLE" -> "<role name>",
          "WAREHOUSE" -> "<warehouse name>",
          "DB" -> "<database name>",
          "SCHEMA" -> "<schema name>"
        )
        val session = Session.builder.configs(configs).create
        session.sql("show tables").show()
      }
    }

    Note the following:

    • Replace the placeholders with values that you use to connect to Snowflake.

    • For account_identifier, specify your account identifier.

    • If you prefer to use key pair authentication:

      • Replace PASSWORD with PRIVATE_KEY_FILE, and set it to the path to your private key file.
      • If the private key is encrypted, you must set PRIVATE_KEY_FILE_PWD to the passphrase for decrypting the private key.

      As an alternative to setting PRIVATE_KEY_FILE and PRIVATE_KEY_FILE_PWD, you can set the PRIVATEKEY property to the string value of the unencrypted private key from the private key file.

      • For example, if your private key file is unencrypted, set this to the value of the key in the file (without the -----BEGIN PRIVATE KEY----- and -----END PRIVATE KEY----- header and footer and without the line endings).
      • Note that if the private key is encrypted, you must decrypt the key before setting it as the value of the PRIVATEKEY property.
  2. From within the directory, run the run.sh script to start the Scala REPL with the settings needed for the Snowpark library:

    ./run.sh
  3. 在 Scala REPL shell 中,输入以下命令以加载您刚刚创建的示例文件:

    :load Main.scala
  4. Run the following statement to execute the main method of the class that you loaded:

    Main.main(Array[String]())

    This runs the SHOW TABLES command and prints out the first 10 rows of the results.