modin.pandas.read_json¶
- modin.pandas.read_json(path_or_buf: FilePath, *, orient: str | None = None, typ: Literal['frame', 'series'] | None = 'frame', dtype: DtypeArg | None = None, convert_axes: bool | None = None, convert_dates: bool | list[str] | None = None, keep_default_dates: bool | None = None, precise_float: bool | None = None, date_unit: str | None = None, encoding: str | None = None, encoding_errors: str | None = None, lines: bool | None = None, chunksize: int | None = None, compression: Literal['infer', 'gzip', 'bz2', 'brotli', 'zstd', 'deflate', 'raw_deflate', 'none'] = 'infer', nrows: int | None = None, storage_options: StorageOptions = None, dtype_backend: DtypeBackend = _NoDefault.no_default, engine: Literal['ujson', 'pyarrow'] | None = None) pd.DataFrame [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.26.0/snowpark-python/src/snowflake/snowpark/modin/plugin/extensions/io_overrides.py#L244-L278)¶
Read new-line delimited json file(s) into a Snowpark pandas DataFrame. This API can read files stored locally or on a Snowflake stage.
Snowpark pandas first stages files (unless they’re already staged) and then reads them using Snowflake’s JSON reader.
- Parameters:
path_or_buf (str) – Local file location or staged file location to read from. Staged file locations starts with a ‘@’ symbol. To read a local file location with a name starting with @, escape it using a @. For more info on staged files, read here.
orient (str) – This parameter is not supported and will raise an error.
typ ({{'frame', 'series'}}, default 'frame') – This parameter is not supported and will raise an error.
dtype (bool or dict, default None) – This parameter is not supported and will raise an error.
convert_axes (bool, default None) – This parameter is not supported and will raise an error.
convert_dates (bool or list of str, default True) – This parameter is not supported and will raise an error.
keep_default_dates (bool, default True) – This parameter is not supported and will raise an error.
precise_float (bool, default False) – This parameter is not supported and will be ignored.
date_unit (str, default None) – This parameter is not supported and will raise an error.
encoding (str, default is 'utf-8') – Encoding to use for UTF when reading/writing (ex. ‘utf-8’). List of Snowflake standard encodings .
encoding_errors (str, optional, default "strict") – This parameter is not supported and will raise an error.
lines (bool, default False) – This parameter is not supported and will raise an error.
chunksize (int, optional) – This parameter is not supported and will raise an error.
compression (str, default 'infer') – String (constant) that specifies the current compression algorithm for the data files to be loaded. Snowflake uses this option to detect how already-compressed data files were compressed so that the compressed data in the files can be extracted for loading. List of Snowflake standard compressions .
nrows (int, optional) – This parameter is not supported and will raise an error.
storage_options (dict, optional) – This parameter is not supported and will be ignored.
dtype_backend ({'numpy_nullable', 'pyarrow'}, default 'numpy_nullable') – This parameter is not supported and will be ignored.
engine ({'ujson', 'pyarrow'}, default 'ujson') – This parameter is not supported and will be ignored.
- Return type:
Snowpark pandas DataFrame
- Raises:
NotImplementedError if a parameter is not supported. –
Notes
Both local files and files staged on Snowflake can be passed into
path_or_buf
. A single file or a folder that matches a set of files can be passed intopath_or_buf
. There is no deterministic order in which the files are read.Examples
Read local json file.
>>> import tempfile >>> import json >>> temp_dir = tempfile.TemporaryDirectory() >>> temp_dir_name = temp_dir.name
>>> data = {'A': "snowpark!", 'B': 3, 'C': [5, 6]} >>> with open(f'{temp_dir_name}/snowpark_pandas.json', 'w') as f: ... json.dump(data, f)
>>> import modin.pandas as pd >>> import snowflake.snowpark.modin.plugin >>> df = pd.read_json(f'{temp_dir_name}/snowpark_pandas.json') >>> df A B C 0 snowpark! 3 [5, 6]
Read staged json file.
>>> _ = session.sql("create or replace temp stage mytempstage").collect() >>> _ = session.file.put(f'{temp_dir_name}/snowpark_pandas.json', '@mytempstage/myprefix') >>> df2 = pd.read_json('@mytempstage/myprefix/snowpark_pandas.json') >>> df2 A B C 0 snowpark! 3 [5, 6]
Read json files from a local folder.
>>> with open(f'{temp_dir_name}/snowpark_pandas2.json', 'w') as f: ... json.dump(data, f) >>> df3 = pd.read_json(f'{temp_dir_name}') >>> df3 A B C 0 snowpark! 3 [5, 6] 1 snowpark! 3 [5, 6]
Read json files from a staged location.
>>> _ = session.file.put(f'{temp_dir_name}/snowpark_pandas2.json', '@mytempstage/myprefix') >>> df4 = pd.read_json('@mytempstage/myprefix') >>> df4 A B C 0 snowpark! 3 [5, 6] 1 snowpark! 3 [5, 6]