modin.pandas.read_json

modin.pandas.read_json(path_or_buf: FilePath, *, orient: str | None = None, typ: Literal['frame', 'series'] | None = 'frame', dtype: DtypeArg | None = None, convert_axes: bool | None = None, convert_dates: bool | list[str] | None = None, keep_default_dates: bool | None = None, precise_float: bool | None = None, date_unit: str | None = None, encoding: str | None = None, encoding_errors: str | None = None, lines: bool | None = None, chunksize: int | None = None, compression: Literal['infer', 'gzip', 'bz2', 'brotli', 'zstd', 'deflate', 'raw_deflate', 'none'] = 'infer', nrows: int | None = None, storage_options: StorageOptions = None, dtype_backend: DtypeBackend = _NoDefault.no_default, engine: Literal['ujson', 'pyarrow'] | None = None) pd.DataFrame[source] (https://github.com/snowflakedb/snowpark-python/blob/v1.26.0/snowpark-python/src/snowflake/snowpark/modin/plugin/extensions/io_overrides.py#L244-L278)

Read new-line delimited json file(s) into a Snowpark pandas DataFrame. This API can read files stored locally or on a Snowflake stage.

Snowpark pandas first stages files (unless they’re already staged) and then reads them using Snowflake’s JSON reader.

Parameters:
  • path_or_buf (str) – Local file location or staged file location to read from. Staged file locations starts with a ‘@’ symbol. To read a local file location with a name starting with @, escape it using a @. For more info on staged files, read here.

  • orient (str) – This parameter is not supported and will raise an error.

  • typ ({{'frame', 'series'}}, default 'frame') – This parameter is not supported and will raise an error.

  • dtype (bool or dict, default None) – This parameter is not supported and will raise an error.

  • convert_axes (bool, default None) – This parameter is not supported and will raise an error.

  • convert_dates (bool or list of str, default True) – This parameter is not supported and will raise an error.

  • keep_default_dates (bool, default True) – This parameter is not supported and will raise an error.

  • precise_float (bool, default False) – This parameter is not supported and will be ignored.

  • date_unit (str, default None) – This parameter is not supported and will raise an error.

  • encoding (str, default is 'utf-8') – Encoding to use for UTF when reading/writing (ex. ‘utf-8’). List of Snowflake standard encodings .

  • encoding_errors (str, optional, default "strict") – This parameter is not supported and will raise an error.

  • lines (bool, default False) – This parameter is not supported and will raise an error.

  • chunksize (int, optional) – This parameter is not supported and will raise an error.

  • compression (str, default 'infer') – String (constant) that specifies the current compression algorithm for the data files to be loaded. Snowflake uses this option to detect how already-compressed data files were compressed so that the compressed data in the files can be extracted for loading. List of Snowflake standard compressions .

  • nrows (int, optional) – This parameter is not supported and will raise an error.

  • storage_options (dict, optional) – This parameter is not supported and will be ignored.

  • dtype_backend ({'numpy_nullable', 'pyarrow'}, default 'numpy_nullable') – This parameter is not supported and will be ignored.

  • engine ({'ujson', 'pyarrow'}, default 'ujson') – This parameter is not supported and will be ignored.

Return type:

Snowpark pandas DataFrame

Raises:

NotImplementedError if a parameter is not supported.

Notes

Both local files and files staged on Snowflake can be passed into path_or_buf. A single file or a folder that matches a set of files can be passed into path_or_buf. There is no deterministic order in which the files are read.

Examples

Read local json file.

>>> import tempfile
>>> import json
>>> temp_dir = tempfile.TemporaryDirectory()
>>> temp_dir_name = temp_dir.name
Copy
>>> data = {'A': "snowpark!", 'B': 3, 'C': [5, 6]}
>>> with open(f'{temp_dir_name}/snowpark_pandas.json', 'w') as f:
...     json.dump(data, f)
Copy
>>> import modin.pandas as pd
>>> import snowflake.snowpark.modin.plugin
>>> df = pd.read_json(f'{temp_dir_name}/snowpark_pandas.json')
>>> df
           A  B       C
0  snowpark!  3  [5, 6]
Copy

Read staged json file.

>>> _ = session.sql("create or replace temp stage mytempstage").collect()
>>> _ = session.file.put(f'{temp_dir_name}/snowpark_pandas.json', '@mytempstage/myprefix')
>>> df2 = pd.read_json('@mytempstage/myprefix/snowpark_pandas.json')
>>> df2
           A  B       C
0  snowpark!  3  [5, 6]
Copy

Read json files from a local folder.

>>> with open(f'{temp_dir_name}/snowpark_pandas2.json', 'w') as f:
...     json.dump(data, f)
>>> df3 = pd.read_json(f'{temp_dir_name}')
>>> df3
           A  B       C
0  snowpark!  3  [5, 6]
1  snowpark!  3  [5, 6]
Copy

Read json files from a staged location.

>>> _ = session.file.put(f'{temp_dir_name}/snowpark_pandas2.json', '@mytempstage/myprefix')
>>> df4 = pd.read_json('@mytempstage/myprefix')
>>> df4
           A  B       C
0  snowpark!  3  [5, 6]
1  snowpark!  3  [5, 6]
Copy
Language: English