usgs_streamflow_gcp
Camels USGS Streamflow
Load in Python
from intake import open_catalog
cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/hydro/camels.yaml")
ds = cat["usgs_streamflow_gcp"].to_dask()
Working with requester pays data
Several of the datasets within the cloud data catalog are contained in requester pays storage buckets. This means that a user requesting data must provide their own billing project (created and authenticated through Google Cloud Platform) to be billed for the charges associated with accessing a dataset. To set up an GCP billing project and use it for authentication in applications:- Create a project on GCP; if this is the first time using GCP, a prompt will appear to choose a Google account to link to all GCP-related activities.
- Create a Cloud Billing account associated with the project and enable billing for the project through this account.
- Using Google Cloud IAM, add the Service Usage Consumer role to your account, which enables it to make billed requests on the behalf of the project.
- Through command line, install the Google Cloud SDK; this can be done using conda:
conda install -c conda-forge google-cloud-sdk
- Initialize the
gcloud
command line interface, logging into the account used to create the aforementioned project and selecting it as the default project; this will allow the project to be used for requester pays access through the command line:gcloud auth login gcloud init
- Finally, use
gcloud
to establish application default credentials; this will allow the project to be used for requester pays access through applications:gcloud auth application-default login
Metadata
origin_url | https://ral.ucar.edu/solutions/products/camels |
Dataset Contents
Dask DataFrame Structure:
date | basin | QObs | flag | |
---|---|---|---|---|
npartitions=1 | ||||
datetime64[ns] | int64 | float64 | object | |
... | ... | ... | ... |
Dask Name: from-delayed, 3 tasks