usgs_streamflow_gcp

Camels USGS Streamflow

Load in Python

from intake import open_catalog

cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/hydro/camels.yaml")
ds  = cat["usgs_streamflow_gcp"].to_dask()

Working with requester pays data

Several of the datasets within the cloud data catalog are contained in requester pays storage buckets. This means that a user requesting data must provide their own billing project (created and authenticated through Google Cloud Platform) to be billed for the charges associated with accessing a dataset. To set up an GCP billing project and use it for authentication in applications:

Create a project on GCP; if this is the first time using GCP, a prompt will appear to choose a Google account to link to all GCP-related activities.
Create a Cloud Billing account associated with the project and enable billing for the project through this account.
Using Google Cloud IAM, add the Service Usage Consumer role to your account, which enables it to make billed requests on the behalf of the project.
Through command line, install the Google Cloud SDK; this can be done using conda:
```
conda install -c conda-forge google-cloud-sdk
```
Initialize the gcloud command line interface, logging into the account used to create the aforementioned project and selecting it as the default project; this will allow the project to be used for requester pays access through the command line:
```
gcloud auth login
gcloud init
```
Finally, use gcloud to establish application default credentials; this will allow the project to be used for requester pays access through applications:
```
gcloud auth application-default login
```

Metadata

origin_url

https://ral.ucar.edu/solutions/products/camels

Dataset Contents

Dask DataFrame Structure:

	date	basin	QObs	flag
npartitions=1
	datetime64[ns]	int64	float64	object
	...	...	...	...

Dask Name: from-delayed, 3 tasks