Good feeling part and regrettable part of the Cloud Datalab

This entry,

It is a continuation.

### Basics of Cloud Datalab

Here also and I wrote that similar but will write again, and Cloud Datalab is something like the following.

• Interactive analysis of the environment based on Jupyter
• GCP and integrated environment
• Jupyter and package the Python library was containerization
• It can or dropping launched easily on the GCE through datalab command the container

### Premise of Datalab

Datalab has been designed to work closely linked to the project of GCP.

Nothing in particular, by default will be as follows and do not specify.

### Start-up

Try various doing on the assumption that. It is the start of Anyway Datalab.

$datalab create --disk-size-gb 10 --no-create-repository datalab-test  • --disk-size-gbIn Specify the disk size. • Since the default is made in 200GB, you specified the 10GB in the smaller size • --no-create-repositoryIn does not perform the Repository Creation • When I was or off in the repository alone, --no-create-repositoryit has not start and not wearing. . . Would be what, this. In addition to properly investigate. ### Cooperation with BigQuery Datalab is very nice and can work with BigQuery. In, but it will change the story a little, the Jupyter Magic Command that %%if there are Kara begins command function, also provides function of BigQuery and GCS. ### Execution of the query as Magic Command Sample is as expected, but this is understood well and try out the splendor that can be written in the cell. %%bq query SELECT id, title, num_characters FROM publicdata.samples.wikipedia WHERE wp_namespace = 0 ORDER BY num_characters DESC LIMIT 10  ### Run through google.datalab.bigquery Since you put BQ of the query into a cell, in the fact that I want to process as those in the sample , but you, you can pass to the Pandas the results of a query as dataframe. wonderful. %%bq query -n requests SELECT timestamp, latency, endpoint FROM cloud-datalab-samples.httplogs.logs_20140615 WHERE endpoint = 'Popular' OR endpoint = 'Recent'  import google.datalab.bigquery as bq import pandas as pd df = requests.execute(output_options=bq.QueryOutput.dataframe()).result()  Is it such a feeling Mochoi to Ppoku via API? import google.datalab.bigquery as bq import pandas as pd # Issue query query = """SELECT timestamp, latency, endpoint FROM cloud-datalab-samples.httplogs.logs_20140615 WHERE endpoint = 'Popular' OR endpoint = 'Recent'""" # create a query object qobj = bq.Query(query) # get the query results as data frames pandas df2 = qobj.execute(output_options=bq.QueryOutput.dataframe()).result() # to the following operations pandas df2.head()  When the good think, because here the API has been provided, it is a flow I Magic Command is made. In fact, here you see and as Magic Command %%bqwill see that has been defined. ### Cooperation with GCS BigQuery the same sample street , we can manipulate the objects on the GCS from the cell. The point, whether you'll be able to read and write files. The results of the BigQuery is also cooperation can be a data source, but it is fascinating to handle transparently the data of GCS as it is as the data source. ### Cooperation with CloudML This is, it was confirmed that the move something through the time being API, because often you do not know as a lot of behavior this time will be skipped. ### Changing the instance type Here is the true value of the cloud. Spec up, if you need that are not possible with Onpure can be realized. In the create of datalab command --machine-typeallows you to specify the instance type in options. By default, n1-standard-1it looks like rises. # Delete command in the instance delete # disks that were attached in this case remain intact$ datalab delete datalab-test

# On the same machine name, start by changing the instance type
for the disk is made in the naming conventions of the # machine name + pd
us to arbitrarily attach the disk # it's the same machine name
\$ datalab create --no-create-repository \
--machine-type n1-standard-4 \
datalab-test


Now, you can raise or lower the specs of the machine if necessary.

### GPU of the analysis environment!

The time being, is this time of the highlights.

with this! ! ! After you specify the GPU instance! ! ! ! Handy GPU machine learning environment can get easy! ! ! !

And the place was in ,,, now the world that does not go so easily I thought, GPU instance is not supported in Datalab.

### Summary

But is Datalab to or places regrettable, GPU instance is somehow expected pale that do not do us with any corresponding now or there, or at the Cloud Source Repository, or except where Cloud ML Engine around is finally also even, I of these days today think that it is the important part for making the data analysis environment. Next time I want to look a little more tightly around here.

### Other reference information

• Datalab API
• It has entered the Datalab Python library

• Ish does not contain such as OpenCV system of library
• There, an additional possible to install a python module
• Also OS side of command !because can beat I mean, apt-getyou should package that can handle is placed in the Toka