Google Associate Cloud Engineer – Exploring Database in Google Cloud Platform Part 4
15. Step 09 – Playing with Cloud SQL, BigQuery and Big Table from Command Line
Welcome back. In the last few steps, we learned a lot about the database options in relational databases. I’m sure you are now very clear on how to choose a database for a specific scenario, and it’s time now to look at how you can export and import data into these databases. Let’s get started with Cloud SQL. Whenever you look at any of the import of the export scenarios in Google Cloud platform, you’d see that typically cloud storage is involved. Whenever you’d want to move data from somewhere to somewhere, you’d first move it to cloud storage and then move it to the destination.
And that’s the reason why you’d see that most of these services support export to and from cloud storage. Let’s start with the basic relational database in Google Cloud, which is Cloud SQL. Cloud SQL provides you with options to export data to cloud storage and import data from a file which is present in cloud storage. You can do this export using the console itself, the Web console, or Gcloud. Or you can also use the Cloud SQL Rest API. There are two formats which are supported sequel and CSV. If you are exporting to another relational database, maybe you’d want to export it to SQL first, and then you can execute those SQL commands to import the data into your destination database.
As you can see in here, the command to export data from Cloud SQL is very, very easy. It’s G cloud, SQL export and you can specify the format. If it’s CSV, you provide CSV, otherwise you’d provide the SQL if you want to import it’s g cloud SQL import CSV SQL cloud SQL also supports a serverless mode for exports. So if you have a large database, then you can use Serverless mode. This would reduce the performance impact of the export on a live database. If you have a database which is being used by users right now, and you are exporting from it, and you want to minimize the amount of impact on the live database, you can use a serverless mode.
This would actually create a standby from which the export would be done so that’s Cloud SQL you can export to or from cloud storage. The next database that we’d be talking about is also a relational database, but a large scale relational database, a global relational database which is cloud spanner. Again, it’s the same to or from cloud storage. If you want to export data out of cloud spanner, one of the options you have is to go to the console and create an export job. And in the background, the export job uses cloud data flow from the transaction databases. Let’s move towards the analytical databases or the data warehouses, which is BigQuery. BigQuery supports import and exports to cloud storage and others.
You can either use the console or you can use BQ command line tool. Remember, the command line tool for BigQuery is not Gcloud, but it is BQUE. So you can use either of them to export data. BigQuery supports a number of export formats. You can export it as CSV or JSON or Avro. And because we are talking about large volumes of data, compression becomes really really important. CSV JSON support GG compression and if you are using Avro, you have the options to use deflate and snappy compressions. The export and the import commands of BQ are a little different. With Gcloud SQL. It is Gcloud SQL. Export import.
However, with BQ. It’s BQ extract and BQ load. BigQuery supports import of data in a variety of ways. You can load data from cloud storage using BQ load. A good example use case is Datastore to Cloud storage to BigQuery. So you’d want to move data from Datastore to BigQuery. What you can do is you can actually take an export from Datastore and put it to cloud storage, and from cloud storage you can import it into BigQuery. You can also do bash loading using the BigQuery data transfer service. You can also use Dataflow to set up a streaming pipeline and stream the data directly into BigQuery.
The important export options are not as important when we talk about the transactional databases like Cloud SQL and Cloud Spanner. Typically they provide very very simple import and export options. However, when it comes to Big data or the data warehouse, tools like BigQuery important export becomes the most important thing. And that’s the reason why you would see that BigQuery supports a lot of different options. It supports a wide variety of formats. And you can either load data from cloud storage or you can actually load data using BigQuery Data Transfer Service, or you can even use cloud dataflow to set up a streaming pipeline.
Let’s look at one of the data transfer options, which is Data Transfer Service. If you enable the BigQuery service, if you just search for BigQuery, enable the BigQuery API and go to Data transfers, this is where you can actually see how you can use the Data Transfer Service. We are not really going to create a data transfer, but let’s see what’s involved. Let’s just create a transfer in here. Over here, you can see that we can pick up the source and you can see how many different sources that it supports. You can see that you can actually go to AWS and pick up data from Amazon s Three. Or you can play it safe and you can actually go in and say I would want to take it from Google cloud Storage.
You can see that you can even import data from the data warehouse in AWS, which is redshift. Let’s take the easier option which is Cloud storage. So if you pick up cloud storage, then you can actually give the config a name and then you can configure your scheduled options. You can say I would want to repeat this daily at a specific time. You can choose the data set where you would want to export it to in BigQuery. And after that you can actually choose the data source details. Where is the file in cloud storage? That’s what you would be choosing in here. And if you scroll a little down, you’d see that there are a wide variety of options and formats that are supported.
You don’t really need to remember all the details around the sources, formats and different kinds of things. If you remember at a high level that BigQuery supports a huge variety of data transfer options, that should be more than sufficient. When you’re using BigQuery, you can either use the BigQuery Data Transfer service the other option is also to use Dataflow. So if you go to Data Flow if you type in Data Flow in here and this is Dataflow, you might need to enable the API first before you’d be able to go in here. Once you go to Dataflow, you would be able to see that Dataflow provides unified streaming and batch data processing that serverless fast and cost effective.
We saw that when you are exporting or importing data from cloud spanner, we are making use of cloud Dataflow. Similarly, cloud Dataflow can also be used to set up pipelines involving BigQuery. Important keywords in here are streaming. So you have a stream of data which is coming in, maybe analytics data, maybe logs. You can process those using Dataflow as well as batch. So streaming and batch workloads are typically considered very, very distinct. However, Dataflow provides a unified way of handling both streaming and batch data processing and in Dataflow the way you can actually process all the data is by creating Jobs. You can go in here and use any of these templates and create a job. Let’s look at what is involved here.
Create Job from Template on the next screen, you’d be able to choose the specific template. You can see that BigQuery would support a lot of export and import options. Let’s just type in spanner. You can see that you can actually transfer data from cloud storage to Spanner, or you can transfer data from cloud spanner to cloud storage. You can see that there are a variety of files that are supported. You can either use a text file as the intermediate thing or an Avro file as the intermediate thing. Let’s type in Query to see what are the different export options to BigQuery that are supported through Dataflow. You can see that there are options to transfer data from cloud Storage to BigQuery.
Hive Kafka Pub Sub if you scroll down further, you’d be able to see that there are a lot more options which are present in here. So you can transfer data from a number of other data sources to BigQuery by using Dataflow. Now, you don’t really need to remember all the details which we have talked about in this specific step at a high level. This is the summary most of the databases support and import and export to or from cloud storage. When we are talking about transactional databases, export and imports are not as important, and that’s the reason why Cloud SQL and Cloud Spanner offers basic export and import options.
However, when we talk about BigQuery, a data warehouse, import and export options are very, very important. And that’s why you would see that it supports a wide variety of import and export options. BigQuery not only supports its own data transfer service, which is BigQuery data transfer service, but also you can use dataflow to create pipelines you can use dataflow to transfer data to or from BigQuery, and also you can use dataflow to create pipelines around Cloud Spanner. I’m sure you’re having a wonderful time and I’ll see you in the next.
16. Step 10a – Importing and Exporting Relational Databases
Welcome back. In this app, let’s look at how you can export and import NoSQL databases, the two nodes SQL databases that we’ll be talking about. You already know that. Data store and big table. Let’s start with Datastore. As usual, we will be importing and exporting data to or from cloud storage. Whatever we are talking about, datastore in here applies to fivestore as well. You can export using the web console. You can use Gcloud or the rest API. If I go into Data Store and open it up in a new tab, we are using Cloud Firestore actually in native mode, and this is where you can do the import or export from. So this is where you can go in import or export.
And you can say either I would want to import Data or export Data. If you want to export, you can go into export and you can say I would want to export the entire database, or you can say I would want to only export a set of collections. And then you can choose where you want to export to. You can choose the destination, and the destination is a cloud storage bucket. If you go back, you can also see the import option. So if you can click the import, the first thing that you would choose is where is the import file present in? And typically this would take you to the cloud storage. From the cloud storage buckets, you can pick up a file and import that data back into Firestore.
So as expected, Cloud Datastore and Firestore support export import options from or to cloud storage. And you can do the import or export using Console Gcloud or Rest API. The console command looks something of this kind gcloud Datastore, Firestore Export Import and you can decide what are the objects that you would want to export. You’d see that this command is very, very similar to SQL. So Cloud SQL also has a similar command structure. Gcloud SQL export or Gcloud SQL import. Similar to datastore. BigTable also supports import or export to or from cloud storage. And to be able to export data from BigTable, you can create data flow jobs.
It also supports a wide variety of formats. It’s very, very important to remember that BigTable is also having a lot of data sometimes in Petabytes, and that’s the reason why import export becomes very, very important. And that’s why Google Cloud supports a wide variety of options around Big Table too. If you go over to Dataflow enable the API, and if you say Create Job from template, you’d see that there are variety of options which you can use to import Data. If I go in here and search for Dataflow template and search for Data Store, you can see that you can use Dataflow also to export data from Datastore to Cloud Storage and from cloud Storage to Datastore, you can also run bad jobs to bulk delete entities in Datastore.
You’d see that there are a wide variety of options around BigTable as well. You can transfer data from cloud storage to BigTable cloud BigTable to cloud storage and it supports a wide variety of formats avro, sequence, file packet those are the different formats that we are showing in here. In this and the last step, we looked at some of the import and export options around databases. There are two common points that we have observed. One is cloud storage. Most of the import and exports are to and from cloud storage. The other common point that we observed is dataflow. Dataflow is also involved in import and exports of the most important tools. Now, before we move on to the next step, there is one important thing that you need to remember.
Whenever you’re doing import and export, the import or export uses the service account of that specific service and the service. Account of that specific service needs to have access to the cloud storage buckets so that it can either read or write data to cloud storage. That’s very, very important. So either you can use the access control list and give it permission. The example here shows giving write permission to a specific service account. So you’re giving a write permission on the bucket to a specific service account using G util ACL ch command or you can also assign a role with that service account either storage admin or storage object admin or storage object creator.
Typically, you don’t want to go with the admin roles. You would want to give it a creator role. If you are creating objects, if you are doing an export, you might want to give the storage object creator. If you are doing an import, then you might want to give it a storage object viewer role. The important thing that you need to remember is that your service account, whichever you are using to do the export or input, should have the access to do the operations that you would want it to perform on cloud storage. I’m sure you had a wonderful time and I’ll see you in the next.
17. Step 10b – Importing and Exporting NoSQL Databases
Welcome back. Let’s quickly review some of the important things to remember about the databases. BigQuery datastore Firebase do not need VM configuration. They do not depend on VMs. However, Cloud SQL and BigTable need VM configuration. They make use of the Compute Engine virtual machines. Let’s revise the relational database options. Small local databases cloud SQL highly scalable global databases cloud spanner if you want a data warehouse BigQuery no SQL databases transactional databases for a few terabytes of data cloud Datastore or Firestore huge volumes of IoT or streaming analytics data cloud big table I’m sure you had a wonderful time in this section. I’ll see you in the next section.
Interesting posts
The Growing Demand for IT Certifications in the Fintech Industry
The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »
CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared
In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »
The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?
If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »
SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification
As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »
CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?
The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »
Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?
The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »