Purpose:
To load CVS data in a database using yb-voyager
Requirements:
1) CSV data exported from the source database.
2) Git installed on the system.
Solution:
Yugabyte Voyager is an open source tool from Yugabyte which can be used to do data migrations from many supported databases into YugabyteDB. You can read about this tool on the official YB docs here.
In this article, we will talk about how this tool can be used to load CSV data into YugabyteDB.
To install this tool,
1. Clone the yb-voyager
repository (to be installed where you can connect to your databases).
git clone https://github.com/yugabyte/yb-voyager.git
2. Change the directory toyb-voyager/installer_scripts
cd yb-voyager/installer_scripts
3. Install yb-voyager using the following script:
./install-yb-voyager
If the script fails, check the/tmp/install-yb-voyager.log
file.
If you receive any errors regarding binutils, please make sure it is installed.
sudo apt-get install binutils
For more information, please follow this doc here.
Scenario:
Lets assume a case where we have some (or a large amount of) CSV data, spread across one or more files, and we want to upload this data in YugabyteDB in a seamless way. Also assume that the schema is not present in the database, the database is blank/empty in the context of this CSV data.
Lets say we have 2 files, MSSQLData1_data.csv and MSSQLData2_data.csv, which contains CSV data with the following fields:
1. MSSQLData1_data.csv - "Id","Customer","InternalId","SalesRepName"
2. MSSQLData2_data.csv - "Id","InternalId","NumTransactions","Comments","DatasetId"
Steps:
1. The first thing to do here will be to create the blank target tables with the same schema as the source table from where the csv was exported from. Please follow this document to create tables.
2. Once the blank target tables are created with the schema, you can use the yb-voyager tool to load the data from csv files to the respective tables.
yb-voyager import data file --export-dir /tmp/ --target-db-host 172.31.20.218 --target-db-user yugabyte --target-db-password yugabyte --data-dir "/tmp/" --file-table-map "MSSQLData1_data.csv:mssqldata1,MSSQLData2_data.csv:mssqldata2" --has-header
where:
--export-dir : Specifies the path to the directory containing the data files to export.
--target-db-host : Specifies the domain name or IP address of the machine on which target database server is running.
--target-db-user : Specifies the username of the target database. This has to be a super user.
--data-dir : Path to the directory containing the data files to import
--file-table-map : Comma-separated mapping between the files in data-dir to the corresponding table in the database.
Default: If this flag isn't used, theimport data file
command imports files from the --data-dir directory matching thePREFIX_data.csv
pattern, where PREFIX is the name of the table into which the file data is imported.
Example :filename1:tablename1,filename2:tablename2[,...]
--has-header -->This argument is to be specified only for CSV file type.
Default: false; change to true if the CSV file contains column names as a header.
Upon loading the data in the tables,
ubuntu@ip-172-31-29-148:/tmp$ yb-voyager import data file --export-dir /tmp/ --target-db-host 172.31.20.218 --target-db-user ammar --target-db-password password --data-dir "/tmp/" --file-table-map "MSSQLData2_data.csv:mssqldata2,MSSQLData1_data.csv:mssqldata1" --has-header
import of data in "yugabyte" database started
Target YugabyteDB version: 11.2-YB-2.15.1.0-b0
skipping already imported tables: []
Preparing to import the tables: [mssqldata1 mssqldata2]
All the tables are imported
exiting...
Comments
0 comments
Please sign in to leave a comment.