Encrypt password used by Sqoop to import or export data from database.
Sqoop became very popular and the darling tool for the industries. Sqoop has developed a lot and become very popular amongst Hadoop ecosystem. When we import or export data from database through Sqoop then we have to give password in command or in file only. I feel this is not a fully secure way to keep password in a file or pass password through command line. In this post, I will try cover the ways to specify database passwords to Sqoop in simple and secure way.
The following ways are common to pass database passwords to Sqoop:
Option 1: We can give password in command itself.
sqoop import --connect jdbc:mysql://servera.com:3306/bigdb --username dbuser --password <passwoord> --table sqoop_import -m 1 --target-dir /user/root/sanity_test/ --append
Option 2: We can keep password in a file and can pass this file in command.
sqoop import --connect jdbc:mysql://servera.com:3306/bigdb --username dbuser -password-file /home/root/.mysql.password --table sqoop_import -m 1 --target-dir /user/root/sanity_test/ --append
However, storing password in a text file is still considered not secure even though we have set the permissions.
As of Sqoop version 1.4.5, Sqoop supports the use of JKS to store passwords which would be encrypted,so that you do not need to store passwords in clear text in a file.This can be achieved using Java KeyStore.
Option 3: To generate the key:
Note: On prompt, enter the password that will be used to access the database.
[email@example.com ~]$ hadoop credential create mydb.password.alias -provider jceks://hdfs/user/root/mysql.password.jceks Enter password: Enter password again: mydb.password.alias has been successfully created. org.apache.hadoop.security.alias.JavaKeyStoreProvider has been updated. ## Now you can use mydb.password.alias file in your sqoop command like below: ## Please add --Dhadoop.security.credential.provider.path=jceks://hdfs/user/root/mysql.password.jceks to the command and change the password like follows: sqoop import --Dhadoop.security.credential.provider.path=jceks://hdfs/user/root/mysql.password.jceks --connect jdbc:mysql://server:3306/sqoop --username sqoop --password-alias mydb.password.alias --table Test --delete-target-dir --target-dir /apps/hive/warehouse/test --split-by name ##or sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfs/user/root/mysql.password.jceks --connect ‘jdbc:mysql://servera.com:3306/bigdb’ --tablesqoop_import --username dbuser --password-alias mydb.password.alias -m 1 --target-dir /user/root/sanity_test/ --append
This way password is hidden inside jceks://hdfs/user/root/mysql.password.jceks and no one is able to see it.
Note: the “mydb.password.alias” is the alias that we can use to pass to Sqoop when running the command, so that no password is needed.
I hope you enjoy reading this article and will help you to protect your password. Please give your valuable feedback.