{"id":11717,"date":"2019-11-02T21:40:52","date_gmt":"2019-11-02T21:40:52","guid":{"rendered":"https:\/\/ittutorial.org\/?p=11717"},"modified":"2019-11-10T10:07:14","modified_gmt":"2019-11-10T10:07:14","slug":"import-table-mysql-to-hdfs-using-apache-sqoop","status":"publish","type":"post","link":"https:\/\/ittutorial.org\/import-table-mysql-to-hdfs-using-apache-sqoop\/","title":{"rendered":"Import table Mysql to HDFS Using Apache Sqoop"},"content":{"rendered":"<p>Hi everyone,\u00a0In this article, I will transfer a table on Mysql to the HDFS file system using sqoop.<\/p>\n<p>Sqoop provides the ability to transfer from any RDBMS to an HDFS system.<\/p>\n<p>Now let&#8217;s continue with the example,\u00a0Let&#8217;s download a sample set from kaggle<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-11724\" src=\"https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_1-1.png\" alt=\"\" width=\"1645\" height=\"648\" srcset=\"https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_1-1.png 1645w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_1-1-300x118.png 300w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_1-1-768x303.png 768w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_1-1-1024x403.png 1024w\" sizes=\"auto, (max-width: 1645px) 100vw, 1645px\" \/><\/p>\n<pre><\/pre>\n<p><span>Move the csv file to the virtual machine through WinSCP or a similar program.<\/span> \u00a0<span>connect to mysql and create a table that fits the data set<\/span>.<\/p>\n<pre><span>create table\u00a0 Border_Crossing_Entry ( port_Name varchar(50), state varchar(50), port int, border varchar(50), border_date date, measure varchar(50), Value int, Location varchar(50));\r\n<\/span><\/pre>\n<p><span>This script transfers the data from our csv file to the table.<\/span><\/p>\n<pre><span>load data local infile '\/home\/cloudera\/Border_Crossing_Entry_Data.csv' into table Border_Crossing_Entry fields terminated by ',' enclosed by '\"' lines terminated by '\\n' (port_Name,state,port,border,border_date,measure,Value,Location); <\/span><\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-11725\" src=\"https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_3-1.png\" alt=\"\" width=\"1130\" height=\"251\" srcset=\"https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_3-1.png 1130w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_3-1-300x67.png 300w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_3-1-768x171.png 768w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_3-1-1024x227.png 1024w\" sizes=\"auto, (max-width: 1130px) 100vw, 1130px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><span>Now let&#8217;s create the target directory in HDFS. \u00a0<\/span><\/p>\n<pre>h<span>adoop fs -mkdir \/user\/cloudera\/sqoop_exp\u00a0<\/span><\/pre>\n<p><span>For import, let&#8217;s set the sqoop parameters and start the operations.<\/span><\/p>\n<pre><span>sqoop import --connect jdbc:mysql:\/\/quickstart:3306\/retail_db --username=retail_dba --password=cloudera --table Border_Crossing_Entry --m 1 --target-dir \/user\/cloudera\/sqoop.exp<\/span><\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-11726\" src=\"https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_4-1.png\" alt=\"\" width=\"988\" height=\"231\" srcset=\"https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_4-1.png 894w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_4-1-300x70.png 300w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_4-1-768x180.png 768w\" sizes=\"auto, (max-width: 988px) 100vw, 988px\" \/><\/p>\n<p><span>Finally, let&#8217;s take a look at the file we show as output to hdfs.<\/span><\/p>\n<pre>hadoop fs -ls \/user\/cloudera\/sqoop.exp\/<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone  wp-image-11727\" src=\"https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_5-1.png\" alt=\"\" width=\"982\" height=\"150\" srcset=\"https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_5-1.png 903w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_5-1-300x46.png 300w, https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/Screenshot_5-1-768x117.png 768w\" sizes=\"auto, (max-width: 982px) 100vw, 982px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>I<span>f you look inside the file with m-00000, you can see that the data set is there.<\/span><\/p>\n<p>See you in the next article&#8230;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hi everyone,\u00a0In this article, I will transfer a table on Mysql to the HDFS file system using sqoop. Sqoop provides the ability to transfer from any RDBMS to an HDFS system. Now let&#8217;s continue with the example,\u00a0Let&#8217;s download a sample set from kaggle Move the csv file to the virtual machine through WinSCP or a &hellip;<\/p>\n","protected":false},"author":67,"featured_media":11728,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[6674],"tags":[],"class_list":["post-11717","post","type-post","status-publish","format-standard","has-post-thumbnail","","category-big-data"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/ittutorial.org\/wp-content\/uploads\/2019\/11\/sqoop_introduction.png","jetpack_sharing_enabled":true,"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/posts\/11717","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/users\/67"}],"replies":[{"embeddable":true,"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/comments?post=11717"}],"version-history":[{"count":2,"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/posts\/11717\/revisions"}],"predecessor-version":[{"id":11729,"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/posts\/11717\/revisions\/11729"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/media\/11728"}],"wp:attachment":[{"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/media?parent=11717"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/categories?post=11717"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ittutorial.org\/wp-json\/wp\/v2\/tags?post=11717"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}